Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems

abstract

2017 Elsevier Inc. Policy iteration and value iteration are two main iterative adaptive dynamic programming frameworks for solving optimal control problems. Policy iteration converges fast while requiring an initial stabilizing control policy, which is a strict constraint in practice. Value iteration avoids the requirement of initial admissible control policy while converging much slowly. This paper tries to utilize the advantages of policy iteration and value iteration, and avoids their drawbacks at the same time. Therefore, a multi-step heuristic dynamic programming (MsHDP) method is developed for solving the optimal control problem of nonlinear discrete-time systems. MsHDP speeds up value iteration and avoids the requirement of initial admissible control policy in policy iteration at the same time. The convergence theory of MsHDP is established by proving that it converges to the solution of the Bellman equation. For implementation purpose, the actor-critic neural network (NN) structure is developed. The critic NN is employed to estimate the value function and its NN weight vector is computed with a least-square scheme. The actor NN is used to estimate the control policy and a gradient descent method is proposed for updating its NN weight vector. According to the comparative simulation studies on two examples, the effectiveness and advantages of MsHDP are verified.

authors

Huang, Tingwen

published proceedings

INFORMATION SCIENCES

author list (cited authors)

Luo, B., Liu, D., Huang, T., Yang, X., & Ma, H.

citation count

25

complete list of authors

Luo, Biao||Liu, Derong||Huang, Tingwen||Yang, Xiong||Ma, Hongwen

publication date

October 2017

publisher

Elsevier Publisher

published in

n0020-0255ISSN Journal

keywords

Adaptive Dynamic Programming
Discrete-time
Multi-step Heuristic Dynamic Programming
Neural Networks
Nonlinear Systems
Optimal Control

Digital Object Identifier (DOI)

10.1016/j.ins.2017.05.005

start page

66

end page

83

volume

411

URL

http://dx.doi.org/10.1016/j.ins.2017.05.005

Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

Other

URL