Off-policy reinforcement learning for H control design. - Texas A&M University (TAMU) Scholar

abstract

The H control design problem is considered for nonlinear systems with unknown internal system model. It is known that the nonlinear H control problem can be transformed into solving the so-called Hamilton-Jacobi-Isaacs (HJI) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, model-based approaches cannot be used for approximately solving HJI equation, when the accurate system model is unavailable or costly to obtain in practice. To overcome these difficulties, an off-policy reinforcement leaning (RL) method is introduced to learn the solution of HJI equation from real system data instead of mathematical system model, and its convergence is proved. In the off-policy RL method, the system data can be generated with arbitrary policies rather than the evaluating policy, which is extremely important and promising for practical systems. For implementation purpose, a neural network (NN)-based actor-critic structure is employed and a least-square NN weight update algorithm is derived based on the method of weighted residuals. Finally, the developed NN-based off-policy RL method is tested on a linear F16 aircraft plant, and further applied to a rotational/translational actuator system.

authors

Huang, Tingwen

published proceedings

IEEE Trans Cybern

altmetric score

0.5

author list (cited authors)

Luo, B., Wu, H., & Huang, T.

citation count

265

complete list of authors

Luo, Biao||Wu, Huai-Ning||Huang, Tingwen

publication date

January 2015

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

IEEE TRANSACTIONS ON CYBERNETICS Journal

keywords

H-infinity Control Design
Hamilton-jacobi-isaacs Equation
Neural Network
Off-policy Learning
Reinforcement Learning

Digital Object Identifier (DOI)

10.1109/TCYB.2014.2319577

start page

65

end page

76

volume

45

issue

1

URL

http://dx.doi.org/10.1109/tcyb.2014.2319577

Off-policy reinforcement learning for H control design. Academic Article

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL