State Aggregation based Linear Programming approach to Approximate Dynamic Programming Conference Paper

Overview
Identity
Additional Document Info
Other
View All

abstract

One often encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for controlled Markov chains. In this paper, we provide a method to construct sub-optimal policies along with a bound for the deviation of such a policy from the optimum through the use of restricted linear programming. The novelty of this approach lies in circumventing the need for a value iteration or a linear program defined on the entire state-space. Instead, the state-space is partitioned based on the reward structure and the optimal cost-to-go or value function is approximated by a constant over each partition. We associate a meta-state with each partition, where the transition probabilities between these meta-states can be derived from the original Markov chain specification. The state aggregation approach results in a significant reduction in the computational burden and lends itself to a restricted linear program defined on the aggregated state-space. Finally, the proposed method is bench marked on a perimeter surveillance stochastic control problem.

name of conference

49th IEEE Conference on Decision and Control (CDC)

authors

Darbha, Swaroop

published proceedings

49th IEEE Conference on Decision and Control (CDC)

author list (cited authors)

Darbha, S., Krishnamoorthy, K., Pachter, M., & Chandler, P.

citation count

23

complete list of authors

Darbha, S||Krishnamoorthy, K||Pachter, M||Chandler, P

publication date

December 2010

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

IEEE Conference on Decision and Control Journal