State Aggregation based Linear Programming approach to Approximate Dynamic Programming
- Additional Document Info
- View All
One often encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for controlled Markov chains. In this paper, we provide a method to construct sub-optimal policies along with a bound for the deviation of such a policy from the optimum through the use of restricted linear programming. The novelty of this approach lies in circumventing the need for a value iteration or a linear program defined on the entire state-space. Instead, the state-space is partitioned based on the reward structure and the optimal cost-to-go or value function is approximated by a constant over each partition. We associate a meta-state with each partition, where the transition probabilities between these meta-states can be derived from the original Markov chain specification. The state aggregation approach results in a significant reduction in the computational burden and lends itself to a restricted linear program defined on the aggregated state-space. Finally, the proposed method is bench marked on a perimeter surveillance stochastic control problem.
author list (cited authors)
Darbha, S., Krishnamoorthy, K., Pachter, M., & Chandler, P.