Approximate Dynamic Programming with State Aggregation applied to Perimeter Patrol*
- View All
One encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for large scale controlled Markov Chains. In this paper, we provide an aggregation method to construct sub-optimal policies for a perimeter surveillance control problem which gives rise to a large scale Markov chain. The novelty of this approach lies in circumventing the need for value iteration over the entire state space. Instead, the state space is partitioned and the value function is approximated by a constant over each partition. We associate a meta-state with each partition where the transition probabilities between these meta-states is known. The aggregation approach results in a significant reduction in the computational burden and lends itself to value iteration over the aggregated state space. We provide bounds to assess the quality of the approximation and give numerical results that support the proposed methodology.
author list (cited authors)
Kalyanam, K., Pachter, M., & Chandler, P.