A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem

One encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for large scale controlled Markov chains. In this article, we consider a perimeter patrol stochastic optimal control problem. To determine the optimal control policy, one has to solve a Markov decision problem, whose large size renders exact dynamic programming methods intractable. So, we propose a state aggregation based approximate linear programming method to construct provably good sub-optimal policies instead. The state-space is partitioned and the optimal cost-to-go or value function is restricted to be a constant over each partition. We show that the resulting restricted system of linear inequalities embeds a family of Markov chains of lower dimension, one of which can be used to construct a tight lower bound on the optimal value function. In general, the construction of the lower bound requires the solution to a combinatorial problem. But the perimeter patrol problem exhibits a special structure that enables tractable linear programming formulation for the lower bound. We demonstrate this and also provide numerical results that corroborate the efficacy of the proposed methodology.

Kalyanam, Krishnamoorthy||Darbha, Swaroop||Park, Myoungkuk||Pachter, Meir||Chandler, Phil||Casbeer, David

A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem Conference Paper