UAV perimeter patrol operations optimization using efficient Dynamic Programming
- Additional Document Info
- View All
A reduced order Dynamic Programming (DP) method that efficiently computes the optimal policy and value function for a class of controlled Markov chains is developed. We assume that the Markov chains exhibit the property that a subset of the states have a single (default) control action associated with them. Furthermore, we assume that the transition probabilities between the remaining (decision) states can be derived from the original Markov chain specification. Under these assumptions, the suggested reduced order DP method yields significant savings in computation time and also leads to faster convergence to the optimal solution. Most importantly, the reduced order DP has been shown analytically to give the exact same solution that one would obtain via performing DP on the original full state space Markov chain. The method is illustrated via a multi UAV perimeter patrol stochastic optimal control problem. © 2011 AACC American Automatic Control Council.
author list (cited authors)
Krishnamoorthy, K., Pachter, M., Chandler, P., Casbeer, D., & Darbha, S.