Optimal MAV operations in an uncertain environment Academic Article uri icon


  • In this paper, we consider a problem of sequential resource allocation. Such a problem arises in a simplified intelligence, surveillance and reconnaissance scenario where a micro air vehicle (MAV) is tasked with classification in an environment with false, that is, clutter, targets. The MAV visits the objects of interest in a specified sequence. A human operator is tasked with aiding the MAV's automatic target recognition system with the classification of objects, based on the images sent to him from the MAVs. If the images do not resolve the ambiguity concerning the status of the object being classified, the operator may request that the object be revisited. In this paper, for the sake of exposition of the employed methods and clarity of presentation, we assume that every object may be revisited at most once. Each object can be revisited and re-examined in L 1 ways. There is an information gain whose value is given by the running reward. The information gain depends on the way an object is re-examined, and the feedback from the operator but it is the same for all the objects. There is a random operator delay in communicating his findings to the MAV and the probability density function of the delay is assumed known. The MAV has a limited fuel reserve and upon getting feedback from the operator (and hence, knowing the delay associated with the object of interest only and not with those that it must revisit in the future), it must decide whether to revisit the object and if so, which of the L ways is optimal so as to maximize the total expected reward. In every revisit, fuel is expended from the reserve and is proportional to twice the delay plus a fixed cost, which is dependent on the way in which the object is re-examined. We employ a stochastic dynamic programming approach to solve this problem. Specifically, for the case when L = 1, we show that there is an optimal threshold for each object and it is optimal to revisit the object if the delay is at most the threshold and not to revisit otherwise. For the case when L> 1, the structure of the optimal decision algorithm is not as simple, but it can be computed off-line so as to facilitate its real-time implementation. Published in 2007 by John Wiley & Sons, Ltd.

published proceedings


author list (cited authors)

  • Pachter, M., Chandler, P. R., & Darbha, S.

citation count

  • 10

complete list of authors

  • Pachter, Meir||Chandler, Phillip R||Darbha, Swaroop

publication date

  • January 2008