Maximal Persistent Surveillance and Optimal Sensor Usage in Denied Environments

Thumbnail Image

Publication or External Link





Research in monitoring and surveillance has flourished in recent years. Its applications include control of illegal deforestation, search for survivors in disasters, and inspection of large infrastructures, among others. Some of the current challenges lie in establishing control policies that are suited for systems with low power and limited sensing, actuation and communication capabilities. This thesis has two main focuses: i) control design for persistent surveillance, where the goal is to design memoryless policies that achieve surveillance of the largest possible area, while respecting certain constraints; and ii) optimal sensor usage for monitoring of denied environments, inside which state observations are costly.

In the first part of the thesis, we design control policies for Markov decision pro- cesses (MDP) with the objective of generating the maximal set of recurrent states, subject to convex constraints on the set of invariant probability mass functions. We propose a design method for memoryless policies and fully observable MDPs with finite state and action spaces. Our approach relies on a finitely parametrized convex program inspired by principles of entropy maximization.

Next, we explore the problem of designing controllers for autonomous robots tasked with maximal persistent surveillance of an area in which there are forbidden regions. We model each robot as an MDP whose state comprises its position on a finite two- dimensional lattice and the direction of motion. The goal is to find the minimum number of robots and an associated time-invariant memoryless control policy that guarantees that the largest number of states are persistently surveilled without ever visiting a forbidden state.

In the second part of the thesis, we study the problem of optimal sensor usage in denied environments, inside which state observations are only available by incurring an extra cost. Observations outside the denied environment are cost free. The goal is to understand the trade-off between paying to access the sensor immediately, and waiting for a free sensor use should the system exit the denied environment. We show that the analysis of this problem simplifies by recasting it as renewal reward process, which enables us to establish conditions for which any local minimum (if it exists) is also a global minimum, thus facilitating the search for its minimizer.

Finally, we extend these results to the case when the state space is discrete and the state update is ruled by a Markov chain. We establish conditions on the initial distribution of the Markov chain that guarantee that any local minimum, if it exists, is also global. In particular, these conditions rely on constraining the radial stochastic order of an auxiliary Markov process. By analyzing these problems, we hope to provide valuable insights into the core of some of the challenges that may arise in real-world implementations.