Mean-field Approaches in Multi-agent Systems: Learning and Control
Publication or External Link
In many settings in physics, chemistry, biology, and sociology, when individuals (particles) interact in large collectives, they begin to behave in emergent ways. This is to say that their collective behavior is altogether different from their individual behavior.
In physics and chemistry, particles interact through the various forces, and this results in the rich behavior of the phases of matter. A particularly interesting case arises in the dynamics of gaseous star formation. In models of star formation, the gases are subject to the attractive gravitational force, and perhaps viscosity, electromagnetism, or thermal fluctuations. Depending on initial conditions, and inclusion of additional forces in the models, a variety of interesting configurations can arise, from dense nodules of gas to swirling vortices.
In biology and sociology, these interactions (forces) can be explicitly tied to chemical or physical phenomena, as in the case of microbial chemotaxis, or they can be more abstract or virtual, as in the case of bird flocking or human pedestrian traffic. We focus on the latter cases in this work.
In collective animal or human traffic, we do not say that animals or humans are explicity subject to physical forces that causes them to move in alignment with each other, or whatever else. Rather, they behave as if there were such forces. In short, we use the language and notation of physics and forces as a convenient tool to build our understanding.
We do so since natural phenomena are rich with sophisticated and adaptive behavior. Bird flocks rapidly adapt to avoid collisions, to fly around obstacles, and to confuse predators. Engineers today can only dream of building drone swarms with such plasticity.
An important question to answer is how one takes a model of interacting individuals and builds a model of a collective. Once one answers this question, another immediately follows: how do we take these models of collectives and use them to discover representations of natural phenomena? Then, can we use these models to build methods to control such phenomena, assuming suitable actuation? Once these questions are answered, our understanding of collective dynamics will improve, broadening the applications we can tackle.
In this thesis, we study collective dynamics via mean-field theory. In mean-field theory, an individual is totally anonymous, and so can be removed or permuted from a large collective without changing the collective dynamics significantly. More specifically, when any individual is excluded from the definition of the empirical measure of all the individuals, those empirical measures converge to the same measure, termed the mean-field measure. The mean-field measure is governed by the forward Kolmogorov equation. In certain scenarios where an analogy can be drawn to particle dynamics, these forward Kolmogorov equations can be converted to compressible Euler equations. When optimal control problems are posed on the particle dynamics, in the mean-field limit we obtain a forward Kolmogorov equation coupled to a backward Hamilton-Jacobi-Bellman (-Isaacs) equation (or a stationary analogue of these). This system of equations describes the solution to the mean-field game.
The first two problems we explore in this thesis are focused on the system identification (inverse) problem: discover a model of collective dynamics from data. In these problems, we study a generalized hydrodynamic Cucker-Smale-type model of flocking in a bounded region of 3D space. We first prove existence of weak bounded energy solutions and a weak-strong uniqueness principle for our model. Then, we use the model to learn a representation of the dynamics of data associated to a synthetic bird flock.
The second two problems we study focus on the control (forward) problem: learn an approximately optimal control for collective dynamics online. We study this first in a relatively simple state-and-control-constrained mean-field game on traffic. In this case, the mean-field term is contained only in the mean-field game's cost. We first numerically study a finite horizon version of this problem. The approach for the first problem is not online. Then, we take an infinite horizon version, and we form a system of approximate dynamic programming ODE-PDEs from the exact dynamic programming PDEs. This approach results in online learning and adapting of the control to the dynamics. We prove this ODE-PDE system has a unique weak solution via semigroup and successive approximation methods. We present a numerical example, and discuss the tradeoffs in this approach.
We conclude the thesis by summarizing our results, and discussing future directions and applications in theoretical and practical settings.