Learning in Large Multi-Agent Systems

No Thumbnail Available

Files

Publication or External Link

Date

2024

Citation

Abstract

In this dissertation, we study a framework of large-scale multi-agent strategic interactions. The agents are nondescript and use a learning rule to repeatedly revise their strategies based on their payoffs. Within this setting, our results are structured around three main themes: (i) Guaranteed learning of Nash equilibria, (ii) The inverse problem, i.e. estimating the payoff mechanism from the agents' strategy choices, and (iii) Applications to the placement of electric vehicle charging stations.

In the traditional setup, the agents' inter-revision times follow identical and independent exponential distributions. We expand on this by allowing these intervals to depend on the agents' strategies or have Erlang distributions. These extensions enhance the framework's modeling capabilities, enabling it to address problems such as task allocation with varying service times or multiple stages. We also explore a third generalization, concerning the accessibility among strategies. Majority of the existing literature assume that the agents can transition between any two strategies, whereas we allow only certain alternatives to be accessible from certain others. This adjustment further improves the framework's modeling capabilities, such as by incorporating constraints on strategy switching related to spatial and informational factors. For all of these extensions, we use Lyapunov's method and passivity-based techniques to find conditions on the revision rates, learning rule, and payoff mechanism that ensure the agents learn to play a Nash equilibrium of the payoff mechanism.

For our second class of problems, we adopt a multi-agent inverse reinforcement learning perspective. Here, we assume that the learning rule is known but, unlike in existing work, the payoff mechanism is unknown. We propose a method to estimate the unknown payoff mechanism from sample path observations of the populations' strategy profile. Our approach is two-fold: We estimate the agents' strategy transitioning probabilities, which we then use - along with the known learning rule - to obtain a payoff mechanism estimate. Our findings regarding the estimation of transitioning probabilities are general, while for the second step, we focus on linear payoff mechanisms and three well-known learning rules (Smith, replicator, and Brown-von Neumann-Nash). Additionally, under certain assumptions, we show that we can use the payoff mechanism estimate to predict the Nash equilibria of the unknown mechanism and forecast the strategy profile induced by other rules.

Lastly, we contribute to a traffic simulation tool by integrating electric vehicles, their charging behaviors, and charging stations. This simulation tool is based on spatial-queueing principles and, although less detailed than some microscopic simulators, it runs much faster and accurately represents traffic rules. Using this tool, we identify optimal charging station locations (on real roadway networks) that minimize the overall traffic.

Notes

Rights