Agent Modeling in Stochastic Repeated Games
Cheng, Kan Leung
Nau, Dana S.
MetadataShow full item record
There are many situations in which two or more agents (e.g., human or computer decision makers) interact with each other repeatedly in settings that can be modeled as repeated games. In such situations, there is evidence that agents sometimes deviate greatly from what conventional game theory would predict. There are several reasons why this might happen, one of which is the focus of this dissertation: sometimes an agent's preferences may involve not only its own payoff (as specified in the payoff matrix), but also the payoffs of the other agent(s). In such situations, it is important to be able to understand what an agent's preferences really are, and how those preferences may affect the agent's behavior. This dissertation studies how the notion of Social Value Orientation (SVO), a construct in social psychology to model and measure a person's social preference, can be used to improve our understanding of the behavior of computer agents. Most of the work involves the life game, a repeated game in which the stage game is chosen stochastically at each iteration. The work includes the following results: * Using a combination of the SVO theory and evolutionary game theory, we studied how an agent's SVO affects its behavior in Iterated Prisoner's Dilemma (IPD). Our analysis provides a way to predict outcomes of agents playing IPD given their SVO values. * In the life game, we developed a way to build a model of agent's SVO based on observations of its behavior. Experimental results demonstrate that the modeling technique works well. * We performed experiments showing that the measured social preferences of computer agents have significant correlation with that of their human designers. The experimental results also show that knowing the SVO of an agent's human designer can be used to improve the performance of other agents that interact with the given agent. * A limitation of the SVO model is that it only looks at agents' preferences in one-shot games. This is inadequate for repeated games, in which an agent's actions may depend on both its SVO and whatever predictions it makes of the other agent's behavior. We have developed an extension of the SVO construct called the behavioral signature, a model of how an agent's behavior over time will be affected by both its own SVO and the other agent's SVO. The experimental results show that the behavioral signature is an effective way to generalize SVO to situations where agents interact repeatedly.