A Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems
A Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems
Files
Publication or External Link
Date
2003
Authors
Chang, Hyeong Soo
Fu, Michael C.
Advisor
Fu, Michael C.
Citation
DRUM DOI
Abstract
We consider a class of infinite horizon Markov decision processes (MDPs) with multiple decision makers, called agents,and a general joint reward structure, but a special decomposable state/action structure such that each individual agent's actions affect the system's state transitions independently from the actions of all other agents. We introduce the concept of ``localization," where each agent need only consider a ``local" MDP defined on its own state and action spaces. Based on this localization concept, we propose an iterative distributed algorithm that emulates gradient ascent and which converges to a locally optimal solution for the average reward case. The solution is an ``autonomous" joint policy such that each agent's action is based on only its local state. Finally, we discuss the implication of the localization concept for discounted reward problems.