A Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems

Loading...
Thumbnail Image

Files

TR_2003-25.pdf (190.97 KB)
No. of downloads: 725

Publication or External Link

Date

2003

Citation

DRUM DOI

Abstract

We consider a class of infinite horizon Markov decision processes (MDPs) with multiple decision makers, called agents,and a general joint reward structure, but a special decomposable state/action structure such that each individual agent's actions affect the system's state transitions independently from the actions of all other agents. We introduce the concept of localization," where each agent need only consider a local" MDP defined on its own state and action spaces. Based on this localization concept, we propose an iterative distributed algorithm that emulates gradient ascent and which converges to a locally optimal solution for the average reward case. The solution is an ``autonomous" joint policy such that each agent's action is based on only its local state. Finally, we discuss the implication of the localization concept for discounted reward problems.

Notes

Rights