Multi-time Scale Markov Decision Processes

Chang, Hyeong Soo; Fard, Pedram; Marcus, Steven I.; Shayman, Mark

Multi-time Scale Markov Decision Processes

dc.contributor.advisor	Marcus, Steven I.	en_US
dc.contributor.advisor	Shayman, Mark	en_US
dc.contributor.author	Chang, Hyeong Soo	en_US
dc.contributor.author	Fard, Pedram	en_US
dc.contributor.author	Marcus, Steven I.	en_US
dc.contributor.author	Shayman, Mark	en_US
dc.contributor.department	ISR	en_US
dc.date.accessioned	2007-05-23T10:11:51Z
dc.date.available	2007-05-23T10:11:51Z
dc.date.issued	2002	en_US
dc.description.abstract	This paper proposes a simple analytical model called M time-scale MarkovDecision Process (MMDP) for hierarchically structured sequential decision making processes, where decisions in each level in the M-level hierarchy are made in M different time-scales. <p>In this model, the state space and the control space ofeach level in the hierarchy are non-overlapping with those of the other levels, respectively, and the hierarchy is structured in a "pyramid" sense such that a decision made at level m (slower time-scale) state and/or the state will affect the evolutionary decision making process of the lower level m+1 (faster time-scale) until a new decision is made at the higher level but the lower level decisions themselves do not affect the higher level's transition dynamics. The performance produced by the lower level's decisions will affect the higher level's decisions.<p>A hierarchical objective function is defined such that the finite-horizon value of following a (nonstationary) policy at the level m+1 over a decision epoch of the level m plus an immediate reward at the level m is the single step reward for the level m decision making process. From this we define "multi-level optimal value function" and derive "multi-level optimality equation."<p>We discuss how to solve MMDPs exactly or approximately and also study heuristic on-line methods to solve MMDPs. Finally, we give some example control problems that can be modeled as MMDPs.	en_US
dc.format.extent	447130 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1903/6259
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	ISR; TR 2002-6	en_US
dc.subject	Global Communication Systems	en_US
dc.title	Multi-time Scale Markov Decision Processes	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR_2002-6.pdf
Size:: 436.65 KB
Format:: Adobe Portable Document Format

Download

Collections

Institute for Systems Research Technical Reports