Multi-time Scale Markov Decision Processes

dc.contributor.advisorMarcus, Steven I.en_US
dc.contributor.advisorShayman, Marken_US
dc.contributor.authorChang, Hyeong Sooen_US
dc.contributor.authorFard, Pedramen_US
dc.contributor.authorMarcus, Steven I.en_US
dc.contributor.authorShayman, Marken_US
dc.contributor.departmentISRen_US
dc.date.accessioned2007-05-23T10:11:51Z
dc.date.available2007-05-23T10:11:51Z
dc.date.issued2002en_US
dc.description.abstractThis paper proposes a simple analytical model called M time-scale MarkovDecision Process (MMDP) for hierarchically structured sequential decision making processes, where decisions in each level in the M-level hierarchy are made in M different time-scales. <p>In this model, the state space and the control space ofeach level in the hierarchy are non-overlapping with those of the other levels, respectively, and the hierarchy is structured in a "pyramid" sense such that a decision made at level m (slower time-scale) state and/or the state will affect the evolutionary decision making process of the lower level m+1 (faster time-scale) until a new decision is made at the higher level but the lower level decisions themselves do not affect the higher level's transition dynamics. The performance produced by the lower level's decisions will affect the higher level's decisions.<p>A hierarchical objective function is defined such that the finite-horizon value of following a (nonstationary) policy at the level m+1 over a decision epoch of the level m plus an immediate reward at the level m is the single step reward for the level m decision making process. From this we define "multi-level optimal value function" and derive "multi-level optimality equation."<p>We discuss how to solve MMDPs exactly or approximately and also study heuristic on-line methods to solve MMDPs. Finally, we give some example control problems that can be modeled as MMDPs.en_US
dc.format.extent447130 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/6259
dc.language.isoen_USen_US
dc.relation.ispartofseriesISR; TR 2002-6en_US
dc.subjectGlobal Communication Systemsen_US
dc.titleMulti-time Scale Markov Decision Processesen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR_2002-6.pdf
Size:
436.65 KB
Format:
Adobe Portable Document Format