Steering Policies for Markov Decision Processes Under a Recurrence Condition.

dc.contributor.author	Ma, Dye-Jyun	en_US
dc.contributor.author	Makowski, Armand M.	en_US
dc.contributor.department	ISR	en_US
dc.date.accessioned	2007-05-23T09:41:20Z
dc.date.available	2007-05-23T09:41:20Z
dc.date.issued	1988	en_US
dc.description.abstract	This paper presents a class of adaptive policies in the context of Markov decision processes (MDP's) with long-run average performance measures. Under a recurrence condition, the proposed policy alternates between two stationary policies so as to adaptively track a sample average cost to a desired value. Direct sample path arguments are presented for investigating the convergence of sample average costs and the performance of the adaptive policy is discussed. The obtained results are particularly useful in discussing constrained MDP's with a single constraint. Applications include a wide class of constrained MDP's with finite state space (Beutler and Ross 1985), an optimal flow control problem (Ma and Makowski 1987) and an optimal resource allocation problem (Nain and Ross 1986).	en_US
dc.format.extent	1049169 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1903/4772
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	ISR; TR 1988-41	en_US
dc.title	Steering Policies for Markov Decision Processes Under a Recurrence Condition.	en_US
dc.type	Technical Report	en_US

Files

Now showing 1 - 1 of 1