Implementation Issues for Markov Decision Processes.
Makowski, Armand M.
MetadataShow full item record
In this paper, the problem of steering a long-run average coat functional to a prespecified value is discussed in the context of Markov decision processes wish countable statespace; this problem naturally arises in the study of constrained Markov decision processes by Lagrangian arguments. Under reasonable assumptions, a Markov stationary steering control is shown to exist, and to be obtained by fixed memoryless randomization between two Markov stationary policies. The implementability of this randomized policy is investigated in view of the fact that the randomization bias is solution to a (highly) nonlinear equation, which may not even be available in the absence of full knowledge of the model parameter values. Several proposals for implementation are made and their relative properties discussed. The paper closes with an outline of a methodology that was found useful in investigating properties of Certainty Equivalence implementations.