Stochastic Approximation and Optimization for Markov Chains
Bartusek, John D.
Makowski, Armand M.
MetadataПоказать полную информацию
We study the convergence properties of the projected stochasticapproximation (SA) algorithm which may be used to find the root of an unknown steady state function of a parameterized family of Markov chains. The analysis is based on the ODE Method and we develop a set of application-oriented conditions which imply almost sure convergence and are verifiable in terms of typically available model data. Specific results are obtained for geometrically ergodic Markov chains satisfying a uniform Foster-Lyapunov drift inequality.<p>Stochastic optimization is a direct application of the above root finding problem if the SA is driven by a gradient estimate of steady state performance. We study the convergence properties of an SA driven by agradient estimator which observes an increasing number of samples from the Markov chain at each step of the SA's recursion. To show almost sure convergence to the optimizer, a framework of verifiable conditions is introduced which builds on the general SA conditions proposed for the root finding problem.<p>We also consider a difficulty sometimes encountered in applicationswhen selecting the set used in the projection operator of the SA algorithm.Suppose there exists a well-behaved positive recurrent region of the state process parameter space where the convergence conditions are satisfied; this being the ideal set to project on. Unfortunately, the boundaries of this projection set are not known a priori when implementing the SA. Therefore, we consider the convergence properties when the projection set is chosen to include regions outside the well-behaved region. Specifically, we consider an SA applied to an M/M/1 which adjusts the service rate parameter when the projection set includes parameters that cause the queue to be transient.<p>Finally, we consider an alternative SA where the recursion is driven by a sample average of observations. We develop conditions implying convergence for this algorithm which are based on a uniform large deviation upper bound and we present specialized conditions implyingthis property for finite state Markov chains.