
Accession Number : AD0742761
Title : Initially Stationary epsilonOptimal Policies in Continuous Time Markov Decision Chains.
Descriptive Note : Technical rept.,
Corporate Author : STANFORD UNIV CALIF DEPT OF OPERATIONS RESEARCH
Personal Author(s) : Lembersky,Mark Raphael
Report Date : 20 APR 1972
Pagination or Media Count : 72
Abstract : The asymptotic behavior of continuous time parameter Markov decision chains is studied. It is shown that the maxiaml total expected t period reward, less t times the maximal longrun average return rate, converges as t approaches infinity for every initial state. This result is used to establish the existence of policies which are simultaneously epsilonoptimal for all process durations, and which are stationary except possibly for a final, finite segment. Further, the length of the final segment depends on epsilon, but not on t for large enough t, while the initial stationary part of the policy is independent of both epsilon and t. The decision rules comprising the initially stationary part of these policies, called preferred, are characterized. Finite algorithms for finding preferred decision rules are given under varying hypotheses on the underlying structure of the system, though the general case case remains unsolved. (Author)
Descriptors : (*DECISION THEORY, STOCHASTIC PROCESSES), DYNAMIC PROGRAMMING, SET THEORY, MATRICES(MATHEMATICS), MEASURE THEORY, OPTIMIZATION, THEOREMS
Subject Categories : Operations Research
Distribution Statement : APPROVED FOR PUBLIC RELEASE