
Accession Number : AD0618406
Title : PERTURBATION THEORY AND MARKOVIAN DECISION PROCESSES.
Descriptive Note : Technical rept.,
Corporate Author : MASSACHUSETTS INST OF TECH CAMBRIDGE OPERATIONS RESEARCH CENTER
Personal Author(s) : Schweitzer,Paul J.
Report Date : JUN 1965
Pagination or Media Count : 318
Abstract : The HowardJewell algorithm for programming over a Markovrenewal process is analyzed in terms of a perturbation theory formalism which describes how the stationary distribution changes when the transition probabilities change. The policy improvement technique is derived from this new viewpoint. The relative values may be interpreted as partial derivatives of the gain rate with respect to policy. The value equations are shown to be solvable, with the relative values unique up to one additive constant, if and only if the underlying Markov chain is irreducible. The policy iteration algorithm is shown not to cycle, this guaranteeing convergence. A discussion of the existence, uniqueness, and characterization of the solution to the functional equation of dynamic programming is given. Emphasis is placed upon the value maximization of transient states. The fundamental matrix is developed as a useful tool for doing perturbation theory, describing firstpassage properties of semiMarkov processes, and for dealing with semiMarkov processes with rewards. (Author)
Descriptors : (*PERTURBATION THEORY, STATISTICAL PROCESSES), (*STATISTICAL PROCESSES, PERTURBATION THEORY), (*DYNAMIC PROGRAMMING, STATISTICAL PROCESSES), OPERATIONS RESEARCH, PROBABILITY, MATRICES(MATHEMATICS), ITERATIONS
Distribution Statement : APPROVED FOR PUBLIC RELEASE