
Accession Number : AD0714833
Title : Maximal AverageReward Policies for a Class of SemiMarkov Decision Processes with Arbitrary State and Action Space,
Corporate Author : CALIFORNIA UNIV LOS ANGELES WESTERN MANAGEMENT SCIENCE INST
Personal Author(s) : Lippman,Steven A.
Report Date : OCT 1970
Pagination or Media Count : 31
Abstract : The report discusses the problem of maximizing the longrun average (also the longrun average expected) reward per unit time in a SemiMarkov Decision Process with arbitrary state and action space. The main result states that one need only to consider the set of stationary policies in that for each epsilon > 0 there is a stationary policy which is epsilonoptimal. This result is derived under the assumptions that (roughly) expected rewards and expected transition times are uniformly bounded over all states and actions, and that there is a state such that the expected length of time until the system returns to this state is uniformly bounded over all policies. The existence of an optimal stationary policy is established under the additional assumption of countable state and finite action space. Applications to queueing reward systems are given. (Author)
Descriptors : (*QUEUEING THEORY, STOCHASTIC PROCESSES), (*STOCHASTIC PROCESSES, DECISION THEORY), SET THEORY, PROBABILITY DENSITY FUNCTIONS, DYNAMIC PROGRAMMING, RANDOM VARIABLES, OPTIMIZATION
Subject Categories : Operations Research
Distribution Statement : APPROVED FOR PUBLIC RELEASE