Accession Number : AD0714833

Title :   Maximal Average-Reward Policies for a Class of Semi-Markov Decision Processes with Arbitrary State and Action Space,

Corporate Author : CALIFORNIA UNIV LOS ANGELES WESTERN MANAGEMENT SCIENCE INST

Personal Author(s) : Lippman,Steven A.

Report Date : OCT 1970

Pagination or Media Count : 31

Abstract : The report discusses the problem of maximizing the long-run average (also the long-run average expected) reward per unit time in a Semi-Markov Decision Process with arbitrary state and action space. The main result states that one need only to consider the set of stationary policies in that for each epsilon > 0 there is a stationary policy which is epsilon-optimal. This result is derived under the assumptions that (roughly) expected rewards and expected transition times are uniformly bounded over all states and actions, and that there is a state such that the expected length of time until the system returns to this state is uniformly bounded over all policies. The existence of an optimal stationary policy is established under the additional assumption of countable state and finite action space. Applications to queueing reward systems are given. (Author)

Descriptors :   (*QUEUEING THEORY, STOCHASTIC PROCESSES), (*STOCHASTIC PROCESSES, DECISION THEORY), SET THEORY, PROBABILITY DENSITY FUNCTIONS, DYNAMIC PROGRAMMING, RANDOM VARIABLES, OPTIMIZATION

Subject Categories : Operations Research

Distribution Statement : APPROVED FOR PUBLIC RELEASE