
Accession Number : AD0644430
Title : ON THE NARMED BANDIT AND OTHER PROBLEMS INVOLVING SEQUENTIAL DECISIONS.
Descriptive Note : Technical rept.,
Corporate Author : MASSACHUSETTS INST OF TECH CAMBRIDGE OPERATIONS RESEARCH CENTER
Personal Author(s) : Obregon,Ivan
Report Date : OCT 1966
Pagination or Media Count : 155
Abstract : The Narmed bandit problem can be defined as the problem faced by a decision maker who is allowed a given number of opportunities to sample from N populations which generate random variables whose distributions are not known with certainty. If he wants to maximize his expected payoff, what should his strategy be. The paper gives a general formulation which is later specialized to the case where the N random variables have binary distributions. A numerical solution is given and some properties of the optimal strategy are discussed for both finite and infinite processes; finally some near optimal rules are proposed and evaluated. In a second part, the so called actiontiming problem and some variants of it are studied. Models with known probability distributions are considered first and then this assumption is relaxed to allow for imperfectly known distributions with adaptive learning built into the policy. (Author)
Descriptors : (*GAME THEORY, DECISION MAKING), (*DECISION MAKING, UNCERTAINTY), DYNAMIC PROGRAMMING, SEQUENTIAL ANALYSIS, RANDOM VARIABLES, SEQUENCES(MATHEMATICS), OPERATIONS RESEARCH
Subject Categories : Operations Research
Distribution Statement : APPROVED FOR PUBLIC RELEASE