
Accession Number : ADA337740
Title : Issues in Scaling Up Machine Learning
Descriptive Note : Final rept. 1 Sep 9331 Aug 96
Corporate Author : CALIFORNIA UNIV IRVINE
Personal Author(s) : Pazzani, Michael J.
PDF Url : ADA337740
Report Date : 13 MAR 1997
Pagination or Media Count : 4
Abstract : This grant investigates issues in improving the accuracy of machine learning systems. The classic machine learning paradigm for prediction has been to learn a set of decision structures or models from a training set and select one for prediction on unseen test data. Rather than select a single node from the set, the focus of this project's research has been to combine the prediction of the learned models to form an improved estimate. The two fronts of this research are regression and classification. In the realm of regression, the task is to predict a single continuous value for an example. The majority of research in this area has focused on simple linear combination of the learned models. The nature of these weights may span from being highly regularized completely unconstrained. A set of weights is considered highly regularized if they are all positive, they sum to one, or they are uniform. Completely unconstrained weights have no restrictions and may be derived by methods like ordinary least squares regression. The degree of regularization required depends on the particular regression problem. The project has developed a technique called PCRY, which automatically estimates the appropriate degrease regularization for a given data set. The basic idea is to use the eigen structure of the model predictions on the training data to derive a continuum of possible weight sets ranging front highly regularized to completely unconstrained. Cross validation is used to estimate which weight set is most appropriate.
Descriptors : *LEARNING MACHINES, *REGRESSION ANALYSIS, *ARTIFICIAL INTELLIGENCE, DATA BASES, ALGORITHMS, NEURAL NETS, DECISION MAKING, COMPUTER LOGIC, EIGENVECTORS, LEAST SQUARES METHOD.
Subject Categories : Cybernetics
Distribution Statement : APPROVED FOR PUBLIC RELEASE