Accession Number : ADA327515

Title :   Learning Effective and Robust Knowledge for Semantic Query Optimization.

Descriptive Note : Research rept.,

Corporate Author : UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST

Personal Author(s) : Hsu, Chun-Nan

PDF Url : ADA327515

Report Date : DEC 1996

Pagination or Media Count : 151

Abstract : Optimizing queries to heterogeneous, distributed multidatabases is an important problem. Due to the query complexity and the heterogeneity of databases, it is difficult for conventional optimization approaches to solve the problem satisfactorily. Semantic Query Optimization (SQO) can complement conventional approaches to overcome the heterogeneity and considerably reduce redundant data transmission. SQO optimizers use rules about data regularities to yield significant cost reduction. However, hand coding useful rules for SQO is impracticable. This dissertation presents a machine learning approach to this knowledge bottleneck problem. Unlike search control rules or classification rules studied extensively in machine learning, two roughly correlated measures must be maximized in the learning of high utility rules for SQO. The first measure is the effectiveness. Effective rules must be applicable in many different queries and yield high cost reduction. The second measure is the robustness against database changes. That is, they must remain valid regardless of database changes. This dissertation presents a new inductive learning approach to learning effective and robust rules. The learning approach considers both applicability and cost-reduction in rule induction to learn effective rules. The learned rules are robust because the learner is able to guide the learning for robust rules with an approach to estimating the probabilities of database changes. To evaluate the utility of the learning approach, this dissertation also describes an extended SQO approach for query plans that retrieve data from heterogeneous multidatabases. The experimental results show that the learned rules produce significant savings while being robust against database changes.

Descriptors :   *LEARNING MACHINES, *KNOWLEDGE BASED SYSTEMS, DATA BASES, ALGORITHMS, OPTIMIZATION, DATA MANAGEMENT, DISTRIBUTED DATA PROCESSING, COMPUTER COMMUNICATIONS, SEMANTICS, RULE BASED SYSTEMS, INFORMATION RETRIEVAL.

Subject Categories : Cybernetics

Distribution Statement : APPROVED FOR PUBLIC RELEASE