Accession Number : ADA309152

Title :   Parallel Data Mining for Association Rules on Shared-Memory Multi-Processors.

Descriptive Note : Technical rept.,

Corporate Author : ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE

Personal Author(s) : Zaki, M. J. ; Ogihara, M. ; Parthasarathy, S. ; Li, W.

PDF Url : ADA309152

Report Date : MAY 1996

Pagination or Media Count : 25

Abstract : Data mining is an emerging research area, whose goal is to extract significant patterns or interesting rules from large databases. High-level inference from large volumes of routine business data can provide valuable information to businesses, such as customer buying patterns, shelving criterion in supermarkets and stock trends. Many algorithms have been proposed for data mining of association rules. However, research so far has mainly focused on sequential algorithms. In this paper we present parallel algorithms for data mining of association rules and study the degree of parallelism, synchronization, and data locality issues on the SCI Power Challenge shared-memory multi-processor. We further present a set of optimizations for the sequential and parallel algorithms. Experiments show that a significant improvement of performance is achieved using our proposed optimizations. We also achieved good speed-up for the parallel algorithm, but we observe a need for parallel I/O techniques for further performance gains.

Descriptors :   *DATA BASES, *PARALLEL PROCESSING, *INFORMATION RETRIEVAL, ALGORITHMS, COMMERCE, SEQUENCES, PATTERNS, ITERATIONS.

Subject Categories : Computer Programming and Software

Distribution Statement : APPROVED FOR PUBLIC RELEASE