Accession Number : ADA331110

Title :   Browsing, Discovery and Search in Large Distributed Databases of Complex and Scanned Documents

Descriptive Note : Scientific technical rept. 1 Apr-30 Jun 97

Corporate Author : MASSACHUSETTS UNIV AMHERST DEPT OF COMPUTER SCIENCE

Personal Author(s) : Croft, W. B.

PDF Url : ADA331110

Report Date : 15 JUL 1997

Pagination or Media Count : 9

Abstract : This project aims to integrate powerful, new techniques for interactive browsing, discovery, and retrieval in very large, distributed databases of complex and scanned documents. Emphasis is placed on going beyond full-text retrieval techniques developed in the DARPA TIPSTER program to support different types of access and non-textual content. These techniques should be particularly relevant to the patent domain where it is important to find relationships between documents and where the patent or trademark may be based on a visual design. The specific tasks identified involve studying representation techniques for long documents with complex structure, browsing and discovery techniques for large text databases, image retrieval and scanned document retrieval techniques, and architectures for large, distributed databases.

Descriptors :   *DATA BASES, *DISTRIBUTED DATA PROCESSING, *INFORMATION RETRIEVAL, IMAGE PROCESSING, INTERACTIVE GRAPHICS, BAYES THEOREM, TEXT PROCESSING.

Subject Categories : Computer Systems

Distribution Statement : APPROVED FOR PUBLIC RELEASE