Title : AN INVESTIGATION OF THE USE OF CONTEXT IN CHARACTER RECOGNITION USING GRAPH SEARCHING,
Corporate Author : CORNELL UNIV ITHACA N Y CENTER FOR APPLIED MATHEMATICS
Personal Author(s) : Christensen,Carl Spencer
Report Date : NOV 1968
Abstract : In the paper context is introduced into a character recognition system. The recognition system is considered to be reading a text rather than isolated characters. Two types of context are considered: the syntax, or formal grammar of the text under consideration, and the statistical distribution of the letters in the text. For syntax, a single 'regular' grammar is used. The statistical distribution is approximated by (a) the probabilities of individual letters, (b) the probabilities of letterpairs (or diagrams), and (c) the probabilities of lettertriplets (or trigrams). For each character input, the character recognizer outputs a list of alternative decisions with their associated 'confidences'. Based on an entire input string, both types of context are then used to make the final decision, using a graphsearching procedure. Experiments were run on a computer with a simulated character recognizer. The results indicate that the graphsearching formulation of the problem does indeed allow the syntax and statistical distribution of the letters to be utilized. The error rate of the character recognition system is reduced when digram and trigram statistics are used; their effectiveness varies inversely as the uncertainty in the statistical distribution of the letters in the text. (Author)
