A Statistical Model for Computer Recognition of Handwritten ZIP Codes
Steve C. Wang
Computing Science and Statistics 31, pp. 221-225 (1999)

I present a statistical model for computer recognition of human handwriting, specifically ZIP codes. I incorporate Bayesian principles to build a model for integrating two major tasks in handwriting recognition: the segmentation of a sequence of characters into its individual components, and the recognition of these individual components.

The model describes how to use the information extracted from a ZIP code image to update our prior knowledge about a candidate segmentation. I incorporate a digit classification algorithm developed by Amit, Geman, and Wilder to recognize the characters determined by the candidate segmentation. The strength of this recognition provides additional information about the plausibility of the segmentation.

Combining these sources of information, we obtain a posterior distribution that simultaneously optimizes both segmentation and recognition. Summing this posterior distribution over all segmentations gives us a posterior distribution on the recognition alone, and we take its mode as our best prediction of the true ZIP code. To make this optimization feasible, a generalized dynamic programming algorithm is implemented.


Return to Steve's home page.