Building Database using Character Recognition Technology


The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 6, No. 7, pp. 1713-1723, Jul. 1999
10.3745/KIPSTE.1999.6.7.1713,   PDF Download:

Abstract

Optical character recognition(OCR) might be the most plausible method in building database out of printed matters. This paper describes the points to be considered when one selects an OCR system in order to build database. Based on the considerations, we evaluated four commercial OCR systems, and chose one which shows the best recognition rate to build OCR-text database. The subject text, the KT-test collection, is a set of abstracts from proceedings of different printing quality, fonts, and formats, KT-test collection is also provided with typed text database. Recognition rate was calculated by comparing the recognition result with the typed text. No preprocessing such as learning and slant correction was applied to the recognition process in order to simulate a practical environment. The result shows 90.5% of character recognition rate over 970 abstracts. This recognition rate is still insufficient for practical use. The errors in OCR texts are different from those of manually typed texts. In this paper, we classify the errors in OCR texts for the further research.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. S. Hwa, L. C. Sik, L. J. Ho, K. J. Hyung, "Building Database using Character Recognition Technology," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 6, no. 7, pp. 1713-1723, 1999. DOI: 10.3745/KIPSTE.1999.6.7.1713.

[ACM Style]
Hahn Sun Hwa, Lee Chung Sik, Lee Joon Ho, and Kim Jin Hyung. 1999. Building Database using Character Recognition Technology. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 6, 7, (1999), 1713-1723. DOI: 10.3745/KIPSTE.1999.6.7.1713.