Keyword Spotting on Hangul Document Images Using Character Feature Models


The KIPS Transactions:PartB , Vol. 12, No. 5, pp. 521-526, Oct. 2005
10.3745/KIPSTB.2005.12.5.521,   PDF Download:

Abstract

In this paper, we propose a keyword spotting system as an alternative to searching system for poor quality Korean document images and compare the proposed system with an OCR-based document retrieval system. The system is composed of character segmentation, feature extraction for the query keyword, and word-to-word matching. In the character segmentation step, we propose an effective method to remove the connectivity between adjacent characters and a character segmentation method by making the variance of character widths minimum. In the query creation step, feature vector for the query is constructed by a combination of a character model by typeface. In the matching step, word-to-word matching is applied base on a character-to-character matching. We demonstrated that the proposed keyword spotting system is more efficient than the OCR-based one to search a keyword on the Korean document images, especially when the quality of documents is quite poor and point size is small.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. C. Park, S. H. Kim, D. J. Choi, "Keyword Spotting on Hangul Document Images Using Character Feature Models," The KIPS Transactions:PartB , vol. 12, no. 5, pp. 521-526, 2005. DOI: 10.3745/KIPSTB.2005.12.5.521.

[ACM Style]
Sang Cheol Park, Soo Hyung Kim, and Deok Jai Choi. 2005. Keyword Spotting on Hangul Document Images Using Character Feature Models. The KIPS Transactions:PartB , 12, 5, (2005), 521-526. DOI: 10.3745/KIPSTB.2005.12.5.521.