Extraction of Text Regions from Spam-Mail Images Using Color Layers


The KIPS Transactions:PartB , Vol. 13, No. 4, pp. 409-416, Aug. 2006
10.3745/KIPSTB.2006.13.4.409,   PDF Download:

Abstract

In this paper, we propose an algorithm for extracting text regions from spam-mail images using color layer. The CLTE(color layer-based text extraction) divides the input image into eight planes as color layers. It extracts connected components on the eight images, and then classifies them into text regions and non-text regions based on the component sizes. We also propose an algorithm for recovering damaged text strokes from the extracted text image. In the binary image, there are two types of damaged strokes: (1) middle strokes such as ‘ㅣ’ or ‘ㅡ’ are deleted, and (2) the first and/or last strokes such as ‘ㅇ’ or ‘ㅁ’ are filled with black pixels. An experiment with 200 spam-mail images shows that the proposed approach is more accurate than conventional methods by over 10%.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. S. Kim, S. H. Kim, S. W. Han, T. Y. Nam, H. J. Son, S. R. Oh, "Extraction of Text Regions from Spam-Mail Images Using Color Layers," The KIPS Transactions:PartB , vol. 13, no. 4, pp. 409-416, 2006. DOI: 10.3745/KIPSTB.2006.13.4.409.

[ACM Style]
Ji Soo Kim, Soo Hyung Kim, Seung Wan Han, Taek Yong Nam, Hwa Jeong Son, and Sung Ryul Oh. 2006. Extraction of Text Regions from Spam-Mail Images Using Color Layers. The KIPS Transactions:PartB , 13, 4, (2006), 409-416. DOI: 10.3745/KIPSTB.2006.13.4.409.