A Spelling Error Correction Model in Korean Using a Correction Dictionary and a Newspaper Corpus


The KIPS Transactions:PartB , Vol. 16, No. 5, pp. 427-434, Oct. 2009
10.3745/KIPSTB.2009.16.5.427,   PDF Download:

Abstract

With the rapid evolution of the Internet and mobile environments, text including spelling errors such as newly-coined words and abbreviated words are widely used. These spelling errors make it difficult to develop NLP (natural language processing) applications because they decrease the readability of texts. To resolve this problem, we propose a spelling error correction model using a spelling error correction dictionary and a newspaper corpus. The proposed model has the advantage that the cost of data construction are not high because it uses a newspaper corpus, which we can easily obtain, as a training corpus. In addition, the proposed model has an advantage that additional external modules such as a morphological analyzer and a word-spacing error correction system are not required because it uses a simple string matching method based on a correction dictionary. In the experiments with a newspaper corpus and a short message corpus collected from real mobile phones, the proposed model has been shown good performances (a miss-correction rate of 7.3%, a F1-measure of 97.3%, and a false positive rate of 1.1%) in the various evaluation measures.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. H. Lee and H. S. Kim, "A Spelling Error Correction Model in Korean Using a Correction Dictionary and a Newspaper Corpus," The KIPS Transactions:PartB , vol. 16, no. 5, pp. 427-434, 2009. DOI: 10.3745/KIPSTB.2009.16.5.427.

[ACM Style]
Se Hee Lee and Hark Soo Kim. 2009. A Spelling Error Correction Model in Korean Using a Correction Dictionary and a Newspaper Corpus. The KIPS Transactions:PartB , 16, 5, (2009), 427-434. DOI: 10.3745/KIPSTB.2009.16.5.427.