Korean Homograph Tagging Model based on Sub-Word Conditional Probability


KIPS Transactions on Software and Data Engineering, Vol. 3, No. 10, pp. 407-420, Oct. 2014
10.3745/KTSDE.2014.3.10.407,   PDF Download:

Abstract

In general, the Korean morpheme analysis procedure is divided into two steps. In the first step as an ambiguity generation step, an Eojeol is analyzed into many morpheme sequences as candidates. In the second step, one appropriate candidate is chosen by using contextual information. Hidden Markov Model(HMM) is typically applied in the second step. This paper proposes Sub-word Conditional Probability(SCP) model as an alternate algorithm. SCP uses sub-word information of adjacent eojeol first. If it failed, then SCP use morpheme information restrictively. In the accuracy and speed comparative test, HMM`s accuracy is 96.49% and SCP`s accuracy is just 0.07% lower. But SCP reduced processing time 53%.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. J. Choul and O. C. Young, "Korean Homograph Tagging Model based on Sub-Word Conditional Probability," KIPS Transactions on Software and Data Engineering, vol. 3, no. 10, pp. 407-420, 2014. DOI: 10.3745/KTSDE.2014.3.10.407.

[ACM Style]
Shin Joon Choul and Ock Cheol Young. 2014. Korean Homograph Tagging Model based on Sub-Word Conditional Probability. KIPS Transactions on Software and Data Engineering, 3, 10, (2014), 407-420. DOI: 10.3745/KTSDE.2014.3.10.407.