Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis


KIPS Transactions on Software and Data Engineering, Vol. 2, No. 12, pp. 881-888, Dec. 2013
10.3745/KTSDE.2013.2.12.881,   PDF Download:

Abstract

A pre-analyzed dictionary is used to increase the speed and the accuracy of morphological analyzers and to decrease the over-generation. However, if the dictionary includes ``Insufficiently-analyzed word-phrases``, which do not include all the possible analysis of the word-phrase, it may cause the decrease of the analysis accuracy. In this paper, we measure the accuracy changes according to the number of word-phrase frequency and the size changes of corpus by Sejong corpus. And performance of integrate system(SMA with pre-dictionary) is highest when sufficient analysis rate of pre-dictionary is more than 99.82%. Also pre-dictionary is constructed with word-phrase that frequency more than 32(64) when size of corpus is 1,600,000(6,300,000) word-phrase.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. J. Kwak, B. G. Kim, J. S. Lee, "Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis," KIPS Transactions on Software and Data Engineering, vol. 2, no. 12, pp. 881-888, 2013. DOI: 10.3745/KTSDE.2013.2.12.881.

[ACM Style]
Su Jeong Kwak, Bo Gyum Kim, and Jae Sung Lee. 2013. Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis. KIPS Transactions on Software and Data Engineering, 2, 12, (2013), 881-888. DOI: 10.3745/KTSDE.2013.2.12.881.