A Reverse Segmentation Algorithm of Compound Nouns


The KIPS Transactions:PartB , Vol. 8, No. 4, pp. 357-364, Aug. 2001
10.3745/KIPSTB.2001.8.4.357,   PDF Download:

Abstract

In this paper, we propose a new segmentation algorithm for compound noun analysis in Korean. The algorithm segments a compound noun into a sequence of unit nouns and affixes using a unit noun dictionary and an affix dictionary. In most cases, the head of a compound noun appears at the end of the word, the proposed algorithm tries to segment the given compound noun from the end of the word to the beginning of the word. To evaluate the accuracy of the proposed algorithm, an experiment was conducted with 3,230 compound nouns which is extracted from ETRI tagged corpus. Experimental results shows that the accuracy of the proposed method is 96.6% on the average. In case of compound nouns with unknown words, the accuracy drops to 77.5%. From the experiment, it become clear that the proposed algorithm outperformed other methods in case of compound nouns with unknown words.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. M. Lee and H. R. Park, "A Reverse Segmentation Algorithm of Compound Nouns," The KIPS Transactions:PartB , vol. 8, no. 4, pp. 357-364, 2001. DOI: 10.3745/KIPSTB.2001.8.4.357.

[ACM Style]
Hyun Min Lee and Hyuk Ro Park. 2001. A Reverse Segmentation Algorithm of Compound Nouns. The KIPS Transactions:PartB , 8, 4, (2001), 357-364. DOI: 10.3745/KIPSTB.2001.8.4.357.