Semi-Automatic Annotation Tool for Korean Dependency Structure


The KIPS Transactions:PartB , Vol. 13, No. 1, pp. 63-70, Feb. 2006
10.3745/KIPSTB.2006.13.1.63,   PDF Download:

Abstract

In general, a corpus contains lots of linguistic information and is widely used in the field of natural language processing and computational linguistics. The creation of such the corpus, however, is an expensive, labor-intensive and time-consuming work. To alleviate this problem, annotation tools to build corpora with much linguistic information is indispensable. In this paper, we design and implement an annotation tool for establishing a Korean dependency tree-tagged corpus. The most ideal way is to fully automatically create the corpus without annotators'' interventions, but as a matter of fact, it is impossible. The proposed tool is semi-automatic like most other annotation tools and is designed to edit errors, which are generated by basic analyzers like part-of-speech tagger and (partial) parser. We also design it to avoid repetitive works while editing the errors and to use it easily and friendly. Using the proposed annotation tool, 10,000 Korean sentences containing over 20 words are annotated with dependency structures. For 2 months, eight annotators have worked every 4 hours a day. We are confident that we can have accurate and consistent annotations as well as reduced labor and time.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. H. Kim and E. J. Park, "Semi-Automatic Annotation Tool for Korean Dependency Structure," The KIPS Transactions:PartB , vol. 13, no. 1, pp. 63-70, 2006. DOI: 10.3745/KIPSTB.2006.13.1.63.

[ACM Style]
Jae Hoon Kim and Eun Jin Park. 2006. Semi-Automatic Annotation Tool for Korean Dependency Structure. The KIPS Transactions:PartB , 13, 1, (2006), 63-70. DOI: 10.3745/KIPSTB.2006.13.1.63.