A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus


KIPS Transactions on Software and Data Engineering, Vol. 8, No. 5, pp. 213-222, May. 2019
https://doi.org/10.3745/KTSDE.2019.8.5.213,   PDF Download:
Keywords: Dialogue Corpus, Semantic Mark Tagging, Context Vector Similarity
Abstract

Determining the meaning of a keyword in a speech dialogue system is an important technology for the future implementation of an intelligent speech dialogue interface. After extracting keywords to grasp intention from user's utterance, the intention of utterance is determined by using the semantic mark of keyword. One keyword can have several semantic marks, and we regard the task of attaching the correct semantic mark to the user’s intentions on these keyword as a problem of word sense disambiguation. In this study, about 23% of all keywords in the corpus is manually tagged to build a semantic mark dictionary, a synonym dictionary, and a context vector dictionary, and then the remaining 77% of all keywords is automatically tagged. The semantic mark of a keyword is determined by calculating the context vector similarity from the context vector dictionary. For an unregistered keyword, the semantic mark of the most similar keyword is attached using a synonym dictionary. We compare the performance of the system with manually constructed training set and semi-automatically expanded training set by selecting 3 high-frequency keywords and 3 low-frequency keywords in the corpus. In experiments, we obtained accuracy of 54.4% with manually constructed training set and 50.0% with semi-automatically expanded training set.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. Park, S. Lee, Y. Lim, J. Choi, "A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus," KIPS Transactions on Software and Data Engineering, vol. 8, no. 5, pp. 213-222, 2019. DOI: https://doi.org/10.3745/KTSDE.2019.8.5.213.

[ACM Style]
Junhyeok Park, Songwook Lee, Yoonseob Lim, and Jongsuk Choi. 2019. A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus. KIPS Transactions on Software and Data Engineering, 8, 5, (2019), 213-222. DOI: https://doi.org/10.3745/KTSDE.2019.8.5.213.