Linguistic Features Discrimination for Social Issue Risk Classification


KIPS Transactions on Software and Data Engineering, Vol. 5, No. 11, pp. 541-548, Nov. 2016
10.3745/KTSDE.2016.5.11.541,   PDF Download:
Keywords: Risk Detection, Text Classification, Linguistic Feature, Feature Discrimination, Word Embedding
Abstract

The use of social media is already essential as a source of information for listening user’s various opinions and monitoring. We define social ‘risks’ that issues effect negative influences for public opinion in social media. This paper aims to discriminate various linguistic features and reveal their effects for building an automatic classification model of social risks. Expecially we adopt a word embedding technique for representation of linguistic clues in risk sentences. As a preliminary experiment to analyze characteristics of individual features, we revise errors in automatic linguistic analysis. At the result, the most important feature is NE (Named Entity) information and the best condition is when combine basic linguistic features. word embedding, and word clusters within core predicates. Experimental results under the real situation in social bigdata - including linguistic analysis errors - show 92.08% and 85.84% in precision respectively for frequent risk categories set and full test set.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. Oh, B. Yun, C. Kim, "Linguistic Features Discrimination for Social Issue Risk Classification," KIPS Transactions on Software and Data Engineering, vol. 5, no. 11, pp. 541-548, 2016. DOI: 10.3745/KTSDE.2016.5.11.541.

[ACM Style]
Hyo-Jung Oh, Bo-Hyun Yun, and Chan-Young Kim. 2016. Linguistic Features Discrimination for Social Issue Risk Classification. KIPS Transactions on Software and Data Engineering, 5, 11, (2016), 541-548. DOI: 10.3745/KTSDE.2016.5.11.541.