Comparison of Korean Classification Models’ Korean Essay Score Range Prediction Performance


KIPS Transactions on Software and Data Engineering, Vol. 11, No. 3, pp. 133-140, Mar. 2022
https://doi.org/10.3745/KTSDE.2022.11.3.133,   PDF Download:
Keywords: Deep Learning-Based Korean Language Model, KoBERT, KcBERT, KR-BERT, Document Classification
Abstract

We investigate the performance of deep learning-based Korean language models on a task of predicting the score range of Korean essays written by foreign students. We construct a data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job (‘job’), conditions of a happy life (‘happ’), relationship between money and happiness (‘econ’), and definition of success (‘succ’). These essays were labeled according to four letter grades (A, B, C, and D), and a total of eleven essay score range prediction experiments were conducted (i.e., five for predicting the score range of ‘job’ essays, five for predicting the score range of ‘happiness’ essays, and one for predicting the score range of mixed topic essays). Three deep learning-based Korean language models, KoBERT, KcBERT, and KR-BERT, were fine-tuned using various training data. Moreover, two traditional probabilistic machine learning classifiers, naive Bayes and logistic regression, were also evaluated. Experiment results show that deep learning-based Korean language models performed better than the two traditional classifiers, with KR-BERT performing the best with 55.83% overall average prediction accuracy. A close second was KcBERT (55.77%) followed by KoBERT (54.91%). The performances of naïve Bayes and logistic regression classifiers were 52.52% and 50.28% respectively. Due to the scarcity of training data and the imbalance in class distribution, the overall prediction performance was not high for all classifiers. Moreover, the classifiers’ vocabulary did not explicitly capture the error features that were helpful in correctly grading the Korean essay. By overcoming these two limitations, we expect the score range prediction performance to improve.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. Cho, H. Im, Y. Yi, J. Cha, "Comparison of Korean Classification Models’ Korean Essay Score Range Prediction Performance," KIPS Transactions on Software and Data Engineering, vol. 11, no. 3, pp. 133-140, 2022. DOI: https://doi.org/10.3745/KTSDE.2022.11.3.133.

[ACM Style]
Heeryon Cho, Hyeonyeol Im, Yumi Yi, and Junwoo Cha. 2022. Comparison of Korean Classification Models’ Korean Essay Score Range Prediction Performance. KIPS Transactions on Software and Data Engineering, 11, 3, (2022), 133-140. DOI: https://doi.org/10.3745/KTSDE.2022.11.3.133.