TAGS: Text Augmentation with Generation and Selection

KIPS Transactions on Software and Data Engineering, Vol. 12, No. 10, pp. 455-460, Oct. 2023
https://doi.org/10.3745/KTSDE.2023.12.10.455,   PDF Download:
Keywords: Natural Language Process, Artificial intelligence, Large Language Model, Few-Shot Learning, Text Augmentation

Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article
[IEEE Style]
K. K. Min, D. H. Kim, S. Jo, H. Oh, M. Hwang, "TAGS: Text Augmentation with Generation and Selection," KIPS Transactions on Software and Data Engineering, vol. 12, no. 10, pp. 455-460, 2023. DOI: https://doi.org/10.3745/KTSDE.2023.12.10.455.

[ACM Style]
Kim Kyung Min, Dong Hwan Kim, Seongung Jo, Heung-Seon Oh, and Myeong-Ha Hwang. 2023. TAGS: Text Augmentation with Generation and Selection. KIPS Transactions on Software and Data Engineering, 12, 10, (2023), 455-460. DOI: https://doi.org/10.3745/KTSDE.2023.12.10.455.