Context-Dependent Video Data Augmentation for Human Instance Segmentation


KIPS Transactions on Software and Data Engineering, Vol. 12, No. 5, pp. 217-228, May. 2023
https://doi.org/10.3745/KTSDE.2023.12.5.217,   PDF Download:  
Keywords: Drama Video, Human Instance Segmentation, Class Imbalance, Video Data Augmentation, Spatio-temporal Context
Abstract

Video instance segmentation is an intelligent visual task with high complexity because it not only requires object instance segmentation for each image frame constituting a video, but also requires accurate tracking of instances throughout the frame sequence of the video. In special, human instance segmentation in drama videos has an unique characteristic that requires accurate tracking of several main characters interacting in various places and times. Also, it is also characterized by a kind of the class imbalance problem because there is a significant difference between the frequency of main characters and that of supporting or auxiliary characters in drama videos. In this paper, we introduce a new human instance datatset called MHIS, which is built upon drama videos, Miseang, and then propose a novel video data augmentation method, CDVA, in order to overcome the data imbalance problem between character classes. Different from the previous video data augmentation methods, the proposed CDVA generates more realistic augmented videos by deciding the optimal location within the background clip for a target human instance to be inserted with taking rich spatio-temporal context embedded in videos into account. Therefore, the proposed augmentation method, CDVA, can improve the performance of a deep neural network model for video instance segmentation. Conducting both quantitative and qualitative experiments using the MHIS dataset, we prove the usefulness and effectiveness of the proposed video data augmentation method.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. Chun, J. Lee, I. Kim, "Context-Dependent Video Data Augmentation for Human Instance Segmentation," KIPS Transactions on Software and Data Engineering, vol. 12, no. 5, pp. 217-228, 2023. DOI: https://doi.org/10.3745/KTSDE.2023.12.5.217.

[ACM Style]
HyunJin Chun, JongHun Lee, and InCheol Kim. 2023. Context-Dependent Video Data Augmentation for Human Instance Segmentation. KIPS Transactions on Software and Data Engineering, 12, 5, (2023), 217-228. DOI: https://doi.org/10.3745/KTSDE.2023.12.5.217.