Souce Code Identification Using Deep Neural Network


KIPS Transactions on Software and Data Engineering, Vol. 8, No. 9, pp. 373-378, Sep. 2019
https://doi.org/10.3745/KTSDE.2019.8.9.373,   PDF Download:
Keywords: Computer Forensic, Frequency Based Embedding, TF-IDF, Deep Learning, CNN
Abstract

Since many programming sources are open online, problems with reckless plagiarism and copyrights are occurring. Among them, source codes produced by repeated authors may have unique fingerprints due to their programming characteristics. This paper identifies each author by learning from a Google Code Jam program source using deep neural network. In this case, the original creator's source is to be vectored using a pre-processing instrument such as predictive-based vector or frequency-based approach, TF-IDF, etc. and to identify the original program source by learning by using a deep neural network. In addition a language-independent learning system was constructed using a pre-processing machine and compared with other existing learning methods. Among them, models using TF-IDF and in-depth neural networks were found to perform better than those using other pre-processing or other learning methods.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. Rhim and T. Abuhmed, "Souce Code Identification Using Deep Neural Network," KIPS Transactions on Software and Data Engineering, vol. 8, no. 9, pp. 373-378, 2019. DOI: https://doi.org/10.3745/KTSDE.2019.8.9.373.

[ACM Style]
Jisu Rhim and Tamer Abuhmed. 2019. Souce Code Identification Using Deep Neural Network. KIPS Transactions on Software and Data Engineering, 8, 9, (2019), 373-378. DOI: https://doi.org/10.3745/KTSDE.2019.8.9.373.