An Empirical Study of Source Code Detection Using Image Classification

Juntong Hong, Osamu Mizuno, Masanari Kondo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The detection of programming language for a source code file has achieved high accuracy using the machine learning techniques. On the other hand, for a piece of software (called snippet), the detection of programming language is required to append tags automatically in a question and answer site such as Stack Overflow. However, the detection of programming language for a snippet is still a challenge since snippets is not a complete source code. Usually, experienced developers can detect the language of such snippet at a glance. It is considered that such a task that a human being easily solves can be solved by the image classification method using deep learning technique. Therefore, we propose a programming language detection method using a deep learning based image classification method. By using the data from actual Q&A site, we evaluate our proposed model. The results of experiment demonstrate that we can successfully detect the correct programming language for snippets with over 90% accuracy.

Original languageEnglish
Title of host publicationProceedings - 2019 10th International Workshop on Empirical Software Engineering in Practice, IWESEP 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-6
Number of pages6
ISBN (Electronic)9781728155906
DOIs
Publication statusPublished - Dec 2019
Externally publishedYes
Event10th International Workshop on Empirical Software Engineering in Practice, IWESEP 2019 - Tokyo, Japan
Duration: Dec 13 2019Dec 14 2019

Publication series

NameProceedings - 2019 10th International Workshop on Empirical Software Engineering in Practice, IWESEP 2019

Conference

Conference10th International Workshop on Empirical Software Engineering in Practice, IWESEP 2019
CountryJapan
CityTokyo
Period12/13/1912/14/19

All Science Journal Classification (ASJC) codes

  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'An Empirical Study of Source Code Detection Using Image Classification'. Together they form a unique fingerprint.

Cite this