Ties between mined structural patterns in program and their identifier names

Yoshiki Mashima, Sachio Hirokawa, Kazuhiro Takeuchi

研究成果: 著書/レポートタイプへの貢献会議での発言

抄録

Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.

元の言語英語
ホスト出版物のタイトルIntegrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings
編集者Hirosato Seki, Masahiro Inuiguchi, Canh Hao Nguyen, Van-Nam Huynh
出版者Springer Verlag
ページ335-346
ページ数12
ISBN(印刷物)9783030148140
DOI
出版物ステータス出版済み - 1 1 2019
イベント7th International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, IUKM 2019 - Nara, 日本
継続期間: 3 27 20193 29 2019

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11471 LNAI
ISSN(印刷物)0302-9743
ISSN(電子版)1611-3349

会議

会議7th International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, IUKM 2019
日本
Nara
期間3/27/193/29/19

Fingerprint

Tie
Support vector machines
Learning systems
Feature extraction
Visualization
Feature Vector
Support Vector Machine
Evaluation
Patterning
Object-oriented
Repository
Feature Selection
Mining
Machine Learning
Partial
Software

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

これを引用

Mashima, Y., Hirokawa, S., & Takeuchi, K. (2019). Ties between mined structural patterns in program and their identifier names. : H. Seki, M. Inuiguchi, C. H. Nguyen, & V-N. Huynh (版), Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings (pp. 335-346). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻数 11471 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-030-14815-7_28

Ties between mined structural patterns in program and their identifier names. / Mashima, Yoshiki; Hirokawa, Sachio; Takeuchi, Kazuhiro.

Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings. 版 / Hirosato Seki; Masahiro Inuiguchi; Canh Hao Nguyen; Van-Nam Huynh. Springer Verlag, 2019. p. 335-346 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻 11471 LNAI).

研究成果: 著書/レポートタイプへの貢献会議での発言

Mashima, Y, Hirokawa, S & Takeuchi, K 2019, Ties between mined structural patterns in program and their identifier names. : H Seki, M Inuiguchi, CH Nguyen & V-N Huynh (版), Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 巻. 11471 LNAI, Springer Verlag, pp. 335-346, 7th International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, IUKM 2019, Nara, 日本, 3/27/19. https://doi.org/10.1007/978-3-030-14815-7_28
Mashima Y, Hirokawa S, Takeuchi K. Ties between mined structural patterns in program and their identifier names. : Seki H, Inuiguchi M, Nguyen CH, Huynh V-N, 編集者, Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings. Springer Verlag. 2019. p. 335-346. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-14815-7_28
Mashima, Yoshiki ; Hirokawa, Sachio ; Takeuchi, Kazuhiro. / Ties between mined structural patterns in program and their identifier names. Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings. 編集者 / Hirosato Seki ; Masahiro Inuiguchi ; Canh Hao Nguyen ; Van-Nam Huynh. Springer Verlag, 2019. pp. 335-346 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{2b2c9e8541564b76bd942f5d70afec43,
title = "Ties between mined structural patterns in program and their identifier names",
abstract = "Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.",
author = "Yoshiki Mashima and Sachio Hirokawa and Kazuhiro Takeuchi",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-14815-7_28",
language = "English",
isbn = "9783030148140",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "335--346",
editor = "Hirosato Seki and Masahiro Inuiguchi and Nguyen, {Canh Hao} and Van-Nam Huynh",
booktitle = "Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings",
address = "Germany",

}

TY - GEN

T1 - Ties between mined structural patterns in program and their identifier names

AU - Mashima, Yoshiki

AU - Hirokawa, Sachio

AU - Takeuchi, Kazuhiro

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.

AB - Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.

UR - http://www.scopus.com/inward/record.url?scp=85064214136&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064214136&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-14815-7_28

DO - 10.1007/978-3-030-14815-7_28

M3 - Conference contribution

AN - SCOPUS:85064214136

SN - 9783030148140

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 335

EP - 346

BT - Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings

A2 - Seki, Hirosato

A2 - Inuiguchi, Masahiro

A2 - Nguyen, Canh Hao

A2 - Huynh, Van-Nam

PB - Springer Verlag

ER -