Can we benchmark Code Review studies? A systematic mapping study of methodology, dataset, and metric

Dong Wang, Yuki Ueda, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


Context: Code Review (CR) is the cornerstone for software quality assurance and a crucial practice for software development. As CR research matures, it can be difficult to keep track of the best practices and state-of-the-art in methodology, dataset, and metric. Objective: This paper investigates the potential of benchmarking by collecting methodology, dataset, and metric of CR studies. Methods: A systematic mapping study was conducted. A total of 112 studies from 19,847 papers published in high-impact venues between the years 2011 and 2019 were selected and analyzed. Results: First, we find that empirical evaluation is the most common methodology (65% of papers), with solution and experience being the least common methodology. Second, we highlight 50% of papers that use the quantitative method or mixed-method have the potential for replicability. Third, we identify 457 metrics that are grouped into sixteen core metric sets, applied to nine Software Engineering topics, showing different research topics tend to use specific metric sets. Conclusion: We conclude that at this stage, we cannot benchmark CR studies. Nevertheless, a common benchmark will facilitate new researchers, including experts from other fields, to innovate new techniques and build on top of already established methodologies. A full replication is available at

Original languageEnglish
Article number111009
JournalJournal of Systems and Software
Publication statusPublished - Oct 2021
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture


Dive into the research topics of 'Can we benchmark Code Review studies? A systematic mapping study of methodology, dataset, and metric'. Together they form a unique fingerprint.

Cite this