High-impact defects: A study of breakage and surprise defects

Emad Shihab, Audri Mockus, Yasutaka Kamei, Bram Adams, Ahmed E. Hassan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

55 Citations (Scopus)

Abstract

The relationship between various software-related phenomena (e.g., code complexity) and post-release software defects has been thoroughly examined. However, to date these predictions have a limited adoption in practice. The most commonly cited reason is that the prediction identifies too much code to review without distinguishing the impact of these defects. Our aim is to address this drawback by focusing on high-impact defects for customers and practitioners. Customers are highly impacted by defects that break pre-existing functionality (breakage defects), whereas practitioners are caught off-guard by defects in files that had relatively few pre-release changes (surprise defects). The large commercial software system that we study already had an established concept of breakages as the highest-impact defects, however, the concept of surprises is novel and not as well established. We find that surprise defects are related to incomplete requirements and that the common assumption that a fix is caused by a previous change does not hold in this project. We then fit prediction models that are effective at identifying files containing breakages and surprises. The number of pre-release defects and file size are good indicators of breakages, whereas the number of co-changed files and the amount of time between the latest pre-release change and the release date are good indicators of surprises. Although our prediction models are effective at identifying files that have breakages and surprises, we learn that the prediction should also identify the nature or type of defects, with each type being specific enough to be easily identified and repaired.

Original languageEnglish
Title of host publicationSIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering
Pages300-310
Number of pages11
DOIs
Publication statusPublished - Sep 30 2011
Externally publishedYes
Event19th ACM SIGSOFT Symposium on Foundations of Software Engineering, SIGSOFT/FSE'11 - Szeged, Hungary
Duration: Sep 5 2011Sep 9 2011

Publication series

NameSIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering

Other

Other19th ACM SIGSOFT Symposium on Foundations of Software Engineering, SIGSOFT/FSE'11
CountryHungary
CitySzeged
Period9/5/119/9/11

Fingerprint

Defects

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Shihab, E., Mockus, A., Kamei, Y., Adams, B., & Hassan, A. E. (2011). High-impact defects: A study of breakage and surprise defects. In SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering (pp. 300-310). (SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering). https://doi.org/10.1145/2025113.2025155

High-impact defects : A study of breakage and surprise defects. / Shihab, Emad; Mockus, Audri; Kamei, Yasutaka; Adams, Bram; Hassan, Ahmed E.

SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering. 2011. p. 300-310 (SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shihab, E, Mockus, A, Kamei, Y, Adams, B & Hassan, AE 2011, High-impact defects: A study of breakage and surprise defects. in SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering. SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering, pp. 300-310, 19th ACM SIGSOFT Symposium on Foundations of Software Engineering, SIGSOFT/FSE'11, Szeged, Hungary, 9/5/11. https://doi.org/10.1145/2025113.2025155
Shihab E, Mockus A, Kamei Y, Adams B, Hassan AE. High-impact defects: A study of breakage and surprise defects. In SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering. 2011. p. 300-310. (SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering). https://doi.org/10.1145/2025113.2025155
Shihab, Emad ; Mockus, Audri ; Kamei, Yasutaka ; Adams, Bram ; Hassan, Ahmed E. / High-impact defects : A study of breakage and surprise defects. SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering. 2011. pp. 300-310 (SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering).
@inproceedings{e5890e6ff3ed4c2d8e49686c07ad50c0,
title = "High-impact defects: A study of breakage and surprise defects",
abstract = "The relationship between various software-related phenomena (e.g., code complexity) and post-release software defects has been thoroughly examined. However, to date these predictions have a limited adoption in practice. The most commonly cited reason is that the prediction identifies too much code to review without distinguishing the impact of these defects. Our aim is to address this drawback by focusing on high-impact defects for customers and practitioners. Customers are highly impacted by defects that break pre-existing functionality (breakage defects), whereas practitioners are caught off-guard by defects in files that had relatively few pre-release changes (surprise defects). The large commercial software system that we study already had an established concept of breakages as the highest-impact defects, however, the concept of surprises is novel and not as well established. We find that surprise defects are related to incomplete requirements and that the common assumption that a fix is caused by a previous change does not hold in this project. We then fit prediction models that are effective at identifying files containing breakages and surprises. The number of pre-release defects and file size are good indicators of breakages, whereas the number of co-changed files and the amount of time between the latest pre-release change and the release date are good indicators of surprises. Although our prediction models are effective at identifying files that have breakages and surprises, we learn that the prediction should also identify the nature or type of defects, with each type being specific enough to be easily identified and repaired.",
author = "Emad Shihab and Audri Mockus and Yasutaka Kamei and Bram Adams and Hassan, {Ahmed E.}",
year = "2011",
month = "9",
day = "30",
doi = "10.1145/2025113.2025155",
language = "English",
isbn = "9781450304436",
series = "SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering",
pages = "300--310",
booktitle = "SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering",

}

TY - GEN

T1 - High-impact defects

T2 - A study of breakage and surprise defects

AU - Shihab, Emad

AU - Mockus, Audri

AU - Kamei, Yasutaka

AU - Adams, Bram

AU - Hassan, Ahmed E.

PY - 2011/9/30

Y1 - 2011/9/30

N2 - The relationship between various software-related phenomena (e.g., code complexity) and post-release software defects has been thoroughly examined. However, to date these predictions have a limited adoption in practice. The most commonly cited reason is that the prediction identifies too much code to review without distinguishing the impact of these defects. Our aim is to address this drawback by focusing on high-impact defects for customers and practitioners. Customers are highly impacted by defects that break pre-existing functionality (breakage defects), whereas practitioners are caught off-guard by defects in files that had relatively few pre-release changes (surprise defects). The large commercial software system that we study already had an established concept of breakages as the highest-impact defects, however, the concept of surprises is novel and not as well established. We find that surprise defects are related to incomplete requirements and that the common assumption that a fix is caused by a previous change does not hold in this project. We then fit prediction models that are effective at identifying files containing breakages and surprises. The number of pre-release defects and file size are good indicators of breakages, whereas the number of co-changed files and the amount of time between the latest pre-release change and the release date are good indicators of surprises. Although our prediction models are effective at identifying files that have breakages and surprises, we learn that the prediction should also identify the nature or type of defects, with each type being specific enough to be easily identified and repaired.

AB - The relationship between various software-related phenomena (e.g., code complexity) and post-release software defects has been thoroughly examined. However, to date these predictions have a limited adoption in practice. The most commonly cited reason is that the prediction identifies too much code to review without distinguishing the impact of these defects. Our aim is to address this drawback by focusing on high-impact defects for customers and practitioners. Customers are highly impacted by defects that break pre-existing functionality (breakage defects), whereas practitioners are caught off-guard by defects in files that had relatively few pre-release changes (surprise defects). The large commercial software system that we study already had an established concept of breakages as the highest-impact defects, however, the concept of surprises is novel and not as well established. We find that surprise defects are related to incomplete requirements and that the common assumption that a fix is caused by a previous change does not hold in this project. We then fit prediction models that are effective at identifying files containing breakages and surprises. The number of pre-release defects and file size are good indicators of breakages, whereas the number of co-changed files and the amount of time between the latest pre-release change and the release date are good indicators of surprises. Although our prediction models are effective at identifying files that have breakages and surprises, we learn that the prediction should also identify the nature or type of defects, with each type being specific enough to be easily identified and repaired.

UR - http://www.scopus.com/inward/record.url?scp=80053190898&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053190898&partnerID=8YFLogxK

U2 - 10.1145/2025113.2025155

DO - 10.1145/2025113.2025155

M3 - Conference contribution

AN - SCOPUS:80053190898

SN - 9781450304436

T3 - SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering

SP - 300

EP - 310

BT - SIGSOFT/FSE'11 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering

ER -