A lightweight and portable approach to making concurrent failures reproducible

Qingzhou Luo, Sai Zhang, Jianjun Zhao, Min Hu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Concurrent programs often exhibit bugs due to unintended interferences among the concurrent threads. Such bugs are often hard to reproduce because they typically happen under very specific interleaving of the executing threads. Basically, it is very hard to fix a bug (or software failure) in concurrent programs without being able to reproduce it. In this paper, we present an approach, called ConCrash, that automatically and deterministically reproduces concurrent failures by recording logical thread schedule and generating unit tests. For a given bug (failure), ConCrash records the logical thread scheduling order and preserves object states in memory at runtime. Then, ConCrash reproduces the failure offline by simply using the saved information without the need for JVM-level or OS-level support. To reduce the runtime performance overhead, ConCrash employs a static data race detection technique to report potential possible race conditions, and only instruments such places. We implement the ConCrash approach in a prototype tool for Java and experimented on a number of multi-threaded Java benchmarks. As a result, we successfully reproduced a number of real concurrent bugs (e.g., deadlocks, data races and atomicity violation) within an acceptable overhead.

Original languageEnglish
Title of host publicationFundamental Approaches to Software Engineering - 13th International Conference, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Proceedings
Pages323-337
Number of pages15
DOIs
Publication statusPublished - Apr 29 2010
Externally publishedYes
Event13th International Conference on Fundamental Approaches to Software Engineering, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010 - Paphos, Cyprus
Duration: Mar 20 2010Mar 28 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6013 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th International Conference on Fundamental Approaches to Software Engineering, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010
CountryCyprus
CityPaphos
Period3/20/103/28/10

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Luo, Q., Zhang, S., Zhao, J., & Hu, M. (2010). A lightweight and portable approach to making concurrent failures reproducible. In Fundamental Approaches to Software Engineering - 13th International Conference, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Proceedings (pp. 323-337). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6013 LNCS). https://doi.org/10.1007/978-3-642-12029-9_23