Skip to main content
Log in

Fixing, preventing, and recovering from concurrency bugs

  • Review
  • Special Focus on High-Confidence Software Technologies
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Concurrency bugs are becoming widespread with the emerging ubiquity of multicore processors and multithreaded software. They manifest during production runs and lead to severe losses. Many effective concurrency-bug detection tools have been built. However, the dependability of multi-threaded software does not improve until these bugs are handled statically or dynamically. This article discusses our recent progresses on fixing, preventing, and recovering from concurrency bugs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jin G L, Song L H, Zhang W, et al. Automated atomicity-violation fixing. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, San Jose, 2011. 389–400

    Chapter  Google Scholar 

  2. Jin G L, Zhang W, Deng D D, et al. Automated concurrency-bug fixing. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, Hollywood, 2012. 221–236

    Google Scholar 

  3. Zhang M X, Wu Y W, Lu S, et al. AI: a lightweight system for tolerating concurrency bugs. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, 2014. 330–340

    Google Scholar 

  4. Zhang W, de Kruijf M, Li A, et al. Conair: featherweight concurrency bug recovery via single-threaded idempotent execution. In: Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, 2013. 113–126

    Google Scholar 

  5. Leveson N G, Turner C S. An investigation of the therac-25 accidents. Computer, 1993, 26: 18–41

    Article  Google Scholar 

  6. Lu S, Park S, Seo E, et al. Learning from mistakes—a comprehensive study of real world concurrency bug characteristics. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, Seattle, 2008. 329–339

    Google Scholar 

  7. Godefroid P, Nagappani N. Concurrency at Microsoft—an Exploratory Survey. Microsoft Research Technical Report MSR-TR-2008-75, 2008

    Google Scholar 

  8. Yin Z N, Yuan D, Zhou Y Y, et al. How do fixes become bugs? In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011. 26–36

    Chapter  Google Scholar 

  9. Flanagan C, Freund S N. Fasttrack: efficient and precise dynamic race detection. In: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, Dublin, 2009. 121–133

    Chapter  Google Scholar 

  10. Kasikci B, Zamfir C, Candea G. Data races vs. data race bugs: telling the difference with portend. In: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, London, 2012. 185–198

    Google Scholar 

  11. Netzer R H B, Miller B P. Improving the accuracy of data race detection. In: Proceedings of the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Williamsburg, 1991. 133–144

    Google Scholar 

  12. Savage S, Burrows M, Nelson G, et al. Eraser: a dynamic data race detector for multithreaded programs. In: Proceedings of the 16th ACM Symposium on Operating Systems Principles, Saint Malo, 1997. 27–37

    Chapter  Google Scholar 

  13. Yu Y, Rodeheffer T, Chen W. RaceTrack: efficient detection of data race conditions via adaptive tracking. In: Proceedings of the 20th ACM Symposium on Operating Systems Principles, Brighton, 2005. 221–234

    Google Scholar 

  14. Chen F, Serbanuta T F, Rosu G. jpredictor: a predictive runtime analysis tool for java. In: Proceedings of the 30th International Conference on Software Engineering, Leipzig, 2008. 221–230

    Google Scholar 

  15. Flanagan C, Freund S N. Atomizer: a dynamic atomicity checker for multithreaded programs. In: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Venice, 2004. 256–267

    Chapter  Google Scholar 

  16. Flanagan C, Qadeer S. A type and effect system for atomicity. In: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, 2003. 338–349

    Chapter  Google Scholar 

  17. Flanagan C, Freund S N, Yi J. Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs. In: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, Tucson, 2008. 293–303

    Chapter  Google Scholar 

  18. Lu S, Tucek J, Qin F, et al. AVIO: detecting atomicity violations via access interleaving invariants. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, 2006. 37–48

    Google Scholar 

  19. Xu M, Bodík R, Hill M D. A serializability violation detector for shared-memory server programs. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, 2005. 1–14

    Chapter  Google Scholar 

  20. Gao Q, Zhang WB, Chen Z Z, et al. 2ndStrike: toward manifesting hidden concurrency typestate bugs. In: Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, Newport Beach, 2011. 239–250

    Google Scholar 

  21. Shi Y, Park S, Yin Z N, et al. Do I use the wrong definition?: DefUse: definition-use invariants for detecting concurrency and sequential bugs. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Reno/Tahoe, 2010. 160–174

    Chapter  Google Scholar 

  22. Zhang W, Sun C, Lu S. ConMem: detecting severe concurrency bugs through an effect-oriented approach. In: Proceedings of the 15th Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, Pittsburgh, 2010. 179–192

    Chapter  Google Scholar 

  23. Zhang W, Lim J, Olichandran R, et al. ConSeq: detecting concurrency bugs through sequential errors. In: Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, Newport Beach, 2011. 251–264

    Google Scholar 

  24. Jula H, Tralamazza D, Zamfir C, et al. Deadlock immunity: enabling systems to defend against deadlocks. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, San Diego, 2008. 295–308

    Google Scholar 

  25. Li T, Ellis C, Lebeck A, et al. Pulse: a dynamic deadlock detection mechanism using speculative execution. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, Anaheim, 2005. 3

    Google Scholar 

  26. Wang Y, Kelly T, Kudlur M, et al. Gadara: dynamic deadlock avoidance for multithreaded programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, Berkeley, 2008. 281–294

    Google Scholar 

  27. Lucia B, Ceze L. Finding concurrency bugs with context-aware communication graphs. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, 2009. 553–563

    Chapter  Google Scholar 

  28. Musuvathi M, Qadeer S, Ball T, et al. Finding and reproducing heisenbugs in concurrent programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, San Diego, 2008. 267–280

    Google Scholar 

  29. Qi S X, Muzahid A, Ahn W, et al. Dynamically detecting and tolerating if-condition data races. In: Proceedings of 20th IEEE International Symposium on High Performance Computer Architecture, Orlando, 2014. 120–131

    Google Scholar 

  30. Yu J, Narayanasamy S, Pereira C, et al. Maple: a coverage-driven testing tool for multithreaded programs. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Tucson, 2012. 485–502

    Chapter  Google Scholar 

  31. Harris T, Fraser K. Language support for lightweight transactions. In: Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programing, Systems, Languages, and Applications, Anaheim, 2003. 388–402

    Google Scholar 

  32. Herlihy M, Moss J E B. Transactional memory: architectural support for lock-free data structures. In: Proceedings of the 20th Annual International Symposium on Computer Architecture, San Diego, 1993. 289–300

    Chapter  Google Scholar 

  33. Rajwar R, Goodman J R. Speculative lock elision: enabling highly concurrent multithreaded execution. In: Proceedings of the 34th Annual ACM/IEEE International Symposium on Microarchitecture, Austin, 2001. 294–305

    Chapter  Google Scholar 

  34. Park S, Lu S, Zhou Y Y. Ctrigger: exposing atomicity violation bugs from their finding places. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, Washington, 2009. 25–36

    Google Scholar 

  35. Woo S C, Ohara M, Torrie E, et al. The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, Margherita Ligure, 1995. 24–36

    Chapter  Google Scholar 

  36. Lattner C, Adve V. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, Palo Alto, 2004. 75–86

    Google Scholar 

  37. Marino D, Musuvathi M, Narayanasamy S. Literace: effective sampling for lightweight data-race detection. In: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, Dublin, 2009. 134–143

    Chapter  Google Scholar 

  38. Serebryany K, Bruening D, Potapenko A, et al. Addresssanitizer: a fast address sanity checker. In: Proceedings of the 2012 USENIX Conference on Annual Technical Conference, Boston, 2012. 28–28

    Google Scholar 

  39. Qin F, Tucek J, Sundaresan J, et al. Rx: treating bugs as allergies c a safe method to survive software failures. In: Proceedings of the 20th ACM Symposium on Operating Systems Principles, Brighton, 2005. 235–248

    Google Scholar 

  40. Sidiroglou S, Laadan O, Perez C, et al. Assure: automatic software self-healing using rescue points. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, Washington, 2009. 37–48

    Google Scholar 

  41. Veeraraghavan K, Chen P M, Flinn J, et al. Detecting and surviving data races using complementary schedules. In: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Cascais, 2011. 369–384

    Google Scholar 

  42. Li Z M, Tan L, Wang X H, et al. An empirical study of bug characteristics in modern open source software. In: Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability, San Jose, 2006. 25–33

    Chapter  Google Scholar 

  43. Vaziri M, Tip F, Dolby J. Associating synchronization constraints with data in an object-oriented language. In: Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, 2006. 334–345

    Google Scholar 

  44. Volos H, Tack A J, Swift M M, et al. Applying transactional memory to concurrency bugs. In: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, London, 2012. 211–222

    Google Scholar 

  45. Le Goues C, Dewey-Vogt M, Forrest S, et al. A systematic study of automated program repair: fixing 55 out of 105 bugs for $8 each. In: Proceedings of the 34th International Conference on Software Engineering, Zurich, 2012. 3–13

    Google Scholar 

  46. Logozzo F, Ball T. Modular and verified automatic program repair. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Tucson, 2012. 133–146

    Chapter  Google Scholar 

  47. Liu P, Zhang C. Axis: automatically fixing atomicity violations through solving control constraints. In: Proceedings of the 34th International Conference on Software Engineering, Zurich, 2012. 299–309

    Google Scholar 

  48. Liu P, Tripp O, Zhang C. Grail: context-aware fixing of concurrency bugs. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, 2014. 318–329

    Google Scholar 

  49. Perkins J H, Kim S, Larsen S, et al. Automatically patching errors in deployed software. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Big Sky, 2009. 87–102

    Chapter  Google Scholar 

  50. Wu J Y, Cui H M, Yang J F. Bypassing races in live applications with execution filters. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. 1–13

    Google Scholar 

  51. Lucia B, Devietti J, Strauss K, et al. Atom-aid: detecting and surviving atomicity violations. In: Proceedings of the 35th Annual International Symposium on Computer Architecture, Washington, 2008. 277–288

    Google Scholar 

  52. Yu J, Narayanasamy S. A case for an interleaving constrained shared-memory multi-processor. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, Austin, 2009. 325–336

    Google Scholar 

  53. Yu J, Narayanasamy S. Tolerating concurrency bugs using transactions as lifeguards. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Washington, 2010. 263–274

    Chapter  Google Scholar 

  54. Lucia B, Ceze L. Cooperative empirical failure avoidance for multithreaded programs. In: Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, 2013. 39–50

    Google Scholar 

  55. Candea G, Kawamoto S, Fujiki Y, et al. Microreboot-a technique for cheap recovery. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, San Francisco, 2004. 3

    Google Scholar 

  56. Erickson J, Musuvathi M, Burckhardt S, et al. Effective data-race detection for the kernel. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. 1–16

    Google Scholar 

  57. Chew L, Lie D. Kivati: fast detection and prevention of atomicity violations. In: Proceedings of the 5th European Conference on Computer Systems, Paris, 2010. 307–320

    Google Scholar 

  58. Lu S, Park S, Hu C F, et al. MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In: Proceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, 2007. 103–116

    Google Scholar 

  59. Bergan T, Hunt N, Ceze L, et al. Deterministic process groups in dOS. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. 1–16

    Google Scholar 

  60. Cui H M, Wu J Y, Gallagher J, et al. Efficient deterministic multithreading through schedule relaxation. In: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Cascais, 2011. 337–351

    Google Scholar 

  61. Liu T P, Curtsinger C, Berger E D. Dthreads: efficient deterministic multithreading. In: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Cascais, 2011. 327–336

    Google Scholar 

  62. Olszewski M, Ansel J, Amarasinghe S. Kendo: efficient deterministic multithreading in software. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, Washington, 2009. 97–108

    Google Scholar 

  63. Aviram A, Weng S-C, Hu S, et al. Efficient system-enforced deterministic parallelism. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. Berkeley: USENIX Association Berkeley, 2010. 1–16

    Google Scholar 

  64. McCloskey B, Zhou F, Gay D, et al. Autolocker: synchronization inference for atomic sections. In: Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, 2006. 346–358

    Google Scholar 

  65. Weeratunge D, Zhang X Y, Jagannathan S. Accentuating the positive: atomicity inference and enforcement using correct executions. In: Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, Portland, 2011. 19–34

    Chapter  Google Scholar 

  66. Upadhyaya G, Midkiff S P, Pai V S. Automatic atomic region identification in shared memory SPMD programs. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Reno/Tahoe, 2010. 652–670

    Chapter  Google Scholar 

  67. Misailovic S, Kim D, Rinard M. Parallelizing Sequential Programs with Statistical Accuracy Tests. MIT Technical Report, MIT-CSAIL-TR-2010-038. 2010

    Google Scholar 

  68. Navabi A, Zhang X Y, Jagannathan S. Quasi-static scheduling for safe futures. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Salt Lake City, 2008. 23–32

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shan Lu.

Additional information

Authors are listed in alphabetical order. This article is based on the authors’ previous papers [1–4], which were done when Marc de Kruijf, Shan Lu, and Wei Zhang were at University of Wisconsin-Madison, and Shanxiang Qi was at University of Illinois at Urbana-Champaign.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, D., Jin, G., de Kruijf, M. et al. Fixing, preventing, and recovering from concurrency bugs. Sci. China Inf. Sci. 58, 1–18 (2015). https://doi.org/10.1007/s11432-015-5315-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-015-5315-9

Keywords

Navigation