Skip to main content

Identifying change patterns of API misuses from code changes

Abstract

Library or framework APIs are difficult to learn and use, leading to unexpected software behaviors or bugs. Hence, various API mining techniques have been introduced to mine API usage patterns about the co-occurring of API calls or pre-conditions of API calls. However, they fail to mine patterns about an API call itself (e.g., whether the arguments of the API call are correctly set and whether the API is suitably chosen over other similar APIs). To bridge this gap, we propose Cpam to identify change patterns (in the form of a pair of APIs before and after code changes) to fix API misuses, using historical code changes. Given a set of target APIs and a corpus of open-source projects, Cpam first selects the commits that potentially fix API misuses from the corpus, then extracts changes to API misuses in each selected commit, and finally identifies change patterns of API misuses. We implement Cpam for Java, and conduct large-scale evaluation, targeting Java SE APIs and using a corpus of 1162 Java projects. Our experimental results demonstrate Cpam’s effectiveness and efficiency. By applying identified change patterns to bug detection, we find 44 new bugs, and 18 of them have been confirmed and fixed.

This is a preview of subscription content, access via your institution.

References

  1. Robillard M P, DeLine R. A field study of API learning obstacles. Empir Softw Eng, 2011, 16: 703–732

    Article  Google Scholar 

  2. Hou D, Li L. Obstacles in using frameworks and APIs: an exploratory study of programmers’ newsgroup discussions. In: Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension, 2011. 91–100

  3. Nadi S, Krüger S, Mezini M, et al. Jumping through hoops: why do Java developers struggle with cryptography apis? In: Proceedings of the 38th International Conference on Software Engineering, 2016. 935–946

  4. Zibran M F, Eishita F Z, Roy C K. Useful, but usable? factors affecting the usability of APIs. In: Proceedings of the 2011 18th Working Conference on Reverse Engineering, 2011. 151–155

  5. Robillard M P, Bodden E, Kawrykow D, et al. Automated API property inference techniques. IEEE Trans Softw Eng, 2013, 39: 613–637

    Article  Google Scholar 

  6. Zhong H, Xie T, Zhang L, et al. MAPO: mining and recommending API usage patterns. In: Proceedings of the 23rd European Conference on ECOOP 2009 — Object-Oriented Programming, 2009. 318–343

  7. Uddin G, Robillard M P. How API documentation fails. IEEE Softw, 2015, 32: 68–75

    Article  Google Scholar 

  8. Linares-Vásquez M, Bavota G, Bernal-Cárdenas C, et al. API change and fault proneness: a threat to the success of android apps. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013. 477–487

  9. McDonnell T, Ray B, Kim M. An empirical study of API stability and adoption in the android ecosystem. In: Proceedings of the 2013 IEEE International Conference on Software Maintenance, 2013. 70–79

  10. Dig D, Johnson R. How do APIs evolve? A story of refactoring. J Softw Maint Evol-Res Pract, 2006, 18: 83–107

    Article  Google Scholar 

  11. Xavier L, Brito A, Hora A, et al. Historical and impact analysis of API breaking changes: a large-scale study. In: Proceedings of 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2017. 138–147

  12. Jezek K, Dietrich J, Brada P. How Java APIs break — an empirical study. Inf Softw Tech, 2015, 65: 129–146

    Article  Google Scholar 

  13. Raemaekers S, van Deursen A, Visser J. Semantic versioning and impact of breaking changes in the Maven repository. J Syst Softw, 2017, 129: 140–158

    Article  Google Scholar 

  14. Jung C, Rus S, Railing B P, et al. Brainy: effective selection of data structures. SIGPLAN Not, 2011, 46: 86–97

    Article  Google Scholar 

  15. Xu G. CoCo: sound and adaptive replacement of Java collections. In: Proceedings of the 27th European conference on Object-Oriented Programming, 2013. 1–26

  16. Chen B, Liu Y, Le W. Generating performance distributions via probabilistic symbolic execution. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 49–60

  17. Zhao Y, Xiao L, Wang X, et al. Localized or architectural: an empirical study of performance issues dichotomy. In: Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2019. 316–317

  18. Georgiev M, Iyengar S, Jana S, et al. The most dangerous code in the world: validating SSL certificates in non-browser software. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, 2012. 38–49

  19. Fahl S, Harbach M, Perl H, et al. Rethinking SSL development in an appified world. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, 2013. 49–60

  20. Egele M, Brumley D, Fratantonio Y, et al. An empirical study of cryptographic misuse in android applications. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, 2013. 73–84

  21. Li L, Bissyandé T F, Traon Y L, et al. Accessing inaccessible android APIs: an empirical study. In: Proceedings of the 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2016. 411–422

  22. Li Z, Zhou Y. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. SIGSOFT Softw Eng Notes, 2005, 30: 306–315

    Article  Google Scholar 

  23. Thummalapenta S, Xie T. Alattin: mining alternative patterns for detecting neglected conditions. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009. 283–294

  24. Monperrus M, Bruch M, Mezini M. Detecting missing method calls in object-oriented software. In: Proceedings of the 24th European Conference on Object-Oriented Programming, 2010. 2–25

  25. Wasylkowski A, Zeller A, Lindig C. Detecting object usage anomalies. In: Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007. 35–44

  26. Moritz E, Linares-Vásquez M, Poshyvanyk D, et al. ExPort: detecting and visualizing API usages in large source code repositories. In: Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013. 646–651

  27. Fowkes J, Sutton C. Parameter-free probabilistic API mining across github. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 254–265

  28. Zhang T, Upadhyaya G, Reinhardt A, et al. Are code examples on an online Q&A forum reliable? a study of API misuse on stack overflow. In: Proceedings of 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), 2018. 886–896

  29. Williams C C, Hollingsworth J K. Recovering system specific rules from software repositories. SIGSOFT Softw Eng Notes, 2005, 30: 1

    Google Scholar 

  30. Livshits B, Zimmermann T. DynaMine: finding common error patterns by mining software revision histories. SIGSOFT Softw Eng Notes, 2005, 30: 296–305

    Article  Google Scholar 

  31. Uddin G, Dagenais B, Robillard M P. Temporal analysis of API usage concepts. In: Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), 2012. 804–814

  32. Azad S, Rigby P C, Guerrouj L. Generating API call rules from version history and stack overflow posts. ACM Trans Softw Eng Methodol, 2017, 25: 1–22

    Article  Google Scholar 

  33. Liang B, Bian P, Zhang Y, et al. Antminer: mining more bugs by reducing noise interference. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 333–344

  34. Ramanathan M K, Grama A, Jagannathan S. Path-sensitive inference of function precedence protocols. In: Proceedings of the 29th International Conference on Software Engineering (ICSE’07), 2007. 240–250

  35. Nguyen H A, Dyer R, Nguyen T N, et al. Mining preconditions of APIs in large-scale code corpus. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 166–177

  36. Ramanathan M K, Grama A, Jagannathan S. Static specification inference using predicate mining. SIGPLAN Not, 2007, 42: 123–134

    Article  Google Scholar 

  37. Wasylkowski A, Zeller A. Mining temporal specifications from object usage. Autom Softw Eng, 2011, 18: 263–292

    Article  Google Scholar 

  38. Chang R Y, Podgurski A, Yang J. Finding what’s not there: a new approach to revealing neglected conditions in software. In: Proceedings of the 2007 International Symposium on Software Testing and Analysis, 2007. 163–173

  39. Nguyen T T, Nguyen H A, Pham N H, et al. Graph-based mining of multiple object usage patterns. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2009. 383–392

  40. Falleri J R, Morandat F, Blanc X, et al. Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014. 313–324

  41. Kim S, Whitehead E J, Zhang Y. Classifying software changes: clean or buggy? IEEE Trans Softw Eng, 2008, 34: 181–196

    Article  Google Scholar 

  42. Jin G, Song L, Shi X, et al. Understanding and detecting real-world performance bugs. SIGPLAN Not, 2012, 47: 77–88

    Article  Google Scholar 

  43. Chen Z, Chen B, Xiao L, et al. Speedoo: prioritizing performance optimization opportunities. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 811–821

  44. Zhou Y, Sharma A. Automated identification of security issues from commit messages and bug reports. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 914–919

  45. Wei L, Liu Y, Cheung S C. Taming android fragmentation: characterizing and detecting compatibility issues for Android apps. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 226–237

  46. Herzig K, Zeller A. The impact of tangled code changes. In: Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR), 2013. 121–130

  47. Dias M, Bacchelli A, Gousios G, et al. Untangling fine-grained code changes. In: Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015. 341–350

  48. Hattori L P, Lanza M. On the nature of commits. In: Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering-Workshops, 2008. 63–71

  49. Liu H, Liu Q, Staicu C A, et al. Nomen est omen: exploring and exploiting similarities between argument and parameter names. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 1063–1073

  50. Pradel M, Gross T R. Detecting anomalies in the order of equally-typed method arguments. In: Proceedings of the 2011 International Symposium on Software Testing and Analysis, 2011. 232–242

  51. Pradel M, Gross T R. Name-based analysis of equally typed method arguments. IEEE Trans Softw Eng, 2013, 39: 1127–1143

    Article  Google Scholar 

  52. Rice A, Aftandilian E, Jaspan C, et al. Detecting argument selection defects. In: Proceedings of the ACM on Programming Languages, 2017. 1–22

  53. Williams C C, Hollingsworth J K. Automatic mining of source code repositories to improve bug finding techniques. IIEEE Trans Softw Eng, 2005, 31: 466–480

    Article  Google Scholar 

  54. Hovemeyer D, Pugh W. Finding bugs is easy. SIGPLAN Not, 2004, 39: 92–106

    Article  Google Scholar 

  55. Aftandilian E, Sauciuc R, Priya S, et al. Building useful program analysis tools using an extensible Java compiler. In: Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation, 2012. 14–23

  56. Copeland T. PMD Applied. Arexandria: Centennial Books, 2005

    Google Scholar 

  57. Thung F, Lucia F, Lo D, et al. To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools. Autom Softw Eng, 2015, 22: 561–602

    Article  Google Scholar 

  58. Habib A, Pradel M. How many of all bugs do we find? a study of static bug detectors. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018. 317–328

  59. Sabetta A, Bezzi M. A practical approach to the automatic classification of security-relevant commits. In: Proceedings of 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018. 579–582

  60. Xu Z, Chen B, Chandramohan M, et al. SPAIN: security patch analysis for binaries towards understanding the pain and pills. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 462–472

  61. Pearson S, Campos J, Just R, et al. Evaluating and improving fault localization. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 609–620

  62. Kawrykow D, Robillard M P. Non-essential changes in version histories. In: Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE), 2011. 351–360

  63. Barnett M, Bird C, Brunet J A, et al. Helping developers help themselves: automatic decomposition of code review change-sets. In: Proceedings of the 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Florence, 2015. 134–144

  64. Tao Y, Kim S, Partitioning composite code changes to facilitate code review. In: Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 2015. 180–190

  65. Paletov R, Tsankov P, Raychev V, et al. Inferring crypto API rules from code changes. SIGPLAN Not, 2018, 53: 450–464

    Article  Google Scholar 

  66. Amann S, Nguyen H A, Nadi S, et al. A systematic evaluation of static API-misuse detectors. IEEE Trans Softw Eng, 2019, 45: 1170–1188

    Article  Google Scholar 

  67. Engler D, Chen D Y, Hallem S, et al. Bugs as deviant behavior: a general approach to inferring errors in systems code. SIGOPS Oper Syst Rev, 2001, 35: 57–72

    Article  Google Scholar 

  68. Salman H E. Identification multi-level frequent usage patterns from APIs. J Syst Softw, 2017, 130: 42–56

    Article  Google Scholar 

  69. Xie T, Pei J. MAPO: mining API usages from open source repositories. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, 2006. 54–57

  70. Kagdi H, Collard M L, Maletic J I. An approach to mining call-usage patternswith syntactic context. In: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, 2007. 457–460

  71. Acharya M, Xie T, Pei J, et al. Mining API patterns as partial orders from source code: from usage scenarios to specifications. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007. 25–34

  72. Gruska N, Wasylkowski A, Zeller A. Learning from 6000 projects: lightweight cross-project anomaly detection. In: Proceedings of the 19th International Symposium on Software Testing and Analysis, 2010. 119–130

  73. Thummalapenta S, Xie T. Mining exception-handling rules as sequence association rules. In: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, 2009. 496–506

  74. Wang J, Dang Y, Zhang H, et al. Mining succinct and high-coverage API usage patterns from source code. In: Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR), 2013. 319–328

  75. Gu X, Zhang H, Zhang D, et al. Deep API learning. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 631–642

  76. Wen M, Liu Y, Wu R, et al. Exposing library API misuses via mutation analysis. In: Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019

  77. Mandelin D, Xu L, Bodík R, et al. Jungloid mining: Helping to navigate the API jungle. SIGPLAN Not, 2005, 40: 48–61

    Article  Google Scholar 

  78. Zhong H, Zhang H L, Mei H. Inferring specifications of object oriented APIs from API source code. In: Proceedings of the 2008 15th Asia-Pacific Software Engineering Conference, 2008. 221–228

  79. Buse R P, Weimer W. Synthesizing API usage examples. In: Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), 2012. 782–792

  80. Niu H, Keivanloo I, Zou Y. API usage pattern recommendation for software development. J Syst Softw, 2017, 129: 127–139

    Article  Google Scholar 

  81. Wang S, Chollak D, Movshovitz-Attias D, et al. Bugram: bug detection with n-gram language models. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 708–719

  82. Murali V, Chaudhuri S, Jermaine C. Bayesian specification learning for finding API usage errors. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 151–162

  83. Murphy-Hill E, Sadowski C, Head A, et al. Discovering API usability problems at scale. In: Proceedings of the 2nd International Workshop on API Usage and Evolution, 2018. 14–17

  84. Uddin G, Dagenais B, Robillard M P. Analyzing temporal API usage patterns. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), 2011. 456–459

  85. Bruch M, Monperrus M, Mezini M. Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2009. 213–222

  86. Wang L, Fang L, Wang L, et al. APIExample: an effective web search based usage example recommendation system for Java APIs. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), 2011. 592–595

  87. Negara S, Codoban M, Dig D, et al. Mining fine-grained code changes to detect unknown change patterns. In: Proceedings of the 36th International Conference on Software Engineering, 2014. 803–813

  88. Meng N, Kim M, McKinley K S. Systematic editing: generating program transformations from an example. SIGPLAN Not, 2011, 46: 329–342

    Article  Google Scholar 

  89. Meng N, Kim M, McKinley K S. LASE: locating and applying systematic edits by learning from examples. In: Proceedings of the 2013 35th International Conference on Software Engineering (ICSE), 2013. 502–511

  90. Rolim R, Soares G, D’Antoni L, et al. Learning syntactic program transformations from examples. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 404–415

  91. Kim D, Nam J, Song J, et al. Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 35th International Conference on Software Engineering (ICSE), 2013. 802–811

  92. Long F, Amidon P, Rinard M. Automatic inference of code transforms for patch generation. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 727–739

  93. Liu X, Zhong H. Mining stackoverflow for program repair. In: Proceedings of the 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018. 118–129

  94. Roychoudhury A, Xiong Y. Automated program repair: a step towards software automation. Sci China Inf Sci, 2019, 62: 200103

    Article  Google Scholar 

  95. Brown D B, Vaughn M, Liblit B, et al. The care and feeding of wild-caught mutants. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 511–522

  96. Monperrus M, Eichberg M, Tekes E, et al. What should developers be aware of? An empirical study on the directives of API documentation. Empir Software Eng, 2012, 17: 703–737

    Article  Google Scholar 

  97. Dekel U, Herbsleb J D. Improving API documentation usability with knowledge pushing. In: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, 2009. 320–330

  98. Saied M A, Sahraoui H, Dufour B. An observational study on API usage constraints and their documentation. In: Proceedings of 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015. 33–42

  99. Zhou Y, Gu R, Chen T, et al. Analyzing APIs documentation and code to detect directive defects. In: Proceedings of 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 27–37

  100. Wu W, Guéhéneuc Y G, Antoniol G, et al. AURA: a hybrid approach to identify framework evolution. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, 2010. 325–334

  101. Dagenais B, Robillard M P. Recommending adaptive changes for framework evolution. ACM Trans Softw Eng Methodol, 2011, 20: 1–35

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61802067).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bihuan Chen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Chen, B., Peng, X. et al. Identifying change patterns of API misuses from code changes. Sci. China Inf. Sci. 64, 132101 (2021). https://doi.org/10.1007/s11432-019-2745-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-019-2745-5

Keywords

  • API misuses
  • API mining
  • AST differencing
  • change patterns
  • bug fixing