Advertisement

Abstract

Formal specifications can help with program testing, optimization, refactoring, documentation, and, most importantly, debugging and repair. Unfortunately, formal specifications are difficult to write manually, while techniques that infer specifications automatically suffer from 90–99% false positive rates. Consequently, neither option is currently practical for most software development projects.

We present a novel technique that automatically infers partial correctness specifications with a very low false positive rate. We claim that existing specification miners yield false positives because they assign equal weight to all aspects of program behavior. By using additional information from the software engineering process, we are able to dramatically reduce this rate. For example, we grant less credence to duplicate code, infrequently-tested code, and code that exhibits high turnover in the version control system.

We evaluate our technique in two ways: as a preprocessing step for an existing specification miner and as part of novel specification inference algorithms. Our technique identifies which input is most indicative of program behavior, which allows off-the-shelf techniques to learn the same number of specifications using only 60% of their original input. Our inference approach has few false positives in practice, while still finding useful specifications on over 800,000 lines of code. When minimizing false alarms, we obtain a 5% false positive rate, an order-of-magnitude improvement over previous work. When used to find bugs, our mined specifications locate over 250 policy violations. To the best of our knowledge, this is the first specification miner with such a low false positive rate, and thus a low associated burden of manual inspection.

Keywords

False Positive Rate Mining Algorithm Execution Trace Program Behavior Version Control System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Alur, R., Cerny, P., Madhusudan, P., Nam, W.: Synthesis of interface specifications for Java classes. In: POPL (2005)Google Scholar
  2. 2.
    Ammons, G., Bodik, R., Larus, J.R.: Mining specifications. In: POPL, pp. 4–16 (2002)Google Scholar
  3. 3.
    Ammons, G., Mandelin, D., Bodík, R., Larus, J.R.: Debugging temporal specifications with concept analysis. In: Programming Language Design and Implementation, pp. 182–195 (2003)Google Scholar
  4. 4.
    Ball, T.: A theory of predicate-complete test coverage and generation. In: FMCO, pp. 1–22 (2004)Google Scholar
  5. 5.
    Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., Ondrusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. In: European Systems Conference, April 2006, pp. 103–122 (2006)Google Scholar
  6. 6.
    Buse, R.P.L., Weimer, W.: Automatic documentation inference for exceptions. In: ISSTA, pp. 273–282 (2008)Google Scholar
  7. 7.
    Buse, R.P.L., Weimer, W.: A metric for software readability. In: ISSTA, pp. 121–130 (2008)Google Scholar
  8. 8.
    Buse, R.P.L., Weimer, W.: The road not taken: Estimating path execution frequency statically (2009)Google Scholar
  9. 9.
    Chen, H., Wagner, D., Dean, D.: Setuid demystified. In: USENIX Security Symposium, pp. 171–190 (2002)Google Scholar
  10. 10.
    Corbett, J.C., Dwyer, M.B., Hatcliff, J., Laubach, S., Pasareanu, C., Robby, Zheng, H.: Bandera: extracting finite-state models from Java source code. In: ICSE, pp. 762–765 (2000)Google Scholar
  11. 11.
    Das, M.: Formal specifications on industrial-strength code—from myth to reality. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, p. 1. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    de Souza, S.C.B., Anquetil, N., de Oliveira, K.M.: A study of the documentation essential to software maintenance. In: SIGDOC, pp. 68–75 (2005)Google Scholar
  13. 13.
    DeLine, R., Fähndrich, M.: Enforcing high-level protocols in low-level software. In: PLDI, pp. 59–69 (2001)Google Scholar
  14. 14.
    Engler, D., Chelf, B., Chou, A., Hallem, S.: Checking system rules using system-specific, programmer-written compiler extensions. In: Symposium on Operating Systems Design and Implementation (2000)Google Scholar
  15. 15.
    Engler, D.R., Chen, D.Y., Chou, A.: Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In: SOSP, pp. 57–72 (2001)Google Scholar
  16. 16.
    Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: Extended static checking for Java. In: PLDI, pp. 234–245 (2002)Google Scholar
  17. 17.
    Gabel, M., Su, Z.: Symbolic mining of temporal specifications. In: ICSE, pp. 51–60 (2008)Google Scholar
  18. 18.
    Gold, E.M.: Language identification in the limit. Information and Control 10(5), 447–474 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hovemeyer, D., Pugh, W.: Finding bugs is easy. In: OOPSLA Companion, pp. 132–136 (2004)Google Scholar
  20. 20.
    Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Kataoka, Y., Ernst, M., Griswold, W., Notkin, D.: Automated support for program refactoring using invariants. In: ICSM, pp. 736–743 (2001)Google Scholar
  22. 22.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14(2), 1137–1145 (1995)Google Scholar
  23. 23.
    Kupferman, O., Lampert, R.: On the construction of fine automata for safety properties. In: Graf, S., Zhang, W. (eds.) ATVA 2006. LNCS, vol. 4218, pp. 110–124. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Lerner, S., Millstein, T., Rice, E., Chambers, C.: Automated soundness proofs for dataflow analyses and transformations via local rules. SIGPLAN Not. 40(1), 364–377 (2005)CrossRefGoogle Scholar
  25. 25.
    Livshits, V.B., Lam, M.S.: Finding security errors in Java programs with static analysis. In: USENIX Security Symposium, August 2005, pp. 271–286 (2005)Google Scholar
  26. 26.
    Malayeri, D., Aldrich, J.: Practical exception specifications. In: Advanced Topics in Exception Handling Techniques, pp. 200–220 (2006)Google Scholar
  27. 27.
    Nagappan, N., Ball, T.: Using software dependencies and churn metrics to predict field failures: An empirical case study. In: ESEM, pp. 364–373 (2007)Google Scholar
  28. 28.
    National Institute of Standards and Technology. The economic impacts of inadequate infrastructure for software testing. Technical Report 02-3 (May 2002)Google Scholar
  29. 29.
    Pfleeger, S.L.: Software Engineering: Theory and Practice. Prentice Hall PTR, Upper Saddle River (2001)Google Scholar
  30. 30.
    Pigoski, T.M.: Practical Software Maintenance: Best Practices for Managing Your Software Investment. John Wiley & Sons, Inc., Chichester (1996)Google Scholar
  31. 31.
    Ramamoothy, C.V., Tsai, W.-T.: Advances in software engineering. IEEE Computer 29(10), 47–58 (1996)CrossRefGoogle Scholar
  32. 32.
    Seacord, R.C., Plakosh, D., Lewis, G.A.: Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices (2003)Google Scholar
  33. 33.
    Shoham, S., Yahav, E., Fink, S., Pistoia, M.: Static specification mining using automata-based abstractions. In: ISSTA, pp. 174–184 (2007)Google Scholar
  34. 34.
    Weimer, W.: Patches as better bug reports. In: GPCE, pp. 181–190 (2006)Google Scholar
  35. 35.
    Weimer, W., Mishra, N.: Privately finding specifications. IEEE Trans. Software Eng. 34(1), 21–32 (2008)CrossRefGoogle Scholar
  36. 36.
    Weimer, W., Necula, G.C.: Mining temporal specifications for error detection. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 461–476. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  37. 37.
    Whaley, J., Martin, M.C., Lam, M.S.: Automatic extraction of object-oriented component interfaces. In: ISSTA (2002)Google Scholar
  38. 38.
    Yang, J., Evans, D., Bhardwaj, D., Bhat, T., Das, M.: Perracotta: mining temporal API rules from imperfect traces. In: ICSE, pp. 282–291 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Claire Le Goues
    • 1
  • Westley Weimer
    • 1
  1. 1.University of VirginiaUSA

Personalised recommendations