Skip to main content

Relational String Abstract Domains

  • 265 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 13182)

Abstract

In modern programming languages, more and more functionalities, such as reflection and data interchange, rely on string values. String analysis statically computes the set of string values that are possibly assigned to a variable, and it involves a certain degree of approximation. During the last decade, several abstract domains approximating string values have been introduced and applied to statically analyze programs. However, most of them are not precise enough to track relational information between string variables whose value is statically unknown (e.g., user input), causing the loss of relevant knowledge about their possible values. This paper introduces a generic approach to formalize relational string abstract domains based on ordering relationships. We instantiate it to several domains built upon different well-known string orders (e.g., substring). We implemented the domain based on the substring ordering into a prototype static analyzer for Go, and we experimentally evaluated its precision and performance on some real-world case studies.

Keywords

  • Relational abstract domains
  • Static analysis
  • String analysis
  • Abstract interpretation

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-94583-1_2
  • Chapter length: 23 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-94583-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Notes

  1. 1.

    secName is the result of a slight modification made to the function available at https://www.codota.com/code/java/classes/java.lang.String.

  2. 2.

    Available at https://github.com/UniVE-SSV/go-lisa.

  3. 3.

    In general, while \(\preceq \) (order on string variables) can be a pre or partial order, \(\sqsubseteq _ \mathcal {A}\) (order on the abstract domain \( \mathcal {A}\)) is always a partial order.

  4. 4.

    Full details about how the constant propagation analysis works are reported in [32].

  5. 5.

    maxLen(x) returns the maximum length of the string recognized by the automaton abstracting x if it is finite, \(+\infty \) otherwise.

References

  1. Abdulla, P.A., Atig, M.F., Diep, B.P., Holík, L., Janků, P.: Chain-free string constraints. In: Chen, Y.-F., Cheng, C.-H., Esparza, J. (eds.) ATVA 2019. LNCS, vol. 11781, pp. 277–293. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31784-3_16

    CrossRef  Google Scholar 

  2. Amadini, R., et al.: Reference abstract domains and applications to string analysis. Fundam. Inform. 158(4), 297–326 (2018). https://doi.org/10.3233/FI-2018-1650

    MathSciNet  CrossRef  MATH  Google Scholar 

  3. Amadini, R., Gange, G., Stuckey, P.J.: Dashed strings for string constraint solving. Artif. Intell. 289, 103368 (2020). https://doi.org/10.1016/j.artint.2020.103368

    MathSciNet  CrossRef  MATH  Google Scholar 

  4. Amadini, R., et al.: Combining string abstract domains for JavaScript analysis: an evaluation. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10205, pp. 41–57. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54577-5_3

    CrossRef  Google Scholar 

  5. Arceri, V., Mastroeni, I.: An automata-based abstract semantics for string manipulation languages. In: Proceedings of VPT 2019. EPTCS, vol. 299, pp. 19–33 (2019). https://doi.org/10.4204/EPTCS.299.5

  6. Arceri, V., Mastroeni, I.: Analyzing dynamic code: a sound abstract interpreter for evil eval. ACM Trans. Priv. Secur. 24(2), 10:1–10:38 (2021). https://doi.org/10.1145/3426470

  7. Arceri, V., Mastroeni, I., Xu, S.: Static analysis for ECMAScript string manipulation programs. Appl. Sci. 10, 3525 (2020). https://doi.org/10.3390/app10103525

    CrossRef  Google Scholar 

  8. Arceri, V., Olliaro, M., Cortesi, A., Mastroeni, I.: Completeness of string analysis for dynamic languages. Inform. Comput. 104791 (2021). https://doi.org/10.1016/j.ic.2021.104791

  9. Bagnara, R., Hill, P.M., Zaffanella, E.: The Parma Polyhedra library: toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1–2), 3–21 (2008). https://doi.org/10.1016/j.scico.2007.08.001

    MathSciNet  CrossRef  Google Scholar 

  10. Bultan, T., Yu, F., Alkhalaf, M., Aydin, A.: String Analysis for Software Verification and Security. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-68670-7

    CrossRef  Google Scholar 

  11. Cortesi, A., Lauko, H., Olliaro, M., Ročkai, P.: String abstraction for model checking of C programs. In: Biondi, F., Given-Wilson, T., Legay, A. (eds.) SPIN 2019. LNCS, vol. 11636, pp. 74–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30923-7_5

    CrossRef  Google Scholar 

  12. Costantini, G., Ferrara, P., Cortesi, A.: Static analysis of string values. In: Qin, S., Qiu, Z. (eds.) ICFEM 2011. LNCS, vol. 6991, pp. 505–521. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24559-6_34

    CrossRef  Google Scholar 

  13. Costantini, G., Ferrara, P., Cortesi, A.: A suite of abstract domains for static analysis of string values. Softw. Pract. Exp. 45(2), 245–287 (2015). https://doi.org/10.1002/spe.2218

    CrossRef  Google Scholar 

  14. Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of POPL 1977, pp. 238–252 (1977). https://doi.org/10.1145/512950.512973

  15. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Aho, A.V., Zilles, S.N., Rosen, B.K. (eds.) Conference Record of the Sixth Annual ACM Symposium on Principles of Programming Languages, San Antonio, Texas, USA, January 1979, pp. 269–282. ACM Press (1979). https://doi.org/10.1145/567752.567778

  16. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Proceedings of POPL 1979, pp. 269–282 (1979). https://doi.org/10.1145/567752.567778

  17. Cousot, P., Cousot, R.: Abstract interpretation and application to logic programs. J. Log. Program. 13(2 & 3), 103–179 (1992). https://doi.org/10.1016/0743-1066(92)90030-7

    MathSciNet  CrossRef  MATH  Google Scholar 

  18. Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Proceedings of POPL 1978, pp. 84–96 (1978). https://doi.org/10.1145/512760.512770

  19. Ferrara, P., Logozzo, F., Fähndrich, M.: Safer unsafe code for.net. In: Proceedings of OOPSLA 2008, pp. 329–346. ACM (2008). https://doi.org/10.1145/1449764.1449791

  20. Ferrara, P., Negrini, L., Arceri, V., Cortesi, A.: Static analysis for dummies: experiencing LiSA. In: Do, L.N.Q., Urban, C. (eds.) SOAP@PLDI 2021: Proceedings of the 10th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis, Virtual Event, Canada, 22 June 2021, pp. 1–6. ACM (2021). https://doi.org/10.1145/3460946.3464316

  21. Gange, G., Navas, J.A., Schachte, P., Søndergaard, H., Stuckey, P.J.: An abstract domain of uninterpreted functions. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 85–103. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49122-5_4

    CrossRef  MATH  Google Scholar 

  22. Gulwani, S., Tiwari, A.: Combining abstract interpreters. In: Proceedings of PLDI 2006, pp. 376–386 (2006). https://doi.org/10.1145/1133981.1134026

  23. Illous, H., Lemerre, M., Rival, X.: A relational shape abstract domain. In: Proceedings of NFM 2017, pp. 212–229 (2017). https://doi.org/10.1007/978-3-319-57288-8_15

  24. Jensen, S.H., Møller, A., Thiemann, P.: Type analysis for JavaScript. In: Palsberg, J., Su, Z. (eds.) SAS 2009. LNCS, vol. 5673, pp. 238–255. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03237-0_17

    CrossRef  Google Scholar 

  25. Kashyap, V., et al.: JSAI: a static analysis platform for JavaScript. In: Proceedings of FSE-22, pp. 121–132 (2014). https://doi.org/10.1145/2635868.2635904

  26. Lee, H., Won, S., Jin, J., Cho, J., Ryu, S.: SAFE: formal specification and implementation of a scalable analysis framework for ECMAScript. In: Proceedings of FOOL 2012 (2012)

    Google Scholar 

  27. Logozzo, F.: Towards a quantitative estimation of abstract interpretations. In: Workshop on Quantitative Analysis of Software. Microsoft, June 2009. https://www.microsoft.com/en-us/research/publication/towards-a-quantitative-estimation-of-abstract-interpretations/

  28. Logozzo, F., Fähndrich, M.: Pentagons: a weakly relational abstract domain for the efficient validation of array accesses. Sci. Comput. Program. 75(9), 796–807 (2010). https://doi.org/10.1016/j.scico.2009.04.004

    MathSciNet  CrossRef  MATH  Google Scholar 

  29. Madsen, M., Andreasen, E.: String analysis for dynamic field access. In: Cohen, A. (ed.) CC 2014. LNCS, vol. 8409, pp. 197–217. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54807-9_12

    CrossRef  Google Scholar 

  30. Miné, A.: The octagon abstract domain. High. Order Symb. Comput. 19(1), 31–100 (2006). https://doi.org/10.1007/s10990-006-8609-1

    CrossRef  MATH  Google Scholar 

  31. Negrini, L., Arceri, V., Ferrara, P., Cortesi, A.: Twinning automata and regular expressions for string static analysis. In: Henglein, F., Shoham, S., Vizel, Y. (eds.) VMCAI 2021. LNCS, vol. 12597, pp. 267–290. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67067-2_13

    CrossRef  MATH  Google Scholar 

  32. Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6

    CrossRef  MATH  Google Scholar 

  33. Park, C., Im, H., Ryu, S.: Precise and scalable static analysis of jQuery using a regular expression domain. In: Proceedings of DLS 2016, pp. 25–36 (2016). https://doi.org/10.1145/2989225.2989228

  34. Seidl, H., Wilhelm, R., Hack, S.: Compiler Design - Analysis and Transformation. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-17548-0

    CrossRef  MATH  Google Scholar 

  35. Wang, H., Chen, S., Yu, F., Jiang, J.R.: A symbolic model checking approach to the analysis of string and length constraints. In: Proceedings of ASE 2018, pp. 623–633. ACM (2018). https://doi.org/10.1145/3238147.3238189

  36. Wilhelm, R., Sagiv, S., Reps, T.W.: Shape analysis. In: Proceedings of CC 2000, pp. 1–17 (2000). https://doi.org/10.1007/3-540-46423-9_1

  37. Yu, F., Alkhalaf, M., Bultan, T., Ibarra, O.H.: Automata-based symbolic string analysis for vulnerability detection. Formal Methods Syst. Design 44(1), 44–70 (2013). https://doi.org/10.1007/s10703-013-0189-1

    CrossRef  MATH  Google Scholar 

  38. Yu, F., Bultan, T., Hardekopf, B.: String abstractions for string verification. In: Groce, A., Musuvathi, M. (eds.) SPIN 2011. LNCS, vol. 6823, pp. 20–37. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22306-8_3

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincenzo Arceri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Arceri, V., Olliaro, M., Cortesi, A., Ferrara, P. (2022). Relational String Abstract Domains. In: Finkbeiner, B., Wies, T. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2022. Lecture Notes in Computer Science(), vol 13182. Springer, Cham. https://doi.org/10.1007/978-3-030-94583-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94583-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94582-4

  • Online ISBN: 978-3-030-94583-1

  • eBook Packages: Computer ScienceComputer Science (R0)