Skip to main content
Log in

On multi-language abstraction: Towards a static analysis of multi-language programs

  • Published:
Formal Methods in System Design Aims and scope Submit manuscript

Abstract

Modern software development rarely takes place within a single programming language. Often, programmers appeal to cross-language interoperability. Examples are exploitation of novel features of one language within another, and cross-language code reuse. Our previous works developed a theory of so-called multi-languages, which arise by combining existing languages, defining a precise notion of (algebraic) multi-language semantics. As regards static analysis, the heterogeneity of the multi-language context opens up new and unexplored scenarios. In this paper, we provide a general theory for the combination of abstract interpretations of existing languages, regardless of their inherent nature, in order to gain an abstract semantics of multi-language programs. As a part of this general theory, we show that formal properties of interest of multi-language abstractions (e.g., soundness and completeness) boil down to the features of the interoperability mechanism that binds the underlying languages together. We extend many of the standard concepts of abstract interpretation to the framework of multi-languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. A commercial static code analyser for Java (version 3.2.0.1227: for Linux 64 bit).

References

  1. Ramsey N (2006) ML module mania: a type-safe, separately compiled, extensible interpreter. Electron Notes Theor Comput Sci 148(2):181–209

    Article  Google Scholar 

  2. Juneau J, Baker J, Wierzbicki F, Soto L, Ng V (2010) The definitive guide to Jython: python for the java platform, 1st edn. Apress, Berkely

    Book  Google Scholar 

  3. Liang S (1999) Java native interface: programmer’s guide and reference, 1st edn. Addison-Wesley Longman Publishing Co. Inc., Boston

    Google Scholar 

  4. Buro S, Mastroeni I (2019) On the multi-language construction. In: European symposium on programming. Springer, pp 293–321

  5. Chisnall D (2013) The challenge of cross-language interoperability. Commun ACM 56(12):50–56

    Article  Google Scholar 

  6. Perconti JT, Ahmed A (2014) Verifying an open compiler using multi-language semantics. In: Proceedings of the 23rd European symposium on programming languages and systems, pp 128–148. Springer, Berlin

  7. Ahmed A, Blume M (2011) An equivalence-preserving cps translation via multi-language semantics. SIGPLAN Not 46(9):431–444

    Article  MATH  Google Scholar 

  8. Furr M, Foster JS (2005) Checking type safety of foreign function calls. SIGPLAN Not. 40(6):62–72

    Article  Google Scholar 

  9. Gray KE (2008) Safe cross-language inheritance. In: Vitek J (ed) ECOOP 2008–object-oriented programming. Springer, Berlin, pp 52–75

    Chapter  Google Scholar 

  10. Patterson D, Perconti J, Dimoulas C, Ahmed A (2017) Funtal: reasonably mixing a functional language with assembly. In: Proceedings of the 38th ACM SIGPLAN conference on programming language design and implementation. ACM, New York, pp 495–509

  11. Matthews J, Findler RB (2009) Operational semantics for multi-language programs. ACM Trans Program Lang Syst 31(3):12–11244

    Article  MATH  Google Scholar 

  12. Campbell G, Papapetrou PP (2013) SonarQube in action. Manning Publications Co., Shelter Island

    Google Scholar 

  13. Cousot P, Cousot R (1977) Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on principles of programming languages, pp 238–252

  14. Cousot P, Cousot R (1992) Abstract interpretation frameworks. J Log Comput 2(4):511–547

    Article  MathSciNet  MATH  Google Scholar 

  15. Buro S, Crole RL, Mastroeni I (2020) On multi-language abstraction—towards a static analysis of multi-language programs. In: Pichardie D, Sighireanu M (eds) Proceedings of static analysis—27th international symposium, SAS 2020, virtual event, November 18-20, 2020, Lecture Notes in Computer Science, vol 12389. Springer, pp 310–332

  16. Goguen JA, Meseguer J (1992) Order-sorted algebra I: equational deduction for multiple inheritance, overloading, exceptions and partial operations. Theoret Comput Sci 105(2):217–273

    Article  MathSciNet  MATH  Google Scholar 

  17. Goguen JA, Diaconescu R (1994) An oxford survey of order sorted algebra. Math Struct Comput Sci 4(3):363–392

    Article  MathSciNet  MATH  Google Scholar 

  18. Tennent RD (1976) The denotational semantics of programming languages. Commun ACM 19(8):437–453

    Article  MathSciNet  MATH  Google Scholar 

  19. Cohen H, Frey G, Avanzi R, Doche C, Lange T, Nguyen K, Vercauteren F (2005) Handbook of elliptic and hyperelliptic curve cryptography. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  20. Goguen JA, Thatcher JW, Wagner EG, Wright JB (1977) Initial algebra semantics and continuous algebras. J ACM 24(1):68–95

    Article  MathSciNet  MATH  Google Scholar 

  21. Cousot P, Giacobazzi R, Ranzato F (2019) A\(^2\)i: abstract\(^2\) interpretation. Proc ACM Program Lang 3(POPL):1–31

    Article  Google Scholar 

  22. Amato G, Meo MC, Scozzari F (2020) On collecting semantics for program analysis. Theoret Comput Sci

  23. Spoto F, Jensen T (2003) Class analyses as abstract interpretations of trace semantics. ACM Trans Program Lang Syst 25(5):578–630

    Article  Google Scholar 

  24. Bjørner N, Gurfinkel A (2015) Property directed polyhedral abstraction. In: International workshop on verification, model checking, and abstract interpretation. Springer, pp 263–281

  25. Kochems J, Ong C (2011) Improved functional flow and reachability analyses using indexed linear tree grammars. In: 22nd International conference on rewriting techniques and applications (RTA’11). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik

  26. Giacobazzi R, Ranzato F (1997) Completeness in abstract interpretation: a domain perspective. In: International conference on algebraic methodology and software technology. Springer, pp 231–245

  27. Cousot P (2002) Constructive design of a hierarchy of semantics of a transition system by abstract interpretation. Theoret Comput Sci 277(1–2):47–103

    Article  MathSciNet  MATH  Google Scholar 

  28. Mastroeni I, Pasqua M (2017) Hyperhierarchy of semantics-a formal framework for hyperproperties verification. In: International static analysis symposium. Springer, pp 232–252

  29. Pasqua M (2019) Hyper static analysis of programs—an abstract interpretation-based framework for hyperproperties verification. PhD thesis, University of Verona

  30. Rival X, Yi K (2019) Introduction to Static Analysis

  31. Cousot P, Halbwachs N (1978) Automatic discovery of linear restraints among variables of a program. In: Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on principles of programming languages, pp 84–96

  32. Arceri V, Mastroeni I (2019) Static program analysis for string manipulation languages. Electron Proc Theoret Comput Sci 299:19–33

    Article  Google Scholar 

  33. Giacobazzi R, Ranzato F, Scozzari F (2000) Making abstract interpretations complete. J ACM 47(2):361–416

    Article  MathSciNet  MATH  Google Scholar 

  34. Oracle: Nashorn User’s Guide. https://docs.oracle.com/en/java/javase/14/nashorn/introduction.html

  35. JetBrains: Calling Java code from Kotlin. https://kotlinlang.org/docs/reference/java-interop.html

  36. Oracle: JNI Types and Data Structures. https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/types.html

  37. Monat R, Ouadjaout A, Miné A (2021) A multilanguage static analysis of python programs with native C extensions. In: Dragoi C, Mukherjee S, Namjoshi KS (eds) Static analysis—28th international symposium, SAS 2021, USA. Lecture Notes in Computer Science, vol 12913, pp 323–345

  38. Gordon AD, Syme D (2001) Typing a multi-language intermediate code. Conference Record of POPL 2001: the 28th ACM SIGPLAN-SIGACT symposium on principles of programming languages. London, UK, January 17–19, 2001. ACM, New York, pp 248–260

  39. Grimmer M, Schatz R, Seaton C, Würthinger T, Luján M (2018) Cross-language interoperability in a multi-language runtime. ACM Trans Program Lang Syst 40(2):8–1843

    Article  Google Scholar 

  40. Barrett E, Bolz CF, Tratt L (2015) Approaches to interpreter composition. Comput Lang Syst Struct 44:199–217

    Google Scholar 

  41. Benton N (2005) Embedded interpreters. J Funct Program 15(4):503–542

    Article  MathSciNet  MATH  Google Scholar 

  42. Buro S, Mastroeni I, Crole RL (2020) Equational logic and categorical semantics for multi-languages. In: In-press (accepted for Publication at 36th international conference on mathematical foundations of programming semantics—MFPS 2020)

  43. Buro S, Mastroeni I, Crole RL (2020) Equational logic and set-theoretic models for multi-languages. In: In-press (accepted for Publication at 21st Italian Conference on Theoretical Computer Science — ICTCS 2020)

  44. Tan G, Morrisett G (2007) Ilea: inter-language analysis across Java and C. SIGPLAN Not 42(10):39–56

    Article  Google Scholar 

  45. Li S, Tan G (2014) Finding reference-counting errors in python/c programs with affine analysis. In: European conference on object-oriented programmings. Springer, pp 80–104

  46. Malcolm D. Usage example: a static analysis tool for CPython extension code. https://gcc-python-plugin.readthedocs.io/en/latest/cpychecker.html

  47. Li S, Tan G (2009) Finding bugs in exceptional situations of jni programs. In: Proceedings of the 16th ACM conference on computer and communications security, pp 442–452

  48. Cousot P (1997) Types as abstract interpretations. In: Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on principles of programming languages, pp 316–331

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuele Buro.

Ethics declarations

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Examples of algebraic semantics for \(\textsf{Imp}\)

We illustrate a simple imperative language \(\textsf{Imp}\) on which we define various semantics in the algebraic style, namely small step operational, prefix trace, and reachability.

Let \(\mathbb{X}\) be a set of variables and \(\mathbb{V}\) a set of scalar values with metavariables x and v, respectively. Variables and values occur in the language as terminal symbols, and for each production defining the syntax of the language (on the right), we introduce a corresponding algebraic operator (on the left), or a family of operators when they are parametric on a subscript:

$$\begin{aligned} \begin{array}{l|rllr} \text {(v)} &{} \langle exp \rangle & {::=}& v & \text {{scalar values}} \\ \text {(x)} &{} \langle exp \rangle & {::=}& x &{} \text {{ variables}} \\ (bop_\odot ) &{} \langle exp \rangle & {::=} & \langle exp \rangle \odot \langle exp \rangle &{} \text {{ binary operations}} \\ ( skip ) &{} \langle com \rangle & {::=} & \text {skip}&{} \text {{ do-nothing}} \\ ( assign _x) \quad &{} \quad \langle com \rangle & {::=} & \text {x} = \langle exp \rangle &{} \text {{ assignment}} \\ ( cond ) &{} \langle com \rangle & {::=} & \text {if }\langle exp \rangle \text { then }\langle com \rangle \text { else }\langle com \rangle \quad &{} \text {{conditional}} \\ ( loop ) &{} \langle com \rangle & {::=} & \text {while}\,\langle exp \rangle \,\text {do}\,\langle com \rangle &{} \text {{ loop statement}} \\ ( seq ) &{} \langle com \rangle & {::=} & \langle com \rangle ;\,\langle com \rangle &{} \text{{composition}} \end{array} \end{aligned}$$

where \(\odot \) is a binary operator such as \(\text {+}\), \(\text {-}\), \(\text {{*}}\), etc. We abuse notation and assume that \(\odot \) denotes both a syntactical symbol of the language and a mathematical function \(\odot :\mathbb{V}^2 \rightarrow \mathbb{V}\) over values. The rank of each algebraic operator can be inferred by the non-terminals appearing in the production rules; for instance, the operator \( cond \) is sorted as

$$\begin{aligned} cond : exp , com , com \rightarrow com \end{aligned}$$

In the examples in the following sections, we often use the correspondence between algebraic and context-free terms. For instance, we may write the algebraic term \( cond (bop_{>}(\text {x}, 0), skip , assign _{\text {x}}(bop\_(0, \text {x})))\) in the less cumbersome context-free form \(\text{if}\;\text{x > 0}\;\text{then}\;\text{skip}\;\text{else}\;\text{x}\;\text{ = }\;\text{0 - x}\).

2.1 A small-step operational semantics

We define a small-step operational semantics \(\mathscr {S}\) describing the program execution steps. The presentation provided here is purely algebraic, and therefore less intuitive than the traditional rule-based style. However, the algebraic framework allows to express many more kinds of semantics in the same formalism, thus favouring their comparison.


Expressions


We treat expressions \(\text {E}\) as “atomic” terms that are fully evaluable into a scalar value in a single-step. Let \(\mathbb{S}_{ exp }\triangleq \{ \, \langle {\text {E}, \rho }\rangle \, \vert \; \text {E} \in {\llbracket exp \rrbracket }_{\mathscr {T}_{\textsf{Imp}}} \wedge \rho \in {\mathbbm{Env}} \}\) be the set of configurations where \(\text {E}\) is an expression and \(\rho \) an environment in \({\mathbbm{Env}}\triangleq \mathbb{X}\rightarrow \mathbb{V}\). The small-step semantics of expressions is given in Fig. 17. Intuitively, starting from an expression \(\text {E}\), we build a set of pairs in \(\mathcal {P}(\mathbb{S}_{ exp }\times \mathbb{V})\) representing the one-step evaluation of \(\text {E}\) in each environment \(\rho \). More precisely, \( \langle{E} \rho \rangle v \in {\llbracket {E} \rrbracket}_{\mathscr{S}}\) simply means that \(\text {E}\) is evaluated into v in \(\rho \). We write \({\llbracket {\text {E}} \rrbracket }_{\mathscr {S}}^\rho \) for denoting such v (unique by construction).

Remark 5

Note that from the small-step semantics \(\llbracket \text {E} \rrbracket_{\mathscr{S}}\) of an expression \(\text {E}\), we are able to recover the term \(\text {E}\). Indeed, \(\llbracket \text {E} \rrbracket_{\mathscr {S}} \ne \varnothing \) and if \( {\langle {\text{E}}_{1}, \rho_{1} \rangle} {\rightarrowtriangle} v_{1}\) and \( {\langle {\text{E}}_{2}, \rho_{2} \rangle} {\rightarrowtriangle} v_{2}\) are transitions (that is, pairs) in \({\llbracket {\text{E}} \rrbracket }_{\mathscr {S}}\), then \(\text {E}_{1} = \text{E} = \text {E}_2\) (this can be shown by a simple structural induction on \(\text {E}\)).

Remark 6

There are some missing cases in the definition of the interpretation functions for the operators in Fig. 17. For instance, we have defined \({\llbracket bop_\odot \rrbracket }_{\mathscr {S}}\) on arguments \({\llbracket \text {E}_1\rrbracket }_{\mathscr {S}}\) and \({\llbracket \text {E}_2\rrbracket }_{\mathscr {S}}\). However, there are semantic elements in \(\mathcal {P}(\mathbb{S}_{ exp }\times \mathbb{V})\) that are not the image of any expressions \(\text {E}\) (e.g., the empty set \(\varnothing \)). We shall leave implicit that \({\llbracket bop_\odot \rrbracket }_{\mathscr {S}}(e_1, e_2) \triangleq \varnothing \) whenever there are no \(\text {E}_1\) or \(\text {E}_2\) such that \(e_1 = {\llbracket \text {E}_1\rrbracket }_{\mathscr {S}}\) and \(e_2 = {\llbracket \text {E}_2\rrbracket }_{\mathscr {S}}\). (This remark and Rem. 5 shall also apply to the next definitions.)

Fig. 17
figure 17

Small-step operational semantics of \(\textsf{Imp}\) expressions

Commands


Let \(\mathbb{S}_{ com }\triangleq \{ \, \langle {\text {C}, \rho }\rangle \, \vert \; \text {C} \in {\llbracket com \rrbracket }_{\mathscr {T}_{\textsf{Imp}}} \cup \{\bot \} \wedge \rho \in {\mathbbm{Env}} \}\) where \(\text {C}\) is a command (or \(\bot \), denoting the end of a computation) and \(\rho \) an environment. For each command operator of \(\textsf{Imp}\) we define its semantics by specifying exactly the pairs of configurations which are related by the action of such an operator (Fig. 18). We write \({\llbracket \text {C}\rrbracket }_{\mathscr {S}}^\rho \) for the unique \(\langle {\text {C}^{\prime}, \rho^{\prime}}\rangle \) such that \( \langle {\text {C}, \rho }\rangle {\rightarrowtriangle} \langle {\text {C}^{\prime}, \rho^{\prime}}\rangle \in {\llbracket \text {C}\rrbracket }_{\mathscr {S}} \).

Fig. 18
figure 18

Small-step operational semantics of \(\textsf{Imp}\) commands

Example 3

We show a small example of the application of the newly defined semantics \({\llbracket -\rrbracket }_{\mathscr {S}}\). We adopt the more intuitive notation provided by the context-free grammar for denoting terms, and we avoid the use of subscripts \(_\mathscr {S}\). Suppose we want to compute the small-step semantics of the conditional statement \(\text {if }\text {x > 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}\). Then,

$$\begin{aligned} {\llbracket \text {if }\text {x> 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}\rrbracket } = {\llbracket cond \rrbracket }({\llbracket \text {x > 0}\rrbracket }, {\llbracket \text {skip}\rrbracket }, {\llbracket \text {x} = {0 - x}\rrbracket }) \end{aligned}$$

where the semantics of the condition is

$$\begin{aligned} \begin{array}{rcl} {\llbracket \text {x> 0}\rrbracket } = {\llbracket bop_{>}\rrbracket }({\llbracket \text {x}\rrbracket }, {\llbracket 0\rrbracket }) &{}= &{}\{ \, \langle {\text {x> 0}, \rho }\rangle {\rightarrowtriangle} \text {1}\, \vert \, \rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) \text {> } 0) = \text {1} \} \\ &{}\cup &{}\{ \, \langle {\text {x> 0}, \rho }\rangle {\rightarrowtriangle} 0\, \vert \, \rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > 0) = 0 \} \end{array} \end{aligned}$$

and therefore,

$$\begin{aligned} \begin{array}{rl} {\llbracket cond \rrbracket }&{}({\llbracket \text {x> 0}\rrbracket }, {\llbracket \text {skip}\rrbracket }, {\llbracket \text {x} = {0 - x}\rrbracket }) = \\ &{}= \{\;\langle {\text {if }\text {x> 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {\text {skip}, \rho }\rangle \\ &{}\qquad \vert \;\rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > 0) = \text {1}\;\} \\ &{}\cup \;\,\{\;\langle {\text {if }\text {x> 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {\text {x} = {0 - x}, \rho }\rangle \\ &{}\qquad \vert \;\rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > 0) = 0\;\} \end{array} \end{aligned}$$

Note that the same result would have been achieved with a traditional rule-based style for specifying small-step semantics.

2.2 Fixpoint definition of prefix trace semantics

Prefix trace semantics associates each program \(\text{P}\) with the set of all finite traces obtained by iterating an arbitrarily large number of times the small-step semantics \(\mathscr {S}\) from \(\langle {\text{P}, \rho }\rangle \), for each environment \(\rho \).

Let \(\mathbb{S}_{ com }^{*} \triangleq \bigcup _{n \in \mathbb{N}} \mathbb{S}_{ com }^n\) be the set of finite sequences of command configurations (that is, finite traces). A trace \(\tau \in \mathbb{S}_{ com }^n\) is denoted by \( \langle {\text {C}_1, \rho _1}\rangle {\rightarrowtriangle} \cdots {\rightarrowtriangle} \langle {\text {C}_n, \rho _n}\rangle \). The prefix trace semantics \(\mathscr {P}\) is defined by keeping the one-step evaluation semantics for expressions \(\text {E}\) (i.e., \({\llbracket \text {E}\rrbracket }_{\mathscr {P}} \triangleq {\llbracket \text {E}\rrbracket }_{\mathscr {S}}\)), and by defining the following fixpoint semantics for command operators \(f:w \rightarrow s\) on the domain \(\langle {\llbracket com \rrbracket }_{\mathscr {P}} \triangleq \mathcal {P}(\mathbb{S}_{ com }^{*}), \subseteq , \varnothing , \cup \rangle \):

$$\begin{aligned} {\llbracket f:w \rightarrow s\rrbracket }_{\mathscr {P}}({\llbracket \text{P}_1\rrbracket }_{\mathscr {P}}, \ldots , {\llbracket \text{P}_n\rrbracket }_{\mathscr {P}}) \triangleq {{\,\textrm{lfp}\,}}_\varnothing ^\subseteq F_{f(\text{P}_1, \ldots , \text{P}_n)} \end{aligned}$$

where \(F_{f(\text{P}_1, \ldots , \text{P}_n)}:\mathcal {P}(\mathbb{S}_{ com }^{*}) \rightarrow \mathcal {P}(\mathbb{S}_{ com }^{*})\) is defined as

$$\begin{aligned} X &{}\mapsto &{}\{\varepsilon \} \cup \{ \, \langle {f(\text{P}_1, \ldots , \text{P}_n), \rho }\rangle \, \vert \; \rho \in {\mathbbm{Env}} \} \\ &{}\cup &{}\{ \, \tau {\rightarrowtriangle} \langle {\text {C}, \rho }\rangle {\rightarrowtriangle} \langle {\text {C}^{\prime}, \rho^{\prime}}\rangle \in \mathbb{S}_{ com }^{*} \, \vert \; \tau {\rightarrowtriangle} \langle {\text {C}, \rho }\rangle \in X \wedge \langle {\text {C}^{\prime}, \rho^{\prime}}\rangle = {\llbracket \text {C}\rrbracket }_{\mathscr {S}}^\rho \} \end{aligned}$$

and the trace semantics of the constant \( skip \) is trivially defined by \({\llbracket skip \rrbracket }_{\mathscr {P}} \triangleq \{\varepsilon \} \cup \{ \, \langle { skip , \rho }\rangle \, \vert \; \rho \in {\mathbbm{Env}} \} \cup \{ \, \langle skip , \rho \rangle {\rightarrowtriangle} \langle \bot , \rho \rangle \, \vert \; \rho \in {\mathbbm{Env}} \}\). The constructive computation of \({{\,\textrm{lfp}\,}}_\varnothing ^\subseteq F_{f(\text{P}_1, \ldots , \text{P}_n)}\) is guaranteed by Kleene’s theorem (\(F_{f(\text{P}_1, \ldots , \text{P}_n)}\) is continuous on the pointed dcpo \(\langle \mathcal {P}(\mathbb{S}_{ com }^{*}), \subseteq , \varnothing , \cup \rangle \)).

Example 4

We restate Ex. 3 for the prefix trace semantics \(\mathscr {P}\) applied to the same term \(\text{P} \triangleq \text {if }\text {x > 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}\):

$$\begin{aligned} {\llbracket \text {if }\text {x > 0}\text { then }\text {skip}\text { else }\text {x} = {0 - x}\rrbracket } = {{\,\textrm{lfp}\,}}_\varnothing ^\subseteq F_{\text{P}} \end{aligned}$$

where the iterates of \(F_{\text{P}}\) are

$$\begin{aligned} F_{\text{P}}^0 &{}= &{}\varnothing \\ F_{\text{P}}^1 &{}= &{}\{\varepsilon \} \cup \{\langle {\text {if }{x> 0}\text { then }\mathrm {skip} \text{ else } \mathrm{x} = {0 - x}, \rho }\rangle \} \\ F_{\text{P}}^2 &{}= &{}\{\;\langle {\text {if } {x> 0}\text { then }\mathrm{skip}\text { else }\mathrm{x}{ = }{0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {\text {skip}, \rho }\rangle \\ &{}&{}\qquad \vert \rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > {0}) = {1} \} \\ &{}\cup &{}\{\langle {\text {if }{x> 0}\mathrm{ then }\text {skip} \mathrm{else } {x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {{x} = {0 - x}, \rho }\rangle \\ &{}&{}\qquad \vert \rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > {0}) = 0 \} \\ &{}\cup &{}F_{\text{P}}^1 \\ F_{\text{P}}^3 &{}= &{}\{\langle {\text {if } {x> 0}\text { then }\mathrm{skip} \text{else} \mathrm{x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {\text {skip}, \rho }\rangle {\rightarrowtriangle} \langle {\bot , \rho }\rangle \\ &{}&{}\qquad \vert \;\rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > 0) = 1\;\} \\ &{}\cup &{}\{\;\langle {\text {if }{x> 0}\mathrm{ then }\text {skip}\mathrm{ else }\text {x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \langle {\text {x} = {0 - x}, \rho }\rangle {\rightarrowtriangle} \\ &{}&{}\qquad {\rightarrowtriangle} \langle {\bot , \rho [\text {x} \leftarrow \text {-x}]}\rangle \;\vert \;\rho \in {\mathbbm{Env}}\wedge (\rho (\text {x}) > 0) = 0\;\} \\ &{}\cup &{}F_{\text{P}}^1 \\ F_{\text{P}}^{\delta > 3} &{}= &{}F_{\text{P}}^3 \end{aligned}$$

and therefore \({\llbracket \text{P}\rrbracket }_{\mathscr {P}}\) is the union of the iterates.

2.3 Reachability semantics as abstraction of trace semantics

Reachability semantics aims at computing the set of states that a program \(\text{P}\) may reach during its execution. Such a set can be parametric on program points (that is, location) or it can be the union of all the environments reached in any point. We show that both of these versions can be obtained by abstracting the collecting semantics \(\mathscr {P}^{*}\) over traces provided in the previous section.

Reachability on Program Points


Let \(\mathscr {R}\) be the reachability semantics that collects states per program point. Its carrier set of sort \( com \) is defined as \({\llbracket com \rrbracket }_{\mathscr {R}} \triangleq \mathcal {P}(\mathbb{S}_{ com })\), thus a command is interpreted as a set of configurations (where program code denotes locations). We show that \(\mathscr {R}\) can be obtained by abstracting the collecting semantics \(\mathscr {P}^{*}\) by establishing a Galois connection between their carrier sets:

$$\begin{aligned} \langle \mathcal {P}(\mathcal {P}(\mathbb{S}_{ com }^{*}))\rangle \overset{\gamma }{\underset{\alpha }{\leftrightarrows }}{\mathcal {P}(\mathbb{S}_{ com })} \end{aligned}$$

The abstraction function \(\alpha \) maps a semantic property \(\mathcal {X} \subseteq \mathcal {P}(\mathbb{S}_{ com }^{*})\) (i.e., a set of sets of finite traces) to the set of states that appears in those traces:

$$\begin{aligned} \alpha (\mathcal {X}) \triangleq \{ \, \langle {\text {C},\rho }\rangle \in \mathbb{S}_{ com }\, \vert \; \exists \tau \in \cup \mathcal {X}\,:\,\exists \langle {\text {C},\rho }\rangle \in \tau \} \end{aligned}$$

Conversely, the concretisation function \(\gamma \) maps each set of states C to the set containing only those traces whose configurations are in C:

$$\begin{aligned} \gamma (C) \triangleq \{ \, X \in \mathcal {P}(\mathbb{S}_{ com }^{*})\, \vert \; \forall \tau \in X\,.\,\langle {\text {C},\rho }\rangle \in \tau \implies \langle {\text {C},\rho } \in C \}\rangle \end{aligned}$$

Now, the definition of \(\mathscr {R}\) follows by the existence of a best correct approximation, as shown in Sect. 4.

Reachability without Program Points


The reachability semantics \(\mathscr {R}_\cup \) forgets about program locations and simply collects the environments reached during the execution of a program. The carrier set of commands is defined as \({\llbracket com \rrbracket }_{\mathscr {R}_\cup } \triangleq \mathcal {P}({\mathbbm{Env}})\). \(\mathscr {R}_\cup \) can be obtained by abstracting the collecting semantics \(\mathscr {P}^{*}\) over traces:

$$\begin{aligned} \alpha \big (\mathcal {X} \in \mathcal {P}(\mathcal {P}(\mathbb{S}_{ com }^{*}))\big )&\triangleq \{ \, \rho \in {\mathbbm{Env}}\, \vert \; \exists \tau \in \cup \mathcal {X}\,:\,\exists \langle {\text {C},\rho }\rangle \in \tau \} \\ \gamma \big (R \in \mathcal {P}({\mathbbm{Env}})\big )&\triangleq \{ \, X \in \mathcal {P}(\mathbb{S}_{ com }^{*})\, \vert \; \forall \tau \in X\,.\,\langle {\text {C},\rho }\rangle \in \tau \implies \rho \in R \} \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Buro, S., Crole, R. & Mastroeni, I. On multi-language abstraction: Towards a static analysis of multi-language programs. Form Methods Syst Des (2023). https://doi.org/10.1007/s10703-022-00405-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10703-022-00405-8

Keywords

Navigation