In this section we illustrate our code-based approach. This material is partly based on Plate et al. (2015) where we first introduced the idea of shifting the problem of establishing whether an application incorporates OSS components that have exploitable vulnerabilities, to the problem of assessing whether the vulnerable code of those components is reachable.
The code-centric and usage-based approach presented in the following Sections overcomes weaknesses of approaches and tools based on meta-data. Phenomenons like library re-bundlingFootnote 8 or the fact that single open source projects commonly distribute multiple, fine-grained libraries containing a subset of the original code baseFootnote 9 represent significant challenges for approaches and tools based on meta-data. Their vulnerability detection suffers from false positives and false negatives, and the maintenance of corresponding vulnerability databases is costly and error-prone. Besides precise vulnerability detection (cf. Section 2.2), the code-centric approach also supports functionalities out of reach for approaches based on meta-data, especially the analysis whether vulnerable code can be executed in the context of a given application (cf. Sections 2.3 to 2.5), which is needed in order to prioritize findings, as well as update metrics considering the actual library use with the goal to reduce regressions (cf. Section 3).
Compared to previous works, Sections 2.1, 2.2 and 2.3generalize (Plate et al. 2015), and Sections 2.2, 2.4, 2.5extend it with unique novel contributions. In particular Sections 2.4, 2.5 are the basis of the update metrics presented in Section 3.
Representing vulnerabilities at the code level
Our approach is based on the idea that a vulnerability can be characterized, thus detected and analyzed, by the set of program constructs (such as methods), that were modified, added, or deleted to fix that vulnerability (Plate et al. 2015).
Definition 1
A program construct (or simply construct) is a structural element of the source code characterized by a construct type (e.g., package, class, constructor, method), a language (e.g., Java, Python), and a unique identifier (e.g., the fully-qualified name).
Example 1
The fully-qualified name of method baz(int) in class Bar and package foo is foo.Bar.baz(int). The type of this construct is method.
It is important to remark that the term type as used in this definition denotes the different kinds of syntactic constructs that form the structure of a program (as illustrated in the examples above); the same term is used with a different meaning in the domain of programming languages. The two meanings should not be confused.
Changes to program constructs are performed through commits in a source code repository; therefore, the set of changes that fix a vulnerability can be obtained from the analysis of the corresponding fix commit. Note that in cases where a commit includes not only a vulnerability fix but also unrelated changes, a post-processing of the construct changes is required.
Definition 2
We define a construct change as the tuple
$$(c, t, \mathtt{AST}_{f}^{(c)},\mathtt{AST}_{v}^{(c)})$$
where c is a construct, t is a change operation (i.e., addition, deletion or modification) on the construct c, and \(\mathtt {AST}_{v}^{(c)}\), \(\mathtt {AST}_{f}^{(c)}\) are, respectively, the abstract syntax trees of the vulnerable and of the fixed c.
Notice that for deleted (added) constructs only \(\mathtt {AST}_{v}^{(c)}\) (\(\mathtt {AST}_{f}^{(c)}\)) exists.
In practice, the typical source of source code changes (from which we extract construct changes) are commits coming from code versioning systems: to a commit (that modifies source code) corresponds a set of construct changes.
These fix commits represent the main input to our approach, and its implementation, Eclipse Steady, requires the maintenance of a knowledge-base with triples each comprising a vulnerability identifier, a URL of the versioning control system of the vulnerable open source project and the identifiers of fix commits (cf. Sections 5 and 6.2.3). Steady automatically processes those fix commits in order to determine all changed constructs and the \(\mathtt {AST}_{f}^{(c)}\) and \(\mathtt {AST}_{v}^{(c)}\), so that this information is available when analyzing concrete applications and the libraries they depend upon (directly or transitively). Differently from proprietary vulnerability databases, mentioned in Section 1, our knowledge-base is open source. Furthermore, identifying (typically few) fix commits is less expensive and error-prone than identifying all the library versions affected by a given open source vulnerability, especially because of a popular technique called re-bundling, where code from one open source library is copied into other libraries distributed with different identifiers. The identification and enumeration of affected library versions with help of, e.g., Common Platform EnumerationFootnote 10 identifiers (CPE) or Maven coordinates,Footnote 11 is what we consider metadata, and is used by many state-of-the-art approaches.
When a fix is implemented over multiple commits, we rely on commit timestamps to compute the set of construct changes by comparing the source code before the first and after the last commit. If the vulnerability fix includes changes in a nested construct (e.g., a method of a class), two distinct entries are included in the set of construct changes, one for the outer construct (the class), one for the nested construct (the method).
While, ideally, fix commits should be systematically indicated by the developers of open source libraries (e.g., as part of security advisories), they are not always disclosed explicitly. Some OSS projects (e.g., Apache Tomcat) provide such information via security advisories; others reference issue tracking systems, which, in turn, describe the vulnerability being solved; some other OSS projects do not explicitly refer to vulnerabilities being fixed. Thus reconciling the information based on the textual description and code changes still requires manual effort (see Section 6.2.3). A broader discussion of the data integration problem can be found in Plate et al. (2015).
Differently from Plate et al. (2015), we provide a definition of construct change and consider the ASTs of the modified program constructs. This is used in Section 2.2 to establish whether a given library artifact includes the changes introduced by the fix.
Vulnerability detection
In this section we introduce the principles of our approach to detect the presence of vulnerabilities, based on the concepts introduced in the previous subsection. A concrete example of how this approach is applied in practice is illustrated in Section 4.1-(1).
Figure 1 shows how a vulnerability j is associated to an application a. Cj is the set of the constructs obtained by analyzing the fix commits of j, as described above. The set Si contains all the constructs of the OSS library i used by the application a, whereas Sa is the set of all constructs of the application itself.
Definition 3
An application depends on a vulnerable library i, affected by vulnerability j, if that library includes code vulnerable to j (referred to as vulnerable constructs, i.e., constructs that have been changed in the fix commits of j), thus, if
$$ (C_{j} \cap S_{i}) \neq\emptyset \wedge \forall c \in (C_{j} \cap S_{i}), \mathtt{AST}^{(c)} = \mathtt{AST}_{v}^{(c)} $$
(1)
According to Definitions 1 and 2, the list of construct changes contains constructs whose types are not limited to functions and methods but also include the outer constructs, e.g., the class. As a result, even if a vulnerability is fixed by adding new methods to an existing class, the intersection Cj ∩ Si is not empty, as it would contain a construct corresponding to the class.
Condition (1) in Definition 3 requires the abstract syntax tree of each construct of library i that was changed in the fix commits to be strictly equal to the vulnerable abstract syntax tree. If such condition holds, it is then straightforward to conclude that the library is vulnerable to j. However, such a strict condition can hardly cope with real world scenarios. In fact, the vulnerable and fixed abstract syntax trees of the construct change are representations of the code at a single point in time, i.e., before and after the patch is applied, whereas it is often the case that several versions of a library (e.g., Spring Framework v4.x, v5.x) are maintained in parallel and thus several, possibly different, corrections may be applied for the different versions. If the same construct is changed in different ways in different versions, a given library version would only contain one of such vulnerable or fixed abstract syntax trees, thereby violating condition (1) of Definition 3.Footnote 12 Moreover the code evolves over time and thus the code where the correction has to be applied may have undergone numerous refactorings during the history of the library. Similarly, the fixed code may be modified and refactored while moving forward. It is thus clear that in the case of library versions released long before or after the application of the patch, it is unlikely to have an exact match on the abstract syntax trees.
To cope with the above challenges, we relaxed condition (1) of Definition 3 as follows to only require an exact match for a single construct as long as the remaining ones do not raise any inconsistencies. Such inconsistencies occur if a single library version contains both fixed constructs with \(\mathtt {AST}_{f}^{(c)}\) and vulnerable ones with \(\mathtt {AST}_{v}^{(c)}\), and have to manually resolved.
$$ \exists \ c_{a} \in (C_{j} \cap S_{i}) \ | \ \mathtt{AST}^{(c_{a})} = \mathtt{AST}_{v}^{(c_{a})} \wedge \not\exists \ c_{b} \in (C_{j} \cap S_{i}) \ | \ \mathtt{AST}^{(c_{b})} = \mathtt{AST}_{f}^{(c_{b})} $$
(2)
Similarly, a library version Si includes the patch fixing vulnerability j if
$$ \exists \ c_{a} \in (C_{j} \cap S_{i}) \ | \ \mathtt{AST}^{(c_{a})} = \mathtt{AST}_{f}^{(c_{a})} \wedge \not\exists \ c_{b} \in (C_{j} \cap S_{i}) \ | \ \mathtt{AST}^{(c_{b})} = \mathtt{AST}_{v}^{(c_{b})} $$
(3)
Whenever condition (2) ((3), resp.) holds, we conclude that a library version i is vulnerable (fixed, resp.) to j according to criterion AST equality.Footnote 13
For a given version of a library, it may happen that no equality is found while comparing abstract syntax trees. In this case, to draw a conclusion as to whether that library version contains the vulnerable or the fix version of the code, other versions of the same library can be considered. The evaluation based on multiple library versions uses both the result of the comparison of the of each construct changed and the order of the library versions.
The comparison of the of each construct changed (c) is used to determine a distance between (c) and \(\mathtt {AST}_{v}^{(c)}\) (\(\mathtt {AST}_{f}^{(c)}\)). To obtain the distance between abstract syntax trees we use tree differencing algorithms (Falleri et al. 2014; Fluri et al. 2007). In particular we use (Fluri et al. 2007) that, given two abstract syntax trees, provides an edit script to transform the former in the latter. The number of edit operations in the edit script is used as distance in our approach.
To order library versions we rely on semantic versioning, following the usual schema , extended with a fourth segment (). This extension was made to cover the versioning schema of libraries like Apache Struts that use four segments ().
Given a library name (e.g., the Maven group and artifact identifier), we order library versions by constructing a set of release tree s, where a release tree is defined as follows.
Definition 4
A release tree is a binary tree such that
-
the root node always contains the first release of a given , i.e., .0.0;
-
the left child contains the next patch release, i.e., given a node Z1 the left child connected through the edge p is Z2 where Z2 > Z1 (denoted as Z1 ≺p Z2);
-
the right child contains the next build release, i.e., given a node W1 the right child connected through the edge b is W2 where W2 > W1, (denoted as W1 ≺b W2).
Figure 2 shows an example of a generic release tree. Note that, when omitted, the segment of the version corresponds to 0.
In the following we present the additional criteria based on which we determine whether a vulnerability should be reported for a certain library version based on abstract syntax tree comparisons and release trees. Differently from criterion AST equality that may be applied to a single library version i, these additional criteria use the knowledge of several library versions. In fact, they require that both a release tree and the results of the abstract syntax tree comparisons be available. These criteria are only used when conditions (2) and (3) do not hold and they are applied in order of precedence.
Intersection
The intersection criterion considers library versions that are adjacent in the release tree (i.e., directly connected) and checks whether the abstract syntax tree comparisons show that the ancestor library version is more similar to the vulnerable code whereas the successor is more similar to the fixed code. If this is the case it concludes that the ancestor contains the vulnerable code whereas the successor contains the correction.
Definition 5
Given libraries S1,S2 in a release tree such that
$$S_{1} \prec_{p} S_{2} \wedge \not\exists \ S_{3} \ | \ S_{1} \prec_{p} S_{3} \wedge S_{3} \prec_{p} S_{2}$$
or
$$S_{1} \prec_{b} S_{2} \wedge \not\exists \ S_{3} \ | \ S_{1} \prec_{b} S_{3} \wedge S_{3} \prec_{b} S_{2}$$
(i.e., S2 is a direct child of S1 either through the patch or build branch), then S1 is said to be vulnerable and S2 is said to be fixed with respect to vulnerability j according to criterion intersection if there exists a construct whose in S1 is more similar to the vulnerable whereas the for the same construct in S2 is more similar to the fixed one, and no other construct exists for which the opposite holds true. Let \(\mathtt {diff}(\mathtt {AST_{S_{1}}}^{(c)},\mathtt {AST_{S_{2}}}^{(c)})\) be the number of changes required to transform \(\mathtt {AST_{S_{1}}}^{(c)}\) into \(\mathtt {AST_{S_{2}}}^{(c)}\) (i.e., the abstract syntax tree edit distance), then the intersection criterion establishes that S1 contains the vulnerable code and S2 contains the corrected code if
$$ \begin{array}{@{}rcl@{}} \exists \ c \in (C_{j} \cap S_{1} \cap S_{2}) \ | \ \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c)},\mathtt{AST}_{v}^{(c)}) < \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c)},\mathtt{AST}_{f}^{(c)})\\ \wedge \ \mathtt{diff}(\mathtt{AST_{S_{2}}}^{(c)},\mathtt{AST}_{v}^{(c)}) > \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c)},\mathtt{AST}_{f}^{(c)}), \end{array} $$
(4)
and
$$ \begin{array}{@{}rcl@{}} \not\exists \ c_{1} \in (C_{j} \cap S_{1} \cap S_{2}) \ | \ \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c_{1})},\mathtt{AST}_{v}^{(c_{1})}) > \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c_{1})},\mathtt{AST}_{f}^{(c_{1})})\\ \wedge \ \mathtt{diff}(\mathtt{AST_{S_{2}}}^{(c_{1})},\mathtt{AST}_{v}^{(c_{1})}) < \mathtt{diff}(\mathtt{AST_{S_{1}}}^{(c_{1})},\mathtt{AST}_{f}^{(c_{1})}). \end{array} $$
(5)
Note that these conditions hold also when either \({\mathtt {diff}(\mathtt {AST_{S_{1}}}^{(c)},\mathtt {AST}_{v}^{(c)})=0}\) or \({\mathtt {diff}(\mathtt {AST_{S_{2}}}^{(c)},\mathtt {AST}_{f}^{(c)})=0}\), i.e., if S1 is vulnerable by criterion AST equality or S2 is fixed by criterion AST equality.
Example 2
Consider the release tree of Fig. 2 and let
-
S1 = X.Y.1
-
S2 = X.Y.2
-
\(\mathtt {diff}(\mathtt {AST_{S_{1}}}^{(c)},\mathtt {AST}_{v}^{(c)}) = 1\)
-
\(\mathtt {diff}(\mathtt {AST_{S_{1}}}^{(c)},\mathtt {AST}_{f}^{(c)}) = 3\)
-
\(\mathtt {diff}(\mathtt {AST_{S_{2}}}^{(c)},\mathtt {AST}_{v}^{(c)}) = 4\)
-
\(\mathtt {diff}(\mathtt {AST_{S_{2}}}^{(c)},\mathtt {AST}_{f}^{(c)}) = 2\)
In Fig. 3, we connect the distances of S1 and S2 to the vulnerable and the ones to the fixed . The former can be distinguished by the use of a bold line. As it becomes clear from Fig. 3, the distances above represent an intersection as the distance is closer to the vulnerable for S1 (i.e., the tree edit distance is smaller) whereas it is closer to the fixed one for S2. As a result, according to criterion intersection we conclude that S1 is vulnerable and S2 is fixed.
Major release
The underlying intuition of the major release criterion is that once the correction of a security defect is included in a library version, all the versions that follow also include the correction. Thus, the release tree for a given is used to establish the ordering of library versions, and to determine those that are descendants of versions that are known to be fixed by criteria AST equality or intersection.
Definition 6
Given the libraries S1 and S2 such that S1 is fixed according to criteria AST equality or intersection, then S2 is said to be fixed according to criterion major release if
Example 3
Consider the example of Fig. 4 where nodes marked with double circles are fixed to vulnerability j by criteria AST equality or intersection. Then according to criterion major release we can conclude that
-
library versions X.Y.3 and X.Y.4 are fixed as X.Y.2 ≺pX.Y.3 and X.Y.2 ≺pX.Y.4, i.e., they follow the fixed one X.Y.2;
-
library versions X.Y.0.2 and X.Y.2.1 are fixed as X.Y.0.1 ≺bX.Y.0.2 and X.Y.2 ≺bX.Y.2.1, i.e., they follow the fixed ones X.Y.0.1 and X.Y.2, resp.
Minor release
Similarly to criterion major release, the idea of the minor release criterion is that library versions preceding one containing the vulnerable code also contain the vulnerable code. Again, the release tree is used to establish the ordering of library versions. The library versions for which the criteria AST equality or intersection concluded that they contain the vulnerable code are used as starting point.
Definition 7
Given the library versions S1 and S2 such that S1 is vulnerable according to criteria AST equality or intersection, S2 is said to be vulnerable according to criterion minor release if
Example 4
Consider the example of Fig. 5 where dashed nodes are vulnerable to vulnerability j by criteria AST equality or intersection. Then according to criterion minor release we can conclude that
-
library versions X.Y.1 and X.Y.0 are vulnerable as X.Y.1 ≺pX.Y.2 and X.Y.0 ≺pX.Y.2, i.e., they are ancestors of the vulnerable one X.Y.2;
-
library versions X.Y.0.1 and X.Y.0 are vulnerable as X.Y.0.1 ≺bX.Y.0.2 and X.Y.0 ≺bX.Y.0.2, i.e., they are ancestors of the vulnerable one X.Y.0.2.
Note that library version X.Y.0 is vulnerable according to both conditions.
Greater release
Criterion greater release is meant to cope with the case of library versions released long after the vulnerability was found and henceforth fixed. The underlying idea is to check whether the entire release tree was created temporally after all the versions that were found fixed according to criteria AST equality or intersection.
Example 5
Consider a library having two release trees: the one of Fig. 4 and a tree with root node A.B.0. All versions belonging to the release tree A.B are fixed according to criterion greater release if the release date of A.B.0 is greater than the one of X.Y.2 and X.Y.0.1 which are the versions that were found fixed according to criteria AST equality and intersection.
Manual inspection is still required whenever no automated conclusion can be taken, e.g., when
$$\exists \ c_{1},c_{2} \in (C_{j} \cap S_{i}) {\kern2.2pt} | {\kern2.2pt} \mathtt{AST}^{(c_{1})} = \mathtt{AST}_{v}^{(c_{1})} \ \wedge \ \mathtt{AST}^{(c_{2})} = \mathtt{AST}_{f}^{(c_{2})}.$$
Differently from Plate et al. (2015), we define the set of constructs Cj as independent of any library i, which takes into consideration that single constructs, e.g., classes or packages, are copied to (included in) libraries other than the original ones produced by the respective open source project. A vulnerability in a library, be it one produced in the context of the original open source project or one produced by other developers who just copied (included) the code of other libraries, is then detected through the intersection of its constructs with Cj. This approach has several advantages: First, it makes it explicit that the vulnerable constructs responsible for a vulnerability j can be contained in any library artifact i, hence, the approach is robust against the prominent practice of re-bundling the code of OSS libraries. Second, it is sufficient that a library version includes a subset of the vulnerable constructs for the vulnerability to be detected. Last, it improves the accuracy compared to approaches based on metadata, which typically flag entire open source projects as affected, even if projects release functionalities as part of different libraries. Apache POI,Footnote 14 for instance, while developed in a single source code repository, is released as a set of distinct, independent libraries. Because Plate et al. (2015) focuses on newly-disclosed vulnerabilities, it assumes that, at the time of disclosure, every library that includes constructs changed in the fix commit must be vulnerable. While this assumption is valid at that moment in time, it is not valid for old vulnerabilities, which require that one establishes whether a given library contains the fixed code. We achieve this by comparing the AST of constructs in use with those of the affected and fixed versions and providing a set of criteria to automatically conclude.
Reachability of vulnerable code: Dynamic assessment
After having determined that an application depends on a library version that includes vulnerable constructs, it is important to establish whether these constructs are reachable. In this paper, we use the term reachable to denote both the case where dynamic analysis shows that a construct is actually executed and the case where static analysis shows potential execution paths. The underlying idea is that if an application executes (or may execute) vulnerable constructs, there exists a significant risk that the vulnerability can be exploited. The dynamic assessment described here is borrowed from our previous work (Plate et al. 2015).
In the following we explain how we use dynamic reachability analysis in our approach; in Section 4.1-(2-3) we illustrate its application in practice.
Figure 6 illustrates the use of dynamic analysis to assess whether the vulnerable constructs are reachable by observing actual executions. Tai represents the set of constructs, either part of application a or its bundled library i, that were executed at least once during some execution of the application. To increase readability, the figure only visualizes one such library, while the actual analysis always considers all libraries that are directly or transitively used by the application. The intersection Cj ∩ Tai comprises all those constructs that are both changed (added, deleted or modified) in the fix commits of j and executed in the context of application a because of its (direct or transitive) use of library i.
The collection of actual executions of constructs can be done at different times: during unit tests, integration tests, and even during live system operation. In our implementation for Java, for instance, the collection is accomplished with a Java instrumentation agent. If enabled using a command line argument of the Java Virtual Machine (JVM), the agent adds suitable Java statements to each method when the corresponding class is loaded for the first time. This happens regardless of whether a class belongs to a directly or indirectly used library, since they are all included in the JVM’s classpath. Moreover, this implementation does not require specific test cases, but relies on existing (unit, integration, or manual) tests. Therefore, the effectiveness of dynamic analysis in discovering the execution of vulnerable code significantly depends on the coverage achieved by such tests.
Reachability of vulnerable code: Static assessment
In addition to the analysis of actual executions (dynamic analysis), our approach uses static analysis to determine whether the vulnerable constructs are potentially executable.
An example of how this approach is applied in practice is presented in Section 4.1-(4).
Our method uses static analysis in two different ways.
First, we use it to complement the results of the dynamic analysis, by identifying the library constructs reachable from the application.
Second, (Section 2.5) we combine the two techniques by using the results of the dynamic analysis as input for the static analysis, thereby overcoming limitations of both techniques: static analyzers are known to struggle with dynamic code (such as, in Java, code loaded through reflection (Landman et al. 2017)); on the other hand, dynamic (test-based) methods suffer from inadequate coverage of the possible execution paths.
Figure 7 illustrates how we use static analysis to complement the results of dynamic analysis. Rai represents the set of all constructs, either part of application a or a bundled library i, that are found reachable starting from the application a and thus can be potentially executed. Again, for the sake of readability, the figure only visualizes one such library, while the actual analysis always considers all libraries that are directly or transitively used by the application. Static analysis is performed by using a static analyzer, e.g., the T.J. Watson Libraries for AnalysisFootnote 15 or the Soot framework,Footnote 16 to compute a graph of all constructs of all libraries reachable from the application constructs. The implementation of the approach, Steady, only requires the distributed Java archives (JARs) of each library the application depends on (but not their source code). However, as those archives are also needed in other contexts, e.g., to compile or test the application source code, their presence does not represent an additional requirement. The intersection Cj ∩ Rai comprises all constructs that are both changed in the fix commit of j and are part of the call graph, thus, can be potentially executed.
Combination of dynamic and static assessment
This subsection presents our method to combine static and dynamic reachability analysis. The application in practice is illustrated in Section 4.1-(5).
As shown in Fig. 8, in the combined method we use the set of constructs actually executed, Tai, as starting point for the static analysis. The result is the set \(R_{T_{ai}}\) of constructs reachable starting from the ones executed during the dynamic analysis. The intersection \(C_{j}~\cap ~R_{T_{ai}}\) comprises all constructs that are both changed in the fix commit of j and can be potentially executed. Note that library i, as in Figs. 6 and 7, can be directly or transitively used by the application.
We explain the benefits of the combinations of the two techniques through the example in Fig. 9. In the following, we denote a library bundled within a software program with the term dependency.
Example 6
Let Sa be a Java application having two direct dependencies S1 and Sf where S1 has a direct dependency S2 that in turn has a direct dependency S3 (thereby S2 and S3 are transitive dependencies for the application Sa). S1 is a library offering a set of functionalities to be used by the application (e.g., Apache Commons FileUploadFootnote 17). Moreover the construct γ of S1 calls the construct δ of S2 dynamically, e.g, by using Java reflection, which means the construct to be called is not known at compile time. Sf is what we call a “framework” providing a skeleton whose functionalities are meant to call the application defining the specific operations (e.g., Apache Struts,Footnote 18 Spring FrameworkFootnote 19). The key difference is the so-called inversion of control as frameworks call the application instead of the other way round.
With the vulnerability detection step of Section 2.2, our approach determines that Sa includes vulnerable constructs for vulnerabilities j1 and j2 via the dependencies Sf and S3, respectively. Note that even if S3 only contains two out of the three constructs of \(C_{j_{2}}\), our approach is still able to detect the vulnerability.
We start the vulnerability analysis by running the static analysis of Section 2.4 that looks for all constructs potentially reachable from the constructs of Sa. The result is the set Ra1 including all constructs of Sa and all constructs of S1 reachable from Sa. As expected, Sf is not reachable in this case as frameworks are not called by the application. Moreover, it is well known that static analysis cannot always identify dynamic calls like those performed using Java reflection. As the call from γ to δ uses Java reflection, in this example only S1 is statically reachable from the application. As shown in Fig. 9Ra1 does not intersect with any of the vulnerable constructs.
The dynamic analysis of Fig. 6 produces the set Ta (omitted from the figure) of constructs that are actually executed. Though no intersection with the vulnerable constructs is found, the dynamic analysis increases the set of reachable constructs (Ra1 ∪ Ta in Fig. 9). In particular, it complements static analysis revealing paths that static analysis missed. First, it contains construct 𝜖 of framework Sf that calls construct α of the application. Second, it follows the dynamic call from γ to δ.
Combining static and dynamic analysis, as shown in Fig. 8, we can use static analysis with the constructs in Ta as starting point. The result is the set \(R_{T_{a}}\) (omitted in the figure) of all constructs that can be potentially executed starting from those actually executed Ta.
After running all the analyses, we obtain the overall set \(R_{a1} \cup T_{a} \cup R_{T_{a}}\) (shown with solid fill in Fig. 9) of all constructs found reachable by at least one technique. Its intersection with Cj1 and Cj2 reveals that both vulnerabilities j1 and j2 are reachable, since one vulnerable construct for each of them is found in the intersection and is thus reachable (\(\eta \in C_{j_{1}}\) is reachable from 𝜖 and \(\omega \in C_{j_{2}}\) is reachable from δ).
Applicability of the approach to different programming languages
This section discusses the possibility to apply the above-presented approach to different programming languages and ecosystems, considering their respective particularities.
A prerequisite for both vulnerability detection and reachability analysis is the unique identification of constructs, both during the analysis of source code (altered by fix commits) as well as during the analysis of distributed packages (downloaded during the build process of downstream dependents).
The generic concept of unique construct identifiers, introduced with Definition 1, has been implemented for the Java and Python programming languages. In both cases, constructs are identified using their fully-qualified name.
In Java, the fully-qualified name of, e.g., method baz(int) in class Bar and package foo is foo.Bar.baz(int). The Java compilation process does not remove or alter such identifiers (apart from constructors of non-static inner classes), thus, they can be easily built from source code and Java archives (JARs) containing Java bytecode.
In Python, the fully-qualified name of, e.g., function loads(s,**kwargs) in module json and package flask is flask.json.loads(s,**kwargs). Python package formats, e.g., eggFootnote 20 or wheel,Footnote 21 contain the Python source code as-is, thus, the same parser can be used to identify Python constructs in source code and in distributed packages.
The implementation choice of using fully-qualified names supports other programming languages as long as they have a comparable naming scheme allowing construct identification in source and distributed code. Moreover, its development community must follow consistent naming conventions (that is, construct names can be assumed to be globally unique). However, those properties are not satisfied for certain languages and cases.
Specifically, we acknowledge that fully-qualified names can hardly be used for programming languages compiled into machine code, e.g., C and C++. For such languages, it is very hard to recognize and compare constructs found in source code with those in the distributed binary packages. But even in case of interpreted languages, the code of distributed packages can deviate significantly from its corresponding source code. JavaScript libraries running in browsers, for instance, are commonly minimized and obfuscated in order to reduce their download size, which makes it hard to relate constructs identified in source code to their counterparts in the distributed libraries. Such transformations are less common for JavaScript running on servers, e.g., in the Node.js runtime.
Where fully-qualified names cannot be used, one has to identify constructs using other information, e.g., characteristics of method or function bodies. This, however, comes with its own challenges and represents a separate body of research.
As described in Section 2.2, vulnerable detection is possible as soon as constructs can be uniquely identified and compared, and has been successfully implemented for Java and Python.
The subsequent reachability analyses, dynamic as described in Section 2.3 and static as described in Section 2.4, depend on other language characteristics, and have only been implemented for Java. However, before mentioning potential problems in other languages, one has to note that the absence of one or the other analysis for a given programming language does not render an implementation useless. The precision of code-based vulnerability detection is valuable by itself, even when reachability analyses cannot be performed.
Static program analysis commonly struggles with dynamic languages such as Python and JavaScript. Characteristics like the absence of static type information or dynamic code execution through statements like eval make it hard to build accurate call graphs. Depending on the chosen analysis approach, the approximated call graphs can be either too small or too big, resulting in false negatives and false positives respectively.
Dynamic program analysis is considered straightforward in cases where tracing tools already exist. In other cases, especially for interpreted languages, the source code of downloaded open source packages can be altered prior or during test execution, e.g., using techniques like aspect-oriented programming.