1 Introduction

The legacy Cell-phone and feature phone were primarily used for messaging, phone calls, short-text, and multimedia communication. However, the advent of smartphone operating systems, the Apple iOS, and Google Android, has ushered a paradigm shift in computing. The current generation smartphone is sophisticated hand-held computers capable of running third-party developer applications (apps). Users prefer Android OS-supported smartphones for social networking, banking, email, and shopping with apps taking an increasing role in our daily life [8, 17, 21]. Android framework components such as Activities, Broadcast receivers, Content providers, and Services are the basic building blocks of any user app. Android can be realized as an asynchronous framework whose components are independent with multiple entry points [12, 15].

Android framework has a rich Inter-component Communication (ICC) model. It enables the development and deployment of highly usable apps with a facility to extend services offered by other apps. Hence, ICC facilitates the communication between components of apps. Intent is a complicated messaging technique for implementing ICC [13]. Although Intent is equipped with a security mechanism to prevent attackers from directly accessing data and services of third-party apps, it does have critical security issues [15]. Each app has a file (manifest file) that includes definitions of Intent filters. These filters determine Intents that can be treated by the app components [13].

Intent resolution [13] is an Android framework mechanism for matching a component via its Intent filter with the corresponding Intent specified in source-code of another component. The Intent filters are defined in the Anroidmanifest.xml, a binary file for app developers used for explicit declarations. Currently, Android does the matching process even if Intent is leaking sensitive data. While an Android app is executed in Sandbox, an app Intent is resolved dynamically. Therefore, it is not possible to afford the delay caused by state-of-the-art techniques, dynamically testing the Intent leakage. Further, Android ICC framework is limited due to non-availability of formal Intent representation model that can match the manifest defined Intent filters with corresponding Intents at run-time. Such a model is invaluable for theoretical representation and potential framework improvements that can be used for proofs of desirable properties of system updates. More specifically, such formal representation would provide a robust basis for a modified Intent resolution algorithm that considers sensitive data leakage.

In this paper, we investigate Intent vulnerabilities responsible for data leakage and propose a formal model which defines Android ICC semantics. The semantics captures Intents, Intent filters, Intent-related APIs, and finally the Intent-resolution algorithm matching Intents and Intent filters. We define a set of key properties that guarantee a consistent and realistic Android state in our proposed semantics. We label the states satisfying such properties as “well-defined”. Further, we prove that the transitions of the proposed semantic preserve the “well-defined” propriety of states. Then, we prove the effectiveness of the semantics in two ways: i) We express different aspects of ICC such as changing Intent types via its attributes; ii) We express and check a specific potential update for improving ICC security. The update includes inserting a caching element to semantic states so that processes that led to security breaches can be reported and avoided later on. We expressed and the model and verified all its lemmas and theorems with Coq proof assistant, a formal modeling tool [10]. We present the proof of one lemma in the paper and make proofs of all other lemmas and theorems available at our GitHub repository.Footnote 1

Based on our ICC formalization, we propose DLAIR, a security-aware algorithm for Intent resolution. To assist DLAIR, we propose a lightweight and heuristic tool (\(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\)) that identifies suspicious execution paths. These are the paths that use Intents in a way that is suspicious of leaking sensitive data. On a set of 2, 000 real-world apps, we evaluated \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and compared it against Flowdroid [3], the state-of-the-art information leakage detection tool. The evaluation proves that \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is more effective and efficient than Flowdroid. \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid reported issues in 824 and 425 applications, respectively. The average running times per application for \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid are 47.8 and 74.6 seconds, respectively. Therefore \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) has a better time complexity than Flowdroid. Efficiency is an important factor as \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is to run dynamically as part of Android. On a set of 100 apps, Flowdroid has higher false-negative rate (21 against 8) and lower false-positive rate (4 against 13) than \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\). Bearing in mind the dynamic environment in which \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is designed to work, the advantages of efficiency and high true positives rate overcome the disadvantage of high false-negative rate (due to the approximation adopted in \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) to boost its lightweight feature). Some of the issues reported by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\), which were not identified by the Flowdroid, are referred to in this paper. The metadata of the dataset and results of the experimental evaluation are available in our GitHub repository.Footnote 2

Contributions This paper has the following contributions:

  1. 1.

    We propose a formal model to theoretically represent the Android framework for Inter-Component Communication (ICC).

  2. 2.

    We formally represent general and security-related ICC properties in our model.

  3. 3.

    We implement and verify our formal model and all the lemmas and theorems of the paper in the well-known proof-assistant Coq [10].

  4. 4.

    We propose DLAIR, a novel security-aware Intent resolution framework.

  5. 5.

    Based on the security guarantee, we propose \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\), a lightweight and heuristic tool that detects Intent generated sensitive data leakage.

  6. 6.

    We compare \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against Flowdroid as one of the state-of-the-art techniques.

Paper Outline The paper is organized as follows. The most relevant state-of-the-art is critically evaluated and discussed in Sect. 2. Necessary background is presented in Sect. 3. The threat model that motivates the work in this paper is presented in Sect. 4. Section 5 presents our formal model for ICC framework. This model comprises the semantic states and key properties defining well-defined state. The main APIs of ICC have been discussed in this section as well. In Sect. 6, we present semantic rules for the framework APIs and prove that, semantic preserves the “well-defined” predicate of states. Section 7 utilizes the semantics presented in the paper to express basic and advanced ICC attributes. Section 8 presents DLAIR including the lightweight tool \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and its design, implementation, and evaluation.

2 Related work

Many studies have been published on intents and component communication using intents. This section reviews recent studies focused on these issues and also on Android formalization [5, 28]. Potential future research is also discussed in this section.

Permissions are located in application manifest files. Permissions can be realized as requests to acquire access to delicate resources. Betarte et al. [5, 6] presented a model for formalizing permission system of Android 6. They also formalized and proved some properties and security mechanisms for the permission model of Android using Coq. While our model targets the interesting behavior of intents and intent filters simply, the work in [5, 6] is focused on the permission system in a complicated way. Chin et al. presented a tool, ComDroid [9], to check application interaction in Android. The tool is based on code analysis and detects security vulnerabilities related to data leakage in Android components. However, unlike our work, ComDroid is not supported by formal semantics to application interaction or its analysis steps.

Xu et al. presented a malware detection method, ICCDetector [29] which needs training with sets of benign applications and malware. This method focuses on the interactions among components of the same application. However, most other similar methods focus on resources claimed by malware such as permissions, system calls, and API calls. The work in our paper can be realized as a formal way to represent this interaction between components. Feizollah et al. [16] studied the effectiveness of including intent information in a characterizing feature for detecting malicious applications. They claimed that using intents reveals more precise malware behaviors than using only permission features. This emphasizes the need for a formal model to intents such as the one presented in our paper.

Building on the Android framework, Schmerl et al. presented a technique, Raindroid [25], that boosts security and preserves extendability. Raindroid relies on static analysis methods to recognize interaction among applications. The technique relies also on run-time methods that observe these interactions. Then the methods advise the system to reject interactions, require user permissions, or grant them. However, formal reasoning about the correctness of the advice is not presented in the paper. The model presented in our paper can be augmented with specific APIs to provide formal reasoning. Barros et al. presented static-analysis methods for Java reflection and intents in [4]. These methods are based on implicit control-flow. The methods also reveal control-flow points and data passed at these points. While our work is based on axiomatic semantics to precisely capture the Intent concept, the work in [4] uses type systems to treat intents.

Android has a mechanism by which an application can have private components. Such components are not accessible by other applications. Using next-intent vulnerability (NIV), this privacy can be bypassed by malicious applications. Hence private components become invocable by other applications. In [26], Tang et al. proposed an intent-flow static analysis NIVAnalyzer to investigate smali code and reveal NIV. The static detection is then followed by a dynamic verification via crafted exploit applications. However, the method is not supported with a formal model proving the correctness of the analysis. Some of the APIs used in NIVAnalyzer are given semantics in our model. Also remaining APIs (such as getIntent()) used in NIVAnalyzer can be added to our model easily. This can lead to formal reasoning about NIV.

3 Background

Android Concepts The building blocks for the Android app development framework are activities (UI tasks); broadcast receiver (system-wide broadcasts); content provider (data handling); and service (background services). A service is a group of background processes that do not have a user interface since they execute in the background. Broadcast receivers are background event observers that are commenced by other components. Content providers enable reaching and treating memory contents such as files, folders, and databases [11, 20].

IntentFootnote 3 is a messaging object used for communication between the app components. Intents are either explicit or implicit. An explicit Intent determines the target component which is usually in the same app. Implicit Intents do not specify a certain component. Instead, they specify the needed action. This makes it possible for a component from another app to do the action. Listing 1 illustrates an example of an explicit Intent usage, ImpViewIntent also manifests implicit intent example. ImpViewIntent exhibits the action for viewing text. ImpViewIntent requests that another adequate app shows the text (announcement).

figure a

An Intent filter is a specification in manifest file. The filter determines Intents that a component can handle. Declaring an Intent filter for an activity enables other apps to invoke the activity using a convenient Intent. Listing 2 demonstrates an example of such Intent filter. The filter declares that its component receives an ACTION_VIEW Intent when text is the value of data type [7].

figure b

Intent Resolution Intent resolution [13] is the process of assigning a convenient component to a given implicit Intent. The assigned component must contain an Intent filter whose attributes match corresponding ones in the Intent. These attributes are action, category, and data. Therefore, Intent resolution includes three tests: action, category, and data tests. The tests are formalized n Table 2.

4 Threat model

This section presents the threat model which motivates us to formalize the intent resolution mechanism. The threat model visualized in Fig. 1 depicts main entities of the Android system; App1, App2, and Intent Resolution Algorithm (IRA). App1 has a component Comp1, that defines an implicit Intent, I, to start an activity. We assumes that I stores sensitive data in its Extra field. This is fair assumption as it common for Intents to store sensitive data [1]. Then I is passed to the API startActivity that asks Android to start an activity with characteristics determined in I. The Android system passes I to IRA to find a matching component. App2 has a component, Comp2. The manifest file of App2 has an Intent filter, F, corresponding to Comp2. IRA uses F to check if Comp2 is a good match for I. If I and F match each other, I (with the wrapped sensitive data) is passed to App2 that initiates an instance of Comp2. The flow of I from App1 to App2 is an example of leaking sensitive data in the ICC framework.

The example above shows that there is a need for a lightweight and heuristic algorithm that identifies Intent-related sensitive data leakage. This need motivated our paper. The lightweight nature of the algorithm is justified by the fact that it is to be used dynamically while the app is running. Therefore the algorithm should not affect the performance significantly. The heuristic aspect of the algorithm is due to the fact that it should recognize cases where there is no need for security checks. Examples of these cases are when the checked Intent and the Intent filter are protected by permissions. In cases like these, the sensitive data passing is intended for some logical reasons related to the functions of the involved apps. The other motivation for this paper is the need to build the algorithm on formal semantics for the process of Inter-Component communication (ICC). It would be an advantage if the semantics is general enough to reason formally about ICC functional and security aspects [29].

Fig. 1
figure 1

Threat model

5 Semantic states

In the following, we present the basis of our formal model for Android ICC. This section introduces the states of the model and axiomatic semantics [18, 23] for Intent APIs. The semantics are formalized as state-transition maps that preserve a well-defined feature of states. The theoretical model is verified with the proof assistant Coq [10].

Fig. 2
figure 2

Intents, Intent-filters, and Applications Formalization

An Ù application, as described in Fig. 2, is a pair of an application id, AppID, and a manifest, Manifest. The Manifest component is a formalization of the Android manifest file. The Manifest is formalized as a pair: Comp\(^*\) representing the application components and Compf representing Intent filters together with their hosting components. A component (Comp) is an element of the set {Activity, Service, BroadReceiver} which are uniquely specified using identifiers [11]. Moreover, the Service has a Boolean (ClientServer) that specifies whether the service is a client or server one. The sets AId, SId, BId denote sets of activity IDs, service IDs, and broadcast receiver IDs, respectively. The union of these sets forms the set of component IDs, CmpId. Since the Components are implemented as classes, the Android framework creates instances of the component classes at run-time. In the following, the instances will be referred to as component instances. As long as the instance exists, we call it an active or a running instance. We describe a component as installed (downloaded) means that the component and its app are included in the system under study [11].

Intent and Intent filters concepts are also formalized in Fig. 2. An Intent is 6-tuple that consists of:

  1. o

    IntId: the Intent identifier,

  2. o

    CmpId: the identifier of the component hosting the Intent,

  3. o

    Action: the action type of the Intent,

  4. o

    Data: the data type of the Intent,

  5. o

    \((\texttt {Category})^*\): the set of categories targeted by the Intent, and

  6. o

    Extra: the extra data (if any) needed to execute the Intent.

An Intent filter is quadruple that consists of:

  1. o

    IntFilId: the Intent filter identifier,

  2. o

    \((\texttt {Action})^*\): the set of action types that can be handled by the Intent filter,

  3. o

    \((\texttt {Category})^*\): the set of categories that can be handled by the Intent filter, and

  4. o

    \((\texttt {Data})^*\): the set of data types that can be handled by the Intent filter.

Fig. 3
figure 3

Model states formalization

The semantic states (Fig. 3) of the model are 6-tuples with the following components:

  1. o

    \(\texttt {Apps}:\) a set of installed applications,

  2. o

    Active: a set of pairs of active instances of components together and their components.

  3. o

    IntFilter: a set of pairs of applications with their Intent filters.

  4. o

    WaitImpInt: a set of implicit Intents still not linked to a matching Intent filter (i.e. needing Intent resolution).

  5. o

    IntentResol: a set of pairs of Intents and Intent filters that were found to match each other.

  6. o

    AppLauncher: a set of pairs of applications and their launching components.

The notation IComp denotes a component instance. The instance is identified via an ID, InstId.

Fig. 4
figure 4

The formalization of the predicate WellDefined and its sub-predicates

The proposed semantics relies on the notion of well-defined system states. The notion captures system states satisfying realistic Android characteristics. These characteristics ensure states respecting the principles of Android OS. Formally, this notion is expressed as a predicate (WellDefined) on system states (SysState). Figure 4 shows the formalization of the predicate and its sub-predicates in the proof assistant Coq [10]. All the predicates assume a state variable \(s\in \texttt {SysState}\). To precisely simulate semantics [18, 23] of Intent APIs, we define some predicates and functions in Coq. These are presented and described in Table 1.

Table 1 Auxiliary predicates and functions

The following sub-predicates compose the WellDefined predicate:

  1. 1.

    The predicate WaitIntsImp holds for a state s if WaitImpInt links applications with implicit Intents only.

  2. 2.

    The predicate UsrIntFsDefd holds for a state s if IntFilter links an application a to an Intent filter f only if a is included in s and f is included in one of the components in Manifest a. The predicate guaranties that treated Intent filters are obtained from installed applications.

  3. 3.

    The predicate NoRepIntFs holds for a state s if an Intent filter can not belong to two identical components of different apps.

  4. 4.

    The predicate NoRepImpInts holds for a state s if no implicit Intent waiting for resolution can belong to two different apps.

  5. 5.

    The predicate ExstAppforWatInt holds for a state s if implicit Intents not having matching components yet belong to system applications. This predicate guaranties that treated Intents are system ones and no foreign Intents are leaking data.

  6. 6.

    The predicate RunInsInSysComp holds for a state s if every running instance ic of a component c has its corresponding component installed.

  7. 7.

    The predicate ResExstIntIntF holds for a state s if the matching between Intent i and Intent filter f implies two conditions. The first condition is that i is an implicit Intent in an installed application, a1. The other condition is that f belongs to an installed application, a2.

We later prove that the predicate WellDefined is invariant concerning transition maps of the semantics [18, 23].

Fig. 5
figure 5

Intent-related APIs

Figure 5 presents APIs that are related to Intents. The formal semantics of the APIs is presented in the next section and their description is as follows:

  1. o

    startActivity(i) starts an activity that is determined by the Intent i.

  2. o

    startService(i) starts a service that is specified by the Intent i.

  3. o

    sendBroadcast(i) sends a broadcast that is fixed by the Intent i.

  4. o

    setComponent(i, option cmpId) explicitly sets the component that the Intent will target.

  5. o

    setAction (i, option act) sets the attribute Action of the given Intent.

  6. o

    setData(i, option dat) sets the attribute Data of the Intent i.

  7. o

    addCategory(i, cat) adds a category to the list attribute Category of the Intent i.

  8. o

    putExtra(i, ef, ev) sets the attributes ef and ev of the Intent i.

These APIs include ones that start a component via an Intent. The APIs also include ones that modify attributes of Intent objects. Therefore, the APIs are among the ones most used by Android programmers when using Intents. The behavior described above for the APIs is based on the official specification by Google [11, 13].

Fig. 6
figure 6

Pre and post conditions of the semantics of intent-related APIs

6 Semantic rules

This section presents the semantics of the Android APIs (Fig. 5) most related to ICC. The semantics is captured by determining pre and post conditions. For each \(\texttt {api}\in \texttt {API}\), a pre-condition expresses the conditions of a system state able to execute api. However, its post-condition describes the main characteristics of the system state resulting from executing api. The relationship \(\rightsquigarrow \) denotes the execution of api. Figure 6 introduces the conditions \(\texttt {pre}(s)\) and \(\texttt {post}(s^\prime )\) such that \(\texttt {api}:s\rightsquigarrow s^\prime \).

We first provide a higher level of explanation for intent resolution and then present its formalization. We then use the formalization in the semantic rules. An Intent filter passes the action test if the Intent action is one of the filter actions. However, if the action list of filters is not empty and the Intent has no action included, the filter passes the test. To pass the category test, Intent categories have to be included in filter categories. The data element (of Intents and Intent filters) can have a MIME data type and a URI structure. The URI has four separate parts: scheme, host, port, and path. Although these parts are optional, their existence respects linear dependencies.

Table 2 Formalization of Intent Resolution Rules

The action, category, and data tests of intent resolution are formalized in Table 2. Rule 1 (R.H.S.) states that a component c resolves an intent i either explicitly or implicitly. Rule 1 (L.H.S.) states that the explicit resolution occurs when i is explicit and CmpId of c coincides with that specified in i. Rules 2 states that the implicit resolution occurs when c includes an intent filter that matches i. That the matching between an intent and an intent filter includes three tests (data, category, and action) is formalized in Rule 3 (L.H.S.). The category and action tests are formalized in Rules  3 (R.H.S.) and 2, respectively. The data test is formalized in Rules 4 (R.H.S.), 5, 6, 7, 8, and 9 . The remaining rules of the figure formalize matching conditions between URL components of data elements contained in an Intent and an Intent filter.

Examples of applying rules of Table 2 are provided by the implicit Intent and the intent filter given in the Listings 1 and 2 of Sect. 3. We show some application cases. Rule 3 (L.H.S.) can be applied as follows: the condition \(\texttt {category-default}\in \texttt {(Category)}^* f\) applies because the category DEFAULT belongs to the intent filter. Also the condition \( \texttt {(Category)}^*\ i \subseteq \texttt {(Category)}^*\ f \) applies because the intent hosts no categories that do not belong to the intent-filter tag. Rule 2 is applicable as the condition

$$\begin{aligned} \texttt {Action i=None}\vee (\exists a:\texttt {Action}.\ (\texttt {Action i}=a)\wedge a\in \texttt {(Action)}^* f) \end{aligned}$$

is satisfied through the second part of the “OR” connector. This is so because both the intent and the intent filter have the action VIEW which is denoted in the rule by a.

In Fig. 6, the precondition of startActivity(i) requires the existence of a component c that resolves the Intent i. The component has to be an activity (IsActivity(c)) and installed in the state s (CompDwnloaded(c,s)). The postcondition of startActivity(i) is the union of the seven sub-conditions stated in the figure. The descriptions of the subfunctions are as follows:

  1. 1.

    The component instances that were running in s are still running in \(s^\prime \).

  2. 2.

    All pairs of an Intent and an Intent filter linked under \(\texttt {IntentResol}\ s\), are also linked under \(\texttt {IntentResol}\ s^\prime \).

  3. 3.

    This sub-condition determines the state parts that are not affected by executing the API. Hence these parts are the same under s and \(s^\prime \).

  4. 4.

    If i is explicit, then no intent resolution is necessary to execute the API. Hence \(\texttt {IntentResol}\ s=\texttt {IntentResol}\ s^\prime \).

  5. 5.

    There is an instance ic of the activity c that is running in \(s^\prime \). The instance was created as a result of executing the API. Therefore, ic does not belong to the state s.

  6. 6.

    Suppose that an \(ic^\prime \) is running in \(s^\prime \). Then \(ic^\prime \) was running in s or \(ic=ic^\prime \).

  7. 7.

    If i is implicit, then the process of intent resolution was carried on. Therefore an Intent filter f was found in c such that i matches f. This is recorded in \(\texttt {IntentResol}\ s^\prime \).

The semantics of the APIs startService(i) and sendBroadcast(i) are similar to that of startActivity(i). However, for startService(i) and sendBroadcast(i), one of the changes is to replace \(\texttt {IsActivity}(c)\) with the convenient function: \(\texttt {IsService c}\) or \(\texttt {IsBroadCast c}\). The descriptions of semantics for setComponent and setAction given in Fig. 6 are similar to that of startActivity. The semantics of APIs setData(i, dat), addCategory(i, cat) and putExtra(i, ef, ev) are similar to that of the API setAction (i, act).

In the following, we prove that the predicate WellDefined is preserved by one-step execution of any of the APIs of Fig. 5. This means that executing an API starting from a WellDefined state results in a WellDefined state. The proofs of all following lemmas and theorems were verified in Coq and their verification is available online.Footnote 4 Hence, we only show the proof of Lemma 1.

Lemma 1

  1. 1.

    \((\texttt {WaitIntsImp}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {WaitIntsImp}(s^\prime ).\)

  2. 2.

    \((\texttt {UsrIntFsDefd}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime ) \Rightarrow \texttt {UsrIntFsDefd}(s^\prime ).\)

  3. 3.

    \((\texttt {NoRepIntFs}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {NoRepIntFs}(s^\prime ).\)

Proof

  1. 1.

    The proof is by induction on the set API. We show only the case where \(\texttt {API}=\texttt {setComponent}(i,\texttt {cid})\). By assumption, for the state s we have:

    $$\begin{aligned} \forall (a : \texttt {App})(i : \texttt {Intent}), (\texttt {WaitImpInt}\ s)\ a\ i\ \Rightarrow \texttt {ImpInt}(i). \end{aligned}$$
    (14)

    We need to show the same for the state \(s^\prime \). Therefore we assume

    $$\begin{aligned} a : \texttt {App},\ i^\prime : \texttt {Intent},\texttt {and (WaitImpInt}\ s^\prime ) \ a\ i^\prime . \end{aligned}$$
    (15)

    Now it is enough to show that \(\texttt {ImpInt}(i^\prime )\). For \(i,i^\prime ,\) we have \(i=i^\prime \) or \(i\not =i^\prime \). We prove the required in both cases:

    • Case \(i\not =i^\prime \): by the postcondition of \(\texttt {API}\) (sub-condition 3), we conclude

      $$\begin{aligned} (\texttt {WaitImpInt}\ s^\prime )\ a\ i^\prime = (\texttt {WaitImpInt}\ s)\ a\ i^\prime . \end{aligned}$$

      By 15, we have \((\texttt {WaitImpInt}\ s)\ a\ i^\prime \). Now by 14, we conclude \(\texttt {ImpInt}(i)^\prime \), as required.

    • Case \(i=i^\prime \): we have two subcases:

      • Case \(\texttt {cid}\not = \texttt {None}\): by the postcondition of \(\texttt {API}\) (sub-condition 2), we conclude

        $$\begin{aligned} \lnot (\texttt {WaitImpInt}\ s^\prime )\ a\ i. \end{aligned}$$

        This is a contradiction to our assumption in 15. Hence the required is proved.

      • Case \(\texttt {cid}= \texttt {None}\): by the postcondition of \(\texttt {API}\) (sub-condition 1), we conclude

        $$\begin{aligned} \texttt {CmpId}\ i=\texttt {None}. \end{aligned}$$

        By definition of implicit intent, this implies \(\texttt {ImpInt}(i),\) which implies \(\texttt {ImpInt}(i^\prime )\).

  2. 2.

    The proof is by induction on the set API. We show only the case where \(\texttt {API}=\texttt {startService}(i)\). By assumption, for the state s we have:

    $$\begin{aligned}&\forall (a : \texttt {App})(f : \texttt {IntentFilter}), (\texttt {IntFilter}\ s)\ a\ f\ \Rightarrow \nonumber \\&(\texttt {Apps}\ s)\ a\ \wedge (\exists c : \texttt {Comp},\ (\texttt {Compf}\ (\texttt {Manifest}\ a))\ c\ f). \end{aligned}$$
    (16)

    We need to prove the same for the state \(s^\prime \). Hence we assume

    $$\begin{aligned} a : \texttt {App},\ f : \texttt {IntentFilter},\ (\texttt {IntFilter}\ s^\prime )\ a\ f. \end{aligned}$$
    (17)

    Now it is enough to show that

    $$\begin{aligned} (\texttt {Apps}\ s^\prime )\ a \wedge (\exists c : \texttt {Comp},\ (\texttt {Compf (Manifest a)) c f)}. \end{aligned}$$

    By 17 and the postcondition of \(\texttt {API}\) (sub-condition 3), we conclude \((\texttt {IntFilters}\ s)\ a\ f\). Hence by 16 we conclude:

    $$\begin{aligned} (\texttt {Apps}\ s)\ a\ \wedge (\exists c : \texttt {Comp},\ (\texttt {Compf}\ (\texttt {Manifest}\ a))\ c\ f). \end{aligned}$$

    By applying the postcondition of \(\texttt {API}\) (sub-condition 3) again, we get the required.

  3. 3.

    The proof is by induction on the set API. We show only the case where \(\texttt {API}=\texttt {putExtra(i,ef,ev)}\). By assumption, for the state s we have:

    $$\begin{aligned}&\forall (a_1, a_2 : \texttt {App})\ (c_1, c_2 : \texttt {Comp})\ (f : \texttt {IntentFilter}),\ (\texttt {IntFilter}\ s)\ a_1\ f\ \Rightarrow \nonumber \\&(\texttt {IntFilter}\ s)\ a_2\ f \Rightarrow (\texttt {Apps}\ s)\ a_1 \Rightarrow (\texttt {Apps}\ s)\ a_2 \Rightarrow (\texttt {Compf (Manifest } a_1))\ c_1\ f\ \nonumber \\&\Rightarrow (\texttt {Compf (Manifest }a_2))\ c_2 f \Rightarrow c_1 = c_2 \Rightarrow a_1 = a_2. \end{aligned}$$
    (18)

    We need to prove the same for the state \(s^\prime \). Hence we assume

    $$\begin{aligned}&(a_1, a_2 : \texttt {App}),\ (c_1, c_2 : \texttt {Comp}),\ (f : \texttt {IntentFilter}),\ (\texttt {IntFilter}\ s^\prime )\ a_1\ f,\nonumber \\&(\texttt {IntFilter}\ s^\prime )\ a_2\ f,(\texttt {Apps}\ s^\prime )\ a_1,\ (\texttt {Apps}\ s^\prime )\ a_2, \nonumber \\&(\texttt {Compf (Manifest } a_1))\ c_1\ f,\ (\texttt {Compf (Manifest }a_2))\ c_2 f,\ c_1 = c_2. \end{aligned}$$
    (19)

    Now it is enough to show that \(a_1 = a_2\). By the postcondition of \(\texttt {API}\) (sub-condition 1), in 19, we can replace \((\texttt {IntFilter}\ s^\prime )\ a_1\ f,\ (\texttt {IntFilter}\ s^\prime )\ a_2\ f,\ (\texttt {Apps}\ s^\prime )\ a_1,\) and \((\texttt {Apps}\ s^\prime )\ a_2\), with \((\texttt {IntFilter}\ s)\ a_1\ f,\ (\texttt {IntFilter}\ s)\ a_2\ f,\ (\texttt {Apps}\ s)\ a_1,\) and \((\texttt {Apps}\ s)\ a_2\), respectively. Then we get the precondition of implication of 18. Hence, we conclude \(a_1=a_2\).

\(\square \)

Lemma 2

  1. 1.

    \((\texttt {ExstAppforWatInt}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {ExstAppfor}{} \texttt {WatInt}(s^\prime ).\)

  2. 2.

    \((\texttt {NoRepImpInts}(s)\wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {NoRepImpInts}(s^\prime ).\)

  3. 3.

    \((\texttt {RunInsInSysComp}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {RunInsInSysComp}(s^\prime ).\)

  4. 4.

    \((\texttt {ResExstIntIntF}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {ResExstIntIntF}(s^\prime ).\)

The following theorem results from Lemmas 1 and 2 .

Theorem 1

\((\texttt {WellDefined}(s) \wedge \texttt {API}:s\rightsquigarrow s^\prime )\Rightarrow \texttt {WellDefined}(s^\prime ).\)

7 Semantics utilization

This section employs the semantics presented in the previous section. The semantics is a base towards formalizing basic and advanced aspects of the Android framework ICC. Here, we demonstrate the utility of proposed semantics towards developing a robust Android ICC framework.

Modifying some fields (action, data, category, and extra) in an Intent does not change the Intent type. However, updating the attribute CmpId possibly change the Intent type. These facts and similar ones are formalized in the Lemmas 3 and 4 .

Lemma 3

  1. 1.

    \((\texttt {ExpInt}(i)\wedge \texttt {setAction}(i, \texttt {act}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ExpInt}(i).\)

  2. 2.

    \((\texttt {ImpInt}(i)\wedge \texttt {setAction}(i, \texttt {act}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ImpInt}(i).\)

  3. 3.

    \((\texttt {ExpInt}(i)\wedge \texttt {setData}(i, \texttt {dat}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ExpInt}(i).\)

  4. 4.

    \((\texttt {ImpInt}(i)\wedge \texttt {setData}(i, \texttt {dat}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ImpInt}(i).\)

  5. 5.

    \((\texttt {ExpInt}(i)\wedge \texttt {addCategory}(i, \texttt {cat}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ExpInt}(i).\)

  6. 6.

    \((\texttt {ImpInt}(i)\wedge \texttt {addCategory}(i, \texttt {cat}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ImpInt}(i).\)

  7. 7.

    \((\texttt {ExpInt}(i)\wedge \texttt {putExtra}(i, \texttt {ef}, \texttt {ev}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ExpInt}(i).\)

  8. 8.

    \((\texttt {ImpInt}(i)\wedge \texttt {putExtra}(i, \texttt {ef}, \texttt {ev}):s\rightsquigarrow s^\prime )\Rightarrow \texttt {ImpInt}(i).\)

Lemma 4

  1. 1.

    \( \texttt {ImpInt}(i)\wedge {\texttt {cid} =\texttt {None}}\wedge \texttt {setComponent}(i, \texttt {cid}):s\rightsquigarrow s^\prime \Rightarrow \texttt {ImpInt}(i).\)

  2. 2.

    \( \texttt {ImpInt}(i)\wedge \texttt {cid} \not =\texttt {None}\wedge \texttt {setComponent}(i, \texttt {cid}):s\rightsquigarrow s^\prime \Rightarrow {\texttt {ExpInt}(i).}\)

  3. 3.

    \( \texttt {ExpInt}(i)\wedge \texttt {cid} \not =\texttt {None}\wedge \texttt {setComponent}(i, \texttt {cid}) :s\rightsquigarrow s^\prime \Rightarrow \texttt {ExpInt}(i).\)

Our proposed semantics does not cache results of Intent resolution process. Hence, if two APIs use the same explicit Intent, the system will resolve the Intent twice. Therefore, the system does not benefit from the first resolution to the second one. This results in resource wastage which can be handled. Further, such instances have serious security implications. For example, if the first Intent resolution is responsible for sensitive data leakage, it will be repeated in the second Intent resolution.

Before running the Intent resolution process, it is convenient to check whether a matching component has been found recently. If this is the case, the system should check whether the component is responsible for a security breach. If there is no history of security breaches, then the system can declare the component to be a match. Otherwise, the system runs the Intent resolution process. The proposed updates can be implemented in our proposed semantics employing the following changes. The first change is to add the component \(\texttt {BadIntResol}: \texttt {Intent} \times \texttt {IntentFilter}\) to the definition of semantic state (SysState). This augmentation, allows a state to store the history of Intent resolution that led to security breaches. The other change is to replace Rule 3 (R.H.S.) with the following three rules:

figure d

With these changes, we can prove that executing Intent APIs such as startActivity(i) is secure as formalized in Theorem 2. The theorem guarantees that executing startActivity(i) for an implicit intent i does not lead to invoking activities having history of security breaches.

Theorem 2

Suppose \( \texttt {API}\in \{ \texttt {startActivity}, \texttt {startService}, \texttt {sendBroad}{} \texttt {cast}\}\). Suppose also that \(\texttt {ImpInt}(i) \wedge \texttt {API}(i):s\rightsquigarrow s^\prime .\) Then there exists \(c\in \texttt {Comp},\ ic\in \texttt {IComp}, \) and\( f\in \texttt {IntentFilter}\) such that:

  1. 1.

    \(\texttt {IsActivity}(c)\vee \texttt {IsService}(c)\vee \texttt {IsBroadCast}(c)\),

  2. 2.

    \(\texttt {CompDwnloaded}(c,s)\wedge \texttt {InsCompNotInState}(ic,s)\wedge (\texttt {Active}\ s^\prime )\ ic\ c \wedge \texttt {InFComp}(c,f)\),

  3. 3.

    \((\texttt {IntentResol}\ s)\ i\ f\wedge (\texttt {IntentResol}\ s)\ i\ f \Rightarrow \lnot (\texttt {BadIntentResol}\ s)\ i\ f, \texttt {and}\)

  4. 4.

    \(\lnot (\texttt {IntentResol}\ s)\ i\ f \Rightarrow \texttt {ActionTest}(i,f),\ \texttt {CategoryTest}(i,f),\) and \(\texttt {DataTest}(i,f). \)

According to the theorem above the API execution results in an instance ic of a component c that has an Intent filter f such that the following conditions hold. The instance ic does not belong to s, but belongs to \(s^\prime \). If f has recently been identified as a match to i, then c has no history of security breaches. This is formalized using the condition \(\lnot (\texttt {BadIntentResol}\ s)\ i\ f\). If there is no history of matching between i and f (\(\lnot (\texttt {IntentResol}\ s)\ i\ f\)), then f has passed the action, category, and data tests.

Executing startActivity creates new instance. In a state s and for an explicit intent i, the execution of the startActivity(i) API results in an instance ic of a component c such that the following holds. c is an activity and ic does not belong to s, but belongs to and is running in \(s^\prime \). The id of c coincides with that specified in attribute CmpId of i. This is formalized in the following lemma.

Lemma 5

Suppose \(\texttt {ExpInt}(i)\) and \( \texttt {startActivity}(i):s\rightsquigarrow s^\prime .\) Then there exists \(c\in \texttt {Comp}\) and \(ic\in \texttt {IComp}\) such that \(\texttt {InsCompNotInState}(ic,s)\wedge (\texttt {Active}\ s^\prime )\ ic\ c\wedge \texttt {CompDwnloaded}(c,s)\wedge \texttt {IsActivity}(c)\wedge \texttt {CmpId}\ i= \texttt {RetCmpId(c)}.\)

It is not possible to execute \(\texttt {sendBroadcast}(i)\) in a state s if i is an implicit intent that can not be resolved in s. Suppose that in state s, i is an implicit intent that was not resolved recently. Additionally, suppose that a is the action value of i. Further, each intent filter in the system does not include a in its action list, it is not possible to execute the APIs \(\{ \texttt {startActivity}(i),\texttt {startService}(i),{} \texttt {sendBroadcast}(i)\}\) in the state s. This is formalized in the following lemma, proof is available in the verification code.

Lemma 6

Suppose that \(\texttt {ImpInt}(i)\wedge \texttt {Action}\ i = a\wedge \forall f\in \texttt {IntentFilter}, \lnot (\texttt {IntentResol}\ s)\ i\ f,\) and \( \forall (c:\texttt {Comp}) (f:\texttt {IntentFilter}),\ \texttt {CompDwn}{} \texttt {loaded}(c,s) \wedge \texttt {InFComp}(c,f)\Rightarrow \texttt {act}\notin (\texttt {Action}^*\ f).\) Then \(\lnot \texttt {pre\_start}{} \texttt {Activity}(i,s)\wedge \lnot \texttt {pre\_startService}(i,s) \wedge \lnot \texttt {pre\_sendBroadcast}(i,s).\)

It is not possible to execute \(\texttt {startService}(i)\) in a state s if, i is an explicit intent pointing to a component that is not installed in s. We assume that in state s i is an explicit intent. We expect that CmpId field of i does not coincide with that of any component in s. In such situations, it is not possible to execute the APIs \(\{ \texttt {startActivity}(i),\texttt {startService}(i),\texttt {sendBroadcast}(i)\}\) in the state s. This is formalized in the following lemma.

Lemma 7

Suppose \(\texttt {ExpInt}(i)\wedge \forall c\in \texttt {Comp}, \texttt {CompDwnloaded}(c,s)\Rightarrow \texttt {CmpId}\ i\not = \texttt {RetCmpId(c)}.\) Then \(\lnot \texttt {pre\_startActivity}(i,s)\wedge \lnot \texttt {pre\_start}{} \texttt {Service}(i,s) \wedge \lnot \texttt {pre\_sendBroadcast}(i,s).\)

Turning an implicit intent i into an explicit one and then executing \(\texttt {startService}(i)\) results in a new service instance in the final state. Suppose that in a state s, i is an implicit intent. Suppose that cid is the Id of the service c. Suppose that i is turned into explicit intent via assigning cid to the filed CmpId of i. Then executing \(\texttt {startService}(i)\) results in a new service instance of c in the final state. This is formalized in the following Theorem.

Theorem 3

Suppose that \(\texttt {ImpInt}(i)\wedge \texttt {setComponent}(i,\texttt {cid}):s\rightsquigarrow s^\prime \wedge \texttt {startService}(i):s^\prime \rightsquigarrow s^{\prime \prime }.\) Then there exists \(c\in \texttt {Comp}\) and \(ic\in \texttt {IComp}\) such that: \(\texttt {InsCompNotInState}(ic,s)\wedge (\texttt {Active}\ s^{\prime \prime })\ ic\ c\wedge \texttt {IsService}(c)\wedge \texttt {RetCmpId(c)}=\texttt {cid}.\)

Executing sendBroadcast (i) with an implicit intent i results in a new Broadcast instance matching i . Suppose that in a state s, i is an implicit intent. Suppose that act is the value of the action property is i. We assume that i was not resolved recently. Then, executing \(\texttt {sendBroadcast}(i)\) results in a new a broadcast instance. The broadcast of the instance is installed in the initial state which includes a filter that matches i. This is formalized in the following theorem.

Theorem 4

Suppose \(\texttt {ImpInt}(i)\wedge \texttt {Action}\ i = \texttt {act}\wedge (\forall f\in \texttt {IntentFilter}, \lnot (\texttt {IntentResol}\ s)\ i\ f)\wedge \texttt {sendBroadcast}(i):s\rightsquigarrow s^\prime \). Then there exists \(c\in \texttt {Comp},\ f\in \texttt {IntentFilter},\) and \(ic\in \texttt {IComp}\) such that: \(\texttt {InsCompNotInState}(ic,s)\wedge (\texttt {Active}\ s^\prime )\ ic\ c\wedge \texttt {CompDwnloaded}(c,s)\wedge \texttt {IsBroadCast}(c)\wedge \texttt {InFComp}(c,f)\wedge \texttt {act}\in (\texttt {Action}^*\ f).\)

8 Data-leakage aware Intent resolution (DLAIR)

This section presents a new security-aware algorithm (DLAIR) for Intent resolution. The algorithm relies on the formal model presented in the previous sections of the paper in addition to \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\), our lightweight tool that identifies suspicious paths related to Android app Intents. In this context, \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) can be thought of as a security analysis tool for reporting execution paths with Intents that are suspicious of leaking sensitive user data in Android apps. DLAIR is outlined in Algorithm 1. DLAIR and the design, implementation, and evaluation of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) (as the main pillar of DLAIR) are presented in the following subsections.

figure e

8.1 Design of DLAIR

DLAIR, as listed in Algorithm 1, takes as input an Intent i of the app p. For i, the algorithm builds two sets of matching Intent Filters (Good_Filters and Suspicious_Filters). Theses sets are initialized in steps 1 and 2 of the algorithm. The set Good_Filters has matching filters of i that are secure and the set Suspicious_Filters has matching filters of i that are suspicious of leaking sensitive data. In Step 3, the algorithm checks if i is implicit. Step 4 calls \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) that returns a triple \([\gamma , \omega , \beta ]\), where:

  1. 1.

    \(\gamma \) is a Boolean that is 1 if the execution path hosting i is suspicious of leaking sensitive data,

  2. 2.

    \(\omega \) specifies the component type targeted by i, and

  3. 3.

    \(\beta \) is the permission required to access the component targeted by i, if the component is protected by a permission. Otherwise, \(\beta \) is none.

Recall that Android allows Intents targeting broadcasts to be protected by permissions. This is not the case for activities and services. Therefore, if \(\omega \) is broadcast (Step 6), and if \(\beta \) is a signature permission (Step 7), then Step 8 uses our proposed algorithm of Intent resolution (\(\texttt {IntMatchIntF}^{mod}\)) to build the set Good_Filters. If \(\beta \) is not signature permission or the component type is not a broadcast, then the algorithm secures the matching process by accepting only filters protected by signature permissions. This is done in Steps 10 and 12. If \(\gamma =0\), then there is no need to care for permissions. This is implemented in Step 14. If i is not implicit, then in Step 16, the traditional algorithm of Intent resolution (outlined in Fig. 2) is used to find the set of Good_Filters. In all cases, Step 17 defines Suspicious_Filters as the set of all filters produced by the traditional algorithm of Intent resolution (outlined in Fig. 2) minus Good_Filters.

It is necessary to note that DLAIR is not a taint analysis. While taint analysis is static and aims at fining data leaks, DLAIR is dynamic analysis. Taint analysis usually is applied once (on one app) at installation time. DLAIR can be realized as an enrichment of the process of Intent-resolutions. This enrichment aims at avoiding repetition of insecure Intent-resolutions. Hence DLAIR works on multiply applications simultaneously. While Taint analysis is typically applied as an extra layer above the Android OS, DLAIR can conveniently be integrated into the Android OS. Flowdroid [3] is built on the Soot tool [27] which replies on optimizing the Java byte code of apk files. This optimization increases significantly the rune time of the tool. Though disabling the optimization part of Soot is possible, the default configuration of Soot is still not fast enough as realized in many tools, such as Dare [22] and ded [14]. This fact has led researchers to utilize other tools (such as Androguard) as reviewed in [24]. Accuracy and soundness arguments [19] of Soot are among the challenges not solved yet [24]. This partially justifies that DLAIR gave better results than Flowdroid as explained later in detail.

8.2 Design of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\)

\(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is useful as a lightweight technique for testing the security of the execution path of a given Intent. In this context, an execution path is intent-suspicious if it contains the following three Dalvik instructions in the following specific order:

  1. o

    The first instruction defines an intent object,

  2. o

    The second instruction invokes a source API that reads sensitive data.

  3. o

    The third instruction invokes a sink API that takes the intent of the first instruction as an argument.

Fig. 7
figure 7

Overview of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\)

Figure 7 presents an overview of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) which is composed of the following modules:

  • Reverse engineering module: aims at decompiling the apk file to get the manifest and dex files.

  • Manifest analysis module: collects lists of activities, broadcast receivers, and services. The analysis decide which elements of these lists are protected by permissions.

  • Identifying Component Classes module: identifies the dex code of classes of different components used in the app. The following stages use these classes.

  • Identifying Intent-Suspicious paths module: investigates paths of methods of component classes. Then paths are analyzed to determine which paths are suspicious according to our definition above. For a suspicious path, the analysis also specifies whether the third instruction, of intent-suspicious definition, is protected with permission. This can be the case for sink methods such as sendBroadcast. The analysis also specifies the type of component object (if any) started by the path.

  • Identifying Intent-Suspicious Pairs of paths module: discovers more involved patterns on leaking data. This pattern was determined from analyzing results of state-of-the-art techniques reporting data leakage such as Flowdroid. In this case, the pattern is composed of two suspicious paths; the first path contains the first and second instructions of the definition of intent-suspicious paths. This path invokes another method that includes the second path which contains the third instruction of definition of intent-suspicious paths. Details specified for the suspicious path of the previous module are also collected for the two paths of this module.

8.3 Implementation of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\)

The details of implementing modules of Fig. 7 are as follows:

  • The reverse engineering is done using Androguard and produces three objects addx.

  • The manifest analysis is implemented using several Androugard APIs including a.find_tags(’receiver’) to get a list of receiver tags. Similarity tags of other components can be obtained. The API tag.get(a._ns(’permission’)) is used to check if the component is protected using a permission.

  • The implementation of Identifying Component Classes relies on the following APIs to get dex classes of different components: a.get_services() and dx.get_class_analysis(c.get_name()).

  • For the implementation of Identifying Suspicious Intent paths and Identifying Intent-Suspicious Pairs of paths phases, the paths of methods of components are built using the APIs basic_blocks.gets() and CurBblock.childs. The analysis of the collected paths relies on identifying path instructions by the API block.get_instructions(). The analysis of instructions details utilizes the APIs get_name() and get_operands().

Our list of source and sink APIs were derived from ones used by state-of-the-art techniques such as Flowdroid. The list includes all common sources and sinks related to intents and bundles only. This list is available online with our result files. Among sink APIs in our list are the sends broadcast ones and the send service API.Footnote 5

8.4 Performance evaluation

This section presents the results of our experiments conducted for evaluating \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\). We got the dataset from a reputable benchmark, namely Andro-Zoo [2]. We downloaded a random set of 2000 applications whose sha256 is available online with our result files. We limited the download to applications that Androzoo obtained from the Google Play Store. All experiments were done on a Dell (Vostro) device with processor: Intel(R) Core(TM) i7-3612 QM CPU @ 2.10 GHz, 8.00GB RAM, and Windows 10 (64-bits) operating system. All implementations were written in Python. Table 3 shows the statistics collected for comparing \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against Flowdroid [3] as one of the state-of-the-art techniques that are most common and related to \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\). We make all our results and data files available online.Footnote 6 These files include 2000 text files produced by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and 1999 text produced by Flowdroid for the applications of the dataset. The Coq files that include the proofs of our semantics are also available online.

Table 3 Statistics comparing \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against Flowdroid

Our experiments answer the following research questions:

  1. RQ1.

    Applicability, Effectiveness, and Performance: How does \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) compare against alternative approaches applicability, discovering suspicious paths related to Intents, and in performance?

  2. RQ2.

    Accuracy: How accurate are the results of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against alternative approaches?

  3. RQ3.

    Common Component Leaking Data: Is there a specific component that is targeted most by Intents used in paths leaking data?

  4. RQ4.

    Permissions: How common is the use of permissions in protecting components in manifest and in protecting intents in the native code?

Applicability, Effectiveness, and Performance: The applicability of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is proved by its results obtained for 2000 real-life Android applications. Moreover, we designed two Android applications that include different patterns of paths leaking sensitive data. These paths include Intents that target different types of components. To cover all possible cases in our applications, some of Intents and components were protected by permissions and others were not. \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) succeeded in reporting all the leaking paths in the example applications. On the other hand, Flowdroid failed to report leaks in these two applications which we make available online along with their \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid results.

Figure 8 (Right-hand-side) provides comparisons for leaks reported by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid. The total number of leaks reported by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid are 49015 and 4664, respectively. It is worth mentioning that Flowdroid reported more leaks than this number, but we considered only leaks related to our problem. For example, Flowdroid typically reports leaks in external components that are not among the native code of the app under analysis. Examples of external components are com.facebook.CampaignTrackingReceiver and com.google.android.gms.ads.AdActivity. Such leaks were not considered in our evaluation. The reported number of leaks proves that \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) is more effective than Flowdroid.

Fig. 8
figure 8

Comparing runtime and reported vulnerabilities in \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid

Figure 8 (Left-hand-side) provides comparisons for running times of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid. The average running times per application for \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) and Flowdroid are 47.8 and 74.6 seconds, respectively. This proves that the performance and efficiency of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) are better than that of Flowdroid.

Accuracy: \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) reported suspicious leaks in 824 apps, while the Flowdroid reported leaks in 425 apps. We manually checked the results reported for a random set of 100 apps for measuring the accuracy of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against Flowdroid. Table 4 illustrates the results of our manual investigation.

Compared to \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\), Flowdroid has higher false-negative and lower false positive. This is in line with the approximation adopted by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\). However, the efficiency and the high number of true positives reported by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) confirm its accuracy and usefulness.

Table 4 Accuracy of \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) against Flowdroid
Fig. 9
figure 9

Comparing statistics collected by \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) for different types of Android components and for leaking paths protection by permissions

Common Component for Leaking Data and Permissions Figure 9 answers research questions 3 and 4. The lower set of bars show that activity components collected from manifest files of the dataset are much more than service and broadcast-receiver components. However, a high percentage (\(21\%\)) of these broadcast components are protected by permissions against \(0.1\%\) only for activity components. This is clear from the middle set of bars of the figure. The abbreviation MC in the y axis stands for manifest components. For some leaking paths, \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) was able to determine the type of component targeted by the Intent of the path. The upper set of bars of Fig. 9 shows that \(72\%\) of these paths targeted activities. Therefore it is common to use activities rather than other types of components to leak data. Only \(0.8\%\) of the paths leaking data were protected by permissions. This supports the logic of our suggested algorithm for Intent resolution DLAIR. This is so because DLAIR tends to trust Intents protected by permissions.

Use Cases: For some apps of our data set Flowdroid did not report any leaks but our technique \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) did. Some of these apps have the following details:

  1. 1.

    The first app has an execution path in the activity named “TiActivity”. The path leaks data in a sequence of instructions that start another activity. The source API, in this case, is getIntent and the sink API is putExtra.

  2. 2.

    The second app has a leaking execution path that uses an Intent to start the activity “FuturePaymentConsentActivity”. The source API in this case is getIntent and the sink API is putExtra.

  3. 3.

    The third app has a leaking path in its activity named “BaseFragmentActivity”. The path uses an Intent to start a service.

Also for our designed app (available online with results) \(\mathcal {L}{} \texttt {ek}\mathcal {I}{} \texttt {nt}\) discovered all the inserted patterns of leaks but Flowdroid did not.