1 Introduction

As of March 2023, the total number of smartphone users globally is approximately 6.92 billion, and Android commands a market share of 71.95% [1,2,3]. As Android surges in popularity, the number of apps available for its users keeps growing. There are currently around 2.7 million apps available in Google Play, the official Android market [4]. With so many users and available apps, it is unsurprising that unscrupulous developers have developed many malicious apps to spy on users or steal their personal information or money. In addition to the variety of obviously malicious software apps, there are a large number of apps that can be considered to be grayware. These apps can include invasive adware that can leak the user’s information to advertisers or other third parties [5]. The access that an app has to a user’s system and private information is governed by permissions. These permissions include items that can be dangerous to a user’s security, such as access to the user’s accounts or the ability to send SMS messages. In their research, Ashawa and Morris [6] show the relationship between a threat and protection level in Android permissions using exploratory factor plane analysis. The results show that every permission, whether normal or dangerous, has a threat level. Alshehri [7] concludes in his research that READ_PHONE_STATE, which allows access to device identifiers, is the most commonly misused permission by benign apps and the most requested dangerous permission by malicious apps. The danger is magnified when multiple sensitive permissions are granted to the same app. In any device running an OS below Android 6.0, when a user decides to install an app, he/she is confronted with a list of permissions. The user must then decide whether to either provide the app with all the permissions it is requesting or not install the app at all. Such an all-or-nothing approach can force users to accept permissions if they perceive an app to be desirable. In any device running Android 6.0 (Marshmallow) or above, a user’s consent is requested when an app wants to access a sensitive system or piece of information. If the user denies the access, the app might not have full functionality.

This paper examines Android application (app) permissions and their impact on user privacy. It analyzes why smartphone users should understand the importance of each app permission they grant to a certain app and the possible vulnerabilities that can be exposed by each app permission. The decision problem regarding granting Android app permissions is important primarily because the privacy of each user depends on the user’s ability to make effective permission decisions, and this decision-making process has not been formalized thoroughly. We use various classifier models to classify apps into risk categories and then determine which model is most effective under varying conditions.

We focus on apps that are associated with advertisement (ad) networks or malware [8]. The threats posed in these cases are as follows:

  • The ad networks or malicious apps request permissions that are not necessary for the proper functioning of the app, thereby overexposing the user to risks.

  • When the user sends a request to a legitimate app or stores information in the app’s servers, a copy of that information can be stored in the malware/ad servers.

  • Malware can steal information using permissions or phishing and then misuse the information, thus violating user data privacy.

  • Ad networks can also pose a threat to user data privacy by gaining access to user profile information using permissions. Users are usually not fully informed regarding the collection and use of the data that ad networks access.

Our paper makes the following contributions to the field of Android security:

  • Combines the relationship between app permissions and advertisement networks as features in the model to determine the impact on user’s privacy.

  • Presents a detailed analysis of the permissions and other features of the 2009 most downloaded apps in Google Play and malicious apps.

  • Explores the relationship between app permissions and threats to user data privacy from ad networks and mobile malware.

2 Related works

In prior work, several studies have focused on Android app permissions and potential user privacy vulnerabilities [9]. Zhou et al. [10] detected Android malware in app markets by first filtering the apps by removing the apps that do not contain risky permission combinations and then examining the behavioral footprints of the apps in comparison to known malware. They also used heuristics-based filtering to discover zero-day malware. However, these methods could not be used for scanning individual apps on a smartphone. Instead, the system performed offline analysis to examine large datasets of apps and isolate those containing malware.

Sarma et al. proposed different risk signals based on which an app requested permissions [11]. Only permissions deemed “critical,” a classification that contained 26 permissions in some of their experiments and 24 in others, were considered. Risk signals were based on how rare a critical permission or a combination of critical permissions was in either the set of all apps or the category that the app being examined belonged to. This study provided some promising results but was limited by the fact that it considered only one factor, the rarity of a certain narrow group of permissions.

In other work, researchers employed machine learning to use probabilistic generative models to compute real risk scores for Android apps based on the permissions they requested [12,13,14]. The more sophisticated models considered in their study were able to distinguish between critical and less-critical permissions and adjust the risk scores accordingly. These models performed very well in the laboratory. All of them were able to achieve AUC values of over 0.94 using tenfold cross validation. However, the effectiveness of these models in realistic scenarios has not been proven. Results show that in the field of Android malware detection, the performance gap between the laboratory-based tenfold cross validation and “in the wild app” detection can be staggering [15].

Alshehri et al. [16] introduced permission usage and risk estimation for Android (PUREDroid) to measure the security risk of Android app permissions and the magnitude of harm resulting from granting unnecessary permission requests. However, this tool does not take into consideration ad networks and malware effects on app permissions.

Recently, researchers started exploring machine learning and deep learning-based malware detection systems [13, 17]. Rathore et al. [18] performed a comprehensive feature analysis to identify the significant Android permissions, and propose an efficient Android malware detection system using machine learning and deep neural network. The limitations of the research are that it considers only 16 Android app permissions in their models, and this number is small considering the overall number of app permissions in the Android platform.

McDonald et al. [19] investigate the effectiveness of four different machine learning algorithms in conjunction with features selected from Android manifest file permissions to classify applications as malicious or benign. This work addresses whether malware can be detected by analyzing permissions that accompany Android binaries, however, lacks to specify how app permissions only can expose users to privacy risks.

Mathur et al. [20] present a detection framework for Android called NATICUSdroid, which investigates and classifies benign and malware using statistically selected native and custom Android permissions as features for various machine learning (ML) classifiers, but doesn’t present the case when the app permissions changes.

Alsoghyer and Almomani [21] present an analysis of Android permissions with the purpose of identifying significant Android permissions that can discriminate ransomware with high accuracy before harming users’ devices. The most commonly used permissions in ransomware apps were identified. Therefore, permissions dataset was created and data mining techniques were executed to build predictive model for the ransomware detection system. However, relying only in the occurrence of app permissions in ransomware apps is not significant to build a case, since other factors can affect the classification process.

Mohamad et al. [22] examine the effectiveness of static analysis to detect Android malware by using permission-based features. He proposes machine learning models with different sets of classifiers used to evaluate Android malware detection. It also aims to identify insecure permissions existing in Android apps.

In all the previous works, the identification of malicious behavior in Android apps has been done by analyzing one or two features extracted from app permissions. Studies show that if more features are taken in consideration a better classification accuracy can be achieved. In this current research, we explore additional features that involve app permissions, such as their relationship with Android malware and ad networks. Google Play forces the app developers to notify the users if there are ad networks associated with their app, but not which ad networks are present, or even exactly how many ad networks are associated with a particular app. Every ad network has access to the user device, sometimes even more than the app requires.

3 Background

3.1 Android permissions

In the Android OS, a permission system governs an app’s access to information out-side its sandbox. If an app wants access to additional system capabilities or user information, it must request the correct permissions. If the permissions are granted, the app will be allowed access to the requested resource. Permissions are divided into a few protection levels, including normal, dangerous, and signature [6]. Normal permissions are the least dangerous and are automatically granted to any app that requests them. Dangerous permissions are higher-risk permissions and are granted only when they are directly approved by the user. Signature permissions are available only to a certain group of apps that are signed with the same certificate as the app that initially declared the permission. Therefore, important signature permissions declared by the system are available only to other system apps.

In Android 6.0 (Marshmallow) or above, the user can choose to grant some of the dangerous permissions requested by an app while denying others. However, in previous versions of Android, permissions followed an all-or-nothing approach; users had to either accept all the permissions requested by an app during installation or choose not to install the app at all. Because these permissions can give apps access to sensitive information and phone functions, these permissions can pose a threat to user privacy.

3.2 Malicious activity

As new methods and exploits are discovered, mobile malware and adware pose greater threats to Android users [23,24,25]. Kaspersky Lab reported that they detected over 880,000 new malicious programs targeted at mobile devices [26, 27]. This figure is thrice the number of such programs detected in 2014. These programs include Trojans that capture a user’s bank account credentials and ransomware that hijack a user’s device and force him/her to pay money to unblock it. The official Google Play store screens all incoming apps for malware with a protective program called Bouncer before they are published [28]. However, there have been multiple incidents where malware-infected apps have been able to fool Bouncer and slip through into the store.

3.3 Mobile advertising

Many Android apps rely on in-app advertising for revenue. Ads are generally provided by an ad network. There are many different ad networks available, ranging from Google’s hugely popular AdMob to a number of small networks used in only a few apps. App developers incorporate these ad networks into their apps by using ad libraries provided by the networks. An ad library will provide an application programming interface for displaying ads, and app developers can use this interface to greatly simplify the process of adding ads to their app. Some ad libraries require certain sets of permissions, which can force developers to include permissions that may not be necessary for the functioning of their app. The ad libraries are also automatically granted any permissions the host app uses, a property that can compromise user privacy. Ads can use these permissions to collect information about the user and send it to ad servers without the user’s knowledge or consent [10, 29].

3.4 User awareness

Research suggests that typical Android users do not fully understand the risk to their privacy resulting from granting dangerous permissions. Two different groups of researchers [30, 31] separately conducted a series of interviews with Android users, and the results of both sets of interviews suggest that users are often unaware of or confused by the implications of Android permissions. Users can also become conditioned to quickly clicking “Accept” on prompts without reading if they see too many of them, leading to the acceptance of dangerous permissions without evaluation of the risks. This phenomenon is known as “warning fatigue” and can render warning popups and screens essentially useless.

3.5 Problem statement

Legitimate mobile apps, ad networks, and threats all require access to mobile resources and data to function properly. Apps require permissions, which should be granted by the user. The task of minimizing the privacy risk posed by Android apps is difficult for the average user because he/she is unlikely to have sufficient knowledge about security to make informed decisions. Therefore, if users were provided with a detailed but understandable analysis of the security risks posed by Android apps downloaded on their devices, it would greatly increase their control over their data privacy.

3.6 Complexity

Many factors contribute to how dangerous an app is to a user’s security and privacy. Often, privacy-invasive ad networks or malicious apps have certain sets of permissions that they require to operate, but the exact permissions required vary depending on the type of information or functionality being misused. No permission, by itself, can serve as the determining factor in app classification. Ad networks must also be taken into consideration because each ad network included in an app poses the risk of possibly violating user privacy. In addition, apps in Google Play are divided into categories such as Communication, Social, or Games. Naturally, benign apps from some of these categories will require more permissions to run than apps from other categories. If the category an app belongs to is not taken into consideration in an analysis, a benign app from a category that requires many permissions for legitimate functions could be erroneously flagged as malicious. To obtain a complete picture of the risks posed by an app, all these features must be taken into consideration in combination because no single feature can serve as an accurate indicator by itself.

4 Extracted features

We use 201 features as the input in our machine-learning model to classify apps as malicious, benign, or safe. The features are extracted from the analysis of a large collection of data.

  • List of all app permissions required by each app: we identified 118 unique app permissions.

  • Dangerous combinations of permissions: the combinations of permissions that were dangerous were determined by examining the permissions of 20 high-risk apps that required many permissions. The combinations of permissions that were causing security risks in these apps were used to create a list of combinations that our model considered to be dangerous.

  • Number of ad networks associated with an app: during our research, we found that ads could compromise user privacy by leaking personal information without the user’s knowledge or consent. An especially glaring example is MobileAppTracking, which is capable of tracking the user’s clicks and actions. We determined the number of ad networks utilized in each app in our dataset using the AppBrain Ad Detector tool. We identified 79 different ad networks and found that the maximum number of ad networks associated with a single app in our database was 13. We posit that apps that contain many ad networks can pose more threats to a user’s security because in such cases, the user information can be leaked to many different advertisers.

  • Number of businesses associated with an ad network and the industry sector they operate in.

  • Total number of privacy threats posed by an app: by mapping a list of the top Android adware and malware to the permissions they require for operation, we were able to determine the number of threats that a user was exposed to through each permission. The total number of threats was calculated by adding the individual threat scores of each permission present in the app and then subtracting the number of overlapping threats, thereby ensuring that each threat was counted only once.

  • Permissions required by an app compared to the average permissions required by other apps in its category: for each category, we determined the average number of permissions required by the top apps. To find the number used to represent this feature, the average was subtracted from the number of permissions requested by the app in question. The greater the number by which the permissions required by an app exceeds the average permissions required by the top apps in its category, the greater is the likelihood that the app is requesting more permissions than necessary. Granting a large number of permissions to a single app can be dangerous to a user’s security.

  • Requests for unnecessary permissions: we considered whether an app contains any permissions that we know is unnecessary for the app’s functioning. First, we consider in-app purchases. This permission allows an app to sell products/services to users using Google Play’s billing service. By definition, this permission is not required for the base functionality of any app. Therefore, we contend that this permission should not be granted by default. Instead, it should be granted only if the user explicitly allows it. The other permission we regard as unnecessary is “Run at Startup.” This allows an app to start itself as soon as the smartphone is turned on. Many malicious apps use this permission to ensure that they are always running. Even with legitimate apps, there are almost no scenarios in which this permission would be required for the basic functionality of the app.

5 Machine learning models

We use three supervised classifiers: Naïve Bayes, support vector machine (SVM), and classification tree. All the classifiers are tested using WEKA, a data mining software [32]. We use tenfold cross validation for our model in all the three classifiers. Cross validation is a model validation technique mainly used in the prediction and estimation of how accurately a predictive model will perform in practice. In tenfold cross validation, the dataset is split randomly into 10 partitions; the model to be validated is then fit to a dataset consisting of nine of the original 10 parts, and the remaining part is used for testing. This process is repeated 10 times, and the average over the 10 validation runs is considered. There are more benign apps available in the market than identified malicious ones. Even in our dataset, we have an imbalance between the number of benign, safe and malicious apps.

To overcome the imbalanced dataset problem, we can use several techniques that can change the distribution of imbalanced datasets [13, 33,34,35]. Balanced datasets are provided to the learner to improve the detection rate of safe and malicious classes [36]. We use the synthetic minority oversampling technique (SMOTE), an oversampling approach in which safe and malicious classes are oversampled by creating “synthetic” records. This method creates extra training data by performing certain operations on our dataset [37]. In addition, we use the Randomize filter to randomly shuffle the order of instances passed through it. When the SMOTE is applied, new synthetic instances are generated at the end of the original dataset. When the tenfold cross validation technique is used in every partition, it selects x number of instances. For the model to give us reliable predictions, we need to ensure that there is a fair representation of three classes in each partition. Therefore, it is important for the Randomize filter to be applied to the dataset during the preprocessing phase. We run each model on the original dataset and on the new dataset obtained after the SMOTE and Randomize technique have been applied. It is important to evaluate the accuracy of our models when the imbalance problem has been taken into consideration.

5.1 Classification tree

A classification tree classifies new data points by using their various features to move down through a hierarchical decision structure known as a tree [38]. A tree is composed of a series of nodes. At each node, a feature is compared against a set threshold. The process advances to one of two other nodes, depending on which side of the threshold the feature falls on. Eventually, the bottom of the tree is reached and the point is classified. A classification tree is built by systematically creating nodes that reduce entropy by the greatest amount. Entropy is a measure of how disordered a dataset is. The ideal entropy of 0 indicates that a set is perfectly separated into correct classes. Classification trees are very prone to overfitting to training data and thus must be “pruned” to ensure generalizability. Pruning involves limiting the size of the tree by using various methods. We find that the best results are obtained when we force each split to put at least 16 training samples in each branch and stop splitting nodes with less than five instances.

In our tests, this model provides good results. It also requires no scaling of data because the method of building the tree eliminates the need for it. However, considering how prone classification trees are to overfitting to test data, further testing is required to ensure that it performs well “in the wild.”

5.2 Naive Bayes

The Naive Bayes model calculates the probability that a new data point belongs to a given class by looking at each feature and determining the probability of the class while considering the value of the feature [39]. The features are assumed to be independent of each other. These probability values are calculated for each feature and then multiplied together. The product is multiplied by the overall probability of the class, and the result is divided by the overall probability of each feature while considering the given values to obtain the overall probability of the point belonging to the class. This process is repeated for each class, and the class with the highest probability is chosen. This model provides poor results compared to the other models we tested. This may be attributed, at least in part, to the independence assumption built into the model because our features are not in fact fully independent of each other.

5.3 Support vector machine

An SVM calculates an optimal separating hyperplane between two classes [12]. It uses the training points that are closest to the gap between the two classes, called the support vectors, to find the hyperplane that separates the classes with the largest margin. This hyperplane is then used as the classification boundary. Every point on one side of the hyperplane is given the label 1 and every point on the other side is given the label − 1. The particular implementation of the SVM that we use is C-SVM, which is explained below. Assume that we have training vectors \({x}_{i} \in {R}^{n},i = 1, \dots , m\) and a vector of labels for these training samples \({y}_{i}\in {R}^{m}\) that consists of only the labels -1 and 1. We desire to obtain parameters for the hyperplane that optimally separates the points into the correct classes, and this is achieved by using the equation\({w}^{T}x + b = 0\), where w, is a vector orthogonal to the hyperplane and b is the bias. Generally, we solve for the hyperplane using the dual problem, which is represented mathematically as follows:

$$\mathop {\min }\limits_{\alpha } \frac{1}{2}\left( {\mathop \sum \limits_{i = 1}^{n} \left( {\mathop \sum \limits_{j = 1}^{n} \alpha_{i} \alpha_{j} \gamma_{i} \gamma_{j} K\left( {x_{i} x_{j} } \right) - \mathop \sum \limits_{i = 1}^{n} \alpha_{i } } \right.} \right.$$
(1)

subject \(y^{T} \alpha = 0, 0 \le \alpha_{i} \le C_{i} , \;i = 1, \ldots ,m\)

where \(K\left( {x_{i} x_{j} } \right) \equiv \varphi \left( {x_{i} } \right)T\varphi \left( {x_{j} } \right)\) is the kernel function, with φ representing a transformation. It can implicitly map \({x}_{i}\) and \({x}_{j}\) to a higher- dimensional feature space depending on the kernel function chosen. This is useful when a nonlinear decision boundary is desirable. In our work, we use a radial basis function kernel \(K\left( {x_{i} x_{j} } \right) = \exp \left( { - \gamma \left\| {x_{i} - x_{j} } \right\|^{2} } \right).\)

Using a grid search, we find that the optimal value of the free parameter γ for our data is 2. We also find that the optimal value for the penalty C is 512. C determines the cost of misclassification, which in turn determines how many support vectors are chosen and how smooth the decision surface is. Once the dual problem is solved and the optimal α is found, the following equation is used to optimize w:

$$w = \mathop \sum \limits_{i = 1}^{m} y_{i} \alpha_{i } \varphi \left( { x_{i} } \right)$$
(2)

Decisions are made by calculating the value of \({w}^{T}\varphi \left(x\right)+b\). If the answer is positive, the data point is classified as label 1. If it is negative, the point is classified as label − 1.

The optimized SVM model provides good results. However, because our problem has three classes, a single normal binary SVM is not an option. We use the one-versus-one method, where an SVM is generated for each pair of classes. Then, the SVMs vote to determine a classification for each new data point. This method slows down classification to some degree, but because we only have three classes, the difference is not significant. SVMs also usually require scaling to operate effectively. However, SVMs can easily handle high-dimensional feature spaces and complex decision boundaries.

6 Evaluation datasets

We have collected and analyzed two different app datasets: (1) 1624 apps from each of the 53 categories in Google Play, and (2) 385 malicious apps provided by Cybersecurity firm CloudSEK [40, 41]. The sampled categories include the following:

  • Social

  • Tools

  • Shopping

  • Weather

  • Health & fitness

  • Finance

  • Transportation

6.1 Google play dataset

We collected data from Google Play, the official Android market. As our sample data, we consider the top 1624 free/paid apps and 385 malicious apps. As an example, the Tik Tok app has more downloads, higher ratings, and higher satisfaction ratios; therefore, it has more impact on end users [42]. Our goal is to collect three different kinds of data:

  1. 1.

    App Permissions: we created a list of all app permissions for the 1624-top free/paid apps from each category in Google Play and the 385 malicious apps. We can obtain this information from the manifest file of each app we analyze [27]. We identified 118 app permissions in the dataset. We used the Androguard tool to extract all the permissions [43]. Androguard is a tool implemented in Python, which extracts APK files and lists all the app permissions. All the APK files were downloaded from APKpure dataset [44].

  2. 2.

    Ad Networks: we created a list of all the ad networks associated with the 1624 top apps in Google Play and the 385 malicious apps. We used the AppBrain Ad Detector tool to identify that 79 different ad networks are associated with the 2009 apps [45, 46]. Some of the ad networks pose a threat to user privacy. MobileApp-Tracking is identified in 100 apps and can track the user’s actions without the user’s knowledge [47]. Another tool we used to identify the list of ad networks for malicious app dataset is mobile security framework (mobSF) [48]. It is an automated, all-in-one mobile application (Android/iOS/Windows) pen-testing, malware analysis and security assessment framework capable of performing static and dynamic analysis.

  3. 3.

    Malware Permissions: we created a list of all the app permissions that the top 99 Android malware programs request to get access to a device [8, 26, 49,50,51].

6.2 Malicious dataset

The 385 malicious apps in our dataset are provided by Cybersecurity firm CloudSEK, an online malware repository [40, 41]. They utilized their own proprietary software and discovered Android apps that contain or had previously contained malware. Each malicious app is installed, and its features (dangerous combinations, number of ad networks, etc.) are noted. Because the malicious apps do not include category information, we manually assign categories. This task is accomplished by searching the app name and locating similar apps in the Google Play store. The categories of the similar apps are then used to determine the category of the malicious app with a high degree of certainty.

6.3 Class distribution

Our dataset includes 2009 records in total and 201 features such as dangerous combo, number of ads, list of ad networks (78), app permission list (118), number of threats, and perm vs. avg. The dataset has three classes: benign, safe, and malicious apps. The analysis of the imbalance of class distribution is very important because it affects the results from the classifiers.

Table 1 indicates the class imbalance that exists in our dataset; the benign class has considerably more records than the safe and malicious classes. A large part of our dataset includes instances for the benign class, which covers 77.75% of the whole dataset. We have fewer records about apps that fall into the safe category. The problem of imbalance has considerable emphasis [36]. Imbalanced datasets exist in many real-world domains such as detection of unreliable telecommunication customers, text classification, detection of fraudulent telephone calls, and information retrieval and filtering [36]. App behavior detection is not an exception. This representation reflects the reality that there will always be a disproportional relationship between benign, safe, and malicious apps. There are more benign apps available than malicious or safe ones. Our main concern pertains to the two minority classes (safe and malicious) and not the benign one. We need an accurate classification method for the minority classes. Traditional data mining algorithms behave undesirably while handling instances of imbalanced datasets because the distribution of the datasets is not taken into consideration [36]. Because the instance parts of the three classes we have identified are not equally distributed, the results that we obtain regarding the classification accuracy should reflect the imbalance class distribution of our training set. In the case of an imbalanced dataset, the classification accuracy cannot reflect reliable classification for the safe and malicious classes. To overcome the imbalanced dataset problem, we can use several techniques that can change the distribution of imbalanced datasets. Balanced datasets are provided to the learner to improve the detection rate of the safe and malicious classes [36].

Table 1 Imbalance class distribution

We use the SMOTE, which is an oversampling approach in which the safe and malicious classes are oversampled by creating “synthetic” records. This method creates extra training data by performing certain operations on our dataset [42]. Figures 1, 2, 3, 4 show the visualization of our dataset. We use the WEKA tool to visualize the dataset [32].

Fig. 1
figure 1

Feature visualization using dataset values, where x-axis represents “app name” and y-axis represents “perm vs. avg.” The benign, safe, and malicious classes are represented by blue, red, and green colors, respectively

Fig. 2
figure 2

Feature visualization using dataset values, where x-axis represents “perm vs. avg.” and y-axis represents “number of ads.” The benign, safe, and malicious classes are represented by blue, red, and green colors, respectively

Fig. 3
figure 3

Feature visualization using dataset values, where x-axis represents “threats” and y-axis represents “perm vs. avg.”

Fig. 4
figure 4

Feature visualization using dataset values, where x-axis represents “perm vs. avg.” and y-axis represents “threats.” The benign, safe, and malicious classes are represented by blue, red, and green colors, respectively

7 Discussion on dataset

From our data analysis of the dataset of safe/benign apps, we make the following inferences:

  • On average, the apps in our database require eight permissions each. The minimum number required is zero, whereas the maximum is 72.

  • In many categories, free apps require approximately twice as many app permissions as the corresponding paid ones. On average, the free apps in our database require approximately 50% more permissions than the paid apps.

  • We calculate the number of app permissions required for each of the 1624 top free/paid apps from Google Play. The results from the categories with the most permissions on average are shown in Fig. 1.

In almost every app category in Google Play, free apps require almost twice the number of app permissions required by paid ones. The greater the number of app permissions required by an app, the greater are the threats that could violate app user privacy. From our analysis, we can infer that ad networks expose mobile users to more privacy threats. Legitimate apps associated with ads require more app permissions to run and store information in additional web and database servers as well as in the legitimate servers of the apps (Fig. 5).

Fig. 5
figure 5

Comparison between number of permissions in free and paid apps in selected categories

The app permissions required most often are: (1) full network access, (2) view network connections, (3) photos/media/files, (4) prevent device from sleeping, (5) Wi-Fi connection information, (6) device ID and call information, (7) in-app purchases, and (8) storage. The following inferences can be made from the results (Fig. 6):

  • Approximately 4% of the apps in our database do not require any permissions at all.

  • Three apps require the maximum number of app permissions in our database: WeChat (72 total), Du Speed Booster (59), and Lookout Security and Antivirus (42).

  • The Live Wallpaper and Tools categories both contain apps with no required app permissions and apps with the maximum number of permissions required.

  • The permissions “full network access” and “view network connections” are required by most of the apps in Google Play.

  • Approximately 37.78% of apps require the permission “in-app purchases,” whereas 20.18% of apps require the permission “run at startup.”

Fig. 6
figure 6

Most common permissions in our dataset

There are 14 Ad networks that are associated with the majority of apps in Google Play. AdMob is the ad network associated with approximately 50% of apps available in the official Android app market (Fig. 7).

Fig. 7
figure 7

The list of ad networks that are part of majority of apps in Google Play

8 Results

The primary metric we use for evaluating our data is classification accuracy (CA), which is defined as:

$$CA = \frac{{{\text{nr}}.{\text{ of correctly predicted points}}}}{{{\text{Total nr}}.{\text{ of points in testing data}}}}$$
(3)

The overall testing results are listed in Tables 2 and 3. Table 2 lists the results for all the three classifiers without using the SMOTE. The values of the area under the curve (AUC), CA, Precision, Recall, and F-measure are the average values for all the three classes (benign, safe, and malicious). The accuracy of the model depends on how well the model separates the instances in the training set into the benign, safe, and malicious classes.

Table 2 Results of machine learning testing without using SMOTE
Table 3 Result of machine learning testing using SMOTE and randomize technique

Table 3 lists the results for all the three classifiers using the SMOTE and Randomize technique. Overall, we obtain better results for all the three models when the SMOTE and Randomize technique are applied to the dataset. It is evident from the previous results for each model that the TP rate for the malicious class increases considerably when class distribution imbalance is considered.

9 Feature selection

Feature selection is also called variable selection or attribute selection. It is the automatic selection of attributes (from the training data) that are most relevant to the predictive model being used. Usually, all the features do not contribute equally in a classification model. For all our classifiers, we identify the feature contribution through tenfold cross validation. We use ClassifierAtrributeEval to evaluate the worth of our 10 attributes by using the Naïve Bayes, SVM, and classification tree classifiers. Although we have 201 attributes, we consider the first 10 attributes defined by ClassifierAtrributeEval that contribute the greatest toward the model. In addition, we use ClassifierSubsetEval that evaluates the attribute subsets on the training data. This attribute evaluator uses all three classifiers to estimate the “merit” of a set of attributes.

The features used throughout the tenfold validation in the Naïve Bayes model are ranked as listed in Table 4. The feature that has the highest contribution in the Naïve Bayes model is “threats”. The feature that has the least contribution is “Google Play license check.”

Table 4 Attribute selection in NAÏVE Bayes model

The features used throughout the tenfold validation in the classification tree model are ranked as listed in Table 5. The feature that has the highest contribution in the classification tree model is “threats”. The feature that has the least contribution is “delete all app cache data.”

Table 5 Attribute selection in classification tree model

The features used throughout the tenfold validation in the SVM model are ranked as listed in Table 6. The feature that has the greatest contribution in the SVM model is “threats”. The feature that has the least contribution is “Google Play license check.” All the features are present in all the 10 partitions.

Table 6 Attribute selection in SVM model

The feature that has the greatest contribution in all the three classification models is the number of threats. The effectiveness of the total threats as a feature is likely attributed to its direct connection with the characteristics of top malware threats. Our assumption since the beginning of the study has been that there is a strong relationship between the permissions that an app requires and the possible threats that users are exposed to. Another feature that contributes toward the classification models is “in-app purchases.” This feature is required extensively even if the apps are free and no purchase is required at the time. The sensitive payment information required up front by the apps is not necessary, and it can expose the user to possible threats and charges. Our assumption since the beginning of the study has been that there is a strong relationship between app permissions and mobile ads and that there is a large possibility of users being exposed to malicious behavior. Most of the ads require more app permissions than the app itself, and the number of ads in a single app keeps increasing. The user is not aware of how many ad networks are associated with an app. Another feature contributing to all the three classifiers is “run at startup.” We theorize that this is because malicious apps are much more likely to request “run at startup.” The feature that has the least contribution is “dangerous combo.”

10 Conclusion

Our goal is to help end-users make informed decisions about the app permissions that they grant access. The app permissions can be misused by malicious apps and advertisement networks to get access to user’s data and mobile resources. We have analyzed the app permissions, dangerous combination of permissions and list of ad networks to make predictions about the app behavior. Our training set is composed of 2009 apps in total: 1624 benign and safe apps from Google Play and 385 malicious apps. We used our dataset to identify 201 features, which are used to classify apps into safe, benign, and malicious categories. We have also demonstrated the classification of apps using these features. The classification tree (J48) is an effective model to categorize apps based on our features; it achieves a classification accuracy of 95.9%. The number of threats is the feature that has the greatest impact in our prediction models. We used static analysis in order to classify malicious app behavior and this is a limitation of our model. However, a dynamic analysis that considers the app permissions used by malicious apps and advertisement networks will be part of our future work. Additionally, the impact of ad networks on user's privacy violation will also be explored. Quantifying the severity of user’s privacy violation caused from malware and ad networks would help increase awareness among end-users about the risks that mobile apps pose toward them.