A federated approach to Android malware classification through Perm-Maps

In the last decades, mobile-based apps have been increasingly used in several application fields for many purposes involving a high number of human activities. Unfortunately, in addition to this, the number of cyber-attacks related to mobile platforms is increasing day-by-day. However, although advances in Artificial Intelligence science have allowed addressing many aspects of the problem, malware classification tasks are still challenging. For this reason, the following paper aims to propose new special features, called permission maps (Perm-Maps), which combine information related to the Android permissions and their corresponding severity levels. Such features have proven to be very effective in classifying different malware families through the usage of a convolutional neural network. Also, the advantages introduced by the Perm-Maps have been enhanced by a training process based on a federated logic. Experimental results show that the proposed approach achieves up to a 3% improvement in average accuracy with respect to J48 trees and Naive Bayes classifier, and up to 16% compared to multi-layer perceptron classifier. Furthermore, the combined use of Perm-Maps and federated logic allows dealing with unbalanced training datasets with low computational efforts.


Introduction
Since Android-based devices are used by thousands of endusers every year, more and more malicious applications are continuously developed by cyber-criminals in order to steal sensitive information and conduct hostile activities. According to McAfee Mobile Threat Report, in 2019, cyber-criminals have increased the effectiveness of their mobile attacks with the support of a wide variety of methods and new approaches, such as backdoors and cryptocurrencies, by making them hard to be identified and removed [32]. In addition to this, as show in Fig. 1, G DATA and McAfee experts have counted more than 4.18 million new malicious applications in 2019 [17], while Kaspersky and TechCrunch have estimated that there will be over 6 billion smartphone users worldwide by 2020 [22,41]. Therefore, to face the following security trend and support researchers in addressing the malware detection tasks, several approaches based on machine learning (ML) and deep learning (DL) have proved to be effective in facing many aspects related to Android threats, especially when they have been combined with static and dynamic features directly extracted from mobile apps [16,21,31]. However, due to the continuous release of new Android malware, the related classification tasks are still challenging. As a consequence, many state-of-the-art approaches suffer from problems related to their dynamic re-training, as well as the updating training datasets.
To address these issues, in this paper, we propose new special features, called permission maps (Perm-Maps), which combine information related to the Android (CNN). Also, the advantages introduced by the Perm-Maps are being enhanced by a training process based on the federated logic, where end-user devices extract static features locally and send them to a centralized server devoted to training the employed neural network. Next, we explore the effectiveness of the proposed Perm-Maps by comparing them with the most popular state-of-the-art ML-and DL-based approaches. Finally, to reduce the computational effort respectively required by the Perm-Maps generation and CNN training processes, we investigate a feature selection technique based on the most frequent Android permissions.
The main contributions of this paper can be summarized as follows: 1. Novel features, called Perm-Maps, are proposed to combine the Android permissions and their corresponding security levels into an image. 2. A federated architecture is presented to support the training phase of the Perm-Maps. 3. A CNN is employed to classify several Android malware families and then compared with the most popular state-of-the-art approaches. 4. A feature selection technique based on the most frequent Android permissions is investigated to reduce the computational effort required by the Perm-Maps generation and CNN training processes, respectively.
The rest of the paper is organized as follows. Section 2 will present the related works about malware classification methods for Android devices. Section 3 will report a background overview on Android permissions. Section 4 will show the definition of Perm-Map, which is based on the Android permissions and their corresponding severity levels. Section 5 will present the employed federated architecture. Section 6 will discuss the obtained results related to the proposed CNN and the investigated feature selection technique, respectively. Finally, Sect. 7 will show the conclusions and future works.

Related works
Since Android malware applications are continuously released every year by cyber-criminals, many detection frameworks based on static and dynamic methodologies have been proposed [16,21,31]. Static techniques can acquire the behaviour of the analyzed applications by performing several reverse engineering steps, and consequently, by extracting useful signatures without executing the application. For instance, Onwuzurike et al. [34] presented MaMaDROID, a new Android malware detection solution that can check the sequences of API calls associated with the activity of a mobile application. However, static approaches are often adversely affected by the use of obfuscation techniques, and additionally, they become ineffective against polymorphic malware which is able to modify itself. This is the reason why any signaturebased detection techniques are ineffective, and consequently, they are often substituted by dynamic approaches, which are based on dynamic analysis techniques, and hence, are able to analyze the behaviour of an application at run time. In 2018, Sruthi et al. [40] proposed a malware detection technique, in Windows OS environment, based on API calls. Furthermore, several works have adopted ML and DL techniques based on both static and dynamic features [14,33,48].
In 2016, Kolosnjaji et al. [24] investigated a comparison among different deep neural networks (DNNs) typologies.  [32] In particular, they proposed a convolutional long short term memory (Conv-LSTM) network able to achieve an 89.0% in average accuracy, by considering 10 different Android malware categories. Kumar et al. [25] proposed a comparison among the three famous ML-based methods to detect Android malware by analyzing the visual representation of APK files formatted as Grayscale, RGB, CMYK, and HSL images, without any code extraction and decompiling operations. More precisely, they investigated the proposed technique by using decision trees (DT), Random Forest (RF), and k-nearest neighbor (k-NN), respectively. The obtained results have shown that RF is able to achieve a 91% accuracy by considering APK files formatted as Grayscale images.
In 2017 Vinayakumar et al. [42] investigate different LSTM neural networks to classify the APK files as either benign or malicious. In particular, they proposed an LSTM network able to achieve an 89.7% accuracy, by taking into account Android permissions translated as numerical information.
In 2018 Li et al. [27] proposed a comparison among different DNNs configurations based on static information, like permissions and Java code. More precisely, they compared ten distinct neural network configurations by achieving an average accuracy between 95 and 97% in the Android malware classification task. Xie et al. [47] proposed a tool called RepassDroid, which is able to classify Android applications, as benign or malicious, based on permission and Java methods. Additionally, they explored a comparison among different ML-based approaches like DT, RF, k-NN, Naive Bayes (NB), and support vector machines. The achieved results have proven that RF is able to achieve a 99.7% accuracy by taking into account 24,288 Android applications.
In 2019, Li et al. [26] proposed a novel and highly reliable DNN classifier for Android malware detection based on the extraction of several features from manifest files and source code. In particular, they considered seven different static features like app components, hardware features, permissions, intent filters, restricted and suspicious Java methods, and used permissions. Thus, they have been used to train a DNN able to obtain a 99.25% average accuracy. D'Angelo et al. [13] proposed a deep sparse autoencoders (AEs) to classify Android-based malware and goodware (GW) applications downloaded from several app stores. More precisely, they proposed a new API methods representation technique named API-images, and then, an average accuracy of 95% has been achieved by employing deep sparse AEs.
In 2020, Aonzo et al. [7] presented BAdDroIds, a mobile application that leverages DL for detecting malware on resource-constrained devices. In particular, the proposed application has been compared with the most notable Android malware detection frameworks by achieving a 98% average accuracy.
Finally, in 2021, D'Angelo et al. [12] proposed a CNN and a recurrent neural network (RNN), based on API-images, in order to classify different malware families. More precisely, they used both neural networks on five malware families on the Unisa malware dataset (UMD) by achieving 99% in average accuracy.

Background
In this section, some key concepts related to Android permissions and federated environments are discussed in order to understand and appreciate the novelties of the proposed approach.

Permission's overview
Android permissions can be categorized into three main typologies: Install-time, Runtime, and Special [4]. Installtime permissions grant an application limited access to restricted data, and thus, they allow an application to perform restricted actions that minimally affect the system or other apps. When a developer declares install-time permissions, the system automatically grants the required permissions without notifying the end-user. There are two types of Install-time permissions respectively called normal permissions and signature permissions: -Normal permissions allow access to data and actions that present minimal risk for the system or end-users privacy. They can be used or identified through a protection level's value set to normal. -Signature permissions since they are defined in another Android application, the signature permissions are granted only if the requesting and declarant applications are signed through the same certificate. Also, they can be used or identified through a protection level value set to signed.
Runtime permissions, also known as dangerous permissions, grant an application additional access to restricted data by allowing it to perform actions that substantially affect the system and other apps. When an Android application requests runtime permissions, the system presents a prompt and waits that is granted or not by the enduser. Runtime permissions can be used or identified through a protection level value set to dangerous. Finally, the special permissions can be only defined by the original equipment manufacturers (OEMs) to provide access control concerning several energy-intensive actions, such as access to other applications. More precisely, they are closely associated with an app operation (app op) related to access control, and they can be used or identified through a protection level value set to appop.

Permission maps
Although most of the techniques used in literature include both static and dynamic approaches, the static one is the most desired because it can analyze applications without running them. Accordingly, we propose new features, called Perm-Maps, derived by the malware static analysis. More precisely, A Perm-Map is a sparse matrix where Android permissions, and their corresponding severity levels, are related as fixed points and reported in an x-y plane. As depicted in the following, the proposed Perm-Maps are able to address three main issues: (i) Android malicious developers could define custom permissions to perform several hostile activities, like theft of sensitive data or launch of cyber-attacks [1]; (ii) since default and custom permissions are associated to different severity levels, also called protection levels or flags, like: normal, signature, dangerous, or their combinations, an application could be characterized by many permissions and severity levels [3,5]. Therefore, a malicious developer could define some low severity level permissions to perform several actions without notifying the end-user; (iii) since Perm-Maps represents static features only extracted from the manifest file, they cannot be influenced by the most famous obfuscator tools, like DexGuard [18], ProGuard [19], and Obfuscapk [6].

Perm-Map creation workflow
The creation of a Perm-Map consists mainly in the following four steps: 1. Extraction of the Android permissions and their corresponding protection level. 2. Assignment of an identifier (ID p ) to any Android permission. 3. Assignment of an identifier (ID s ) to any severity level. 4. Creation of the Perm-Maps by using pairs of IDs (ID p ; ID s ) as coordinates of fixed points in an x-y plane.
The first step is accomplished by using several tools or libraries devoted to the malware static analysis. A typical approach could envisage a dictionaries creation process of the well-known Android permissions, and their protection levels, by finding them from the official documentation [2]. Alternatively, the hpermissioni tag can be employed to know the protection level of custom permissions. This approach is adopted by several most famous reverse engineering tools, like Androguard [15]. More precisely, for each permission declared into the AndroidManifest file, it is able to obtain the corresponding protection level by checking if the considered permission is known; assign a dangerous protection level otherwise. Next, the second and third steps are accomplished by creating two dictionaries to respectively translate each Android permission and each corresponding severity level into a unique ID number. Finally, for each analyzed application, the fourth step is conducted by considering each pair of ID numbers (ID p ; ID s ) as coordinates of a fixed point, and consequently, storing the translated information in a sparse matrix. For instance, let p1 and p2 two Android permissions, and let s3 and s2 their security level, respectively. We can consider two pair of coordinates C1 ¼ ðp1; s3Þ and C2 ¼ ðp2; s2Þ and draw two points in an x-y plane, where axes x and y reports permissions and severity levels, respectively. However, since security levels could be different among them, it is possible to use different colour scales (like RGB or Gray-scale) to remark these differences. Figure 2 shows the complete workflow to obtain a Perm-Map.

A federated architecture
Since millions of Android-based applications are released every year, managing related data for model training purposes is a process that requires significant efforts, mainly associated to accessing, searching, and updating them. To overcome these issues, we present a federated architecture to support Android classification tasks through the proposed Perm-Maps. Federated architectures are based on a federated data production logic, which implies that the participating devices send their own pre-processed permission data to a centralized infrastructure devoted to provide collection services and classification-model construction and to share related information [23]. Due to its great success, the federated logic has been investigated, in the last decade, to face main issues related to the convergence process among edge and cloud infrastructures, such as data aggregation, data mobility, and services migration [10,30,38]. Also, it has been involved in many other famous application domains, such as cryptography solutions to preserve data security [36], optimization frameworks for the medical of things devices [37], and vehicular networks optimization [43].
In detail, the proposed architecture aims to provide a data aggregation workflow where federated devices are used as decentralized permission data sources and preliminary processing units. Additionally, a central server is employed to collect data, and then construct, share and update a classification model to be transferred as an update to each federated device, and thus, to propose a managing strategy for the involved permissions data. Therefore, the discussed architecture works through two steps respectively named model creation process and model update process, while its main contributions can be summarized as follows: 1. A data aggregation's workflow is presented to collect data from federated devices. 2. A centralized dataset is employed to create a shared DNN model based on Perm-Maps. 3. A data update workflow is discussed to manage centralized data and re-adapt the shared model.

Model creation process
At beginning of the model creation process, each device decompresses the APK file and sends the AndroidManifest file to the central server. Thus, when data are completely stored, it will perform the Perm-Maps creation process by following the workflow shown in Fig. 2. Basically, the server will run the CNN's training and testing phase and send the classification model to each device. Finally, each end-user will receive a notification concerning the classification result of the analyzed application. Figure 3 shows the discussed process, while its main steps can be summarized as follows: 1. End devices decompress the APK file. 2. They also send the manifest file to the central server.
3. The server runs the Perm-Maps creation process, when data are completely available. 4. It then runs the CNN's training and testing phase. 5. The server sends the classification model to each device. 6. The end devices notify the end-users about the classification result.
Note that, when an end device receives the first classification model information, it becomes able to autonomously create its Perm-Maps, and hence perform classification, without affecting the central server.

Model update process
The following phase is responsible for collecting new data when the end-user tries to install a new application. At a high level, it differs from the previous process in three main aspects: 1. If an application is unknown, it automatically stores the related manifest file on the central server. 2. If an application is unknown, it considers the end-users feedback to generate a classification label. 3. If a threshold value is reached, it trains and shares an updated model by considering new data.
Therefore, when an end-user installs an application, the device decompresses the APK, extracts the Perm-Map by reading the AndroidManifest file, and uses the classification model to make a classification. If the application is known, the classification module will notify the end-user by showing the achieved prediction. Otherwise, it will ask if the installed application is known or trusted, and subsequently, will send the manifest file and the user's answer to the central server. Thus, the employed server stores new data and, when the dataset size will have reached a threshold value, it will re-perform the Perm-Maps creation process. Finally, the server will re-run the training and testing phase and sends the updated model to each device. Figure 4 shows the discussed process, while the main steps can be summarized as follows: 1. End devices decompress the APK file. 2. They also extract the Perm-Map from the manifest file.
3. End devices also try to obtain a prediction and ask if the analyzed application is known or trusted. 4. They send the manifest file and user's answer to the server. 5. The server stores new data. 6. It then re-runs the Perm-Maps creation process, when the dataset size reaches a threshold value. 7. It also re-runs the CNN's training and testing phase. 8. Finally it sends the updated model to each device.

Experimental results
The first goal of experiments, reported in this section, is devoted to demonstrating the contribution of the proposed approach concerning the classification of several Android applications. Instead, the second one exploring the

UMD cleaning
In 2021 we developed a new Android malware dataset (AMD) called Unisa malware dataset (UMD) 1 [12] that contains 25,275 mobile applications collected by analyzing two famous datasets: AMD [28,44] and Drebin [8,39]. This first version of UMD consists of two main directories called amd-cuckoo-family and drebin-cuckoo-family that contain 66 and 143 Android malware families, respectively. Additionally, it provides, for each analyzed application, the report files obtained through CuckooDroid Sandbox [11,20]. Table 1 shows an overview of the first release of UMD.
In this work, we use a cleaned version of UMD (UMD-v2) obtained by applying the following modifications: 1. Consider the two main folders as a single one. 2. Merge the common families. 3. For each common family, remove the duplicates. 4. Remove each application which has got one or more malformed files. 5. Remove each application which has got one or more missing files.
The application of points (1) and (2) have reduced the number of considered families from 209 to 185. Instead, the application of points (3), (4) and (5) have reduced the number of the analyzed applications from 25,275 to 24,285. Additionally, the application of the entire protocol has reduced the dimensions (Dim.) from 117.63 to 112.45 GB. Table 2 reports a comparison between the two versions of our datasets.

Proof of concept experimental setting
We built our proof of concept testing framework within a virtualization scenario based on VirtualBox. For this work, we considered 10 categories of Android applications. In particular, the entire dataset used for training has been composed by choosing nine malware families from UMD-v2 and selecting GW applications from the following online stores: ApkPure, GooglePlay, and PlayDrone. Hence, to simulate the discussed Model Creation Process, each application has been analyzed through the Android device cross-platform mode of CuckooDroid [11,20]. More precisely, in our proof of concept framework we used two Android guest virtual machines, simulating end devices, to decompress each APK file and send the AndroidManifest file to the server virtual machine. Thus, we extracted Perm-Maps by using a dedicated Python script executed on the server machine. We stored each Perm-Map as a matrix 4 Â 298 in accordance with the maximum number of distinct severity levels and Android permissions observed, respectively. Figure 5 shows the application's distribution extracted by performing an exploratory data analysis, EDA [35,46], and it highlights the unbalanced behaviour of the employed dataset. Subsequently, we have split the following dataset in order to run the experiments. To this purpose, the whole dataset has been subdivided into two mutually exclusive subsets called learning and testing dataset, respectively. We used 70% of the entire dataset for learning and the remaining 30% for testing. Then, the K-fold cross-validation algorithm, with k = 10 (as recommended in [9], has been used to tune the hyper-parameters and provide an unbiased evaluation of each employed CNN. Finally, each CNN has been trained on each training set and evaluated on the corresponding testing set. Table 3 reports the main information about the involved dataset.
To evaluate the classification quality of the employed neural network, the following metrics have been computed: accuracy (Acc.), sensitivity (Sens.), specificity (Spec.), precision (Prec.), area under the ROC curve (AUC), and F-measure (F-Meas or F-score). More precisely, they have been derived from a multi-class confusion matrix where, for each category, TPs (true positives) are the applications correctly classified, TNs (true negatives) are the applications correctly classified in another category, FPs (false positives) are the applications incorrectly identified as a considered category, while FNs (false negatives) are the applications in another category incorrectly identified as a considered category. Subsequently, in order to obtain a global validation, the average values (Avg.) among all metrics have been computed.

Achieved results
The proposed CNN has been trained and tested on an iMac equipped with an Intel 6-Core i7 CPU @ 3.20 GHz, and 16 GB RAM. The employed neural network has been compiled with Adam optimizer and SparseCategori-calFocalLoss function [29], which is a useful function to fit neural networks in presence of unbalanced datasets. Then, it has been trained with batch_size = 64, and 150 epochs by using the 70/30 criteria and the K-fold cross-validation algorithm with k = 10. We chose the following hyperparameters according to the achieved results from the testing process. Tables 4 and 5 show results that have been obtained from the testing phase by respectively using the 70/30 criteria and the K-fold cross-validation algorithm with k = 10, while Table 6 shows the multi class confusion matrix related to the 70/30 criteria.
Furthermore, to face the yearly growth of the malicious applications and analyze the update process of the  presented architecture, we have estimated the data growth range within which to readjust the proposed CNN. More precisely, we have reduced the whole dataset by 5% through an iterative process. At each step, 5% of data have been randomly removed, and thus, we have employed the considered sub-dataset to train and test the proposed CNN by following the 70/30 criteria. Table 7 summarizes the classification metrics derived by the testing phase for each considered sub-dataset.
The achieved results show that the proposed CNN should be readjusted when the data dimensions growing between 15 and 20%. In particular, the comparison between the whole dataset (size 100%) and the dataset reduced by 20% (size 80%) shows a worsening of all classification metrics. For instance, the proposed CNN has respectively obtained a worsening of 3% in average precision, 7% in average sensibility, and 6% in average F-score.
In order to show the effectiveness of the use of the proposed representation method, the achieved results have been compared with the most notable ML-based approaches implemented in the WEKA [45] framework. More precisely, we used multi-layer perceptron (MLP), J48 trees (J48), and NB, to derive the classification metrics by considering a flattened version of the employed dataset that has been used to train and test the proposed CNN. Table 8 summarizes the comparison between the proposed CNN (Pr-CNN) and the employed ML-based methods.  The following comparison shows that the MLP classifier is not able to distinguish different application categories by considering Android permissions and their severity levels, while J48 trees and the NB classifier have achieved good results. More precisely, the proposed CNN has obtained up to a 3% improvement in average accuracy over J48 trees and the NB classifier, and up to a 16% over MLP classifier. Consequently, the proposed CNN can reduce the number of FPs and FNs, and then, better minimize the classification error respect to the most famous ML-based approaches.
Finally, we compared the proposed CNN with the ML and DL based state-of-art solutions. We considered RF results respectively achieved by A. Kumar [26]. Table 9 summarizes the comparison between the Pr-CNN and the state-of-art solutions.
First of all, the following comparison shows that the Vi-LSTM and Kum-RF solutions have achieved discrete results, and consequently, the proposed CNN has obtained up to 10% and 8% in average accuracy over both solutions, respectively. As reported in Sect. 2, Vi-LSTM evaluation metrics have been obtained by only considering Android permissions translated as numerical information, while Kum-RF evaluation metrics have been achieved by considering Grayscale images directly generated from the APK files, without performing any code extraction and decompiling operations. Consequently, the selected static features are not sufficient to achieve equivalent results as those obtained by the proposed CNN. Second, Xie-RF and Li-DNN have been achieved optimal results, and consequently, the proposed CNN has obtained up to 2% in average accuracy over Xie-RF, while their evaluation metrics are similar to those achieved by Li-DNN. However, the proposed Perm-Map representation technique is only based on Android permission and their severity levels, while Xie-RF and Li-DNN are based on Android permissions and Java methods. Consequently, Xie-RF and Li-DNN become ineffective against obfuscation techniques. Finally, Table 10 reports a final overview among proposed CNN, ML-based methods of WEKA, and state-of-art solutions.

Feature selection process
Since the number of employed permissions is 298, the final goal is devoted to exploring a feature extraction technique, based on the most frequent Android permissions, in order to reduce the computational effort required by the generation and training processes of the Perm-Map and CNN, respectively. To this purpose, we have analyzed the permissions frequencies distribution in order to find the minimum frequency number that was able to reduce the number of employed permissions and preserve the number of applications analyzed previously. We have performed the following analysis by using a dedicated Python script. More precisely, we have firstly created an ordered dictionary to store each permission and its frequency. Then, we have considered all Android permissions required at least 50 times, and consequently, 57 Android permissions have been considered for the generation process of each Perm-Map. Figure 7 shows the first five most required Android permissions. Subsequently, according to the workflow shown in Fig. 2, we employed the 57 Android permissions to generate and store each Perm-Maps as a matrix 4 Â 64 in accordance with the maximum number of distinct severity levels and an over-bound number of Android permissions, respectively. We have chosen the following over-bound to simplify the operations that are performed by convolutional layers. Thus, we have split the following new dataset in order to run the experiments. To this purpose, the whole dataset has been subdivided into two mutually exclusive subsets assuming the role of learning and testing datasets, respectively. We used 70% of the entire dataset for learning and the remaining 30% for testing. The employed neural network has been compiled with Adam optimizer, SparseCategoricalFocalLoss function, batch_size = 64, and 150 epochs. Furthermore, it presents the same architecture of the neural network described in Fig. 6 except for the input_shape = (4, 64, 1) and dense layers with dropout = 0.45. Finally, the computational effort for the text substitution, Perm-Maps generation, and training processes have been derived with and without considering the employed features selection method, respectively. Table 11 reports the computational effort required for each analyzed phase, Table 12 shows results that have been obtained from the testing phase by using the 70/30 criteria, while Table 13 summarizes the comparison between the proposed CNNs that have been respectively called CNN-NoExtraction (CNN-NE) and CNN-WithExtraction (CNN-WE).
The obtained results show that the employed feature selection approach could reduce the computational effort required by each analyzed process. More precisely, Table 11 shows that text substitution and Perm-Maps generation processes have been slightly improved, respectively. Furthermore, it shows that the training process has been improved by 3.5 s, while the total effort has been improved by 3.6 s. Finally, the comparison reported in Table 13 demonstrates that proposed CNNs have been obtained equivalent evaluation metrics by testing phase, and thus, how the employed features selections criteria could also optimize the proposed representation approach.

Conclusions and future works
In this paper, novel features called Perm-Maps, based on Android permissions and their corresponding severity levels, have been presented. Next, a CNN has been used to show the potentialities of the proposed approach. More precisely, it has been enhanced by a training process based on a federated logic, where end-users devices extract static features locally and send them to a central server devoted to training a neural network performing malware classification. Then, the effectiveness of the presented methodology has been validated by using statistic metrics and comparing it to the most popular state-of-the-art ML-based approaches, like NB, MLP and J48 DTs. The obtained results show that the proposed CNN has achieved up to a 3% improvement in average accuracy over a J48 tree-based and NB classifier, and up to 16% over a MLP classifier, respectively. Finally, a feature selection technique, based on the most frequent Android permissions, has been explored to reduce the computational effort required by the Perm-Maps generation and CNN training processes, respectively. The achieved results show that the proposed methodology has improved the training time by 3.6 s and that they are also comparable with those obtained without considering any features selection technique. However, due to the high number of existing Androidbased applications, we would like to propose two possible future works. First of all, we will investigate the proposed features by considering an enormous quantity of decentralized data and applying a fully federated learning approach, involving end devices in model construction. Finally, since the most popular ML and DL based methods consider only features obtained at the end of malware analysis, we will propose new solutions capable of reducing damages caused at run-time by processing streams of dynamic features. For instance, several combinations among LSTM layers, CNNs, and stacked AEs (SAEs) could be explored and combined with the proposed approach.
Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

Data availability
The used data is available on http://antlab.di.unisa.it/ malware/.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent Informed consent was obtained from all individual participants included in the study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.