Knowledge and Information Systems

, Volume 55, Issue 3, pp 771–796 | Cite as

Mobile Apps identification based on network flows

  • Georgi AjaeiyaEmail author
  • Imad H. Elhajj
  • Ali Chehab
  • Ayman Kayssi
  • Marc Kneppers
Regular Paper


Network operators and mobile carriers are facing serious security challenges caused by an increasing number of services provided by smartphone Apps. For example, Android OS has more than 1 million Apps in stores. Hence, network administrators tend to adopt strict policies to secure their infrastructure. The aim of this study is to propose an efficient framework that has a classification component based on traffic analysis of Android Apps. The framework differs from other proposed studies by focusing on identifying Apps traffic from a network perspective without introducing any overhead on subscribers smartphones. Additionally, it involves a technique for pre-processing network flows generated by Apps to acquire a set of features that are used to build an identification model using machine learning algorithms. The classification model is built using classification ensembles. A group of chosen users contribute in training the classification model, which learns the normal behavior of selected Apps. Eventually, the model should be able to detect abnormal behavior of similar Apps across the network. A 93.78% classification accuracy is achieved with a low false positive rate under 0.5%. In addition, the framework is able to detect abnormal flows of unknown classes by implementing an outlier detection mechanism and reported a 94% accuracy.


Android security Traffic analysis App profiling Flow-based classification 



This research is funded by TELUS Corp., Canada.


  1. 1.
    Smartphone os market share 2015, 2014, 2013, and 2012. Accessed 2016
  2. 2.
    Baghel SK, Keshav K, Manepalli VR (2012). An investigation into traffic analysis for diverse data applications on smartphones. In: IEEE 2012 national conference on communications (NCC), pp 1–5Google Scholar
  3. 3.
    Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139CrossRefGoogle Scholar
  4. 4.
    Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD workshop, vol 10. Seattle, WA, pp 359–370Google Scholar
  5. 5.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
  6. 6.
    Burguera I, Zurutuza U, Nadjm-Tehrani S (2011). Crowdroid: behavior-based malware detection system for android. In Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, pp 15–26. ACM. Chicago, IL, USAGoogle Scholar
  7. 7.
    Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. University of California, Berkeley, pp 1–12Google Scholar
  8. 8.
    Choi Y, Chung JY, Park B, Hong JW-K (2012) Automated classifier generation for application-level mobile traffic identification. In: 2012 IEEE network operations and management symposium. IEEE. MAUI, HAWAII, USA, pp 1075–1081Google Scholar
  9. 9.
    Conti M, Mancini LV, Spolaor R, Verde NV (2016) Analyzing android encrypted network traffic to identify user actions. IEEE Trans Inf Forensics Secur 11(1):114–125CrossRefGoogle Scholar
  10. 10.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  11. 11.
    Dai S, Tongaonkar A, Wang X, Nucci A, Song D (2013) Networkprofiler: towards automatic fingerprinting of android apps. In: INFOCOM, 2013 Proceedings IEEE. IEEE. Turin, Italy, pp 809–817Google Scholar
  12. 12.
    Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press, Boca RatonzbMATHGoogle Scholar
  13. 13.
    Falaki H, Lymberopoulos D, Mahajan R, Kandula S, Estrin D (2010). A first look at traffic on smartphones. In: Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pp 281–287. ACM. Melbourne, AustraliaGoogle Scholar
  14. 14.
    Johnson R, Wang Z, Gagnon C, Stavrou A (2012) Analysis of android applications’ permissions. In: 2012 IEEE sixth international conference on software security and reliability companion (SERE-C). IEEE. Gaithersburg, MD, USA, pp 45–46Google Scholar
  15. 15.
    Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques In: Proceedings of the 2007 conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologie. IOS Press, Netherlands, pp 3–24.
  16. 16.
    Kuncheva LI (2004). Classifier ensembles for changing environments. In: International workshop on multiple classifier systems, Springer, pp 1–15Google Scholar
  17. 17.
    Li J, Zhai L, Zhang X, Quan D (2014) Research of android malware detection based on network traffic monitoring. In: 2014 9th IEEE conference on industrial electronics and applications. IEEE. Hangzhou, China, pp 1739–1744Google Scholar
  18. 18.
    Miller KW, Voas JM, Hurlburt GF (2012) Byod: security and privacy considerations. It Prof 14(5):53–55CrossRefGoogle Scholar
  19. 19.
    Mongkolluksamee S, Visoottiviseth V, Fukuda K (2016) Combining communication patterns and traffic patterns to enhance mobile traffic identification performance. J Inf Process 24(2):247–254Google Scholar
  20. 20.
    Moore AW (2001) Information gain. School of Computer Science, Carnegie Mellon University.
  21. 21.
    Murphey YL, Guo H, Feldkamp LA (2004) Neural learning from unbalanced data. Appl Intell 21(2):117–128CrossRefzbMATHGoogle Scholar
  22. 22.
    Nissim N, Moskovitch R, BarAd O, Rokach L, Elovici Y (2016) Aldroid: efficient update of android anti-virus software using designated active learning methods. Knowl Inf Syst 49(3):795–833CrossRefGoogle Scholar
  23. 23.
    Oprişa C, Gavriluţ D, Cabău G (2016) A scalable approach for detecting plagiarized mobile applications. Knowl Inf Syst 49(1):143–169CrossRefGoogle Scholar
  24. 24.
    Osuna E, Freund R, Girosi F (1997) Support vector machines: training and applications. Massachusetts Institute of Technology, USA.
  25. 25.
    Pieterse H, Olivier MS (2012) Android botnets on the rise: trends and characteristics. In: IEEE 2012 Information security for South Africa, pp 1–5Google Scholar
  26. 26.
    Qi Y, Cao M, Zhang C, Wu R (2014) A design of network behavior-based malware detection system for android. IN: International conference on algorithms and architectures for parallel processing. Springer. Dalian, China, pp 590–600Google Scholar
  27. 27.
    Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106Google Scholar
  28. 28.
    Quinlan JR (1996) Bagging, boosting, and c4. 5. AAAI/IAAI 1:725–730Google Scholar
  29. 29.
    Rish I (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. IBM New York. Seattle, Washington, USA, pp 41–46Google Scholar
  30. 30.
    Saab F, Elhajj I, Kayssi A, Chehab A (2016). A crowdsourcing game-theoretic intrusion detection and rating system. In Proceedings of the 31st annual ACM symposium on applied computing, pp 622–625. ACMGoogle Scholar
  31. 31.
    Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Nieves J, Bringas PG, Álvarez Marañón G (2013) Mama: manifest analysis for malware detection in android. Cybern Syst 44(6–7):469–488CrossRefGoogle Scholar
  32. 32.
    Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) andromaly: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190CrossRefGoogle Scholar
  33. 33.
    Shabtai A, Tenenboim-Chekina L, Mimran D, Rokach L, Shapira B, Elovici Y (2014) Mobile malware detection through analysis of deviations in application network behavior. Comput Secur 43:1–18CrossRefGoogle Scholar
  34. 34.
    Taylor VF, Spolaor R, Conti M, Martinovic I (2016) Appscanner: automatic fingerprinting of smartphone apps from encrypted network traffic. In: 2016 IEEE European symposium on security and privacy (EuroS&P). IEEE. Saarbrcken, GERMANY, pp 439–454Google Scholar
  35. 35.
    Tsompanidis I, Zahran AH, Sreenan CJ (2014) Mobile network traffic: a user behaviour model. In: 2014 7th IFIP wireless and mobile networking conference (WMNC). IEEE. Vilamoura, Algarve, Portugal, pp 1–8Google Scholar
  36. 36.
    Upadhyaya S, Singh K (2012) Classification based outlier detection techniques. Int J Comput Trends Technol 3(2):294–298Google Scholar
  37. 37.
    Wei X, Gomez L, Neamtiu I, Faloutsos M (2012). Profiledroid: multi-layer profiling of android applications. In: Proceedings of the 18th annual international conference on Mobile computing and networking, pp 137–148. ACM. Istanbul, TurkeyGoogle Scholar
  38. 38.
    Zaman M, Siddiqui T, Amin MR, Hossain MS (2015) Malware detection in android by network traffic analysis. In: 2015 International conference on networking systems and security (NSysS). IEEE. Dhaka, Bangladesh, pp 1–5Google Scholar
  39. 39.
    Zhang J, Zulkernine M, Haque A (2008) Random-forests-based network intrusion detection systems. IEEE Trans Syst Man Cybern C Appl Rev 38(5):649–659CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringAmerican University of BeirutBeirutLebanon
  2. 2.TELUS CorpVancouverCanada

Personalised recommendations