Skip to main content

Filtering for Malice Through the Data Ocean: Large-Scale PHA Install Detection at the Communication Service Provider Level

  • Conference paper
  • First Online:
Research in Attacks, Intrusions, and Defenses (RAID 2017)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10453))

Abstract

As a key stakeholder in mobile communications, the communication service provider (CSP, including carriers and ISPs) plays a critical role in safeguarding mobile users against potentially-harmful apps (PHA), complementing the security protection at app stores. However a CSP-level scan faces an enormous challenge: hundreds of millions of apps are installed everyday; retaining their download traffic to construct their packages entails a huge burden on the CSP side, forces them to change their infrastructure and can have serious privacy and legal ramifications. To control the cost and avoid trouble, today’s CSPs acquire apps from download URLs for a malware analysis. Even this step is extremely expensive and hard to meet the demand of online protection: for example, a CSP we are working with runs hundreds of machines to check the daily downloads it observes. To rise up to this challenge, we present in this paper an innovative “app baleen” (called Abaleen) framework for an on-line security vetting of an extremely large number of app downloads, through a high-performance, concurrent inspection of app content from the sources of the downloads. At the center of the framework is the idea of retrieving only a small amount of the content from the remote sources to identify suspicious app downloads and warn the end users, hopefully before the installation is complete. Running on 90 million download URLs recorded by our CSP partner, our screening framework achieves an unparalleled performance, with a nearly 85\(\times \) speed-up compared to the existing solution. This level of performance enables an online vetting for PHAs at the CSP scale: among all unique URLs used in our study, more than 95% were processed before the completion of unfettered downloads. With the CSP-level dataset, we revealed not only the surprising pervasiveness of PHAs, but also the real impact of them (over 2 million installs in merely 3 days).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Those terms and images were manually inspected to ensure their correctness.

  2. 2.

    For each node, its child nodes with most children are visited first.

References

  1. Aliyun cloud. https://www.aliyun.com/

  2. At&t managed security services. http://www.corp.att.com/gov/solution/network_services/mss.html

  3. Cloaking. https://en.wikipedia.org/wiki/Cloaking

  4. Data retention directive. https://en.wikipedia.org/wiki/Data_Retention_Directive

  5. Fraudster phishing users with malicious mobile apps. https://info.phishlabs.com/blog/fraudster-phishing-users-with-malicious-mobile-apps

  6. Jieba - chinese text segmentation. https://github.com/fxsjy/jieba

  7. Linking to your products. https://developer.android.com/distribute/tools/promote/linking.html

  8. Mobile/tablet operating system market share. https://www.netmarketshare.com/operating-system-market-share.aspx

  9. Verizon managed security services. http://www.verizonenterprise.com/products/security/monitoring-analytics/managed-security-services.xml

  10. Data retention across the eu (2016). http://fra.europa.eu/en/theme/information-society-privacy-and-data-protection/data-retention

  11. Zip (file format) (2017). https://en.wikipedia.org/wiki/Zip_(file_format)

  12. Abbasi, A., Albrecht, C., Vance, A., Hansen, J.: Metafraud: a meta-learning framework for detecting financial fraud. Mis Q. 36(4), 1293–1327 (2012)

    Google Scholar 

  13. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.D.: Effective and explainable detection of android malware in your pocket. In: NDSS (2014)

    Google Scholar 

  14. Chen, K., Liu, P., Zhang, Y.: Achieving accuracy and scalability simultaneously in detecting application clones on android markets. In: ICSE (2014)

    Google Scholar 

  15. Chen, K., Wang, P., Lee, Y., Wang, X., Zhang, N., Huang, H., Zou, W., Liu, P.: Finding unknown malice in 10 seconds: mass vetting for new threats at the google-play scale. In: USENIX Security, vol. 15 (2015)

    Google Scholar 

  16. Chen, K., Wang, X., Chen, Y., Wang, P., Lee, Y., Wang, X., Ma, B., Wang, A., Zhang, Y., Zou, W.: Following devil’s footprints: cross-platform analysis of potentially harmful libraries on android and IOS. In: IEEE Symposium on Security and Privacy (SP), pp. 357–376. IEEE (2016)

    Google Scholar 

  17. Crussell, J., Gibler, C., Chen, H.: Attack of the clones: detecting cloned applications on android markets. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 37–54. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33167-1_3

    Chapter  Google Scholar 

  18. Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 3–14. ACM (2011)

    Google Scholar 

  19. Foozy, C.F.M., Ahmad, R., Abdollah, M.F.: Phishing detection taxonomy for mobile device. Int. J. Comput. Sci. 10, 338–344 (2013)

    Google Scholar 

  20. Google. Google report: Android security 2014 year in review (2014). https://static.googleusercontent.com/media/source.android.com/en/security/reports/Google_Android_Security_2014_Report_Final.pdf

  21. Gu, G., Porras, P.A., Yegneswaran, V., Fong, M.W., Lee, W.: Bothunter: detecting malware infection through ids-driven dialog correlation. In: Security (2007)

    Google Scholar 

  22. Lever, C., Antonakakis, M., Reaves, B., Traynor, P., Lee, W.: The core of the matter: analyzing malicious traffic in cellular carriers. In: NDSS (2013)

    Google Scholar 

  23. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15, 11 (2006)

    Article  Google Scholar 

  24. Niu, X.-M., Jiao, Y.-H.: An overview of perceptual hashing. Acta Electronica Sinica 36(7), 1405–1411 (2008)

    Google Scholar 

  25. Rastogi, V., Chen, Y., Enck, W.: Appsplayground: automatic security analysis of smartphone applications. In: CODASPY, pp. 209–220 (2013)

    Google Scholar 

  26. Ren, C., Chen, K., Liu, P.: Droidmarking: resilient software watermarking for impeding android application repackaging. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 635–646. ACM (2014)

    Google Scholar 

  27. RFC. Hypertext transfer protocol - http/1.1 (1999). http://www.ietf.org/rfc/rfc2616.txt

  28. Sun, M., Li, M., Lui, J. Droideagle: seamless detection of visually similar android apps. In: Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, p. 9. ACM (2015)

    Google Scholar 

  29. Yan, L.K., Yin, H.: Droidscope: seamlessly reconstructing the OS and dalvik semantic views for dynamic android malware analysis. In: USENIX Security (2012)

    Google Scholar 

  30. Zhang, F., Huang, H., Zhu, S., Wu, D., Liu, P.: Viewdroid: towards obfuscation-resilient mobile application repackaging detection. In: WiSec (2014)

    Google Scholar 

  31. Zhou, W., Zhou, Y., Jiang, X., Ning, P.: Detecting repackaged smartphone applications in third-party android marketplaces. In: CODASPY (2012)

    Google Scholar 

Download references

Acknowledgements

We thank our shepherd Roberto Perdisci and anonymous reviewers for their valuable comments. We also thank VirusTotal for the help in validating suspicious apps in our study. Kai Chen was supported in part by NSFC U1536106, National Key Research and Development Program of China (Grant No. 2016QY04W0805), Youth Innovation Promotion Association CAS, and strategic priority research program of CAS (XDA06010701). The IU authors are supported in part by NSF CNS-1223477, 1223495, 1527141 and 1618493, and ARO W911NF1610127.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Chen .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (txt 1 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chen, K., Li, T., Ma, B., Wang, P., Wang, X., Zong, P. (2017). Filtering for Malice Through the Data Ocean: Large-Scale PHA Install Detection at the Communication Service Provider Level. In: Dacier, M., Bailey, M., Polychronakis, M., Antonakakis, M. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2017. Lecture Notes in Computer Science(), vol 10453. Springer, Cham. https://doi.org/10.1007/978-3-319-66332-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66332-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66331-9

  • Online ISBN: 978-3-319-66332-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics