Abstract
Personally identifiable information (PII) is widely used for many aspects such as network privacy leak detection, network forensics, and user portraits. Internet service providers (ISPs) and administrators are usually concerned with whether PII has been extracted during the network transmission process. However, most studies have focused on the extractions occurring on the client side and server side. This study proposes a static tainting extraction approach that automatically extracts PII from large-scale network traffic without requiring any manual work and feedback on the ISP-level network traffic. The proposed approach does not deploy any additional applications on the client side. The information flow graph is drawn via a tainting process that involves two steps: inter-domain routing and intra-domain infection that contains a constraint function (CF) to limit the “over-tainting”. Compared with the existing semantic-based approach that uses network traffic from the ISP, the proposed approach performs better, with 92.37% precision and 94.04% recall. Furthermore, three methods that reduce the computing time and the memory overhead are presented herein. The number of rounds is reduced to 0.0883%, and the execution time overhead is reduced to 0.0153% of the original approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Krishnamurthy B, Wills C E. On the leakage of personally identifiable information via online social networks. In: Proceedings of the 2nd ACM Workshop on Online Social Networks, 2009. 7–12
Mccallister E, Grance T, Scarfone K A. Guide to Protecting the Confidentiality of Personally Identifiable Information (PII). Special Publication (NIST SP)-800-122. 2010
Liu Y, Song H H, Bermudez I, et al. Identifying personal information in internet traffic. In: Proceedings of ACM on Conference on Online Social Networks, 2015. 59–70
Enck W, Gilbert P, Chun B G, et al. TaintDroid: an information flow tracking system for real-time privacy monitoring on smartphones. Commun ACM, 2014, 57: 99–106
Ball J, Schneier B, Greenwald G. NSA and GCHQ target Tor network that protects anonymity of web users. Guardian Web, 2013. https://www.schneier.com/essays/archives/2013/10/nsa_and_gchq_target.html
Yang Z, Yang M, Zhang Y, et al. Appintent: analyzing sensitive data transmission in android for privacy leakage detection. In: Proceedings of ACM Sigsac Conference on Computer & Communications Security, 2013. 1043–1054
Arzt S, Rasthofer S, Fritz C, et al. Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of ACM Sigplan Conference on Programming Language Design and Implementation, 2014. 259–269
Au K W Y, Zhou Y F, Huang Z, et al. Pscout: analyzing the android permission specification. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 217–228
Egele M, Kruegel C, Kirda E, et al. PiOS: detecting privacy leaks in IOS applications. In: Proceedings of NDSS, 2011. 177–183
Cao Y, Fratantonio Y, Bianchi A, et al. Edgeminer: automatically detecting implicit control flow transitions through the android framework. In: Proceedings of Network and Distributed System Security Symposium, 2015
Babil G S, Mehani O, Boreli R, et al. On the effectiveness of dynamic taint analysis for protecting against private information leaks on android-based devices. In: Proceedings of International Conference on Security and Cryptography (SECRYPT), 2013
Song Y, Hengartner U. Privacyguard: a VPN-based platform to detect information leakage on android devices. In: Proceedings of ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices, 2015
Ren J, Rao A, Lindorfer M, et al. Recon: revealing and controlling PII leaks in mobile network traffic. In: Proceedings of International Conference on Mobile Systems, Applications, and Services, 2016. 361–374
Razaghpanah A, Vallina-Rodriguez N, Sundaresan S, et al. Haystack: in Situ mobile traffic analysis in user space. 2015. ArXiv:1510.01419
Le A, Varmarken J, Langhoff S, et al. Antmonitor: a system for monitoring from mobile devices. In: Proceedings of ACM SIGCOMM Workshop on Crowdsourcing and Crowdsharing of Big, 2015. 15–20
Continella A, Fratantonio Y, Lindorfer M, et al. Obfuscation-resilient privacy leak detection for mobile apps through differential analysis. In: Proceedings of Network and Distributed System Security Symposium, 2017
Englehardt S, Han J, Narayanan A. I never signed up for this! Privacy implications of email tracking. In: Proceedings of Privacy Enhancing Technologies, 2018
Srivastava G, Bhuwalka K, Sahoo S K, et al. Privacyproxy: leveraging crowdsourcing and in situ traffic analysis to detect and mitigate information leakage. 2017. ArXiv: 1708.06384
Seneviratne S, Kolamunna H, Seneviratne A. A measurement study of tracking in paid mobile applications. In: Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, 2015
Chen T, Ullah I, Kaafar M A, et al. Information leakage through mobile analytics services. In: Proceedings of Workshop on Mobile Computing Systems & Applications, 2014
Leontiadis I, Efstratiou C, Picone M, et al. Don’t kill my ads! Balancing privacy in an ad-supported mobile application market. In: Proceedings of the 12th Workshop on Mobile Computing Systems & Applications, 2012
Georgiev M, Iyengar S, Jana S, et al. The most dangerous code in the world: validating ssl certificates in non-browser software. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 38–49
Fahl S, Harbach M, Muders T, et al. Why Eve and Mallory love Android: an analysis of Android SSL (in) security. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 50–61
Ren J J, Lindorfer M, Dubois D J, et al. Bug fixes, improvements, … and privacy leaks — a longitudinal study of PII leaks across Android app versions. In: Proceedings of Network and Distributed System Security Symposium (NDSS), 2018
Lindorfer M, Neugschwandtner M, Weichselbaum L, et al. Andrubis — 1,000,000 apps later: a view on current android malware behaviors. In: Proceedings of International Workshop on Building Analysis Datasets & Gathering Experience Returns for Security, 2016. 3–17
Bell J, Kaiser G. Phosphor: illuminating dynamic data flow in commodity JVMs. ACM Sigplan Notice, 2014, 10: 83–101
Rastogi V, Qu Z Y, Mcclurg J, et al. Uranine: real-time privacy leakage monitoring without system modification for Android. In: Proceedings of International Conference on Security and Privacy in Communication Systems, 2015. 256–276
Hornyack P, Han S, Jung J, et al. “These aren’t the droids you’re looking for”: retrofitting Android to protect data from imperious applications. In: Proceedings of ACM Conference on Computer and Communications Security (CCS), 2011
Zhu D Y, Jung J, Song D, et al. TaintEraser: protecting sensitive data leaks using application-level taint tracking. SIGOPS Oper Syst Rev, 2011, 45: 142
Arefi Meisam N, Alexander G, Crandall J R. PIITracker: automatic tracking of personally identifiable information in windows. In: Proceedings of the 11th European Workshop on Systems Security (EuroSec’18), 2018
Machiry A, Tahiliani R, Naik M. Dynodroid: an input generation system for Android apps. In: Proceedings of Joint Meeting on Foundations of Software Engineering, 2013. 224–234
Carter P, Mulliner C, Lindorfer M, et al. Curiousdroid: automated user interface interaction for android application analysis sandboxes. In: Proceedings of International Conference on Financial Cryptography and Data Security, 2016. 231–249
Hao S, Liu B, Nath S, et al. Puma: programmable Ui-automation for large-scale dynamic analysis of mobile apps. In: Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services, 2014. 204–217
Starov O, Nikiforakis N. Extended tracking powers: measuring the privacy diffusion enabled by browser extensions. In: Proceedings of International Conference on World Wide Web, 2017. 1481–1490
Liu Y. Design and implementation of high performance IP network traffic capture system. J Yanan Univ (Natl Sci Edit), 2017, 36: 22–24
Liu Y, Zhan Y H. Research on mobile terminal equipment recognition method based on HTTP traffic. Modern Electron Tech, 2018, 41: 93–95
Dai S F, Tongaonkar A, Wang X Y, et al. NetworkProfiler: towards automatic fingerprinting of Android apps. In: Proceedings of IEEE INFOCOM, 2013. 809–817
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. 61672101, U1636119, 61866038, 61962059).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, Y., Liao, L. & Song, T. Static tainting extraction approach based on information flow graph for personally identifiable information. Sci. China Inf. Sci. 63, 132104 (2020). https://doi.org/10.1007/s11432-018-9839-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-018-9839-6