Abstract
Online tracking technology is a critical tool for user-centric platform practitioners to link users across multiple web pages and make detailed user profiles for the improvement of recommender systems like targeted advertising. Recently, due to the dynamic address allocation and security upgrade, mitigations indirectly make prior tracking techniques unreliable. To overcome the problem, traffic-based tracking techniques are proposed to link users’ dynamic addresses through similarity learning of user behaviors in their traffic interaction. However, prior work either provides poor similarity learning ability or is impractical when applied to a large scale. In this paper, we propose GALG, a graph-based artificial intelligence approach to link addresses for user tracking on TLS encrypted traffic. GALG uses the framework of graph autoencoder and adversarial training to learn the user embedding with semantics and distributions. Employing a new theory – link generation, GALG could link all the addresses of target users based on the knowledge of address-service links. When evaluated on real-world user datasets, GALG outperforms existing approaches in both performance and practicality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, B., McGrew, D.A.: OS fingerprinting: new techniques and a study of information gain and obfuscation. In: CNS, pp. 1–9 (2017)
Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science (5439), 509–512 (1999)
Bashir, M.A., Farooq, U., Shahid, M., Zaffar, M.F., Wilson, C.: Quantity vs. quality: evaluating user interest profiles using ad preference managers. In: NDSS (2019)
Cui, T., Gou, G., Xiong, G., Li, Z., Cui, M., Liu, C.: SiamHAN: IPv6 address correlation attacks on TLS encrypted traffic via siamese heterogeneous graph attention network. In: USENIX Security, pp. 4329–4346 (2021)
Droms, R.E.: Dynamic host configuration protocol. RFC 2131, 1–45 (1997)
Dunteman, G.H.: Principal components analysis (1989)
Gómez-Boix, A., Laperdrix, P., Baudry, B.: Hiding in the crowd: an analysis of the effectiveness of browser fingerprinting at large scale. In: WWW, pp. 309–318 (2018)
Gonzalez, R., Soriente, C., Laoutaris, N.: User profiling in the time of HTTPS. In: IMC, pp. 373–379 (2016)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: KDD, pp. 855–864 (2016)
Herrmann, D., Banse, C., Federrath, H.: Behavior-based tracking: exploiting characteristic patterns in DNS traffic. In: Computer Security, pp. 17–33 (2013)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Kipf, T.N., Welling, M.: Variational graph auto-encoders. CoRR (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Kumpost, M., Matyas, V.: User profiling and re-identification: case of university-wide network analysis. In: TrustBus, pp. 1–10 (2009)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 7, 1019–1031 (2007)
Mayer, J.R., Mitchell, J.C.: Third-party web tracking: policy and technology. In: SP, pp. 413–427 (2012)
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In: IJCAI, pp. 2609–2615 (2018)
Papadopoulos, P., Kourtellis, N., Markatos, E.P.: Cookie synchronization: everything you always wanted to know but were afraid to ask. In: WWW, pp. 1432–1442 (2019)
Rescorla, E.: The transport layer security (TLS) protocol version 1.3. RFC 8446, pp. 1–160 (2018)
Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining Knowl. Disc. 3, 447–478 (2011)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (2018)
Yen, T., Xie, Y., Yu, F., Yu, R.P., Abadi, M.: Host fingerprinting and tracking on the web: privacy and security implications. In: NDSS (2012)
Yu, Z., Macbeth, S., Modi, K., Pujol, J.M.: Tracking the trackers. In: WWW, pp. 121–132 (2016)
Acknowledgements
This work is supported by the National Key Research and Development Program of China No. 2020YFE0200500 and the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040400.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cui, T., Xiong, G., Liu, C., Shi, J., Fu, P., Gou, G. (2023). GALG: Linking Addresses in Tracking Ecosystem Using Graph Autoencoder with Link Generation. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13718. Springer, Cham. https://doi.org/10.1007/978-3-031-26422-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-26422-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26421-4
Online ISBN: 978-3-031-26422-1
eBook Packages: Computer ScienceComputer Science (R0)