Skip to main content

GALG: Linking Addresses in Tracking Ecosystem Using Graph Autoencoder with Link Generation

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13718))

  • 791 Accesses

Abstract

Online tracking technology is a critical tool for user-centric platform practitioners to link users across multiple web pages and make detailed user profiles for the improvement of recommender systems like targeted advertising. Recently, due to the dynamic address allocation and security upgrade, mitigations indirectly make prior tracking techniques unreliable. To overcome the problem, traffic-based tracking techniques are proposed to link users’ dynamic addresses through similarity learning of user behaviors in their traffic interaction. However, prior work either provides poor similarity learning ability or is impractical when applied to a large scale. In this paper, we propose GALG, a graph-based artificial intelligence approach to link addresses for user tracking on TLS encrypted traffic. GALG uses the framework of graph autoencoder and adversarial training to learn the user embedding with semantics and distributions. Employing a new theory – link generation, GALG could link all the addresses of target users based on the knowledge of address-service links. When evaluated on real-world user datasets, GALG outperforms existing approaches in both performance and practicality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anderson, B., McGrew, D.A.: OS fingerprinting: new techniques and a study of information gain and obfuscation. In: CNS, pp. 1–9 (2017)

    Google Scholar 

  2. Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science (5439), 509–512 (1999)

    Google Scholar 

  3. Bashir, M.A., Farooq, U., Shahid, M., Zaffar, M.F., Wilson, C.: Quantity vs. quality: evaluating user interest profiles using ad preference managers. In: NDSS (2019)

    Google Scholar 

  4. Cui, T., Gou, G., Xiong, G., Li, Z., Cui, M., Liu, C.: SiamHAN: IPv6 address correlation attacks on TLS encrypted traffic via siamese heterogeneous graph attention network. In: USENIX Security, pp. 4329–4346 (2021)

    Google Scholar 

  5. Droms, R.E.: Dynamic host configuration protocol. RFC 2131, 1–45 (1997)

    Google Scholar 

  6. Dunteman, G.H.: Principal components analysis (1989)

    Google Scholar 

  7. Gómez-Boix, A., Laperdrix, P., Baudry, B.: Hiding in the crowd: an analysis of the effectiveness of browser fingerprinting at large scale. In: WWW, pp. 309–318 (2018)

    Google Scholar 

  8. Gonzalez, R., Soriente, C., Laoutaris, N.: User profiling in the time of HTTPS. In: IMC, pp. 373–379 (2016)

    Google Scholar 

  9. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: KDD, pp. 855–864 (2016)

    Google Scholar 

  10. Herrmann, D., Banse, C., Federrath, H.: Behavior-based tracking: exploiting characteristic patterns in DNS traffic. In: Computer Security, pp. 17–33 (2013)

    Google Scholar 

  11. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)

    Google Scholar 

  12. Kipf, T.N., Welling, M.: Variational graph auto-encoders. CoRR (2016)

    Google Scholar 

  13. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  14. Kumpost, M., Matyas, V.: User profiling and re-identification: case of university-wide network analysis. In: TrustBus, pp. 1–10 (2009)

    Google Scholar 

  15. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)

    Google Scholar 

  16. Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 7, 1019–1031 (2007)

    Article  Google Scholar 

  17. Mayer, J.R., Mitchell, J.C.: Third-party web tracking: policy and technology. In: SP, pp. 413–427 (2012)

    Google Scholar 

  18. Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In: IJCAI, pp. 2609–2615 (2018)

    Google Scholar 

  19. Papadopoulos, P., Kourtellis, N., Markatos, E.P.: Cookie synchronization: everything you always wanted to know but were afraid to ask. In: WWW, pp. 1432–1442 (2019)

    Google Scholar 

  20. Rescorla, E.: The transport layer security (TLS) protocol version 1.3. RFC 8446, pp. 1–160 (2018)

    Google Scholar 

  21. Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining Knowl. Disc. 3, 447–478 (2011)

    Article  MathSciNet  Google Scholar 

  22. Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)

    Google Scholar 

  23. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (2018)

    Google Scholar 

  24. Yen, T., Xie, Y., Yu, F., Yu, R.P., Abadi, M.: Host fingerprinting and tracking on the web: privacy and security implications. In: NDSS (2012)

    Google Scholar 

  25. Yu, Z., Macbeth, S., Modi, K., Pujol, J.M.: Tracking the trackers. In: WWW, pp. 121–132 (2016)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China No. 2020YFE0200500 and the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040400.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chang Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cui, T., Xiong, G., Liu, C., Shi, J., Fu, P., Gou, G. (2023). GALG: Linking Addresses in Tracking Ecosystem Using Graph Autoencoder with Link Generation. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13718. Springer, Cham. https://doi.org/10.1007/978-3-031-26422-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26422-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26421-4

  • Online ISBN: 978-3-031-26422-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics