Skip to main content
Log in

Robust website fingerprinting through resource loading sequence

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

A website fingerprinting (WF) attack is a type of traffic analysis technique that extracts the unique fingerprint of the traffic visiting a website, demonstrating that the current privacy protection mechanism provided by https is still fragile. Whereas prior WF attack methods that extract fingerprints using the Web traffic generated by the first TCP flow can easily be compromised by frequent website updates, we observe that it is still possible to identify a website accurately by fingerprinting the resource loading sequence generated by multiple TCP flows. We record the multiple TCP flows during a website visit and analyse their traffic structure. We find that despite the updates to the website, the TCP establishment is usually kept unchanged, and the TCP sequence can be used to fingerprint a website. Hence, we use multiple TCP flows for website fingerprinting attacks and demonstrate their high accuracy in recognizing a website even under https protection. We collect data from 20 websites within a time span of six months and show that the accuracy and robustness are significantly higher than those of state-of-the-art WF solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

Notes

  1. The browser is Chrome (Version 101.0.1210.47), and the visiting time is Jul.25th, 2022

  2. We construct RLSeq using a window-sliding way, that is, a fresh RLSeq structure is constructed once the new flows comes.

References

  1. Alexa website ranking. https://www.alexa.com/. Accessed 6 May 2021

  2. Cisco joy. https://github.com/cisco/joy. Accessed 17 Aug 2021

  3. Google transparency report. https://transparencyreport.google.com/https/overview. Accessed 25 Feb 2022

  4. Rfc 7540: Hypertext transfer protocol version 2 (http/2). https://www.rfc-editor.org/rfc/rfc7540.html. Accessed 30 Mar 2022

  5. Selenium, automating Web applications for testing purposes tools. https://www.selenium.dev/. Accessed 17 Aug 2021

  6. Dong, C., Lu, Z., Cui, Z., Liu, B., Chen, K.: MBtree: Detecting encryption rats communication using malicious behavior tree. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 3589–3603 (2021)

    Article  Google Scholar 

  7. Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D., Lindorfer, M., Choffnes, D., Steen, M., Peter, A.: Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traffic. In: Network and Distributed System Security (NDSS) (2020)

  8. Saman, F., Leith, D.J.: A Web traffic analysis attack using only timing information. IEEE Trans. Inf. Forensic. Secur. (TIFS) 11(8), 1747–1759 (2016)

    Article  Google Scholar 

  9. Gezer, A., Warner, G., Wilson, C., Shrastra, P.: A flow-based approach for Trickbot banking trojan detection. Comput. Secur. 84 (2019)

  10. Hayes, J., Danezis, G.: K-fingerprinting: A robust scalable website fingerprinting technique. In: Proceeding of the USENIX Security Symposium, pp 1187–1203 (2016)

  11. Herrmann, D., Wendolsky, R., Federrath, H.: Website fingerprinting: Attacking popular privacy enhancing technologies with the multinomial nave-bayes classifier. In: ACM Workshop on Cloud Computing Security (CCSW), pp 31–42 (2009)

  12. Gong, J., Wang, T.: Zero-delay lightweight defenses against website fingerprinting. In: USENIX Security Symposium (USENIX Security), pp 717–734 (2020)

  13. Jahani, H., Jalili, S.: A novel passive website fingerprinting attack on TOR using fast fourier transform. Comput. Commun. (CC) 96(1), 43–51 (2016)

    Article  Google Scholar 

  14. Keogh, E.J., Pazzani, M.J.: Derivative Dynamic Time Warping, pp. 1–11

  15. Korczyński, M., Duda, A.: Markov chain fingerprinting to classify encrypted traffic. In: IEEE Conference on Computer Communications (INFOCOM), pp 781–789 (2014)

  16. Jie, L., Anjin, L., Fan, D., et al.: Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. (TKDE) 31(12), 2346–2363 (2019)

    Google Scholar 

  17. Nayak, S., Misra, B.B., Behera, H.S.: Impact of data normalization on stock index forecasting. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 6(1), 257–269 (2014)

    Google Scholar 

  18. Nie, L, Zhao, L, Li, K.: Robust anomaly detection using reconstructive adversarial network. IEEE Trans. Netw. Serv. Manag. (TNSM) 18(2), 1899–1912 (2021)

    Article  Google Scholar 

  19. Panchenko, A., Lanze, F., Zinnen, A., Henze, M., Engel, T.: Website fingerprinting at internet scale. In: ISOC Network & Distributed System Security Symposium (NDSS), pp 1–18 (2016)

  20. Roei, S, Vitaly, S, Eran, T.: Beauty and the burst: Remote identification of encrypted video streams. In: USENIX Security Symposium (USENIX Security), pp 1357–1374 (2017)

  21. Yi, S., Kanta, M.: Fingerprinting attack on the TOR anonymity system. In: International Conference on Information and Communications Security (ICICS), pp 425–438 (2009)

  22. Sepp, H, Jürgen, S.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  23. Shen, M., Zhang, J., Zhu, L., Xu, K., Du, X.: Accurate decentralized application identification via encrypted traffic analysis using graph neural networks. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 2367–2380 (2021)

    Article  Google Scholar 

  24. Shen, M., Liu, Y., Zhu, L., Du, X., Hu, J.: Fine-grained webpage fingerprinting using only packet length information of encrypted traffic. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 2046–2059 (2021)

    Article  Google Scholar 

  25. Shi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In: International Conference on Neural Information Processing Systems (NIPS), pp 802–810 (2015)

  26. Siby, S., Juarez, M., Diaz, C., Troncoso, C., Vallina-Rodriguez, N.: Encrypted dns → privacy: A traffic analysis perspective. In: ISOC Network and Distributed System Security Symposium (NDSS), pp 1–18 (2020)

  27. Sirinam, P., Imani, M., Juarez, M., Wright, M.: Deep fingerprinting: Undermining website fingerprinting defenses with deep learning. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp 1928–1943 (2018)

  28. Taylor, V.F., Spolaor, R., Conti, M., Martinovic, I.: Robust smartphone app identification via encrypted network traffic analysis. IEEE Trans. Inf. Forensic. Secur. (TIFS) 13(1) (2017)

  29. Xie, J., Li, S., Zhang, Y., Yun, X., Li, J.: A method based on hierarchical spatiotemporal features for Trojan traffic detection. In: 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC), pp 1–8 (2019)

  30. Zhang, Z., Kang, C., Xiong, G., Li, Z.: Deep forest with LRRS feature for fine-grained website fingerprinting with encrypted SSL/TLS. In: ACM International Conference on Information and Knowledge Management (CIKM), pp 851–860 (2019)

Download references

Funding

This work is supported in part by the National Key Research and Development Program of China No. 2019QY1301; the NSFC-General Technology Basic Research Joint Funds under Grant U1836214; NSFC-61872265; the New Generation of Artificial Intelligence Science and Technology Major Project of Tianjin under 19ZXZNGX00010.

Author information

Authors and Affiliations

Authors

Contributions

Changzhi Li prepared experiments and figures; Lihai Nie prepared main manuscript text; All authors reviewed the manuscript.

Corresponding author

Correspondence to Laiping Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Changzhi Li and Lihai Nie contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Nie, L., Zhao, L. et al. Robust website fingerprinting through resource loading sequence. World Wide Web 26, 2329–2349 (2023). https://doi.org/10.1007/s11280-023-01138-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01138-2

Keywords

Navigation