Skip to main content
Log in

Extracting human behavior patterns from DNS traffic

  • Published:
Annals of Telecommunications Aims and scope Submit manuscript

Abstract

The Internet has evolved in the last decades as a fundamental part of human culture. Human patterns are present in network traffic due to users’ activity regarding everyday tasks or other routines. Consequently, these patterns can be found in DNS (Domain Name System) traffic, as it is a critical element for the Internet’s working. The present work shows a procedure to detect and extract some of those human patterns by applying machine learning techniques on real DNS data. Network traffic retrieved from an authoritative DNS server from the ccTLD (country-code top level domain) from Chile .cl, was processed as multiple time series for pattern extraction. Particular and complex techniques have to be used in order to work with this data structure. The procedure consists of a first stage of clustering analysis, to detect groups of domains based on their activity to analyze their behavior over time and determine persistent patterns; and a second stage of association rules extraction, to retrieve specific activity differences between the groups. Finding human patterns in the data could be of high interest to researchers that analyze human behavior regarding Internet usage. Through the application of the proposed procedure, trends and patterns present in DNS traffic were detected, which showed to be consistent over different time portions of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bortzmeyer S (2015) DNS Privacy considerations. RFC, p 7626

  2. Bargh JA, McKenna KYA (2004) The internet and social life. Annu. Rev. Psychol. 55:573–590

    Article  Google Scholar 

  3. Whang Z, Tseng S-S (2011) Anomaly detection of domain name system (dns) query traffic at top level domain servers. Sci Res Essays 6(18):3858–3872

    Article  Google Scholar 

  4. Berelson B, Steiner GA (1964) Human behavior: an inventory of scientific findings

  5. Bui N, Cesana M, Amir Hosseini S, Qi L, Malanchini I, Widmer J (2017) A survey of anticipatory mobile networking: Context-based classification, prediction methodologies, and optimization techniques. IEEE Communications Surveys & Tutorials 19(3):1790–1821

    Article  Google Scholar 

  6. Gonzalez MC, Hidalgo CA, Barabasi A-L (2008) Understanding individual human mobility patterns. Nature 453(7196):779

    Article  Google Scholar 

  7. Oliveira EMR, Viana AC, Sarraute C, Brea J, Alvarez-Hamelin I (2016) On the regularity of human mobility. Pervasive and Mobile Computing 33:73–90

    Article  Google Scholar 

  8. Wang H, Fengli X, Li Y, Zhang P, Jin D (2015) Understanding mobile traffic patterns of large scale cellular towers in urban environment. Inproceedings of the Internet Measurement Conference, pages 225–238. ACM, 2015

  9. Madariaga D, Panza M, Bustos-Jiménez J (2018) Dns traffic forecasting using deep neural networks. In: International Conference on Machine Learning for Networking, pages 181–192. Springer

  10. Cassisi C, Montalto P, Aliotta M, Cannata A, Pulvirenti A et al (2012) Similarity measures and dimensionality reduction techniques for time series data mining. Advances in data mining knowledge discovery and applications, pp 71–96

  11. Tak-chung F (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

    Article  Google Scholar 

  12. NIC Chile. Official registry for the.cl cctld

  13. NIC Chile..cl nameservers map

  14. Amazon (2019) Amazon alexa topsites

  15. Mockapetris PV (1987) Rfc1035: Domain names-implementation and specification

  16. Kaufman L, Rousseeuw PJ (1990) Partitioning around medoids (program pam). Finding groups in data:, an introduction to cluster analysis 344:68–125

    Article  Google Scholar 

  17. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence 2:224–227

    Article  Google Scholar 

  18. Paparrizos J, Gravano L (2015) k-shape: Efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1855–1870. ACM

  19. Jiawei H, Kamber M, Kaufmann M (2001) Data mining: concepts and techniques. 2001 University of Simon Fraser

  20. Sarda-Espinosa A (2019) dtwclust: time series clustering along with optimizations for the dynamic time warping distance. R package version 5.5.4

  21. Hahsler M, Chelluboina S, Hornik K, Buchta C (2011) The arules r-package ecosystem: analyzing interesting patterns from large transaction datasets. J Mach Learn Res 12:1977–1981

    MathSciNet  MATH  Google Scholar 

  22. Borg I, Groenen PJF (2005) Modern multidimensional scaling: theory and applications springer

  23. Hunter JD (2007) Matplotlib: a 2d graphics environment. Computing in Science & Engineering 9 (3):90–95

    Article  Google Scholar 

  24. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. Inproceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pages 2–11 ACM

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martín Panza.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Panza, M., Madariaga, D. & Bustos-Jiménez, J. Extracting human behavior patterns from DNS traffic. Ann. Telecommun. 77, 407–420 (2022). https://doi.org/10.1007/s12243-021-00888-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12243-021-00888-2

Keywords

Navigation