Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 37-52

Country-Scale Exploratory Analysis of Call Detail Records Through the Lens of Data Grid Models

  • Romain Guigourès
  • Dominique Gay
  • Marc Boullé
  • Fabrice Clérot
  • Fabrice Rossi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9286)

Abstract

Call Detail Records (CDRs) are data recorded by telecommunications companies, consisting of basic informations related to several dimensions of the calls made through the network: the source, destination, date and time of calls. CDRs data analysis has received much attention in the recent years since it might reveal valuable information about human behavior. It has shown high added value in many application domains like e.g., communities analysis or network planning.

In this paper, we suggest a generic methodology based on data grid models for summarizing information contained in CDRs data. The method is based on a parameter-free estimation of the joint distribution of the variables that describe the calls. We also suggest several well-founded criteria that allows one to browse the summary at various granularities and to explore the summary by means of insightful visualizations. The method handles network graph data, temporal sequence data as well as user mobility data stemming from original CDRs data. We show the relevance of our methodology on real-world CDRs data from Ivory Coast for various case studies, like network planning strategy and yield management pricing strategy.

Keywords

Classification rule Bayes theory Minimum description length 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Becker, R.A., Cáceres, R., Hanson, K., Isaacman, S., Loh, J.M., Martonosi, M., Rowland, J., Urbanek, S., Varshavsky, A., Volinsky, C.: Human mobility characterization from cellular network data. Commun. ACM 56(1), 74–82 (2013)CrossRefGoogle Scholar
  2. 2.
    Berlingerio, M., Calabrese, F., Di Lorenzo, G., Nair, R., Pinelli, F., Sbodio, M.L.: AllAboard: a system for exploring urban mobility and optimizing public transport using cellphone data. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS, vol. 8190, pp. 663–666. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  3. 3.
    Blondel, V., Krings, G., Thomas, I.: Regions and borders of mobile telephony in Belgium and in the Brussels metropolitan zone. Brussels Studies 42 (2010)Google Scholar
  4. 4.
    Blondel, V., de Cordes, N., Decuyper, A., Deville, P., Raguenez, J., Smoreda, Z.: Mobile phone data for development - analysis of mobile phone datasets for the development of ivory coast (2013). http://perso.uclouvain.be/vincent.blondel/netmob/2013/D4D-book.pdf
  5. 5.
    Blondel, V.D., Esch, M., Chan, C., Clérot, F., Deville, P., Huens, E., Morlot, F., Smoreda, Z., Ziemlicki, C.: Data for development: the D4D challenge on mobile phone data. CoRR abs/1210.0137 (2012)Google Scholar
  6. 6.
    Boullé, M.: Data grid models for preparation and modeling in supervised learning. In: Guyon, I., Cawley, G., Dror, G., Saffari, A. (eds.) Hands-On Pattern Recognition: Challenges in Machine Learning, vol. 1, pp. 99–130. Microtome (2011)Google Scholar
  7. 7.
    Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: KDD, pp. 89–98 (2003)Google Scholar
  8. 8.
    Frías-Martínez, E., Williamson, G., Frías-Martínez, V.: An agent-based model of epidemic spread using human mobility and social network information. In: SocialCom/PASSAT, pp. 57–64 (2011)Google Scholar
  9. 9.
    Gnabéli, R.: La production d’une identité autochtone en Côte d’Ivoire. Journal des anthropologues. Association française des anthropologues 114–115, 247–275 (2008)Google Scholar
  10. 10.
    Guigourès, R., Boullé, M.: Segmentation of towns using call detail records. In: NetMob Workshop at IEEE SocialCom (2011)Google Scholar
  11. 11.
    Guigourès, R., Gay, D., Boullé, M., Clérot, F., Rossi, F.: Country-scale exploratory analysis of call detail records through the lens of data grid models (2015). http://arxiv.org/abs/1503.06060
  12. 12.
    Hartigan, J.A.: Direct clustering of a data matrix. Journal of the American Statistical Association 67, 123–129 (1972)CrossRefGoogle Scholar
  13. 13.
    Jiang, S., Fiore, G.A., Yang, Y., Ferreira Jr., J., Frazzoli, E., Gonzàlez, M.C.: A review of urban computing for mobile phone traces: current methods, challenges and opportunities. In: UrbComp@KDD (2013)Google Scholar
  14. 14.
    Kolda, T.G., Sun, J.: Scalable tensor decompositions for multi-aspect data mining. In: ICDM, pp. 363–372 (2008)Google Scholar
  15. 15.
    Laurila, J.K., Gatica-Perez, D., Aad, I., Blom, J., Bornet, O., Do, T.M.T., Dousse, O., Eberle, J., Miettinen, M.: From big smartphone data to worldwide research: The mobile data challenge. Pervasive and Mobile Computing 9(6), 752–771 (2013)CrossRefGoogle Scholar
  16. 16.
    Lockhart, J.W., Weiss, G.M.: The benefits of personalized smartphone-based activity recognition models. In: SDM, pp. 614–622 (2014)Google Scholar
  17. 17.
    Peng, W., Li, T.: Temporal relation co-clustering on directional social network and author-topic evolution. Knowledge and Information Systems 26(3), 467–486 (2011)CrossRefMATHGoogle Scholar
  18. 18.
    Slonim, N., Friedman, N., Tishby, N.: Agglomerative multivariate information bottleneck. In: NIPS, pp. 929–936 (2001)Google Scholar
  19. 19.
    Sun, J., Tao, D., Faloutsos, C.: Beyond streams and graphs: dynamic tensor analysis. In: KDD 2006, pp. 374–383 (2006)Google Scholar
  20. 20.
    Tishby, N., Pereira, O.C., Bialek, W.: The information bottleneck method. In: Allerton Conference on Communication, Control and Computing (1999)Google Scholar
  21. 21.
    United Nations Global Pulse: Mobile phone network data for development (2013). www.unglobalpulse.org/Mobile_Phone_Network_Dat_for_Dev
  22. 22.
    Vieira, M.R., Frías-Martínez, V., Oliver, N., Frías-Martínez, E.: Characterizing dense urban areas from mobile phone-call data: discovery and social dynamics. In: SocialCom/PASSAT, pp. 241–248 (2010)Google Scholar
  23. 23.
    Wang, D., Pedreschi, D., Song, C., Giannotti, F., Barabási, A.L.: Human mobility, social ties, and link prediction. In: KDD, pp. 1100–1108 (2011)Google Scholar
  24. 24.
    Wang, P., Domeniconi, C., Laskey, K.B.: Information bottleneck co-clustering. In: Workshop TextMining@SIAM DM 2010 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Romain Guigourès
    • 1
  • Dominique Gay
    • 2
  • Marc Boullé
    • 2
  • Fabrice Clérot
    • 2
  • Fabrice Rossi
    • 3
  1. 1.ZalandoBerlinGermany
  2. 2.Orange Labs LannionLannionFrance
  3. 3.SAMM EA 4543, Univeristé Paris 1ParisFrance

Personalised recommendations