Automated Discovery of Mobile Users Locations with Improved K-means Clustering

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9120)


Location is one of the most commonly used contextual information in mobile context-aware systems. It can be considered on many different levels of granularity, varying from geolocation that is based on GPS systems, up to microlocation that uses Bluetooth Low Energy devices and WiFi access points for locating users inside buildings. Most common use of location is navigation, however recently it is more often considered also as an important component of the user profile. One of the biggest challenges in location-based context-aware systems is the discovery of patterns in user transportation traces and extraction of the most often visited places. In this paper we presented and evaluated a method that allows for automatic extraction of clusters from user location traces. These clusters represents user points of interest like home, work, favourite restaurants, but also transportation routines. The original contribution of this work is a proposal of an approach based on the K-means clustering algorithm equipped with a module for automatic discovery of number of clusters and density-based cluster merging. This method allows for online, adaptable discovery of user points of interests, and transportation routines in mobile systems.


Context-awareness Mobile devices Clustering Localisation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alvares, L.O., Bogorny, V., Kuijpers, B., de Macedo, J.A.F., Moelans, B., Vaisman, A.: A model for enriching trajectories with semantic geographical information. In: Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems, GIS 2007, pp. 22:1–22:8. ACM, New York (2007), CrossRefGoogle Scholar
  2. 2.
    Ashbrook, D., Starner, T.: Using gps to learn significant locations and predict movement across multiple users. Personal Ubiquitous Comput. 7(5), 275–286 (2003), CrossRefGoogle Scholar
  3. 3.
    Bobek, S., Porzycki, K., Nalepa, G.J.: Learning sensors usage patterns in mobile context-aware systems. In: Proceedings of the FedCSIS 2013 Conference, Krakow, pp. 993–998. IEEE (September 2013)Google Scholar
  4. 4.
    Debatty, T., Michiardi, P., Thonnard, O., Mees, W.: Determining the k in k-means with MapReduce. In: ICDT 2014, 17th International Conference on Database Theory, in conjunction with EDBT/ICDT 2014, Athens, Greece, March 24-28 (2014),
  5. 5.
    Dey, A.K., Mankoff, J.: Designing mediation for context-aware applications. ACM Trans. Comput.-Hum. Interact. 12(1), 53–80 (2005), CrossRefGoogle Scholar
  6. 6.
    Ester, M., Peter Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise, pp. 226–231. AAAI Press (1996)Google Scholar
  7. 7.
    Ferreira, D.: AWARE: A mobile context instrumentation middleware to collaboratively understand human behavior. Ph.D. thesis (2013)Google Scholar
  8. 8.
    Flach, P.: Machine Learning: The art and science of algorithms that make sense of data. Cambridge University Press (September 2012)Google Scholar
  9. 9.
    Foundation, A.S.: Apache Mahout,
  10. 10.
    Hamerly, G., Elkan, C.: Learning the k in k-means. In: Neural Information Processing Systems, p. 2003. MIT Press (2003)Google Scholar
  11. 11.
    Hartigan, J.A., Wong, M.A.: Algorithm AS 136: A K-Means Clustering Algorithm. Applied Statistics 28(1), 100–108 (1979), CrossRefGoogle Scholar
  12. 12.
    Kang, J.H., Welbourne, W., Stewart, B., Borriello, G.: Extracting places from traces of locations. In: Proceedings of the 2nd ACM International Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, WMASH 2004, pp. 110–118. ACM, New York (2004), Google Scholar
  13. 13.
    Leung, K.W.T., Lee, D.L., Lee, W.C.: Clr: A collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 305–314. ACM, New York (2011), Google Scholar
  14. 14.
    Lloyd, S.: Least squares quantization in pcm. IEEE Transactions on Information Theory 28(2), 129–137 (1982)MathSciNetCrossRefGoogle Scholar
  15. 15.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Statistics, vol. 1, pp. 281–297. University of California Press, Berkeley (1967), Google Scholar
  16. 16.
    Mahalanobis, P.C.: On the generalised distance in statistics. Proceedings of the National Institute of Science 2, 49–55 (1936), Google Scholar
  17. 17.
    Montoliu, R., Gatica-Perez, D.: Discovering human places of interest from multimodal mobile phone data. In: Proceedings of the 9th International Conference on Mobile and Ubiquitous Multimedia, MUM 2010, pp. 12:1–12:10. ACM, New York (2010), Google Scholar
  18. 18.
    Nalepa, G.J., Bobek, S., Ligęza, A., Kaczor, K.: Algorithms for rule inference in modularized rule bases. In: Bassiliades, N., Governatori, G., Paschke, A. (eds.) RuleML 2011 - Europe. LNCS, vol. 6826, pp. 305–312. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  19. 19.
    Nalepa, G.J., Bobek, S.: Rule-based solution for context-aware reasoning on mobile devices. Computer Science and Information Systems 11(1), 171–193 (2014)CrossRefGoogle Scholar
  20. 20.
    Palma, A.T., Bogorny, V., Kuijpers, B., Alvares, L.O.: A clustering-based approach for discovering interesting places in trajectories. In: Proceedings of the 2008 ACM Symposium on Applied Computing, SAC 2008, pp. 863–868. ACM, New York (2008),
  21. 21.
    Pelleg, D., Moore, A.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proceedings of the 17th International Conf. on Machine Learning, pp. 727–734. Morgan Kaufmann (2000)Google Scholar
  22. 22.
    Shindler, M., Wong, A., Meyerson, A.: Fast and accurate k-means for large datasets. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F.C.N., Weinberger, K.Q. (eds.) NIPS, pp. 2375–2383 (2011),
  23. 23.
    Sugar, C.A., James, G.M.: Finding the number of clusters in a data set: An information theoretic approach. Journal of the American Statistical Association 98, 750–763 (2003)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a dataset via the gap statistic 63, 411–423 (2000)Google Scholar
  25. 25.
    Wang, J., Ghosh, R., Das, S.: A survey on sensor localization. Journal of Control Theory and Applications 8(1), 2–11 (2010), CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.AGH University of Science and TechnologyKrakowPoland

Personalised recommendations