, Volume 20, Issue 2, pp 327–349 | Cite as

Exploring cell tower data dumps for supervised learning-based point-of-interest prediction (industrial paper)

  • Ran Wang
  • Chi-Yin ChowEmail author
  • Yan Lyu
  • Victor C. S. Lee
  • Sarana Nutanong
  • Yanhua Li
  • Mingxuan Yuan


Exploring massive mobile data for location-based services becomes one of the key challenges in mobile data mining. In this paper, we investigate a problem of finding a correlation between the collective behavior of mobile users and the distribution of points of interest (POIs) in a city. Specifically, we use large-scale cell tower data dumps collected from cell towers and POIs extracted from a popular social network service, Weibo. Our objective is to make use of the data from these two different types of sources to build a model for predicting the POI densities of different regions in the covered area. An application domain that may benefit from our research is a business recommendation application, where a prediction result can be used as a recommendation for opening a new store/branch. The crux of our contribution is the method of representing the collective behavior of mobile users as a histogram of connection counts over a period of time in each region. This representation ultimately enables us to apply a supervised learning algorithm to our problem in order to train a POI prediction model using the POI data set as the ground truth. We studied 12 state-of-the-art classification and regression algorithms; experimental results demonstrate the feasibility and effectiveness of the proposed method.


Spatio-temporal data analysis Classification Regression Cell tower data dumps Point-of-interest prediction 



R. Wang and C.-Y. Chow were partially supported by a research grant (CityU Project No. 9231131). S. Nutanong was partially supported by a CityU research grant (CityU Project No. 7200387). This work was also supported by the National Natural Science Foundation of China under the Grant 61402460.


  1. 1.
    Bao J, Zheng Y, Mokbel MF (2012) Location-based and preference-aware recommendation using sparse geo-social networking data. In: ACM SIGSPATIALGoogle Scholar
  2. 2.
    Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD (1972) Statistical inference under order restrictions: The theory and application of isotonic regression. Wiley, New YorkGoogle Scholar
  3. 3.
    Becker RA, Caceres R, Hanson K, Loh JM, Urbanek S, Varshavsky A, Volinsky C (2011) A tale of one city: Using cellular network data for urban planning. IEEE Pervasive Computing 10(4):18–26CrossRefGoogle Scholar
  4. 4.
    Birant D, St-dbscan AK (2007) An algorithm for clustering spatial–temporal data. DKE 60(1):208–221CrossRefGoogle Scholar
  5. 5.
    Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, New YorkGoogle Scholar
  6. 6.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140Google Scholar
  7. 7.
    Chen XM, Liu WQ, Lai JH, Li Z, Lu C (2012) Face recognition via local preserving average neighborhood margin maximization and extreme learning machine. Soft Comput 16(9):1515–1523CrossRefGoogle Scholar
  8. 8.
    Collins M, Schapire RE, Singer Y (2002) Logistic regression, adaboost and bregman distances. Mach Learn 48(1-3):253–285CrossRefGoogle Scholar
  9. 9.
    Ghosh S, Lee K, Moorthy S (1995) Multiple scale analysis of heterogeneous elastic structures using homogenization theory and voronoi cell finite element method. IJSS 32(1):27–62Google Scholar
  10. 10.
    Goh JY, Taniar D (2004) Mobile data mining by location dependencies. In: IDEALGoogle Scholar
  11. 11.
    Gokaraju B, Durbha SS, King RL, Younan NH (2011) A machine learning based spatio-temporal data mining approach for detection of harmful algal blooms in the Gulf of Mexico. IEEE J-STARS 4(3):710–720Google Scholar
  12. 12.
    Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc: Ser C: Appl Stat 28(1):100–108Google Scholar
  13. 13.
    Haykin S (1994) Neural networks: A comprehensive foundation. Prentice Hall PTRGoogle Scholar
  14. 14.
    Holmes G, Donkin A, Weka IH (1994) Witten: A machine learning workbench. In: ANZIISGoogle Scholar
  15. 15.
    Isaacman S, Becker R, Cáceres R, Kobourov S, Martonosi M, Rowland J, Varshavsky A (2011) Identifying important places in people’s lives from cellular network data. In: Pervasive ComputingGoogle Scholar
  16. 16.
    Kanasugi H, Sekimoto Y, Kurokawa M, Watanabe T, Muramatsu S, Shibasaki R (2013) Spatiotemporal route estimation consistent with human mobility using cellular network data. In: IEEE PerComGoogle Scholar
  17. 17.
    Miller HJ, Han J (2009) Geographic data mining and knowledge discovery. CRC PressGoogle Scholar
  18. 18.
    Pan B, Zheng Y, Wilkie D, Shahabi C (2013) Crowd sensing of traffic anomalies based on human mobility and social media. In: ACM SIGSPATIALGoogle Scholar
  19. 19.
    Quinlan JR (1996) Improved use of continuous attributes in C4.5. JAIR 4:77–90Google Scholar
  20. 20.
    Ratti C, Williams S, Frenchman D, Pulselli RM (2006) Mobile landscapes: using location data from cell phones for urban analysis. Environ Plan B: Planning and Design 33(5):727CrossRefGoogle Scholar
  21. 21.
    Rish I (2001) An empirical study of the naive bayes classifier. In: IJCAIGoogle Scholar
  22. 22.
    Seber GAF, Lee AJ (2012) Linear regression analysis, volume 936. John Wiley & SonsGoogle Scholar
  23. 23.
    Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. JRSS, Series B 53(3):683–690Google Scholar
  24. 24.
    Stone CJ (1985) Additive regression and other nonparametric models. Ann Stat:689–705Google Scholar
  25. 25.
    Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66Google Scholar
  26. 26.
    Toole JL, Ulm M, González MC, Bauer D (2012) Inferring land use from mobile phone activity. In: ACM UrbCompGoogle Scholar
  27. 27.
    Torgo L, Gama J (1996) Regression by classification. In: Advances in Artificial Intelligence, pp 51–60Google Scholar
  28. 28.
    Vapnik V (2000) The nature of statistical learning theory. SpringerGoogle Scholar
  29. 29.
    Vieira MR, Frias-Martinez V, Oliver N, Frias-Martinez E (2010) Characterizing dense urban areas from mobile phone-call data: Discovery and social dynamics. In: IEEE SocialComGoogle Scholar
  30. 30.
    Wang L, Huang YP, Luo XY, Wang Z, Luo SW (2011) Image deblurring with filters learned by extreme learning machine. Neurocomputing 74(16):2464–2474CrossRefGoogle Scholar
  31. 31.
    Wang Y, Witten IH (1999) Pace regression. Technical Report 99/12, Department of Computer Science, The University of WaikatoGoogle Scholar
  32. 32.
    Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE TEVC 1(1):67–82Google Scholar
  33. 33.
    Yavaṡ G, Katsaros D, Ulusoy Ö, Manolopoulos Y (2005) A data mining approach for location prediction in mobile environments. DKE 54(2):121–146CrossRefGoogle Scholar
  34. 34.
    Ye M, Yin P, Lee W-C, Lee D-L (2011) Exploiting geographical influence for collaborative point-of-interest recommendation. In: ACM SIGSPATIALGoogle Scholar
  35. 35.
    Yuan J, Zheng Y, Xie X (2012) Discovering regions of different functions in a city using human mobility and pois. In: ACM SIGKDDGoogle Scholar
  36. 36.
    Yuan J, Zheng Y, Xie X, Sun G (2013) T-drive: Enhancing driving directions with taxi drivers’ intelligence. IEEE TKDE 25(1):220–232Google Scholar
  37. 37.
    Zha Z, Wang M, Zheng Y, Yang Y, Hong R, Chua T (2012) Interactive video indexing with statistical active learning. IEEE TMM 14(1):17–27Google Scholar
  38. 38.
    Zhang J-D, Chow C-Y (2013) iGSLR: Personalized geo-social location recommendation: A kernel density estimation approach. In: ACM SIGSPATIALGoogle Scholar
  39. 39.
    Zheng J, Liu S, Ni LM (2013) Effective routine behavior pattern discovery from sparse mobile phone data via collaborative filtering. In: IEEE PerComGoogle Scholar
  40. 40.
    Zheng Y, Chen Y, Xie X, Ma WY (2009) Geolife2.0: A location-based social networking service. In: IEEE MDMGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Ran Wang
    • 1
  • Chi-Yin Chow
    • 1
    Email author
  • Yan Lyu
    • 1
  • Victor C. S. Lee
    • 1
  • Sarana Nutanong
    • 1
  • Yanhua Li
    • 2
  • Mingxuan Yuan
    • 3
  1. 1.Department of Computer ScienceCity University of Hong KongKowloonHong Kong
  2. 2.Department of Computer ScienceWorcester Polytechnic Institute (WPI)WorcesterUSA
  3. 3.Huawei Noah’s Ark LabShatinHong Kong

Personalised recommendations