Information Systems Frontiers

, Volume 13, Issue 3, pp 337–348 | Cite as

GIS enabled service site selection: Environmental analysis and beyond

  • Junjie Wu
  • Jian ChenEmail author
  • Yili Ren


Given its importance, the problem of selecting the right site for a service entity has attracted great attention in the literature. However, due to its complexity, the quantification of the interrelationships between the service site and its nearby business types is still a challenging task. To this end, in this paper, we propose a novel joint learning scheme for service site selection. This scheme employs both the Probabilistic Latent Semantic Analysis (PLSA) on the Geographical Information System (GIS) data and the partitional clustering on the service performance data. A case study for bank branch selection is provided to demonstrate the usefulness of our method. Finally, based on the joint learning scheme, we present a conceptual framework for the complete procedure of service site selection with a particular emphasis on the GIS enabled network analysis.


Site selection Geographical Information System (GIS) Probabilistic Latent Semantic Analysis (PLSA) Joint learning 



This research was partially supported by the National Natural Science Foundation of China (NSFC) (nos. 70901002, 71031001, 70890082). Also, we are grateful to the Information Systems Frontiers anonymous referees for their constructive comments on the paper.


  1. Aboolian, R., Berman, O., & Krass, D. (2007). Competitive facility location and design problem. European Journal of Operational Research, 182(1), 40–62.CrossRefGoogle Scholar
  2. Basu, S., Bilenko, M., & Mooney, R. J. (2004). A probabilistic framework for semisupervised clustering. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining.Google Scholar
  3. Berman, O., & Krass, D. (2002). Locating multiple competitive facilities: Spatial interaction models with variable expenditures. Annals of Operations Research, 111(1), 197–225.CrossRefGoogle Scholar
  4. Bruns, A. (2007). Where the customers are—Can ‘financialization’ be productive? In Site selection magazine.Google Scholar
  5. Cheng, E. W. L., Li, H., & Yu, L. (2007). A GIS approach to shopping mall location selection building and environment. Computers and Operations Research, 42, 884–892.Google Scholar
  6. Cheng, M., & Chang, G. (2001). Automating utility route design and planning through GIS. Automation in Construction, 10, 507–516.CrossRefGoogle Scholar
  7. Cover, T., & Thomas, J. (2006) Elements of information theory (2nd ed.). New York: Wiley-Interscience.Google Scholar
  8. Davidson, I., & Ravi, S. (2005). Clustering with constraints feasibility issues and the k-means algorithm. In Proceedings of the 2005 SIAM international conference on data mining (pp. 138–149).Google Scholar
  9. Davis, D. (2003). GIS for everyone (3rd ed.). Esri Press.Google Scholar
  10. Dempster, N. L., Laird, N. M., & Rubin, D. B. (1977). Maximum-likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.Google Scholar
  11. Dhillon, I., Mallela, S., & Kumar, R. (2003). A divisive information–theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research, 3, 1265–1287.CrossRefGoogle Scholar
  12. Domínguez, E., & Munoz, J. (2008). A neural model for the p-median problem. Computers and Operations Research, 35(2), 404–416.CrossRefGoogle Scholar
  13. Drezner, Z. (1995). Facility location: A survey of applications and methods. New York: Springer.Google Scholar
  14. Elkan, C. (2006). Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution. In Proceedings of the 23rd international conference on machine learning.Google Scholar
  15. Ester, M., & Kriegel, H.-P. (1998). Clustering for mining in large spatial databases. KI-Journal, 1, 332–338.Google Scholar
  16. Estivill-Castro, V., & Lee, I. (2000). Autoclust+: Automatic clustering of point-data sets in the presence of obstacles. In Proceedings of international workshop on temporal, spatial and spatio-temporal data mining (pp. 133–146).Google Scholar
  17. Fan, B. (2009). A hybrid spatial data clustering method for site selection: The data driven approach of gis mining. Expert Systems with Applications, 36(2009), 3923–3936.CrossRefGoogle Scholar
  18. Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42, 177–196.CrossRefGoogle Scholar
  19. Jain, A., & Dubes, R. (1988). Algorithms for clustering data. Englewood Cliffs: Prentice Hall.Google Scholar
  20. Jain, K., & Vazirani, V. (2001). Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation. Journal of the ACM, 48(2), 274–296.CrossRefGoogle Scholar
  21. Jin, R., Goswami, A., & Agrawal, G. (2006). Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information Systems, 10(1), 17–40.CrossRefGoogle Scholar
  22. Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical analysis (4th ed.). Englewood Cliffs: Prentice Hall.Google Scholar
  23. Kaufman, L., & Rousseeuw, P. (1990). Finding groups in data: An introduction to cluster analysis, ser. Wiley series in probability and statistics. New York: Wiley.Google Scholar
  24. Klose, A., & Drexl, A. (2005). Facility location models for distribution system design. European Journal of Operational Research, 162, 4–29.CrossRefGoogle Scholar
  25. Korte, G. P. E., & Koret, G. P. (1997). The GIS book: Understanding the value and implementation of geographic information systems. Albany: Delmar.Google Scholar
  26. Li, T. (2008). Clustering based on matrix approximation: A unifying view. Knowledge and Information Systems, 17(1), 1–15.CrossRefGoogle Scholar
  27. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L. M. L. Cam, & J. Neyman (Eds.), Proceedings of the 5th Berkeley symposium on mathematical statistics and probability. Volume I: Statistics. Berkeley: University of California Press.Google Scholar
  28. Mapping-Analytics (2007). Evaluating branch locations: A network optimization approach. Available online at:
  29. Melo, T., Nickel, S., & da Gama, F. S. (2007). Facility location and supply chain management—A comprehensive review. Berichte des Fraunhofer ITWM (no. 130).Google Scholar
  30. Nathanail, C. (1998). Spatial management of geotechnical data for site selection. Engineering Geology, 50, 347–356.CrossRefGoogle Scholar
  31. Nickel, S., & Puerto, J. (2005). Location theory: A unified approach. Berlin: Springer.Google Scholar
  32. Nikolakaki, P. (2004). A GIS site-selection process for habitat creation: Estimating connectivity of habitat patches. Landscape and Urban Planning, 68, 77–94.CrossRefGoogle Scholar
  33. ReVelle, C., Eiselt, H., & Daskin, M. (2008). A bibliography for some fundamental problem categories in discrete location science. European Journal of Operational Research, 184, 817–848.CrossRefGoogle Scholar
  34. Saaty, T. L. (2001). Decision making for leaders: The analytic hierarchy process for decisions in a complex world. RWS Publications.Google Scholar
  35. Sivic, J., & Fergus, R. (2005). Two bag-of-words classifiers. Available online at:
  36. Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. Reading: Addison-Wesley.Google Scholar
  37. Tung, A., Hou, J., & Han, J. (2001). Spatial clustering in the presence of obstacles. In Proceedings of the 17th international conference on data engineering.Google Scholar
  38. Wang, X., Rostoker, C., & Hamilton, H. J. (2004). Density-based spatial clustering in the presence of obstacles and facilitators. In Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases (pp. 446–458).Google Scholar
  39. Wu, J., Xiong, H., & Chen, J. (2008). Sail: Summation-based incremental learning for information-theoretic clustering. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining.Google Scholar
  40. Wu, J., Xiong, H., Chen, J., & Zhou, W. (2007). A generalization of proximity functions for k-means. In Proceedings of the 2007 IEEE international conference on data mining (pp. 361–370).Google Scholar
  41. Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.CrossRefGoogle Scholar
  42. Zaïane, O., & Lee, C. (2002). Clustering spatial data when facing physical constraints. In Proceedings of the IEEE international conference on data mining (pp. 737–740).Google Scholar
  43. Zhang, L., & Rushton, G. (2008). Optimizing the size and locations of facilities in competitive multi-site service systems. Computers and Operations Research, 35, 327–338.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.School of Economics and ManagementBeihang UniversityBeijingChina
  2. 2.Research Center for Contemporary Management, Key Research Institute of Humanities and Social Sciences at Universities, School of Economics and ManagementTsinghua UniversityBeijingChina

Personalised recommendations