Location identification for real estate investment using data analytics

  • E. Sandeep KumarEmail author
  • Viswanath Talasila
  • Naphtali Rishe
  • T. V. Suresh Kumar
  • S. S. Iyengar


The modeling and control of complex systems, such as transportation, communication, power grids or real estate, require vast amounts of data to be analyzed. The number of variables in the models of such systems is large, typically a few hundred or even thousands. Computing the relationships between these variables, extracting the dominant variables and predicting the temporal and spatial dynamics of the variables are the general focuses of data analytics research. Statistical modeling and artificial intelligence have emerged as crucial solution enablers to these problems. The problem of real estate investment involves social, governmental, environmental and financial factors. Existing work on real estate investment focuses predominantly on the trend predictions of house pricing exclusively from financial factors. In practice, real estate investment is influenced by multiple factors (stated above), and computing an optimal choice is a multivariate optimization problem and lends itself naturally to machine learning-based solutions. In this work, we focus on setting up a machine learning framework to identify an optimal location for investment, given a preference set of an investor. We consider, in this paper, the problem to only direct real estate factors (bedroom type, garage spaces, etc.), other indirect factors like social, governmental, etc., will be incorporated into future work, in the same framework. Two solution approaches are presented here: first, decision trees and principal component analysis (PCA) with K-means clustering to compute optimal locations. In the second, PCA is replaced by artificial neural networks, and both methods are contrasted. To the best of our knowledge, this is the first work where the machine learning framework is introduced to incorporate all realistic parameters influencing the real estate investment decision. The algorithms are verified on the real estate data available in the TerraFly platform.


Real estate investment Machine learning Artificial intelligence Decision trees Principal component analysis K-means clustering Artificial neural networks Complex systems 



  1. 1.
    Chowdhury, M., Apon, A., Dey, K.: Data Analytics for Intelligent Transport Systems, 1st edn. Elsevier, New York City (2017)Google Scholar
  2. 2.
    Khan, N., Yaqoob, I., Hashem, I.A., Inayat, Z., Ali, W.K., Alam, M., Shiraz, M., Gani, A.: Big data: survey, technologies, opportunities, and challenges. Sci World J. 2014, 712826 (2014)Google Scholar
  3. 3.
    Weihs, C., Ickstadt, K.: Data science: the impact of statistics. Int. J. Data Sci. Anal. 6(3), 189–194 (2018). CrossRefGoogle Scholar
  4. 4.
    Clarke, B., Fokoue, E., Zhang, H.H.: Principles and theory for data mining and machine learning. Springer (2009)Google Scholar
  5. 5.
    Skourletopoulos, G., et al.: Big data and cloud computing: a survey of the state-of-the art and research challenges. In: Mavromoustakis, C., Mastorakis, G., Dobre, C. (eds.) Advances in Mobile Cloud Computing and Big Data in the 5G Era. Studies in Big Data, vol. 22. Springer, Cham (2016).
  6. 6.
    Carr, D.H., Lawson, J.A., Lawson, J., Schultz, J.: Mastering Real Estate Appraisal. Dearborn Real Estate Education, Wisconsin (2003)Google Scholar
  7. 7.
    Tang, D., Li, L.: Real estate investment decision-making based on analytic network process. IEEE International Conference on Business Intelligence and Financial Engineering. Beijing, pp. 544–547 (2009).
  8. 8.
    Klimczak, K.: Determinants of real estate investment. Econ. Sociol. 3(2), 58–66 (2010)CrossRefGoogle Scholar
  9. 9.
    Zhang, Y., Liu, S., He, S., Fang, Z.: Forecasting research on real estate prices in Shanghai. In: 2009 IEEE International Conference on Grey Systems and Intelligent Services (GSIS 2009), Nanjing, pp. 625–629 (2009)Google Scholar
  10. 10.
    Wei, W., Guang-ji, T., Hong-rui, Z.: Empirical analysis on the housing price in Harbin City based on hedonic model. In: 2010 International Conference on Management Science and Engineering 17th Annual Conference Proceedings, Melbourne, VIC, pp. 1659–1664 (2010)Google Scholar
  11. 11.
    Park, B., Bae, J.K.: Using machine learning algorithms for housing price prediction: the case of Fairfax County, Virginia housing data. Expert Syst. Appl. 42(6), 2928–2934 (2015)CrossRefGoogle Scholar
  12. 12.
    Zhang, P., Ma, W., Zhang, T.: Application of artificial neural network to predict real estate investment in Qingdao. In: Future Communication, Computing, Control and Management, LNEE 141, pp. 213–219. Springer, Berlin (2012)Google Scholar
  13. 13.
    Shi, H.: Determination of real estate price based on principal component analysis and artificial neural networks. In: 2009 Second International Conference on Intelligent Computation Technology and Automation, Changsha, Hunan, pp. 314–317 (2009)Google Scholar
  14. 14.
    Ahmed, E., Moustafa, M.: House price estimation from visual and textual features. In: Computer Vision and Pattern Recognition. Cornell University Library. arXiv:1609.08399 (2016)
  15. 15.
    French, N., French, S.: Decision theory and real estate investment. J. Prop. Valuat. Invest. 15(3), 226–232 (1997). MathSciNetCrossRefGoogle Scholar
  16. 16.
    French, N.: Decion ecision theory and real estate investment. Manag. Decis. Econ. 22, 399–410 (2001)CrossRefGoogle Scholar
  17. 17.
    Li, L., Chu, K.H.: Prediction of real estate price variation based on economic parameters. In: 2017 International Conference on Applied System Innovation (ICASI), Sapporo, pp. 87–90 (2017).
  18. 18.
    Sampathkumar, V., Helen Santhi, M., Vanjinathan, J.: Forecasting the land price using statistical and neural network software. Procedia Comput. Sci. 57, 112–121 (2015)CrossRefGoogle Scholar
  19. 19.
    Chiarazzoa, V., Caggiania, L., Marinellia, M., Ottomanelli, M.: A Neural Network based model for real estate price estimation considering environmental quality of property location. In: 17th Meeting of the EURO Working Group on Transportation, EWGT2014, 2–4 July 2014, Sevilla, Spain. Transportation Research Procedia, vol. 3, pp. 810–117 (2014)Google Scholar
  20. 20.
    Salnikovo, V.A., Mikheeva, M.: Models for predicting prices in the Moscow residential real estate market. Stud. Russ. Econ. Dev. 29(1), 94–101 (2018)CrossRefGoogle Scholar
  21. 21.
    Pappalardo, L., Vanhoof, M., Gabrielli, L., Smoreda, Z., Pedreschi, D., Giannotti, F.: An analytical framework to nowcast well-being using mobile phone data. Int. J. Data Sci. Anal. 2(1–2), 75–92 (2016). CrossRefGoogle Scholar
  22. 22.
    Tosi, D.: Cell phone big data to compute mobility scenarios for future smart cities. Int. J. Data Sci. Anal. 4(4), 265–284 (2017). CrossRefGoogle Scholar
  23. 23.
    “Maptitude”—real estate software.
  24. 24.
    Pitney bowes—real estate software.
  25. 25.
    “Terrafly”—Geospatial Big Data Platform and Solutions.
  26. 26.
    The condominium numbers’ range was obtained from the website.
  27. 27.
    Sheugh, L., Alizadeh, S.H.: A note on Pearson correlation coefficient as a metric of similarity in recommender system. In: 2015 AI & Robotics (IRANOPEN), Qazvin, pp. 1–6 (2015).
  28. 28.
    Benesty, J., Chen, J., Huang, Y.: On the importance of the Pearson correlation coefficient in noise reduction. IEEE Trans. Audio Speech Lang. Process. 16(4), 757–765 (2008)CrossRefGoogle Scholar
  29. 29.
    Soong, T.T.: Fundamentals of Probability and Statistics for Engineers. Wiley, Hoboken (2004)zbMATHGoogle Scholar
  30. 30.
    Schalkopff, R.J.: Intelligent Systems Principles, Paradigms, and Pragmatics. Jones and Bartlett Publishers, Burlington (2011)Google Scholar
  31. 31.
    Jolliffe, I.T.: Principal Component Analysis. Springer, Berlin (2002)zbMATHGoogle Scholar
  32. 32.
    Wu, J.: Advances in K-means Clustering: A Data Mining Thinking. Springer, Berlin (2012)zbMATHCrossRefGoogle Scholar
  33. 33.
    da Silva, I.N., Hernane Spatti, D., Andrade Flauzino, R., Liboni, L.H.B., dos Reis Alves, S.F.: Artifical neural networks: a practical course. Springer (2017)Google Scholar
  34. 34.
    Kathmann, R.M.: Neural networks for the mass appraisal of real estate. Comput. J. Environ. Urban Syst. 17(4), 373–384 (1993)CrossRefGoogle Scholar
  35. 35.
    Lim, W.T., Wang, L., Wang, Y., Chang, Q.: Housing price prediction using neural networks. In: IEEE 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, pp. 518–522 (2016)Google Scholar
  36. 36.
    Wang, L., Chan, F.F., Wang, Y., Chang, Q.: Predicting public housing prices using delayed neural networks. In: 2016 IEEE Region 10 Conference (TENCON), Singapore, pp. 3589–3592 (2016)Google Scholar
  37. 37.
    Peterson, S., Flanagan, A.B.: Neural network hedonic pricing models in mass real estate appraisal. J. Real Estate Res. 31(2), 147–164 (2009)Google Scholar
  38. 38.
    Olden, J.D., Jackson, D.A.: Illuminating the ‘black-box’: a randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 154, 135–150 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • E. Sandeep Kumar
    • 1
    Email author
  • Viswanath Talasila
    • 1
  • Naphtali Rishe
    • 2
  • T. V. Suresh Kumar
    • 3
  • S. S. Iyengar
    • 2
  1. 1.Department of Telecommunication EngineeringM.S. Ramaiah Institute of TechnologyBengaluruIndia
  2. 2.School of Computing and Information SciencesFlorida International UniversityMiami DadeUSA
  3. 3.Department of Computer ApplicationsM.S. Ramaiah Institute of TechnologyBengaluruIndia

Personalised recommendations