Skip to main content
Log in

Estimation of missing prices in real-estate market agent-based simulations with machine learning and dimensionality reduction methods

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The opacity of real-estate market involves some challenges in their agent-based simulation. While some real-estate Web sites provide the prices of a great amount of houses publicly, the prices of the rest are not available. The estimation of these prices is necessary for simulating their evolution from a complete initial set of houses. Additionally, this estimation could also be useful for other purposes such as appraising houses, letting buyers know which are the best offered prices (i.e., the lowest ones compared to the appraisals) and recommending the buyers to set an initial price. This work proposes combining dimensionality reduction methods with machine learning techniques to obtain the estimated prices. In particular, this work analyzes the use of nonnegative factorization, recursive feature elimination and feature selection with a variance threshold, as dimensionality reduction methods. It compares the application of linear regression, support vector regression, the k-nearest neighbors and a multilayer perceptron neural network, as machine learning techniques. This work has applied a tenfold cross-validation for comparing the estimations and errors and assessing the improvement over a basic estimator commonly used in the beginning of simulations. The developed software and the used dataset are freely available from a data research repository for the sake of reproducibility and the support to other researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.idealista.com (last accessed October 22, 2018).

  2. http://data.gov.uk/ (last accessed September 16, 2017).

  3. https://www.fotocasa.es/es/ (last accessed October 22, 2018).

References

  1. Anya O, Moore B, Kieliszewski C, Maglio P, Anderson L (2015) Understanding the practice of discovery in enterprise big data science: an agent-based approach. Procedia Manuf 3:882–889

    Article  Google Scholar 

  2. Bárcena Ruiz MJ, Menéndez P, Palacios MB, Tusell Palmer FJ (2011) Measuring the effect of the real estate bubble: a house price index for Bilbao. Biltoki 5463. http://hdl.handle.net/10810/5463. Last accessed 19 July 2017

  3. Becker T, Illigen C, McKelvey B, Hülsmann M, Windt K (2016) Using an agent-based neural-network computational model to improve product routing in a logistics facility. Int J Prod Econ 174:156–167

    Article  Google Scholar 

  4. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York

    MATH  Google Scholar 

  5. Borges F, Gutierrez-Milla A, Luque E, Suppi R (2017) Care HPS: a high performance simulation tool for parallel and distributed agent-based modeling. Future Gener Comput Syst 68:59–73

    Article  Google Scholar 

  6. Bosch M, Carnero MA, Farré L (2015) Rental housing discrimination and the persistence of ethnic enclaves. SERIEs 6(2):129–152

    Article  Google Scholar 

  7. Brown JM, Phelps JJ, Barkwith A, Hurst MD, Ellis MA, Plater AJ (2016) The effectiveness of beach mega-nourishment, assessed over three management epochs. J Environ Manag 184:400–408

    Article  Google Scholar 

  8. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Last accessed 19 July 2017

  9. Chang CC, Chao CH, Yeh JH (2016) The role of buy-side anchoring bias: evidence from the real estate market. Pacific-Basin Finance J 38:34–58

    Article  Google Scholar 

  10. Chasco Yrigoyen C, Le Gallo J (2012) Hierarchy and spatial autocorrelation effects in hedonic models. Econ Bull 32(2):1474–1480

    Google Scholar 

  11. Chen J, Feng S, Liu J (2014) Topic sense induction from social tags based on non-negative matrix factorization. Inf Sci 280:16–25

    Article  MathSciNet  Google Scholar 

  12. Chiarazzo V, Caggiani L, Marinelli M, Ottomanelli M (2014) A neural network based model for real estate price estimation considering environmental quality of property location. Transp Res Procedia 3:810–817. https://doi.org/10.1016/j.trpro.2014.10.067, http://www.sciencedirect.com/science/article/pii/S2352146514002300, 17th Meeting of the EURO working group on transportation, EWGT2014, 2–4 July 2014, Sevilla, Spain

  13. Chung H, Badeau R, Plourde E, Champagne B (2018) Training and compensation of class-conditioned nmf bases for speech enhancement. Neurocomputing 284:107–118

    Article  Google Scholar 

  14. Cicirelli F, Furfaro A, Giordano A, Nigro L (2011) HLA\_ACTOR\_REPAST: an approach to distributing RePast models for high-performance simulations. Simul Modell Pract Theory 19(1):283–300

    Article  Google Scholar 

  15. Cui G, Zhuang G, Lu J (2016) Neural-network-based distributed adaptive synchronization for nonlinear multi-agent systems in pure-feedback form. Neurocomputing 218:234–241

    Article  Google Scholar 

  16. Davidsson P (2002) Agent based social simulation: a computer science view. J Artif Soc Soc Simul 5(1):1–7

    Google Scholar 

  17. Dismuke C, Lindrooth R (2006) Ordinary least squares. In: Chumney E, Simpson NK (eds) Methods and designs for outcomes research. American Society of Health-System Pharmacists, Bethesda, pp 93–104

    Google Scholar 

  18. Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, Hoboken

    MATH  Google Scholar 

  19. Faul F, Erdfelder E, Lang AG, Buchner A (2007) G* power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39(2):175–191

    Article  Google Scholar 

  20. Galey M (2005) System and method of online real estate listing and advertisement. US Patent App. 10/896,331

  21. Garca N, Gmez M, Alfaro E (2008) Ann+gis: an automated system for property valuation. Neurocomputing 71(4):733–742. https://doi.org/10.1016/j.neucom.2007.07.031, http://www.sciencedirect.com/science/article/pii/S0925231207003505, Neural Networks: algorithms and applications 50 years of artificial intelligence: a neuronal approach

  22. García M (2010) The breakdown of the spanish urban growth model: social and territorial effects of the global crisis. Int J Urban Reg Res 34(4):967–980

    Article  Google Scholar 

  23. García-Magariño I, Lacuesta R (2017) Agent-based simulation of real-estate transactions. J Comput Sci 21:60–76

    Article  Google Scholar 

  24. García-Magariño I, Plaza I (2017) ABS-MindHeart: an agent based simulator of the influence of mindfulness programs on heart rate variability. J Comput Sci 19:11–20

    Article  Google Scholar 

  25. García-Magariño I, Gómez-Rodríguez A, González-Moreno JC, Palacios-Navarro G (2015) PEABS: a process for developing efficient agent-based simulators. Eng Appl Artif Intell 46:104–112

    Article  Google Scholar 

  26. García-Magariño I, Medrano C, Delgado J (2017) Python code for the estimation of missing prices in real-estate market with a dataset of house prices from Teruel city. Mendeley Data, v2 https://doi.org/10.17632/mxpgf54czz.2

  27. Gilbert N, Terna P (2000) How to build and use agent-based models in social science. Mind Soc 1(1):57–72

    Article  Google Scholar 

  28. Gómez-Sanz JJ, Fernández CR, Arroyo J (2010) Model driven development and simulations with the INGENIAS agent framework. Simul Model Pract Theory 18(10):1468–1482

    Article  Google Scholar 

  29. Hassan S, Garmendia L, Pavón J (2010) Introducing uncertainty into social simulation: using fuzzy logic for agent-based modelling. Int J Reasoning-based Intell Syst 2(2):118–124

    Google Scholar 

  30. Houari R, Bounceur A, Kechadi MT, Tari AK, Euler R (2016) Dimensionality reduction in data mining: a copula approach. Expert Syst Appl 64:247–260

    Article  Google Scholar 

  31. Jalalimanesh A, Haghighi HS, Ahmadi A, Soltani M (2017) Simulation-based optimization of radiotherapy: agent-based modeling and reinforcement learning. Math Comput Simul 133:235–248

    Article  MathSciNet  Google Scholar 

  32. Jayaram D, Manrai AK, Manrai LA (2015) Effective use of marketing technology in Eastern Europe: web analytics, social media, customer analytics, digital campaigns and mobile applications. J Econ Finance Adm Sci 20(39):118–132

    Google Scholar 

  33. Jiang GM, Hu ZP, Jin JY (2007) Quantitative evaluation of real estate’s risk based on AHP and simulation. Syst Eng Theory Pract 27(9):77–81

    Article  Google Scholar 

  34. Khalil KM, Abdel-Aziz M, Nazmy TT, Salem ABM (2015) MLIMAS: a framework for machine learning in interactive multi-agent systems. Procedia Comput Sci 65:827–835

    Article  Google Scholar 

  35. Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  MATH  Google Scholar 

  36. Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562

    Google Scholar 

  37. Li ZX (2006) Using fuzzy neural network in real estate prices prediction. In: 2007 Chinese control conference, pp 399–402. https://doi.org/10.1109/CHICC.2006.4347291

  38. Maltamo M, Kangas A (1998) Methods based on k-nearest neighbor regression in the prediction of basal area diameter distribution. Can J For Res 28(8):1107–1115

    Article  Google Scholar 

  39. Maruyama R, Maeda K, Moroda H, Kato I, Inoue M, Miyakawa H, Aonishi T (2014) Detecting cells using non-negative matrix factorization on calcium imaging data. Neural Netw 55:11–19

    Article  Google Scholar 

  40. Nguyen N, Cripps A (2001) Predicting housing value: a comparison of multiple regression analysis and artificial neural networks. J Real Estate Res 22(3):313–336

    Google Scholar 

  41. North MJ, Collier NT, Ozik J, Tatara ER, Macal CM, Bragen M, Sydelko P (2013) Complex adaptive systems modeling with Repast Simphony. Complex Adapt Syst Model 1(1):1

    Article  Google Scholar 

  42. Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126

    Article  Google Scholar 

  43. Park B, Bae JK (2015) Using machine learning algorithms for housing price prediction: the case of Fairfax county, Virginia housing data. Expert Syst Appl 42(6):2928–2934. https://doi.org/10.1016/j.eswa.2014.11.040

    Article  Google Scholar 

  44. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830

    MathSciNet  MATH  Google Scholar 

  45. Provost F, Fawcett T (2013) Data science and its relationship to big data and data-driven decision making. Big Data 1(1):51–59

    Article  Google Scholar 

  46. Pyhrr SA (1973) A computer simulation model to measure the risk in real estate investment. Real Estate Econ 1(1):48–78

    Article  Google Scholar 

  47. Reiser L, Mueller LA, Rhee SY (2002) Surviving in a sea of data: a survey of plant genome data resources and issues in building data management systems. Functional genomics. Springer, Berlin, pp 59–74

    Chapter  Google Scholar 

  48. Sabarina K, Priya N (2015) Lowering data dimensionality in big data for the benefit of precision agriculture. Procedia Comput Sci 48:548–554

    Article  Google Scholar 

  49. Simovici D (2012) Linear algebra tools for data mining. World Scientific Publishing, Singapore

    Book  MATH  Google Scholar 

  50. Sun Y, Wen G (2017) Cognitive facial expression recognition with constrained dimensionality reduction. Neurocomputing 230:397–408

    Article  Google Scholar 

  51. Symeonidis S, Effrosynidis D, Arampatzis A (2018) A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis. Expert Syst Appl 110:298–310

    Article  Google Scholar 

  52. Tratalos J, Haines-Young R, Potschin M, Fish R, Church A (2016) Cultural ecosystem services in the UK: lessons on designing indicators to inform management and policy. Ecol Indic 61:63–73

    Article  Google Scholar 

  53. Urbanavičiene V, Kaklauskas A, Zavadskas EK (2009) The conceptual model of construction and real estate negotiation. Int J Strateg Prop Manag 13(1):53–70

    Article  Google Scholar 

  54. Wang R, Hou J, He X (2017) Real estate price and heterogeneous investment behavior in China. Econ Model 60:271–280

    Article  Google Scholar 

  55. Wang S, Wan J, Zhang D, Li D, Zhang C (2016) Towards smart factory for industry 4.0: a self-organized multi-agent system with big data based feedback and coordination. Comput Netw 101:158–168

    Article  Google Scholar 

  56. Wojtusiak J, Warden T, Herzog O (2012) Machine learning in agent-based stochastic simulation: inferential theory and evaluation in transportation logistics. Comput Math Appl 64(12):3658–3665

    Article  MATH  Google Scholar 

  57. Yaqoob I, Hashem IAT, Gani A, Mokhtar S, Ahmed E, Anuar NB, Vasilakos AV (2016) Big data: from beginning to future. Int J Inf Manag 36(6):1231–1247

    Article  Google Scholar 

  58. Zhang L, Wang Z, Sagotsky JA, Deisboeck TS (2009) Multiscale agent-based cancer modeling. J Math Biol 58(4–5):545–559

    Article  MathSciNet  MATH  Google Scholar 

  59. Zhuge C, Shao C, Gao J, Dong C, Zhang H (2016) Agent-based joint model of residential location choice and real estate price for land use and transport model. Comput Environ Urban Syst 57:93–105

    Article  Google Scholar 

  60. Žibert J, Cedilnik J, Pražnikar J (2016) Particulate matter (pm10) patterns in Europe: an exploratory data analysis using non-negative matrix factorization. Atmos Environ 132:217–228

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by the program “Estancias de movilidad en el extranjero José Castillejo para jóvenes doctores” funded by the Spanish Ministry of Education, Culture and Sport with reference CAS17/00005. This work also acknowledges the research project “Diseño de actividades de aprendizaje colaborativas con Big Data” with reference PIIDUZ_16_120 funded by University of Zaragoza. We acknowledge the research project “Construcción de un framework para agilizar el desarrollo de aplicaciones móviles en el ámbito de la salud” funded by University of Zaragoza and Foundation Ibercaja with grant reference JIUZ-2017-TEC-03. We also acknowledge support from “Universidad de Zaragoza,” “Fundación Bancaria Ibercaja” and “Fundación CAI” in the “Programa Ibercaja-CAI de Estancias de Investigación” with reference IT1/18. This work was partially supported by the Spanish Research grant MTM2015-65433-P (MINECO/FEDER), Gobierno de Aragón and Fondo Social Europeo. Furthermore, we acknowledge the “Fondo Social Europeo” and the “Departamento de Tecnología y Universidad del Gobierno de Aragón” for their joint support with grant number Ref-T81.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iván García-Magariño.

Ethics declarations

Conflict of interest

The authors declare that there is not any conflict of interest about this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

García-Magariño, I., Medrano, C. & Delgado, J. Estimation of missing prices in real-estate market agent-based simulations with machine learning and dimensionality reduction methods. Neural Comput & Applic 32, 2665–2682 (2020). https://doi.org/10.1007/s00521-018-3938-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3938-7

Keywords

Navigation