Abstract
Studies have shown that sociodemographic attributes significantly influence individuals' transportation choices. However, not all travel demand models do not account for this effect when predicting future travel scenarios. On the other hand, current integrated urban models (IUMs) that incorporate demographic dynamics mostly rely on conventional logit models and rule-based models. These models may not be optimal for complex modeling since they do not fully capture the non-linear relationship between inputs and output. In this research, we explore the feasibility of utilizing machine learning (ML) models to enhance the prediction of demographic dynamics within our proposed IUM—known as ‘STELARS’, in conjunction with conventional logit models. To address the challenge of the black-box nature of ML, we employ an explainable AI technique (xAI) to gain insights into the influence of the factors and compare them with the interpretation revealed by the logit models. Three demographic components are considered: marriage/common-law formation, separation and divorce, and childbirth events, while other components were developed using rate-based models. The results (on the testing dataset) indicate that ML models outperform conventional logit models in terms of overall accuracy by a margin of up-to 3%. However, when considering the true positive accuracy (correctly predicting the event of interest), a significant improvement of 30–48% is observed. Additionally, the xAI analysis reveals consistent interpretation with the logit model. Subsequently, we implemented our demographic dynamics module within our integrated urban modeling system to predict population changes in the Okanagan region of Canada. The multi-year validation of the simulation results against Census data suggests a reasonably close prediction of the observed population. We also optimize the runtime of the demographic dynamics module using vectorization, reducing the simulation time for the demographic changes in our study area (comprising approximately 200,000 individuals living in 85,000 households) to just about 100 s for the total 10 years of simulation. The development and implementation of this advanced demographic dynamics module to accurately predict the life events of individuals adds a fundamental capacity to the STELARS to be built as an event-based microsimulation model.
Similar content being viewed by others
References
Adnan, M., Pereira, F.C., Azevedo, C.L.: SimMobility : A multi-scale integrated agent-based simulation platform. In: Transportation Research Board 95th Annual MeetingTransportation Research Board (2016)
Ahmed, U., Moeckel, R.: Impact of life events on incremental travel behavior change. Transp. Res. Rec. J. Transp. Res. Board (2023). https://doi.org/10.1177/03611981231159863
Ahmed, U., Roorda, M.J.: Modeling freight vehicle type choice using machine learning and discrete choice methods. Transp. Res. Rec. 2676, 541–552 (2022). https://doi.org/10.1177/03611981211044462
Ali, Y., Hussain, F., Bliemer, M.C.J., Zheng, Z., Haque, M.M.: Predicting and explaining lane-changing behaviour using machine learning: A comparative study. Transp. Res. Part c Emerg. Technol. 145, 103931 (2022). https://doi.org/10.1016/j.trc.2022.103931
Ali Khalil, M., Hamad, K., Shanableh, A.: Developing machine learning models to predict roadway traffic noise: an opportunity to escape conventional techniques. Transp. Res. Rec. 2673, 158–172 (2019). https://doi.org/10.1177/0361198119838514
Basu, R., Ferreira, J.: Understanding household vehicle ownership in Singapore through a comparison of econometric and machine learning models. Transp. Res. Procedia 48, 1674–1693 (2020). https://doi.org/10.1016/j.trpro.2020.08.207
Bentéjac, C., Csörgő, A., Martínez-Muñoz, G.: A Comparative Analysis of Gradient Boosting Algorithms. Springer, Netherlands (2021)
Birfir, S., Elalouf, A., Rosenbloom, T.: Building machine-learning models for reducing the severity of bicyclist road traffic injuries. Transp. Eng. 12, 100179 (2023). https://doi.org/10.1016/j.treng.2023.100179
Chandrabose, M., RahimMohammadForkan, A., Abe, T., Owen, N., Sugiyama, T.: Joint associations of environmental and sociodemographic attributes with active and sedentary travel. Transp. Res. Transp. Environ. 116, 103643 (2023). https://doi.org/10.1016/j.trd.2023.103643
Chen, P., Hu, S., Shen, Q., Lin, H., Xie, C.: Estimating traffic volume for local streets with imbalanced data. Transp. Res. Rec. 2673, 598–610 (2019). https://doi.org/10.1177/0361198119833347
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 13–17-Augu, pp 785–794 (2016). https://doi.org/10.1145/2939672.2939785
Chingcuanco, F., Miller, E.J.: The ILUTE demographic microsimulation model for the greater toronto-hamilton area: current operational status and historical validation. In: GeoComputational Analysis and Modeling of Regional Systems. pp 167–187 (2017)
De Bok, M., De Jong, G., Baak, J., Helder, E., Puttemans, C., Verlinden, K., Borremans, D., Grispen, R., Liebens, J., Van Criekinge, M.: A population simulator and disaggregate transport demand models for flanders. Transp. Res. Procedia 8, 168–180 (2015). https://doi.org/10.1016/j.trpro.2015.06.052
Dieleman, F.M., Dijst, M., Burghouwt, G.: Urban form and travel behaviour: micro-level household attributes and residential context. Urban Stud. 39, 507–527 (2002). https://doi.org/10.1080/00420980220112801
Elreedy, D., Atiya, A.F., Kamalov, F.: A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning. Mach. Learn. (2023). https://doi.org/10.1007/s10994-022-06296-4
Eluru, N., Pinjari, A.R., Guo, J.Y., Sener, I.N., Srinivasan, S., Copperman, R.B., Bhat, C.R.: Population updating system structures and models embedded in the comprehensive econometric microsimulator for urban systems. Transp. Res. Rec. (2008). https://doi.org/10.3141/2076-19
Fatmi, M.R., Habib, M.A.: Microsimulation of life-stage transitions and residential location transitions within a life-oriented integrated urban modeling system. Comput. Environ. Urban Syst. 69, 87–103 (2018). https://doi.org/10.1016/j.compenvurbsys.2018.01.003
García-García, J.C., García-Ródenas, R., López-Gómez, J.A., Martín-Baos, J.Á.: A comparative study of machine learning, deep neural networks and random utility maximization models for travel mode choice modelling. Transp. Res. Procedia 62, 374–382 (2022). https://doi.org/10.1016/j.trpro.2022.02.047
Genuer, R., Poggi, J.-M.: Random Forests with R. Springer, Berlin (2020)
Habib, M.A., McCarthy, S.: Development of an integrated transportation and land use microsimulation model on a flexible modeling platform. Transp. Res. Rec. 2675, 355–369 (2021). https://doi.org/10.1177/03611981211029641
Hagenauer, J., Helbich, M.: A comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst. Appl. 78, 273–282 (2017). https://doi.org/10.1016/j.eswa.2017.01.057
Khaddar, S., Saxena, S., Fatmi, M.R.: An episode-level joint model for activity engagement, duration, and destination location choice decisions an episode-level joint model for activity engagement, duration, and destination Abstract. Transp. Lett. 00, 1–11 (2022). https://doi.org/10.1080/19427867.2022.2102347
Khaddar, S., Varghese, V., Fatmi, M.R., Chikaraishi, M.: A joint model for activity start time, travel mode, companionship, and destination location choice decisions. In: 102nd Annual Meeting of Transportation Research Board. Washington D.C (2023)
Khalil, M.A., Fatmi, M.R.: How residential energy consumption has changed due to COVID-19 pandemic? An agent-based model. Sustain. Cities Soc. 81, 103832 (2022). https://doi.org/10.1016/j.scs.2022.103832
Khalil, M.A., Fatmi, M.R.: Modeling daily in-home activities using machine learning techniques. Travel Behav. Soc. 31, 374–385 (2023). https://doi.org/10.1016/j.tbs.2023.02.001
Krogh, A.: What are artificial neural networks? Nat. Biotechnol. 26, 195–197 (2008). https://doi.org/10.1038/nbt1386
Lee, J.H., Davis, A.W., Goulias, K.G.: Triggers of behavioral change: longitudinal analysis of travel behavior, household composition and spatial characteristics of the residence. J. Choice Model. 24, 4–21 (2017). https://doi.org/10.1016/j.jocm.2017.01.001
Lee, D., Derrible, S., Pereira, F.C.: Comparison of four types of artificial neural network and a multinomial logit model for travel mode choice modelling. Transp. Res. Rec. 2672, 101–112 (2018). https://doi.org/10.1177/0361198118796971
Liu, Y., Li, B., Yang, S., Li, Z.: Handling missing values and imbalanced classes in machine learning to predict consumer preference: Demonstrations and comparisons to prominent methods. Expert Syst. Appl. 237, 121694 (2024). https://doi.org/10.1016/j.eswa.2023.121694
Louppe, G.: Understanding random forests: from theory to practice, http://arxiv.org/abs/1407.7502 (2014)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: 31st Conference on Neural Information Processing Systems (NIPS) (2017)
MontesinosLópez, O.A., MontesinosLópez, A., Crossa, J.: Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer Nature, Berlin (2022)
Orvin, M., Fatmi, M.R.: Are pre-pandemic models transferable to the pandemic period? assessing the temporal transferability of the housing price component of an integrated urban model. In: 102nd Annual Meeting of Transportation Research Board. pp. 1–23. Montreal (2023a)
Orvin, M., Enam, A., Fatmi, M.R.: Modeling residential mobility and location choice, and their interdependencies: a reason-based competing hazard and latent segmentation-based logit model. In: 101st Annual Meeting of Transportation Research Board (TRB). Washington D.C. (2022)
Orvin, M.M., Fatmi, M.R.: Modeling residential mobility decisions from a life history–oriented perspective. Transp. Lett. 00, 1–10 (2021). https://doi.org/10.1080/19427867.2021.1995585
Orvin, M., Fatmi, M.R.: A residential location search model based on the reasons for moving out. Transp. Lett. (2023b). https://doi.org/10.1080/19427867.2023.2222990
Pérez, F.A.A., Ortiz, G.E.R., Muñiz, E.R., Sacarello, F.J.O., Kang, J.E., Rodriguez-Roman, D.: Predicting trip cancellations and no-shows in Paratransit operations. Transp. Res. Rec. 2674, 774–784 (2020). https://doi.org/10.1177/0361198120924661
Rahman, N., Fatmi, M.R.: Population synthesis accommodating heterogeneity: a bayesian network and generalized raking technique. In: Transportation Research Board 101th Annual MeetingTransportation Research Board (2022)
Rahman, M.N., Fatmi, M.R.: Population synthesis accommodating heterogeneity: a Bayesian network and generalized raking technique. Transp. Res. Rec. J. Transp. Res. Board (2023). https://doi.org/10.1177/03611981221144289
Raux, C., Ma, T.Y., Cornelis, E.: Variability in daily activity-travel patterns: the case of a one-week travel diary. Eur. Transp. Res. Rev. (2016). https://doi.org/10.1007/s12544-016-0213-9
Salvini, P.A.: Design and development of the ILUTE operational prototype: a comprehensive microsimulation model of urban systems (2003)
Statistics Canada (2021) Census Profile, 2021 Census of Population, http://www12.statcan.gc.ca/health-sante/82-228/details/page.cfm?Lang=F&Tab=1&Geo1=PR&Code1=35&Geo2=PR&Code2=01&Data=Rate&SearchText=Ontario&SearchType=Contains&SearchPR=01&B1=All&Custom=&B2=All&B3=All&GeoLevel=PR&GeoCode=35
Statistics Canada (2023) Number of divorces and divorce rate per 1,000 marriages, by duration of marriage, https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1710013901
Statistics Canada: Median after-tax income, Canada and provinces, 2016 to 2020, https://www150.statcan.gc.ca/n1/daily-quotidien/220323/t002a-eng.htm
Statistics Canada: Live births, by age of mother1, https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1710013901
Statistics Canada: Safe Cities profile series: Key indicators by census metropolitan area, https://www150.statcan.gc.ca/n1/pub/85-002-x/2020001/article/00001/kelowna-eng.htm
Tamim, M., Jamal, A., Samim, M., Almoshaogeh, M., Masiur, S.: Predicting the travel mode choice with interpretable machine learning techniques: a comparative study. Travel Behav. Soc. 29, 279–296 (2022). https://doi.org/10.1016/j.tbs.2022.07.003
Train, K.E.: Discrete Choice Methods with Simulation. 9780521816, 1–334 (2003). https://doi.org/10.1017/CBO9780511753930
Tsolaki, K., Vafeiadis, T., Nizamis, A., Ioannidis, D., Tzovaras, D.: Utilizing machine learning on freight transportation and logistics applications: a review. ICT Express (2022). https://doi.org/10.1016/j.icte.2022.02.001
van Cranenburgh, S., Wang, S., Vij, A., Pereira, F., Walker, J.: Choice modelling in the age of machine learning—discussion paper. J. Choice Model. 42, 100340 (2022). https://doi.org/10.1016/j.jocm.2021.100340
Victoriano, R., Paez, A., Carrasco, J.A.: Time, space, money, and social interaction: using machine learning to classify people’s mobility strategies through four key dimensions. Travel Behav. Soc. 20, 1–11 (2020). https://doi.org/10.1016/j.tbs.2020.02.004
Wagner, P., Wegener, M.: Urban land use, transport and environment models: experiences with an integrated microscopic approach. Disp 170, 45–56 (2007). https://doi.org/10.1080/02513625.2007.10556988
Wambani, P.P., Ogunjinmi, A.A., Oladeji, S.O.: Socio-demographic determinants of travel motivation and behaviour of visitors in nature-based destinations in northern Nigeria. J. Appl. Sci. Environ. Manag. 24, 2121–2129 (2021). https://doi.org/10.4314/jasem.v24i12.18
Westin, K., Jansson, J., Nordlund, A.: The importance of socio-demographic characteristics, geographic setting, and attitudes for adoption of electric vehicles in Sweden. Travel Behav. Soc. 13, 118–127 (2018). https://doi.org/10.1016/j.tbs.2018.07.004
Wojtusiak, J., Warden, T., Herzog, O.: Machine learning in agent-based stochastic simulation: Inferential theory and evaluation in transportation logistics. Comput. Math. Appl. 64, 3658–3665 (2012). https://doi.org/10.1016/j.camwa.2012.01.079
Ye, N., Gao, L., Juan, Z., Ni, A.: Are people from households with children more likely to travel by car? An empirical investigation of individual travel mode choices in Shanghai, China. Sustainability (2018). https://doi.org/10.3390/su10124573
Yuan, J., Abdel-Aty, M., Gong, Y., Cai, Q.: Real-time crash risk prediction using long short-term memory recurrent neural network. Transp. Res. Rec. 2673, 314–326 (2019). https://doi.org/10.1177/0361198119840611
Zeng, Z., Zhu, W., Ke, R., Ash, J., Wang, Y., Xu, J., Xu, X.: A generalized nonlinear model-based mixed multinomial logit approach for crash data analysis. Accid. Anal. Prev. 99, 51–65 (2017). https://doi.org/10.1016/j.aap.2016.11.008
Zhang, Z., Yang, X.: Freeway traffic speed estimation by regression machine-learning techniques using probe vehicle and sensor detector data. J. Transp. Eng. Part Syst. 146, 1–10 (2020). https://doi.org/10.1061/jtepbs.0000455
Acknowledgements
The authors are grateful for the Discovery Grant provided by the National Sciences and Engineering Research Council. The authors would also like to thank Nathan Nichol from the University of British Columbia for proof editing the manuscript and offering some interesting suggestions to better present the results.
Author information
Authors and Affiliations
Contributions
Research Idea: MAK and MF, Methodology: MAK and MF, Data Collection: MAK and MF, Model Development: MAK, MF, and MO. Writing—First Draft: MAK and MF, Writing—Review: MF and MAK, Project Administration: MF, Acquisition of Funding: MF.
Corresponding author
Ethics declarations
Competing interests
We would like to declare that there is no conflict of interest between us.
AI-assisted technologies
During the preparation of this work, the authors used ChatGPT by OpenAI and Bing by Microsoft to proofread some sentences to enhance their readability. After using these tools, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khalil, M.A., Fatmi, M.R. & Orvin, M. Developing and microsimulating demographic dynamics for an integrated urban model: a comparison between logistic regression and machine learning techniques. Transportation (2024). https://doi.org/10.1007/s11116-024-10468-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s11116-024-10468-7