Abstract
Forecasting travel demand is a classic problem in transportation planning. The models made for this purpose take the socioeconomic characteristics of a subset of a population to estimate the total demand, mainly using random utility models. However, with machine learning algorithms fast becoming key instruments in many transportation applications, the past decade has seen the rapid development of such models for travel demand forecasting. As these algorithms are independent of assumptions, have high pattern recognition ability, and often offer promising results, they can be effective alternatives to discrete choice models for forecasting trip patterns. This paper aimed to predict mandatory and non-mandatory trip patterns using a Deep Neural Network (DNN) algorithm. A dataset containing Metropolitan Washington Council of Government Transportation Planning Board (MWCGTPB) 2007–2008 survey data and a dataset containing traffic analysis zones’ characteristics (TAZ) were prepared to extract and predict these patterns. After the modeling phase, the models were evaluated based on accuracy and Cohen’s kappa coefficient. The estimates of mandatory and non-mandatory trips were found to have an accuracy of 70.87% and 50.02%, respectively. The results showed that a DNN could find the relationship between socioeconomic factors and trip patterns. This can be helpful for transportation planners when they are trying to predict travel demand.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The trip pattern of individuals is based on demographic characteristics and environmental factors, e.g., accessibility (Srinivasan & Ferreira, 2002). These data are collected mainly for a small percentage of an area, which enables transportation modelers to identify similar patterns and estimate the trip demand of a wider population. There is an unambiguous relationship between demographic characteristics, activity participation, and travel behavior (Cheng et al., 2019). Trip pattern choice is a function of the need for participation in dispersive activities in the urban environment and individual and house characteristics, including a set of alternatives and limitations. Accessibility also plays a crucial role in such pattern recognition since the generation of work and other trips are sensitive to accessibility (Cordera et al., 2017; Currans et al., 2020; Næss et al., 2018; Pitombo et al., 2011; Stead, 2001).
Random utility models (RUMs) are the primary tool for travel demand prediction. While these models allow interpretability, they are not as complex as deep learning models. This is important because their understanding of the traveler’s choices might be inefficient as the travel behavior pattern formation is complex in essence, and they can contain intra-household complexity substrates affected by the built environment (Yang et al., 2019).
Either machine learning models or random utility models are chosen, there is a trade-off between higher accuracy and interpretability (Ali et al., 2021; Derrible & Pereira, n.d.; García-García et al., 2022). The higher complexity of machine learning models allows them to identify non-linear relationships between the input and designated outputs to a higher degree than random utility models. This grants us with more accurate prediction. Despite the advantageous prediction capability, machine learning models divest the modeler of the interpretability and statistical tools that random utility models provide. This means the mechanism operating in a choice procedure and the significance to which each element serves a function that leads to a final decision is unknown when using a machine learning model. Regardless, due to simplicity, random utility models often prove computationally expensive, or even infeasible, when fed with a large volume of data (Wang et al., 2021).
There is substantial evidence that people with similar socioeconomic backgrounds show comparable travel behaviors (Carlsson-Kanyama & Linden, 1999; Li et al., 2018). Males, for instance, tend to engage in more business and work-related activities, whilst women engage in more leisure activities such as visiting family or seeing friends (Collins & Tisdell, 2002). Women prefer to commute over shorter distances, at off-peak hours, or by using flexible modes of transport (Ng & Acker, 2018). Income is another element that influences travel behavior; income level may change travel behavior habits such as distance (Jain & Tiwari, 2019). Because there has shown to be a causal relationship between travel patterns and mobility across demographic groups, travel patterns can be inferred and calculated using socioeconomic data.
A number of researchers have sought to explore essential variables in the formation of household travel behaviors. Bhat et al., 2013 proposed a household activity production model in Southern California to find how all individuals in a household make their decisions about activity participation. The daily trips of an individual can also be classified into distinct patterns. In that regard, Hedau & Sanghai, 2014 classified daily trips into five patterns to develop an activity-trip choice model using multiple variables. Molla et al., 2017 introduced a probabilistic activity-based travel generation model, which could infer the actual number of trip generations. They assumed that small organizations in an urban area could create activity-based models based on traditional trip surveys. In all of these studies, socioeconomic characteristics were considered significant in generating activity-based trips.
With such a relationship in place, some studies have investigated the derivation of socioeconomic characteristics from travel patterns. Zhu et al., 2017 predicted people’s sociodemographic variables such as work status, age, gender, and income based on GPS data, training SVM, and logistic regression. Among the most important differentiating characteristics they utilized for categorization were variables linked to the spatiotemporal variability of tours. Temporal-spatial data obtained from public transit smart cards may also be utilized to study trip patterns (Yang et al., 2018). Li et al., 2019 used large-scale data across three age groups to lessen the usage of survey data in the design of human-centered public transportation. The study concentrated on predicting age groups based on travel to various “points of interest” retrieved from trip destinations. Among the ML approaches trained and compared, the neural network (NN) produced the best results. Zhang & Chen, 2018 estimated vehicle ownership, age, gender, and income using extracted attributes from smart card data. After testing multiple supervised ML algorithms, they concluded that the NN produced the best results. While this study manually collected characteristics from the data before feeding it to ML models, Zhang, Cheng, & Sari Aslam, 2019 used a convolutional neural network (CNN) to undertake the same investigation without the requirement for feature extractions. CNN’s are extensively utilized in cutting-edge image processing models, but they may also be trained to recognize hidden patterns in non-image data. Similarly, after training and comparing many ML models, Zhang & Cheng, 2019 predicted job status from London’s public transportation smart card data and discovered that CNN performed the best in their scenario.
Whether traffic analysis zone (TAZ) parameters are incorporated into a model or not can influence trip patterns and, consequently, travel demand forecasts. Some researchers highlight this fact. For example, an urban accessibility relative index (UARI) was developed to integrate the collected multi-mode transportation big data (related to the taxis, buses, and subways) to quantify, visualize and understand the spatiotemporal patterns of accessibility in urban areas (Jiang et al., 2021). Neglecting an accessibility characteristic could lead to incorrect interpretations of travel demand forecasting for non-mandatory trips when modes other than private cars are used, or mandatory trips are made by private cars (Cordera et al., 2017). In addition, population density can affect trip generation (Zhang, Clifton, et al., 2019).
Machine learning algorithms have eliminated many practical limitations because of the abundance of mobility data and pattern recognition. They are widely used in the forecasting and analysis of travel behavior in activity-travel patterns, including supervised (e.g., neural networks and support vector machines) and unsupervised (e.g., K-means clustering) learning (Koushik et al., 2020). These algorithms can find complex travel behavior patterns through the relationship between Spatio-temporal and socioeconomic characteristics and estimate trip patterns. Despite this benefit, such techniques have rarely been employed in activity-based modeling and trip forecasting. For example, a support vector machine (SVM) was employed to recognize and forecast daily activity sequences (Allahviranloo & Recker, 2013). Furthermore, using a hybrid logit-SVM model, Yang et al., 2016 considered the role of the head of household in forecasting the number of household trips.
Despite the indispensable strength of great pattern recognition, machine learning algorithms are primarily non-interpretable, meaning that the process through which the final output is produced is unknown to the modeler. For this reason, these models are often referred to as the “black box.” Nevertheless, sometimes a combination of unsupervised (e.g., clustering algorithms) and supervised algorithms (e.g., decision trees) can enable interpretability. Pitombo et al., 2011 used this modeling approach to analyze the pattern-travel relationship involving activity, land use, and socioeconomic characteristics. In the same vein, Hafezi et al., 2019 proposed a model to identify activity patterns by combining K-means clustering and the classification and regression trees (CART) algorithm. After recognizing homogeneous patterns, they developed the CART model to allow a more in-depth analysis.
This study proposes a novel deep learning method to predict the future travel demand based on how population distribution—that is, according to designated demographic traits (e.g., age, gender, income)—changes over time. The proposed deep learning model will predict future travel demand more accurately than conventional random utility models. A DNN model is developed to predict trip patterns in two categories: mandatory and non-mandatory. For this purpose, the socioeconomic characteristics coupled with characteristics of TAZs of people residing in the Washington metropolitan area were used.
2 Data
The socioeconomic characteristics extracted from the Metropolitan Washington Council of Government Transportation Planning Board (MWCOGTPB) 2007–2008 survey data were used to train a trip-pattern-predicting model. This data set contains travel behavior and demographics of 11,000 households in the Washington metropolitan area, including Northern Virginia and some parts of Maryland. The Transportation Planning Board (TPB) periodically conducts the survey to evaluate the transportation system’s effectiveness with respect to the transportation demand of the households. The participants responded to a one-day questionnaire on a detailed travel diary from February 2007 to March 2008. Although the data is collected for 24 hours, it is collected from different days of the week, which provides a complete picture of the travel behavior in the area. Moreover, the mandatory and non-mandatory patterns are expected to continue in a recurring manner throughout the week, especially given the inflexible nature of non-mandatory trips. Therefore, the results can be confidently generalized to the area’s population.
Although more recent data would have been preferable, it was essential to incorporate the characteristics of each transportation analysis zone to improve the results and reduce bias, and such data was only available from 2007. As the infrastructural condition of the area has developed since then, we needed to match the collection timeline of the two data. Nonetheless, TPB data is one of the most comprehensive survey data available, and we focused our attention mainly on the algorithm, which can be trained on any data from any year.
A set of distinct variables typically available in travel surveys and census data were selected for inclusion in the model. As with any data, this data also had to be preprocessed before the machine learning model training. Once the data was prepared, it consisted of TAZ characteristics and the socioeconomic characteristics of individuals, which are shown in Table 1 and Table 2. Three types of features used to train the model included continuous, categorical, and binary variables. The categorical data were organized as dummy variables, while the continuous variables were scaled in the range of 0–1. This allowed faster model convergence (lower computational cost), resulting from a lower variance of each feature.
There are a total of 3722 TAZs in the Washington metropolitan area, which differ in urban infrastructures depending on whether they are in- or out-of-city zones. This study included TAZs features such as public transportation access, population density, and employment density. The spatial distribution of the participants at the scale of TAZ is shown in Fig. 1.
Table 3 represents the mandatory and non-mandatory trip patterns and their share of data. Mandatory trips were defined based on the following assumption. Mandatory trips are inflexible, meaning they must take place at a specific time and last for a predefined amount of time. They are not made on a voluntary basis, nor is there a choice for the time of their occurrence. Some trips, such as grocery shopping, are necessary for the household, but they are not counted as mandatory based on this assumption.
Mandatory trip patterns were divided into seven groups based on the number of work, educational, and school trips, and one group was also considered for those with no mandatory trips. On the other hand, the remainder of the trip purposes were labeled as non-mandatory (e.g., shopping trips, visiting relatives, and recreational trips). These were classified into five groups based on the number of their occurrences. Since the number of individuals with no non-mandatory trips was tiny (0.2%), they were excluded from the modeling. The mandatory-trip model predicts a combination of trip purposes as categorical classes; for example, “1 work and 1 education” is a single class. School trips refer to trips made for the purpose of formal education—that is, school, university, and college—while educational trips refer to trips made for other educational activities, such as learning music or language.
3 Methodology
A deep neural network algorithm was used to train the developed model. While a shallow neural network typically consists of a small number of hidden layers, deep neural networks are created by stacking many hidden layers on top of each other. This makes the model bigger and more complex. Although training deeper models means longer training time (higher computational cost), it should be noted that the deeper a neural network, the more likely it is to identify and pick up more complex and non-linear patterns from the dataset.
A node in an NN is a processing unit with a weight and a sum function. A weight w is a mathematical value representing the relative power of connections to transfer data from one layer to another, while a sum function y calculates the total weight of all input variables in a processing unit. The performance signal appearing in the output of neuron j is calculated as:
where m is the number of variables introduced to neuron j, xi is a group of variables in neuron j, yj is the output of neuron j, wji is the calculated weight from neuron i to neuron j, and bj is the bias term. The activation function is typically required for a non-linear introduction to the neural network. It defines a non-linear relationship between the input and output of a node and a network. The present study adopted the softmax activation function in the output layer:
where y is the input vector and k is the number of classes in multiclass classification.
We scaled each variable in the range of 0–1. Scaling speeds up the gradient descent—a step-wise optimization algorithm of a neural network—which works hand in hand with back-propagation. Starting from randomly assigned weights to each variable, the algorithm takes small or big steps (depending on the learning rate) to minimize a cost function. This works based on calculating an error term on each step and then taking the derivatives of the activation function and adjusting the weights based on that. Constraining the variance of each variable within a limited range prevents the algorithms from taking large derivatives with each update of the gradient descent, hence reducing each consecutive computation time and, consequently, the total training time. Scaling also helps better the performance and stability of the optimization process (Bishop, 1995).
The sample was collected from a random subset of the population. Sample bias and data imbalance have always been challenging in such cases. The dominant groups are prone to overshadow less frequent observations while training the machine learning algorithm. This means DNNs can perform decently when dealing with uniformly distributed datasets, while their performance on datasets of an unbalanced distribution cannot be ensured (Wang et al., 2016). One way of dealing with the imbalance problem is to augment marginalized categories; however, this disturbs the authenticity of the distribution of the randomly sampled data, so the data will longer represent the actual population. Instead of a synthesization that would have undermined the validity of our analysis, we used class weighing during the training phase of the deep learning models, penalizing the error—using the cost function—commensurate with the share of samples in the data.
The prediction of mandatory and non-mandatory trip patterns through socioeconomic characteristics were both formulated as classification problems. Python programming language was used to implement the preprocessing and modeling of this paper.
3.1 The evaluation criteria
3.1.1 Accuracy, precision, recall, F1-score
The classification could be evaluated through the true positives (TP) as the number of correctly included classes, true negatives (TN) as the number of correctly excluded classes, false positives (FP) as the number of wrongly included classes, and false negatives (FN) as the number of wrongly excluded classes. These four criteria form a confusion matrix for the classification (Sokolova & Lapalme, 2009). In this respect, accuracy, precision, recall, and F1-score can be calculated as:
The accuracy of an ML model indicates how many times it was accurate overall, while precision measures how well a model predicts a specific category. Precision is an excellent metric when the costs of FPs are high. When a considerable cost is associated with FNs, we will utilize recall as the measure to choose our best model. F1-score is helpful while attempting to seek a balance between precision and recall. It may also be appropriate when there is an uneven class distribution. A collective consideration of the aeformentioned criteria was the best way to choose the final model, because every aspects of the model performance was clear for us.
3.1.2 Kappa coefficient
Cohen’s kappa coefficient helps solve multiclass classification problems with non-normal distributions. This criterion measures the agreement between classified data (Landis & Koch, 1977). Because our data was of an imbalance nature, the kappa coefficient was calculated along with the other evaluation indices to measure the model’s performance effectively. This coefficient is expressed by Eq. (7) as follows:
where p0 denotes real relative agreement between two datasets, while pe is the probability of random agreement between the datasets. It is required to define boundaries for the calculated coefficients to perform the evaluation. Although different performance levels have been suggested for the number that the kappa coefficient provides, the scoring system proposed by Landis and Koch was adopted (Table 4). The kappa coefficient varies from 0 to 1 at six evaluation levels. A larger kappa coefficient represents the higher efficiency and effectiveness of a model.
4 Results
The mandatory and non-mandatory trip patterns in the Washington metropolitan area were estimated in this study. The socioeconomic characteristics were extracted from the MWCGTPB 2007–2008 data, and TAZ characteristics were included. Then, a DNN algorithm was formulated to predict trip patterns.
Figure 2 displays the mandatory trip pattern estimates as a confusion matrix. The vertical axis represents the real values (correct labels), and the horizontal axis represents the estimates. The color of each square represents the probability of correct estimations—Table 5 reports each class’s accuracy, precision, recall, and F-score.
A total of seven classes were predicted for mandatory trips, including “no mandatory trip.” The estimation accuracy of mandatory trips was 70.87%, implying its promising performance. Given the high recall and precision scores, the model mostly predicted the individuals with no mandatory trips more accurately. Most of the mandatory trip groups with “1 work trip” and “1 school trip” were predicted inaccurately, while the individuals with “1 work trip” were estimated with high accuracy.
Individuals with work trips were rarely confused with individuals who had educational trips. This distinction could be attributed to their socioeconomic characteristics, e.g., age and occupational-educational position. In other words, individuals aged 0 to 18 are mostly students, so they are not expected to generate work trips. In many cases, individuals with two work trips were wrongly predicted as those with one work trip, but the model could differentiate the work from educational trips. In addition, the class of three work trips was mainly confused with other groups of work trips, suggesting that the model was relatively inefficient in recognizing the number of work trips.
Educational trips were the second group of mandatory trips. The pattern of one educational trip was predicted with reasonable accuracy; however, this pattern was confused with the “no mandatory trip” pattern and was rarely predicted as a work trip pattern. The patterns of “one work trip and one educational trip” and “one educational trip and one school trip” (patterns combining multiple purposes) had low estimation accuracies. The former was mostly confused with work trips, while the latter was recognized as school trips. Finally, the “one school trip” pattern had high accuracy.
Apart from the outputs of mandatory trip patterns, Table 6 and Fig. 3 represent the estimation results of non-mandatory trips. The model yielded an overall prediction accuracy of 50.02% for non-mandatory trips. The pattern of “one non-mandatory trip” and “four or more non-mandatory trips” had the highest estimation accuracy, followed by the “two non-mandatory trip” pattern. In contrast, the “three non-mandatory trips” pattern had the lowest accuracy. This performance seems acceptable since non-mandatory trips involve a wide range of trip purposes, from buying gas to visiting relatives and recreation.
As presented in Table 7, the kappa coefficient was calculated to be 0.5853 for mandatory trips, which is a medium coefficient, while non-mandatory trips had a kappa coefficient of 0.3014, suggesting a fair value with an accuracy of 50.02. This implies acceptable performance for both mandatory and non-mandatory trips.
5 Discussion
This study presents a novel deep learning framework for forecasting future travel demand. A DNN model was used to discover a meaningful relationship between socioeconomic characteristics and accessibility measures on one side, and trip patterns on the other side. The proposed model is expected to outperform traditional random utility models in predicting future travel demand. The prediction ability of this model can be deployed on census data to generalize and synthesize the trip behavior of the area’s entire population. In other words, the predicted patterns can be aggregated on a specific geospatial scale (for example, TAZ) to estimate trip production and attractions as population distribution—demographic attributes (e.g., age, gender, income)—and urban accessibility changes over time. This will help in outlining planning and policy measures.
Given the accuracy gap between mandatory and non-mandatory patterns’ prediction results, it is apparent that the separated modeling of mandatory and non-mandatory trips helped the deep learning model map socioeconomics to trip patterns in a more distinguishable manner, owing to the nature of each trip category and its relation to socioeconomics. The relation between socioeconomics and spatial features was more recognizable for mandatory trip patterns, which could be attributed to the role of each individual in the household. Socioeconomics such as age, income, and gender define this role, hence affecting the creation of mandatory trips, with each category dependent directly and distinctly on the assigned role. Additionally, mandatory trips are inflexible, meaning they are not conducted voluntarily. Therefore, the question of what pattern an individual has as a routine part of his transportation diary is easier to answer because there is less flexibility and thus more certainty about their occurrence.
A similar analogy can be drawn for non-mandatory trips, however, adopting the reverse reasoning. This category of trips could be made under less strict circumstances. They can assume an arbitrary form and are not necessarily based on a predefined or recurring schedule. This means, even though socioeconomics plays a key role in the formation of trip patterns, they might not be as influential for the creation of non-mandatory trips. The when and if of the occurrence of non-mandatory trips are harder to relate to the socioeconomics, so there is much less certainty regarding this category. This justifies the lower performance accuracy of non-mandatory trips.
Based on the literature, adding land use data to the input of similar machine learning models often improves the results. Because land-use and socioeconomic characteristics are the main impetus for creating trips. Unfortunately, we could not access the land use information of the area, and trained the models on related features extractable from the data at hand (travel survey and accessibility measures). Thus, future work could use a more comprehensive set of inputs to improve the results.
With the rapid growth of big data technologies, especially GPS (Global Positioning System) data, the inference of socio-economic information also seems a promising direction for future work. Socioeconomic information is one of the main inputs of travel demand models, and relating these data, which are continually and passively collected via censors in our cellphones and cars, can reduce the survey data collection cost as well as help transportations modelers and policy designers draw more meaningful conclusions on how mobility is linked to socioeconomic characteristics. Future studies could use state-of-the-art deep learning models to find such linkage.
6 Conclusion
The present study aimed to forecast trip patterns based on socioeconomic and TAZ characteristics. Once the mandatory and non-mandatory trip patterns and socioeconomic characteristics were extracted from the MWCGTPB 2007–2008 survey data, a DNN was trained to classify these patterns. Mandatory trips included work, education, school, or a combination of such trips, while non-mandatory trips involved the remaining trips of the individuals. The model had an estimation accuracy of 70.87% for mandatory trips (seven groups of trips) and 50.02% for non-mandatory trips (four trip groups). The estimates of mandatory and non-mandatory trips were observed to be significantly different. Mandatory trip patterns with a single trip type were forecasted more accurately than combined mandatory trips, regardless of the number of trips. In addition to accuracy, Cohen’s kappa coefficient was calculated to validate the model’s predictive performance. The results of this study showed that a deep learning algorithm could effectively recognize the correlation between socioeconomic features and trip pattern formation. The prediction results of this model can then be aggregated on a larger geospatial scale to estimate trip production and attractions as population distribution and urban accessibility change over time. This provides transportation modelers with a more accurate tool in the process of travel demand forecasting.
Availability of data and materials
The datasets are available from www.mwcog.org on request.
References
Ali, N. F. M., Sadullah, A. F. M., Majeed, A. P., Razman, M. A. M., Zakaria, M. A., & Nasir, A. F. A. (2021). Travel mode choice modeling: Predictive efficacy between machine learning models and discrete choice model. The Open Transportation Journal, 15(1). https://doi.org/10.2174/1874447802115010241
Allahviranloo, M., & Recker, W. (2013). Daily activity pattern recognition by using support vector machines with multiple classes. Transportation Research Part B: Methodological, 58, 16–43.
Bhat, C. R., Goulias, K. G., Pendyala, R. M., Paleti, R., Sidharthan, R., Schmitt, L., & Hu, H.-H. (2013). A household-level activity pattern generation model with an application for Southern California. Transportation, 40(5), 1063–1086.
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford University Press.
Carlsson-Kanyama, A., & Linden, A.-L. (1999). Travel patterns and environmental effects now and in the future:: Implications of differences in energy consumption among socio-economic groups. Ecological Economics, 30(3), 405–417.
Cheng, L., Chen, X., Yang, S., Wu, J., & Yang, M. (2019). Structural equation models to analyze activity participation, trip generation, and mode choice of low-income commuters. Transportation Letters, 11(6), 341–349.
Collins, D., & Tisdell, C. (2002). Gender and differences in travel life cycles. Journal of Travel Research, 41(2), 133–143.
Cordera, R., Coppola, P., dell’Olio, L., & Ibeas, Á. (2017). Is accessibility relevant in trip generation? Modelling the interaction between trip generation and accessibility taking into account spatial effects. Transportation, 44(6), 1577–1603.
Currans, K. M., Abou-Zeid, G., Clifton, K. J., Howell, A., & Schneider, R. (2020). Improving transportation impact analyses for subsidized affordable housing developments: A data collection and analysis of motorized vehicle and person trip generation. Cities, 103, 102774.
Lee, D., Derrible, S., & Pereira, F. C. (2018). Comparison of four types of artificial neural network and a multinomial logit model for travel mode choice modeling. Transportation Research Record, 2672(49), 101–112.
García-García, J. C., García-Ródenas, R., López-Gómez, J. A., & Martín-Baos, J. Á. (2022). A comparative study of machine learning, deep neural networks and random utility maximization models for travel mode choice modelling. Transportation Research Procedia, 62, 374–382.
Hafezi, M. H., Liu, L., & Millward, H. (2019). A time-use activity-pattern recognition model for activity-based travel demand modeling. Transportation, 46(4), 1369–1394.
Hedau, A. L., & Sanghai, S. (2014). Development of trip generation model using activity based approach. International Journal of Civil, Structural, Environmental and Infrastructure Engineering Research and Development, 4(3), 61–78.
Jain, D., & Tiwari, G. (2019). Explaining travel behaviour with limited socio-economic data: Case study of Vishakhapatnam, India. Travel Behaviour and Society, 15, 44–53.
Jiang, Y., Guo, D., Li, Z., & Hodgson, M. E. (2021). A novel big data approach to measure and visualize urban accessibility. Computational Urban Science, 1(1), 1–15.
Koushik, A. N., Manoj, M., & Nezamuddin, N. (2020). Machine learning applications in activity-travel behaviour research: A review. Transport Reviews, 40(3), 288–311.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, No. 1, 159–174.
Li, C., Bai, L., Liu, W., Yao, L., & Waller, S. T. (2019). Passenger demographic attributes prediction for human-centered public transport. International Conference on Neural Information Processing.
Li, J., Lo, K., & Guo, M. (2018). Do socio-economic characteristics affect travel behavior? A comparative study of low-carbon and non-low-carbon shopping travel in Shenyang City, China. International Journal of Environmental Research and Public Health, 15(7), 1346.
Molla, M. M., Stone, M. L., & Motuba, D. (2017). Developing an activity-based trip generation model for small/medium size planning agencies. Transportation Planning and Technology, 40(5), 540–555.
Næss, P., Peters, S., Stefansdottir, H., & Strand, A. (2018). Causality, not just correlation: Residential location, transport rationales and travel behavior across metropolitan contexts. Journal of Transport Geography, 69, 181–195.
Ng, W.-S., & Acker, A. (2018). Understanding urban travel behaviour by gender for efficient and equitable transport policies.
Pitombo, C. S., Kawamoto, E., & Sousa, A. J. (2011). An exploratory analysis of relationships between socioeconomic, land use, activity participation variables and travel patterns. Transport Policy, 18(2), 347–357.
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437.
Srinivasan, S., & Ferreira, J. (2002). Travel behavior at the household level: Understanding linkages with residential choice. Transportation Research Part D: Transport and Environment, 7(3), 225–242.
Stead, D. (2001). Relationships between land use, socioeconomic factors, and travel patterns in Britain. Environment and Planning. B, Planning & Design, 28(4), 499–528.
Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., & Kennedy, P. J. (2016). Training deep neural networks on imbalanced data sets 2016 international joint conference on neural networks (IJCNN).
Wang, S., Mo, B., Hess, S., & Zhao, J. (2021). Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: An empirical benchmark. arXiv preprint arXiv:2102.01130.
Yang, C., Yan, F., & Ukkusuri, S. V. (2018). Unraveling traveler mobility patterns and predicting user behavior in the Shenzhen metro system. Transportmetrica A: Transport Science, 14(7), 576–597.
Yang, S., Deng, W., Deng, Q., & Fu, P. (2016). The research on prediction models for urban family member trip generation. KSCE Journal of Civil Engineering, 20(7), 2910–2919.
Yang, S., Fan, Y., Deng, W., & Cheng, L. (2019). Do built environment effects on travel behavior differ between household members? A case study of Nanjing, China. Transport Policy, 81, 360–370.
Zhang, Q., Clifton, K. J., Moeckel, R., & Orrego-Oñate, J. (2019). Household trip generation and the built environment: Does more density mean more trips? Transportation Research Record, 2673(5), 596–606.
Zhang, Y., & Chen, G. (2018). Inferring social-demographics of travellers based on smart card data 2nd International Conference on Advanced Research Methods and Analytics (CARMA 2018). Proceedings.
Zhang, Y., & Cheng, T. (2019). A deep learning approach to infer employment status of passengers by using smart card data. IEEE Transactions on Intelligent Transportation Systems, 21(2), 617–629.
Zhang, Y., Cheng, T., & Sari Aslam, N. (2019). Deep learning for demographic prediction based on smart card data and household survey Proceedings of the 27th Conference on GIS Research UK (GISRUK).
Zhu, L., Gonder, J., & Lin, L. (2017). Prediction of individual social-demographic role based on travel behavior variability using long-term GPS data. Journal of Advanced Transportation, 2017, Article ID 7290248. https://doi.org/10.1155/2017/7290248
Acknowledgements
Not applicable.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Author information
Authors and Affiliations
Contributions
Hamid Mirzahossein: Supervision, Conceptualization, Methodology, Validation. Ali Bakhtiari: Data curation, Visualization, Software, Writing- Original draft preparation. Navid Kalantari: Data curation, Visualization, Investigation, Validation. Xia Jin: Supervision, Writing- Reviewing and Editing. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Consent for publication
Authors consent to publish the submitted paper and any associated data and accompanying images.
Competing interests
There is no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mirzahossein, H., Bakhtiari, A., Kalantari, N. et al. Investigating mandatory and non-mandatory trip patterns based on socioeconomic characteristics and traffic analysis zone features using deep neural networks. Comput.Urban Sci. 2, 35 (2022). https://doi.org/10.1007/s43762-022-00063-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43762-022-00063-w