Skip to main content
Log in

Improved imputation of rule sets in class association rule modeling: application to transportation mode choice

  • Published:
Transportation Aims and scope Submit manuscript

Abstract

Predicting transportation mode choice is a critical component of forecasting travel demand. Recently, machine learning methods have become increasingly more popular in predicting transportation mode choice. Class association rules (CARs) have been applied to transportation mode choice, but the application of the imputed rules for prediction remains a long-standing challenge. Based on CARs, this paper proposes a new rule merging approach, called CARM, to improve predictive accuracy. In the suggested approach, first, CARs are imputed from the frequent pattern tree (FP-tree) based on the frequent pattern growth (FP-growth) algorithm. Next, the rules are pruned based on the concept of pessimistic error rate. Finally, the rules are merged to form new rules without increasing predictive error. Using the 2015 Dutch National Travel Survey, the performance of suggested model is compared with the performance of CARIG that uses the information gain statistic to generate new rules, class-based association rules (CBA), decision trees (DT) and the multinomial logit (MNL) model. In addition, the proposed model is assessed using a ten-fold cross validation test. The results show that the accuracy of the proposed model is 91.1%, which outperforms CARIG, CBA, DT and the MNL model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Agrawal, R., Imielinski, T., Swami, A.: Mining Association in Large Databases. Proc. 1993 ACM SIGMOD Int. Conf. Manag. data - SIGMOD ’93. 207–216 (1993). https://doi.org/10.1145/170036.170072

  • Arentze, T.A., Timmermans, H.J.P.: A learning-based transportation oriented simulation system. Transp. Res. Part B Methodol. 38, 613–633 (2004). https://doi.org/10.1016/j.trb.2002.10.001

    Article  Google Scholar 

  • Arentze, T., Hofman, F., van Mourik, H., Timmermans, H.J.P.: ALBATROSS: multiagent, rule-based model of activity pattern decisions. Transp. Res. Rec. 1706, 136–144 (2000)

    Article  Google Scholar 

  • Azmi, M., Runger, G.C., Berrado, A.: Interpretable regularized class association rules algorithm for classification in a categorical data space. Inf. Sci. 483, 313–331 (2019)

    Article  Google Scholar 

  • Azmi, M., Berrado, A.: Class-association rules pruning using regularization. Proc. IEEE/ACS Int. Conf. Comput. Syst. Appl. AICCSA. 0, 1–7 (2016). https://doi.org/10.1109/AICCSA.2016.7945625

  • Beckman, J.D., Goulias, K.G.: Immigration, residential location, car ownership, and commuting behavior: A multivariate latent class analysis from California. Transportation (amst). 35, 655–671 (2008). https://doi.org/10.1007/s11116-008-9172-x

    Article  Google Scholar 

  • Cascetta, E., Papola, A.: Random utility models with implicit availability/perception of choice alternatives for the simulation of travel demand. Transp. Res. Part C Emerg. Technol. 9, 249–263 (2001)

    Article  Google Scholar 

  • Cheng, L., Chen, X., De Vos, J., Lai, X., Witlox, F.: Applying a random forest method approach to model travel mode choice behavior. Travel Behav. Soc. 14, 1–10 (2019)

    Article  Google Scholar 

  • Chu, K.K.A., Chapleau, R.: Augmenting transit trip characterization and travel behavior comprehension: Multiday location-stamped smart card transactions. Transp. Res. Rec. 29–40 (2010)

  • Dabiri, S., Lu, C.T., Heaslip, K., Reddy, C.K.: Semi-supervised deep learning approach for transportation mode identification using GPS trajectory data. IEEE Trans. Knowl. Data Eng. 32, 1010–1023 (2020)

    Article  Google Scholar 

  • Delgado-Osuna, J.A., García-Martínez, C., Gómez-Barbadillo, J., Ventura, S.: Heuristics for interesting class association rule mining a colorectal cancer database. Inf. Process. Manag. 57, 102207 (2020). https://doi.org/10.1016/j.ipm.2020.102207

    Article  Google Scholar 

  • Diana, M.: Studying patterns of use of transport modes through data mining. Transp. Res. Rec. (2012). https://doi.org/10.3141/2308-01

    Article  Google Scholar 

  • Feng, T., Timmermans, H.J.P.: Transportation mode recognition using GPS and accelerometer data. Transp. Res. Part C Emerg. Technol. 37, 118–130 (2013)

    Article  Google Scholar 

  • Feng, T., Timmermans, H.J.P.: Comparison of advanced imputation algorithms for detection of transportation mode and activity episode using GPS data. Transp. Plan. Technol. 39, 180–194 (2016)

    Article  Google Scholar 

  • Guo, J., Feng, T., Timmermans, H.J.P.: Co-dependent workplace, residence and commuting mode choice: Results of a multi-dimensional mixed logit model with panel effects. Cities (2020). https://doi.org/10.1016/j.cities.2019.102448

    Article  Google Scholar 

  • Guo, J., Feng, T., Timmermans, H.J.P.: Modeling co-dependent choice of workplace, residence and commuting mode using an error component mixed logit model. Transportation (Amst) (2020). https://doi.org/10.1007/s11116-018-9927-y

    Article  Google Scholar 

  • Hafezi, M.H., Liu, L., Millward, H.: A time-use activity-pattern recognition model for activity-based travel demand modeling. Transportation (amst). 46, 1369–1394 (2019). https://doi.org/10.1007/s11116-017-9840-9

    Article  Google Scholar 

  • Hagenauer, J., Helbich, M.: A comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst. Appl. 78, 273–282 (2017). https://doi.org/10.1016/j.eswa.2017.01.057

    Article  Google Scholar 

  • Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. (ACM Spec. Interes. Gr. Manag. Data). 29, 1–12 (2004). https://doi.org/10.1145/335191.335372

  • Hensher, D.A., Ton, T.T.: A comparison of the predictive potential of artificial neural networks and nested logit models for commuter mode choice. Transp Res Part E Logist Transp Rev (2000). https://doi.org/10.1016/S1366-5545(99)00030-7

    Article  Google Scholar 

  • Hu, S., Liang, Q., Qian, H., Weng, J., Zhou, W., Lin, P.: Frequent-pattern growth algorithm based association rule mining method of public transport travel stability. Int. J. Sustain. Transp. (2020). https://doi.org/10.1080/15568318.2020.1827318

    Article  Google Scholar 

  • Huang, R., Liu, J., Chen, H., Li, Z., Liu, J., Li, G., Guo, Y., Wang, J.: An effective fault diagnosis method for centrifugal chillers using associative classification. Appl. Therm. Eng. 136, 633–642 (2018)

    Article  Google Scholar 

  • Ibrahim, S.P.S., Chandran, K.R.: Compact Weighted Class Association Rule Mining Using Information Gain. Int. J. Data Min. Knowl. Manag. Process. 1, 1–13 (2011)

  • Keuleers, B., Wets, G., Arentze, T., Timmermans, H.: Association rules in identification of spatial-temporal patterns in multiday activity diary data. Transp. Res. Rec. (2001). https://doi.org/10.3141/1752-05

    Article  Google Scholar 

  • Keuleers, B., Wets, G., Timmermans, H., Arentze, T., Vanhoof, K.: Stationary and time-varying patterns in activity diary panel data: Explorative analysis with association rules. Transp. Res. Rec. 9–15 (2002)

  • Kim, S., Rasouli, S., Timmermans, H., Yang, D.: Estimating panel effects in probabilistic representations of dynamic decision trees using Bayesian generalized linear mixture models. Transp. Res. Part B Methodol. 111, 168–184 (2018)

    Article  Google Scholar 

  • Koppelman, F.S., Sethi, V.: Incorporating variance and covariance heterogeneity in the Generalized Nested Logit model: An application to modeling long distance travel choice behavior. Transp. Res. Part B Methodol. 39, 825–853 (2005)

    Article  Google Scholar 

  • Kusumastuti, D., Hannes, E., Janssens, D., Wets, G., Dellaert, B.G.C.: Scrutinizing individuals’ leisure-shopping travel decisions to appraise activity-based models of travel demand. Transportation (amst). 37, 647–661 (2010). https://doi.org/10.1007/s11116-010-9272-2

    Article  Google Scholar 

  • Lee, J.K., Yoo, K.E., Song, K.H.: A study on travelers’ transport mode choice behavior using the mixed logit model: A case study of the Seoul-Jeju route. J. Air Transp. Manag. 56, 131–137 (2016). https://doi.org/10.1016/j.jairtraman.2016.04.020

    Article  Google Scholar 

  • Lee, D., Derrible, S., Pereira, F.C.: Comparison of Four Types of Artificial Neural Network and a Multinomial Logit Model for Travel Mode Choice Modeling. Transp. Res. Rec. 2672, 101–112 (2018). https://doi.org/10.1177/0361198118796971

    Article  Google Scholar 

  • Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules, pp. 369–376. Proc. - IEEE Int. Conf. Data Mining, ICDM (2001)

    Google Scholar 

  • Li, L., Zhu, J., Zhang, H., Tan, H., Du, B., Ran, B.: Coupled application of generative adversarial networks and conventional neural networks for travel mode detection using GPS data. Transp. Res. Part A Policy Pract. 136, 282–292 (2020). https://doi.org/10.1016/j.tra.2020.04.005

    Article  Google Scholar 

  • Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. Proc. fourth Int. Conf. Knowl. Discov. Data Min. 24–25 (1998)

  • Lu, Y., Kawamura, K.: Data-Mining Approach to Work Trip Mode Choice Analysis in Chicago, Illinois. Area. Transp. Res. Rec. J. Transp. Res. Board. 2156, 73–80 (2010)

    Article  Google Scholar 

  • Omrani, H.: Predicting travel mode of individuals by machine learning. Transp. Res. Procedia. 10, 840–849 (2015). https://doi.org/10.1016/j.trpro.2015.09.037

    Article  Google Scholar 

  • Paulssen, M., Temme, D., Vij, A., Walker, J.L.: Values, attitudes and travel behavior: A hierarchical latent variable mixed logit model of travel mode choice. Transportation 41, 873–888 (2014). https://doi.org/10.1007/s11116-013-9504-3

    Article  Google Scholar 

  • Pitombo, C.S., Kawamoto, E., Sousa, A.J.: An exploratory analysis of relationships between socioeconomic, land use, activity participation variables and travel patterns. Transp. Policy. 18, 347–357 (2011). https://doi.org/10.1016/j.tranpol.2010.10.010

    Article  Google Scholar 

  • Rasouli, S., Kim, S., Yang, D.: Albatross IV: from single day to multi time horizon travel demand forecasting. In: 97th Transportation Research Board Annual Meeting (2018)

  • Rasouli, S., Timmermans, H.J.P.: Using ensembles of decision trees to predict transport mode choice decisions: Effects on predictive success and uncertainty estimates. Eur. J. Transp. Infrastruct. Res. 14, 412–424 (2014)

    Google Scholar 

  • Seeniselvi, T., Imrankhan, R.: Personalized Mobile Search Engine by Analyzing Query Travel Patterns with Association Rule Mining. Int. J. 2, 199–205 (2013)

    Google Scholar 

  • Sekhar, C.R., Minal, Madhu, E.: Mode choice analysis using random forest decision trees. Transp. Res. Procedia. 17, 644–652 (2016)

    Article  Google Scholar 

  • Shanmugam, L., Ramasamy, M.: Study on mode choice using nested logit models in travel towards Chennai metropolitan city. J. Ambient Intell. Humaniz. Comput. (2021). https://doi.org/10.1007/s12652-020-02868-1

    Article  Google Scholar 

  • Shao, Y., Liu, B., Wang, S., Li, G.: A novel software defect prediction based on atomic class-association rule mining. Expert Syst. Appl. 114, 237–254 (2018). https://doi.org/10.1016/j.eswa.2018.07.042

    Article  Google Scholar 

  • Shao, Y., Liu, B., Wang, S., Li, G.: Software defect prediction based on correlation weighted class association rule mining. Knowledge-Based Syst. 196, 105742 (2020). https://doi.org/10.1016/j.knosys.2020.105742

    Article  Google Scholar 

  • Song, K., Lee, K.: Predictability-based collective class association rule mining. Expert Syst. Appl. 79, 1–7 (2017). https://doi.org/10.1016/j.eswa.2017.02.024

    Article  Google Scholar 

  • Srinivasan, S., Bhat, C.R., Holguin-Veras, J.: Empirical analysis of the impact of security perception on intercity mode choice: A panel rank-ordered mixed logit model. Transp. Res. Rec. (2006). https://doi.org/10.3141/1942-02

    Article  Google Scholar 

  • Srivastava, M., Sekhar, C.R.: Web Survey Data and Commuter Mode Choice Analysis Using Artificial Neural Network. Int. J. Traffic Transp. Eng. 8, 359–371 (2018)

    Article  Google Scholar 

  • Supattranuwong, S., Sinthupinyo, S., Juwattanasamran, P.: Applying Data Mining to Analyze Travel Pattern in Searching Travel Destination Choices. Int. J. Eng. Sci. 2, 38–44 (2013)

    Google Scholar 

  • Tang, L., Xiong, C., Zhang, L.: Decision tree method for modeling travel mode switching in a dynamic behavioral process. Transp. Plan. Technol. 38, 833–850 (2015). https://doi.org/10.1080/03081060.2015.1079385

    Article  Google Scholar 

  • Wang, F., Ross, C.L.: Machine Learning Travel Mode Choices: Comparing the Performance of an Extreme Gradient Boosting Model with a Multinomial Logit Model. Transp. Res. Rec. 2672, 35–45 (2018). https://doi.org/10.1177/0361198118773556

    Article  Google Scholar 

  • Weng, J., Tu, Q., Yuan, R., Lin, P., Chen, Z.: Modeling mode choice behaviors for public transport commuters in Beijing. J. Urban Plan. Dev. 144, 1–9 (2018)

    Article  Google Scholar 

  • Wets, G., Vanhoof, K., Arentze, T., Timmermans, H.: Identifying decision structures underlying activity patterns: An exploration of data mining algorithms. Transp. Res. Rec. (2000). https://doi.org/10.3141/1718-01

    Article  Google Scholar 

  • Xiao, G., Juan, Z., Zhang, C.: Travel mode detection based on GPS track data and Bayesian networks. Comput. Environ. Urban Syst. 54, 14–22 (2015). https://doi.org/10.1016/j.compenvurbsys.2015.05.005

    Article  Google Scholar 

  • Xie, C., Lu, J., Parkany, E.: Work Travel Mode Choice Modeling with Data Mining: Decision Trees and Neural Networks. Transp. Res. Rec. (2003). https://doi.org/10.3141/1854-06

    Article  Google Scholar 

  • Yamamoto, T., Kitamura, R., Fujii, J.: Driver’s route choice behavior: Analysis by data mining algorithms. Transp. Res. Rec. (2002). https://doi.org/10.3141/1807-08

    Article  Google Scholar 

  • Zhan, G., Yan, X., Zhu, S., Wang, Y.: Using hierarchical tree-based regression model to examine university student travel frequency and mode choice patterns in China. Transp. Policy. 45, 55–65 (2016). https://doi.org/10.1016/j.tranpol.2015.09.006

    Article  Google Scholar 

  • Zhang, J., Feng, T., Timmermans, H.J.P., Lin, Z.: Association rules and prediction of travel choices: a case study of transportation mode choice. In: 99th Annual Meeting of the Transportation Research Board (2019)

  • Zhang, Y., Xie, Y.: Travel mode choice modeling with support vector machines. Transp. Res. Rec. (2008). https://doi.org/10.3141/2076-16

    Article  Google Scholar 

  • Zhao, X., Yan, X., Yu, A., Van Hentenryck, P.: Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. Travel Behav. Soc. 20, 22–35 (2020)

    Article  Google Scholar 

  • Zhu, Z., Chen, X., Xiong, C., Zhang, L.: A mixed Bayesian network for two-dimensional decision modeling of departure time and mode choice. Transportation 45, 1499–1522 (2018). https://doi.org/10.1007/s11116-017-9770-6

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by China Scholarship Council.

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed to the paper as follows: study design: J. Zhang, T. Feng, H.J.P. Timmermans and Z. Lin; computer programming and analysis: J. Zhang; interpretation of results: J. Zhang, T. Feng, H.J.P. Timmermans and Z. Lin; draft manuscript preparation: J. Zhang and H.J.P. Timmermans. All authors reviewed the draft manuscript and approved the final version of the manuscript.

Corresponding author

Correspondence to Zhengkui Lin.

Ethics declarations

Conflict of interest

The authors declared that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

An example of dataset with 3 classes has 60 observations, where the training dataset is from \({x}_{1}\) to \({x}_{45}\), and the test dataset is from \({x}_{46}\) to \({x}_{60}\). The more details are listed in Table 7.

Table 7 An example of dataset

Appendix 2

In Table 8, there are 152 rules (CARs) are obtained from FP-tree, where slashes represent redundant rules. The rules have been ordered in accordance with the requirements.

Table 8 CARs are obtained from FP-tree

Appendix 3

After pruning, 47 rules are obtained in Table 9. In the training dataset, these rules are used to obtain \(cCAR\_record\) and \(cCAR\_record\). From \(cCAR\_record\) and \(cCAR\_record\), \(cCAR\), \(wCAR\) and \(wcCAR\) are found.

Table 9 CARs \({\varvec{p}}{\varvec{r}}{\varvec{R}}\) after pruning

Appendix 4

\(FrequentRules\) and \(RareRules\) are listed in Tables 10 and 11, respectively, where 2 + 13 stands for the conditions of rule 2 in Table 9 and rule 13 in Table 9 are merged to form a new rule, and the class label of this new rule is same to the class label of rule 13. For each new rule, Local_Confidence and Local_Support are calculated.

Table 10 Frequent rules
Table 11 Rare rules

Appendix 5

Table 12 The results are about that training dataset and test dataset are predicted by the first rule of \({\varvec{F}}{\varvec{r}}{\varvec{e}}{\varvec{q}}{\varvec{u}}{\varvec{e}}{\varvec{n}}{\varvec{t}}{\varvec{R}}{\varvec{u}}{\varvec{l}}{\varvec{e}}{\varvec{s}}\)
Table 13 The results are about that training dataset and test dataset are predicted by the first rule of \({\varvec{R}}{\varvec{a}}{\varvec{r}}{\varvec{e}}{\varvec{R}}{\varvec{u}}{\varvec{l}}{\varvec{e}}{\varvec{s}}\)
Table 14 Comparison of the results in % for training dataset and test dataset

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Feng, T., Timmermans, H. et al. Improved imputation of rule sets in class association rule modeling: application to transportation mode choice. Transportation 50, 63–106 (2023). https://doi.org/10.1007/s11116-021-10238-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11116-021-10238-9

Keywords

Navigation