, Volume 39, Issue 4, pp 773–789 | Cite as

The usefulness of the Sequence Alignment Methods in validating rule-based activity-based forecasting models

  • George Sammour
  • Tom Bellemans
  • Koen Vanhoof
  • Davy Janssens
  • Bruno Kochan
  • Geert WetsEmail author


This research paper aims at achieving a better understanding of rule-based activity-based models, by proposing a new level of validation at the process model level in the A Learning-based Transportation Oriented Simulation System (ALBATROSS) model. To that effect, the work activity process model, which includes six decision steps, has been investigated. Each decision step is evaluated during the prediction of the individuals’ schedules. There are specific decision steps that affect the execution pattern of the work activity process model. So, the comportment of execution in the process model contains activation dependency. This branches the execution and evaluation of each agent under examination. Sequence Alignment Methods (SAM) can be used to evaluate how similar/dissimilar the predicted and observed decision sequences are on an agent level. The original Chi-squared Automatic Interaction Detector decision trees at each decision step utilized in ALBATROSS are compared with other well known induction methods chosen to appraise the purpose of the analyses. The models are validated at four levels: the classifier or decision step level whereby confusion matrix statistics are used; The work activity trips Origin–Destination matrix level; the time of day work activity start time level, using a correlation coefficient; and the process model level, using SAM. The results of validation on the proposed process model level show conformity to all validation levels. In addition, the results provide additional information in better understanding the process model’s behavior. Hence, introducing a new level of validation incur new knowledge and assess the predictive performance of rule-based activity-based models. And assist in identifying critical decision steps in the work activity process model.


Activity-based models validation Sequence Alignment Methods Classification methods Process models 


  1. Anggraini, R., Arentz, T., Timmermans, H.: Modeling car allocation decisions in automobile deficient households. In: Proceedings of the European Transport Conference, Noordwijkerhout, The Netherlands (2007)Google Scholar
  2. Arentze, T.A., Timmermans, H.J.P., ALBATROSS: A Learning-Based Transportation Oriented Simulation System. EIRASS, Eindhoven University of Technology, Eindhoven (2000)Google Scholar
  3. Arentze, T.A., Timmermans, H.J.P.: Measuring impacts of condition variables in rule-based models of space–time choice behavior: method and empirical illustration. J. Geogr. Anal. 35, 24–45 (2003a)CrossRefGoogle Scholar
  4. Arentze, T.A., Timmermans, H.J.P.: Measuring the goodness-of-fit of decision-tree models of discrete and continuous activity-travel choice: methods and empirical illustration. J. Geogr. Anal. 5(2), 185–206 (2003b)Google Scholar
  5. Arentze, T.A., Timmermans, H.J.P.: A learning-based transportation oriented simulation system. Transp. Res. B 38, 613–633 (2004)CrossRefGoogle Scholar
  6. Bellemans, T., Janssens, D., Wets, G., Arentze, T., Timmermans, H.: Implementation framework and development trajectory of the FEATHERS activity-based simulation platform. In: Proceedings of the Annual Meeting of the Transportation Research Board (2010)Google Scholar
  7. Brier, G.W.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78, 1–3 (1950)CrossRefGoogle Scholar
  8. Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Soc. B 20(2), 215–242 (1958)Google Scholar
  9. Garofalakis, M., Hyun, D., Rastogi, R., Shim, K.: Efficient algorithms for constructing decision trees with constraints. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 335–339 (2000)Google Scholar
  10. Gehrke, J., Ramakrishnan, R., Ganti, V.: RainForest—a framework for fast decision tree construction of large datasets. In: Proceedings of the 24th VLDB Conference, New York, USA, pp. 416–427 (1998)Google Scholar
  11. Guazelli, A., Lin, W.-C., Jena, T.: PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics. CreateSpace. ISBN 978-1452858265 (2010)Google Scholar
  12. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  13. Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–90 (1993)CrossRefGoogle Scholar
  14. Janssens, D., Wets, G., Brijs, T., Vanhoof, K., Arentze, T.A., Timmermans, H.J.P.: Improving performance of a multiagent rule-based model for activity pattern decisions with bayesian networks. In: Transportation Research Record: Journal of the Transportation Research Board, No. 1894, pp. 75–83. Transportation Research Board of the National Academies, Washington, DC (2004)Google Scholar
  15. Janssens, D., Wets, G., Brijs, T., Vanhoof, K., Arentze, T.A., Timmermans, H.J.P.: Integrating Bayesian networks and decision trees in a sequential rule-based transportation model. Eur. J. Oper. Res. 175(1), 16–34 (2006)CrossRefGoogle Scholar
  16. Joh, C.H., Arentze, T.A., Timmermans, H.J.P.: Pattern recognition in complex activity-travel patterns: a comparison of euclidean distance, signal processing theoretical, and multidimensional sequence alignment methods. Presented at the 80th annual meeting of the Transportation Research Board, Washington, DC, USA (2001)Google Scholar
  17. Joh, C.-H., Arentze, T.A., Hofman, F., Timmermans, H.J.P.: Activity-travel pattern similarity: a multidimensional alignment method. Transp. Res. B 36, 385–403 (2002)CrossRefGoogle Scholar
  18. Kass, G.V.: An exploratory technique for investigating large quantities of categorical data. J. R. Stat. Soc. C 29(2), 119–127 (1980)Google Scholar
  19. Keuleers, B., Wets, G., Arentze, T., Timmermans, H.: Association rules in identification of spatial-temporal patterns in multiday activity diary data. In: Transportation Research Record: Journal of the Transportation Research Board, No. 1752, pp. 32–37. TRB, National Research Council, Washington, DC (2001)Google Scholar
  20. King, G., Zeng, L.: Logistic regression in rare events data. Political Anal. 9, 137–163 (2001)CrossRefGoogle Scholar
  21. Lim, T.S., Loh, W.Y., Shih, Y.S.: A comparison of prediction accuracy, complexity, and training time for thirty-three old and new classification algorithms. Mach. Learn. 40, 203–228 (2000)CrossRefGoogle Scholar
  22. Moons, E.: Modelling activity-diary data: complexity or parsimony? PhD dissertation, Limburg University, Diepenbeek, Belgium (2005)Google Scholar
  23. Quinlan, J.R.: Decision trees and multi-valued attributes. In: Hayes, J.E., Michie, D. (eds.) Machine Intelligence, vol. 11. Oxford University Press, Oxford (1985)Google Scholar
  24. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 11(1), 81–106 (1986)Google Scholar
  25. Wets, G., Vanhoof, K., Arentze, T.A., Timmermans, H.J.P.: Identifying decision structures underlying activity patterns: an exploration of data mining algorithms. Transp. Res. Rec. 1718, 1–9 (2000)CrossRefGoogle Scholar
  26. Williams, G.J.: Rattle: a data mining GUI for R. R J. 1(2), 45–55 (2009)Google Scholar
  27. Wilson, C.: Activity pattern analysis by means of sequence-alignment methods. Environ. Plan. A 30, 1017–1038 (1998)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2012

Authors and Affiliations

  • George Sammour
    • 1
  • Tom Bellemans
    • 1
  • Koen Vanhoof
    • 1
  • Davy Janssens
    • 1
  • Bruno Kochan
    • 1
  • Geert Wets
    • 1
    Email author
  1. 1.Transportation Research Institute (IMOB), Faculty of Applied EconomicsHasselt UniversityDiepenbeekBelgium

Personalised recommendations