Abstract
The efficiency of Public Transportation (PT) Networks is a major goal of any urban area authority. Advances on both location and communication devices drastically increased the availability of the data generated by their operations. Adequate Machine Learning methods can thus be applied to identify patterns useful to improve the Schedule Plan. In this paper, the authors propose a fully automated learning framework to determine the best Schedule Coverage to be assigned to a given PT network based on Automatic Vehicle location (AVL) and Automatic Passenger Counting (APC) data. We formulate this problem as a clustering one, where the best number of clusters is selected through an ad-hoc metric. This metric takes into account multiple domain constraints, computed using Sequence Mining and Probabilistic Reasoning. A case study from a large operator in Sweden was selected to validate our methodology. Experimental results suggest necessary changes on the Schedule coverage. Moreover, an impact study was conducted through a large-scale simulation over the affected time period. Its results uncovered potential improvements of the schedule reliability on a large scale.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Reports stoppage time at stops. Includes a fixed delay due to door opening and closing time, and a variable delay caused by passengers boarding/alighting activities.
- 2.
All the normalizations done throughout this section used the Euclidean distance.
- 3.
Note that this naive timetabling procedure is done only for this specific purpose. Once the coverage is changed, the entire timetable of the affected periods need to be recomputed. The reader can consult the work in [15] to know more about this topic.
References
Moreira-Matias, L., Mendes-Moreira, J., Freire de Sousa, J., Gama, J.: Improving mass transit operations by using avl-based systems: a survey. IEEE Trans. Intell. Transp. Syst. 16(4), 1636–1653 (2015)
Mendes-Moreira, J., Moreira-Matias, L., Gama, J., Freire de Sousa, J.: Validating the coverage of bus schedules: a machine learning approach. Inf. Sci. 293, 299–313 (2015)
Mazloumi, E., Mesbah, M., Ceder, A., Moridpour, S., Currie, G.: Efficient transit schedule design of timing points: A comparison of ant colony and genetic algorithms. Transp. Res. Part B: Methodol. 46(1), 217–234 (2012)
Cats, O., Mach Rufi, F., Koutsopoulos, H.: Optimizing the number and location of time point stops. Public Transp. 6(3), 215–235 (2014)
Jorge, A.M., Mendes-Moreira, J., de Sousa, J.F., Soares, C., Azevedo, P.J.: Finding interesting contexts for explaining deviations in bus trip duration using distribution rules. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 139–149. Springer, Heidelberg (2012)
Patnaik, J., Chien, S., Bladikas, A.: Using data mining techniques on apc data to develop effective bus scheduling. J. Syst. Cybern. Inf. 4(1), 86–90 (2006)
Pei, J., Han, J., Mortazavi-Asl, N., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: ICCCN, p. 0215. IEEE (2001)
Fraley, C., Raftery, A.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)
Matias, L., Gama, J., Mendes-Moreira, J., Freire de Sousa, J.: Validation of both number and coverage of bus schedules using avl data. In: 13th IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 131–136 (2010)
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Wagner, R., Scholz, S., Decker, R.: The number of clusters in market segmentation. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds.) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 157–176. Springer, Heidelberg (2005)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2012). ISBN 3-900051-07-0
Fraley, C., Raftery, A., Scrucca, L.: Normal mixture modeling for model-based clustering, classification, and density estimation. Department of Statistics, University of Washington 23, 2012 (2012)
Tabei, Y.: An imprementation of prefixspan (prefix-projected sequential pattern mining), August 2015. https://code.google.com/p/prefixspan/people/list. last access at August 2015
Ceder, A.: Urban transit scheduling: framework, review and examples. J. Urban Plann. Dev. 128(4), 225–244 (2002)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Acknowledgements
This work was also supported by the European Commission under TEAM, a large scale integrated project part of the Seventh Framework Programme for research, technological development and demonstration [Grant Agreement No. 318621]. The authors would like to thank all partners within TEAM for their cooperation and valuable contribution.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Khiari, J., Moreira-Matias, L., Cerqueira, V., Cats, O. (2016). Automated Setting of Bus Schedule Coverage Using Unsupervised Machine Learning. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)