Automated Setting of Bus Schedule Coverage Using Unsupervised Machine Learning

Khiari, Jihed; Moreira-Matias, Luis; Cerqueira, Vitor; Cats, Oded

doi:10.1007/978-3-319-31753-3_44

Jihed Khiari¹⁹,
Luis Moreira-Matias¹⁹,
Vitor Cerqueira¹⁹ &
…
Oded Cats²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2766 Accesses
9 Citations

Abstract

The efficiency of Public Transportation (PT) Networks is a major goal of any urban area authority. Advances on both location and communication devices drastically increased the availability of the data generated by their operations. Adequate Machine Learning methods can thus be applied to identify patterns useful to improve the Schedule Plan. In this paper, the authors propose a fully automated learning framework to determine the best Schedule Coverage to be assigned to a given PT network based on Automatic Vehicle location (AVL) and Automatic Passenger Counting (APC) data. We formulate this problem as a clustering one, where the best number of clusters is selected through an ad-hoc metric. This metric takes into account multiple domain constraints, computed using Sequence Mining and Probabilistic Reasoning. A case study from a large operator in Sweden was selected to validate our methodology. Experimental results suggest necessary changes on the Schedule coverage. Moreover, an impact study was conducted through a large-scale simulation over the affected time period. Its results uncovered potential improvements of the schedule reliability on a large scale.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Reports stoppage time at stops. Includes a fixed delay due to door opening and closing time, and a variable delay caused by passengers boarding/alighting activities.
2.
All the normalizations done throughout this section used the Euclidean distance.
3.
Note that this naive timetabling procedure is done only for this specific purpose. Once the coverage is changed, the entire timetable of the affected periods need to be recomputed. The reader can consult the work in [15] to know more about this topic.

References

Moreira-Matias, L., Mendes-Moreira, J., Freire de Sousa, J., Gama, J.: Improving mass transit operations by using avl-based systems: a survey. IEEE Trans. Intell. Transp. Syst. 16(4), 1636–1653 (2015)
Article Google Scholar
Mendes-Moreira, J., Moreira-Matias, L., Gama, J., Freire de Sousa, J.: Validating the coverage of bus schedules: a machine learning approach. Inf. Sci. 293, 299–313 (2015)
Article Google Scholar
Mazloumi, E., Mesbah, M., Ceder, A., Moridpour, S., Currie, G.: Efficient transit schedule design of timing points: A comparison of ant colony and genetic algorithms. Transp. Res. Part B: Methodol. 46(1), 217–234 (2012)
Article Google Scholar
Cats, O., Mach Rufi, F., Koutsopoulos, H.: Optimizing the number and location of time point stops. Public Transp. 6(3), 215–235 (2014)
Article Google Scholar
Jorge, A.M., Mendes-Moreira, J., de Sousa, J.F., Soares, C., Azevedo, P.J.: Finding interesting contexts for explaining deviations in bus trip duration using distribution rules. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 139–149. Springer, Heidelberg (2012)
Chapter Google Scholar
Patnaik, J., Chien, S., Bladikas, A.: Using data mining techniques on apc data to develop effective bus scheduling. J. Syst. Cybern. Inf. 4(1), 86–90 (2006)
Google Scholar
Pei, J., Han, J., Mortazavi-Asl, N., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: ICCCN, p. 0215. IEEE (2001)
Google Scholar
Fraley, C., Raftery, A.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)
Article MathSciNet MATH Google Scholar
Matias, L., Gama, J., Mendes-Moreira, J., Freire de Sousa, J.: Validation of both number and coverage of bus schedules using avl data. In: 13th IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 131–136 (2010)
Google Scholar
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet MATH Google Scholar
Wagner, R., Scholz, S., Decker, R.: The number of clusters in market segmentation. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds.) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 157–176. Springer, Heidelberg (2005)
Chapter Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2012). ISBN 3-900051-07-0
Google Scholar
Fraley, C., Raftery, A., Scrucca, L.: Normal mixture modeling for model-based clustering, classification, and density estimation. Department of Statistics, University of Washington 23, 2012 (2012)
Google Scholar
Tabei, Y.: An imprementation of prefixspan (prefix-projected sequential pattern mining), August 2015. https://code.google.com/p/prefixspan/people/list. last access at August 2015
Ceder, A.: Urban transit scheduling: framework, review and examples. J. Urban Plann. Dev. 128(4), 225–244 (2002)
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article MATH Google Scholar

Download references

Acknowledgements

This work was also supported by the European Commission under TEAM, a large scale integrated project part of the Seventh Framework Programme for research, technological development and demonstration [Grant Agreement No. 318621]. The authors would like to thank all partners within TEAM for their cooperation and valuable contribution.

Author information

Authors and Affiliations

NEC Laboratories Europe, 69115, Heidelberg, Germany
Jihed Khiari, Luis Moreira-Matias & Vitor Cerqueira
Department of Transport and Planning, TU Delft, 2600, Delft, Netherlands
Oded Cats

Authors

Jihed Khiari
View author publications
You can also search for this author in PubMed Google Scholar
Luis Moreira-Matias
View author publications
You can also search for this author in PubMed Google Scholar
Vitor Cerqueira
View author publications
You can also search for this author in PubMed Google Scholar
Oded Cats
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis Moreira-Matias .

Editor information

Editors and Affiliations

The University of Melbourne, Melbourne, Victoria, Australia
James Bailey
The University of Texas at Dallas, Richardson, Texas, USA
Latifur Khan
Osaka University, Osaka, Japan
Takashi Washio
University of Auckland, Auckland, New Zealand
Gill Dobbie
Shenzhen University, Shenzhen, China
Joshua Zhexue Huang
Massey University, Auckland, New Zealand
Ruili Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khiari, J., Moreira-Matias, L., Cerqueira, V., Cats, O. (2016). Automated Setting of Bus Schedule Coverage Using Unsupervised Machine Learning. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-31753-3_44
Published: 12 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics