Skip to main content
Log in

Derivation of train arrival timings through correlations from individual passenger farecard data

  • Published:
Transportation Aims and scope Submit manuscript

Abstract

In this paper, we propose a method for estimating the timings at which trains arrive and depart from stations using passenger farecard data and knowledge of the network topology. The problem we consider is essential for understanding commuter movement patterns across metro systems at high granular detail in settings where one does not have access to train logs (comprising records of train arrival and departure timings) or when these records are unreliable. Our technique requires as input the timings at which passengers arrive and depart from station—these are easily retrievable from farecard data—and provide as output an estimate of the number of trains running as well as the timings at which each train arrives and departs at each station. Our method relies on two key observations: (1) passengers tend to exit metro stations as soon as they alight and (2) we can reliably conclude that groups of passengers who board at the same stop but alight at different stops were on the same train if their boarding timings have similar distributions. In contrast with prior works, our methodology is stand-alone in that it does not rely on external sources of information such as train schedules and it requires minimal parameter tuning. In addition, because a by-product of our method is that we infer the trains for which passengers board, our techniques can be employed as a pre-processing step for downstream tasks such as inferring passenger route choices. We apply our method to recover train logs using synthetically generated data as well as actual ticketing data of passengers in the Singapore metro network. Experiments on synthetic data show that our method reliably recovers train logs even with moderate levels of overcrowding on train platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Finland)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alsger, A., Assemi, B., Mesbah, M., Ferreira, L.: Validating and improving public transport origin-destination estimation algorithm using smart card fare data. Transp. Res. C Emerg. Technol. 68, 490–506 (2016)

    Article  Google Scholar 

  • Darling, D.A.: The Kolmogorov-Smirnov, Cramer-von Mises tests. Ann. Math. Stat. 28(4), 823–838 (1957)

    Article  Google Scholar 

  • Einmahl, J.H.J., Magnus, J.R.: Records in athletics through extreme-value theory. J. Am. Stat. Assoc. 103(484), 1382–1391 (2008)

    Article  Google Scholar 

  • El-Geneidy, A.M., Horning, J., Krizek, K.J.: Analyzing transit service reliability using detailed data from automatic vehicular locator systems. J. Adv. Transp. 45(1), 66–79 (2011)

    Article  Google Scholar 

  • Eom, J.K., Choi, M.H., Lee, J.: Evaluation of metro service quality using transit smart card data. In: Transportation Research Board 91st Annual Meeting (2012)

  • Jin Ki, E., Ji Young, S., Dae-Seop, M.: Analysis of public transit service performance using transit smart card data in Seoul. KSCE J. Civ. Eng. 19(5), 1530–1537 (2015)

    Article  Google Scholar 

  • Hörcher, D., Graham, D.J., Anderson, R.J.: Crowding cost estimation with large scale smart card and vehicle location data. Transp. Res. B Methodol. 95, 105–125 (2017). https://doi.org/10.1016/j.trb.2016.10.015

    Article  Google Scholar 

  • Hinneburg, A., Gabriel, H.H.: Denclue 2.0: fast clustering based on kernel density estimation. In: International Symposium on Intelligent Data Analysis, pp. 70–80. Springer (2007)

  • Hong, S.-P., Min, Y.-H., Park, M.-J., Kyung Min, K., Suk Mun, O.: Precise estimation of connections of metro passengers from Smart Card data. Transportation 43(5), 749–769 (2016)

    Article  Google Scholar 

  • Jones, M.C., Marron, J.S., Sheather, S.J.: A brief survey of bandwidth selection for density estimation. J. Am. Stat. Assoc. 91(433), 401–407 (1996)

    Article  Google Scholar 

  • Ko, S.J., Kim, K.M., Hong, S.P.: Estimation of transfer times and alighting times of the metro passengers in Seoul metropolitan area. Technical report, Working paper (2015)

  • Kusakabe, T., Asakura, Y.: Behavioural data mining of transit smart card data: a data fusion approach. Transp. Res. C Emerg. Technol. 46, 179–191 (2014)

    Article  Google Scholar 

  • Kusakabe, T., Iryo, T., Asakura, Y.: Estimation method for railway passengers’ train choice behavior with smart card transaction data. Transportation 37(5), 731–749 (2010)

    Article  Google Scholar 

  • Lee, M., Sohn, K.: Inferring the route-use patterns of metro passengers based only on travel-time data within a Bayesian framework using a reversible-jump Markov chain Monte Carlo (MCMC) simulation. Transp. Res. B Methodol. 81, 1–17 (2015)

    Article  Google Scholar 

  • Legara, E.F., Khoon, L.K., Guang, H.G., Monterola, C.: Mechanism-based model of a mass rapid transit system: a perspective. Int. J. Mod. Phys. Conf. Ser. 36, 1560011 (2015)

    Article  Google Scholar 

  • Lin, J., Wang, P., Barnum, D.T.: A quality control framework for bus schedule reliability. Transp. Res. E Logist. Transp. Rev. 44(6), 1086–1098 (2008)

    Article  Google Scholar 

  • Ma, Z., Xing, J., Mesbah, M., Ferreira, L.: Predicting short-term bus passenger demand using a pattern hybrid approach. Transp. Res. C Emerg. Technol. 39, 148–163 (2014)

    Article  Google Scholar 

  • Manley, E., Zhong, C.: Spatiotemporal variation in travel regularity through transit user profiling. Transportation 45(3), 703–732 (2018)

    Article  Google Scholar 

  • Min, Y.-H., Ko, S.-J., Kim, K.M., Hong, S.-P.: Mining missing train logs from Smart card data. Transp. Res. C Emerg. Technol. 63, 170–181 (2016)

    Article  Google Scholar 

  • Nan, H., Erika Fille, L., Kee Khoon, L., Gih Guang, H., Christopher, M.: Impacts of land use and amenities on public transport use, urban planning and design. Land Use Policy 57, 356–367 (2016)

    Article  Google Scholar 

  • Paul, E.C.: Estimating train passenger load from automated data systems: application to London underground. PhD thesis, Massachusetts Institute of Technology (2010)

  • Pelletier, M.-P., Trépanier, M., Morency, C.: Smart card data use in public transit: a literature review. Transp. Res. C Emerg. Technol. 19(4), 557–568 (2011)

    Article  Google Scholar 

  • Muhamad Azfar, R., Vasundhara, J., Hyen Chee, K., Kian Heong, T., Garyee Kee, K., Christopher, M.: Improved estimation of commuter waiting times using headway and commuter boarding information. Phys. A Stat. Mech. Appl. 501, 217–226 (2018)

    Article  Google Scholar 

  • Rodríguez-Núñez, E., García-Palomares, J.C.: Measuring the vulnerability of public transport networks. J. Transp. Geogr. 35, 50–63 (2014). https://doi.org/10.1016/j.jtrangeo.2014.01.008

    Article  Google Scholar 

  • Sun, Y., Schonfeld, P.M.: Schedule-based rail transit path-choice estimation using automatic fare collection data. J. Transp. Eng. 142(1), 04015037 (2016). https://doi.org/10.1061/(ASCE)TE.1943-5436.0000812

    Article  Google Scholar 

  • Sun, L., Lee, D.H., Erath, A., Huang, X.: Using smart card data to extract passenger’s spatio-temporal density and train’s trajectory of MRT system. In: Proceedings of the ACM SIGKDD International Workshop on Urban Computing, UrbComp ’12, pp. 142–148. ACM (2012). https://doi.org/10.1145/2346496.2346519

  • Sun, L., Lu, Y., Jin, J.G., Lee, D.-H., Axhausen, K.W.: An integrated Bayesian approach for passenger flow assignment in metro networks. Transp. Res. C Emerg. Technol. 52, 116–131 (2015)

    Article  Google Scholar 

  • Trépanier, M., Morency, C., Agard, B.: Calculation of transit performance measures using smartcard data. J. Public Transp. 12(1), 5 (2009)

    Google Scholar 

  • van Oort, N., Brands, T., de Romph, E.: Short-term prediction of ridership on public transport with smart card data. Transp. Res. Rec. 2535(1), 105–111 (2015)

    Article  Google Scholar 

  • Zhu, W., Wang, W., Huang, Z.: Estimating train choices of rail transit passengers with real timetable and automatic fare collection data. J. Adv. Transp. (2017a). https://doi.org/10.1155/2017/5824051

    Article  Google Scholar 

  • Zhu, Y., Koutsopoulos, H.N., Wilson, N.H.M.: A probabilistic passenger-to-train assignment model based on automated data. Transp. Res. B Methodol. 104, 522–542 (2017b). https://doi.org/10.1016/j.trb.2017.04.012

    Article  Google Scholar 

Download references

Acknowledgements

This research is supported by the National Research Foundation, Singapore, and the Land Transport Authority under its Urban Mobility Grand Challenge Programme (Award No UMGC-L005). The views expressed herein are those of the authors and are not necessarily those of the funding agencies.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualisation, methodology, HET, DWS, YSS and MAR; Coding, formal analysis and investigation, HET, DWS and YSS; Writing—original draft preparation, HET and DWS; Writing—review and editing, HET, DWS, YSS and MAR; Supervision, project administration and funding acquisition, MAR. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Hong En Tan.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, H.E., Soh, D.W., Soh, Y.S. et al. Derivation of train arrival timings through correlations from individual passenger farecard data. Transportation 48, 3181–3205 (2021). https://doi.org/10.1007/s11116-021-10164-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11116-021-10164-w

Keywords

Navigation