Abstract
The aim of this study is to estimate both the physical and schedulebased connections of metro passengers from their entry and exit times at the gates and the stations, a data set available from Smart Card transactions in a majority of train networks. By examining the Smart Card data, we will observe a set of transit behaviors of metro passengers, which is manifested by the time intervals that identifies the boarding, transferring, or alighting train at a station. The authenticity of the time intervals is ensured by separating a set of passengers whose trip has a unique connection that is predominantly better by all respects than any alternative connection. Since the connections of such passengers, known as reference passengers, can be readily determined and hence their gate times and stations can be used to derive reliable time intervals. To detect an unknown path of a passenger, the proposed method checks, for each alternative connection, if it admits a sequence of boarding, middle train(s), and alighting trains, whose time intervals are all consistent with the gate times and stations of the passenger, a necessary condition of a true connection. Tested on weekly 32 million trips, the proposed method detected unique connections satisfying the necessary condition, which are, therefore, most likely true physical and schedulebased connections in 92.6 and 83.4 %, respectively, of the cases.
This is a preview of subscription content, log in to check access.
References
Asakura, Y., Iryo, T., Nakajima, Y., Kusakabe, T., Takagi, Y., Kashiwadani, M.: Behavioural analysis of railway passengers using smart card data. In: Proceedings of the Urban Transport, pp. 599–608. Malta (2008)
Bagchi, M., White, P.R.: The potential of public transport smart card data. Transp. Policy 12(5), 464–474 (2005)
Bureau of Public Roads: Traffic Assignment Manual. U.S, Department of Commerce (1964)
Cox, T., Houdmont, J., Griffiths, A.: Rail passenger crowding, stress, health and safety in Britain. Transp. Res. Part A 40, 244–258 (2006)
Cronbach, L.J.: Coefficient alpha and the internal structure of tests. Psychometrika 16(3), 297–334 (1951)
De Cea, J., Fernandez, J.E.: Transit assignment for congested public tranport system: an equilibrium model. Transp. Sci. 27(2), 133–147 (1993)
Einmahl, J.H.J., Smeets, S.G.W.R.: Ultimate 100 m world records through through extremevalue theory. Stat. Neerl. 65(1), 32–42 (2011)
Fu, Q., Liu, R., Hess, S.: A bayesian modelling framework for individual passenger’s probabilistic route choices: a case study on the London underground. In: 93rd Transportation Research Board (TRB) Annual Meeting (2014)
Guo, Z., Wilson, N.: Transfer behavior and transfer planning in public transport systems: a case of the London underground. In: Proceedings of the 11th International Conference on Advanced Systems for Public Transport, Hong Kong (2009)
Jang, W.: Travel time and transfer analysis using transit smart card data. Transp. Res. Rec. 2144, 142–149 (2010)
Kato, H., Kaneko, Y., Inoue, M.: Comparative analysis of transit assignment: evidence from urban railway system in the Tokyo metropolitan area. Transportation 37, 775–799 (2010)
Ko, S.J., Kim, K.M., Hong, S.P.: Estimation of transfer times and alighting times of the metro passengers in Seoul metropolitan area. Working paper
Kusakabe, T., Iryo, T., Asakura, Y.: Estimation method for railway passengers’ train choice behaviour with smart card transaction data. Transportation 37, 731–749 (2010)
Lam, W.H.K., Lo, H.K.: Traffic assignment methods. In: Hensher, D.A., Button, K.J., Haynes, K.E., Stopher, P.R. (eds.) Handbook of Transport Geography and Spatial Systems, pp. 609–625 (2004)
Lehtonen, M., Rosenberg, M., Rasanen, J., Sirkia, A.: Utilization of the smart card payment system (scps) data in public tranport planning and statistics. In: Proceedings of the 9th World Congress on Intelligent Transport Systems, Chicago, Illinois, 14–17 October 2002
Morency, C., Trépanier, M., Agard, B.: Measuring transit use variability with smartcard data. Transp. Policy 14(3), 193–203 (2007)
Nielsen, O.A.: A stochastic transit assignment model considering differences in passengers utility functions. Transp. Res. Part B 34(5), 377–402 (2000)
Nour, A., Casello, J.M., Hellinga, B.: Anxietybased formulation to estimate generalized cost of transit travel time. Transp. Res. Rec. 2143, 108–116 (2010)
Park, J.Y., Kim, D.J., Lim, Y.: Use of smart card data to define public transit use in Seoul, South Korea. Transp. Res. Rec. 2063, 3–9 (2008)
Pelletier, M.P., Trèpanier, M., Morency, C.: Smart card data use in public transit: a literature review. Transp. Res. Part C 19, 557–568 (2011)
Raveau, S., Muñoz, J.C., de Grange, L.: A topological route choice model for metro. Transp. Res. Part A 45, 138–147 (2011)
Rinks, D.B.: Revenue allocation methods for integrated transit systems. Transp. Res. Part A 20(1), 39–50 (1986)
Seaborn, C.: Application of smart card fare payment data to bus network planning in London. UK. MS thesis, Massachusetts Institute of Technology, Cambridge (2008)
Seaborn, C., Attanucci, J., Wilson, N.: Analyzing multimodal public transport journeys in London with smart card fare payment data. Transp. Res. Rec. 2121, 55–62 (2009)
Shin, S.G., Cho, Y., Lee, C.: Integrated transit service evaluation methodologies using transportation card data (In Korean). Technical Report 2007R09, Seoul Development Institute (2007)
Trépanier, M., Tranchant, N., Chapleau, R.: Individual trip destination estimation in a transit smart card automated fare collection system. J. Intell. Transp. Syst. 11(1), 1–14 (2007)
Tsamboulas, D.A., Antoniou, C.: Allocating revenues to public transit operators under an integrated fare system. Transp. Res. Rec. 1986, 29–37 (2006)
Utsunomiya, M., Attanuchi, J., Wilson, N.H.: Potential uses of transit smart card registration and transaction data to improve transit planning. Transp. Res. Rec. 1971, 119–126 (2006)
Weidmann, U., Orth, H., Dorbritz, R.: Development of measurement system for public transport performance. Transp. Res. Rec. 2274, 135–143 (2012)
Zhou, F., Xu, R.H.: Model of passenger flow assignment for urban rail transit based on entry and exit time constraints. J. Transp. Res. Board 2284, 57–61 (2012)
Acknowledgments
This research was supported in part by Basic Science Research Program (2014R1A2A1A11049663) through the National Research Foundation of Korea (NRF), and by the BK21 Plus Program(Center for Sustainable and Innovative Industrial Systems) funded by the Ministry of Education, Korea.
Author information
Appendices
Appendix
Probability estimation of schedulebased connections
Suppose the current physical connection requires a single transfer, say, at Station \(A\). The schedulebased connections on a physical connection can be represented by a timeexpanded network as in Fig. 11.
The consistency check is initiated by finding consistent trains at both \(O\) and \(D\). By this assumption, there can be at most two trains, say \(X_1\) and \(X_2\), at \(O\), whose time intervals contain the entry time, while at most one train, say \(Y\), can be consistent with the exit time at \(D\). If there are no such trains at either \(O\) or \(D\), the passenger did not use the physical connection.
If neither \(X_1\) and \(X_2\) can be connected to \(Y\), in the sense that there is no relevant transfer reference passenger, we conclude that the passenger did not use the physical connection.
If there is only one such train, say \(X_1\), whose connection to \(Y\) can be verified by transfer reference passengers, then the schedulebased connection, \(X_1Y\) is confirmed as the unique connection of the passenger.
Finally, if there are two trains, say \(X_1\) and \(X_2\), from both of which we can find transfer reference passengers to \(Y\) as in Fig. 11, we need to return both \(X_1Y\) and \(X_2Y\). It is a worst case in that the maximum number of schedulebased connections are confirmed as consistent connections.
The estimation, however, can be refined by a probability distribution over the two connections. In Fig. 11, we introduce some notations as follows:

\(p\): The fraction of the boarding reference passengers from the overlap of the two time intervals that boarded train \(X_1\)

\(1p\): The fraction of the boarding reference passengers from the overlap of the two time intervals that boarded train not \(X_1\) but \(X_2\)

\(1q_1\): The fraction of the transfer reference passenger from \(X_1\) to \(Y\)

\(q_2\): The fraction of the transfer reference passenger from \(X_2\) to \(Y\)
It is not then difficult to show that
Table 6 summarizes the numbers and list of consistent schedulebased connection(s), the corresponding conditions, and the probability distributions. If none of the conditions from Table 6 is satisfied, no schedulebased connection can be consistent with the quadruple of our passenger and hence the physical connection is rejected.
For a physical connection that requires two transfers, there may be up to 3 schedulebased connections consistent with a quadruple if the trip is not abnormally delayed. The previous arguments can be easily extended to such a case.
Rights and permissions
About this article
Cite this article
Hong, S., Min, Y., Park, M. et al. Precise estimation of connections of metro passengers from Smart Card data. Transportation 43, 749–769 (2016). https://doi.org/10.1007/s111160159617y
Published:
Issue Date:
Keywords
 Physical and schedulebased connection estimation
 Smart Card data
 Metro network
 Passenger’s behaviors