Abstract
The increasing adoption of automatic vehicle location and automatic passenger count technologies by transit agencies produces passenger boarding and alighting count data on a continuous basis. This data provides new opportunities for origin–destination (O–D) flow estimation. However, the state-of-the-art methodologies generated flows within routes and barely considered linked trips. This paper proposes optimization models to identify transfers and approximate network-level O–D flows by: a quadratic integer program (QIP), a feasible rounding procedure for the quadratic convex programming (QCP) relaxation of the QIP, and an integer program (IP). A case study for Ann Arbor-Ypsilanti area in Michigan suggests that: The IP model outperforms the QCP in terms of accuracy and remains tractable from an efficiency standpoint, contrary to the QIP. Its O–D estimation achieves an R-Squared metric of \(95.57\%\) at the traffic analysis zone level and \(92.39\%\) at the stop level, compared to the ground-truths inferred from the state-of-the-practice trip-chaining methods.
Similar content being viewed by others
Notes
The terms transfer probabilities and transfer rates are referred to as the proportions of transfers.
References
Alsger AA, Mesbah M, Luis F, Safi H (2015) Use of smart card fare data to estimate public transport origin-destination matrix. Transp Res Rec J Transp Res Board 2535:88–96
Alsger A, Assemi B, Mesbah M, Ferreira L (2016) Validating and improving public transport origin-destination estimation algorithm using smart card fare data. Transp Res Part C Emerg Technol 68:490–506
Badu-Marfo G, Farooq B, Patterson Z (2019) A perspective on the challenges and opportunities for privacy-aware big transportation data. J Big Data Anal Transp 1(1):1–23. https://doi.org/10.1007/s42421-019-00001-z
Barry J, Newhouser R, Rahbee A, Sayeda S (2002) Origin and destination estimation in New York City with automated fare system data. Transp Res Rec J Transp Res Board 1817:183–187
Ben-Akiva ME, Macke PP, Hsu PS (1985) Alternative methods to estimate route-level trip tables and expand on-board surveys. Transp Res Rec 1037:1–11
Chu X (2004) Ridership models at the stop level. Technical report, National Center for Transit Research, University of South Florida
Deming WE, Stephan FF (1940) On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Stat. 11(4):427–444. https://doi.org/10.1214/aoms/1177731829
Devillaine F, Munizaga M, Trépanier M (2012) Detection of activities of public transport users by analyzing smart card data. Transp Res Rec J Transp Res Board 2276:48–55
Furth PG, Hemily B, Muller THJ, Strathman JG (2003) Uses of archived AVL-APC data to improve transit performance and management: review and potential. Technical report, Transit Cooperative Research Program
Furth PG, Strathman JG, Hemily B (2005) Making automatic passenger counts mainstream: accuracy, balancing algorithms, and data structures. Transp Res Rec 1927(1):206–216
Golani H (2007) Use of archived bus location, dispatch, and ridership data for transit analysis. Transp Res Rec 1992(1):101–112
Google Transit APIs: GTFS Static Overview (2019). https://developers.google.com/transit/gtfs/
Gurobi Optimization, LLC: Gurobi optimizer reference manual (2019). http://www.gurobi.com
Iliopoulou C, Kepaptsoglou K (2019) Combining its and optimization in public transportation planning: state of the art and future research paths. Eur Transp Res Rev. https://doi.org/10.1186/s12544-019-0365-5
James G, Witten D, Hastie T, Tibshirani R (2013) Unsupervised learning. In: An introduction to statistical learning. Springer, New York
Jang W (2010) Travel time and transfer analysis using transit smart card data. Transp Res Rec 2144(1):142–149. https://doi.org/10.3141/2144-16
Ji Y, Mishalani RG, McCord MR (2014) Estimating transit route OD flow matrices from APC data on multiple bus trips using the IPF method with an iteratively improved base: method and empirical evaluation. J Transp Eng 140(5):04014008. https://doi.org/10.1061/(ASCE)TE.1943-5436.0000647
Ji Y, Mishalani RG, McCord MR (2015a) Transit passenger origin-destination flow estimation: efficiently combining onboard survey and large automatic passenger count datasets. Transp Res Part C Emerg Technol 58:178–192
Ji Y, You Q, Jiang S, Zhang HM (2015b) Statistical inference on transit route-level origin-destination flows using automatic passenger counter data. J Adv Transp 49(6):724–737
Luo D, Cats O, van Lint H (2017) Constructing transit origin-destination matrices with spatial clustering. Transp Res Rec J Transp Res Board 2652:39–49
Mandelzys M, Hellinga B (2010) Identifying causes of performance issues in bus schedule adherence with automatic vehicle location and passenger count data. Transp Res Rec 2143(1):9–15
McCord MR, Mishalani RG, Goel P, Strohl B (2010) Iterative proportional fitting procedure to determine bus route passenger origin-destination flows. Transp Res Rec 2145(1):59–65
Munizaga MA, Palma C (2012) Estimation of a disaggregate multimodal public transport origin-destination matrix from passive smartcard data from Santiago, Chile. Transp Res Part C Emerg Technol 24:9–18
Parker D (2008) AVL systems for bus transit: update. Technical report, Transit Cooperative Research Program
Pelletier MP, Trépanier M, Morency C (2011) Smart card data use in public transit: a literature review. Transp Res Part C Emerg Technol 19(4):557–568
Tamblay S, Galilea P, Iglesias P, Raveau S, Muñoz JC (2016) A zonal inference model based on observed smart-card transactions for Santiago de Chile. Transp Res Part A Policy Pract 84:44–54
Tavassoli A, Alsger A, Hickman M, Mesbah M (2016) How close the models are to the reality? Comparison of transit origin-destination estimates with automatic fare collection data
Tétreault PR, El-Geneidy AM (2010) Estimating bus run times for new limited-stop service using archived AVL and APC data. Transp Res Part A Policy Pract 44(6):390–402
Trépanier M, Tranchant N, Chapleau R (2007) Individual trip destination estimation in a transit smart card automated fare collection system. J Intell Transp Syst 11(1):1–14
United States Department of Transportation: advanced passenger counters fact sheet: transit overview (2019). https://www.pcb.its.dot.gov/factsheets/apc/apc_overview.aspx#page=tech
Washington S, Karlaftis MG, Mannering FL (2011) Statistical and econometric methods for transportation data analysis. CRC Press, Boca Raton
Wu L, Kang JE, Chung Y, Nikolaev A (2019) Monitoring multimodal travel environment using automated fare collection data: data processing and reliability analysis. J Big Data Anal Transp 1(2):123–146. https://doi.org/10.1007/s42421-019-00012-w
Acknowledgements
This research is funded by the Michigan Institute of Data Science (MIDAS) and by Grant 7F-30154 from the Department of Energy. The authors would like to thank Forest Yang from the AAATA for his assistance in providing the data. Findings presented in this paper do not necessarily represent the views of the funding agencies.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The appended figures visualize the spatial distribution of stop-level accuracy of the model estimation, with the IP approach on the Go!Pass data as an example. Figures 6 and 10 each illustrate the benchmark counts of trips originating from and terminating at each stop, in which the 2 transit centers are each treated as a stop. Figures 7 and 11 each depict the inferred counts of trips originating from and terminating at the each stop. Figures 8 and 12 each visualize the total deviation of the IP model estimation from the benchmark, also in terms of the trip counts at each stop as an origin and as a destination. Both figures show the absolute differences when comparing Figs. 7 and 11 against Figs. 6 and 10. Specifically, let \(\text{OD}^*\) denote the benchmark matrix, \(\text{OD}\) denote the matrix estimation, n denote the total number of stops, and \((i,j) \in \{1,\ldots ,n\}^2\) denote the indices (or the coordinates) of the origin and destination pairs. For each stop i as the origin, the data presented in Fig. 8 was calculated as \(\big | \sum _{j=1}^{n} \text{OD}^*_{i,j} - \sum _{j=1}^{n} \text{OD}_{i,j} \big |\). For each stop j as the destination, the data presented in Fig. 12 was calculated as \(\big | \sum _{i=1}^{n} \text{OD}^*_{i,j} - \sum _{i=1}^{n} \text{OD}_{i,j} \big |\). Figures 9 and 13 depict the L2-norm as the distance measure between the benchmark matrix and the estimations, also at the stop-level. Specifically, let the vector \(\text{OD}^*_{i,\cdot }\) denote the ith row of the benchmark matrix, and the vector \(\text{OD}_{i,\cdot }\) denote the ith row of the estimation matrix. Also, let the vector \(\text{OD}^*_{\cdot ,j}\) denote the jth column of the benchmark matrix, and the vector \(\text{OD}_{\cdot ,j}\) denote the jth column of the estimation matrix. For each stop i as the origin, the data presented in Fig. 9 was calculated as \(\left\Vert \text{OD}^*_{i,\cdot } - \text{OD}_{i, \cdot }\right\Vert _2 = \sum _{j=1}^n (\text{OD}^*_{i,j} - \text{OD}_{i,j})^2\). For each stop j as the destination, the data presented in Fig. 13 was calculated as \(\left\Vert \text{OD}^*_{\cdot ,j} - \text{OD}_{\cdot ,j}\right\Vert _2 = \sum _{i=1}^n (\text{OD}^*_{i,j} - \text{OD}_{i,j})^2\). The data in Figs. 8, 9, 12 and 13 have the same unit as those in Figs. 6, 7, 10, and 11, and were plotted in the same scale for comparison. In Figs. 6, 7, 10, and 11, the size of the red circles depicts the volume of flows originating from or terminating at each stop. In Figs. 8, 9, 12 and 13, the red circles of larger sizes correspond to larger differences between the estimation and the benchmark and poorer model performance.
Rights and permissions
About this article
Cite this article
Liu, X., Van Hentenryck, P. & Zhao, X. Optimization Models for Estimating Transit Network Origin–Destination Flows with Big Transit Data. J. Big Data Anal. Transp. 3, 247–262 (2021). https://doi.org/10.1007/s42421-021-00050-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42421-021-00050-3