Abstract
Transportation analysts and planners are beginning to leverage GPS trajectory data to draw additional insight into travel behavior and enhance data-driven decision-making capabilities. However, raw trajectory data cannot be utilized directly; they require extensive processing prior to analysis. This paper presents a scalable approach for enhancing raw GPS trajectory data by snapping and routing waypoints along a user-defined target road network that may have discontinuities and missing links, thus enabling trajectory datasets to be used in conjunction with the types of non-routable road networks often employed by transportation agencies. The proposed approach fuses a well-established map matching solution with a custom waypoint conflation procedure, and provides a framework to execute the trajectory processing in parallel to efficiently leverage available computing resources for large GPS datasets. To demonstrate its capability, four months of 2018 trajectory data from Maryland (2.5 billion waypoints from 46 million trips) are processed in this manner and assigned to a Traffic Message Channel road network. The enhanced trajectory data are then used to demonstrate a real-world use case, identifying key travel patterns along the I-270 spur in Maryland—a key commuting corridor currently being considered by the Maryland Department of Transportation for a congestion mitigation investment.
Similar content being viewed by others
References
Blazquez CA, Vonderohe AP (2005) Simple map-matching algorithm applied to intelligent winter maintenance vehicle data. Transp Res Rec 1935(1):68–76
De Smith MJ, Goodchild MF, Longley P (2007) Geospatial analysis: a comprehensive guide to principles, techniques and software tools. Troubador publishing ltd
Green E, Ripy J, Chen M, Zhang X (2013) Conflation methodologies to incorporate consumer travel data into state hpms datasets. In: Transportation research board 92 nd annual meeting, transportation research board, vol 92, pp 1–15
Greenfeld JS (2002) Matching GPS observations to locations on a digital map. 81th annual meeting of the transportation research board, vol 1. Washington DC, pp 164–173
Haklay M, Weber P (2008) Openstreetmap: User-generated street maps. IEEE Pervasive Comput 7(4):12–18
Kaushik K, Wood E, Gonder J (2018) Coupled approximation of us driving speed and volume statistics using spatial conflation and temporal disaggregation. Transp Res Rec 2672(43):1–11
Li Y, Liu C (2012) Spatial approaches for conflating gis roadway datasets. In: Sustainable transportation systems: plan, design, build, manage, and maintain, pp 290–298
Lomax T, Wang B, Schrank D, Eisele W, Turner S, Ellis D, Li Y, Koncz N, Geng L et al (2010) Improving mobility information with better data and estimation procedures. Technical report, Texas Transportation Institute
Luo W, Tan H, Chen L, Ni LM (2013) Finding time period-based most frequent path in big trajectory data. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp 713–724
Marković N, Sekuła P, Vander Laan Z, Andrienko G, Andrienko N (2018) Applications of trajectory data from the perspective of a road transportation agency: literature review and Maryland case study. IEEE Trans Intell Transp Syst 20(5):1858–1869
Miller S, Vander Laan Z, Marković N (2020) Scaling gps trajectories to match point traffic counts: a convex programming approach and Utah case study. Transp Res Part E Logist Transp Rev 143:1
Newson P, Krumm J (2009) Hidden Markov map matching through noise and sparseness. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 336–343
Ochieng WY, Quddus MA, Noland RB (2003) Map-matching in complex urban road networks
Quddus MA, Ochieng WY, Noland RB (2007) Current map-matching algorithms for transport applications: State-of-the art and future research directions. Transp Res Part c Emerg Technol 15(5):312–328
Quddus MA, Ochieng WY, Zhao L, Noland RB (2003) A general map matching algorithm for transport telematics applications. GPS Solut 7(3):157–167
Schrank D, Eisele B, Lomax T, Bak J (2015) 2015 urban mobility scorecard. Technical Report, Texas A&M Transportation Institute
Sekuła P, Marković N, Vander Laan Z, Sadabadi KF (2018) Estimating historical hourly traffic volumes via machine learning and vehicle probe data: a Maryland case study. Transp Res Part C Emerg Technol 97:147–158
Syed S, Cannon ME (2004) Fuzzy logic-based map matching algorithm for vehicle navigation system in urban canyons. In: ION National Technical Meeting. number 1, pp 26–28
Yang D, Cai B, Yuan Y (2003) An improved map-matching algorithm used in vehicle navigation system. In: Proceedings of the 2003 IEEE international conference on intelligent transportation systems, vol 2, IEEE, pp 1246–1250
Zheng Y (2015) Trajectory data mining: an overview. ACM Trans Intell Syst Technol (TIST) 6(3):29
Acknowledgements
The authors are grateful to the Editor and three anonymous referees whose comments helped improve this manuscript. The authors also acknowledge Yinhu Wang from the University of Utah as well as Ignacio Tous and Rick Ayers from the Center for Advanced Transportation Technology at UMD for their assistance.
Author information
Authors and Affiliations
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: OSRM Map Matching
Appendix: OSRM Map Matching
For completeness, we provide a brief description of the method from Newson and Krumm (2009) that is implemented by the Open Source Routing Machine (OSRM), which is used in Step 1 to match raw GPS points to the road network. The map matching algorithm employed by the OSRM uses a Hidden Markov Model (HMM) to find the most likely road route represented by a time-stamped sequence of latitude/longitude pairs (Newson and Krumm 2009). Given a sequence of GPS points \(\{z_1,z_2,\ldots ,z_t,\ldots ,z_T\}\) and the road segments in the OSM network \(\{r_1,r_2,\ldots ,r_i,\ldots ,r_N\}\), the HMM calculates the measurement probability \(p(z_t|r_i)\) that the GPS points \(z_t\) would be observed if the vehicle was actually on road segment \(r_i\) as
where \(\sigma _z\) is the standard deviation of GPS measurements and \(\Vert z_t-x_{t,i}\Vert\) measures the great-circle distance between \(z_t\) and the closest point \(x_{t,i}\) to \(z_t\) on the road segment \(r_i\).
Based on (1), the probability of each GPS point matched to a list of candidate road segments can be obtained. The next step requires determining the probability of a vehicle moving between candidate road segments for any two temporal adjacent trace points. The transition probability function is defined to measure the probability of a vehicle moving between the candidate road matches,
in which
where \(x_{t,i}\) and \(x_{t+1,j}\) represents the closest point on road \(r_i\) and \(r_j\) for GPS point \(z_t\) and \(z_{t+1}\), respectively, and \(\Vert x_{t,i} - x_{t+1,j}\Vert\) denotes the driving distance between point \(x_{t,i}\) and \(x_{t+1,j}\).
Given the two probability functions, the Viterbi algorithm is used to quickly find the most plausible path that maximizes the product of the measurement probabilities and transition probabilities (Newson and Krumm 2009).
Rights and permissions
About this article
Cite this article
Vander Laan, Z., Franz, M. & Marković, N. Scalable Framework for Enhancing Raw GPS Trajectory Data: Application to Trip Analytics for Transportation Planning. J. Big Data Anal. Transp. 3, 119–139 (2021). https://doi.org/10.1007/s42421-021-00040-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42421-021-00040-5