Skip to main content
Log in

Scalable Framework for Enhancing Raw GPS Trajectory Data: Application to Trip Analytics for Transportation Planning

  • Original Paper
  • Published:
Journal of Big Data Analytics in Transportation Aims and scope Submit manuscript

Abstract

Transportation analysts and planners are beginning to leverage GPS trajectory data to draw additional insight into travel behavior and enhance data-driven decision-making capabilities. However, raw trajectory data cannot be utilized directly; they require extensive processing prior to analysis. This paper presents a scalable approach for enhancing raw GPS trajectory data by snapping and routing waypoints along a user-defined target road network that may have discontinuities and missing links, thus enabling trajectory datasets to be used in conjunction with the types of non-routable road networks often employed by transportation agencies. The proposed approach fuses a well-established map matching solution with a custom waypoint conflation procedure, and provides a framework to execute the trajectory processing in parallel to efficiently leverage available computing resources for large GPS datasets. To demonstrate its capability, four months of 2018 trajectory data from Maryland (2.5 billion waypoints from 46 million trips) are processed in this manner and assigned to a Traffic Message Channel road network. The enhanced trajectory data are then used to demonstrate a real-world use case, identifying key travel patterns along the I-270 spur in Maryland—a key commuting corridor currently being considered by the Maryland Department of Transportation for a congestion mitigation investment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  • Blazquez CA, Vonderohe AP (2005) Simple map-matching algorithm applied to intelligent winter maintenance vehicle data. Transp Res Rec 1935(1):68–76

    Article  Google Scholar 

  • De Smith MJ, Goodchild MF, Longley P (2007) Geospatial analysis: a comprehensive guide to principles, techniques and software tools. Troubador publishing ltd

  • Green E, Ripy J, Chen M, Zhang X (2013) Conflation methodologies to incorporate consumer travel data into state hpms datasets. In: Transportation research board 92 nd annual meeting, transportation research board, vol 92, pp 1–15

  • Greenfeld JS (2002) Matching GPS observations to locations on a digital map. 81th annual meeting of the transportation research board, vol 1. Washington DC, pp 164–173

  • Haklay M, Weber P (2008) Openstreetmap: User-generated street maps. IEEE Pervasive Comput 7(4):12–18

    Article  Google Scholar 

  • Kaushik K, Wood E, Gonder J (2018) Coupled approximation of us driving speed and volume statistics using spatial conflation and temporal disaggregation. Transp Res Rec 2672(43):1–11

    Article  Google Scholar 

  • Li Y, Liu C (2012) Spatial approaches for conflating gis roadway datasets. In: Sustainable transportation systems: plan, design, build, manage, and maintain, pp 290–298

  • Lomax T, Wang B, Schrank D, Eisele W, Turner S, Ellis D, Li Y, Koncz N, Geng L et al (2010) Improving mobility information with better data and estimation procedures. Technical report, Texas Transportation Institute

    Google Scholar 

  • Luo W, Tan H, Chen L, Ni LM (2013) Finding time period-based most frequent path in big trajectory data. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp 713–724

  • Marković N, Sekuła P, Vander Laan Z, Andrienko G, Andrienko N (2018) Applications of trajectory data from the perspective of a road transportation agency: literature review and Maryland case study. IEEE Trans Intell Transp Syst 20(5):1858–1869

    Article  Google Scholar 

  • Miller S, Vander Laan Z, Marković N (2020) Scaling gps trajectories to match point traffic counts: a convex programming approach and Utah case study. Transp Res Part E Logist Transp Rev 143:1

    Article  Google Scholar 

  • Newson P, Krumm J (2009) Hidden Markov map matching through noise and sparseness. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 336–343

  • Ochieng WY, Quddus MA, Noland RB (2003) Map-matching in complex urban road networks

  • Quddus MA, Ochieng WY, Noland RB (2007) Current map-matching algorithms for transport applications: State-of-the art and future research directions. Transp Res Part c Emerg Technol 15(5):312–328

    Article  Google Scholar 

  • Quddus MA, Ochieng WY, Zhao L, Noland RB (2003) A general map matching algorithm for transport telematics applications. GPS Solut 7(3):157–167

    Article  Google Scholar 

  • Schrank D, Eisele B, Lomax T, Bak J (2015) 2015 urban mobility scorecard. Technical Report, Texas A&M Transportation Institute

  • Sekuła P, Marković N, Vander Laan Z, Sadabadi KF (2018) Estimating historical hourly traffic volumes via machine learning and vehicle probe data: a Maryland case study. Transp Res Part C Emerg Technol 97:147–158

    Article  Google Scholar 

  • Syed S, Cannon ME (2004) Fuzzy logic-based map matching algorithm for vehicle navigation system in urban canyons. In: ION National Technical Meeting. number 1, pp 26–28

  • Yang D, Cai B, Yuan Y (2003) An improved map-matching algorithm used in vehicle navigation system. In: Proceedings of the 2003 IEEE international conference on intelligent transportation systems, vol 2, IEEE, pp 1246–1250

  • Zheng Y (2015) Trajectory data mining: an overview. ACM Trans Intell Syst Technol (TIST) 6(3):29

    Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Editor and three anonymous referees whose comments helped improve this manuscript. The authors also acknowledge Yinhu Wang from the University of Utah as well as Ignacio Tous and Rick Ayers from the Center for Advanced Transportation Technology at UMD for their assistance.

Author information

Authors and Affiliations

Authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: OSRM Map Matching

Appendix: OSRM Map Matching

For completeness, we provide a brief description of the method from Newson and Krumm (2009) that is implemented by the Open Source Routing Machine (OSRM), which is used in Step 1 to match raw GPS points to the road network. The map matching algorithm employed by the OSRM uses a Hidden Markov Model (HMM) to find the most likely road route represented by a time-stamped sequence of latitude/longitude pairs (Newson and Krumm 2009). Given a sequence of GPS points \(\{z_1,z_2,\ldots ,z_t,\ldots ,z_T\}\) and the road segments in the OSM network \(\{r_1,r_2,\ldots ,r_i,\ldots ,r_N\}\), the HMM calculates the measurement probability \(p(z_t|r_i)\) that the GPS points \(z_t\) would be observed if the vehicle was actually on road segment \(r_i\) as

$$\begin{aligned} p(z_t | r_i) = \dfrac{1}{\sqrt{2\pi }\sigma _z}e^{-0.5\left( \dfrac{\Vert z_t-x_{t,i}\Vert }{\sigma _z}\right) ^2}, \end{aligned}$$
(1)

where \(\sigma _z\) is the standard deviation of GPS measurements and \(\Vert z_t-x_{t,i}\Vert\) measures the great-circle distance between \(z_t\) and the closest point \(x_{t,i}\) to \(z_t\) on the road segment \(r_i\).

Based on (1), the probability of each GPS point matched to a list of candidate road segments can be obtained. The next step requires determining the probability of a vehicle moving between candidate road segments for any two temporal adjacent trace points. The transition probability function is defined to measure the probability of a vehicle moving between the candidate road matches,

$$\begin{aligned} p(d_t) = \dfrac{1}{\beta }e^{-\frac{d_t}{\beta }}, \end{aligned}$$
(2)

in which

$$\begin{aligned} d_t = |\ \Vert z_t - z_{t+1}\Vert - \Vert x_{t,i} - x_{t+1,j}\Vert \ |, \end{aligned}$$
(3)

where \(x_{t,i}\) and \(x_{t+1,j}\) represents the closest point on road \(r_i\) and \(r_j\) for GPS point \(z_t\) and \(z_{t+1}\), respectively, and \(\Vert x_{t,i} - x_{t+1,j}\Vert\) denotes the driving distance between point \(x_{t,i}\) and \(x_{t+1,j}\).

Given the two probability functions, the Viterbi algorithm is used to quickly find the most plausible path that maximizes the product of the measurement probabilities and transition probabilities (Newson and Krumm 2009).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vander Laan, Z., Franz, M. & Marković, N. Scalable Framework for Enhancing Raw GPS Trajectory Data: Application to Trip Analytics for Transportation Planning. J. Big Data Anal. Transp. 3, 119–139 (2021). https://doi.org/10.1007/s42421-021-00040-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42421-021-00040-5

Keywords

Navigation