Geospatial Partitioning of Open Transit Data

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12128)


Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator’s service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.


Linked open data Mobility Maintainability Web API engineering 


  1. 1.
    Antrim, A., Barbeau, S.J., et al.: The many uses of GTFS data–opening the door to transit and multimodal applications 4 (2013). Location-Aware Information Systems Laboratory at the University of South FloridaGoogle Scholar
  2. 2.
    Aurenhammer, F.: Voronoi diagrams–a survey of a fundamental geometric data structure. ACM Comput. Surv. (CSUR) 23(3), 345–405 (1991)CrossRefGoogle Scholar
  3. 3.
    Bast, H., et al.: Fast routing in very large public transportation networks using transfer patterns. In: de Berg, M., Meyer, U. (eds.) ESA 2010. LNCS, vol. 6346, pp. 290–301. Springer, Heidelberg (2010). CrossRefGoogle Scholar
  4. 4.
    Bast, H., Hertel, M., Storandt, S.: Scalable transfer patterns. In: 2016 Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 15–29. SIAM (2016)Google Scholar
  5. 5.
    Bauer, R., Delling, D., Wagner, D.: Experimental study of speed up techniques for timetable information systems. Networks 57(1), 38–52 (2011)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Berger, A., Delling, D., Gebhardt, A., Müller-Hannemann, M.: Accelerating time-dependent multi-criteria timetable information is harder than expected. In: 9th Workshop on Algorithmic Approaches for Transportation Modeling, Optimization, and Systems (ATMOS 2009). Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2009)Google Scholar
  7. 7.
    Berners-Lee, T.: Linked data-design issues (2006).
  8. 8.
    Colpaert, P., Llaves, A., Verborgh, R., Corcho, O., Mannens, E., Van de Walle, R.: Intermodal public transit routing using liked connections. In: International Semantic Web Conference: Posters and Demos, pp. 1–5 (2015)Google Scholar
  9. 9.
    European Commission: mobile broadband prices in Europe 2019 (2019).
  10. 10.
    Delling, D., Dibbelt, J., Pajor, T., Zündorf, T.: Faster transit routing by hyper partitioning. In: 17th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017)Google Scholar
  11. 11.
    Delling, D., Goldberg, A.V., Razenshteyn, I., Werneck, R.F.: Graph partitioning with natural cuts. In: 2011 IEEE International Parallel & Distributed Processing Symposium, pp. 1135–1146. IEEE (2011)Google Scholar
  12. 12.
    Dibbelt, J., Pajor, T., Strasser, B., Wagner, D.: Connection scan algorithm (2017)Google Scholar
  13. 13.
    Flanders: population: size and growth (2019).
  14. 14.
    Geisberger, R., Sanders, P., Schultes, D., Delling, D.: Contraction hierarchies: faster and simpler hierarchical routing in road networks. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 319–333. Springer, Heidelberg (2008). Scholar
  15. 15.
    Hofmann, H., Kafadar, K., Wickham, H.: Letter-value plots: boxplots for large data. Technical report, (2011)Google Scholar
  16. 16.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Verborgh, R., et al.: Querying datasets on the web with high availability. In: MikaMika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 180–196. Springer, Cham (2014). Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.IDLab, Department of Electronics and Information SystemsGhent University – imecGhentBelgium

Personalised recommendations