Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

The First Data Science Challenge at BTW 2017

  • 292 Accesses

Abstract

The 17th Conference on Database Systems for Business, Technology, and Web (BTW2017) of the German Informatics Society (GI) took place in March 2017 at the University of Stuttgart in Germany. A Data Science Challenge was organized for the first time at a BTW conference by the University of Stuttgart and Sponsor IBM. We challenged the participants to solve a data analysis task within one month and present their results at the BTW. In this article, we give an overview of the organizational process surrounding the Challenge, and introduce the task that the participants had to solve. In the subsequent sections, the final four competitor groups describe their approaches and results.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Notes

  1. 1.

    Especially because, in Manhattan alone, about 2.8 million taxi trips take place each week [8]

  2. 2.

    http://www.nyc.gov/html/dot/html/about/vz_datafeeds.shtml

  3. 3.

    https://azure.microsoft.com

  4. 4.

    https://studio.azureml.net

  5. 5.

    https://www.nytimes.com/2014/01/22/nyregion/east-coast-snowstorm-takes-aim-at-new-york-region.html

  6. 6.

    https://developers.google.com/maps/documentation/javascript/heatmaplayer?hl=en

  7. 7.

    http://btw2017-dsc-hhu.azurewebsites.net/

  8. 8.

    https://developers.google.com/maps

  9. 9.

    https://developer.here.com

  10. 10.

    http://scikit-learn.org/stable/modules/generated/sklearn.cluster.dbscan.html

  11. 11.

    http://postgrest.com

  12. 12.

    https://github.com/indygemma/nyc-accident-explorer

  13. 13.

    https://d3js.org/

  14. 14.

    https://github.com/TobiasHildebrandt/Audiovisual-Analytics

  15. 15.

    http://btw.lab.indygemma.com/tobias/vis

  16. 16.

    https://vimeo.com/user25403611

  17. 17.

    http://cs.univie.ac.at/wst/research/projects/project/infproj/1096

  18. 18.

    https://www.citylab.com/transportation/2014/02/mapping-new-yorks-traffic-accidents/8516/

  19. 19.

    https://www.interworks.com/de/blog/modonnell/2015/08/26/exploring-nyc-vehicle-crash-data-tableau

  20. 20.

    https://www.youtube.com/watch?v=KJr9dn6pmn8

  21. 21.

    https://github.com/elastic/examples/tree/master/ElasticStack_nyc_traffic_accidents

  22. 22.

    https://flink.apache.org

  23. 23.

    http://leafletjs.com/

  24. 24.

    https://github.com/google/guava

  25. 25.

    https://en.wikipedia.org/wiki/Geohash

  26. 26.

    https://github.com/davidmoten/geo/

References

  1. 1.

    Australian Bureau of Statistics (2017) Time series analysis: the basics

  2. 2.

    Chaudhuri S, Dayal U (1997) An overview of data warehousing and olap technology. ACM Sigmod Rec 26(1):65–74

  3. 3.

    Cortes C, Vapnik V (1995) Support-Vector Networks. Mach Learn 20(3):273–297

  4. 4.

    Ho T (1995) Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp 278–282

  5. 5.

    Keim D, Andrienko G, Fekete JD, Görg C, Kohlhammer J, Melançon G (2008) Visual analytics: definition, process, and challenges. In: Information visualization. Springer, Berlin Heidelberg, pp 154–175

  6. 6.

    Kisilevich S, Mansmann F, Nanni M, Rinzivillo S (2010) Spatio-temporal clustering. Springer, Boston, pp 855–874

  7. 7.

    Maciejewski R, Rudolph S, Hafen R, Abusalah A, Yakout M, Ouzzani M, Cleveland WS, Grannis S, Ebert DS (2010) A visual analytics approach to understanding spatiotemporal hotspots. IEEE Trans Vis Comput Graph 16(2):205–220

  8. 8.

    NYC Taxi & Limousine Commision (2017) Tlc trip record data

  9. 9.

    Slocum TA, McMaster RB, Kessler FC, Howard HH (2005) Thematic cartography and geographic visualization. geographic information science. Pearson, Prentice Hall

  10. 10.

    Thomas J, Kielman J (2009) Challenges for visual analytics. Inf Vis 8(4):309–314

  11. 11.

    Van Brummelen G (2013) Heavenly mathematics: the forgotten art of spherical trigonometry. Princeton University Press, Princeton

  12. 12.

    World Health Organization (2017) Top 10 causes of death

Download references

Acknowledgements

The organizers of the Data Science Challenge thank the participants and the jury for the time they invested. Furthermore, we would like to thank INFOS for sponsoring the price money of the event.

This work was partially funded by the IST-Hochschule University of Applied Sciences and by the PhD programOnline Participation, supported by the North Rhine-Westphalian funding scheme Fortschrittskollegs.

This work was partly funded by the German Federal Ministry of Education and Research within the project Competence Center for Scalable Data Services and Solutions (ScaDS) Dresden/Leipzig (BMBF 01IS14014B) and Explicit Privacy-Preserving Host Intrusion Detection System EXPLOIDS (BMBF 16KIS0522K).

Author information

Correspondence to Pascal Hirmer.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hirmer, P., Waizenegger, T., Falazi, G. et al. The First Data Science Challenge at BTW 2017. Datenbank Spektrum 17, 207–222 (2017). https://doi.org/10.1007/s13222-017-0263-8

Download citation

Keywords

  • BTW 2017
  • Challenge
  • Data science
  • Analytics
  • New York City
  • Citibike
  • Car accidents