The First Data Science Challenge at BTW 2017

Abstract

The 17th Conference on Database Systems for Business, Technology, and Web (BTW2017) of the German Informatics Society (GI) took place in March 2017 at the University of Stuttgart in Germany. A Data Science Challenge was organized for the first time at a BTW conference by the University of Stuttgart and Sponsor IBM. We challenged the participants to solve a data analysis task within one month and present their results at the BTW. In this article, we give an overview of the organizational process surrounding the Challenge, and introduce the task that the participants had to solve. In the subsequent sections, the final four competitor groups describe their approaches and results.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Notes

  1. 1.

    Especially because, in Manhattan alone, about 2.8 million taxi trips take place each week [8]

  2. 2.

    http://www.nyc.gov/html/dot/html/about/vz_datafeeds.shtml

  3. 3.

    https://azure.microsoft.com

  4. 4.

    https://studio.azureml.net

  5. 5.

    https://www.nytimes.com/2014/01/22/nyregion/east-coast-snowstorm-takes-aim-at-new-york-region.html

  6. 6.

    https://developers.google.com/maps/documentation/javascript/heatmaplayer?hl=en

  7. 7.

    http://btw2017-dsc-hhu.azurewebsites.net/

  8. 8.

    https://developers.google.com/maps

  9. 9.

    https://developer.here.com

  10. 10.

    http://scikit-learn.org/stable/modules/generated/sklearn.cluster.dbscan.html

  11. 11.

    http://postgrest.com

  12. 12.

    https://github.com/indygemma/nyc-accident-explorer

  13. 13.

    https://d3js.org/

  14. 14.

    https://github.com/TobiasHildebrandt/Audiovisual-Analytics

  15. 15.

    http://btw.lab.indygemma.com/tobias/vis

  16. 16.

    https://vimeo.com/user25403611

  17. 17.

    http://cs.univie.ac.at/wst/research/projects/project/infproj/1096

  18. 18.

    https://www.citylab.com/transportation/2014/02/mapping-new-yorks-traffic-accidents/8516/

  19. 19.

    https://www.interworks.com/de/blog/modonnell/2015/08/26/exploring-nyc-vehicle-crash-data-tableau

  20. 20.

    https://www.youtube.com/watch?v=KJr9dn6pmn8

  21. 21.

    https://github.com/elastic/examples/tree/master/ElasticStack_nyc_traffic_accidents

  22. 22.

    https://flink.apache.org

  23. 23.

    http://leafletjs.com/

  24. 24.

    https://github.com/google/guava

  25. 25.

    https://en.wikipedia.org/wiki/Geohash

  26. 26.

    https://github.com/davidmoten/geo/

References

  1. 1.

    Australian Bureau of Statistics (2017) Time series analysis: the basics

    Google Scholar 

  2. 2.

    Chaudhuri S, Dayal U (1997) An overview of data warehousing and olap technology. ACM Sigmod Rec 26(1):65–74

    Article  Google Scholar 

  3. 3.

    Cortes C, Vapnik V (1995) Support-Vector Networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  4. 4.

    Ho T (1995) Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp 278–282

    Google Scholar 

  5. 5.

    Keim D, Andrienko G, Fekete JD, Görg C, Kohlhammer J, Melançon G (2008) Visual analytics: definition, process, and challenges. In: Information visualization. Springer, Berlin Heidelberg, pp 154–175

    Google Scholar 

  6. 6.

    Kisilevich S, Mansmann F, Nanni M, Rinzivillo S (2010) Spatio-temporal clustering. Springer, Boston, pp 855–874

    Google Scholar 

  7. 7.

    Maciejewski R, Rudolph S, Hafen R, Abusalah A, Yakout M, Ouzzani M, Cleveland WS, Grannis S, Ebert DS (2010) A visual analytics approach to understanding spatiotemporal hotspots. IEEE Trans Vis Comput Graph 16(2):205–220

    Article  Google Scholar 

  8. 8.

    NYC Taxi & Limousine Commision (2017) Tlc trip record data

    Google Scholar 

  9. 9.

    Slocum TA, McMaster RB, Kessler FC, Howard HH (2005) Thematic cartography and geographic visualization. geographic information science. Pearson, Prentice Hall

    Google Scholar 

  10. 10.

    Thomas J, Kielman J (2009) Challenges for visual analytics. Inf Vis 8(4):309–314

    Article  Google Scholar 

  11. 11.

    Van Brummelen G (2013) Heavenly mathematics: the forgotten art of spherical trigonometry. Princeton University Press, Princeton

    Google Scholar 

  12. 12.

    World Health Organization (2017) Top 10 causes of death

    Google Scholar 

Download references

Acknowledgements

The organizers of the Data Science Challenge thank the participants and the jury for the time they invested. Furthermore, we would like to thank INFOS for sponsoring the price money of the event.

This work was partially funded by the IST-Hochschule University of Applied Sciences and by the PhD programOnline Participation, supported by the North Rhine-Westphalian funding scheme Fortschrittskollegs.

This work was partly funded by the German Federal Ministry of Education and Research within the project Competence Center for Scalable Data Services and Solutions (ScaDS) Dresden/Leipzig (BMBF 01IS14014B) and Explicit Privacy-Preserving Host Intrusion Detection System EXPLOIDS (BMBF 16KIS0522K).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pascal Hirmer.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hirmer, P., Waizenegger, T., Falazi, G. et al. The First Data Science Challenge at BTW 2017. Datenbank Spektrum 17, 207–222 (2017). https://doi.org/10.1007/s13222-017-0263-8

Download citation

Keywords

  • BTW 2017
  • Challenge
  • Data science
  • Analytics
  • New York City
  • Citibike
  • Car accidents