Advertisement

Datenbank-Spektrum

, Volume 19, Issue 3, pp 165–182 | Cite as

Particulate Matter Matters—The Data Science Challenge @ BTW 2019

  • Holger J. MeyerEmail author
  • Hannes Grunert
  • Tim Waizenegger
  • Lucas Woltmann
  • Claudio Hartmann
  • Wolfgang Lehner
  • Mahdi Esmailoghli
  • Sergey Redyuk
  • Ricardo Martinez
  • Ziawasch Abedjan
  • Ariane Ziehn
  • Tilmann Rabl
  • Volker Markl
  • Christian Schmitz
  • Dhiren Devinder Serai
  • Tatiane Escobar Gava
Schwerpunktbeitrag
  • 140 Downloads

Abstract

For the second time, the Data Science Challenge took place as part of the 18th symposium “Database Systems for Business, Technology and Web” (BTW) of the Gesellschaft für Informatik (GI). The Challenge was organized by the University of Rostock and sponsored by IBM and SAP. This year, the integration, analysis and visualization around the topic of particulate matter pollution was the focus of the challenge. After a preselection round, the accepted participants had one month to adapt their developed approach to a substantiated problem, the real challenge. The final presentation took place at BTW 2019 in front of the prize jury and the attending audience. In this article, we give a brief overview of the schedule and the organization of the Data Science Challenge. In addition, the problem to be solved and its solution will be presented by the participants.

Keywords

BTW 2019 Data Science Challenge Big Data Analytics Particulate matter Driving bans 

Notes

Acknowledgements

The organizers of the Data Science Challenge would like to take this opportunity to thank the participants and jury members for their contributions, especially Ute Schuerfeld and Stefan Goers for their valuable support throughout the whole process. In addition, we would like to thank IBM and SAP for sponsoring the Challenge.

The TU Dresden would like to thank Elke Sähn from the Fraunhofer-Institut für Verkehrs- und Infrastruktursysteme IVI for her substantial input as its domain expert.

The TU Berlin would like to acknowledge the German Federal Ministry of Transport and Digital Infrastructure (BMVI) in the context of the research initiative mFUND, for funding the project DAYSTREAM under grant number 19F2031D, in which some of the tools and techniques used in this research are based or inspired. The work is also supported by the BZML under grant number 01IS18037A, BBDC 2 under grant number 01IS18025A, ECDF, and the HEIBRiDS graduate school.

References

  1. 1.
    Alkhouri G, Wilke M (2019) Deep Learning zur Vorhersage von Feinstaubbelastung. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 305–308  https://doi.org/10.18420/btw2019-ws-35 CrossRefGoogle Scholar
  2. 2.
    Bailis P, Gan E, Madden S, Narayanan D, Rong K, Suri S (2017) Macrobase: prioritizing attention in fast data. ACM International Conference on Management of Data. ACM, Chicago, pp 541–556 (Proceedings)Google Scholar
  3. 3.
    Bougoudis I, Demertzis K, Iliadis L (2016) Fast and low cost prediction of extreme air pollution values with hybrid unsupervised learning. Integr Comput Aided Eng 23(2):115–127CrossRefGoogle Scholar
  4. 4.
    Cleveland RB, Cleveland WS, McRae JE, Terpenning I (1990) STL: a seasonal-trend decomposition. J Off Stat 6(1):3–73Google Scholar
  5. 5.
    Cyrys J, Eeftens M, Heinrich J, Ampe C, Armengaud A, Beelen R, Bellander T, Beregszaszi T, Birk M, Cesaroni G et al (2012) Variation of NO2 and NOx concentrations between and within 36 European study areas: results from the ESCAPE study. Atmos Environ 62:374–390.  https://doi.org/10.1016/j.atmosenv.2012.07.080 CrossRefGoogle Scholar
  6. 6.
    Deutscher Wetterdienst (2019) Climate data center. ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/. Accessed 6 Feb 2019Google Scholar
  7. 7.
    Esmailoghli M, Redyuk S, Martinez R, Abedjan Z, Rabl T, Markl V (2019) Explanation of air pollution using external data sources. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 297–300  https://doi.org/10.18420/btw2019-ws-32 CrossRefGoogle Scholar
  8. 8.
    Folium (2019) Folium documentation. https://python-visualization.github.io/folium/. Accessed 6 May 2019Google Scholar
  9. 9.
    Grunert H, Meyer H (2019) Die Data Science Challenge auf der BTW 2019 in Rostock. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 281–284  https://doi.org/10.18420/btw2019-ws-30 CrossRefGoogle Scholar
  10. 10.
    Hagedorn S, Sattler K (2019) Peaks and the influence of weather, traffic, and events on particulate pollution. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 301–302  https://doi.org/10.18420/btw2019-ws-33 CrossRefGoogle Scholar
  11. 11.
    Klingner M (2018) Stellungnahme von Prof. Dr. Matthias Klingner zur öffentlichen Anhörung am 25. Juni 2018. https://www.bundestag.de/resource/blob/561430/42f387a20eef0041e81502cd5092b271/014_sitzung_fraunhofer-data.pdf. Accessed 25 Apr 2019Google Scholar
  12. 12.
    Klingner M, Sähn E (2008) Prediction of PM10 concentration on the basis of high resolution weather forecasting. Meteorol Z 17(3):263–272.  https://doi.org/10.1127/0941-2948/2008/0288 CrossRefGoogle Scholar
  13. 13.
    Lelieveld J, Evans JS, Fnais M, Giannadaki D, Pozzer A (2015) The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 525(7569):367CrossRefGoogle Scholar
  14. 14.
    Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49(4):764–766CrossRefGoogle Scholar
  15. 15.
    Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) (2019) Datenbanksysteme für Business, Technologie und Web (BTW 2019). 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.-8. März 2019 Gesellschaft für Informatik, BonnGoogle Scholar
  16. 16.
    Moritz S, Sardá A, Bartz-Beielstein T, Zaefferer M, Stork J (2015) Comparison of different methods for Univariate time series imputation in R. ArXiv 2015(10):arXiv:1510.03924 [stat.AP]. https://arxiv.org/abs/1510.03924 Google Scholar
  17. 17.
    Mukherjee A, Agrawal M (2017) World air particulate matter: sources, distribution and health effects. Environ Chem Lett 15(2):283–309CrossRefGoogle Scholar
  18. 18.
    Nova Fitness Co, Ltd (2015) SDS011 laser PM2.5 sensor specification. http://ecksteinimg.de/Datasheet/SDS011laserPM2.5sensorspecification-V1.3.pdf. Accessed 8 Feb 2019Google Scholar
  19. 19.
    OpenWeather (2018) Weather API – OpenWeatherMap. https://openweathermap.org/api. Accessed 28 Nov 2018Google Scholar
  20. 20.
    Alfeld P (1984) A trivariate Clough-Tocher scheme for tetrahedral data. Comput Aided Geom Des 1(2):169–181.  https://doi.org/10.1016/0167-8396(84)90029-3 CrossRefzbMATHGoogle Scholar
  21. 21.
    Plotly (2019) Build beautiful, web-based analytics applications with Dash. https://plot.ly/products/dash/. Accessed 20 Apr 2019Google Scholar
  22. 22.
    Rausch A, Werhahn O, Witzel O, Ebert V, Vuelban EM, Gersl J, Kvernmo G, Korsman J, Coleman M, Gardiner T et al (2015) Metrology to underpin future regulation of industrial emissions. 17th International Congress of Metrology. EDP Sciences, Paris, p 7008Google Scholar
  23. 23.
    Schmitz C, Serai DD, Gava TE (2019) Prediction of air pollution with machine learning. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 303–304  https://doi.org/10.18420/btw2019-ws-34 CrossRefGoogle Scholar
  24. 24.
    Stuttgart OL (2015) Luftdaten Info. https://luftdaten.info/. Accessed 28 Nov 2018Google Scholar
  25. 25.
    Stuttgart OL (2015) Luftdaten Info. https://archive.luftdaten.info/csv_per_month/. Accessed 28 Nov 2018Google Scholar
  26. 26.
    topographic-mapcom (2019) Topografische Karte Stuttgart. http://de-de.topographic-map.com/places/Stuttgart-8132395/. Accessed 26 Feb 2019Google Scholar
  27. 27.
    Tukey JW (1977) Exploratory data analysis. Addison-Wesley, ReadingzbMATHGoogle Scholar
  28. 28.
    Environmental Protection Agency (2019) Particulate Matter (PM) basics. https://www.epa.gov/pm-pollution/particulate-matter-pm-basics#PM. Accessed 27 Apr 2019Google Scholar
  29. 29.
    Waizenegger T (2017) BTW 2017 data science challenge (SDSC17). In: Mitschang B, Ritter N, Schwarz H, Klettke M, Thor A, Kopp O, Wieland M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2017) 17. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Stuttgart, Germany, 6.-10. März 2017, pp 405–406Google Scholar
  30. 30.
    WHO (2016) Air pollution levels rising in many of the world’s poorest cities. http://www.who.int/en/news-room/detail/12-05-2016-air-pollution-levels-rising-in-many-of-the-world-s-poorest-cities. Accessed 24 Nov 2018Google Scholar
  31. 31.
    Woltmann L, Hartmann C, Lehner W (2019) Assessing the impact of driving bans with data analysis. In: Meyer H, Ritter N, Thor A, Nicklas D, Heuer A, Klettke M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2019) 18. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Rostock, Germany, 4.–8. März 2019 Gesellschaft für Informatik, Bonn, pp 287–296  https://doi.org/10.18420/btw2019-ws-31 CrossRefGoogle Scholar
  32. 32.
    Xiao Q, Ma Z, Li S, Liu Y (2015) The impact of winter heating on air pollution in China. PLoS ONE 10(1):e117311CrossRefGoogle Scholar

Copyright information

© Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Holger J. Meyer
    • 1
    Email author
  • Hannes Grunert
    • 1
  • Tim Waizenegger
    • 2
  • Lucas Woltmann
    • 3
  • Claudio Hartmann
    • 3
  • Wolfgang Lehner
    • 3
  • Mahdi Esmailoghli
    • 4
  • Sergey Redyuk
    • 4
  • Ricardo Martinez
    • 5
  • Ziawasch Abedjan
    • 4
    • 5
  • Ariane Ziehn
    • 5
  • Tilmann Rabl
    • 6
  • Volker Markl
    • 4
    • 5
  • Christian Schmitz
    • 7
  • Dhiren Devinder Serai
    • 7
  • Tatiane Escobar Gava
    • 7
  1. 1.Institut für InformatikUniversität RostockRostockGermany
  2. 2.IBM R & D GmbHBöblingenGermany
  3. 3.Database Systems GroupTechnische Universität DresdenDresdenGermany
  4. 4.Technische Universität BerlinBerlinGermany
  5. 5.Deutsches Forschungszentrum für Künstliche IntelligenzBerlinGermany
  6. 6.Hasso Plattner InstitutePotsdamGermany
  7. 7.IPVSUniversität StuttgartStuttgartGermany

Personalised recommendations