Open Statistics: The Rise of a New Era for Open Data?

  • Evangelos Kalampokis
  • Efthimios Tambouris
  • Areti Karamanou
  • Konstantinos Tarabanis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9820)

Abstract

A large part of open data concerns statistics, such as demographic, economic and social data (henceforth referred to as Open Statistical Data, OSD). In this paper we start by introducing open data fragmentation as a major obstacle for OSD reuse. We proceed by outlining data cube as a logical model for structuring OSD. We then introduce Open Statistics as a new area aiming to systematically study OSD. Open Statistics reuse and extends methods from diverse fields like Open Data, Statistics, Data Warehouses and the Semantic Web. In this paper, we focus on benefits and challenges of Open Statistics. The results suggest that Open Statistics provide benefits not present in any of these fields alone. We conclude that in certain cases OSD can realise the potential of open data.

Keywords

Open data Statistical data Open statistics Linked data Data analytics 

Notes

Acknowledgments

This work is funded by the European Commission within the H2020 Programme in the context of the project OpenGovIntelligence (http://OpenGovIntelligence.eu) under grand agreement No. 693849.

References

  1. 1.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: A classification scheme for open government data: towards linking decentralized data. Int. J. Web Eng. Technol. 6(3), 266–285 (2011)CrossRefGoogle Scholar
  2. 2.
    Attard, J., Orlandi, F., Scerri, S., Auer, S.: A systematic review of open government data initiatives. Gov. Inf. Q. 32(4), 399–418 (2015)CrossRefGoogle Scholar
  3. 3.
    Susha, I., Zuiderwijk, A., Janssen, M., Gronlund, A.: Benchmarks for evaluating the progress of open data adoption: usage, limitations, and lessons learned. Soc. Sci. Comput. Rev. 33(5), 613–630 (2014)CrossRefGoogle Scholar
  4. 4.
    Manyika, J., Chui, M., Bughin, J., Dobbs, R., Bisson, P., Marrs, A.: McKinsey Global Institute D (2013)Google Scholar
  5. 5.
    Janssen, M., Charalabidis, Y., Zuiderwijk, A.: Benefits, adoption barriers and myths of open data and open government. Inf. Syst. Manag. 29(4), 258–268 (2012)CrossRefGoogle Scholar
  6. 6.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: Linked open government data analytics. In: Wimmer, M.A., Janssen, M., Scholl, H.J. (eds.) EGOV 2013. LNCS, vol. 8074, pp. 99–110. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    European Commission: Guidelines on recommended standard licences, datasets and charging for the reuse of documents, C240/1, 24 July 2014Google Scholar
  8. 8.
    Bhattacherjee, A.: Social Science Research: Principles, Methods, and Practices, Open Access Textbooks (2012)Google Scholar
  9. 9.
    Shields, P., Rangarajan, N.: A Playbook for Research Methods: Integrating Conceptual Frameworks and Project Management. New Forum Press Inc., Stillwater (2013)Google Scholar
  10. 10.
    Romero, O., Abell, A.: A survey of multidimensional modeling methodologies. Int. J. Data Warehous. Min. 5(2), 1 (2009)CrossRefGoogle Scholar
  11. 11.
    Tseng, F.S., Chen, C.W.: Integrating heterogeneous data warehouses using XML technologies. J. Inf. Sci. 31(3), 209–229 (2005)CrossRefGoogle Scholar
  12. 12.
    Niemi, T., Hirvonen, L., Jrvelin, K.: Multidimensional data model and query language for informetrics. J. Am. Soc. Inf. Sci. Technol. 54(10), 939–951 (2003)CrossRefGoogle Scholar
  13. 13.
    Datta, A., Thomas, H.: The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses. Decis. Support Syst. 27(3), 289–301 (1999)CrossRefGoogle Scholar
  14. 14.
    Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26(1), 65–74 (1997)CrossRefGoogle Scholar
  15. 15.
    Dielman, T.E.: Pooled cross-sectional and time series data: a survey of current statistical methodology. Am. Stat. 37(2), 111–122 (1983)MATHGoogle Scholar
  16. 16.
    Hildreth, C.: Combining cross section data and time series. Cowles Commission Discussion paper, No. 347, 15 May 1950Google Scholar
  17. 17.
    Shmueli, G.: To explain or to predict? Stat. Sci. 25(3), 289–310 (2010)CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 233–246. ACM (2002)Google Scholar
  19. 19.
    Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: Proceedings of the 13th International Conference on Data Engineering, pp. 232–243 (1997)Google Scholar
  20. 20.
    Perez, J., Berlanga, R., Aramburu, M., Pedersen, T.: Integrating data warehouses with web data: a survey. IEEE Trans. Knowl. Data Eng. 20(7), 940–955 (2008)CrossRefGoogle Scholar
  21. 21.
    Chiricos, T.: Rates of crime and unemployment: an analysis of aggregate research evidence. Soc. Prob. 34(2), 187–212 (1987)CrossRefGoogle Scholar
  22. 22.
    Kalampokis, E., Karamanou, A., Tambouris, E., Tarabanis, K.: Towards a vocabulary for incorporating predictive models into the linked data web. In: Proceedings of the 1st International Workshop on Semantic Statistics (SemStats 2013) Within 12th International Semantic Web Conference (ISWC 2013), Sydney, Australia, vol. 1549. CEUR-WS (2013)Google Scholar
  23. 23.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp. 205–227 (2009)Google Scholar
  24. 24.
    Tambouris, E., Kalampokis, E., Tarabanis, K.: Processing linked open data cubes. In: Tambouris, E., Janssen, M., Scholl, H.J., Wimmer, M.A., Tarabanis, K., Gascó, M., Klievink, B., Lindgren, I., Parycek, P. (eds.) EGOV 2015. LNCS, vol. 9248, pp. 130–143. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  25. 25.
    Cyganiak, R., Reynolds, D., Tennison, J.: The RDF Data Cube Vocabulary. W3C Recommendation. World Wide Web Consortium (W3C), 16 January 2014Google Scholar
  26. 26.
    Petrou, I., Papastefanatos, G., Dalamagas, T.: Publishing census as linked open data: a case study. In: Proceedings of the 2nd International Workshop on Open Data, Ser. WOD 2013, pp. 4:1–4:3. ACM, New York (2013)Google Scholar
  27. 27.
    Mero-Peuela, A., Ashkpour, A., Rietveld, L., Hoekstra, R., Schlobach, S.: Linked humanities data: the next frontier? In: Proceedings of the 2nd International Workshop on Linked Science 2012, A Case-Study in Historical Census Data, vol. 951 (2012)Google Scholar
  28. 28.
    Kalampokis, E., Roberts, B., Karamanou, A., Tambouris, E., Tarabanis, K.: Challenges on developing tools for exploiting linked open data cubes. In: Proceedings of the 3rd International Workshop on Semantic Statistics (SemStats 2015) within the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, Pennsylvania, USA, 11–15 October 2015, vol. 1551. CEUR-WS (2015)Google Scholar
  29. 29.
    Kim, W., Seo, J.: Classifying schematic and data heterogeneity in multidatabase systems. Computer 24(12), 12–18 (1991)CrossRefGoogle Scholar
  30. 30.
    Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. (CSUR) 18(4), 323–364 (1986)CrossRefGoogle Scholar
  31. 31.
    Berger, S., Schrefl, M.: FedDW global schema architect: UML-based design tool for the integration of data mart schemas. In: Song, I.-Y., Golfarelli, M. (eds.) DOLAP, pp. 33–40. ACM, Maui (2012)Google Scholar
  32. 32.
    Kalampokis, E., Nikolov, A., Haase, P., Cyganiak, R., Stasiewicz, A., Karamanou, A., Zotou, M., Zeginis, D., Tambouris, E., Tarabanis, K.: Exploiting linked data cubes with opencube toolkit. In: Proceedings of the ISWC 2014 Posters and Demos Track a Track Within 13th International Semantic Web Conference (ISWC 2014), Riva del Garda, Italy, 19–23 October 2014, vol. 1272. CEUR-WS (2014)Google Scholar
  33. 33.
    Salas, P.E.R., Da Mota, F.M., Breitman, K.K., Casanova, M.A., Martin, M., Auer, S.: Publishing statistical data on the web. Int. J. Semant. Comput. 6(4), 373–388 (2012)CrossRefGoogle Scholar
  34. 34.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: ICT tools for creating, expanding, and exploiting statistical linked open data, Stat. J. IAOS (2016, in press)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Evangelos Kalampokis
    • 1
    • 2
  • Efthimios Tambouris
    • 1
    • 2
  • Areti Karamanou
    • 1
    • 2
  • Konstantinos Tarabanis
    • 1
    • 2
  1. 1.University of MacedoniaThessalonikiGreece
  2. 2.Centre for Research and Technology – HellasInformation Technologies InstituteThermiGreece

Personalised recommendations