Advertisement

Towards Interoperable Open Statistical Data

  • Evangelos KalampokisEmail author
  • Areti Karamanou
  • Konstantinos Tarabanis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11685)

Abstract

An important part of Open Data is of statistical nature and describes economic and social indicators monitoring population size, inflation, trade, and employment. Combining and analysing Open Data from multiple datasets and sources enable the performance of advanced data analytics scenarios that could result in valuable services and data products. However, it is still difficult to discover and combine open statistical data that reside in different data portals. Although Linked Open Statistical Data (LOSD) provide standards and approaches to facilitate combining statistics on the Web, various interoperability challenges still exist. In this paper, we define interoperability conflicts that hamper combining and analysing LOSD from different portals. Towards this end, we start from a thorough literature review on databases and data warehouses interoperability conflicts. Based on this review, we define interoperability conflicts that may appear in LOSD. We defined two types of schema-level conflicts namely, naming conflicts and structural conflicts. Naming conflicts include homonyms and synonyms and result from the different URIs used in the data cubes. Structural conflicts result from different practices of modelling the structure of data cubes.

Keywords

Open Data Linked statistical data Interoperability 

Notes

Acknowledgments

This research is co-financed by Greece and the European Union (European Social Fund- ESF) through the Operational Program “Human Resources Development, Education and Lifelong Learning 2014–2020” in the context of the project “Integrating open statistical data using semantic technologies” (MIS 5007306).

References

  1. 1.
    Asano, Y., Takeyoshi, Y., Matsuda, J., Nishimura, S.: Publication of statistical linked open data in Japan. In: Proceedings of the 4th International Workshop on Semantic Statistics Co-Located with 15th International Semantic Web Conference (ISWC 2016). CEUR Workshop Proceedings (2016)Google Scholar
  2. 2.
    Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)CrossRefGoogle Scholar
  3. 3.
    Berger, S., Schrefl, M.: From federated databases to a federated data warehouse system. In: Proceedings of the 41st Annual Hawaii International Conference on System Sciences, pp. 394–394. IEEE (2008)Google Scholar
  4. 4.
    Berger, S., Schrefl, M.: FedDW global schema architect: UML-based design tool for the integration of data mart schemas. In: Song, I.Y., Golfarelli, M. (eds.) DOLAP, Maui, Hawaii, USA, pp. 33–40. ACM, November 2012Google Scholar
  5. 5.
    Bruckner, R.M., Ling, T.W., Mangisengi, O., et al.: A framework for a multidimensional OLAP model using topic maps. In: Proceedings of the 2nd International Conference on Web Information Systems Engineering 2001, vol. 2, pp. 109–118. IEEE (2001)Google Scholar
  6. 6.
    Cabibbo, L., Torlone, R.: A logical approach to multidimensional databases. In: Schek, H.-J., Alonso, G., Saltor, F., Ramos, I. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 183–197. Springer, Heidelberg (1998).  https://doi.org/10.1007/BFb0100985CrossRefGoogle Scholar
  7. 7.
    Capadisli, S., Auer, S., Ngonga Ngomo, A.C.: Linked SDMX data. Semant. Web 6(2), 105–112 (2015)CrossRefGoogle Scholar
  8. 8.
    Channah, N., Aris, O.: A classification of semantic conflicts in heterogeneous database systems. J. Organ. Comput. 5(2), 167–193 (1995)Google Scholar
  9. 9.
    Chen, P.P.S.: The entity-relationship model—toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Cyganiak, R., Hausenblas, M., McCuirc, E.: Official statistics and the practice of data fidelity, pp. 135–151 (2011).  https://doi.org/10.1007/978-1-4614-1767-5_7CrossRefGoogle Scholar
  11. 11.
    Cyganiak, R., Reynolds, D.: The RDF data cube vocabulary: W3C recommendation, January 2014Google Scholar
  12. 12.
    Datta, A., Thomas, H.: The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses. Decis. Support. Syst. 27(3), 289–301 (1999)CrossRefGoogle Scholar
  13. 13.
    Diamantini, C., Potena, D., Storti, E.: Data mart reconciliation in virtual innovation factories. In: Iliadis, L., Papazoglou, M., Pohl, K. (eds.) CAiSE 2014. LNBIP, vol. 178, pp. 274–285. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-07869-4_26CrossRefGoogle Scholar
  14. 14.
    Doan, A., Halevy, A.Y.: Semantic integration research in the database community: a brief survey. AI Mag. 26(1), 83–94 (2005)Google Scholar
  15. 15.
    Gnanadesikan, R.: Methods for Statistical data Analysis of Multivariate Observations, vol. 321. Wiley, Hoboken (2011)zbMATHGoogle Scholar
  16. 16.
    Janssen, M., Charalabidis, Y., Zuiderwijk, A.: Benefits, adoption barriers and myths of open data and open government. Inf. Syst. Manag. 29(4), 258–268 (2012).  https://doi.org/10.1080/10580530.2012.716740CrossRefGoogle Scholar
  17. 17.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: Linked open cube analytics systems: potential and challenges. IEEE Intell. Syst. 31(5), 89–92 (2016)CrossRefGoogle Scholar
  18. 18.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: A classification scheme for open government data: towards linking decentralised data. Int. J. Web Eng. Technol. 6(3), 266–285 (2011)CrossRefGoogle Scholar
  19. 19.
    Kalampokis, E., Tambouris, E., Tarabanis, K.: Linked open government data analytics. In: Wimmer, M.A., Janssen, M., Scholl, H.J. (eds.) EGOV 2013. LNCS, vol. 8074, pp. 99–110. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40358-3_9CrossRefGoogle Scholar
  20. 20.
    Kim, W., Seo, J.: Classifying schematic and data heterogeneity in multidatabase systems. Computer 24(12), 12–18 (1991).  https://doi.org/10.1109/2.116884CrossRefGoogle Scholar
  21. 21.
    Lee, C., Chen, C.J., Lu, H.: An aspect of query optimization in multidatabase systems. SIGMOD Rec. 24(3), 28–33 (1995).  https://doi.org/10.1145/211990.212011CrossRefGoogle Scholar
  22. 22.
    Lee, K.H., Kim, M.H., Lee, K.C., Kim, B.S., Lee, M.Y.: Conflict classification and resolution in heterogeneous information integration based on XML schema. In: Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, TENCON 2002, vol. 1, pp. 93–96. IEEE (2002)Google Scholar
  23. 23.
    Mangisengi, O., Huber, J., Hawel, C., Essmayr, W.: A framework for supporting interoperability of data warehouse islands using XML. Data Warehous. Knowl. Discov. 2114, 328–338 (2001).  https://doi.org/10.1007/3-540-44801-2_32CrossRefzbMATHGoogle Scholar
  24. 24.
    Miles, A., Bechhofer, S.: SKOS simple knowledge organization system reference: W3C recommendation, August 2009Google Scholar
  25. 25.
    Neumayr, B., Schrefl, M., Thalheim, B.: Hetero-homogeneous hierarchies in data warehouses. In: Song, I.Y., Golfarelli, M. (eds.) Proceedings 7th Asia-Pacific Conference on Conceptual Modelling, Brisbane, Australia, January 2010Google Scholar
  26. 26.
    Pedersen, T., Pedersen, D., Riis, K.: On-demand multidimensional data integration: toward a semantic foundation for cloud intelligence. J. Supercomput. 65(1), 217–257 (2013).  https://doi.org/10.1007/s11227-011-0712-3CrossRefGoogle Scholar
  27. 27.
    Perez, J., Berlanga, R., Aramburu, M., Pedersen, T.: Integrating data warehouses with web data: a survey. IEEE Trans. Knowl. Data Eng. 20(7), 940–955 (2008).  https://doi.org/10.1109/TKDE.2007.190746CrossRefGoogle Scholar
  28. 28.
    Ram, S., Park, J.: Semantic conflict resolution ontology (SCROL): an ontology for detecting and resolving data and schema-level semantic conflicts. IEEE Trans. Knowl. Data Eng. 16(2), 189–202 (2004)CrossRefGoogle Scholar
  29. 29.
    Reddy, M., Prasad, B.E., Reddy, P., Gupta, A.: A methodology for integration of heterogeneous databases. IEEE Trans. Knowl. Data Eng. 6(6), 920–933 (1994)CrossRefGoogle Scholar
  30. 30.
    Sboui, T., Bédard, Y., Brodeur, J., Badard, T.: A conceptual framework to support semantic interoperability of geospatial datacubes. In: Hainaut, J.L., et al. (eds.) ER 2007. LNCS, vol. 4802, pp. 378–387. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-76292-8_44CrossRefGoogle Scholar
  31. 31.
    Sheth, A.P., Kashyap, V.: So far (schematically) yet so near (semantically). In: Proceedings of the IFIP WG2: Conference on Semantics of Interoperable Database Systems, Lorne, Victoria, Australia, pp. 283–312, November 1992CrossRefGoogle Scholar
  32. 32.
    Spaccapietra, S., Parent, C., Dupont, Y.: Model independent assertions for integration of heterogeneous schemas. VLDB J. 1(1), 81–126 (1992)CrossRefGoogle Scholar
  33. 33.
    Torlone, R.: Two approaches to the integration of heterogeneous data warehouses. Distrib. Parallel Databases 23, 69–97 (2008)CrossRefGoogle Scholar
  34. 34.
    Torlone, R.: Interoperability in data warehouses. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Database Systems, pp. 1560–1564. Springer, Boston (2009).  https://doi.org/10.1007/978-0-387-39940-9CrossRefGoogle Scholar
  35. 35.
    Tseng, F.S., Chen, C.W.: Integrating heterogeneous data warehouses using XML technologies. J. Inf. Sci. 31(3), 209–229 (2005).  https://doi.org/10.1177/0165551505052467CrossRefGoogle Scholar
  36. 36.
    W3C: Best practices for publishing linked data. W3C Working Group Note (2014)Google Scholar
  37. 37.
    Webster, J., Watson, R.T.: Analyzing the past to prepare for the future: writing a literature review. Manag. Inf. Syst. Q. 26(2), 3 (2002)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  1. 1.University of MacedoniaThessalonikiGreece

Personalised recommendations