Advertisement

Unleashing Semantics of Research Data

  • Florian Stegmaier
  • Christin Seifert
  • Roman Kern
  • Patrick Höfler
  • Sebastian Bayerl
  • Michael Granitzer
  • Harald Kosch
  • Stefanie Lindstaedt
  • Belgin Mutlu
  • Vedran Sabol
  • Kai Schlegel
  • Stefan Zwicklbauer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8163)

Abstract

Research depends to a large degree on the availability and quality of primary research data, i.e., data generated through experiments and evaluations. While the Web in general and Linked Data in particular provide a platform and the necessary technologies for sharing, managing and utilizing research data, an ecosystem supporting those tasks is still missing. The vision of the CODE project is the establishment of a sophisticated ecosystem for Linked Data. Here, the extraction of knowledge encapsulated in scientific research paper along with its public release as Linked Data serves as the major use case. Further, Visual Analytics approaches empower end users to analyse, integrate and organize data. During these tasks, specific Big Data issues are present.

Keywords

Linked Data Natural Language Processing Data Warehousing Big Data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data – the story so far. International Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)CrossRefGoogle Scholar
  2. 2.
    Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: Conference on Information and Knowledge Management, pp. 601–610 (2009)Google Scholar
  3. 3.
    Dumbill, E.: What is big data? An introduction to the big data landscape. O’Reilly Strata (January 11, 2012), http://strata.oreilly.com/2012/01/what-is-big-data.html
  4. 4.
    Mitchell, I., Wilson, M.: Linked Data - Connecting and exploiting Big Data. White Paper (March 2012), http://www.fujitsu.com/uk/Images/Linked-data-connecting-and-exploiting-big-data-(v1.0).pdfGoogle Scholar
  5. 5.
    Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. PVLDB 5(12), 2032–2033 (2012)Google Scholar
  6. 6.
    Hasan, I., Parapar, J., Barreiro, Á.: Improving the extraction of text in pdfs by simulating the human reading order. Journal of Universal Computer Science 18, 623–649 (2012), http://www.jucs.org/jucs_18_5/improving_the_extraction_of Google Scholar
  7. 7.
    Granitzer, M., Hristakeva, M., Knight, R., Jack, K., Kern, R.: A comparison of layout based bibliographic metadata extraction techniques. In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, WIMS 2012, pp. 19:1–19:8. ACM, New York (2012)Google Scholar
  8. 8.
    Kern, R., Jack, K., Hristakeva, M.: TeamBeam - Meta-Data Extraction from Scientific Literature. D-Lib Magazine 18 (July 2012)Google Scholar
  9. 9.
    Fang, J., Gao, L., Bai, K., Qiu, R., Tao, X., Tang, Z.: A table detection method for multipage pdf documents via visual seperators and tabular structures. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 779–783 (September 2011)Google Scholar
  10. 10.
    Liu, Y., Bai, K., Gao, L.: An efficient pre-processing method to identify logical components from pdf documents. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part I. LNCS (LNAI), vol. 6634, pp. 500–511. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Kataria, S.S., Kumar, K.S., Rastogi, R.R., Sen, P., Sengamedu, S.H.: Entity disambiguation with hierarchical topic models. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1037–1045. ACM, New York (2011)Google Scholar
  12. 12.
    Fader, A., Soderl, S., Etzioni, O.: Scaling wikipediabased named entity disambiguation to arbitrary web text. In: Proc. of WikiAI (2009)Google Scholar
  13. 13.
    Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, Stroudsburg, PA, USA, pp. 277–285. Association for Computational Linguistics (2010)Google Scholar
  14. 14.
    Rebholz-Schuhmann, D., Kirsch, H., Gaudan, S., Arregui, M., Nenadic, G.: Annotation and disambiguation of semantic types in biomedical text: a cascaded approach to named entity recognition. In: Proceedings of the EACL Workshop on Multi-Dimensional Markup in NLP, Trente, Italy (2006)Google Scholar
  15. 15.
    Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Discovering and maintaining links on the web of data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 650–665. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Kurz, T., Schaffert, S., Bürger, T.: LMF – a framework for linked media. In: Proceedings of the Workshop on Multimedia on the Web Collocated to i-KNOW/i-SEMANTICS, pp. 1–4 (September 2011)Google Scholar
  17. 17.
    Kämpgen, B., Harth, A.: Transforming statistical linked data for use in olap systems. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, New York, NY, USA, pp. 33–40. ACM (2011)Google Scholar
  18. 18.
    Zhao, P., Li, X., Xin, D., Han, J.: Graph cube: on warehousing and olap multidimensional networks. In: Proceedings of the International Conference on Management of Data, pp. 853–864 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Florian Stegmaier
    • 1
  • Christin Seifert
    • 1
  • Roman Kern
    • 2
  • Patrick Höfler
    • 2
  • Sebastian Bayerl
    • 1
  • Michael Granitzer
    • 1
  • Harald Kosch
    • 1
  • Stefanie Lindstaedt
    • 2
  • Belgin Mutlu
    • 2
  • Vedran Sabol
    • 2
  • Kai Schlegel
    • 1
  • Stefan Zwicklbauer
    • 1
  1. 1.University of PassauGermany
  2. 2.Know-CenterGrazAustria

Personalised recommendations