Crowdsourcing Fact Extraction from Scientific Literature

Seifert, Christin; Granitzer, Michael; Höfler, Patrick; Mutlu, Belgin; Sabol, Vedran; Schlegel, Kai; Bayerl, Sebastian; Stegmaier, Florian; Zwicklbauer, Stefan; Kern, Roman

doi:10.1007/978-3-642-39146-0_15

Christin Seifert¹⁸,
Michael Granitzer¹⁸,
Patrick Höfler¹⁹,
Belgin Mutlu¹⁹,
Vedran Sabol¹⁹,
Kai Schlegel¹⁸,
Sebastian Bayerl¹⁸,
Florian Stegmaier¹⁸,
Stefan Zwicklbauer¹⁸ &
…
Roman Kern¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7947))

Included in the following conference series:

International Workshop on Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data

4582 Accesses
3 Citations
16 Altmetric

Abstract

Scientific publications constitute an extremely valuable body of knowledge and can be seen as the roots of our civilisation. However, with the exponential growth of written publications, comparing facts and findings between different research groups and communities becomes nearly impossible. In this paper, we present a conceptual approach and a first implementation for creating an open knowledge base of scientific knowledge mined from research publications. This requires to extract facts - mostly empirical observations - from unstructured texts (mainly PDF’s). Due to the importance of extracting facts with high-accuracy and the impreciseness of automatic methods, human quality control is of utmost importance. In order to establish such quality control mechanisms, we rely on intelligent visual interfaces and on establishing a toolset for crowdsourcing fact extraction, text mining and data integration tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 601–610. ACM, New York (2009)
Chapter Google Scholar
Gollub, T., Stein, B., Burrows, S.: Ousting Ivory Tower Research: Towards a Web Framework for Providing Experiments as a Service. In: Hersh, B., Callan, J., Maarek, Y., Sanderson, M. (eds.) 35th International ACM Conference on Research and Development in Information Retrieval (SIGIR 2012), pp. 1125–1126. ACM (2012)
Google Scholar
Holzinger, A., Simonic, K.M., Yildirim, P.: Disease-disease relationships for rheumatic diseases: Web-based biomedical textmining an knowledge discovery to assist medical decision making. In: 2012 IEEE 36th Annual Computer Software and Applications Conference (COMPSAC), pp. 573–580 (2012)
Google Scholar
Stegmaier, F., Seifert, C., Kern, R., Höfler, P., Bayerl, S., Granitzer, M., Kosch, H., Lindstaedt, S., Mutlu, B., Sabol, V., Schlegel, K., Zwicklbauer, S.: Unleashing semantics of research data. In: Proceedings of the 2nd Workshop on Big Data Benchmarking (2012)
Google Scholar
Hazman, M., El-Beltagy, S.R., Rafea, A.: A survey of ontology learning approaches. International Journal of Computer Applications 22, 36–43 (2011); Published by Foundation of Computer Science
Article Google Scholar
Gillman, D.W.: Common metadata constructs for statistical data. In: Proceedings of Statistics Canada Symposium 2005: Methodological Challenges for Future Information needs Catalogue no. 11-522-XIE (September 2005)
Google Scholar
Paulheim, H., Fürnkranz, J.: Unsupervised generation of data mining features from linked open data. In: International Conference on Web Intelligence and Semantics, WIMS 2012 (2012)
Google Scholar
Kern, R., Jack, K., Hristakeva, M.: TeamBeam – Meta-Data Extraction from Scientific Literature. D-Lib Magazine 18 (2012)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems. I-Semantics 2011, pp. 1–8. ACM, New York (2011)
Google Scholar
Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1994, pp. 152–158. ACM, New York (1994)
Google Scholar
Thomas, J.J., Cook, K.A. (eds.): Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE Computer Society (2005)
Google Scholar
Keim, D.A., Mansmann, F., Thomas, J.: Visual analytics: how much visualization and how much analytics? SIGKDD Explor. Newsl. 11, 5–8 (2010)
Article Google Scholar
Keim, D.A., Mansmann, F., Oelke, D., Ziegler, H.: Visual analytics: Combining automated discovery with interactive visualizations. In: Boulicaut, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 2–14. Springer, Heidelberg (2008)
Chapter Google Scholar
Heer, J., Agrawala, M.: Design considerations for collaborative visual analytics. In: IEEE Symposium on Visual Analytics Science and Technology, VAST 2007, pp. 171–178 (2007)
Google Scholar
Mutlu, B., Hoefler, P., Granitzer, M., Sabol, V.: D 4.1 - Semantic Descriptions for Visual Analytics Components (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Passau, Innstrae 33, 94032, Passau, Germany
Christin Seifert, Michael Granitzer, Kai Schlegel, Sebastian Bayerl, Florian Stegmaier & Stefan Zwicklbauer
Know-Center, Inffeldgasse 13/6, 8010, Graz, Austria
Patrick Höfler, Belgin Mutlu, Vedran Sabol & Roman Kern

Authors

Christin Seifert
View author publications
You can also search for this author in PubMed Google Scholar
Michael Granitzer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Höfler
View author publications
You can also search for this author in PubMed Google Scholar
Belgin Mutlu
View author publications
You can also search for this author in PubMed Google Scholar
Vedran Sabol
View author publications
You can also search for this author in PubMed Google Scholar
Kai Schlegel
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Bayerl
View author publications
You can also search for this author in PubMed Google Scholar
Florian Stegmaier
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Zwicklbauer
View author publications
You can also search for this author in PubMed Google Scholar
Roman Kern
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Medical Informatics, Statistics and Documentation (IMI), Medical University Graz, Auenbruggerplatz 2/V, 8036, Graz, Austria
Andreas Holzinger
Department of Informatics, Systems and Communication, University of Milano-Bicocca, Viale Sarca 336, 20126, Milano, Italy
Gabriella Pasi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Seifert, C. et al. (2013). Crowdsourcing Fact Extraction from Scientific Literature. In: Holzinger, A., Pasi, G. (eds) Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data. HCI-KDD 2013. Lecture Notes in Computer Science, vol 7947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39146-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-39146-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39145-3
Online ISBN: 978-3-642-39146-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics