Skip to main content

Abstract

Scientific publications constitute an extremely valuable body of knowledge and can be seen as the roots of our civilisation. However, with the exponential growth of written publications, comparing facts and findings between different research groups and communities becomes nearly impossible. In this paper, we present a conceptual approach and a first implementation for creating an open knowledge base of scientific knowledge mined from research publications. This requires to extract facts - mostly empirical observations - from unstructured texts (mainly PDF’s). Due to the importance of extracting facts with high-accuracy and the impreciseness of automatic methods, human quality control is of utmost importance. In order to establish such quality control mechanisms, we rely on intelligent visual interfaces and on establishing a toolset for crowdsourcing fact extraction, text mining and data integration tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 601–610. ACM, New York (2009)

    Chapter  Google Scholar 

  2. Gollub, T., Stein, B., Burrows, S.: Ousting Ivory Tower Research: Towards a Web Framework for Providing Experiments as a Service. In: Hersh, B., Callan, J., Maarek, Y., Sanderson, M. (eds.) 35th International ACM Conference on Research and Development in Information Retrieval (SIGIR 2012), pp. 1125–1126. ACM (2012)

    Google Scholar 

  3. Holzinger, A., Simonic, K.M., Yildirim, P.: Disease-disease relationships for rheumatic diseases: Web-based biomedical textmining an knowledge discovery to assist medical decision making. In: 2012 IEEE 36th Annual Computer Software and Applications Conference (COMPSAC), pp. 573–580 (2012)

    Google Scholar 

  4. Stegmaier, F., Seifert, C., Kern, R., Höfler, P., Bayerl, S., Granitzer, M., Kosch, H., Lindstaedt, S., Mutlu, B., Sabol, V., Schlegel, K., Zwicklbauer, S.: Unleashing semantics of research data. In: Proceedings of the 2nd Workshop on Big Data Benchmarking (2012)

    Google Scholar 

  5. Hazman, M., El-Beltagy, S.R., Rafea, A.: A survey of ontology learning approaches. International Journal of Computer Applications 22, 36–43 (2011); Published by Foundation of Computer Science

    Article  Google Scholar 

  6. Gillman, D.W.: Common metadata constructs for statistical data. In: Proceedings of Statistics Canada Symposium 2005: Methodological Challenges for Future Information needs Catalogue no. 11-522-XIE (September 2005)

    Google Scholar 

  7. Paulheim, H., Fürnkranz, J.: Unsupervised generation of data mining features from linked open data. In: International Conference on Web Intelligence and Semantics, WIMS 2012 (2012)

    Google Scholar 

  8. Kern, R., Jack, K., Hristakeva, M.: TeamBeam – Meta-Data Extraction from Scientific Literature. D-Lib Magazine 18 (2012)

    Google Scholar 

  9. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems. I-Semantics 2011, pp. 1–8. ACM, New York (2011)

    Google Scholar 

  10. Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1994, pp. 152–158. ACM, New York (1994)

    Google Scholar 

  11. Thomas, J.J., Cook, K.A. (eds.): Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE Computer Society (2005)

    Google Scholar 

  12. Keim, D.A., Mansmann, F., Thomas, J.: Visual analytics: how much visualization and how much analytics? SIGKDD Explor. Newsl. 11, 5–8 (2010)

    Article  Google Scholar 

  13. Keim, D.A., Mansmann, F., Oelke, D., Ziegler, H.: Visual analytics: Combining automated discovery with interactive visualizations. In: Boulicaut, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 2–14. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Heer, J., Agrawala, M.: Design considerations for collaborative visual analytics. In: IEEE Symposium on Visual Analytics Science and Technology, VAST 2007, pp. 171–178 (2007)

    Google Scholar 

  15. Mutlu, B., Hoefler, P., Granitzer, M., Sabol, V.: D 4.1 - Semantic Descriptions for Visual Analytics Components (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Seifert, C. et al. (2013). Crowdsourcing Fact Extraction from Scientific Literature. In: Holzinger, A., Pasi, G. (eds) Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data. HCI-KDD 2013. Lecture Notes in Computer Science, vol 7947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39146-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39146-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39145-3

  • Online ISBN: 978-3-642-39146-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics