Big Scale Text Analytics and Smart Content Navigation

  • Karsten Schmidt
  • Sebastian Bächle
  • Philipp Scholl
  • Georg Nold
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 206)

Abstract

Identifying and exploring relevant content in growing document collections is a challenge for researchers, users, and system providers alike. Supporting this is crucial for companies offering knowledge in the form of documents as their core product. Our demo shows an intelligent way of doing guided research in big text collections, using the collection of the major scientific publisher Springer SBM as an example data set. We use the SAP HANA platform for flexible text analysis, ad-hoc calculations and data linkage, in order to enhance the experience of users navigating and exploring publications. We integrate unstructured data (textual documents) and structured data (document metadata and web server logs), and provide interactive filters in order to enable a responsive user experience while searching for relevant content. With HANA, we are able to implement this functionality over big data on a single machine by making use of HANA’s SQL data store and the built-in application server.

Keywords

SAP HANA Analytics Information retrieval 

References

  1. 1.
    Plattner, H., Zeier, A.: In-Memory Data Management: An Inflection Point for Enterprise Applications. Springer, Berlin (2011)Google Scholar
  2. 2.
    SpringerLink Corpus (2013). http://link.springer.com
  3. 3.
    Wikipedia Encyclopedia API (2013). https://www.mediawiki.org/wiki/API

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Karsten Schmidt
    • 1
  • Sebastian Bächle
    • 1
  • Philipp Scholl
    • 1
  • Georg Nold
    • 2
  1. 1.SAP AGWalldorfGermany
  2. 2.Springer Science+Business MediaBerlinGermany

Personalised recommendations