Real-Time Data Mining with In-Memory Database Technology

Part of the Studies in Computational Intelligence book series (SCI, volume 445)

Abstract

In the past, classical databases basically came in two flavors: optimized for write access (OLTP, online transaction processing) or read access (OLAP, online analytic processing). Typical data mining tasks, however, involve preprocessing, feature extraction, model training and cross-validation which cannot fully be categorized as either flavor. SAP’s in-memory database HANA stores all data cache-aware in main memory, allowing for rapid transactional and analytical access. While in common three-tier architectures (database, application server, client), computationally intensive applications run at the application server layer and data is loaded into the main memory of application servers, enterprise applications developed for or moved to HANA are more tightly integrated with the database. The main principle of application development for HANA is to execute data-intensive computations in the database close to the raw data in order to prevent expensive data movement. This shift in application design poses new challenges to the application developer: in order to utilize HANA efficiently, he has to think differently about how to design his application.We’ll address these challenges, discuss a real-world data analysis scenario and present some open questions in this area.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Conrad, T.O.: New statistical algorithms for the analysis of mass spectrometry time-of-flight mass data with applications in clinical diagnostics. PhD thesis, Freie Universität Berlin (2008)Google Scholar
  2. 2.
    Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2011)CrossRefGoogle Scholar
  3. 3.
    Jaecksch, B., Faerber, F., Rosenthal, F., Lehner, W.: Hybrid Data-Flow Graphs for Procedural Domain-Specific Query Languages. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 577–578. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    MALDI-TOF Mass Analysis (2011), http://www.protein.iastate.edu/maldi.html (last accessed on June 19, 2012)
  5. 5.
    Plattner, H.: A common database approach for OLTP and OLAP using an in-memory column database. In: Proc. 35th SIGMOD Intl. Conf. on Management of Data, SIGMOD 2009, pp. 1–2 (2009)Google Scholar
  6. 6.
    Plattner, H., Zeier, A.: In-Memory Data Management: An Inflection Point for Enterprise Applications. Springer, Heidelberg (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.SAP AGBerlinGermany

Personalised recommendations