Skip to main content

Data Mining in Life Sciences: A Case Study on SAPs In-Memory Computing Engine

  • Conference paper
Enabling Real-Time Business Intelligence (BIRTE 2012)

Abstract

While column-oriented in-memory databases have been primarily designed to support fast OLAP queries and business intelligence applications, their analytical performance makes them a promising platform for data mining tasks found in life sciences. One such system is the HANA database, SAP’s in-memory data management solution. In this contribution, we show how HANA meets some inherent requirements of data mining in life sciences. Furthermore, we conducted a case study in the area of proteomics research. As part of this study, we implemented a proteomics analysis pipeline in HANA. We also implemented a flexible data analysis toolbox that can be used by life sciences researchers to easily design and evaluate their analysis models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Plattner, H., Zeier, A.: In-Memory Data Management: An Inflection Point for Enterprise Applications. Springer (June 2011)

    Google Scholar 

  2. Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA Database: Data Management for Modern Business Applications. SIGMOD Rec. 40(4), 45–51 (2012)

    Article  Google Scholar 

  3. Sikka, V., Färber, F., Lehner, W., Cha, S.K., Peh, T., Bornhövd, C.: Efficient Transaction Processing in SAP HANA Database: The End of a Column Store Myth. In: Proceedings of the 2012 International Conference on Management of Data, SIGMOD 2012, pp. 731–742. ACM, New York (2012)

    Chapter  Google Scholar 

  4. Plattner, H.: A Common Database Approach for OLTP and OLAP using an In-Memory Column Database. In: Proceedings of the 35th SIGMOD International Conference on Management of Data, SIGMOD 2009, pp. 1–2. ACM (2009)

    Google Scholar 

  5. Jaecksch, B., Faerber, F., Rosenthal, F., Lehner, W.: Hybrid Data-Flow Graphs for Procedural Domain-Specific Query Languages. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 577–578. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Venables, W.N., Smith, D.M.: An Introduction to R. Notes on R: A Programming Environment for Data Analysis and Graphics Version 2.15.0. R-project.org. (March 2012)

    Google Scholar 

  7. Stonebraker, M., Becla, J., DeWitt, D.J., Lim, K.T., Maier, D., Ratzesberger, O., Zdonik, S.B.: Requirements for Science Data Bases and SciDB. In: CIDR (2009)

    Google Scholar 

  8. SciDB.org: The Open Source Data Management and Analytics Software for Scientific Research, http://www.scidb.org/Documents/SciDB-Summary.pdf (last accessed: July 22, 2012)

  9. Haas, L.M., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A System for integrated Access to Life Sciences Data Sources. IBM Syst. J. 40(2), 489–511 (2001)

    Article  Google Scholar 

  10. SAP: SAP Unveils Unified Strategy for Real-Time Data Management to Grow Database Market Leadership. SAP News (April 2012), http://www.sap.com/corporate-en/press.epx?PressID=18621

  11. Conrad, T.O.F.: New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics. PhD thesis, Freie Universität Berlin (2008)

    Google Scholar 

  12. MALDI-TOF Mass Analysis, http://www.protein.iastate.edu/maldi.html (last accessed: July 25, 2012)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boese, JH. et al. (2013). Data Mining in Life Sciences: A Case Study on SAPs In-Memory Computing Engine. In: Castellanos, M., Dayal, U., Rundensteiner, E.A. (eds) Enabling Real-Time Business Intelligence. BIRTE 2012. Lecture Notes in Business Information Processing, vol 154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39872-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39872-8_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39871-1

  • Online ISBN: 978-3-642-39872-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics