Skip to main content

PopulAid: In-Memory Test Data Generation

  • Conference paper
  • First Online:
  • 1014 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Abstract

During software development, it is often necessary to access real customer data in order to validate requirements and performance thoroughly. However, company and legal policies often restrict access to such sensitive information. Without real data, developers have to either create their own customized test data manually or rely on standardized benchmarks. While the first tends to lack scalability and edge cases, the latter solves these issues but cannot reflect the productive data distributions of a company.

In this paper, we propose PopulAid as a tool that allows developers to create customized benchmarks. We offer a convenient data generator that incorporates specific characteristics of real-world applications to generate synthetic data. So, companies have no need to reveal sensible data but yet developers have access to important development artifacts. We demonstrate our approach by generating a customized test set with medical information for developing SAP’s healthcare solution.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    More information (including a screencast) can be found at: https://epic.hpi.uni-potsdam.de/Home/PopulAid.

  2. 2.

    More information concerning the tables and the included attributes is accessible under: http://help.sap.com/saphelp_crm60/helpdata/de/09/a4d2f5270f4e58b2358fc5519283be/content.htm.

References

  1. Tay, Y.C.: Data generation for application-specific benchmarking. In: VLDB Challenges and Visions, pp. 1470–1473 (2011)

    Google Scholar 

  2. Plattner, H.: A Course in In-Memory Data Management. Springer, Heidelberg (2013)

    Book  Google Scholar 

  3. Newman, M.E.: Power laws, pareto distributions and zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)

    Article  Google Scholar 

  4. Stephens, J.M., Poess, M.: MUDD: a multi-dimensional data generator. In: Proceedings of the 4th International Workshop on Software and Performance. WOSP 2004, pp. 104–109, ACM (2004)

    Google Scholar 

  5. Rabl, T., Jacobsen, H.-A.: Big data generation. In: Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.) WBDB 2012. LNCS, vol. 8163, pp. 20–27. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  6. Ming, Z., Luo, C., Gao, W., Han, R., Yang, Q., Wang, L., Zhan, J.: BDGS: A scalable big data generator suite in big data benchmarking, pp. 1–16 (2014). arXiv preprint arXiv:1401.5465

  7. Alexandrov, A., Tzoumas, K., Markl, V.: Myriad: scalable and expressive data generation. Proc. VLDB Endow. 5(12), 1890–1893 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

We thank Janusch Jacoby, Benjamin Reissaus, Kai-Adrian Rollmann, and Hendrik Folkerts for their valuable contributions during the development of PopulAid.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ralf Teusner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Teusner, R., Perscheid, M., Appeltauer, M., Enderlein, J., Klingbeil, T., Kusber, M. (2015). PopulAid: In-Memory Test Data Generation. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20233-4_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20232-7

  • Online ISBN: 978-3-319-20233-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics