Thoughts on System Design for Big Data



In the context of computing with data, what exactly is a system? Generally speaking, a system is an aggregation of computing components (and the links between them) that collectively provide a solution to a problem. System design covers choices that system designers make regarding such components: hardware (e.g., servers, networks, sensors, etc.); software (e.g., operating systems, cluster managers, applications, etc.); data (e.g., collection, retention, processing, etc.); and other components that vary based on the nature of each solution. There’s no free lunch in system design and no silver bullet; instead, there are patterns that can jumpstart a solution; and for the most part, there will always be tradeoffs. Skilled system designers learn how to deal with novel problems and ambiguity; one of the skills they practice is decomposing a complex problem into more manageable subproblems that look analogous to ones that can be solved using known patterns, then connect those components together to solve the complex problem. In this chapter, we put on our designer hats and explore various aspects of system design in practice by creating a hypothetical big-data solution: a productivity bot.


  1. J. Bentley. Programming Pearls. Pearson Education, 2016. ISBN 9780134498034.Google Scholar
  2. Andrew Hunt and David Thomas. The Pragmatic Programmer: From Journeyman to Master. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. ISBN 0-201- 61622-X.Google Scholar
  3. F.P. Brooks. The Mythical Man-Month, Anniversary Edition: Essays On Software Engineering. Pearson Education, 1995. ISBN 9780132119160.Google Scholar
  4. M. Cohn. Agile Estimating and Planning. Pearson Education, 2005. ISBN 9780132703109.Google Scholar
  5. David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pages 654–663, New York, NY, USA, 1997. ACM. ISBN 0-89791-888-6. doi: 10.1145/258533.258660. URL
  6. R.C. Martin. Clean Architecture. Robert C. Martin. Pearson Education, 2017. ISBN 9780134494166.Google Scholar
  7. M.T. Nygard. Release It!: Design and Deploy Production-ready Software. Pragmatic Bookshelf Series. Pragmatic Bookshelf, 2007. ISBN 9780978739218.Google Scholar
  8. M. Howard and D. LeBlanc. Writing Secure Code. Best Practices Series. Microsoft Press, 2003. ISBN 9780735617223.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.AmazonMenlo ParkUSA
  2. 2.VoiceraSanta ClaraUSA

Personalised recommendations