Data Management Challenges in Cloud Computing Infrastructures

  • Divyakant Agrawal
  • Amr El Abbadi
  • Shyam Antony
  • Sudipto Das
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5999)

Abstract

The challenge of building consistent, available, and scalable data management systems capable of serving petabytes of data for millions of users has confronted the data management research community as well as large internet enterprises. Current proposed solutions to scalable data management, driven primarily by prevalent application requirements, limit consistent access to only the granularity of single objects, rows, or keys, thereby trading off consistency for high scalability and availability. But the growing popularity of “cloud computing”, the resulting shift of a large number of internet applications to the cloud, and the quest towards providing data management services in the cloud, has opened up the challenge for designing data management systems that provide consistency guarantees at a granularity larger than single rows and keys. In this paper, we analyze the design choices that allowed modern scalable data management systems to achieve orders of magnitude higher levels of scalability compared to traditional databases. With this understanding, we highlight some design principles for systems providing scalable and consistent data management as a service in the cloud.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aguilera, M.K., Merchant, A., Shah, M., Veitch, A., Karamanolis, C.: Sinfonia: a new paradigm for building scalable distributed systems. In: SOSP, pp. 159–174 (2007)Google Scholar
  2. 2.
    Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison Wesley, Reading (1987)Google Scholar
  3. 3.
    Burrows, M.: The Chubby Lock Service for Loosely-Coupled Distributed Systems. In: OSDI, pp. 335–350 (2006)Google Scholar
  4. 4.
    Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: PODC, pp. 398–407 (2007)Google Scholar
  5. 5.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data. In: OSDI, pp. 205–218 (2006)Google Scholar
  6. 6.
    Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: PNUTS: Yahoo!’s hosted data serving platform. In: Proc. VLDB Endow., vol. 1(2), pp. 1277–1288 (2008)Google Scholar
  7. 7.
    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: SOSP, pp. 205–220 (2007)Google Scholar
  8. 8.
    von Eicken, T.: Righscale Blog: Animoto’s Facebook Scale-up (April 2008), http://blog.rightscale.com/2008/04/23/animoto-facebook-scale-up/
  9. 9.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: SOSP, pp. 29–43 (2003)Google Scholar
  10. 10.
    Gray, J.: Notes on data base operating systems. In: Flynn, M.J., Jones, A.K., Opderbeck, H., Randell, B., Wiehle, H.R., Gray, J.N., Lagally, K., Popek, G.J., Saltzer, J.H. (eds.) Operating Systems. LNCS, vol. 60, pp. 393–481. Springer, Heidelberg (1978)Google Scholar
  11. 11.
    Helland, P.: Life beyond distributed transactions: an apostate’s opinion. In: CIDR, pp. 132–141 (2007)Google Scholar
  12. 12.
    Hirsch, A.: Cool Facebook Application Game – Scrabulous – Facebook’s Scrabble (2007), http://www.makeuseof.com/tag/best-facebook-application-game-scrabulous-facebooks-scrabble/
  13. 13.
    Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: STOC, pp. 654–663 (1997)Google Scholar
  14. 14.
    Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)CrossRefGoogle Scholar
  15. 15.
    Lindsay, B.G., Haas, L.M., Mohan, C., Wilms, P.F., Yost, R.A.: Computation and communication in R*: a distributed database manager. ACM Trans. Comput. Syst. 2(1), 24–38 (1984)CrossRefGoogle Scholar
  16. 16.
    Rothnie Jr., J.B., Bernstein, P.A., Fox, S., Goodman, N., Hammer, M., Landers, T.A., Reeve, C.L., Shipman, D.W., Wong, E.: Introduction to a System for Distributed Databases (SDD-1). ACM Trans. Database Syst. 5(1), 1–17 (1980)CrossRefGoogle Scholar
  17. 17.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: SIGCOMM, pp. 149–160 (2001)Google Scholar
  18. 18.
    Vogels, W.: Data access patterns in the amazon.com technology platform. In: VLDB, p. 1. VLDB Endowment (2007)Google Scholar
  19. 19.
    Weikum, G., Vossen, G.: Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Divyakant Agrawal
    • 1
  • Amr El Abbadi
    • 1
  • Shyam Antony
    • 1
  • Sudipto Das
    • 1
  1. 1.University of CaliforniaSanta Barbara

Personalised recommendations