Abstract
Until a decade ago, the database world was all SQL, distributed, sometimes replicated, and fully consistent. Then, web and cloud applications emerged that need to deal with complex big data, and NoSQL came in to address their requirements, trading consistency for scalability and availability. NewSQL has been the latest technology in the big data management landscape, combining the scalability and availability of NoSQL with the consistency and usability of SQL. By blending capabilities only available in different kinds of database systems such as fast data ingestion and SQL queries and by providing online analytics over operational data, NewSQL opens up new opportunities in many application domains where real-time decisions are critical. NewSQL may also simplify data management, by removing the traditional separation between NoSQL and SQL (ingest data fast, query it with SQL), as well as between operational database and data warehouse/data lake (no more ETLs!). However, a hard problem is scaling out transactions in mixed operational and analytical (HTAP) workloads over big data, possibly coming from different data stores (HDFS, SQL, NoSQL). Today, only a few NewSQL systems have solved this problem. In this paper, wne make the case for NewSQL, introducing their basic principles from distributed database systems and illustrating with Spanner and LeanXcale, two of the most advanced systems in terms of scalable transaction management.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, D., Das, S., El Abbadi, A.: Data Management in the Cloud: Challenges and Opportunities. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2012)
Bondiombouy, C., Valduriez, P.: Query processing in multistore systems: an overview. Int. J. Cloud Comput. 5(4), 309–346 (2016)
Corbett, J.C., et al.: Spanner: Google’s globally-distributed database. In: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, pp. 251–264 (2012)
DeWitt, D., Gray, J.: Parallel database systems: the future of high-performance database systems. Commun. ACM 35(6), 85–98 (1992)
Gilbert, S., Lynch, N.A.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)
Kolev, B., Valduriez, P., Bondiombouy, C., Jimenez-Peris, R., Pau, R., Pereira, J.: CloudMdsQL: Querying Heterogeneous Cloud Data Stores with a Common Language. Distrib. Parallel Databases 34(4), 463–503 (2016). https://doi.org/10.1007/s10619-015-7185-y
Kolev, B., Valduriez, P., Bondiombouy, C., Jimenez-Peris, R., Pau, R., Pereira, J.: The CloudMdsQL multistore system. In: Proceedings of the ACM SIGMOD Conference, pp. 2113–2116 (2016)
Kolev, B., et al.: Parallel polyglot query processing on heterogeneous cloud data stores with LeanXcale. In: Proceedings of the IEEE BigData Conference, pp. 1757–1766 (2018)
Jimenez-Peris, R., Patiño-Martinez, M.: System and method for highly scalable decentralized and low contention transactional processing. European patent #EP2780832, US patent #US9,760,597 (2011)
Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 1st edn. Prentice-Hall (1991)
Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 4th edn. Springer (2020). https://doi.org/10.1007/978-3-030-26253-2
Stonebraker, M.: The case for shared nothing. IEEE Database Eng. Bull. 9(1), 4–9 (1986)
Stonebraker, M., Weisberg, A.: The VoltDB main memory DBMS. IEEE Data Eng. Bull. 36(2), 21–27 (2013)
Valduriez, P.: Principles of distributed data management in 2020? In: Hameurlain, A., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) DEXA 2011. LNCS, vol. 6860, pp. 1–11. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23088-2_1
Valduriez, P., Jimenez-Peris, R.: NewSQL: principles, systems and current trends. Tutorial, IEEE Big Data Conference (2019). https://www.leanxcale.com/scientific-articles
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer-Verlag GmbH Germany, part of Springer Nature
About this chapter
Cite this chapter
Valduriez, P., Jimenez-Peris, R., Özsu, M.T. (2021). Distributed Database Systems: The Case for NewSQL. In: Hameurlain, A., Tjoa, A.M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVIII. Lecture Notes in Computer Science(), vol 12670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-63519-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-662-63519-3_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-63518-6
Online ISBN: 978-3-662-63519-3
eBook Packages: Computer ScienceComputer Science (R0)