NOSQL Databases

  • Tomasz WiktorskiEmail author
Part of the Advanced Information and Knowledge Processing book series (AI&KP)


In chapters so far, you have relied on HDFS as your storage medium. It has two major advantages for the type of processing we desired to do. It excels at storing large files and enabling distributed processing of these files with help of MapReduce. HDFS is most efficient for tasks that require a pass through all data in a file (or a set of files). In case you only need to access a certain element in a dataset (operation sometimes called point query) or a continuous range of elements (sometimes called range query), HDFS does not provide you an efficient toolkit for the task. You are forced to simply scan over all elements to pick out the ones you are interested in.


  1. Abadi, D (2010) DBMS musings: problems with CAP, and Yahoo’s little known NoSQL system. (visited on 09/26/2018)
  2. Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. en. In: ACM SIGACT News 33.2 (June 2002), p 51. ISSN: 01635700. (visited on 09/26/2018)CrossRefGoogle Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer Science, Faculty of Science and TechnologyUniversity of StavangerStavangerNorway

Personalised recommendations