Advertisement

Data Warehousing Using Hadoop

  • Sameer Wadkar
  • Madhu Siddalingaiah
Chapter

Abstract

The Hadoop platform supports several data warehousing solutions, including Apache Hive, Impala, and Shark. These solutions are conceptually similar to relational databases at much larger scale but differ in their implementation and usage model. Relational databases are often used in transactional systems in which single row inserts, updates, and deletes must be executed atomically. Efficient indexing and referential integrity with primary/foreign keys allow modern relational databases to find records quickly and guarantee that all data satisfies a strict schema. Relational databases try to avoid full table scans whenever possible because I/O bandwidth is limited and tends to be the bottleneck in these systems.

Keywords

Relational Database Abstract Syntax Tree Hadoop Cluster Partition Column Select Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Sameer Wadkar and Madhu Siddalingaiah 2014

Authors and Affiliations

  • Sameer Wadkar
    • 1
  • Madhu Siddalingaiah
    • 1
  1. 1.MDUS

Personalised recommendations