Apache HBase and HDFS

  • Deepak Vohra


HBase runs on HDFS as the underlying filesysystem and benefits from HDFS features such as data reliability, scalability, and replication. HBase stores data as StoreFiles (HFiles) on the HDFS datanodes. HFile represents the file format for HBase. HFile is an HBase-specific file format based on the TFile binary file format. A StoreFile can be considered a lightweight wrapper around the HFile. HBase also stores the write-ahead logs (WALs), which store data before it is written to HFiles on HDFS. HBase is a HDFS client and makes use of the DFSClient class, references to which appear in the HBase client log messages and HBase logs, to connect to NameNode to get block locations for datanode blocks and add data to the datanode blocks. HBase leverages the fault tolerance provided by the Hadoop File System (HDFS). HBase requires some configuration at the client side (HBase) and the server side (HDFS).


Block Size Data Block Target Size Bloom Filter Large File 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Deepak Vohra 2016

Authors and Affiliations

  • Deepak Vohra
    • 1
  1. 1.White RockCanada

Personalised recommendations