Encyclopedia of Big Data Technologies

2019 Edition
| Editors: Sherif Sakr, Albert Y. Zomaya

Big Data Warehouses for Smart Industries

Reference work entry
DOI: https://doi.org/10.1007/978-3-319-77525-8_204


Definition of Terms

Big Data Warehouse (BDW). A BDW can be defined as a scalable, highly performant, and flexible storage and processing system, capable of dealing with the ever-increasing volume, variety, and velocity of data, i.e., Big Data, while lowering the costs of traditional Data Warehousing architectures through the use of commodity hardware. Big Data imposes severe difficulties for traditional data storage and processing technologies, and the BDW aims to overcome these challenges and support near real-time descriptive and predictive Big Data Analytics over huge amounts of heterogeneous data (Krishnan 2013; Russom 2016; Costa et al. 2017; Santos et al. 2017).

Smart Industry. A Smart Industry can be seen as an organization chain from any industrial sector (e.g., manufacturing, services) with high digitalization levels, which supports the replication of the physical world into a virtual world, through an environment that is highly connected...

This is a preview of subscription content, log in to check access.



This entry has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT (Fundação para a Ciência e Tecnologia) within the Project Scope: UID/CEC/00319/2013 and the Doctoral scholarship (PD/BDE/135101/2017) and by European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project n° 002814; Funding Reference: POCI-01-0247-FEDER-002814].


  1. Apache Hive (2017) Apache Hive documentation. Apache Software Foundation. https://cwiki.apache.org/confluence/display/Hive/Home. Accessed 12 May 2017
  2. Cattell R (2011) Scalable SQL and NoSQL data stores. ACM SIGMOD Rec 39:12–27.  https://doi.org/10.1145/1978915.1978919CrossRefGoogle Scholar
  3. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209.  https://doi.org/10.1007/s11036-013-0489-0CrossRefGoogle Scholar
  4. Costa C, Santos MY (2017a) The SusCity Big Data Warehousing approach for smart cities. In: Proceedings of international database engineering & applications symposium, p 10Google Scholar
  5. Costa C, Santos MY (2017b) The data scientist profile and its representativeness in the European e-competence framework and the skills framework for the information age. Int J Inf Manag 37:726–734.  https://doi.org/10.1016/j.ijinfomgt.2017.07.010CrossRefGoogle Scholar
  6. Costa E, Costa C, Santos MY (2017) Efficient big data modelling and organization for Hadoop Hive-based data warehouses. Coimbra, PortugalCrossRefGoogle Scholar
  7. Dumbill E (2013) Making sense of big data. Big Data 1:1–2.  https://doi.org/10.1089/big.2012.1503CrossRefGoogle Scholar
  8. Floratou A, Minhas UF, Özcan F (2014) SQL-on-Hadoop: full circle back to shared-nothing database architectures. Proc VLDB Endow 7:1295–1306.  https://doi.org/10.14778/2732977.2733002CrossRefGoogle Scholar
  9. Hermann M, Pentek T, Otto B (2016) Design principles for Industrie 4.0 scenarios. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp 3928–3937Google Scholar
  10. Hevner AR, March ST, Park J, Ram S (2004) Design science in information systems research. MIS Q 28:75–105CrossRefGoogle Scholar
  11. Kagermann H, Wahlster W, Helbig J (2013) Recommendations for implementing the strategic initiative INDUSTRIE 4.0. National Academy of Science and Engineering, MünchenGoogle Scholar
  12. Kimball R, Ross M (2013) The data warehouse toolkit: the definitive guide to dimensional modeling, 3rd edn. Wiley, IndianapolisGoogle Scholar
  13. Krishnan K (2013) Data warehousing in the age of big data, 1st edn. Morgan Kaufmann Publishers, San FranciscoGoogle Scholar
  14. Lipcon T, Alves D, Burkert D, et al (2015) Kudu: storage for fast analytics on fast data. Cloudera. Unpublished paper from the KUDU team. http://getkudu.io/kudu.pdf
  15. Mackey G, Sehrish S, Wang J (2009) Improving metadata management for small files in HDFS. In: 2009 IEEE international conference on cluster computing and workshops, pp 1–4Google Scholar
  16. Manyika J, Chui M, Brown B, et al (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global InstituteGoogle Scholar
  17. Marz N, Warren J (2015) Big data: principles and best practices of scalable realtime data systems. Manning Publications Co, Shelter IslandGoogle Scholar
  18. NBD-PWG (2015) NIST big data interoperability framework: volume 6, reference architecture. National Institute of Standards and Technology, GaithersburgGoogle Scholar
  19. O’Leary DE (2014) Embedding AI and crowdsourcing in the big data lake. IEEE Intell Syst 29:70–73.  https://doi.org/10.1109/MIS.2014.82CrossRefGoogle Scholar
  20. Russom P (2016) Data warehouse modernization in the age of big data analytics. The Data Warehouse Institute, RentonGoogle Scholar
  21. Santos MY, Costa C, Galvão J, et al (2017) Evaluating SQL-on-Hadoop for big data warehousing on not-so-good hardware. In: Proceedings of international database engineering & applications symposium (IDEAS’17), BristolGoogle Scholar
  22. Vale Lima F (2017) Big data warehousing em tempo real: Da Recolha ao Processamento de Dados. University of Minho, GuimarãesGoogle Scholar
  23. Villars RL, Olofson CW, Eastwood M (2011) Big data: what it is and why you should care. IDC, FraminghamGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.CCG – Centro de Computação Gráfica and ALGORITMI Research CentreUniversity of MinhoGuimarãesPortugal
  2. 2.Department of Information Systems, ALGORITMI Research CentreUniversity of MinhoGuimarãesPortugal

Section editors and affiliations

  • Kamran Munir
    • 1
  • Antonio Pescapè
    • 2
  1. 1.Computer Science and Creative TechnologiesUniversity of the West of EnglandBristolUK
  2. 2.Department of Electrical Engineering and Information TechnologyUniversity of Napoli Federico IINapoliItaly