The Implementation of a Hadoop Ecosystem Portal with Virtualization Deployment

Yang, Chao-Tung; Wu, Chien-Heng; Chang, Wen-Yi; Tsai, Whey-Fone; Chan, Yu-Wei; Kristiani, Endah; Chiang, Yuan-Ping

doi:10.1007/978-3-030-02607-3_11

The Implementation of a Hadoop Ecosystem Portal with Virtualization Deployment

Chao-Tung Yang⁶,
Chien-Heng Wu⁷,
Wen-Yi Chang⁷,
Whey-Fone Tsai⁷,
Yu-Wei Chan⁸,
Endah Kristiani⁹ &
…
Yuan-Ping Chiang⁶

Conference paper
First Online: 17 October 2018

1087 Accesses

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 24))

Abstract

The requirements of research, analysis, processing and storing of big data are more and more important because big data is increasingly vital for development in the fields of information technology, finance, medicine, etc. Most of the big data environments are built on Hadoop or Spark. However, the constructions of these kinds of big data platform are not easy for ordinary users because of the lacks of professional knowledge and familiarity with the system. To make it easier to use the big data platform for data processing and analysis, we implemented the web user interface combining the big data platform including Hadoop and Spark. Then, we packaged the whole big data platform into the virtual machine image file along with the web user interface so that users can construct the environment and do the job more quickly and efficiently. We provide the convenient web user interface, not only reduce the difficulty of building a big data platform and save time but also provide an excellent performance of the system. And we also made the comparison of performance between the web user interface and the command line using the HiBench benchmark suit.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)
Article Google Scholar
Yang, C.-T., Yan, Y.-Z., Liu, R.-H., Chen, S.-T.: Cloud city traffic state assessment system using a novel architecture of big data. In: 2015 International Conference on Cloud Computing and Big Data (CCBD) (2015)
Google Scholar
Laney, D.: 3D data management: controlling data volume, velocity, and variety. Technical report, META Group (2001)
Google Scholar
Gupta, A.: Big data analysis using computational intelligence and hadoop: a study. In: 2015 International Conference on Computing for Sustainable Global Development, INDIACom 2015, pp. 1397–1401 (2015)
Google Scholar
Apache Hadoop (2014). http://hadoop.apache.org/
Hadoop (2017). http://en.wikipedia.org/wiki/Apache_Hadoop
Mapreduce (2017). https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
Dittrich, J., Quian, J.: Efficient big data processing in Hadoop MapReduce. Proc. VLDB Endow. 5(12), 2014–2015 (2012)
Article Google Scholar
Borthakur, D.: The Hadoop distributed file system: architecture and design (2007). http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf
Azzedin, F.: Towards a scalable HDFS architecture. In: Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013, pp. 155–161 (2013)
Google Scholar
What is a Portlet - O’Reilly Media (2017). http://archive.oreilly.com/pub/a/java/archive/what-is-a-portlet.html
Portals and Portlets: The Basics (2017). http://editorial.mcpressonline.com/web/mcpdf.nsf/wdocs/5232/$file/5232_exp.pdf
Virtualization Technology & Virtual Machine Software - VMware (2017). https://www.vmware.com/il/solutions/virtualization.html

Download references

Acknowledgements

This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 107-2221-E-029-008 and MOST 106-3114-E-029-003.

Author information

Authors and Affiliations

Department of Computer Science, Tunghai University, No.1727, Sec.4, Taiwan Boulevard, Xitun District, Taichung, 40704, Taiwan
Chao-Tung Yang & Yuan-Ping Chiang
High Performance Computing and Applications National Center, High-Performance Computing National Applied Research Laboratories, Hsinchu, 30076, Taiwan
Chien-Heng Wu, Wen-Yi Chang & Whey-Fone Tsai
College of Computing and Informatics, Providence University, 200, Sec.7, Taiwan Boulevard, Shalu Dist, Taichung City, 43301, Taiwan
Yu-Wei Chan
Department of Computer Science, Department of Industrial Engineering and Enterprise Information, Tunghai University, No.1727, Sec.4, Taiwan Boulevard, Xitun District, Taichung, 40704, Taiwan
Endah Kristiani

Authors

Chao-Tung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Heng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Yi Chang
View author publications
You can also search for this author in PubMed Google Scholar
Whey-Fone Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Wei Chan
View author publications
You can also search for this author in PubMed Google Scholar
Endah Kristiani
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Ping Chiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao-Tung Yang .

Editor information

Editors and Affiliations

Dept De Ciències De La Computació, Universitat Politècnica De Catalunya, Barcelona, Spain
Fatos Xhafa
Tunghai University, Taichung, Taiwan
Fang-Yie Leu
Università Della Campania Luigi Vanvitelli, Caserta, Italy
Massimo Ficco
Tunghai University, Taichung, Taiwan
Chao-Tung Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, CT. et al. (2019). The Implementation of a Hadoop Ecosystem Portal with Virtualization Deployment. In: Xhafa, F., Leu, FY., Ficco, M., Yang, CT. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-030-02607-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-02607-3_11
Published: 17 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02606-6
Online ISBN: 978-3-030-02607-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics