Skip to main content

A Service-Oriented Infrastructure for Teaching Big Data Technologies

  • Conference paper
  • First Online:
Supercomputing (RuSCDays 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 793))

Included in the following conference series:

Abstract

The paper presents an experience in incorporating Big Data technologies into introductory parallel and distributed computing courses and building a service-oriented infrastructure to support practical exercises involving these technologies. The presented approach helped to provide a smooth practical experience for students with different technical background by enabling them to run and test their MapReduce and Spark programs on a provided Hadoop cluster via convenient web interfaces. This approach also enabled automation of routine actions related to submission of programs to a cluster and evaluation of programming assignments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  2. White, T.: Hadoop: The Definitive Guide. O’Reilly Media Inc, Sebastopol (2012)

    Google Scholar 

  3. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10(10–10), 95 (2010)

    Google Scholar 

  4. Dincer, K., Fox. G.C.: Design issues in building web-based parallel programming environments. In: 1997 Proceedings of the Sixth IEEE International Symposium on High Performance Distributed Computing, pp. 283–292. IEEE (1997)

    Google Scholar 

  5. Tourino, J., Martin, M.J., Tarrio, J., Arenaz, M.: A grid portal for an undergraduate parallel programming course. IEEE Trans. Educ. 48(3), 391–399 (2005)

    Article  Google Scholar 

  6. Maggi, P., Sisto, R.: A grid-powered framework to support courses on distributed programming. IEEE Trans. Educ. 50(1), 27–33 (2007)

    Article  Google Scholar 

  7. Schlarb, M., Hundt, C., Schmidt, B.: SAUCE: a web-based automated assessment tool for teaching parallel programming. In: Hunold, S., Costan, A., Giménez, D., Iosup, A., Ricci, L., Gómez Requena, M.E., Scarano, V., Varbanescu, A.L., Scott, S.L., Lankes, S., Weidendorfer, J., Alexander, M. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 54–65. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_5

    Chapter  Google Scholar 

  8. Nowicki, M., Marchwiany, M., Szpindler, M., Bała, P.: On-line service for teaching parallel programming. In: Hunold, S., Costan, A., Giménez, D., Iosup, A., Ricci, L., Gómez Requena, M.E., Scarano, V., Varbanescu, A.L., Scott, S.L., Lankes, S., Weidendorfer, J., Alexander, M. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 78–89. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_7

    Chapter  Google Scholar 

  9. Heterogeneous Parallel Programming. https://www.coursera.org/course/hetero

  10. Gergel, V., Kustikova, V.: Internet-oriented educational course “Introduction to Parallel Computing”: a simple way to start. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 291–303. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-55669-7_23

    Chapter  Google Scholar 

  11. Garrity, P., Yates, T., Brown, R., Shoop, E.: Webmapreduce: an accessible and adaptable tool for teaching map-reduce computing. In: Proceedings of the 42nd ACM Technical Symposium On Computer Science Education, pp. 183–188. ACM (2011)

    Google Scholar 

  12. Hue. http://gethue.com/

  13. Databricks Platform. https://databricks.com/product/databricks

  14. Cloudera Data Science Workbench. https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html

  15. Sukhoroslov, O., Volkov, S., Afanasiev, A.: A web-based platform for publication and distributed execution of computing applications. In: 2015 14th International Symposium on Parallel and Distributed Computing (ISPDC), pp. 175–184, June 2015

    Google Scholar 

  16. Everest. http://everest.distcomp.org/

Download references

Acknowledgments

This work is supported by the Russian Science Foundation (project No. 16-11-10352).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oleg Sukhoroslov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sukhoroslov, O. (2017). A Service-Oriented Infrastructure for Teaching Big Data Technologies. In: Voevodin, V., Sobolev, S. (eds) Supercomputing. RuSCDays 2017. Communications in Computer and Information Science, vol 793. Springer, Cham. https://doi.org/10.1007/978-3-319-71255-0_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-71255-0_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-71254-3

  • Online ISBN: 978-3-319-71255-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics