Abstract
In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle .
In this chapter we explore the possibilities of enabling efficient and transparent use of Jungle Computing Systems in everyday scientific practice. To this end, we discuss the fundamental methodologies required for defining programming models that are tailored to the specific needs of scientific researchers. Importantly, we claim that many of these fundamental methodologies already exist today, as integrated in our Ibis high-performance distributed programming system. We also make a case for the urgent need for easy and efficient Jungle Computing in scientific practice, by exploring a set of state-of-the-art application domains. For one of these domains, we present results obtained with Ibis on a real-world Jungle Computing System. The chapter concludes by exploring fundamental research questions to be investigated in the years to come.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abramson, D., Sosic, R., Giddy, J., Hall, B.: Nimrod: a tool for performing parameterised simulations using distributed workstations. In: Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing (HPDC’95), Pentagon City, USA, pp. 112–121 (1995)
Anadiotis, G., Kotoulas, S., Oren, E., Siebes, R., van Harmelen, F., Drost, N., Kemp, R., Maassen, J., Seinstra, F., Bal, H.: MaRVIN: a distributed platform for massive RDF inference. In: Semantic Web Challenge 2008, Held in Conjunction with the 7th International Semantic Web Conference (ISWC 2008), Karlsruhe, Germany (2008)
Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009)
Bal, H., Maassen, J., van Nieuwpoort, R., Drost, N., Kemp, R., van Kessel, T., Palmer, N., Wrzesińska, G., Kielmann, T., van Reeuwijk, K., Seinstra, F., Jacobs, C., Verstoep, K.: Real-world distributed computing with ibis. IEEE Comput. 48(8), 54–62 (2010)
Butler, D.: The petaflop challenge. Nature 448, 6–7 (2007)
Carley, K.: Organizational change and the digital economy: a computational organization science perspective. In: Brynjolfsson, E., Kahin, B. (eds.) Understanding the Digital Economy: Data, Tools, Research, pp. 325–351. MIT Press, Cambridge (2000)
Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)
Chang, C.I.: Hyperspectral Data Exploitation: Theory and Applications. Wiley, New York (2007)
Kranzlmüller, D.: Towards a sustainable federated grid infrastructure for science. In: Keynote Talk, Sixth High-Performance Grid Computing Workshop (HPGC’08), Rome, Italy (2009)
Denis, A., Aumage, O., Hofman, R., Verstoep, K., Kielmann, T., Bal, H.: Wide-area communication for grids: an integrated solution to connectivity, performance and security problems. In: Proceedings of the 13th International Symposium on High Performance Distributed Computing (HPDC’04), Honolulu, HI, USA, pp. 97–106 (2004)
Dijkstra, E.: On the Phenomenon of Scientific Disciplines (1986). Unpublished Manuscript EWD988; E.W. Dijkstra Archive
Douglas, R., Martin, K.: Neuronal circuits in the neocortex. Annu. Rev. Neurosci. 27, 419–451 (2004)
Drost, N., van Nieuwpoort, R., Maassen, J., Seinstra, F., Bal, H.: JEL: unified resource tracking for parallel and distributed applications. Concurr. Comput. Pract. Exp. (2010). doi:10.1002/cpe.1592
Editorial: The importance of technological advances. Nature Cell Biology 2, E37 (2000)
Editorial: Cloud computing: clash of the clouds. The Economist (2009)
Gagliardi, F.: Grid and cloud computing: opportunities and challenges for e-science. In: Keynote Speech, International Symposium on Grid Computing 2008 (ISCG 2008), Taipei, Taiwan (2008)
Fensel, D., van Harmelen, F., Andersson, B., Brennan, P., Cunningham, H., Valle, E.D., Fischer, F., Zhisheng, H., Kiryakov, A., Lee, T.I., Schooler, L., Tresp, V., Wesner, S., Witbrock, M., Ning, Z.: Towards LarKC: a platform for web-scale reasoning. In: Proceedings of the Second International Conference on Semantic Computing (ICSC 2008), Santa Clara, CA, USA, pp. 524–529 (2008)
Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15(3), 200–222 (2001)
Geusebroek, J., Smeulders, A., Geerts, H.: A minimum cost approach for segmenting networks of lines. Int. J. Comput. Vis. 43(2), 99–111 (2001)
Goetz, A., Vane, G., Solomon, J., Rock, B.: Imaging spectrometry for earth remote sensing. Science 228, 1147–1153 (1985)
Graham-Rowe, D.: Mission to Build a Simulated Brain Begins. New Scientist (2005)
Green, R., Eastwood, M., Sarture, C., Chrien, T., Aronsson, M., Chippendale, B., Faust, J., Pavri, B., Chovit, C., Solis, M., Olah, M.: Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS). Remote Sens. Environ. 65(3), 227–248 (1998)
Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. Commun. ACM 51(7), 60–69 (2008)
Hey, T.: The social grid. In: Keynote Talk, OGF20 2007, Manchester, UK (2007)
Khan, J., Wierzbicki, A.: Guest editor’s introduction; foundation of peer-to-peer computing. Comput. Commun. 31(2), 187–189 (2008)
Koelma, D., Poll, E., Seinstra, F.: Horus C++ reference. Tech. rep., University of Amsterdam, The Netherlands (2002)
Koene, R., Tijms, B., van Hees, P., Postma, F., de Ridder, A., Ramakers, G., van Pelt, J., van Ooyen, A.: NETMORPH: a framework for the stochastic generation of large scale neuronal networks with realistic neuron morphologies. Neuroinformatics 7(3), 195–210 (2009)
Lu, P., Oki, H., Frey, C., Chamitoff, G., Chiao, L., Fincke C.M. Foale, E.M. Jr., Tani, D., Whitson, P., Williams, J., Meyer, W., Sicker, R., Au, B., Christiansen, M., Schofield, A., Weitz, D.: Order-of-magnitude performance increases in gpu-accelerated correlation of images from the international space station. J. Real-Time Image Process. (2009)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput. Pract. Exp. 18(10), 1039–1065 (2005)
Maassen, J., Bal, H.: SmartSockets: solving the connectivity problems in grid computing. In: Proceedings of the 16th International Symposium on High Performance Distributed Computing (HPDC’07), Monterey, USA, pp. 1–10 (2007)
Manual: Advanced Micro Devices Corporation (AMD). AMD Stream Computing User Guide, Revision 1.1 (2008)
Manual: NVIDIA CUDA Complete Unified Device Architecture Programming Guide, v2.0 (2008)
Medeiros, R., Cirne, W., Brasileiro, F., Sauvé, J.: Faults in grids: why are they so bad and what can be done about it? In: Proceedings of the 4th International Workshop on Grid Computing, Phoenix, AZ, USA, pp. 18–24 (2003)
Morrow, P., Crookes, D., Brown, J., McAleese, G., Roantree, D., Spence, I.: Efficient implementation of a portable parallel programming model for image processing. Concurr. Comput. Pract. Exp. 11, 671–685 (1999)
Paz, A., Plaza, A., Plaza, J.: Comparative analysis of different implementations of a parallel algorithm for automatic target detection and classification of hyperspectral images. In: Proceedings of SPIE Optics and Photonics—Satellite Data Compression, Communication, and Processing V, San Diego, CA, USA (2009)
Plaza, A.: Recent developments and future directions in parallel processing of remotely sensed hyperspectral images. In: Proceedings of the 6th International Symposium on Image and Signal Processing and Analysis, Salzburg, Austria, pp. 626–631 (2009)
Plaza, A., Plaza, J., Paz, A.: Parallel heterogeneous CBIR system for efficient hyperspectral image retrieval using spectral mixture analysis. Concurr. Comput. Pract. Exp. 22(9), 1138–1159 (2010)
Plaza, A., Valencia, D., Plaza, J., Martinez, P.: Commodity cluster-based parallel processing of hyperspectral imagery. J. Parallel Distrib. Comput. 66(3), 345–358 (2006)
Rasher, U., Gioli, B., Miglietta, F.: FLEX—fluorescence explorer: a remote sensing approach to quantify spatio-temporal variations of photosynthetic efficiency from space. In: Allen, J., et al. (eds.) Photosynthesis. Energy from the Sun: 14th International Congress on Photosynthesis, pp. 1387–1390. Springer, Berlin (2008)
Reilly, M.: When multicore isn’t enough: trends and the future for multi-multicore systems. In: Proceedings of the Twelfth Annual Workshop on High-Performance Embedded Computing (HPEC 2008), Lexington, MA, USA (2008)
Seinstra, F., Bal, H., Spoelder, H.: Parallel simulation of ion recombination in nonpolar liquids. Future Gener. Comput. Syst. 13(4–5), 261–268 (1998)
Seinstra, F., Geusebroek, J., Koelma, D., Snoek, C., Worring, M., Smeulders, A.: High-performance distributed video content analysis with parallel-horus. IEEE Trans. Multimed. 14(4), 64–75 (2007)
Seinstra, F., Koelma, D., Bagdanov, A.: Finite state machine-based optimization of data parallel regular domain problems applied in low-level image processing. IEEE Trans. Parallel Distrib. Syst. 15(10), 865–877 (2004)
Seinstra, F., Koelma, D., Geusebroek, J.: A software architecture for user transparent parallel image processing. Parallel Comput. 28(7–8), 967–993 (2002)
Snoek, C., Worring, M., Geusebroek, J., Koelma, D., Seinstra, F., Smeulders, A.: The semantic pathfinder: using an authoring metaphor for generic multimedia indexing. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1678–1689 (2006)
Tan, J., Abramson, D., Enticott, C.: Bridging organizational network boundaries on the grid. In: Proceedings of the 6th IEEE International Workshop on Grid Computing, Seattle, WA, USA, pp. 327–332 (2005)
Taylor, I., Wang, I., Shields, M., Majithia, S.: Distributed computing with Triana on the grid. Concurr. Comput. Pract. Exp. 17(9), 1197–1214 (2005)
Urbani, J., Kotoulas, S., Maassen, J., Drost, N., Seinstra, F., van Harmelen, F., Bal, H.: WebPIE: a web-scale parallel inference engine. In: Third IEEE International Scalable Computing Challenge (SCALE2010), Held in Conjunction with the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), Melbourne, Australia (2010)
van Harmelen, F.: Semantic web technologies as the foundation of the information infrastructure. In: van Oosterom, P., Zlatanove, S. (eds.) Creating Spatial Information Infrastructures: Towards the Spatial Semantic Web. CRC Press, London (2008)
van Kessel, T., Drost, N., Seinstra, F.: User transparent task parallel multimedia content analysis. In: Proceedings of the 16th International Euro-Par Conference (Euro-Par 2010), Ischia–Naples, Italy (2010)
van Nieuwpoort, R., Kielmann, T., Bal, H.: User-friendly and reliable grid computing based on imperfect middleware. In: Proceedings of the ACM/IEEE International Conference on Supercomputing (SC’07), Reno, NV, USA (2007)
van Werkhoven, B., Maassen, J., Seinstra, F.: Towards user transparent parallel multimedia computing on GPU-clusters. In: Proceedings of the 37th ACM IEEE International Symposium on Computer Architecture (ISCA 2010), First Workshop on Applications for Multi and Many Core Processors (A4MMC 2010), Saint Malo, France (2010)
Verstoep, K., Maassen, J., Bal, H., Romein, J.: Experiences with fine-grained distributed supercomputing on a 10G testbed. In: Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’08), Lyon, France, pp. 376–383 (2008)
Waltz, D., Buchanan, B.: Automating science. Science 324, 43–44 (2009)
Website: EGI—Towards a Sustainable Production Grid Infrastructure. http://www.eu-egi.eu
Website: Open European Network for High-Performance Computing on Complex Environments. http://w3.cost.esf.org/index.php?id=177&action_number=IC0805
Website: SETI@home. http://setiathome.ssl.berkeley.edu
Website: Top500 Supercomputer Sites. http://www.top500.org; Latest Update (2009)
Wojick, D., Warnick, W., Carroll, B., Crowe, J.: The digital road to scientific knowledge diffusion: a faster, better way to scientific progress? D-Lib Mag. 12(6) (2006)
Wrzesińska, G., Maassen, J., Bal, H.: Self-adaptive applications on the grid. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’07), San Jose, CA, USA, pp. 121–129 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Seinstra, F.J. et al. (2011). Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds. In: Cafaro, M., Aloisio, G. (eds) Grids, Clouds and Virtualization. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-049-6_8
Download citation
DOI: https://doi.org/10.1007/978-0-85729-049-6_8
Publisher Name: Springer, London
Print ISBN: 978-0-85729-048-9
Online ISBN: 978-0-85729-049-6
eBook Packages: Computer ScienceComputer Science (R0)