Invasive Computing: An Overview

  • Jürgen TeichEmail author
  • Jörg Henkel
  • Andreas Herkersdorf
  • Doris Schmitt-Landsiedel
  • Wolfgang Schröder-Preikschat
  • Gregor Snelting


A novel paradigm for designing and programming future parallel computing systems called invasive computing is proposed. The main idea and novelty of invasive computing is to introduce resource-aware programming support in the sense that a given program gets the ability to explore and dynamically spread its computations to neighbour processors in a phase called invasion, then to execute portions of code of high parallelism degree in parallel based on the available invasible region on a given multi-processor architecture. Afterwards, once the program terminates or if the degree of parallelism should be lower again, the program may enter a retreat phase, deallocate resources and resume execution again, for example, sequentially on a single processor. To support this idea of self-adaptive and resource-aware programming, not only new programming concepts, languages, compilers and operating systems are necessary but also revolutionary architectural changes in the design of Multi-Processor Systems-on-a-Chip must be provided so to efficiently support invasion, infection and retreat operations involving concepts for dynamic processor, interconnect and memory reconfiguration. This contribution reveals the main ideas, potential benefits and challenges for supporting invasive computing at the architectural, programming and compiler level in the future. It serves to give an overview of required research topics rather than being able to present mature solutions yet.


B (Hardware) B.7 (Hardware: Integrated Circuits) C (Computer Systems Organisation) C.1 (Computer Systems Organisation; Processor Architectures) C.3 (Computer Systems Organisation: Special-Purpose and Application-Based Systems) 



We thank the following people for their support (in alphabetical order): Dr. Tamim Asfour, Dr. Lars Bauer, Prof. Jürgen Becker, Prof. Hans-Joachim Bungartz, Prof. Rüdiger Dillmann, Prof. Michael Gerndt, Dr. Frank Hannig, Sebastian Harl, Dr. Michael Hübner, Dr. Daniel Lohmann, Prof. Peter Sanders, Prof. Ulf Schlichtmann, Prof. Marc Stamminger, Prof. Walter Stechele, Prof. Rolf Wanka, Dr. Thomas Wild and all of their scientific staff members. Finally, we would like to express our sincere gratitude to the German Research Foundation (DFG) to establish its collaborative research center TCRC89 on the topic of invasive computing, see


  1. 1.
    Rabaey, J.M., Malik, S.: Challenges and solutions for late- and post-silicon design. IEEE Design and Test of Computers 25(4), 296–302 (2008). DOI 2008.91. URL
  2. 2.
    Corporation, N.: NVIDIA Whitepaper: NVIDIA’s Next Generation CUDA Compute Architecture: Fermi. (2009)
  3. 3.
    Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro 28, 39–55 (2008)CrossRefGoogle Scholar
  4. 4.
    Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics 27(3), 1–15 (2008). DOI Google Scholar
  5. 5.
    Pham, D., Aipperspach, T., Boerstler, D., Bolliger, M., Chaudhry, R., Cox, D., Harvey, P., Harvey, P., Hofstee, H., Johns, C., et al.: Overview of the Architecture, Circuit Design, and Physical Implementation of a First-Generation Cell Processor. IEEE Journal of Solid-State Circuits 41(1), 179–196 (2006)CrossRefGoogle Scholar
  6. 6.
    Hannig, F., Dutta, H., Teich, J.: Mapping a Class of Dependence Algorithms to Coarse-grained Reconfigurable Arrays: Architectural Parameters and Methodology. International Journal of Embedded Systems 2(1/2), 114–127 (2006)CrossRefGoogle Scholar
  7. 7.
    Kissler, D., Hannig, F., Kupriyanov, A., Teich, J.: A Highly Parameterizable Parallel Processor Array Architecture. In: Proceedings of the IEEE International Conference on Field Programmable Technology (FPT), pp. 105–112. Bangkok, Thailand (2006)CrossRefGoogle Scholar
  8. 8.
    Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., et al.: An 80-Tile 1.28 TFLOPS Network-on-Chip in 65nm CMOS. In: SolidState Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International, pp. 98–589 (2007)Google Scholar
  9. 9.
    Feautrier, P.: Automatic Parallelization in the Polytope Model. Tech. Rep. 8, Laboratoire PRiSM, Université des Versailles St-Quentin en Yvelines, 45, avenue des États-Unis, F-78035 Versailles Cedex (1996)Google Scholar
  10. 10.
    Hannig, F., Teich, J.: Resource Constrained and Speculative Scheduling of an Algorithm Class with Run-Time Dependent Conditionals. In: Proceedings of the 15th IEEE International Conference on Application-specific Systems, Architectures, and Processors (ASAP), pp. 17–27. Galveston, TX, USA (2004)Google Scholar
  11. 11.
    Thomas, A., Becker, J.: New adaptive multi-grained hardware architecture for processing of dynamic function patterns. it - Information Technology 49(3), 165–173 (2007)CrossRefGoogle Scholar
  12. 12.
    Teich, J.: Invasive Algorithms and Architectures. it - Information Technology 50(5), 300–310 (2008)CrossRefGoogle Scholar
  13. 13.
    Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. Journal of Parallel and Distributed Computing 7, 279–301 (1989)CrossRefGoogle Scholar
  14. 14.
    Boillat, J.E.: Load balancing and poisson equation in a graph. Concurrency: Practice and Experience 2, 289–313 (1990)CrossRefGoogle Scholar
  15. 15.
    Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of Markov chains and the analysis of iterative load-balancing schemes. In: Proc. 39th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 694–703 (1998)Google Scholar
  16. 16.
    Sanders, P.: Randomized priority queues for fast parallel access. Journal Parallel and Distributed Computing, Special Issue on Parallel and Distributed Data Structures 49, 86–97 (1998)zbMATHGoogle Scholar
  17. 17.
    Sanders, P.: Asynchronous scheduling of redundant disk arrays. IEEE Transactions on Computers 52(9), 1170–1184 (2003). Short version in 12th ACM Symposium on Parallel Algorithms and Architectures, pages 89–98, 2000CrossRefGoogle Scholar
  18. 18.
    Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielsstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pp. 519–538. ACM (2005)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Jürgen Teich
    • 1
    Email author
  • Jörg Henkel
  • Andreas Herkersdorf
  • Doris Schmitt-Landsiedel
  • Wolfgang Schröder-Preikschat
  • Gregor Snelting
  1. 1.Hardware/Software Co-Design, Department of Computer ScienceUniversity of Erlangen-NurembergErlangenGermany

Personalised recommendations