Skip to main content

Extrinsic and Intrinsic Text Cloning

  • Conference paper
Computer Architecture (ISCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6161))

Included in the following conference series:

  • 1673 Accesses

Abstract

Text Cloning occurs when a processor is storing in its shared caches the same text multiple times. There are several causes of Text Cloning and we classify them either as Extrinsic or Intrinsic.

Extrinsic Text Cloning can happen due to user and software practices, or middleware policies, which result into making multiple copies of a binary and concurrently executing the multiple copies on the same processor.

Intrinsic Text Cloning can happen when an instruction cache is Virtually Indexed/Virtually Tagged. A simultaneous multithreaded processor, that employs such cache, will map different processes of the same binary to different instruction cache space due to their distinct process identifier.

Text cloning can be wasteful to performance, especially for simultaneous multithreaded processors, because concurrent processes compete for cache space to store the same instruction blocks.

Experimental results on simultaneous multithreaded processors indicate that the performance overhead of this type of undesirable cloning is significant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Enabling Grids for E-sciencE, http://www.eu-egee.org/

  2. KVM: Kernel Based Virtual Machine, http://www.linux-kvm.org/

  3. ARM: Cortex-A8 Technical Reference Manual (2007)

    Google Scholar 

  4. Beckmann, B.M., Wood, D.A.: Managing wire delay in large chip-multiprocessor caches. In: MICRO 37: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 319–330. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  5. Beszedes, A., Ferenc, R., Gyimuthy, T., Dolenc, A., Karsisto, K.: Survey of Code-Size Reduction Methods. ACM Comput. Surv. 35(3) (September 2003)

    Google Scholar 

  6. Biswas, S., Franklin, D., Savage, A., Dixon, R., Sherwood, T., Chong, F.T.: Multi-execution: multicore caching for data-similar executions. In: ISCA (June 2009)

    Google Scholar 

  7. Casazza, J.: First the tick, now the tock: Intelmicroarchitecture (nehalem). Intel Corporation

    Google Scholar 

  8. Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing replication, communication, and capacity allocation in cmps. SIGARCH Comput. Archit. News 33(2), 357–368 (2005)

    Article  Google Scholar 

  9. Cooper, K.D., McIntosh, N.: Enhanced Code Compression for Embedded RISC Processors. In: Proceedings of PLDI (May 1999)

    Google Scholar 

  10. Debray, S., Evans, W., Muth, R., Sutter, B.D.: Compiler Techniques for Code Compaction. ACM Transactions on Programming Languages and Systems 22(2) (March 2000)

    Google Scholar 

  11. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid - enabling scalable virtual organizations. International Journal of Supercomputer Applications 15, 2001 (2001)

    Article  Google Scholar 

  12. Harizopoulos, S., Ailamaki, A.: Improving instruction cache performance in oltp. ACM Trans. Database Syst. 31(3), 887–920 (2006)

    Article  Google Scholar 

  13. Kleanthous, M., Sazeides, Y.: Catch: A mechanism for dynamically detecting cache-content-duplication and its application to instruction caches. In: DATE (March 2008)

    Google Scholar 

  14. Koufaty, D., Marr, D.T.: Hyper-Threading Technology in the Netburst Microarchitecture. IEEE Micro 23(2), 56–65 (2003)

    Article  Google Scholar 

  15. Lefurgy, C., Bird, P., Chen, I.C., Mudge, T.: Improving Code Density Using Compression Techniques. In: Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 194–203 (December 1997)

    Google Scholar 

  16. Marco, C., Fabio, C., Alvise, D., Antonia, C., Francesco, G., Alessandro, M., Moreno, M., Salvatore, M., Fabrizio, P., Luca, P., Francesco, P.: The glite workload management system. In: 4th International Conference on Grid and Pervasive Computing (2009)

    Google Scholar 

  17. Mohamood, F., Ghosh, M., Lee, H.H.S.: DLL-conscious Instruction Fetch Optimization for SMT Processors. Journal of Systems Architecture 54, 1089–1100 (2008)

    Article  Google Scholar 

  18. Sager, D., Group, D.P., Corp, I.: The microarchitecture of the pentium 4 processor. Intel Technology Journal (2001)

    Google Scholar 

  19. Services, A.W.: Amazon elastic compute cloud: User guide. Tech. Rep. API Version 2009-11-30 (2010)

    Google Scholar 

  20. Shah, M., Barreh, J., Brooks, J., Golla, R., Grohoski, G., Gura, N., Hetherington, R., Jordan, P., Luttrell, M., Olson, C., Saha, B., Sheahan, D., Spracklen, L., Wynn, A.: Ultrasparc t2: A highly-threaded, power-efficient, sparc soc. In: A-SSCC 2007 (November 2007)

    Google Scholar 

  21. Shayesteh, A., Reinman, G., Jouppi, N., Sair, S., Sherwood, T.: Dynamically configurable shared cmp helper engines for improved performance. SIGARCH Comput. Archit. News 33(4), 70–79 (2005)

    Article  Google Scholar 

  22. Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: ASPLOS (October 2002)

    Google Scholar 

  23. Sinharoy, B.: Power7 multi-core processor design. In: MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (2009)

    Google Scholar 

  24. Smith, A.J.: Cache Memories. ACM Computing Surveys (CSUR) 14(3), 473–530 (1982)

    Article  Google Scholar 

  25. Snavely, A., Tullsen, D.M.: Symbiotic job scheduling for a simultaneous multithreaded processor. ACM SIGARCH Computer Architecture News 28(5), 234–244 (2000)

    Article  Google Scholar 

  26. Tullsen, D.M.: Simulation and modeling of a simultaneous multithreading processor. In: Int. CMG Conference (1996)

    Google Scholar 

  27. Tullsen, D., Eggers, S., Levy, H.: Simultaneous Multithreading: Maximizing On-Chip Parallelism. In: 22nd Annual International Symposium on Computer Architecture (June 1995)

    Google Scholar 

  28. Yamamoto, W., Serrano, M., Talcott, A., Wood, R., Nemirosky, M.: Performance estimation of multistreamed, superscalar processors. In: Twenty-Seventh Hawaii Internation Conference on 1994

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kleanthous, M., Sazeides, Y., Dikaiakos, M.D. (2011). Extrinsic and Intrinsic Text Cloning. In: Varbanescu, A.L., Molnos, A., van Nieuwpoort, R. (eds) Computer Architecture. ISCA 2010. Lecture Notes in Computer Science, vol 6161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24322-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24322-6_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24321-9

  • Online ISBN: 978-3-642-24322-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics