Memory Bandwidth Requirements of Tile-Based Rendering

  • Iosif Antochi
  • Ben Juurlink
  • Stamatis Vassiliadis
  • Petri Liuha
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3133)


Because mobile phones are omnipresent and equipped with displays, they are attractive platforms for rendering 3D images. However, because they are powered by batteries, a graphics accelerator for mobile phones should dissipate as little energy as possible. Since external memory accesses consume a significant amount of power, techniques that reduce the amount of external data traffic also reduce the power consumption. A technique that looks promising is tile-based rendering. This technique decomposes a scene into tiles and renders the tiles one by one. This allows the color components and z values of one tile to be stored in small, on-chip buffers, so that only the pixels visible in the final scene need to be stored in the external frame buffer. However, in a tile-based renderer each triangle may need to be sent to the graphics accelerator more than once, since it might overlap more than one tile. In this paper we measure the total amount of external data traffic produced by conventional and tile-based renderers using several representative OpenGL benchmark scenes. The results show that employing a tile size of 32 × 32 pixels generally yields the best trade-off between the amount of on-chip memory and the amount of external data traffic. In addition, the results show that overall, a tile-based architecture reduces the total amount of external data traffic by a factor of 1.96 compared to a traditional architecture.


External Data Frame Buffer Tile Size Texture Memory Traditional Renderer 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ARM Ltd.: ARM 3D Graphics Solutions.(2002) Available at
  2. 2.
    Catthoor, F., Franssen, F., Wuytack, S., Nachtergaele, L., Man, H.D.: Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems. In: Proc. VLSI Signal Processing Workshop (1994)Google Scholar
  3. 3.
    Fromm, R., Perissakis, S., Cardwell, N., Kozyrakis, C., McGaughy, B., Patterson, D., Anderson, T., Yelick, K.: The Energy Efficiency of IRAM Architectures. In: Proc. 24th Annual Int. Symp. on Computer Architecture, pp. 327–337. ACM Press, New York (1997)Google Scholar
  4. 4.
    Fuchs, H., Poulton, J., Eyles, J., Greer, T., Goldfeather, J., Ellsworth, D., Molnar, S.: Pixel-Planes 5: A Heterogeneous Multiprocessor Graphics System Using Processor-Enhanced Memories. Computer Graphics 23(3), 79–88 (1989)CrossRefGoogle Scholar
  5. 5.
    Molnar, S., Cox, M., Ellsworth, D., Fuchs, H.: A Sorting Classification of Parallel Rendering. IEEE Comput. Graph. Appl. 14, 23–32 (1994)CrossRefGoogle Scholar
  6. 6.
    Humphreys, G., Houston, M., Ng, R., Frank, R., Ahern, S., Kirchner, P.D., Klosowski, J.T.: Chromium: A Stream Processing Framework for Interactive Rendering on Clusters. In: Proc. 29 thAnnual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH 2002), pp. 693–702 (2002)Google Scholar
  7. 7.
    Mueller, C.: The Sort-First Rendering Architecture for High-Performance Graphics. In: Proc. Symp. on Interactive 3D Graphics, pp. 75–84. ACM Press, New York (1995)CrossRefGoogle Scholar
  8. 8.
    Chen, M., Stoll, G., Igehy, H., Proudfoot, K., Hanrahan, P.: Simple Models of the Impact of Overlap in Bucket Rendering. In: Proc. ACM SIGGRAPH/Eurographics Workshop on Graphics Hardware, Lisbon, Portugal, pp. 105–112. ACM Press, New York (1998)CrossRefGoogle Scholar
  9. 9.
    PowerVR: 3D Graphical Processing (Tile Based Rendering - The Future of 3D), White Paper. (2000),
  10. 10.
    Hsieh, E., Pentkovski, V., Piazza, T.: ZR: A 3D API Transparent Technology for Chunk Rendering. In: Proc. 34th ACM/IEEE Int. Symp. on Microarchitecture MICRO-34 (2001)Google Scholar
  11. 11.
    Cox, M., Bhandari, N.: Architectural Implications of Hardware-Accelerated Bucket Rendering on the PC. In: Proc. 1997 SIGGRAPH/Eurographics Workshop on Graphics Hardware, pp. 25–34. ACM Press, New York (1997)CrossRefGoogle Scholar
  12. 12.
    Antochi, I., Juurlink, B., Cilio, A., Liuha, P.: Trading Efficiency for Energy in a Texture Cache Architecture. In: Proc. 4th Int. Conf. on Massively Parallel Computing Systems, MPCS 2002 (2002)Google Scholar
  13. 13.
    Beers, A.C., Agrawala, M., Chaddha, N.: Rendering from Compressed Textures. In: Proc. 23rd Annual Conf. on Computer Graphics and Interactive Techniques, pp. 373–378. ACM Press, New York (1996)Google Scholar
  14. 14.
    Fenney, S.: Texture Compression Using Low-Frequency Signal Modulation. In: Proc. ACM SIGGRAPH/Eurographics Conf. on Graphics Hardware, Eurographics Association, pp.84–91 (2003)Google Scholar
  15. 15.
    Akenine-Möller, T., Ström, J.: Graphics for the Masses: A Hardware Rasterization Architecture for Mobile Phones. ACM Trans. Graph. 22, 801–808 (2003)CrossRefGoogle Scholar
  16. 16.
    Antochi, I., Juurlink, B., Vassiliadis, S., Liuha, P.: GraalBench: A 3D Graphics Benchmark Suite for Mobile Phones. In: Proc. ACM SIGPLAN/SIGBED Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES) (2004) (to appear)Google Scholar
  17. 17.
    Hakura, Z.S., Gupta, A.: The Design and Analysis of a Cache Architecture for Texture Mapping. In: Proc. 24th Annual Int. Symp. on Computer Architecture (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Iosif Antochi
    • 1
  • Ben Juurlink
    • 1
  • Stamatis Vassiliadis
    • 1
  • Petri Liuha
    • 2
  1. 1.Computer Engineering Laboratory, Electrical Engineering, Mathematics and Computer Science FacultyDelft University of TechnologyDelftThe Netherlands
  2. 2.NOKIA Research CenterTampereFinland

Personalised recommendations