A Hexagonal Processor and Interconnect Topology for Many-Core Architecture with Dense On-Chip Networks

  • Zhibin Xiao
  • Bevan Baas
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 418)


Network-on-Chips (NoCs) are used to connect large numbers of processors in many-core processor architecture because they perform better than less scalable methods such as global shared buses. Among all NoC design parameters, NoC topologies define how nodes are placed and connected and greatly affect the performance, energy efficiency, and circuit area of many-core processor arrays. Due to its simplicity and the fact that processor tiles are traditionally square or rectangular, 2D mesh is mostly used for existing on-chip networks. However, efficiently mapping applications can be a challenge for cases that require communication between processors that are not adjacent on the 2D mesh. Motivated by the fact that applications often have largely localized communication patterns, we have proposed an 8-neighbor mesh topology and a 6-neighbor topology with hexagonal-shaped processor tiles, both of which increase local connectivity while keep much of the simplicity of a mesh-based topology. We have physically designed a 16-bit DSP processor and the corresponding processor arrays which utilize all three topologies. A 1080p H.264/AVC residual video encoder and a 54 Mbps 802.11a/11g OFDM wireless LAN baseband receiver are mapped onto all topologies. The 6-neighbor hexagonal grid topology incurs a 2.9% area increase per tile compared to the 4-neighbor 2D mesh, but its much more effective inter-processor interconnect yields an average total application area reduction of 21%, an average power reduction of 17%, and a total application inter-processor communication distance reduction of 19%.


CMOS many-core processor interconnection topology network on chip (NoC) digital signal processing (DSP) 


  1. 1.
    Ho, R., Mai, K., Horowitz, M.: The future of wires. Proc. of IEEE 89, 490–504 (2001)CrossRefGoogle Scholar
  2. 2.
    Neeb, C., Wehn, N.: Designing efficient irregular networks for heterogeneous systems-on-chip. In: 9th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools (DSD 2006), pp. 665–672 (2006)Google Scholar
  3. 3.
    Leary, G., Srinivasan, K., Mehta, K., Chatha, K.: Design of network-on-chip architectures with a genetic algorithm-based technique. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17(5), 674–687 (2009)CrossRefGoogle Scholar
  4. 4.
    Pande, P.P., Grecu, C., Jones, M., Ivanov, A., Saleh, R.: Effect of traffic localization on energy dissipation in NoC-based interconnect. In: Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), pp. 1774–1777 (2005)Google Scholar
  5. 5.
    Taylor, M., et al.: A 16-issue multiple-program-counter microprocessor with point-to-point scalar operand network. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 170–171 (February 2003)Google Scholar
  6. 6.
    Yu, Z., Meeuwsen, M., Apperson, R., Sattari, O., Lai, M., Webb, J., Work, E., Truong, D., Mohsenin, T., Baas, B.: AsAP: An asynchronous array of simple processors. IEEE Journal of Solid-State Circuits 43(3), 695–705 (2008)CrossRefGoogle Scholar
  7. 7.
    Bell, S., et al.: TILE64 processor: A 64-core soc with mesh interconnect. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 88–89 (February 2008)Google Scholar
  8. 8.
    Truong, D.N., Cheng, W.H., Mohsenin, T., Yu, Z., Jacobson, A.T., Landge, G., Meeuwsen, M.J., Tran, A.T., Xiao, Z., Work, E.W., Webb, J.W., Mejia, P.V., Baas, B.M.: A 167-processor computational platform in 65 nm CMOS. IEEE Journal of Solid-State Circuits 44(4), 1130–1144 (2009)CrossRefGoogle Scholar
  9. 9.
    Howard, J., Dighe, S., Vangal, S., Ruhl, G., Borkar, N., Jain, S., Erraguntla, V., Konow, M., Riepen, M., Gries, M., Droege, G., Lund-Larsen, T., Steibl, S., Borkar, S., De, V., Van Der Wijngaart, R.: A 48-core ia-32 processor in 45 nm cmos using on-die message-passing and dvfs for performance and power scaling. IEEE Journal of Solid-State Circuits 46(1), 173–183 (2011)CrossRefGoogle Scholar
  10. 10.
    Yin, A., Xu, T., Liljeberg, P., Tenhunen, H.: Explorations of honeycomb topologies for network-on-chip. In: Sixth IFIP International Conference on Network and Parallel Computing, NPC 2009, pp. 73–79 (October 2009)Google Scholar
  11. 11.
    Becker, J., Henrici, F., Trendelenburg, S., Ortmanns, M., Manoli, Y.: A continuous-time hexagonal field-programmable analog array in 0.13um CMOS with 186MHz GBW. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 70–71 (February 2008)Google Scholar
  12. 12.
    Malony, A.D.: Regular processor arrays. In: The 2nd Symposium on the Frontiers of Massively Parallel Computation, pp. 499–502 (1988)Google Scholar
  13. 13.
    Chen, M.S., Shin, K., Kandlur, D.: Addressing, routing, and broadcasting in hexagonal mesh multiprocessors. IEEE Transactions on Computers 39, 10–18 (1990)CrossRefGoogle Scholar
  14. 14.
    Decayeux, C., Seme, D.: 3D hexagonal network: modeling, topological properties, addressing scheme, and optimal routing algorithm. IEEE Trans. on Parallel and Distributed Systems 16(9), 875–884 (2005)CrossRefGoogle Scholar
  15. 15.
    Stojmenovic, I.: Honeycomb networks: Topological properties and communication algorithms. IEEE Transactions on Parallel and Distributed Systems 8, 1036–1042 (1997)CrossRefGoogle Scholar
  16. 16.
    Chariete, A., Bakhouya, M., Gaber, J., Wack, M.: An approach for customizing on-chip interconnect architectures in soc design. In: 2012 International Conference on High Performance Computing and Simulation (HPCS), pp. 288–294 (July 2012)Google Scholar
  17. 17.
    Yu, Z., Baas, B.: A low-area multi-link interconnect architecture for GALS chip multiprocessors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 18(5), 750–762 (2010)CrossRefGoogle Scholar
  18. 18.
    Xiao, Z., Baas, B.: A 1080p H.264/AVC baseline residual encoder for a fine-grained many-core system. IEEE Transaction on Circuits and Systems for Video Technology 21(7), 890–902 (2011)CrossRefGoogle Scholar
  19. 19.
    Work, E.W.: Algorithms and software tools for mapping arbitrarily connected tasks onto an asynchronous array of simple processors. Master’s thesis, University of California, Davis, CA, USA (September 2007),
  20. 20.
    Tosun, S., Ozturk, O., Ozen, M.: An ILP formulation for application mapping onto network-on-chips. In: International Conference on Application of Information and Communication Technologies (AICT 2009), pp. 1–5 (October 2009)Google Scholar
  21. 21.
    Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Tran, A.T., Truong, D.N., Baas, B.M.: A complete real-time 802.11a baseband receiver implemented on an array of programmable processors. In: Asilomar Conference on Signals, Systems and Computers (ACSSC), pp. 165–170 (October 2008)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2013

Authors and Affiliations

  • Zhibin Xiao
    • 1
  • Bevan Baas
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of California, DavisDavisUSA

Personalised recommendations