Efficient Implementation of Computationally Complex Algorithms: Custom Instruction Approach

  • Waqar AhmedEmail author
  • Hasan Mahmood
  • Umair Siddique
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 130)


Information decoding and security using minimal hardware and software resources is very indispensable in mission and safety critical applications. Currently, various methodologies have been proposed in which hardware exhibits parallelism either implicitly or explicitly. In this chapter, we report an enhancement in DLX processor instruction set for efficient implementation of Viterbi decoding algorithm and enhanced AES encryption algorithm. We also present results for enhanced AES encryption algorithm for PicoJava II processor. We create a custom permutation instruction (WUHPERM) and a custom trellis expansion instruction (Texpand) in CPUSIM simulator on RISC-based architecture. In addition, we implement the same WUHPERM instruction on Mic-1 simulator which is based on JVM microarchitecture. The results show substantial improvements in the execution speed of approximately six times when the WUHPERM instruction is implemented in RISC architecture and eight times for stack-based architecture. The execution time is stupendously improved to approximately three times when Texpand instruction is implemented for RISC architecture.


  1. 1.
    Announcing the advanced encryption standard (AES). National Institute of Standard Technology, FIPS, 197Google Scholar
  2. 2.
    Schneier B (1996) Applied cryptography, 2nd edn. Wiley, New YorkGoogle Scholar
  3. 3.
    Schneier B, Kelsey J (1998). Twofish: A 128-bit block cipher. Available online
  4. 4.
    Smith B, Anderson R, Biham E, Knudsen L (1998). Serpent: a proposal for the advanced encryption standard. Available online
  5. 5.
    Sedwick R (June 1977) Permutation generation methods. Comput Surv 9(2):137–164CrossRefGoogle Scholar
  6. 6.
    Tompkin C (1956) Machine attack on problems whose variable are permutations. In: Proceedings of symposium in applied math numerical analysis. McGraw Hill Inc., New York, US, pp 195–211Google Scholar
  7. 7.
    Bossert M (1999) Channel coding for telecommunications. Wiley, Chichester, UKGoogle Scholar
  8. 8.
    Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inform Theory IT-13(2):260–269CrossRefGoogle Scholar
  9. 9.
    Hennessy J, Patterson D (1996) A Computer Architecture, A Quantitative Approach. Morgan Kaufmann publisher Inc., San FranciscozbMATHGoogle Scholar
  10. 10.
    Tanenubaum AS (2005) Structured computer organization, 5th edn. Prentice Hall, Upper Saddle River, NJGoogle Scholar
  11. 11.
    Ahmed W, Mahmood H, Siddique U (2011) The Efficient Implementation of S8 AES Algorithm. In: Proceedings of world congress on engineering 2011 (WCE 2011). International Conference of Computer Science and Engineering, London, UK, 6–8 July 2011, pp 1215–1219Google Scholar
  12. 12.
    Skrien D (2001) CPU Sim 3.1: A Tool for Simulating Computer Architectures for CS3 classes. ACM J Educ Resour Comput 1(4):46–591CrossRefGoogle Scholar
  13. 13.
    Hussain I, Shah T, Mahmood H (2010) A new algorithm to construct secure keys for AES. Int J Contemp Math Sci 5(26):1263–1270MathSciNetzbMATHGoogle Scholar
  14. 14.
    Tran MT, Bui DK, Duong AD (2008) Gray S-box for advanced encryption standard. In: Proceedings of international conference of computational intelligence and security (CIS’08), Suzhou, 13–17 Dec 2008, pp 253–258Google Scholar
  15. 15.
    Murphy S, Robshaw MJ (2002) Essential algebraic structure within the AES. In: Proceedings of the 22nd annual international cryptology conference 2002 (Crypto’02). LNCS 2442, Santa Barbara, CA, USA, 18–22 Aug 2002, pp 1–16Google Scholar
  16. 16.
    Rosenthal J (2003) A polynomial description of the Rijndael Advanced Encryption Standard. J Algebra Appl 2(2):223–236MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Daemen J, Rijmen V (1999) AES proposal: Rijindael, AES submission, version 2. Available online
  18. 18.
    Forney GD (2005) The Viterbi algorithm: a personal history. In: Proceedings of Viterbi conference, LA, USA, 8–9 Mar 2005Google Scholar
  19. 19.
  20. 20.
    Lee R (1989) Precision architecture. IEEE Comput 22(1):78–91CrossRefGoogle Scholar
  21. 21.
    Lee R, Mahon M, Morris D (1992) Path length reduction features in the PA-RISC architecture. In: Proceedings of IEEE Compcon, San Francisco, CA, 24–28 Feb 1992, pp 129–135Google Scholar
  22. 22.
    Shi Z, Lee RB (2000) Bit permutation instructions for accelerating software cryptography. In: Proceedings of the IEEE international conference on application-specific systems. Architectures and Processors 2000 (ASAP 2000), Boston, MA, USA, 10–12 July 2000, pp 138–148Google Scholar
  23. 23.
    Convolutional coding on Xtensa processor application note. Tensilica, Inc., January 2009, Doc Number: AN01-123-04Google Scholar
  24. 24.
    Liang J, Tessier R, Geockel D (2004) A dynamically-reconfigurable, power-efficient turbo decoder. In: Proceedings of 12th annual IEEE symposium on field programmable custom computing machines (FCCM 2004), Napa, CA, USA, 20–23 April 2004, pp 91–100Google Scholar
  25. 25.
    Ferguson N, Schroeppel R, Whiting D (2001) A simple algebraic representation of Rijndael. In: Proceedings of Selected Areas in Cryptography 2011 (SAC01). LNCS 2259, London, UK, 16–17 Aug 2001, pp 103–111Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2013

Authors and Affiliations

  1. 1.Department of ElectronicsQuaid-i-Azam UniversityIslamabadPakistan
  2. 2.Research Center for Modeling and SimulationNational University of Sciences and Technology (NUST)IslamabadPakistan

Personalised recommendations