Fast and Generalized Polynomial Time Memory Consistency Verification

  • Amitabha Roy
  • Stephan Zeisset
  • Charles J. Fleckenstein
  • John C. Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4144)


The problem of verifying multi-threaded execution against the memory consistency model of a processor is known to be an NP hard problem. However polynomial time algorithms exist that detect almost all failures in such execution. These are often used in practice for microprocessor verification. We present a low complexity and fully parallelized algorithm to check program execution against the processor consistency model. In addition our algorithm is general enough to support a number of consistency models without any degradation in performance. An implementation of this algorithm is currently used in practice to verify processors in the post silicon stage for multiple architectures.


Adjacency Matrix Consistency Model Single Instruction Multiple Data Apply Rule Device Under Test 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hangal, S., Vahia, D., Manovit, C., Lu, J.-Y.J.: Tsotool: A program for verifying memory systems using the memory consistency model. In: ISCA 2004: Proceedings of the 31st annual international symposium on Computer architecture, Washington, DC, USA, p. 114. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  2. 2.
    Gibbons, P.B., Korach, E.: The complexity of sequential consistency. In: SPDP: Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, pp. 317–325 (1992)Google Scholar
  3. 3.
    Cantin, J.F., Lipasti, M.H., Smith, J.E.: The complexity of verifying memory coherence. In: SPAA 2003: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pp. 254–255. ACM Press, New York (2003)CrossRefGoogle Scholar
  4. 4.
    Cantin, J.F., Lipasti, M.H., Smith, J.E.: The Complexity of Verifying Memory Coherence. In: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures (SPAA), San Diego, pp. 254–255 (2003)Google Scholar
  5. 5.
    Manovit, C., Hangal, S.: Efficient algorithms for verifying memory consistency. In: SPAA 2005: Proceedings of the 17th annual ACM symposium on Parallelism in algorithms and architectures, pp. 245–252. ACM Press, New York (2005)CrossRefGoogle Scholar
  6. 6.
    Adve, S.V., Gharachorloo, K.: Shared memory consistency models: A tutorial. Computer 29(12), 66–76 (1996)CrossRefGoogle Scholar
  7. 7.
    Gharachorloo, K., Lenoski, D., Laudon, J., Gibbons, P.B., Gupta, A., Hennessy, J.L.: Memory consistency and event ordering in scalable shared-memory multiprocessors. In: 25 Years ISCA: Retrospectives and Reprints, pp. 376–387 (1998)Google Scholar
  8. 8.
    IA-32 Intel Architecture Software Developer’s Manual, vol. 3: System Programming Guide. Intel Corporation (2005),
  9. 9.
    Intel Itanium Architecture, vol. 1: Application Architecture. Intel Corporation (2005),
  10. 10.
    Cain, H.W., Lipasti, M.H., Nair, R.: Constraint graph analysis of multithreaded programs. In: PACT 2003: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, p. 4. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  11. 11.
    Gopalakrishnan, G., Yang, Y., Sivaraj, H.: Qb or not qb: An efficient execution verification tool for memory orderings. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 401–413. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    A Formal Specification of Intel Itanium Processor Family Memory Ordering. Intel Corporation (2005),
  13. 13.
    Warshall, S.: A theorem on boolean matrices. J. ACM 9(1), 11–12 (1962)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Fleckenstein, C.J., Huang, J.C., Roy, A., Zeisset, S.: Fast and Generalized Polynomial Time Memory Consistency Verification. Technical Report arXiv:cs.AR/0605039 (May 2006)Google Scholar
  15. 15.
    Banerjee, U.K.: Loop Parallelization. Kluwer Academic, Norwell (1994)MATHGoogle Scholar
  16. 16.
    Bentley, B.: Validating the intel pentium 4 microprocessor. In: DAC 2001: Proceedings of the 38th conference on Design automation, pp. 244–248. ACM Press, New York (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Amitabha Roy
    • 1
  • Stephan Zeisset
    • 1
  • Charles J. Fleckenstein
    • 1
  • John C. Huang
    • 1
  1. 1.Intel Corporation 

Personalised recommendations