Abstract
We study an automated verification method for functional correctness of parallel programs running on graphics processing units (GPUs). Our method is based on Kojima and Igarashi’s Hoare logic for GPU programs. Our algorithm generates verification conditions (VCs) from a program annotated by specifications and loop invariants, and passes them to off-the-shelf SMT solvers. It is often impossible, however, to solve naively generated VCs in reasonable time. A main difficulty stems from quantifiers over threads due to the parallel nature of GPU programs. To overcome this difficulty, we additionally apply several transformations to simplify VCs before calling SMT solvers. Our implementation successfully verifies correctness of several GPU programs, including matrix multiplication optimized by using shared memory. In contrast to many existing verification tools for GPU programs, our verifier succeeds in verifying fully parameterized programs: parameters such as the number of threads and the sizes of matrices are all symbolic. We empirically confirm that our simplification heuristics is highly effective for improving efficiency of the verification procedure.
Similar content being viewed by others
Notes
We choose these initial values to explain what happens when the control branches. These initial values do not satisfy the precondition on the first line, so the asserted invariant is not preserved during execution.
Actually, this precondition is not necessary for the correctness of the program, but if we remove it, we would need more complicated loop invariants.
Strictly speaking, the postcondition only asserts that the contents of a and b at the end of the program are equal. We use the fact that the program does not modify the contents of a to understand that this postcondition implies that the initial value of a is copied into b.
Some of the terms appearing in this expression are not well-typed. We could write \( assign (b_2, (\lambda t.i_2(t) < {len}_0), b_1, (\lambda t.i_2(t)), (\lambda t.a_0(i_2(t))))\), but for brevity we abbreviate it as above.
In general, it is not the case that all of the instances necessary to prove a formula \(\varphi \) appears in \(\varphi \). However, we believe that this strategy is sufficient in this specific case (that is, finding appropriate instances of \( assign \)).
In this case \(t+1, t+2, \dots \) are also \(\forall \)-bounds, but we do not take them into account. Practically, considering only t seems sufficient in many cases.
Several examples are found at https://fmt.ewi.utwente.nl/redmine/projects/vercors-verifier/wiki/Examples.
References
Asakura, I., Masuhara, H., Aotani, T.: Proof of soundness of concurrent separation logic for GPGPU in Coq. J. Inf. Process. 24(1), 132–140 (2016)
Betts, A., Chong, N., Donaldson, A.F., Ketema, J., Qadeer, S., Thomson, P., Wickerson, J.: The design and implementation of a verification technique for GPU kernels. ACM Trans. Program. Lang. Syst. 37(3), 10:1–10:49 (2015). doi:10.1145/2743017
Blom, S., Huisman, M., Mihelčić, M.: Specification and verification of GPGPU programs. Sci. Comput. Programm. 95(3), 376–388 (2014)
Bobot, F., Filliâtre, J.C., Marché, C., Paskevich, A.: Why3: Shepherd Your Herd of Provers. In: Boogie 2011: 1st International Workshop on Intermediate Verification Languages, pp. 53–64. Wroclaw, Poland (2011). URL https://hal.inria.fr/hal-00790310
Bozga, M., Iosif, R.: On decidability within the arithmetic of addition and divisibility. In: Sassone, V. (ed.) Proceedings of FOSSACS 2005, Springer LNCS, vol. 3441, pp. 425–439. (2005). doi:10.1007/978-3-540-31982-5_27
Cachera, D., Jensen, T.P., Jobin, A., Kirchner, F.: Inference of polynomial invariants for imperative programs: A farewell to Gröbner bases. Sci. Comput. Program. 93, 89–109 (2014). doi:10.1016/j.scico.2014.02.028
Collingbourne, P., Cadar, C., Kelly, P.H.: Symbolic testing of OpenCL code. In: Eder, K., Lourenço, J.A., Shehory, O. (eds.) Proceedings of Hardware and Software: Verification and Testing, Springer LNCS, vol. 7261, pp. 203–218. Springer Verlag (2012). doi:10.1007/978-3-642-34188-5_18
Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52(3), 365–473 (2005). doi:10.1145/1066100.1066102
Ernst, M.D., Perkins, J.H., Guo, P.J., McCamant, S., Pacheco, C., Tschantz, M.S., Xiao, C.: The daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1–3), 35–45 (2007). doi:10.1016/j.scico.2007.01.015
Flanagan, C., Leino, K.R.M.: Houdini, an annotation assistant for ESC/Java. In: Oliveira, J.N., Zave, P. (eds.) Proceedings of International Symposium of Formal Methods Europe (FME 2001), Springer LNCS, vol. 2021, pp. 500–517. Springer (2001). doi:10.1007/3-540-45251-6_29
Flanagan, C., Saxe, J.B.: Avoiding exponential explosion: Generating compact verification conditions. In: Proceedings of ACM POPL, POPL ’01, pp. 193–205. ACM, New York, NY, USA (2001). doi:10.1145/360204.360220
Garg, P., Löding, C., Madhusudan, P., Neider, D.: ICE: A robust framework for learning invariants. In: Biere, A., Bloem, R. (eds.) Proceedings of 26th International Conference on Computer Aided Verification (CAV 2014), Springer LNCS, vol. 8559, pp. 69–87. Springer (2014). doi:10.1007/978-3-319-08867-9_5
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976). doi:10.1145/360248.360252
Kojima, K., Igarashi, A.: A Hoare Logic for SIMT programs. In: Chieh Shan, C. (ed.) Proceedings of Asian Symposium on Programming Languages and Systems (APLAS 2013), Springer LNCS, vol. 8301, pp. 58–73 (2013)
Kojima, K., Igarashi, A.: A Hoare logic for GPU kernels. ACM Transactions on Computational Logic (2016). To appear. A revised and extended version of [14]
Kojima, K., Imanishi, A., Igarashi, A.: Automated verification of functional correctness of race-free GPU programs. In: Blazy, S., Chechik, M. (eds.) Verified Software. Theories, Tools, and Experiments–8th International Conference, VSTTE 2016, Toronto, ON, Canada, July 17-18, 2016, Revised Selected Papers, Lecture Notes in Computer Science, vol. 9971, pp. 90–106 (2016). doi:10.1007/978-3-319-48869-1_7
Komuravelli, A., Bjørner, N., Gurfinkel, A., McMillan, K.L.: Compositional verification of procedural programs using Horn clauses over integers and arrays. In: Kaivola, R., Wahl, T. (eds.) Formal Methods in Computer-Aided Design, FMCAD 2015, Austin, Texas, USA, September 27-30, 2015, pp. 89–96. IEEE (2015)
Kovács, L., Voronkov, A.: Finding loop invariants for programs over arrays using a theorem prover. In: Chechik, M., Wirsing, M. (eds.) Fundamental Approaches to Software Engineering, Springer LNCS, vol. 5503, pp. 470–485. Springer Berlin Heidelberg (2009). doi:10.1007/978-3-642-00593-0_33
Lechner, A., Ouaknine, J., Worrell, J.: On the complexity of linear arithmetic with divisibility. In: Proceedings of 30th Annual ACM/IEEE Symposium on Logic in Computer Science, (LICS 2015), pp. 667–676. IEEE (2015). doi:10.1109/LICS.2015.67
Li, G., Gopalakrishnan, G.: Scalable SMT-based verification of GPU kernel functions. In: Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’10), pp. 187–196. ACM (2010). doi:10.1145/1882291.1882320
Li, G., Gopalakrishnan, G.: Parameterized verification of GPU kernel programs. In: IPDPS Workshop on Multicore and GPU Programming Models, Languages and Compilers Wokshop, pp. 2450–2459. IEEE (2012)
Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.P.: GKLEE: concolic verification and test generation for GPUs. In: Ramanujam, J., Sadayappan, P. (eds.) Proc. of ACM PPoPP, pp. 215–224. ACM (2012). doi:10.1145/2145816.2145844
Li, P., Li, G., Gopalakrishnan, G.: Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programs. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE Computer Society Press (2012)
Li, P., Li, G., Gopalakrishnan, G.: Practical symbolic race checking of GPU programs. In: T. Damkroger, J. Dongarra (eds.) Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2014), pp. 179–190. IEEE (2014). doi:10.1109/SC.2014.20
McMillan, K.: Quantified invariant generation using an interpolating saturation prover. In: Ramakrishnan, C., Rehof, J. (eds.) Tools and Algorithms for the Construction and Analysis of Systems, Springer LNCS, vol. 4963, pp. 413–427. Springer Berlin Heidelberg (2008). doi:10.1007/978-3-540-78800-3_31
Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: Proceedings of 11th International Conference on Compiler Construction (CC 2002), Springer LNCS, vol. 2304, pp. 213–228 (2002). doi:10.1007/3-540-45937-5_16
Nguyen, H.: GPU Gems 3, first edn. Addison-Wesley Professional (2007). http://developer.nvidia.com/object/gpu-gems-3.html
NVIDIA: NVIDIA CUDA C Programming Guide (2014). URL http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
Acknowledgements
We thank anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kojima, K., Imanishi, A. & Igarashi, A. Automated Verification of Functional Correctness of Race-Free GPU Programs. J Autom Reasoning 60, 279–298 (2018). https://doi.org/10.1007/s10817-017-9428-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10817-017-9428-2