Combining Lock Inference with Lock-Based Software Transactional Memory
An atomic block is a language construct that simplifies the programming of critical sections. In the past, software transactional memory (STM) and lock inference have been used to implement atomic blocks. Both approaches have strengths and weaknesses. STM provides fine-grained locking but has high overheads due to logging and potential rollbacks. Lock inference is a static analysis that computes which locks an atomic block must acquire in order to guarantee atomicity. Lock inference avoids both logging overhead and rollbacks, but with a growing number of variables accessed in an atomic block, locking becomes coarse-grained and hence reduces parallelism.
The first contribution of this paper is an approach that combines these advantages without the drawbacks. A compiler analysis determines if lock inference can achieve a fine-grained synchronization or if STM is better for an atomic block. The generated code then either uses lock inference, STM, or a combination of both that allows the atomic block to switch from STM to lock inference during its execution. The second contribution are two optimizations that remove some of the limits of state-of-the-art static lock inference analysis and therefore extend its applicability. These optimizations make more atomic blocks amenable to fine-grained lock inference.
We use the STAMP benchmark suite to prove the practicability of our work. The reduced contention due to fine-grained locking and less transactional overhead lead to execution times that are between \(1.1\) and \(6.0\) times faster than a pure STM or lock inference implementation.
- 1.Bronson, N.G., Casper, J., Chafi, H., Olukotun, K.: A practical concurrent binary search tree. In: PPoPP’10: Proceedings of the Symposium on Principles and Practice Parallel Programming, Bangalore, India, pp. 257–268, Jan 2010Google Scholar
- 2.Bronson, N.G., Casper, J., Chafi, H., Olukotun, K.: Transactional predication: high-performance concurrent sets and maps for STM. In: PODC’10: Proceedings of the Symposium on Principles of Distributed Computing, Zurich, Switzerland, pp. 6–15, Jul 2010Google Scholar
- 3.Cao Minh, C., Chung, J., Kozyrakis, C., Olukotun, K.: STAMP: stanford transactional applications for multi-processing. In: Proceedings of the Symposium on Workload Characterization (IISWC’08), Seattle, WA, pp. 35–46, Sep 2008Google Scholar
- 4.Dragojević, A., Guerraoui, R., Kapalka, M.: Stretching transactional memory. In: PLDI ’09: Proceedings of the Conference on Programming Language Design and Implementation, Dublin, Ireland, pp. 155–165, June 2009Google Scholar
- 5.Fomitchev, M., Ruppert, E.: Lock-free linked lists and skip lists. In: PODC’04: Proceedings of the Symposium on Principles of Distributed Computing, St. John’s, Newfoundland, Canada, pp. 50–59, Jul 2004Google Scholar
- 6.Golan-Gueta, G., Bronson, N., Aiken, A., Ramalingam, G., Sagiv, M., Yahav, E.: Automatic fine-grain locking using shape properties. In: OOPSLA’11: Proceedings of the International Conference on Object Oriented Programming Systems Languages and Applications, Portland, OR, pp. 225–242, Oct 2011Google Scholar
- 8.Herlihy, M., Koskinen, E.: Transactional boosting: a methodology for highly-concurrent transactional objects. In: PPoPP’08: Proceedings of the Symposium on Principles and Practice Parallel Programming, Salt Lake City, UT, pp. 207–216, Feb 2008Google Scholar
- 10.Hicks, M., Foster, J.S., Prattikakis, P.: Lock inference for atomic sections. In: Proceedings of the Workshop on Languages, Compilers, and Hardware Support for Transactional Computing (TRANSACT’06), Ottawa, Canada, pp. 304–315, June 2006Google Scholar
- 11.Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P.: Optimistic parallelism requires abstractions. In: PLDI ’07: Proceedings of the Conference on Programming Language Design and Implementation, PLDI ’07, San Diego, CA, pp. 211–222, June 2007Google Scholar
- 12.Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation. In: CGO’04: Proceedings of the International Symposium on Code Generation and Optimization, Palo Alto, CA, pp. 75–85, March 2004Google Scholar
- 15.Usui, T., Behrends, R., Evans, J., Smaragdakis, Y.: Adaptive locks: Combining transactions and locks for efficient concurrency. In: PACT’09: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Raleigh, NC, pp. 3–14, Sep 2009Google Scholar
- 16.Wamhoff, J.T., Fetzer, C., Felber, P., Rivière, E., Muller, G.: FastLane: improving performance of software transactional memory for low thread counts. In: PPoPP’13: Proceedings of the Symposium on Principles and Practice Parallel Programming, Shenzhen, China, pp. 113–122, Feb 2013Google Scholar