Skip to main content

Performance Driven Cooperation between Kernel and Auto-tuning Multi-threaded Interval B&B Applications

  • Conference paper
Computational Science and Its Applications – ICCSA 2012 (ICCSA 2012)

Abstract

Dynamically determining the appropriate number of threads for a multi-threaded application may lead to a higher efficiency than predetermining the number of threads beforehand. Interval branch-and-bound (B&B) global optimization algorithms are typically irregular algorithms that may benefit from the use of a dynamic number of threads. The question is how to obtain the necessary on line information to decide on the number of threads. We experiment with a scheme following a SPMD (Single Program, Multiple Data) and AMP (Asynchronous Multiple Pool) model. This means that all threads execute the same code and they are consequently affected by the same types of blocked time.

There exist several methods to measure the blocked time of an application. The basis for the data to be obtained is the information provided by the Linux Operating System (O.S.) for tasks: task_interruptible and task_uninterruptible block time. We elaborate on this, to determine new metrics allowing kernel and applications to collaborate through system calls in order to decide on the number of threads for an application.

This work has been funded by grants from the Spanish Ministry of Science and Innovation (TIN2008-01117), and Junta de Andalucía (P11-TIC-7176), in part financed by the European Regional Development Fund (ERDF). Eligius M. T. Hendrix is a fellow of the Spanish “Ramón y Cajal” contract program, co-financed by the European Social Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52, 56–67 (2009), doi:10.1145/1562764.1562783

    Article  Google Scholar 

  2. Bhattacharjee, A., Contreras, G., Martonosi, M.: Parallelization libraries: Characterizing and reducing overheads. ACM Trans. Archit. Code Optim. 8, 5:1–5:29 (2011), doi:10.1145/1952998.1953003

    Article  Google Scholar 

  3. Casado, L.G., Martínez, J.A., García, I., Hendrix, E.M.T.: Branch-and-bound interval global optimization on shared memory multiprocessors. Optimization Methods and Software 23(3), 689–701 (2008), doi:10.1080/10556780802086300

    Article  MathSciNet  MATH  Google Scholar 

  4. De Bruin, A., Kindervater, G., Trienekens, H.: Asynchronous Parallel Branch and Bound and Anomalies. In: Ferreira, A., Rolim, J. (eds.) IRREGULAR 1995. LNCS, vol. 980, pp. 363–377. Springer, Heidelberg (1995), doi:10.1007/3-540-60321-2_29

    Chapter  Google Scholar 

  5. Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE Press, Piscataway (2008), doi:10.1109/SC.2008.5213927

    Google Scholar 

  6. Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 36:1–36:11. IEEE Press, Piscataway (2008), http://portal.acm.org/citation.cfm?id=1413370.1413407

    Google Scholar 

  7. Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguadé, E.: Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In: 38th International Conference on Parallel Processing (ICPP 2009), pp. 124–131. IEEE Computer Society, Vienna (2009), doi:10.1109/ICPP.2009.64

    Google Scholar 

  8. Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the Cilk-5 multithreaded language. In: PLDI 1998: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, pp. 212–223. ACM, New York (1998), doi:10.1145/277650.277725

    Chapter  Google Scholar 

  9. Gendron, B., Crainic, T.G.: Parallel branch-and-bound algorithms: Survey and synthesis. Operations Research 42(6), 1042–1066 (1994), doi:10.1287/opre.42.6.1042

    Article  MathSciNet  MATH  Google Scholar 

  10. Lee, J., Park, J.H., Kim, H., Jung, C., Lim, D., Han, S.: Adaptive execution techniques of parallel programs for multiprocessors. Journal of Parallel and Distributed Computing 70(5), 467–480 (2010), doi:10.1016/j.jpdc.2009.10.008

    Article  MATH  Google Scholar 

  11. Olivier, S.L., Prins, J.F.: Evaluating openmp 3.0 run time systems on unbalanced task graphs. In: Müller, M.S., de Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 63–78. Springer, Heidelberg (2009), doi:10.1007/978-3-642-02303-3_6

    Chapter  Google Scholar 

  12. Olivier, S.L., Prins, J.F.: Comparison of openmp 3.0 and other task parallel frameworks on unbalanced task graphs. International Journal of Parallel Programming 38, 341–360 (2010), doi:10.1007/s10766-010-0140-7

    Article  MATH  Google Scholar 

  13. OpenMP Architecture Review Board: OpenMP Application Program Interface, version 3.0. OpenMP (2008)

    Google Scholar 

  14. Patterson, D.A.: Software knows best: portable parallelism requires standardized measurements of transparent hardware. In: Proceedings of the First Joint WOSP/SIPEW International Conference on Performance Engineering, WOSP/SIPEW 2010, pp. 1–2. ACM, New York (2010), doi:10.1145/1712605.1712607

    Chapter  Google Scholar 

  15. Pusukuri, K.K., Gupta, R., Bhuyan, L.N.: Thread reinforcer: Dynamically determining number of threads via os level monitoring. In: Proceedings of the 2011 International Symposium on Workload Characterization, Austin, TX, USA, pp. 116–125 (October 2011), doi:10.1109/IISWC.2011.6114208

    Google Scholar 

  16. Reinders, J.: Intel Threading Building Blocks. O’Reilly (2007)

    Google Scholar 

  17. Sanjuan-Estrada, J., Casado, L., García, I.: Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures. The Journal of Supercomputing 58(3), 376–384 (2011), doi:10.1007/s11227-011-0594-4

    Article  Google Scholar 

  18. Sanjuan-Estrada, J.F., Casado, L.G., García, I.: Adaptive parallel interval global optimization algorithms based on their performance for non-dedicated multicore architectures. In: Proceedings of PDP 2011 - The 19th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, Cyprus, pp. 252–256 (February 2011), doi:10.1109/PDP.2011.54

    Google Scholar 

  19. Suleman, M.A., Qureshi, M.K., Patt, Y.N.: Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on cmps. SIGARCH Comput. Archit. News 36, 277–286 (2008), doi:10.1145/1353534.1346317

    Article  Google Scholar 

  20. Yu, C., Petrov, P.: Adaptive multi-threading for dynamic workloads in embedded multiprocessors. In: Proceedings of the 23rd Symposium on Integrated Circuits and System Design, SBCCI 2010, pp. 67–72. ACM, New York (2010), doi:10.1145/1854153.1854173

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sanjuan-Estrada, J.F., Casado, L.G., García, I., Hendrix, E.M.T. (2012). Performance Driven Cooperation between Kernel and Auto-tuning Multi-threaded Interval B&B Applications. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2012. ICCSA 2012. Lecture Notes in Computer Science, vol 7333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31125-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31125-3_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31124-6

  • Online ISBN: 978-3-642-31125-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics