Advertisement

Cluster Computing

, Volume 17, Issue 2, pp 359–369 | Cite as

High efficient sedimentary basin simulations on hybrid CPU-GPU clusters

  • Mei Wen
  • Huayou Su
  • Wenjie Wei
  • Nan Wu
  • Xing Cai
  • Chunyuan Zhang
Article

Abstract

The key to achieving high performance on a GPU-enhanced cluster is efficient exploitation of each GPU’s powerful computing capability. Moreover, rationally balancing the workload between CPUs and GPUs can release additional computing power, which arises from the CPUs. In this paper, we extend our earlier work on using a hybrid CPU-GPU cluster for real-world sedimentary basin simulation, by further improving the involved CUDA implementations. A thorough analysis of the achieved new performance is also carried out. By using 1024 GPUs and 12288 CPU cores together, our best CPU-GPU hybrid implementation is able to achieve a double-precision performance of 72.8 TFlops, in connection with simulations on a huge 131072×131072 mesh.

Keywords

Dual-lithology sedimentary simulation CPU-GPU hybrid computing CUDA programming 

Notes

Acknowledgements

The authors gratefully acknowledge the support from the National Natural Science Foundation of China under NSFC Nos. 61033008, 61103080 and 61272145, SRFDP No. 20104307110002 and 20124307130004, Innovation in Graduate School of NUDT No. B120605, Hunan Provincial Innovation Foundation for Postgraduate under No. CX2012B030, FriNatek program of the Research Council of Norway No. 214113/F20. Technical assistance from the National Supercomputing Center in Changsha is also acknowledged.

References

  1. 1.
    Nickolls, J., Dally, W.J.: The GPU computing era. IEEE MICRO 30(2), 56–59 (2011) CrossRefGoogle Scholar
  2. 2.
    Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010) Google Scholar
  3. 3.
    Thibault, J.C., Senocak, I.: CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proceedings of the 47th AIAA Aerospace Sciences Meeting (2009) Google Scholar
  4. 4.
    Hamada, T., Nitadori, K.: 190 TFlops astrophysical N-body simulation on a cluster of GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010) Google Scholar
  5. 5.
    Hampton, S.S., Alam, S.R., Crozier, P.S., Agarwal, P.K.: Optimal utilization of heterogeneous resources for biomolecular simulations. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010) Google Scholar
  6. 6.
    Shimokawabe, T., Aoki, T., Takaki, T., Yamanaka, A., Nukada, A., Endo, T., Maruyama, N., Matsuoka, S.: Peta-scale Phase-Field Simulation for Dendritic Solidification on the TSUBAME 2.0 Supercomputer. In: Proceedings of the 2011 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2011) Google Scholar
  7. 7.
    Wen, M., Su, H., Wei, W., Wu, N., Cai, X., Zhang, C.: Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations. In: Proceedings of the IEEE Cluster, pp. 25–35. IEEE Computer Society, Los Alamitos (2012) Google Scholar
  8. 8.
    Clark, S.R., Wei, W., Cai, X.: Numerical analysis of a dual-sediment transport model applied to Lake Okeechobee, Florida. In: Proceedings of the 9th International Symposium on Parallel and Distributed Computing, pp. 189–194. IEEE Computer Society Press, Los Alamitos (2010) Google Scholar
  9. 9.
    Jordan, T.E., Flemmings, P.B.: Large-scale stratigraphic architecture, eustatic variation, and unsteady tectonism: a theorectical evaluation. J. Geophys. Res. 96(1), 6681–6699 (1991) CrossRefGoogle Scholar
  10. 10.
    Rivenæs, J.C.: A computer simulation model for siliclastic basin stratigraphy. Ph.D. thesis, University of Trondheim (1993) Google Scholar
  11. 11.
    Wei, W., Clark, S.R., Su, H., Wen, M., Cai, X.: Balancing efficiency and accuracy for sediment transport simulations (2012). http://heim.ifi.uio.no/xingca/Wei-etal-2012-CG.pdf
  12. 12.
    Schottler, S.P., Engstrom, D.R.: A chronological assessment of Lake Okeechobee (Florida) sediments using multiple dating markers. J. Paleolimnol. 36, 19–36 (2006) CrossRefGoogle Scholar
  13. 13.
    Reddy, K.R., Diaz, O.A., Scinto, L.J., Agami, M.: Phosphorus dynamics in selected wetlands and streams of the lake Okeechobee Basin. J. Ecol. Eng. 5, 183–207 (1995) CrossRefGoogle Scholar
  14. 14.
    Hill, G.W., DeWitt, N.T., Hansen, M.: Lake O-keechobee bathymetry data. Tech. rep. (2002) http://sofia.usgs.gov/publications/maps/lakeokeebathy/index.html
  15. 15.
    Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. 14(3), 189–204 (2000) CrossRefGoogle Scholar
  16. 16.

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Mei Wen
    • 1
  • Huayou Su
    • 1
    • 2
    • 3
  • Wenjie Wei
    • 2
  • Nan Wu
    • 1
    • 2
  • Xing Cai
    • 2
    • 3
  • Chunyuan Zhang
    • 1
  1. 1.School of Computer and State key Laboratory of High Performance ComputingNational University of Defense TechnologyChangshaChina
  2. 2.Simula Research LaboratoryOsloNorway
  3. 3.Department of InformaticsUniversity of OsloOsloNorway

Personalised recommendations