Abstract
In this paper, based on generalized conjugate residual squared (GCRS2) algorithm in Zhang et al. (2010 Third International Conference on Information and Computing, pp 326–329, 2010) and the ideas in Gu et al. (Appl Math Comput 186:1243–1253, 2007), we present an improved generalized conjugate residual squared (IGCRS2) algorithm, which is designed for distributed parallel environments. The improved algorithm reduces two global synchronization points to one by changing the computation sequence in the GCRS2 algorithm and all inner products per iteration are independent and communication time required for inner product can be overlapped with useful computation. Theoretical analysis and numerical comparison about isoefficiency analysis show that the IGCRS2 method has better parallelism and scalability than the GCRS2 method and the parallel performance can be improved by a factor of about 2.
Similar content being viewed by others
References
Bücker, H.M., Sauren, M.: A parallel version of the quasi-minimal residual method based on coupled two-term recurrences. In: Proceedings of Workshop on Applied Parallel computing in Industrial Problems and Optimization (Para96), Technical University of Denmark, Springer, Lyngby (1996)
Chi, L.H., Liu, J., Liu, X.P., Hu, Q.F., Li, X.M.: An improved conjugate residual algorithm for large symmetric linear systems. In: Computational Physics, Proceedings of the Joint Conference of ICCP6 and CCP2003, pp. 325–328. Rinton Press, New Jersey (2005)
Freund, R.W., Nachtigal, N.M.: QMR: a quasi-minimal residual method for non-Hermitian linear systems. Numer. Math. 60, 315–339 (1991)
Grama, A., Gupta, A., Kumar, V.: Isoefficiency function: a scalability metric for parallel algorithms and architectures. IEEE Parallel Distrib. Technol. 1(3), 12–21 (1993)
Gu, T.X., Liu, X.P., Mo, Z.Y.: Multiple search direction conjugate gradient method I: methods and their propositions. Int. J. Comput. Math. 81(9), 1133–1143 (2004)
Gu, T.X., Zuo, X.Y., Zhang, L.T., Zhang, W.Q., Sheng, Z.Q.: An improved Bi-Conjugate residual algorithm suitable for distributed parallel computing. Appl. Math. Comput. 186, 1243–1253 (2007)
Jiang, D.D., Xu, Z.Z., Chen, Z.H., Han, Y., Xu, H.W.: Joint time-frequency sparse estimation of large-scale network traffic. Comput. Netw. 55(10), 3533–3547 (2011)
Jiang, D.D., Xu, Z.Z., Xu, H.W., Han, Y., Chen, Z.H., Yuan, Z.: An approximation method of origin-destination flow traffic from link load counts. Comput. Electr. Eng. 37(6), 1106–1121 (2011)
Liu, X.P., Gu, T.X., Hang, X.D., Sheng, Z.Q.: A parallel version of QMRCGSTAB method for large linear systems in distributed parallel environments. Appl. Math. Comput. 172(2), 744–752 (2006)
Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing Company, Boston (1996)
Sogabe, T., Zhang, S.L.: Extended conjugate residual methods for solving nonsymmetric linear systems. In: Yuan, Y.-X. (ed.) Numerical Linear Algebra and Optimization, pp. 88–99. Science Press, Beijing (2003)
Sonneveld, P.: CGS: a fast lanczos-type solver for nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 10(1), 36–52 (1989)
de Sturler, E., van der Vorst, H.A.: Reducing the effect of the global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math. 18, 441–459 (1995)
de Sturler, E.: A performance model for Krylov subspace methods on mesh-based parallel computers. Parallel Comput. 22, 57–74 (1996)
van der Vorst, H.A.: Bi-CGSTAB: a fast and smoothly converging variant of bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 631–644 (1992)
Yang, T.R.: The improved CGS method for large and sparse linear systems on bulk synchronous parallel architectures. In: 2002 5th Intern, Conf. Algorithms and Architectures for Parallel Processing, pp. 232–237. IEEE Computer Society (2002)
Yang, T.R., Brent, R.P.: The improved BiCGSTAB method for large and sparse nonsymmetric linear systems on parallel distributed memory architectures, In: 2001 5th Intern, Conf. Algorithms and Architectures for Parallel Processing, pp. 324–328. IEEE Computer Society (2002)
Yang, T.R., Brent, R.P.: The improved BiCG method for large and sparse linear systems on parallel distributed memory architectures. Inf. J. 6, 349–360 (2003)
Yang, T.R., Lin, H.X.: The improved quasi-minimal residual method on massively distributed memory computers. In: Proceedings of The International Conference on High Performance Computing and Networking (HPCN-97), April 1997
Zhang, L.T., Huang, T.Z., Gu, T.X., Zuo, X.Y.: An improved conjugate residual squared algorithm suitable for distributed parallel computing. Microelectron. Comput. 25(10), 12–14 (2008)
Zhang, J.H., Zhao, J.: A generalized conjugate residual squared algorithm for solving nonsymmetric linear systems. In: 2010 Third International Conference on Information and Computing, pp. 326–329 (2010)
Zhang, L.T., Zuo, X.Y., Gu, T.X., Huang, T.Z.: Conjugate residual squared method and its improvement for non-symmetric linear systems. Int. J. Comput. Math. 87(7), 1578–1590 (2010)
Acknowledgments
The authors would like to thank the referees and Editor for their helpful and detailed suggestions for revising this manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research of this author is supported by NSFC Tianyuan Mathematics Youth Fund (11226337), NSFC(11471098,61203179,61202098,61170309,91130024,61272544,61472462 and 11171039), Aeronautical Science Foundation of China (2013ZD55006), Project of Youth Backbone Teachers of Colleges and Universities in Henan Province(2013GGJS-142), ZZIA Innovation team fund (2014TD02), Major project of development foundation of science and technology of CAEP (2012A0202008), Defense Industrial Technology Development Program, Basic and Advanced Technological Research Project of of Henan Province (122300410181,142300410333), China Postdoctoral Science Foundation (2014M552001), Henan Province Postdoctoral Science Foundation (2013031), Natural Science Foundation of Henan Province (13A110399,14A630019,14B110023), Natural Science Foundation of Zhengzhou City (141PQYJS560).
About this article
Cite this article
Zhang, LT., Dong, XN., Gu, TX. et al. An improved generalized conjugate residual squared (IGCRS2) algorithm suitable for distributed parallel computing. Japan J. Indust. Appl. Math. 32, 143–155 (2015). https://doi.org/10.1007/s13160-014-0163-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13160-014-0163-3