Tall-and-skinny QR factorization with approximate Householder reflectors on graphics processors


We present a novel method for the QR factorization of large tall-and-skinny matrices that introduces an approximation technique for computing the Householder vectors. This approach is very competitive on a hybrid platform equipped with a graphics processor, with a performance advantage over the conventional factorization due to the reduced amount of data transfers between the graphics accelerator and the main memory of the host. Our experiments show that, for tall–skinny matrices, the new approach outperforms the code in MAGMA by a large margin, while it is very competitive for square matrices when the memory transfers and CPU computations are the bottleneck of the Householder QR factorization.

Fig. 1
Fig. 2


This research was supported by the Project TIN2017-82972-R from the MINECO (Spain) and the EU H2020 Project 732631 “OPRECOMP. Open Transprecision Computing”.

