We consider an application to matrix completion problems by solving (5.4) with our relaxed Interior Point algorithm for Low-Rank SDPs (IPLR), described in Algorithm 2. IPLR has been implemented using Matlab (R2018b) and all experiments have been carried out on Intel Core i5 CPU 1.3 GHz with 8 GB RAM. Parameters in Algorithm 2 have been chosen as follows:
$$\begin{aligned} \mu _0 = 1,\ \sigma = 0.5,\ \eta _1 = 0.9,\ \eta _2 = \sqrt{n}, \end{aligned}$$
while the starting dual feasible approximation has been chosen as \(y_0=0, S_0=\frac{1}{2}I_n\) and \(U_0\) is defined by the first r columns of the identity matrix \(I_n\).
We considered two implementations of IPLR which differ with the strategy used to find a minimizer of \(\phi _{\mu _k}(U,y)\) (Line 3 of Algorithm 2).
Let IPLR-GS denote the implementation of IPLR where the Gauss–Seidel strategy described in Algorithm 3 is used to find a minimizer of \(\phi _{\mu _k}(U,y)\). We impose a maximum number of 5 \(\ell \)-iterations and use the (possibly) preconditioned conjugate gradient method to solve the linear systems (4.13) and (4.14). We set a maximum of 100 CG iterations and the tolerance \(10^{-6}\) on the relative residual of the linear systems. System (4.13) is solved with unpreconditioned CG. Regarding (4.14), for the sake of comparison, we report in the next section statistics using unpreconditioned CG and CG employing the preconditioner defined by (4.19) and (4.21). In this latter case the action of the preconditioner has been implemented through the augmented system (4.22), following the procedure outlined at the end of Sect. 5. The linear system (4.23) has been solved by preconditioned CG, with preconditioner (4.24) allowing a maximum of 100 CG iterations and using a tolerance \(10^{-8}\). In fact, the linear system (4.14) along the IPLR iterations becomes ill-conditioned and the application of the preconditioner needs to be performed with high accuracy. We will refer to the resulting method as IPLR-GS_P.
As an alternative implementation to IPLR-GS, we considered the use of a first-order approach to perform the minimization at Line 3 of Algorithm 2. We implemented the Barzilai-Borwein method [3, 38] with a non-monotone line-search following [17, Algorithm 1] and using parameter values as suggested therein. The Barzilai-Borwein method iterates until \(\Vert \nabla \phi _{\mu _k}(U_k,y_k) \Vert \le \min (10^{-3}, \mu _k)\) or a maximum of 300 iterations is reached. We refer to the resulting implementation as IPLR-BB.
The recent literature for the solution of matrix completion problems is very rich and there exist many algorithms finely tailored for such problems, see e.g. [11, 14, 28, 33, 35, 37, 42, 45] just to name a few. Among these, we chose the OptSpace algorithm proposed in [28, 29] as a reference algorithm in the forthcoming tests. In fact, OptSpace compares favourably [29] with the state-of-art solvers such as SVT [11], ADMiRA [33] and FPCA [37] and its Matlab implementation is publicly available online.Footnote 1OptSpace is a first-order algorithm. Assuming the known solution rank r, it first generates a good starting guess by computing the truncated SVD (of rank r) of a suitable sparsification of the available data \(B_{\Omega }\) and then uses a gradient-type procedure in order to minimize the error \(\Vert B-Q\Sigma V^T\Vert _F\) where \(Q,\Sigma , V\) are the SVD factors of the current solution approximation. Since Q and V are orthonormal matrices, the minimization in these variables is performed over the Cartesian product of Grassmann manifolds, while minimization in \(\Sigma \) is computed exactly in \(\mathbb {R}^{r\times r}\). In [29], OptSpace has been equipped with two strategies to accommodate the unknown solution rank: the first strategy aims at finding a split in the eigenvalue distribution of the sparsified (“trimmed”) matrix and on accurate approximation of its singular values and the corresponding singular vectors; the second strategy starts from the singular vectors associated with the largest singular value and incrementally searches for the next singular vectors. The latter strategy yields the so called Incremental OptSpace variant, proposed to handle ill-conditioned problems whenever an accurate approximation of the singular vector corresponding to the smallest singular value is not possible and the former strategy fails.
Matlab implementations of OptSpace and Incremental OptSpace have been employed in the next sections. We used default parameters except for the maximum number of iterations. The default value is 50 and, as reported in the next sections, it was occasionally increased to improve accuracy in the computed solution.
We perform two sets of experiments: the first aims at validating the proposed algorithms and is carried out on randomly generated problems; the second is an application of the new algorithms to real data sets.
Tests on Random Matrices
As it is a common practice for a preliminary assessment of new methods, in this section we report on the performance of our proposed IPLR algorithm on matrices which have been randomly generated. We have generated random matrices both with noise and without noise, random nearly low-rank matrices and random mildly ill-conditioned matrices with and without noise. For the last class of matrices, which we expect to mimic reasonably well the practical problems, we also report the solution statistics obtained with OptSpace.
We have generated \({\hat{n}} \times {\hat{n}}\) matrices of rank r by sampling two \({\hat{n}} \times r\) factors \(B_L\) and \(B_R\) independently, each having independently and identically distributed Gaussian entries, and setting \(B = B_L B_R\). The set of observed entries \(\Omega \) is sampled uniformly at random among all sets of cardinality m. The matrix B is declared recovered if the (2,1) block \({\bar{X}}\) extracted from the solution X of (5.4), satisfies
$$\begin{aligned} \Vert {\bar{X}} - B\Vert _F / \Vert B\Vert _F < 10^{-3}, \end{aligned}$$
(6.1)
see [13].
Given r, we chose m by setting \(m = c r(2{\hat{n}}-r)\), \({\hat{n}}=600,700,800,900,1000\). We used \(c=0.01 {\hat{n}}+4\). These corresponding values of m are much lower than the theoretical bound provided by [13] and recalled in Sect. 5, but in our experiments they were sufficient to recover the sought matrix by IPLR.
In our experiments, the accuracy level in the matrix recovery in (6.1) is always achieved by setting \(\epsilon = 10^{-4}\) in Algorithm 2.
In the forthcoming tables we report: dimensions n and m of the resulting SDPs and target rank r of the matrix to be recovered; being X and S the computed solution, the final primal infeasibility \(\Vert {{\mathcal {A}}}(X)-b\Vert \), the complementarity gap \(\Vert XS-\mu I\Vert _F\), the error in the solution of the matrix completion problem \({{\mathcal {E}}}= \Vert {\bar{X}} - B\Vert _F /\Vert B\Vert _F\), the overall cpu time in seconds.
Table 2 IPLR-GS on random matrices In Tables 2 and 3 we report statistics of IPLR-GS and IPLR-BB, respectively. We choose as a starting rank r the rank of the matrix B to be recovered. In the last column of Table 2 we report both the overall cpu time of IPLR-GS without preconditioner (cpu) and with preconditioner (cpu_P) in the solution of (4.14). The lowest computational time for each problem is indicated in bold.
Table 3 IPLR-BB on random matrices As a first comment, we verified that Assumption 1 in Sect. 2 holds in our experiments. In fact, the method manages to preserve positive definiteness of the dual variable and \(\alpha _k<1\) is taken only in the early stage of the iterative process.
Secondly, we observe that both IPLR-GS and IPLR-BB provide an approximation to the solution of the sought rank; in some runs the updating procedure increases the rank, but at the subsequent iteration the downdating strategy is activated and the procedure comes back to the starting rank r. Moreover, IPLR-GS is overall less expensive than IPLR-BB in terms of cpu time, in particular as n and m increase. In fact, the cost of the linear algebra in the IPLR-GS framework is contained as one/two inner Gauss–Seidel iterations are performed at each outer IPLR-GS iteration except for the very few initial ones where up to five inner Gauss–Seidel iterations are needed. To give more details of the computational cost of both methods, in Table 4 we report some statistics of IPLR-GS and IPLR-BB for \( {\hat{n}}=900\), \(r=3\) and 8. More precisely we report the average number of inner Gauss–Seidel iterations (avr_GS) and the average number of unpreconditioned CG iterations in the solution of (4.13) (avr_CG_1) and (4.14) (avr_CG_2) for IPLR-GS and the average number of BB iterations for IPLR-BB (avr_BB). We notice that the solution of SDP problems becomes more demanding as the rank increases, but both the number of BB iterations and the number of CG iterations are reasonable.
Table 4 Statistics of IPLR-GS and IPLR-BB on random matrix \({\hat{n}}=900\), \(r=3\) and 8 To provide an insight into the linear algebra phase, in Fig. 1 we plot the minimum nonzero eigenvalue and the maximum eigenvalue of the coefficient matrix of (4.13), i.e. \(Q_k^T (A^TA + (S_k^2 \otimes I_n)) Q_k\). We remark that the matrix depends both on the outer iteration k and on the inner Gauss–Seidel iteration \(\ell \) and we dropped the index \(\ell \) to simplify the notation. Eigenvalues are plotted against the inner/outer iterations, for \( {\hat{n}}=100\), \(r=4\) and IPLR-GS continues until \(\mu _k<10^{-7}\). In this run only one inner iteration is performed at each outer iteration except for the first outer iteration. We also plot in the left picture of Fig. 2 the number of CG iterations versus inner/outer iterations. The figures show that the condition number of \(Q_k\) and the overall behaviour of CG do not depend on \(\mu _k\). Moreover, Table 4 shows that unpreconditioned CG is able to reduce the relative residual below \(10^{-6}\) in a low number of iterations even in the solution of larger problems and higher rank. These considerations motivate our choice of solving (4.13) without employing any preconditioner.
We now discuss the effectiveness of the preconditioner \(P_k\) given in (4.19), with \(Z_k\) given in (4.21), in the solution of (4.14). Considering \( {\hat{n}}=100\), \(r=4\), in Fig. 3 we plot the eigenvalue distribution (in percentage) of \(A (I \otimes X_k^2) A^T\) and \(P_k^{-1}(A (I \otimes X_k^2) A^T)\) at the first inner iteration of outer IPLR-GS iteration corresponding to \(\mu _k \approx 1.9e\!-\!3\). We again drop the index \(\ell \). We can observe that the condition number of the preconditioned matrix is about 1.3e5, and it is significantly smaller than the condition number of the original matrix (about 3.3e10). The preconditioner succeeded both in pushing the smallest eigenvalue away from zero and in reducing the largest eigenvalue. However, CG converges in a reasonable number of iterations even in the unpreconditioned case, despite the large condition number. In particular, we can observe in the right picture of Fig. 2 that preconditioned CG takes less than five iterations in the last stages of IPLR-GS and that the most effort is made in the initial stage of the IPLR-GS method; in this phase the preconditioner is really effective in reducing the number of CG iterations. These considerations remain true even for larger values of \({\hat{n}}\) and r as it is shown in Table 4.
Focusing on the computational cost of the preconditioner’s application, we can observe from the cpu times reported in Table 2, that for \(r=3,4,5\) the employment of the preconditioner produces a great benefit, with savings that vary from \(20\%\) to \(50\%\). Then, the overhead associated to the construction and application of the preconditioner is more than compensated by the gains in the number of CG iterations. The cost of application of the preconditioner increases with r as the dimension of the diagonal blocks of \(M_k\) in (4.24) increases with r. Then, for small value of \({\hat{n}}\) and \(r=6,7,8\) unpreconditioned CG is preferable, while for larger value of \({\hat{n}}\) the preconditioner is effective in reducing the overall computational time for \(r\le 7\). This behaviour is summarized in Fig. 4 where we plot the ratio cpu_P/cpu with respect to dimension n and rank (from 3 to 8).
In the approach proposed in this paper the primal feasibility is gradually reached, hence it is also possible to handle data \(B_{\Omega }\) corrupted by noise. To test how the method behaves in such situations we set \({\hat{B}}_{(s,t)} = B_{(s,t)} + \eta RD_{(s,t)}\) for any \((s,t) \in \Omega \), where \(RD_{(s,t)}\) is a random scalar drawn from the standard normal distribution, generated by the Matlab function randn; \(\eta >0\) is the level of noise. Then, we solved problem (5.4) using the corrupted data \({\hat{B}}_{(s,t)}\) to form the vector b. Note that, in this case \(\Vert {{\mathcal {A}}}(B)-b\Vert _2\approx \eta \sqrt{m}\). In order to take into account the presence of noise we set \(\epsilon = \max (10^{-4},10^{-1} \eta )\) in Algorithm 2.
Results of these runs are collected in Table 5 where we considered \(\eta = 10^{-1}\) and started with the target rank r. In Table 5 we also report
$$\begin{aligned} RMSE= \Vert {\bar{X}} - B\Vert _F /{\hat{n}}, \end{aligned}$$
that is the root-mean squared error per entry. Note that the root-mean error per entry in data \(B_{\Omega }\) is of the order of the noise level \(10^{-1}\), as well as \(\Vert {{\mathcal {A}}}(B)-b\Vert _2/\sqrt{m}\). Then, we claim to recover the matrix with acceptable accuracy, corresponding to an average error smaller than the level of noise.
Table 5 IPLR-GS_P on noisy matrices (noise level \(\eta = 10^{-1}\)) Mildly Ill-Conditioned Problems
In this subsection we compare the performance of IPLR_GS_P, OptSpace and Incremental OptSpace on mildly ill-conditioned problems with exact and noisy observations. We first consider exact observation and vary the condition number of the matrix that has to be recovered \(\kappa \). We fixed \({\hat{n}}=600\) and \(r=6\) and, following [29], generated random matrices with a prescribed condition number \(\kappa \) and rank r as follows. Given a random matrix B generated as in the previous subsection, let \(Q\Sigma V^T\) be its SVD decomposition and \({\tilde{Q}}\) and \({\tilde{V}}\) be the matrices formed by the first r columns of Q and V, respectively. Then, we formed the matrix \({\hat{B}}\) that has to be recovered as \({\hat{B}}={\tilde{Q}} {\tilde{\Sigma }} {\tilde{V}}^T\), where \({\tilde{\Sigma }}\) is a \(r\times r\) diagonal matrix with diagonal entries equally spaced between \({\hat{n}}\) and \({\hat{n}}/\kappa \). In Fig. 5 we plot the RMSE value against the condition number for all the three solvers considered, using the \(13\%\) of the observations. We can observe, as noticed in [29], that OptSpace does not manage to recover mildly ill-conditioned matrices while Incremental OptSpace improves significantly over OptSpace. According to [29], the convergence difficulties of OptSpace on these tests have to be ascribed to the singular value decomposition of the trimmed matrix needed in Step 3 of OptSpace. In fact, the singular vector corresponding to the smallest singular value cannot be approximated with enough accuracy. On the other hand, our approach is more accurate than Incremental OptSpace and its behaviour only slightly deteriorates as \(\kappa \) increases.
Now, let us focus on the case of noisy observations. We first fixed \(\kappa =200\) and varied the noise level. In Fig. 6 we plot the RMSE value against the noise level for all the three solvers considered, using the \(20\%\) of observations. Also in this case IPLR-GS_P is able to recover the matrix \({\hat{B}}\) with acceptable accuracy, corresponding to an average error smaller than the level of noise, and outperforms both OptSpace variants when the noise level is below 0.8. In fact, OptSpace managed to recover \({\hat{B}}\) only with a corresponding RMSE of the order of \(10^{-1}\) for any tested noise level, consistent only with the larger noise level tested.
In order to get a better insight into the behaviour of the method on mildly ill-conditioned and noisy problems, we fixed \(\kappa =100\), noise level \(\eta =0.3\) and varied the percentage of known entries from 8.3% to 50%, namely we set \(m=30,000, 45,000,\) 60, 000, 120, 000, 180, 000. In Fig. 7 the value of RMSE is plotted against the percentage of known entries. The oracle error value \(RMSE_{or}=\eta \sqrt{(n1r-r^2)/m}\), given in [12] is plotted, too. We observe that in our experiments IPLR-GS_P recovers the sought matrix with RMSE values always smaller than \(1.3RMSE_{or}\), despite the condition number of the matrix. This is not the case for OptSpace and Incremental OptSpace; OptSpace can reach a comparable accuracy only if the percentage of known entries exceeds \(30\%\). As expected, for all methods the error decreases as the number of subsampled entries increases.
In summary, for mildly ill-conditioned random matrices our approach is more reliable than OptSpace and Incremental OptSpace as the latter algorithms might struggle with computing the singular vectors of the sparsified data matrix accurately, and they cannot deliver precision comparable to that of IPLR. For the sake of completeness, we remark that we have tested OptSpace also on the well-conditioned random matrices reported in Tables 2, 4 and 3, 5. On these problems IPLR and OptSpace provide comparable solutions, but as a solver specially designed for matrix-completion problems OptSpace is generally faster than IPLR.
Rank Updating
We now test the effectiveness of the rank updating/downdating strategy described in Algorithm 2. To this purpose, we run IPLR-GS_P starting from \(r=1\), with rank increment/decrement \(\delta _r = 1\) and report the results in Table 6 for \({\hat{n}}=600,800,1000\). In all runs, the target rank has been correctly identified by the updating strategy and the matrix B is well-recovered. Runs in italic have been obtained allowing 10 inner Gauss–Seidel iterations. In fact, 5 inner Gauss–Seidel iterations were not enough to sufficiently reduce the residual in (2.4) and the procedure did not terminate with the correct rank. Comparing the values of the cpu time in Tables 2 and 6 we observe that the use of rank updating strategy increases the overall time; on the other hand, it allows to adaptively modify the rank in case a solution of (5.4) with the currently attempted rank does not exist.
Table 6 IPLR-GS_P on random matrices starting with \(r=1\) The typical updating behaviour is illustrated in Fig. 8 where we started with rank 1 and reached the target rank 5. In the first eight iterations a solution of the current rank does not exist and therefore the procedure does not manage to reduce the primal infeasibility as expected. Then, the rank is increased. At iteration 9 the correct rank has been detected and the primal infeasibility drops down. Interestingly, the method attempted rank 6 at iteration 13, but quickly corrected itself and returned to rank 5 which was the right one.
The proposed approach handles well the situation where the matrix which has to be rebuilt is nearly low-rank. We recall that by Corollary 5.2 we generate a low-rank approximation \({\bar{X}}_k\), while the primal variable \(X_k\) is nearly low-rank and gradually approaches a low-rank solution. Then, at termination, we approximate the nearly low-rank matrix that has to be recovered with the low-rank solution approximation.
Letting \(\sigma _1\ge \sigma _2\ge \dots \ge \sigma _{{\hat{n}}}\) be the singular values of B, we perturbed each singular value of B by a random scalar \(\xi = 10^{-3}\eta \), where \(\eta \) is drawn from the standard normal distribution, and using the SVD decomposition of B we obtain a nearly low-rank matrix \({\hat{B}}\). We applied IPLR-GS_P to (5.4) with the aim to recover the nearly low-rank matrix \({\hat{B}}\) with tolerance in the stopping criterion set to \(\epsilon =10^{-4}\). Results reported in Table 7 are obtained starting from \(r=1\) in the rank updating strategy. In the table we also report the rank \({\bar{r}}\) of the rebuilt matrix \({\bar{X}}\). The run corresponding to rank 8, in italic in the table, has been performed allowing a maximum of 10 inner Gauss–Seidel iterations. We observe that the method always rebuilt the matrix with accuracy consistent with the stopping tolerance. The primal infeasibility is larger than the stopping tolerance, as data b are obtained sampling a matrix which is not low-rank and therefore the method does not manage to push primal infeasibility below \(10^{-3}\). Finally we note that in some runs (rank equal to 4,5,6) the returned matrix \({\bar{X}}\) has a rank \({\bar{r}}\) larger than that of the original matrix B. However, in this situation we can observe that \({\bar{X}}\) is nearly-low rank as \(\sigma _i=O(10^{-3})\), \(i=r+1,\ldots ,{\bar{r}}\) while \(\sigma _i \gg 10^{-3}\), \(i=1,\ldots ,r\). Therefore the matrices are well rebuilt for each considered rank r and the presence of small singular values does not affect the updating/downdating procedure.
Table 7 IPLR-GS_P starting from \(r=1\) on nearly low-rank matrices (\(\xi = 10^{-3}\)) Tests on Real Data Sets
In this section we discuss matrix completion problems arising in diverse applications as the matrix to be recovered represents city-to-city distances, a grayscale image, game parameters in a basketball tournament and total number of COVID-19 infections.
Low-Rank Approximation of Partially Known Matrices
We now consider an application of matrix completion where one wants to find a low-rank approximation of a matrix that is only partially known.
As the first test example, we consider a \(312 \times 312 \) matrix taken from the “City Distance Dataset” [10] and used in [11], that represents the city-to-city distances between 312 cities in the US and Canada computed from latitude/longitude data.
We sampled the 30% of the matrix G of geodesic distances and computed a low-rank approximation \( {\bar{X}}\) by IPLR-GS_P inhibiting rank updating/downdating and using \(\epsilon =10^{-4}\). We compared the obtained solution with the approximation \({\bar{X}}_{os}\) computed by OptSpace and the best rank-r approximation \({\bar{X}}_r\), computed by truncated SVD (TSVD), that requires the knowledge of the full matrix G. We considered some small values of the rank (\(r=3,4,5\)) and in Table 8 reported the errors \({{\mathcal {E}}}_{ip}=\Vert G- {\bar{X}}\Vert _F/\Vert G\Vert _F\), \({{\mathcal {E}}}_{os}=\Vert G-{\bar{X}}_{os}\Vert _F/\Vert G\Vert _F\) and \({{\mathcal {E}}}_r=\Vert G-{\bar{X}}_r\Vert _F/\Vert G\Vert _F\). We remark that the matrix G is not nearly-low-rank, and our method correctly detects that there does not exist a feasible rank r matrix as it is not able to decrease the primal infeasibility below 1e0. On the other hand the error \({{\mathcal {E}}}_{ip}\) in the provided approximation, obtained using only the 23% of the entries, is the same as that of the best rank-r approximation \({\bar{X}}_r\). Note that computing the 5-rank approximation is more demanding. In fact the method requires on average: 3.4 Gauss–Seidel iterations, 37 unpreconditioned CG iterations for computing \(\Delta U\) and 18 preconditioned CG iterations for computing \(\Delta y\). In contrast, the 3-rank approximation requires on average: 3.8 Gauss–Seidel iterations, 18 unpreconditioned CG iterations for computing \(\Delta U\) and 10 preconditioned CG iterations for computing \(\Delta y\). As a final comment, we observe that IPLR-GS fails when \(r=5\) since unpreconditioned CG struggles with the solution of (4.14). The computed direction \(\Delta y\) is not accurate enough and the method fails to maintain S positive definite within the maximum number of allowed backtracks. Applying the preconditioner cures the problem because more accurate directions become available. Values of the error \( {{\mathcal {E}}}_{op}\) obtained with OptSpace are larger than \({{\mathcal {E}}}_r\). However it is possible to attain comparable values for \(r=3\) and \(r=5\) under the condition that the default maximum number of iterations of OptSpace is increased 10 times. In these cases, OptSpace is twice and seven time faster, respectively.
Table 8 TSVD, OptSpace and IPLR-GS_P for low rank approximation of the City Distance matrix As the second test example, we consider the problem of computing a low rank approximation of an image that is only partially known because some pixels are missing and we analyzed the cases when the missing pixels are distributed both randomly and not randomly (inpainting). To this purpose, we examined the Lake \(512\times 512\) original grayscale imageFootnote 2 shown in Fig. 9c and generated the inpainted versions with the 50% of random missing pixels (Fig. 9b) and with the predetermined missing pixels (Fig. 9c).
We performed tests fixing the rank to values ranging from 10 to 150 and therefore used IPLR-BB which is computationally less sensitive than IPLR-GS to the magnitude of the rank.
In Fig. 10 we plot the quality of the reconstruction in terms of relative error \( {{\mathcal {E}}}\) and PSNR (Peak-Signal-to-Noise-Ratio) against the rank, for IPLR-BB, OptSpace and truncated SVD. We observe that when the rank is lower than 40, IPLR-BB and TSVD give comparable results, but when the rank increases the quality obtained with IPLR-BB does not improve. As expected, by adding error information available only from the knowledge of the full matrix, the truncated SVD continues to improve the accuracy as the rank increases. The reconstructions produced with OptSpace display noticeably worse values of the two relative errors (that is, larger \({{\mathcal {E}}}\) and smaller PSNR, respectively) despite the rank increase.
Figure 11 shows that IPLR-BB is able to recover the inpainted image in Fig. 9c and that visually the quality of the reconstruction benefits from a larger rank. Images restored by OptSpace are not reported since the relative PSNR values are approximately 10 points lower than those obtained with IPLR-BB. The quality of the reconstruction of images Fig. 9b and c obtained with OptSpace cannot be improved even if the maximum number of iterations is increased tenfold.
Application to Sports Game Results Predictions
Matrix completion is used in sport predictive models to forecast match statistics [27]. We consider the dataset concerning the NCAA Men’s Division I Basketball Championship, in which each year 364 teams participate.Footnote 3 The championship is organized in 32 groups, called Conferences, whose winning teams face each other in a final single elimination tournament, called March Madness. Knowing match statistics of games played in the regular Championship, the aim is to forecast the potential statistics of the missing matches played in the March Madness phase. In our tests, we have selected one match statistic of the 2015 Championship, namely the fields goals attempted (FGA) and have built a matrix where teams are placed on rows and columns and nonzero ij-values correspond to the FGA made by team i and against team j. In this season, only 3771 matches were held and therefore we obtained a rather sparse \(364\times 364\) matrix of FGA statistics; in fact, only the 5.7% of entries of the matrix that has to be predicted is known. To validate the quality of our predictions we used the statistics of the 134 matches actually played by the teams in March Madness. We verified that in order to obtain reasonable predictions of the missing statistics the rank of the recovered matrix has to be sufficiently large. Therefore we use IPLR-BB setting the starting rank \(r=20\), rank increment \(\delta _r=10\) and \(\epsilon =10^{-3}\). The algorithm terminated recovering matrix \({\bar{X}}\) of rank 30. In Fig. 12 we report the bar plot of the exact and predicted values for each March Madness match. The matches have been numbered from 1 to 134. We note that except for 12 mispredicted statistics, the number of fields goals attempted is predicted reasonably well. In fact, we notice that the relative error between the true and the predicted statistic is smaller than \(20\%\) in the 90% of predictions.
On this data set, OptSpace gave similar results to those in Fig. 12 returning a matrix of rank 2.
Application to COVID-19 Infections Missing Data Recovery
We now describe a matrix completion problem where data are the number of COVID-19 infections in provincial capitals of regions in the North of Italy. Each row and column of the matrix corresponds to a city and to a day, respectively, so that the ij-value corresponds to the total number of infected people in the city i on the day j. We used data made available by the Italian Protezione CivileFootnote 4 regarding the period between March 11th and April 4th 2020, that is, after restrictive measures have been imposed by the Italian Government until the date of paper submission. We assume that a small percentage (5%) of data is not available to simulate the real case because occasionally certain laboratories do not communicate data to the central board. In such a case our aim is to recover this missing data and provide an estimate of the complete set of data to be used to make analysis and forecasts of the COVID-19 spread. Overall, we build a \(47 \times 24\) dense matrix and attempt to recover 56 missing entries in it. We use IPLR-GS_P with starting rank \(r=2\), rank increment \(\delta _r=1\) and \(\epsilon =10^{-4}\) and we have obtained a matrix \({\bar{X}}\) of rank 2. The same rank is obtained using OptSpace but only if the maximum number of its iterations is increased threefold. In Fig. 13 both the predicted and actual data (top) and the percentage error (bottom) are plotted using the two solvers. We observe that IPLR-GS_P yields an error below 10% except for 8 cases and in the worst case it reaches 22%. The error obtained with OptSpace exceeds 10% in 15 cases and in one case reaches 37%.
The good results obtained with IPLR-GS_P for this small example are encouraging for applying the matrix completion approach to larger scale data sets.