TRIPs-Py: Techniques for regularization of inverse problems in python

Pasha, Mirjeta; Gazzola, Silvia; Sanderford, Connor; Ugwu, Ugochukwu O.

doi:10.1007/s11075-024-01878-w

TRIPs-Py: Techniques for regularization of inverse problems in python

Original Paper
Open access
Published: 22 July 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

TRIPs-Py: Techniques for regularization of inverse problems in python

Download PDF

Mirjeta Pasha¹,
Silvia Gazzola²,
Connor Sanderford³ &
…
Ugochukwu O. Ugwu⁴

49 Accesses
Explore all metrics

Abstract

In this paper we describe TRIPs-Py, a new Python package of linear discrete inverse problems solvers and test problems. The goal of the package is two-fold: 1) to provide tools for solving small and large-scale inverse problems, and 2) to introduce test problems arising from a wide range of applications. The solvers available in TRIPs-Py include direct regularization methods (such as truncated singular value decomposition and Tikhonov) and iterative regularization techniques (such as Krylov subspace methods and recent solvers for $\ell _p$-$\ell _q$ formulations, which enforce sparse or edge-preserving solutions and handle different noise types). All our solvers have built-in strategies to define the regularization parameter(s). Some of the test problems in TRIPs-Py arise from simulated image deblurring and computerized tomography, while other test problems model real problems in dynamic computerized tomography. Numerical examples are included to illustrate the usage as well as the performance of the described methods on the provided test problems. To the best of our knowledge, TRIPs-Py is the first Python software package of this kind, which may serve both research and didactical purposes.

Convergent Regularization in Inverse Problems and Linear Plug-and-Play Denoisers

Article Open access 03 June 2024

IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems

Article 03 August 2018

Data-Informed Regularization for Inverse and Imaging Problems

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Inverse problems arise whenever one wants to recover information about a hidden quantity from measurements acquired via a physical process (forward problem). In this paper, we are interested in linear discrete inverse problems, which can be formulated as linear systems of equations or linear least squares problems of the form

$$\begin{aligned} \textbf{A}\textbf{x}= \textbf{b},\qquad \min _{\textbf{x}}\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2, \end{aligned}$$

(1)

where $\textbf{A}\in \mathbb {R}^{m\times n}$ is a suitable discretization of the forward operator, $\textbf{x}\in \mathbb {R}^n$ represents a quantity of interest, and $\textbf{b}\in \mathbb {R}^m$ is the available data, which is typically corrupted by some unknown perturbations (noise) $\textbf{e}$, i.e.,

$$\begin{aligned} \textbf{b}= \textbf{b}_\textrm{true} + \textbf{e}= \textbf{A}\textbf{x}_\textrm{true} + \textbf{e}\,. \end{aligned}$$

(2)

To keep the following derivations simple, we assume $m\ge n$; extensions to the $m<n$ case are often straightforward. Many important applications, such as medical, seismic, and satellite imaging require the solution of inverse problems; see, for instance, [1,2,3]. Test problems in TRIPs-Py model deblurring (or deconvolution problems) and computerized tomography problems; while many realistic instances of these problems can be generated using the provided synthetic data, also instances of dynamic tomography problems that use real data are included. For this kind of problems, the matrix $\textbf{A}$ is typically ill-conditioned, with singular values that gradually decay and cluster at zero. This implies that the solution of (1) is very sensitive to the noise in $\textbf{b}$ and some regularization should be applied to recover a meaningful approximation of $\textbf{x}_\textrm{true}$. All the regularization methods considered in this paper and available within TRIPs-Py compute a regularized solution $\textbf{x}_\textrm{reg}$ by (approximately) solving the following optimization problem

$$\begin{aligned} \textbf{x}_\textrm{reg} = \arg \min _{\textbf{x}\in \mathcal {D}\subseteq \mathbb {R}^{n}}\mathcal {F}(\textbf{x}) + \alpha \mathcal {R}(\textbf{x}),\quad \alpha >0, \end{aligned}$$

(3)

where $\mathcal {F}(\textbf{x})$ is a fit-to-data term that typically involves the matrix $\textbf{A}$ (or a modification thereof) and the data $\textbf{b}$, $\alpha $ is a regularization parameter, $\mathcal {R}$ is a regularization term, and $\mathcal {D}$ is a set of constraints. Most of the methods in this paper take $\mathcal {F}(\textbf{x})= \Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _p^p$, with $p>0$, and $\mathcal {R}(\textbf{x})=\Vert \Psi \textbf{x}\Vert _q^q$, with $\Psi \in \mathbb {R}^{k\times n}$ and $q>0$. The choice of p is dictated by the distribution of the noise $\textbf{e}$ (e.g., $p = 2$ for Gaussian white noise, $p = 1$ for impulse noise), while the choices of q and of the regularization matrix $\Psi $ are dictated by prior information available on $\textbf{x}_\textrm{true}$ (e.g., if the gradient of $\textbf{x}_\textrm{true}$ is smooth, one typically takes $q=2$ and $\Psi =\nabla $, i.e., a discretization of the gradient operator; if $\textbf{x}_\textrm{true}$ is sparse one takes $\Psi $ as the identity and $0<q\le 1$). Approaches for solving (3) depend on the particular choices of $p,\,q,\,\Psi $ and the features of the problem (e.g., if $\textbf{A}$ is small or large scale). A summary of the possible choices of the functionals $\mathcal {F}$ and $\mathcal {R}$ supported within TRIPs-Py is provided in Table 1, and more details are provided in Section 2. Within the TRIPs-Py solvers, the set of constraints $\mathcal {D}$ is taken to be the whole domain $\mathbb {R}^n$ for small-scale problems, and an appropriate linear subspace of $\mathbb {R}^n$ for large-scale problems; choices for the latter are detailed in Section 2.2.

TRIPs-Py serves two interrelated purposes: 1) to provide model implementations of solvers for a variety of both small and large-scale linear inverse problems, and 2) to provide a range of test problems (dealing with both synthetic and real data) for the users to test the TRIPs-Py solvers or, possibly, their own solvers. TRIPs-Py is an open source package that is available through GitHub at https://github.com/trips-py/trips-py, where the users can also find installation instructions and requirements. Figure 1 shows an overview of the TRIPs-Py structure and its modules.

When designing TRIPs-Py we aimed to create Python software that is easy to use, with calls to all the solvers that are very basic and similar. More precisely, default values are provided for all the options and parameters needed by the solvers (including automatic strategies to choose the regularization parameter in (3) or the stopping iteration for iterative solvers); however, experienced users can easily set such parameters by modifying the input options. The test problems in TRIPs-Py include 1D and 2D deconvolution problems (deblurring), and a variety of computed tomography problems that employ both synthetic and real data. All the test problems are generated with very similar instructions, and default values for the test problem generators are provided for users who are not familiar with the associated applications; as for the solvers, experienced users can easily set such problem-specific parameters by modifying the input options.

Although other packages are already available in MATLAB and Python for solving, and numerically experimenting with, inverse problems (and TRIPs-Py shares some features with them), some specifics make TRIPs-Py unique. For instance, solely the solvers and test problems for small-scale inverse problems in TRIPs-Py are closely modeled on the ones available in Regularization Tools [4], a pioneering MATLAB package that has proven to be very popular with the wider community of researchers in algorithms for linear inverse problems since the Nineties. TRIPs-Py shares many design objectives, solvers and test problems with IR Tools [5], a recent MATLAB package of iterative regularization methods and test problems for large-scale linear inverse problems. However, when compared to IR Tools, TRIPs-Py also features one of the first publicly available implementation of some recent methods for $\ell _p$-$\ell _q$ regularization (see [6] for a MATLAB implementation of similar solvers that was developed simultaneously to TRIPs-Py) and some test problems in TRIPs-Py also employ real data. Other popular MATLAB toolboxes such as Restore Tools [7], AIR Tools II [8]) and TIGRE [9] focus only on specific applications (the former is for image deblurring, the latter for computerized tomography), while TRIPs-Py encompasses a range of small and large-scale linear inverse problems. Nowadays, as Python is very popular with researchers both inside and outside of academia, and students at universities are increasingly exposed to Python as a programming language in their courses, we envision that TRIPs-Py could be useful for the community of users of these MATLAB toolboxes that may have to switch to Python for collaborating on research projects and for didactical purposes. Since many TRIPs-Py functionalities are similar to the ones underlying Regularization Tools and IR Tools, we are confident that such users will find transitioning to Python through TRIPs-Py natural. Even in the setting of the many powerful and popular Python packages for the solution of inverse problems, we believe that TRIPs-Py provides some valuable additions, for a variety of reasons. First, as mentioned above, TRIPs-Py’s test problems model diverse applications, while many Python packages for inverse problems are focused on a particular application. This differentiates TRIPs-Py from recent packages like CIL [10], ASTRA [11], and the Python version of TIGRE, which all target computerized tomography. Second, TRIPs-Py’s solvers for large-scale problems are based on standard or generalized Krylov subspace methods, with some of the former and the latter not being available anywhere else or being used to solve non-convex non-smooth instances of problem (3). This distinguishes TRIPs-Py from other general-purpose libraries for inverse problems like ODL [12], whose majority of solvers are based on optimization methods such as proximal gradient algorithms, primal-dual hybrid gradient algorithms, and ADMM. We anticipate that the users of these Python packages would consider using some of the TRIPs-Py solvers to tackle their applications, and to compare available or newly developed solvers with the ones in TRIPs-Py.

Table 1 List of the solvers available in TRIPs-Py

Full size table

The remaining part of this paper is organized as follows: Section 2 gives an overview of the solvers available in TRIPs-Py, starting with filtering methods for small-scale problems and including many projection methods for large-scale, smooth and non-smooth regularized problems. Section 3 gives an overview of the test problems available in TRIPs-Py, together with some illustrations of the usage of solvers and test problems. Conclusions and future directions are discussed in Section 4.

2 Overview of the TRIPs-Py solvers

We start this section by discussing the TRIPs-Py solvers for regularization methods expressed in the 2-norm, i.e., to solve problem (3) with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2^2$ and $\mathcal {R}(\textbf{x})=\Vert \Psi \textbf{x}\Vert _2^2$. This is the main part of the section, and most of the TRIPs-Py solvers are tailored to this case. The last part of this section describes a solver that can be employed for regularization methods expressed in the $\ell _p$-$\ell _q$ norm, i.e., to solve problem (3) with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _p^p$ and $\mathcal {R}(\textbf{x})=\Vert \Psi \textbf{x}\Vert _q^q$, $p,q>0$. An overview of the solvers available within TRIPs-Py is given in Table 1. We will not give too many details about the methods in this section, but rather provide extensive references for the reader. All the solvers in TRIPs-Py can be called in a consistent way that, at the very least, should include the forward operator (which could be a matrix or a function that acts on vectors) and the measured data. Additional optional inputs, such as the exact solution $\textbf{x}_\textrm{true}$ for synthetic test problems, the maximum number of iterations to be performed and information about the stopping criterion for iterative solvers, or other specific features about specific methods, can be included. Most of these inputs are otherwise assigned default values. All functions return the computed approximation of the solution of the inverse problem, together with a dictionary that contains additional information about the solver and typically depends on the additional optional inputs assigned when calling the function.

We first survey methods that are suited for small-scale problems, followed by methods that are suited for large-scale problems; both the cases $\Psi =\textbf{I}$ and $\Psi \ne \textbf{I}$ will be covered.

2.1 Direct methods for small-scale problems

When problem (1) is small-scale, solvers for problems (3) expressed in the 2-norm with $\mathcal {R}(\textbf{x})=\Vert \textbf{x}\Vert _2^2$ typically rely on the Singular Value Decomposition (SVD) of $\textbf{A}$, i.e.,

$$ \textbf{A}= \textbf{U}{\varvec{\Sigma }}\textbf{V}^T, $$

where $\textbf{U}\in \mathbb {R}^{m\times m}$ and $\textbf{V}\in \mathbb {R}^{n\times n}$ are the orthogonal matrices of the left and right singular vectors, respectively, and ${\varvec{\Sigma }}\in \mathbb {R}^{m\times n}$ is the matrix formed by the diagonal matrix of the singular values $\sigma _1\ge \sigma _2\ge \ldots \ge \sigma _n \ge 0$ on top, and an $(m-n)\times n$ matrix of zeros at the bottom. The SVD of $\textbf{A}$ is also a useful tool to analyze the ill-posedness of problem (1). While the SVD exists for every matrix, algorithms for its computation have a cost of order $O(mn^2)$ flops, and are therefore prohibitive for large-scale problems; see [22, §8.6] for more details.

When $\mathcal {R}(\textbf{x})=\Vert \Psi \textbf{x}\Vert _2^2$, with $\textbf{I}\ne \Psi \in \mathbb {R}^{k\times n}$, problems (3) are naturally handled by considering the generalized singular value decomposition (GSVD) of the matrix pair ($\textbf{A}$, $\Psi $). Assume that $m\ge n \ge k$, that $\text {rank}(\Psi )=k$ and that the null spaces of $\textbf{A}$ and $\Psi $ intersect trivially. Then the GSVD of $(\textbf{A}$,$\Psi )$ is given by

$$\begin{aligned} \textbf{A}=\widetilde{\textbf{U}}\widetilde{\Sigma }\textbf{Y}^{T},\qquad \Psi =\widetilde{\textbf{V}}\widetilde{\Lambda }\textbf{Y}^{T}, \end{aligned}$$

(4)

where $\widetilde{\textbf{U}}\in \mathbb {R}^{m\times n}$ and $\widetilde{\textbf{V}}\in \mathbb {R}^{k\times n}$ have orthonormal columns, $\textbf{Y}\in \mathbb {R}^{n\times n}$ is nonsingular; $\widetilde{\Sigma }\in \mathbb {R}^{n\times n}$ is the diagonal matrix with diagonal entries $0\le \widetilde{\sigma }_1\le \dots \le \widetilde{\sigma }_n\le 1$, and $\widetilde{\Lambda }\in \mathbb {R}^{k\times n}$ is the matrix formed by the diagonal matrix of the values $1\ge \widetilde{\lambda }_1\ge \dots \ge \widetilde{\lambda }_k\ge 0$ on the left and $k\times (n-k)$ matrix of zeros on the right. The diagonal entries of $\widetilde{\Sigma }$ and $\widetilde{\Lambda }$ are such that, for $1\le i\le k$, $\widetilde{\sigma }_i^2+\widetilde{\lambda }_i^2=1$; the quantities $\widetilde{\sigma }_i/\widetilde{\lambda }_i$ are commonly referred to as the generalized singular values of $(\textbf{A},\Psi )$. Similarly to the SVD of $\textbf{A}$, the cost of computing the GSVD of $(\textbf{A},\Psi )$ is prohibitive for large-scale problems, unless some structure of $\textbf{A}$ and $\Psi $ can be exploited; see [22, §6.1.6 and §8.7.4] for properties of the GSVD and its computation.

Next, we describe three TRIPs-Py basic regularization methods that are directly based on the SVD of $\textbf{A}$ or the GSVD of $(\textbf{A},\Psi )$. Such methods are specific instances of the general class of (G)SVD filtering methods, whose solutions $\textbf{x}_{\mu }$ can be expressed as

$$\begin{aligned} \begin{aligned} \textbf{x}_{\mu }&= \sum _{i=1}^n\phi _i(\mu )\frac{\textbf{u}_i^T\textbf{b}}{\sigma _i}\textbf{v}_i\quad&\text{(for } \Psi =\textbf{I}\text{) }\\\textbf{x}_{\mu }&= \sum _{i=1}^k\phi _i(\mu )\frac{\widetilde{\textbf{u}}_i^T\textbf{b}}{\widetilde{\sigma }_i}\widetilde{\textbf{y}}_i + \sum _{i=k+1}^n(\widetilde{\textbf{u}}_i^T\textbf{b})\widetilde{\textbf{y}}_i\quad&\text{(for } \Psi \ne \textbf{I}\text{) }\,, \end{aligned} \end{aligned}$$

(5)

where $\widetilde{\textbf{y}}_i$, $i=1,\dots ,n$ are the columns of $\textbf{Y}^{-T}$. The scalars $\phi _i(\mu )$, $0\le \phi _i(\mu )\le 1$, appearing in the above sums are called filter factors. The functional expressions of $\phi _i(\mu )$ determine different filtering methods, which all depend on the parameter $\mu $. The basic principles underlying filtering methods can be understood referring to the so-called ‘discrete Picard condition’, which offers insight into the relative behavior of the magnitude of $\textbf{u}_i^T\textbf{b}$ and $\sigma _i$ (for $\Psi =\textbf{I}$), and $\widetilde{\textbf{u}}_i^T\textbf{b}$ and $\widetilde{\sigma }_i$ (for $\Psi \ne \textbf{I}$), which appear in the expression of the filtered solutions. Namely, assuming that the quantities $|\textbf{u}_i^T\textbf{b}_\textrm{true}|$ and $\sigma _i$, and $|\widetilde{\textbf{u}}_i^T\textbf{b}_\textrm{true}|$ and $\widetilde{\sigma }_i$, decay at the same rate, then the noise in $\textbf{b}$ dominates the unregularized solution for small $\sigma _i$’s, and $\widetilde{\sigma }_i$’s. Therefore, filtering methods successfully compute regularized solutions when the filter factors $\phi _i(\mu )$ are close to 0 for small $\sigma _i$’s and $\widetilde{\sigma }_i$’s, and close to 1 for large $\sigma _i$’s and $\widetilde{\sigma }_i$’s (these are the meaningful components of the solution that we wish to retain). Having too many filter factors close to 1 results in under-regularized solutions (with $\phi _i(\mu )=1$ for every i resulting in the (unregularized) solution of problem (1)); having too many filter factors close to 0 results in over-regularized solutions. The second sum in the second equality in equation (5) expresses the components of $\textbf{x}_{\mu }$ in the null space of $\Psi $, which are unaffected by regularization. Many strategies for choosing the regularization parameter $\mu $ are available in the literature, and Section 2.3 provides more details about the ones implemented in TRIPs-Py. In the following, we specify how the three filtering methods in TRIPs-Py relate to the general framework (5), specifying an expression for the filter factors $\phi _i(\mu )$.

Truncated SVD (TSVD)

The truncated SVD (TSVD) method regularizes (1) by computing

$$\begin{aligned} \textbf{x}_h=\sum _{i=1}^{h} \frac{\textbf{u}_i^T\textbf{b}}{\sigma _i} \textbf{v}_i\,, \quad h\in \{1,2,\dots ,n\}\,. \end{aligned}$$

(6)

TSVD is clearly a filtering method, with filter factors $\phi _i(h)=1$ for $1\le i\le h$ and $\phi _i(h)=0$ otherwise. Therefore, the integer h plays the role of the regularization parameter, which can be set by applying the Generalized Cross-Validation (GCV) method or the discrepancy principle (if an estimate of the noise magnitude $\Vert \textbf{e}\Vert _2$ is provided); see Section 2.3 for more details. Let $\textbf{A}_h$ denote the best rank-h approximation of $\textbf{A}$ in the 2-norm, i.e.,

$$ \textbf{A}_h = \textbf{U}_h\Sigma _h\textbf{V}_h^T, $$

where $\textbf{U}_h$ and $\textbf{V}_h$ are obtained by taking the first h columns of $\textbf{U}$ and $\textbf{V}$, respectively, and $\Sigma _h$ is the diagonal matrix with the first h singular values on its main diagonal. Then $\textbf{x}_h$ can be regarded as the solution of the following variational problem

$$ \textbf{x}_h=\arg \min _{\textbf{x}\in \mathbb {R}^n}\Vert \textbf{A}_h\textbf{x}-\textbf{b}\Vert _2^2, $$

which belongs to the framework (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}_h\textbf{x}-\textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=0$, $\mathcal {D}=\mathbb {R}^n$.

Tikhonov regularization

Tikhonov regularization replaces (1) with the problem of computing

$$\begin{aligned} \textbf{x}_{\alpha }=\arg \min _{\textbf{x}\in \mathbb {R}^{n}}\Vert \textbf{A}\textbf{x}- \textbf{b}\Vert _2^2 + \alpha \Vert \Psi \textbf{x}\Vert _2^2,\quad \text{ where } \Psi \in \mathbb {R}^{k\times n}\,. \end{aligned}$$

(7)

When $\Psi = \textbf{I}$, problem (7) is said to be in the standard form, and when $\Psi \ne \textbf{I}$ it is said to be the general form. Problem (7) clearly belongs to the framework (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=\Vert \Psi \textbf{x}\Vert _2^2$, $\mathcal {D}=\mathbb {R}^n$. Problem (7) can be equivalently expressed as a damped least squares problem, with associated normal equations

$$\begin{aligned} (\textbf{A}^T\textbf{A}+ \alpha \Psi ^T\Psi )\textbf{x}_{\alpha } = \textbf{A}^T\textbf{b}. \end{aligned}$$

(8)

If the null spaces of $\textbf{A}$ and $\Psi $ intersect trivially, the Tikhonov regularized solution $\textbf{x}_{\alpha }$ is unique. By plugging the SVD of $\textbf{A}$ (if $\Psi =\textbf{I}$) or the GSVD of $(\textbf{A},\Psi )$ (if $\Psi \ne \textbf{I}$) into the above equation, one can see that $\textbf{x}_{\alpha }$ can be expressed in the framework of (5) with

$$ \phi _i(\alpha )=\frac{\sigma _i^2}{\sigma _i^2 + \alpha }\quad \text{(for } \Psi =\textbf{I}\text{) }\quad \text{ and }\quad \phi _i(\alpha )=\frac{(\widetilde{\sigma }_i/\widetilde{\lambda }_i)^2}{(\widetilde{\sigma }_i/\widetilde{\lambda }_i)^2 + \alpha }\quad \text{(for } \Psi \ne \textbf{I}\text{) }\,. $$

From the above expression and looking at (7), it is clear that more regularization is imposed for larger values of $\alpha $, as more weight is put on the regularization term and more filter factors approach to 1 (depending on the location of $\alpha $ within the range of the singular values of $\textbf{A}$ or generalized singular values of $(\textbf{A},\Psi )$). Conversely, smaller values of $\alpha $ lead to under-regularized solutions. When $\Psi \ne \textbf{I}$ has a nontrivial null space, it is important to stress (again, from the second equation in (5) or from (7)) that vectors in the null space of $\Psi $ are unaffected by regularization. The regularization parameter $\alpha $ can be set by applying the GCV method or the discrepancy principle (if an estimate of the noise magnitude $\Vert \textbf{e}\Vert _2$ is provided); see Section 2.3 for more details. Finally, we remark that any Tikhonov regularized problem in general form can be equivalently transformed into a Tikhonov regularized problem in standard form. The specific transformation depends on the properties of $\Psi $. Generically, one expresses the equivalent standard form solution as

$$\begin{aligned} \textbf{x}_{\alpha }=\Psi _{\textbf{A}}^{\dagger }\textbf{z}_{\alpha }+\bar{\textbf{x}}_{0},\quad \text{ where }\quad \textbf{z}_{\alpha }=\arg \min _{\textbf{z}\in \mathbb {R}^{k}}\Vert \textbf{A}\Psi _{\textbf{A}}^{\dagger } \textbf{z}- \bar{\textbf{b}}\Vert _2^2 + \alpha \Vert \textbf{z}\Vert _2^2. \end{aligned}$$

(9)

In the above expression, $\Psi ^{\dagger }_{\textbf{A}} = (\textbf{I}- (\textbf{A}(\textbf{I}-\Psi ^{\dagger }\Psi ))^{\dagger }\textbf{A})\Psi ^{\dagger }$ denotes the $\textbf{A}$-weighted generalized pseudoinverse $\Psi ^{\dagger }_{\textbf{A}}$ of the operator $\Psi $, $\bar{\textbf{x}}_0$ denotes the component of $\textbf{x}_{\alpha }$ in the null space of $\Psi $ and $\bar{\textbf{b}}=\textbf{b}-\textbf{A}\bar{\textbf{x}}_0$; see [23] for more details.

Truncated GSVD (TGSVD)

The truncated GSVD (TGSVD) method regularizes (1) by computing

$$\begin{aligned} \textbf{x}_h=\sum _{i=0}^{h-1} \frac{\widetilde{\textbf{u}}_{k-i}^T\textbf{b}}{\widetilde{\sigma }_{k-i}} \widetilde{\textbf{y}}_{k-i} + \sum _{i=k+1}^n(\widetilde{\textbf{u}}_i^T\textbf{b})\widetilde{\textbf{y}}_i\,,\quad h\in \{1,2,\dots ,k\} \,. \end{aligned}$$

(10)

TGSVD is a filtering method that can be expressed as in (5) (second equation), with filter factors $\phi _i(h)=1$ for $k-h+1\le i\le k$ and $\phi _i(h)=0$ otherwise. The TGSVD solution can be linked to both TSVD and Tikhonov regularization in general form, in that the TGSVD solution can also be expressed as

$$\begin{aligned} \textbf{x}_h=\Psi _{\textbf{A}}^{\dagger }\arg \min _{\textbf{z}\in \mathbb {R}^k}\Vert (\textbf{A}\Psi _{\textbf{A}}^{\dagger })_h\textbf{z}- \textbf{b}\Vert _2^2\,+\,\bar{\textbf{x}}_0\,, \end{aligned}$$

(11)

where $\Psi _{\textbf{A}}^{\dagger }$ and $\bar{\textbf{x}}_0$ are as in equation (9), and $(\textbf{A}\Psi _{\textbf{A}}^{\dagger })_h$ is the optimal rank-h approximation of $\textbf{A}\Psi _{\textbf{A}}^{\dagger }$ in the 2-norm, which can be either computed using the SVD of $\textbf{A}\Psi _{\textbf{A}}^{\dagger }$ or the GSVD of $(\textbf{A},\Psi _{\textbf{A}})$. The variational problem (11) belongs to the framework (3), as it is essentially a transformed least squares problem. As for TSVD, the truncation parameter h plays the role of the regularization parameter, and similar strategies are used to automatically set it.

2.2 Projection methods for large-scale problems

The methods described in the previous section can be properly applied only when it is feasible to compute some factorizations (e.g., the SVD) of the coefficient matrix $\textbf{A}$, and possibly joint factorizations of the regularization matrix $\Psi $ and $\textbf{A}$ (e.g., the GSVD), making them suitable only for small-scale problems or for problems where $\textbf{A}$, and possibly $\Psi $, have a special structure that can be exploited for storage and computations; see [13, 24]. In general, when solving large-scale problems, only matrix-free methods, which do not require storage nor factorizations of $\textbf{A}$ but rather computations of matrix-vector products with $\textbf{A}$ and, often, with $\textbf{A}^T$, are viable options. This section is devoted to projection methods, which are iterative methods that determine a sequence of approximate regularized solutions of the original problem (1) in a sequence of subspaces of $\mathbb {R}^n$ of dimensions up to $d^*\ll n$.

Remark 1

All the projection methods available within TRIPs-Py compute a regularized solution by one of the following strategies:

1.
Applying a projection method directly to problems (1) and stopping before the noise corrupts the approximate solution (i.e., exploiting semiconvergence); see [13, Chapter 6].
2.
Applying a projection method to approximate the solution of a Tikhonov-like regularized problem.
3.
Adopting a ‘hybrid’ approach, whereby Tikhonov regularization is applied to a sequence of projected problems [16].

As far as regularized problems in the 2-norm are concerned, all the approaches listed above can be employed. Moreover, when considering standard form Tikhonov regularization with a given regularization parameter, the second and third approaches are equivalent. When considering regularization problems formulated in some $\ell _p$ norm, only the second approach will be considered within a computationally convenient majorization-minimization strategy.

In general, a projection method for computing an approximation $\textbf{x}_d$ of a solution of problems (1) is defined by the two conditions

$$\begin{aligned} \textbf{x}_d\in \mathcal {S}_d,\quad \textbf{r}_d := \textbf{b}- \textbf{A}\textbf{x}_d\perp \mathcal {C}_d, \end{aligned}$$

(12)

where $\mathcal {S}_d$ and $\mathcal {C}_d$ are subspaces of $\mathbb {R}^n$ and $\mathbb {R}^m$ of dimension $d\ll \min \{m, n\}$, commonly referred to as the approximation subspace and the constraint subspace, respectively. If $\textbf{S}_d\in \mathbb {R}^{n\times d}$ and $\textbf{C}_d\in \mathbb {R}^{m\times d}$ are matrices whose columns span $\mathcal {S}_d$ and $\mathcal {C}_d$, respectively, then the above conditions can be equivalently expressed as

$$\begin{aligned} \text{ find }\quad \textbf{t}_d\in \mathbb {R}^d\quad \text{ such } \text{ that }\quad \textbf{C}_d^T(\textbf{b}- \textbf{A}\textbf{S}_d\textbf{t}_d)=\textbf{0}\,, \end{aligned}$$

(13)

so that, since $d\ll n$, one can solve

$$\begin{aligned} \underbrace{\textbf{C}_d^T\textbf{A}\textbf{S}_d}_{=:\textbf{G}_d\in \mathbb {R}^{d\times d}}{\textbf{t}_d}=\underbrace{\textbf{C}_d^T\textbf{b}}_{=:\textbf{g}_d\in \mathbb {R}^d} \end{aligned}$$

(14)

by any direct method and form the approximate solution of (1) by taking $\textbf{x}_d=\textbf{S}_d\textbf{t}_d$. In practice, starting without loss of generality from the initial guess $\textbf{x}_0=\textbf{0}$, a projection method for the solution of (1) is an iterative method that computes a sequence of approximate solutions $\{\textbf{x}_d\}_{d=1,2,\dots }$ satisfying conditions (12) by building a sequence of nested subspaces

$$ \mathcal {S}_d=\text {range}(\textbf{S}_d)=\text {range}([\textbf{S}_{d-1}, \textbf{s}_d]),\quad \mathcal {C}_d=\text {range}(\textbf{C}_d)=\text {range}([\textbf{C}_{d-1}, \textbf{c}_d]), $$

$d=1, 2,\dots $, with $\textbf{S}_0, \textbf{C}_0$ being empty matrices; see [25] for more details.

The majority of the solvers in TRIPs-Py are projection methods onto Krylov subspaces [15], i.e., both $\mathcal {S}_d$ and $\mathcal {C}_d$ are Krylov subspaces. In general, given $\textbf{M}\in \mathbb {R}^{n\times n}$ and $\textbf{n}\in \mathbb {R}^n$, a Krylov subspace is defined as

$$\begin{aligned} \mathcal {K}_d(\textbf{M},\textbf{n}) = \textrm{span}\{\textbf{n}, \textbf{M}\textbf{n}, \textbf{M}^2 \textbf{n}, \ldots , \textbf{M}^{(d-1)}\textbf{n}\}\,. \end{aligned}$$

(15)

Here and in the following, we assume that the dimension of $\mathcal {K}_d(\textbf{M},\textbf{n})$ is d; appropriate checks are incorporated into TRIPs-Py to make sure that this assumption is satisfied. Different Krylov methods differ in the choices of their approximation and constraint Krylov subspaces, and in their implementation: when different Krylov subspace methods share the same $\mathcal {S}_d$ and $\mathcal {C}_d$, $d=1,2,\dots $, we say that they are mathematically equivalent. When Krylov methods are used to solve problems (1), the quantities $\textbf{M}$ and $\textbf{n}$ appearing in (15) are typically defined in terms of $\textbf{A}$ and $\textbf{b}$ (for square matrices), or $\textbf{A}^T\textbf{A}$ and $\textbf{A}^T\textbf{b}$, or $\textbf{A}\textbf{A}^T$ and $\textbf{b}$ (for rectangular matrices). In the following, we will refer to (15) as standard Krylov subspace, to distinguish it from other Krylov-like subspaces that are generated by computing matrix-vector products with an iteration-dependent modification of the matrix $\textbf{M}$, and therefore cannot be expressed taking increasing powers of $\textbf{M}$ as in (15).

Table 2 List of decompositions used by some TRIPs-Py solvers

Full size table

2.2.1 Methods based on standard Krylov subspaces

In this section, we briefly describe the Krylov subspace methods available within TRIPs-Py: namely, Generalized Minimal Residual (GMRES)(which can be applied when the coefficient matrix in (1) is square), Least Squares QR (LSQR) and Conjugate Gradient Least Squares (CGLS), along with their available hybrid variants. We will emphasize how such methods relate to the framework in Remark 1.

Methods based on the Arnoldi algorithm: GMRES and its hybrid variant

Given $\textbf{A}\in \mathbb {R}^{n\times n}$ and $\textbf{b}\in \mathbb {R}^n$, d iterations of the Arnoldi algorithm initialized with $\textbf{v}_1 = \textbf{b}/\Vert \textbf{b}\Vert _2$ compute the partial factorization

$$\begin{aligned} \textbf{A}\textbf{V}_d = \textbf{V}_{d+1} \textbf{H}_{d}, \end{aligned}$$

(16)

where $\textbf{V}_{d+1} = [\textbf{v}_1, \textbf{v}_2,..., \textbf{v}_{d+1}]=[\textbf{V}_d, \textbf{v}_{d+1}] \in \mathbb {R}^{n \times (d+1)}$ has orthonormal columns that span the Krylov subspace $\mathcal {K}_{d+1}(\textbf{A}, \textbf{b})$, and $\textbf{H}_{d} \in \mathbb {R}^{(d+1)\times d}$ is upper Hessenberg. TRIPs-Py provides two versions of the Arnoldi algorithm, both based on the modified Gram-Schmidt orthogonalization process (see Table 2): one computes a given number of Arnoldi steps, while the other only computes one step, updating an already available partial Arnoldi decomposition. If the matrix $\textbf{A}$ is symmetric, the Arnoldi algorithm simplifies to the symmetric Lanczos algorithms [28], although only the former is provided in TRIPs-Py.

GMRES is a Krylov method for the solution of the linear system appearing in (1). At the dth iteration, GMRES chooses $\mathcal {S}_d=\mathcal {K}_d(\textbf{A},\textbf{b})$, $\mathcal {C}_d=\textbf{A}\mathcal {K}_d(\textbf{A},\textbf{b})$ in (12). Because of this choice, the associated residual $\textbf{r}_d$ at the dth iteration is minimized in $\mathcal {K}_d(\textbf{A},\textbf{b})$; see [15]. In practice, exploiting the Arnoldi factorization (16),

$$\begin{aligned} \begin{array}{rcl} \textbf{x}_d& =& {\arg \min }_{\textbf{x}\in \mathcal {K}_d(\textbf{A}, \textbf{b})}\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _{2}= \textbf{V}_d{\arg \min }_{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{A}\textbf{V}_d\textbf{t}- \textbf{b}\Vert _{2}\vspace{0.2cm}\\ & =& \textbf{V}_d\arg \min _{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{H}_{d}\textbf{t}-\Vert \textbf{b}\Vert _2 \textbf{e}_{1}\Vert _{2}, \end{array} \end{aligned}$$

(17)

where $\textbf{e}_{1}$ denotes the first canonical basis vector of $\mathbb {R}^{d+1}$. Referring to the quantities introduced in equation (14), GMRES takes $\textbf{G}_d=\textbf{H}_d^T\textbf{H}_d$ and $\textbf{g}_d=\textbf{H}_d^T(\Vert \textbf{b}\Vert _2\textbf{e}_1)$. It has been theoretically proven that GMRES is an iterative regularization method, so that the number of GMRES iterations serves as a regularization parameter; in other words, GMRES adopts the first strategy listed in Remark 1. If an estimate of the norm of the noise $\textbf{e}$ affecting the data $\textbf{b}$ in (2) is provided, then TRIPs-Py automatically uses the discrepancy principle as stopping criterion for the GMRES iterations; see Section 2.3 for more details. Otherwise, the user may assign a maximum number of iterations to be performed, being aware that too many iterations may result in an under-regularized solution. From the first equality in equation (17), we can see that GMRES can be expressed in the general framework of problem (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}- \textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=0$ and $\mathcal {D}= \mathcal {K}_{d^*}(\textbf{A},\textbf{b})$, where $d^*$ is the stopping iteration for GMRES.

The hybrid GMRES method applies some additional, iteration-dependent Tikhonov regularization in standard form to the projected problem appearing in (17), following the third principle described in Remark 1. Namely, the dth iteration of the hybrid GMRES method computes the regularized solution

$$\begin{aligned} \begin{array}{rcl} \textbf{x}_d=\textbf{V}_d\arg \min _{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{H}_{d}\textbf{t}-\Vert \textbf{b}\Vert _2 \textbf{e}_{1}\Vert _{2}^2 + \alpha _d\Vert \textbf{t}\Vert _2^2, \end{array} \end{aligned}$$

(18)

where $\alpha _d$ is an iteration-dependent regularization parameter that can be automatically chosen applying the generalized cross validation method or the discrepancy principle (if an estimate of the noise magnitude $\Vert \textbf{e}\Vert _2$ is provided) to the projected problem; we refer to Section 2.3 for more details. Assuming that a suitable value of $\alpha _d$ is chosen at each iteration, thanks to the additional regularization imposed in (18) to the projected problem, hybrid GMRES is not affected by semiconvergence and setting a stopping criterion is less crucial than for GMRES; see [16]. Therefore hybrid GMRES is stopped when a maximum number $d^*$ of iterations is performed. By exploiting decomposition (16) within (18) and the definition of the matrix $\textbf{V}_{d+1}$, it can be shown that the hybrid GMRES method belongs to the general framework stated in (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=\Vert \textbf{x}\Vert _2^2$, $\alpha =\alpha _{{d}^*}$ and $\mathcal {D}=\mathcal {K}_{{d}^*}(\textbf{A},\textbf{b})$. The Arnoldi-Tikhonov method is mathematically equivalent to hybrid GMRES, the only difference between the two being that the former performs a given number of Arnoldi steps (which, if an estimate of $\Vert \textbf{e}\Vert _2$ is available, is determined by the discrepancy principle), and regularizes the projected problem only at the last computed iteration.

Methods based on Golub-Kahan bidiagonalization (LSQR and its hybrid variant) and CGLS

Given $\textbf{A}\in \mathbb {R}^{m\times n}$ and $\textbf{b}\in \mathbb {R}^m$, d iterations of the Golub–Kahan bidiagonalization algorithm initialized with $\textbf{v}_1=\textbf{A}^T\textbf{b}/\Vert \textbf{A}^T\textbf{b}\Vert _2$ and $\textbf{u}_1=\textbf{b}/\Vert \textbf{b}\Vert _2$ compute the partial factorizations

$$\begin{aligned} \textbf{A}\textbf{V}_{d}=\textbf{U}_{d+1}\textbf{B}_{d},\quad \textbf{A}^{T}\textbf{U}_{d+1}=\textbf{V}_{d+1}\bar{\textbf{B}}_{d+1}^{T}, \end{aligned}$$

(19)

where $\textbf{V}_{d+1} = [\textbf{v}_1, \textbf{v}_2,..., \textbf{v}_{d+1}]=[\textbf{V}_d, \textbf{v}_{d+1}] \in \mathbb {R}^{n \times (d+1)}$ has orthonormal columns that span the Krylov subspace $\mathcal {K}_{d+1}(\textbf{A}^T\textbf{A}, \textbf{A}^T\textbf{b})$, $\textbf{U}_{d+1} = [\textbf{u}_1, \textbf{u}_2,..., \textbf{u}_{d+1}]=[\textbf{U}_d, \textbf{u}_{d+1}] \in \mathbb {R}^{m \times (d+1)}$ has orthonormal columns that span the Krylov subspace $\mathcal {K}_{d+1}(\textbf{A}\textbf{A}^T, \textbf{b})$, $\bar{\textbf{B}}_{d+1} \in \mathbb {R}^{(d+1)\times (d+1)}$ is a lower bidiagonal matrix, and $\textbf{B}_{d}$ is obtained by taking the first d columns of $\bar{\textbf{B}}_{d+1}$. As for the Arnoldi algorithm, TRIPs-Py provides two versions of the Golub–Kahan bidiagonalization algorithm (see Table 2); both versions are implemented without reorthogonalization.

LSQR is a Krylov method for the solution of the problems appearing in (1) that, at the dth iteration, chooses $\mathcal {S}_d=\mathcal {K}_d(\textbf{A}^T\textbf{A},\textbf{A}^T\textbf{b})$, $\mathcal {C}_d=\textbf{A}\mathcal {K}_d(\textbf{A}^T\textbf{A},\textbf{A}^T\textbf{b})$ in (12). Because of this choice, at the dth iteration the residual $\textbf{r}_d$ is minimized in $\mathcal {K}_d(\textbf{A}^T\textbf{A},\textbf{A}^T\textbf{b})$; see [15]. In practice, exploiting Golub-Kahan bigiagonalization (19),

$$\begin{aligned} \begin{array}{rcl} \displaystyle \textbf{x}_d& =& \arg \min _{\textbf{x}\in \mathcal {K}_d(\textbf{A}^T\textbf{A}, \textbf{A}^T\textbf{b})}\!\!\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _{2}= \displaystyle \textbf{V}_d\arg \min _{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{A}\textbf{V}_d\textbf{t}- \textbf{b}\Vert _{2}\vspace{0.2cm}\\ & =& \displaystyle \textbf{V}_d\arg \min _{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{B}_{d}\textbf{t}-\Vert \textbf{b}\Vert _2 \textbf{e}_{1}\Vert _{2}. \end{array} \end{aligned}$$

(20)

Referring to the quantities introduced in equation (14), LSQR takes $\textbf{G}_d=\textbf{B}_d^T\textbf{B}_d$ and $\textbf{g}_d=\textbf{B}_d^T(\Vert \textbf{b}\Vert _2\textbf{e}_1)$. It is well-known that LSQR is an iterative regularization method, so that the number of LSQR iterations serves as a regularization parameter; in other words, LSQR adopts the first strategy listed in Remark 1. Similarly to GMRES, the LSQR iterations are stopped when the discrepancy principle is satisfied (if an estimate of $\Vert \textbf{e}\Vert _2$ is provided by the user) or if a maximum number of iterations is performed. From the first equality in equation (20), we can see that LSQR can be expressed in the general framework of problem (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}- \textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=0$ and $\mathcal {D}= \mathcal {K}_{{d}^*}(\textbf{A}^T\textbf{A},\textbf{A}^T\textbf{b})$, where ${d}^*$ is the stopping iteration for LSQR.

Similarly to hybrid GMRES, hybrid LSQR applies some additional, iteration-dependent Tikhonov regularization in standard form to the projected problem appearing in (20), following the third principle described in Remark 1. Namely, the dth iteration of the hybrid LSQR method computes the regularized solution

$$\begin{aligned} \begin{array}{rcl} \displaystyle \textbf{x}_d= \displaystyle \textbf{V}_d\arg \min _{\textbf{t}\in \mathbb {R}^{d}}\Vert \textbf{B}_{d}\textbf{t}-\Vert \textbf{A}^T\textbf{b}\Vert _2 \textbf{e}_{1}\Vert _{2}^2 + \alpha _d\Vert \textbf{t}\Vert _2^2, \end{array} \end{aligned}$$

(21)

where $\alpha _d$ is an iteration-dependent regularization parameter that can be automatically chosen by applying the same approaches listed for hybrid GMRES. A proper choice of $\alpha _d$ mitigates the LSQR semiconvergence and the hybrid LSQR iterations are stopped when a maximum number ${d}^*$ of iterations is performed. Similarly to hybrid GMRES, hybrid LSQR belongs to the general framework stated in (3), with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2^2$, $\mathcal {R}(\textbf{x})=\Vert \textbf{x}\Vert _2^2$, $\alpha =\alpha _{{d}^*}$ and $\mathcal {D}=\mathcal {K}_{{d}^*}(\textbf{A}^T\textbf{A},\textbf{A}^T\textbf{b})$. Similarly to Arnoldi-Tikhonov, Golub-Kahan-Tikhonov computes a fixed number of Golub-Kahan iterations and applies Tikhonov regularization only to the last computed projected problem.

CGLS is a Krylov method for the solution of (1), which is mathematically equivalent to LSQR. That is, at the dth iteration, $\mathcal {S}_d$ and $\mathcal {C}_d$ are as in LSQR, but the CGLS solution is computed using a three-term recurrence formula, rather than by (implicitly) computing the factorization (19) and solving the projected problem (20) as in LSQR. Both CGLS and LSQR can be naturally used to approximate the solution of the Tikhonov-regularized problem (7) (both in standard and general form, with a fixed regularization parameter $\alpha $), following the second strategy in Remark 1. This is achieved by applying such solvers to the equivalent formulation of (7) as a damped least squares problem, and stopping when we reach high accuracy in the associated normal equation residual. However, since CGLS does not compute the factorization (19), it is not possible to state a hybrid version of CGLS as in (21).

2.2.2 Methods based on generalized Krylov subspaces

TRIPs-Py contains methods that rely on generalized Krylov subspaces, whose definition will be made clear in the next paragraph. Such methods include GKS, MMGKS, AnisoTV, IsoTV, and GS (see Table 1 for a few more details). First we briefly describe the generalized Krylov subspace method (GKS) that approximates the solution of (7), with a general regularization matrix $\Psi \in \mathbb {R}^{k\times n}$; see [19]. Then we describe the majorization-minimization (MM) technique that solves (3) for a broad selection of $0<p,q \le 2$ and $\Psi $, and we explain how GKS can be used to approximate the solution of the resulting reweighted problem: the resulting method is referred to as MMGKS.

Generalized Krylov subspace (GKS) method

The GKS method computes an approximation to (7) with a general $\Psi \in \mathbb {R}^{k\times n}$ by first determining an initial approximation subspace for the solution through $1\le d\ll \min \{m,n\}$ steps of the Golub-Kahan bidiagonalization algorithm applied to $\textbf{A}$, with initial vector $\textbf{b}$. This gives a decomposition like the first one appearing in (19). As far as d is relatively small, it is inexpensive to compute the skinny QR factorizations

$$\begin{aligned} \begin{array}{cccc} \textbf{A}\textbf{V}_{d} = \textbf{Q}_{\textbf{A}} \textbf{R}_{\textbf{A}} \quad \text {with} & \quad \textbf{Q}_{\textbf{A}} \in \mathbb {R}^{m \times d}, & \textbf{R}_{\textbf{A}} \in \mathbb {R}^{d\times d},\\ \Psi \textbf{V}_{d} = \textbf{Q}_{\Psi } \textbf{R}_{\Psi } \quad \text {with} & \quad \textbf{Q}_{\Psi }\in \mathbb {R}^{r \times d}, & \textbf{R}_{\Psi } \in \mathbb {R}^{d \times d}, \end{array} \end{aligned}$$

(22)

where the matrices $\textbf{Q}_{\textbf{A}}$ and $\textbf{Q}_{\Psi }$ have orthonormal columns and the matrices $\textbf{R}_{\textbf{A}}$ and $\textbf{R}_{\Psi }$ are upper triangular. Taking $\textbf{x}_d=\textbf{V}_d\textbf{t}$ in (7) and using the factorizations in (22), we obtain the d-dimensional linear system of equations

$$\begin{aligned} (\textbf{R}^T_{\textbf{A}}\textbf{R}_{\textbf{A}} + \alpha \textbf{R}_{\Psi }^T\textbf{R}_{\Psi })\textbf{t}= \textbf{R}^T_{\textbf{A}}\textbf{Q}^T_{\textbf{A}}\textbf{b}, \end{aligned}$$

(23)

to be solved to get $\textbf{t}=\textbf{t}_d$. We then expand the approximation subspace for the solution of (7) by taking

$$\begin{aligned} \textbf{V}_{d+1}=[\textbf{V}_{d},\textbf{v}_\textrm{new}]\in \mathbb {R}^{n\times (d+1)},\;\text{ where }\; \begin{array}{l} \textbf{v}_\textrm{new}=\widetilde{\textbf{r}}_d/\Vert \widetilde{\textbf{r}}_d\Vert _2,\vspace{0.1cm}\\ \widetilde{\textbf{r}}_d=\textbf{A}^T(\textbf{A}\textbf{V}_{d}\textbf{t}_d -\textbf{b})+\alpha \Psi ^T\Psi \textbf{V}_{d}\textbf{t}_d, \end{array} \end{aligned}$$

(24)

i.e., by adding the current normalised residual of the (full) normal equations. This concludes the first GKS iteration.

Next, the QR factorization in (22) is updated for the matrices $\textbf{A}\textbf{V}_{d+1}$ and $\Psi \textbf{V}_{d+1}$ and the process in steps (22) - (24) is repeated, expanding the solution space during subsequent GKS iterations until a sufficiently accurate solution is reached. Since a suitable value of the regularization parameter $\alpha $ in (7) is usually not known and applying some parameter choice strategy to (7) is computationally unfeasible for large-scale problems, the GKS method allows an iteration-dependent choice of $\alpha $. Specifically, in TRIPs-Py one can adopt the discrepancy principle (if a good estimate of the noise magnitude $\Vert \textbf{e}\Vert _2$ is provided by the user) or generalized cross validation to compute $ \alpha =\alpha _d$ for problem (23); see Section 2.3 for more details. Since, in general, $\alpha $ may vary at each iteration and $\Psi \ne \textbf{I}$, the range of $\textbf{V}_{d+1}$ (approximation subspace for the solution) is not a standard Krylov subspace anymore, and it is called generalized Krylov subspace. If $\alpha $ is fixed or $\Psi =\textbf{I}$, the GKS method is mathematically equivalent to LSQR or CGLS applied to the damped least squares formulation of the Tikhonov problem (7).

Majorization-minimization Generalized Krylov subspace (MMGKS) method

Let us consider problem (3) with $\mathcal {F}(\textbf{x})=\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _p^p$ and $\mathcal {R}(\textbf{x}) = \Vert \Psi \textbf{x}\Vert _q^q$. Due to the non-differentiability and possibly nonconvexity for certain choices of p and q, it is common to replace the data fidelity and the regularization terms with differentiable approximations thereof, and solve the following problem

$$\begin{aligned} \min _{\textbf{x}} \mathcal {J}_{\epsilon ,\alpha }(\textbf{x}) = \min _{\textbf{x}} \sum _{j = 1}^{m} \phi _{p, \epsilon }(( \textbf{A}\textbf{x}- \textbf{b})_j) + \alpha \sum _{j = 1}^{k} \phi _{q, \epsilon }(( \Psi \textbf{x})_j)\,, \end{aligned}$$

(25)

where

$$\begin{aligned} \phi _{r,\epsilon }(t)= \left( t^2+\epsilon ^2\right) ^{r/2} \text{ with } \left\{ \begin{array}{ll} \epsilon>0 \text{ for } 0<r\le 1,\\ \epsilon =0 \text{ for } r>1, \end{array} \right. \end{aligned}$$

(26)

for a small positive constant $\epsilon >0$. A well-known approach to solve (25) is majorization-minimization (MM) [29, 30], which solves a sequence of regularized least squares problems. Namely, at the $(\ell +1)$th MM iteration, one computes

$$\begin{aligned} \textbf{x}^{(\ell +1)} = \arg \min _{\textbf{x}\in \mathbb {R}^n} \Vert \textbf{P}_\textrm{fid}^{(\ell )}\left( \textbf{A}\textbf{x}- \textbf{b}\right) \Vert _2^2 + \alpha \Vert \textbf{P}_\textrm{reg}^{(\ell )}\Psi \textbf{x}\Vert _2^2 \, , \end{aligned}$$

(27)

where $\textbf{P}_{s}^{(\ell )}$ (s = fid, reg) denotes a diagonal weighting matrix, whose entries are determined from the current solution $\textbf{x}^{(\ell )}$, $\ell =0,1,\dots $.

Classical methods for MM involve solving (27) with a fixed regularization parameter by, e.g., applying CGLS, which results in time-consuming inner-outer iterative strategies. The MMGKS method implemented in TRIPs-Py simultaneously computes a new approximation $\textbf{x}^{(\ell +1)}$ and updates the weights $\textbf{P}_{s}^{(\ell +1)}$ ($s = \mathrm fid, \mathrm reg$) by projecting the current problem (27) onto a generalized Krylov subspace, whereby an iteration-dependent suitable value for the regularization parameter $\alpha =\alpha _{\ell }$ can be computed (similarly to the method described in the previous paragraph) and the quantities $\textbf{P}_\textrm{fid}^{(\ell )}\textbf{A}$, $\textbf{P}_\textrm{fid}^{(\ell )}\textbf{b}$, and $\textbf{P}_\textrm{reg}^{(\ell )}\Psi $ can be incorporated. In this way, for the MMKS methods, no inner-outer iteration scheme is needed and an MM iterate corresponds to a GKS iterate. Namely, the generalized Krylov subspace is expanded by 1 vector at each iteration, so that the $(\ell +1)$th approximate solution $\textbf{x}^{(\ell +1)}$ belongs to a generalized Krylov subspace of dimension $d=d_0+\ell $, where $d_0$ denotes the dimension of the initial approximation subspace for building the generalized Krylov subspace. We refer to [20, 31, 32] for further details.

A variety of regularized formulations can be handled more or less straightforwardly by the MMGKS methods, provided that the parameters p and q (for the $\ell _p$ and $\ell _q$ (quasi) norms), the regularization matrix $\Psi $ and the weights $\textbf{P}_s^{(\ell )}$ (s = fid, reg) are properly defined. Some of the solvers listed in Table 1 are indeed drivers for MMGKS, whereby MMGKS is called with specific inputs, to conveniently handle particularly relevant regularization terms. Namely,

AnisoTV implements anisotropic total variation (TV) regularization. In this instance, $p=2$ and the matrix $\Psi \in \mathbb {R}^{k\times n}$ is a rescaled finite difference discretization of the first derivative operator in one or two spatial dimensions, or in three spatio-temporal dimensions; k depends on the dimensionality of the problem. The weights are given by the diagonal matrix with entries
$$ \left( (\Psi \textbf{x}^{(\ell )})_j^2 + \epsilon ^2\right) ^{(q-2)/4},\quad \text{ where }\quad 0<q\le 1\,. $$
IsoTV implements isotropic TV regularization. In one spatial dimension this coincides with anisotropic TV, so that isotropic TV is more meaningful in two spatial dimensions or in three spatio-temporal dimensions.
GS implements a regularizer enforcing group sparsity (possibly under transform), for solutions that are that are naturally partitioned in subsets; see [33].

We refer to Section 3.3 for more details about (an)isotropic TV and group sparsity within MMGKS; see also [21].

2.3 Strategies to choose the regularization parameter

In this section we discuss methods available within TRIPs-Py to choose the regularization parameter for the solvers described in the previous two subsections (and summarized in Table 1). Properly choosing regularization parameters is a pivotal task for any regularization method, as the success of the latter crucially depends on the former. In this section, when considering projection methods, it is convenient to use the following compact notation for the regularized solution associated to the dth projected problem

$$\begin{aligned} \textbf{x}_d=\textbf{V}_d\underbrace{\textbf{F}_{\alpha _d}^\dagger \textbf{f}_d}_{=:\textbf{t}_d}, \end{aligned}$$

(28)

where $\textbf{V}_d$ is the basis for the dth approximation subspace, $\textbf{F}_{\alpha _d}^{\dagger }$ is the so-called dth regularized inverse for the projected problem (depending on the dth Tikhonov regularization parameter $\alpha _d$), and $\textbf{f}_d$ is the dth projected right-hand side vector. All quantities appearing in (28) are specific for each projection method. In particular, for projections methods that do not involve any additional Tikhonov regularization, $\textbf{x}_d=\textbf{V}_d\textbf{F}_{0}^\dagger \textbf{f}_d$.

Discrepancy principle

Let us assume that an estimate $\delta $ for the norm of the noise $\textbf{e}$ appearing in (2) is available. Then the discrepancy principle computes the regularization parameter for the regularized solution of the generic regularization problem (3) by imposing

$$\begin{aligned} \mathcal {D}(\textbf{x}_\textrm{reg}):=\Vert \textbf{A}\textbf{x}_\textrm{reg}-\textbf{b}\Vert _2=\eta \delta , \end{aligned}$$

(29)

where $\eta >1$ ($\eta \simeq 1$) is a safety factor. Since the specific expression and properties of the functional $\mathcal {D}(\textbf{x}_\textrm{reg})$ depend on the considered regularization method, TRIPs-Py includes different versions of (29). In general, for the discrepancy principle to be satisfied, one should make the natural assumption that

$$\begin{aligned} \Vert \textbf{b}_0\Vert _2\le \Vert \textbf{e}\Vert _2\le \Vert \textbf{b}\Vert _2\,, \end{aligned}$$

(30)

where $\textbf{b}_0$ denotes the orthogonal projection of $\textbf{b}$ onto the null space of $\textbf{A}\textbf{A}^T$; see [34].

For T(G)SVD, $\mathcal {D}(\textbf{x}_h)$ is a decreasing functional of the discrete truncation parameter h: therefore one chooses the truncation parameter $h^*$ such that

$$ \mathcal {D}(\textbf{x}_{h^*})\ge \eta \delta > \mathcal {D}(\textbf{x}_{h^*+1})\,. $$

Similarly, for purely iterative regularizing methods, $\mathcal {D}(\textbf{x}_\textrm{reg})$ is evaluated at discrete points (i.e., the number of iterations or, equivalently, the dimension of the projection subspace). For the dth projected problem, $\mathcal {D}(\textbf{x}_d)$ is essentially the norm of the residual associated to the dth solution. For GMRES and LSQR, $\mathcal {D}(\textbf{x}_d)$ is a decreasing functional of d: this is a consequence of the optimality properties mentioned in Section 2.2.1. Therefore, to satisfy the discrepancy principle, one stops at the iteration $d^*$ such that

$$ \mathcal {D}(\textbf{x}_{d^*-1})\ge \eta \delta > \mathcal {D}(\textbf{x}_{d^*})\,. $$

Note that, for GMRES and LSQR, the discrepancy principle is computationally cheap to apply, as the functional $\mathcal {D}(\textbf{x}_d)$ can be computed with respect to the projected coefficient matrix and right-hand side vector; see the last two equalities in equations (17) and (20), respectively, for a justification.

When the discrepancy principle is applied to Tikhonov regularization (for either the full-dimensional problem or at each iteration of a projected problem), (29) amounts to a nonlinear equation to be solved with respect to the regularization parameter $\alpha $. Note that, with the change of variable $\hat{\alpha }=1/\alpha $ in (7), (23), (27), and $\hat{\alpha }_d=1/\alpha _d$ in (18), (21), $\mathcal {D}(\textbf{x}_\textrm{reg})$ is a decreasing convex functional of $\hat{\alpha }$ and $\hat{\alpha }_d$, respectively, so that Newton’s method is guaranteed to converge when started on the left of the zero of $\mathcal {D}(\textbf{x}_\textrm{reg})- \eta \delta $. For hybrid GMRES and hybrid LSQR, i.e., when considering the projected problems appearing in (18) and (21), the discrepancies computed with respect to the full-dimensional and the projected problems coincide, i.e.,

$$\begin{aligned} \mathcal {D}(\textbf{x}_\textrm{reg})=\Vert \textbf{b}-\textbf{A}\textbf{V}_{d}\textbf{F}_{({\alpha _{d}}}{)}_\textbf{d}^{\dagger }\Vert _2= \Vert {\textbf {f}}_{\textbf {d}}-\textbf{E}_{{\textbf {d}}} (\textbf{F}_{\alpha _{{\textbf {d}}}})^{\dagger } {\textbf {f}}_{\textbf {d}}\Vert _\textbf{2}\,,\quad \text {where}\quad \textbf{E}_{\textbf {d}}=\textbf{H}_{\textbf {d}},\textbf{B}_{\textbf {d}} \end{aligned}$$

(31)

for GMRES and LSQR, respectively; see, again, the last two equalities in equations (17) and (20). Therefore, in order for $\mathcal {D}(\textbf{x}_\textrm{reg})- \eta \delta $ to have a zero, one should also assume that

$$\begin{aligned} \Vert (\textbf{I}-\textbf{E}_d\textbf{F}_0^\dagger )\textbf{f}_d\Vert _2\le \Vert \textbf{e}\Vert _2\le \Vert \textbf{b}\Vert _2\,, \end{aligned}$$

(32)

which is essentially condition (30) applied to the projected problems appearing in (18) and (21). Since the quantity on the left of the first inequality is the norm of the GMRES or the LSQR residuals (recall that, with $\alpha _d=0$ hybrid GMRES or hybrid LSQR are equivalent to GMRES or LSQR, respectively), and since such residuals decrease with d, the above condition is satisfied when d is sufficiently large (typically after only a few iterations); see also [35]. For methods based on generalized Krylov subspaces, the discrepancies computed with respect to the full-dimensional and the projected problems are different. Namely,

$$\begin{aligned} \mathcal {D}(\textbf{x}_\textrm{reg})=\left( \Vert \textbf{f}_d- \textbf{R}_{\textbf{A}}\textbf{F}_{\alpha }^{\dagger }\textbf{f}_d\Vert _2^2 + \Vert (\textbf{I}- \textbf{Q}_{\textbf{A}}\textbf{Q}_{\textbf{A}}^T)\textbf{b}\Vert _2^2\right) ^{1/2} \end{aligned}$$

(33)

when computed for the full-dimensional problem. The second term in the above sum is dropped when $\mathcal {D}(\textbf{x}_\textrm{reg})$ is computed with respect to the projected problem; see also [36]. In this setting, in order for $\mathcal {D}(\textbf{x}_\textrm{reg})-\eta \delta $ to have a zero, a condition similar to (32) should be satisfied; for the full-dimensional problem, the leftmost quantity in (32) should be replaced by $(\Vert (\textbf{I}-\textbf{R}_{\textbf{A}}\textbf{F}_0^\dagger )\textbf{f}_d\Vert _2+\Vert (\textbf{I}- \textbf{Q}_{\textbf{A}}\textbf{Q}_{\textbf{A}}^T)\textbf{b}\Vert _2^2)^{1/2}$. By default, for the GKS-based solvers, TRIPs-Py applies the discrepancy principle with respect to the full-dimensional problem.

Generalized Cross Validation (GCV)

The GCV criterion prescribes to take the regularization parameter that minimizes the functional

$$\begin{aligned} \mathcal {G}(\textbf{x}_\textrm{reg})=\frac{\Vert \textbf{A}\textbf{x}_\textrm{reg}-\textbf{b}\Vert _2^2}{\left( \text {trace}(\textbf{I}- \textbf{A}\textbf{A}_\textrm{reg}^{\dagger })\right) ^2}\,. \end{aligned}$$

(34)

Such procedure is derived from statistical techniques, starting from the principle that a good regularized solution $\textbf{x}_\textrm{reg}$ (defined by a good regularization parameter) should be able to predict the exact data $\textbf{b}_\textrm{true}$ as well as possible. In equation (34), $\textbf{A}_\textrm{reg}^{\dagger }$ is the regularized inverse of $\textbf{A}$ (specific for each regularization method), i.e., a matrix such that $\textbf{x}_\textrm{reg}=\textbf{A}_\textrm{reg}^{\dagger }\textbf{b}$, and the quantity $\textbf{A}\textbf{A}_\textrm{reg}^{\dagger }$ is often referred to as ‘influence matrix’. Since GCV does not require any information about the magnitude of the noise affecting the data $\textbf{b}$, it is the default regularization parameter choice method for the TRIPs-Py solvers that involve TSVD or Tikhonov regularization (for either the full-dimensional or the projected problem).

For (G)SVD spectral filtering methods, the functional $\mathcal {G}(\textbf{x}_\textrm{reg})$ can be conveniently expressed with respect to the filter factors and quantities appearing in the SVD of $\textbf{A}$ or the GSVD of $(\textbf{A},\Psi )$. In particular, for T(G)SVD, the functional $\mathcal {G}(\textbf{x}_h)$ is evaluated at discrete points and its denominator simplifies to $(m-h)^2$, with $h=1,\dots ,n$ for TSVD and $h=1,\dots ,k$ for GSVD (see equations (6) and (10), respectively); we refer to [13, §5.4] for further details.

When GCV is applied to the hybrid-projection methods based on standard Krylov subspace methods (Section 2.2.1) and to the methods based on generalized Krylov subspaces (Section 2.2.2), the values of $\mathcal {G}(\textbf{x}_\textrm{reg})$ may depend on whether the regularization parameter is selected for the projected problem only, or by linking the projected problem to the corresponding full-dimensional regularized problem.

For hybrid methods based on standard Krylov subspaces, the numerator of $\mathcal {G}(\textbf{x}_\textrm{reg})$ (i.e., the square of the functional $\mathcal {D}(\textbf{x}_\textrm{reg})$) is the same when computed for both the full-dimensional and the projected problems; see (31). This is not true for GKS-based methods; see (33). Concerning the denominator of $\mathcal {G}(\textbf{x}_\textrm{reg})$, using the properties of the trace, one can derive the expressions

$$ \zeta - \text {trace}(\textbf{E}_d\textbf{F}_{\alpha }^\dagger ),\quad \text {where}\quad \textbf{E}_d=\textbf{H}_d,\textbf{B}_d,\textbf{R}_{\textbf{A}} $$

for hybrid GMRES, hybrid LSQR and (MM)GKS, respectively. The constant $\zeta $ is m for all the methods when computed for the full-dimensional problem. For the projected problems, $\zeta =d+1$ for hybrid GMRES and hybrid LSQR (see [37] for more detailed derivations in the hybrid GMRES case), and $\zeta =d$ for (MM)GKS. By default, and in agreement with the common choices made in the literature, TRIPs-Py uses the GCV criterion computed for the full-dimensional problem for hybrid GMRES and hybrid LSQR, and the projected GCV version for the solvers based on generalized Krylov subspaces; see [37, 38].

2.4 Regularization operators

This section describes the regularization matrices implemented in TRIPs-Py. We consider two types of operators: those based on a finite-difference discretization of the first derivative operator and those based on framelet operators. Some more details about the usage of these regularization matrices are discussed in Section 3, and an illustration is provided towards the end of the demo demo_Tomo_large_scale.ipynb.

Case 1: Regularization operators based on the first derivative operator

Let

$$\begin{aligned} \Psi _{D}= \begin{bmatrix} 1 & -1 & & & \\ & 1 & -1 & & \\ & & \ddots & \ddots & \\ & & & 1 & -1 \end{bmatrix}\in \mathbb {R}^{(n_D-1)\times n_D} \quad \text{ and }\quad \textbf{I}_{n_D}\in \mathbb {R}^{n_D\times n_D} \end{aligned}$$

(35)

be a rescaled finite-difference discretization of the first derivative operator and the identity matrix of order $n_D$, respectively. For problems that depend on one or two spatial dimensions (x, y) and, possibly, a time dimension t, the matrix $\Psi _D$ is used to obtain discretizations of the first derivatives in the D-direction, with $D = x$ (vertical direction), $D=y$ (horizontal direction), and $D=t$ (time direction). For a static image represented as a 2D array $\textbf{X}\in \mathbb {R}^{n_x \times n_y}$, such that $\textbf{x}= \text {vec}\left( \textbf{X}\right) \in \mathbb {R}^{n_x n_y}$ is obtained by stacking the columns of $\textbf{X}$, its horizontal and vertical derivatives are given as

$$\begin{aligned} \begin{array}{lcl} \textrm{vec}(\Psi _{x}\textbf{X}) & =& (\textbf{I}_{n_y} \otimes \Psi _{x}) \textbf{x}\in \mathbb {R}^{n_y(n_x-1)}\vspace{0.1cm}\\ \textrm{vec}(\textbf{X}\Psi ^T_y) & =& (\Psi _{y} \otimes \textbf{I}_{n_x}) \textbf{x}\in \mathbb {R}^{(n_y -1) n_x} \end{array}, \end{aligned}$$

(36)

respectively. The discrete gradient is then expressed as

$$\begin{aligned} \Psi _s = [(\textbf{I}_{n_y} \otimes \Psi _{x})^T , (\Psi _{y} \otimes \textbf{I}_{n_x})^T]^T. \end{aligned}$$

(37)

When modelling dynamic inverse problems with a time-varying solution, let $\textbf{X}\in \mathbb {R}^{n_s\times n_t}$ be the 2D array whose columns store the quantity of interest at the $n_t$ time instants; note that, if such quantities are 2D images, then the columns of $\textbf{X}$ are vectorialized images with $n_s=n_xn_y$ pixels. The derivative in the time dimension is then given by $\textrm{vec}(\textbf{X}\Psi _{t}^T) =(\Psi _{t}\otimes \textbf{I}_{n_s} ) \textbf{x}\in \mathbb {R}^{(n_t -1)n_s}$.

Case 2: Regularization operators based on a two-level framelet analysis operator

When regularizing by leveraging sparsity but the desired solution is not sparse in the original domain, a well-known technique is to perform a transformation to another domain, where the solution may admit a sparse representation. For this purpose, TRIPs-Py provides a two-level framelet analysis operator, defined as follows. Let $\textbf{W}\in \mathbb {R}^{r\times n}$ with $1\le n\le r$. The set of the rows of $\textbf{W}$ is a framelet system for $\mathbb {R}^n$ if, for all $\textbf{x}\in \mathbb {R}^n$,

$$\begin{aligned} \Vert \textbf{x}\Vert _2^2=\sum _{j=1}^{r}{(\textbf{w}^T_j\textbf{x})^2}, \end{aligned}$$

(38)

where $\textbf{w}_j\in \mathbb {R}^n$ denotes the jth row of the matrix $\textbf{W}$ (written as a column vector), i.e.,$\textbf{W}=[\textbf{w}_1,\textbf{w}_2,\ldots ,\textbf{w}_r]^T$. The matrix $\textbf{W}$ is referred to as an analysis operator and $\textbf{W}^T$ as a synthesis operator. We use the same tight frames as in [32, 39, 40], i.e., the system of linear B-splines. This system is formed by a low-pass filter $\textbf{W}_{0}\in \mathbb {R}^{n \times n}$ and two high-pass filters $\textbf{W}_1,\textbf{W}_2\in \mathbb {R}^{n \times n}$, whose corresponding masks are

$$\begin{aligned} \textbf{w}^{(0)}=\frac{1}{4}[1,2,1], \quad \textbf{w}^{(1)}=\frac{\sqrt{2}}{4}[1,0,-1], \quad \textbf{w}^{(2)}=\frac{1}{4}[-1,2,-1]. \end{aligned}$$

The analysis operator $\textbf{W}$ in one space-dimension is derived from these masks and by imposing reflexive boundary conditions to ensure that $\textbf{W}^T\textbf{W}=\textbf{I}$. The corresponding two-dimensional operator $\textbf{W}$ is given by

$$\begin{aligned} \textbf{W}=\begin{bmatrix} [\textbf{W}_{0}\otimes \textbf{W}_{0}]^T, [\textbf{W}_{0}\otimes \textbf{W}_{1}]^T, [\textbf{W}_{0}\otimes \textbf{W}_{2}]^T, [\textbf{W}_{1}\otimes \textbf{W}_{0}]^T, \hdots , [\textbf{W}_{2}\otimes \textbf{W}_{2}]^T \end{bmatrix}^T\!, \end{aligned}$$

(39)

where $\otimes $ denotes the Kronecker product. This matrix is not explicitly formed. We note that the evaluation of matrix-vector products with $\textbf{W}$ and $\textbf{W}^T$ is inexpensive, because the matrix $\textbf{W}$ is sparse. The operator $\textbf{W}$ can be used for instance as a regularization operator in GKS and MMGKS.

3 Overview of the TRIPs-Py test problems

In this section, we consider three main classes of test problems. In the first class we consider both 1D and 2D deblurring, with the latter being used to produce both small-scale and large-scale synthetic test problems. In the second class we consider computed tomography with synthetic data, which can be used to generate both small-scale and large-scale inverse problems. The third class contains dynamic inverse problems with real data. The usage of the classes to generate test problems and of the TRIPs-Py’s solvers that can be attempted to compute their solutions is illustrated in a number of demos collected in jupyter notebooks. A complete list of demos in TRIPs-Py, and short descriptions thereof, is given in Table 3. For a smooth usage of the package, we recommend the users to first download the data needed for test problems from the google drive^{Footnote 1} and place the folder data inside the folder demos.

Table 3 List of TRIPs-Py demos

Full size table

3.1 Deblurring (deconvolution)

Deblurring can be formulated as an integral equation of the kind

$$\begin{aligned} \int \mathcal {B}(s, t)x(t)ds + e = b(s), \end{aligned}$$

(40)

where $s, t\in \mathbb {R}^D$ represent spatial information (in TRIPs-Py, $D=1, 2$). The kernel $\mathcal {B}(s, t)$ (also known as the point spread function (PSF)) defines the blur. It is well-known that, if the kernel is spatially invariant (as it is in TRIPs-Py), i.e., if $\mathcal {B}(s, t) = \mathcal {B}(s-t)$, then (40) is a deconvolution problem. In practical settings, discrete data are collected in finite regions, so that the continuous model (40) is discretized and yields a linear system of equations as in (1). In this particular case, the matrix $\textbf{A}\in \mathbb {R}^{n\times n}$ represents the blurring operator, which is defined starting from the PSF and the boundary conditions on the unknown quantity of interest. In TRIPs-Py the PSF is given by a rescaled, possibly asymmetric Gaussian of the form

$$\begin{aligned} \mathcal {B}(s,t)=c\exp \left( -\frac{1}{2}(s-t)^T\textbf{B}(s-t)\right) ,\quad \text{ where }\quad c>0,\;\textbf{B}=\text {diag}(\beta _1^2,\dots ,\beta _D^2), \end{aligned}$$

(41)

whose spread parameters $\beta _1,\dots ,\beta _D$ ($D=1,2$) are set by the user; reflective boundary conditions are used. The vector $\textbf{b}$ contains the vectorized measured blurred and noisy quantity of interest. By default, the synthetic deblurring test problems available within TRIPs-Py avoid inverse crime by allowing a mismatch between the forward operator used to solve the problem, and the forward operator used to generate the data; see [41]. Namely, the latter employs zero boundary conditions. Inverse crimes can be allowed by setting the CommitCrime option to True (the default being False). More details on deblurring can be found in [24].

3.1.1 1D Deblurring

Referring to the notations in (2), in the 1D setting we consider a one dimensional true signal $\textbf{x}_\textrm{true} \in \mathbb {R}^{n}$ and a convolution forward operator $\textbf{A}\in \mathbb {R}^{n \times n}$, with associated smoothed and noisy signal $\textbf{b}\in \mathbb {R}^{n}$. This problem can be setup in TRIPs-Py by defining an object of the Deblurring1D() class. The method $\texttt {gen\_xtrue()}$ generates the true signal. The required arguments are the dimension of the problem nx and the test signal test; for the latter, the user can choose among the saved options: $\texttt {piecewise}$, $\texttt {sigma}$, $\texttt {curve0}$, $\texttt {curve1}$, $\texttt {curve2}$, and $\texttt {curve3}$. The user can have acces to the forward operator by calling the method $\texttt {forward\_Op\_1D()}$ with parameter storing the spread of the 1D Gaussian blurring function. The data can be generated using the method $\texttt {gen\_data()}$. Noise is then added to the data through the method $\texttt {add\_noise()}$, which takes as input the noiseless data, the distribution of the random noise and the noise level (for Gaussian and Laplace noise), i.e., the ratio $\Vert \textbf{e}\Vert _2/\Vert \textbf{A}\textbf{x}_\textrm{true}\Vert _2$. An illustration of the usage of the Deblurring1D() class is shown in Code 1, and more illustrations are presented in demo $\texttt {demo\_1D\_Deblurring.ipynb}$. Figure 2 shows the true $\texttt {'curve2'}$ signal, its blurred and noisy version with 1% Gaussian noise, and reconstructions with TSVD and TGSVD.

3.1.2 2D deblurring

In this section we show strategies to define and solve a 2D deblurring problem in TRIPs-Py. Similarly to the 1D case, 2D image deblurring test problems are set up as objects of the Deblurring2D() class. As a first step, we start with a small-scale problem to illustrate the ill-posedness of such inverse problem and the performance of direct regularization methods applied to it (with some more illustrations available through the notebook $\texttt {demo\_Deblurring\_small\_scale.ipynb}$). Then we consider a larger scale problem to illustrate iterative regularization methods (a full investigation is provided in the demo $\texttt {demo\_Deblurring\_large\_scale.ipynb}$). For consistancy, we choose the same image in both examples, but the user can choose from other images provided in the package, such as $\texttt {h\_im}$, $\texttt {hubble}$, $\texttt {grain}$, and $\texttt {sky}$. Other images (in tiff, tif, jpg, png or mat formats) can be easily used in TRIPs-Py by performing the following operations:

1.
Create the folder my_image_data under demos/data and place the desired image inside the folder.
2.
Run the function
$$\begin{aligned} \texttt {convert\_image\_for\_trips(imag = `image\_name', image\_type=`image\_type')}, \end{aligned}$$
where $\texttt {'image\_name'}$ and $\texttt {'image\_type'}$ refer to the desired image name and type.

This is shown in the demo $\texttt {demo\_Deblurring\_your\_data.ipynb}$.

Small-scale image deblurring

For this illustration we set $n_x= n_y = 50$ and define the forward operator to be a Gaussian PSF (41) with parameters $(\beta _1,\beta _2)=(1,1)$. The true image and the blurred and noisy image with $1\%$ Gaussian noise are shown in Fig. 3(a) and (b). We call methods TSVD, Hybrid LSQR, and MMGKS with parameters specified as follows.

We remark that the user should provide the operator $\textbf{A}$ as a dense matrix for TSVD and the observed data as a vector for all the methods. The regularization parameter can be a scalar, or chosen by the discrepancy principle (’dp’) or generalized cross validation (’gcv’). When the regularization parameter is set to ’dp’, the user must specify the noise level parameter delta, returned from the add_noise() function (see Code 1). For MMGKS, the user can specify the regularization matrix $\textbf{L}$ (for this example we choose $\textbf{L}$ to be a 2D discretized derivative operator of the first order). The values of p and q can be set as well through the input parameters pnorm and qnorm. The approximate solution and information collected through the iterations is outputed. Approximate solutions obtained from TSVD, and iterative methods such as Hybrid LSQR and MMGKS are shown in Fig. 3 c), d), and e), respectively. Apart from visual inspections provided in Fig. 3 for this example, more quantitative measures on the reconstructed solution can be found in the demo jupyter notebook $\texttt {demo\_Deblurring\_small\_scale.ipynb}$.

Large-scale image deblurring

For the large-scale version we consider the true image of size $128\times 128$ pixels shown in Fig. 4(a). Such image is blurred by a Gaussian PSF with parameters $(\beta _1,\beta _2)=(3,3)$ and we add $1\%$ Gaussian noise. An illustration of how to define a large-scale 2D Deblurring problem is shown in Code 3. The blurred and noisy image is shown in Fig. 4(b). Approximate solutions obtained by hybrid GMRES, hybrid LSQR, and MMGKS are shown in Fig. 4(c), (d), and (e). The calls to these solvers are essentially identical to the ones illustrated for the small-scale example. More details can be found in the demo $\texttt {demo\_Deblurring\_large\_scale.ipynb}$.

3.2 Computerized tomography

Another test problem considered in TRIPs-Py is 2D X-ray computerized tomography (CT), which consists in reconstructing an object (i.e., the attenuation coefficients of an object) from a set of projections along straight lines (i.e., intensities of energy rays recorded by detectors). In the following we briefly describe the physical and mathematical formulation of CT and then illustrate how to define a CT test problem within TRIPs-Py.

Let $\textbf{x}\!=\![x_1,x_2]^T$, $\beta \in [-\pi , \pi ]$, $\varvec{\beta }\!=\![\cos (\beta ), \sin (\beta )]^T$, $\varvec{\beta }^{\perp }=[-\sin (\beta ), \cos (\beta )]^T$, and $c\in \mathbb {R}$. Let

$$ \mathcal {L}(\beta , c) =\left\{ \textbf{x}\in \mathbb {R}^2\,|\,\textbf{x}\cdot \varvec{\beta } = c \right\} =\left\{ \textbf{x}\in \mathbb {R}^2\,|\,\textbf{x}=c\varvec{\beta }+\ell \varvec{\beta }^\perp ,\,\ell \in \mathbb {R}\right\} $$

be the perpendicular line to $\varvec{\beta }$ with signed orthogonal distance c from the origin. Then, assuming that absorption dominates potential scattering effects, Lambert-Beer law links the attenuation coefficient $f(\textbf{x})$ of the object we wish to image to the recorded intensity $I_{\beta ,c}$ of the measured X-ray of incoming intensity $I_0$ along $\mathcal {L}(\beta , c)$ as follows

$$\begin{aligned} \underbrace{\int _{\mathcal {L}(\beta , c)}f(\textbf{x})d\ell }_{=\mathfrak {R}[f](\beta , c)}= -\log \left( \frac{I_{\beta ,s}}{I_0}+e_{\beta ,c}\right) , \end{aligned}$$

(42)

where $e_{\beta ,c}$ is a random perturbation corrupting the measurements. The left-hand side of the above equation, when computed for all $\beta $’s and c’s, is the Radon transform of the function f. After a discretization process, from (42) we obtain a discrete formulation of the problem, $\textbf{A}\textbf{x}= \textbf{b}$. More details on CT can be found in [42].

Similarly to deblurring, we generate a CT test problem in TRIPs-Py by first defining an object of the class Tomography(). By default, also the synthetic CT test problems available within TRIPs-Py avoid inverse crime by using slightly different forward operators to solve the problem and to generate the data. Namely, the set of projection angles for the two operators are affected by a small constant mismatch. Inverse crimes can be allowed by setting the CommitCrime option to True (the default being False). The following demos in TRIPs-Py illustrate how to set up and solve tomography test problems.

$\diamond $:: $\texttt {demo\_Tomo\_small\_scale.ipynb}$ This notebook defines a tomography test problem with small dimensions, so that the forward operator can be explicitly formed and stored to compute a naive solution or a regulatized solution from TSVD.
$\diamond $:: $\texttt {demo\_Tomo\_large\_scale.ipynb}$ This notebook showcases how to generate a large-scale tomography problem along with how to call solvers for the formulated inverse problem. For both demos described above, to setup the forward operator, we use the ASTRA toolbox with ‘fanflat’ 2D geometry. Other geometries such as ‘parallel’ and ‘fanflat_vec’ can be used. More details on the ASTRA toolbox and its documentation can be found in [11] and references therein.
$\diamond $:: $\texttt {demo\_Tomo\_saved\_data.ipynb}$ This notebook demonstrates how to generate a tomogrpahy problem where the forward operator and the data are already available. This demo has no dependency on any ASTRA toolbox to generate the forward operator.

Code 4 illustrates how to set up a large-scale tomography test problem, where $n_x = n_y = 256$ and the number of view angles is 50; $1\%$ Gaussian noise is added to the data. A small-scale problem can be defined similarly by reducing $n_x$ and $n_y$. The angles can be limited by varying the parameter $\texttt {views}$. The true phantom is given in Fig. 5(a); reconstructed solutions by iterative methods Hybrid LSQR, GKS, and MMGKS are shown in Fig. 5(c), (d), and (e), respectively.

3.3 Dynamic computerized tomography

Recent technological advancements in detector speed and accuracy resulted in a growing interest in X-ray computerized tomography, which in turn prompted the need for new and efficient methods to analyze the collected data. In particular, in this paper and within TRIPs-Py, we are interested in reconstructing a series of images to explore the spatial and temporal properties of data acquired within dynamic CT. The aim of this section is three-folded: 1) to briefly describe several regularization terms $\mathcal {R}\left( \textbf{x}\right) $ for use in dynamic inverse problems, 2) to show how to handle dynamic data and generate dynamic CT problems in TRIPs-Py, and 3) to illustrate the need for temporal regularization along with spatial regularization, as well as the usage of TRIPs-Py solvers for dynamic CT.

In the dynamic setting, we are interested in solving the minimization problem

$$\begin{aligned} \underbrace{ \begin{bmatrix} \widehat{\textbf{A}}^{(1)} \\ & \ddots \\ & & \widehat{\textbf{A}}^{(n_t)} \end{bmatrix}}_{\textbf{A}} \underbrace{\begin{bmatrix} \widehat{\textbf{x}}^{(1)}\\ \vdots \\ \widehat{\textbf{x}}^{(n_t)} \end{bmatrix}}_{\textbf{x}} + \underbrace{\begin{bmatrix} \widehat{\textbf{e}}^{(1)}\\ \vdots \\ \widehat{\textbf{e}}^{(n_t)} \end{bmatrix}}_{\textbf{e}} = \underbrace{ \begin{bmatrix} \widehat{\textbf{b}}^{(1)}\\ \vdots \\ \widehat{\textbf{b}}^{(n_t)} \end{bmatrix}} _{\textbf{b}}. \end{aligned}$$

(43)

The operator $\textbf{A}$ has typically a block diagonal structure where the blocks $\widehat{\textbf{A}}^{(t)}$ change in time $t=1,\dots ,n_t$. If $\widehat{\textbf{A}}^{(t)} = \widehat{\textbf{A}}$ for $t=1,\dots ,n_t$, the forward operator simplifies to $\textbf{A}= \textbf{I}_{n_t} \otimes \widehat{\textbf{A}}$, where $\otimes $ denotes the Kronecker product. Problem (43) may be approached by solving a sequence of ‘static’ regularized problems of the form

$$\begin{aligned} \textbf{x}^{(t)}_\text {static} = \arg \min _{\textbf{x}\in \mathbb {R}^{n_s}} \Vert \widehat{\textbf{A}}^{(t)}\textbf{x}-\widehat{\textbf{b}}^{(t)}\Vert _2^2 + \alpha \Vert \Psi _s\textbf{x}\Vert _q^q, \qquad t = 1,2,\dots , n_t, \end{aligned}$$

(44)

where $n_s$ is the number of spatial unknowns (costant for all time instances), $\Psi _s$ is defined as in (37) and $q>0$ (note that $q=1$ corresponds to anisotropic total variation in space). However, in order to overcome the difficulties associated to often having only limited information per time instance, and to improve the reconstruction quality, it is crucial to incorporate temporal information into the reconstruction process by solving

$$\begin{aligned} \textbf{x}_\text {dynamic}=\arg \min _{\textbf{x}\in \mathbb {R}^{n}}\Vert \textbf{A}\textbf{x}-\textbf{b}\Vert _2^2 + \alpha \mathcal {R}(\textbf{x}). \end{aligned}$$

(45)

Here $\mathcal {R}(\textbf{x})$ is a regularization term that takes into consideration both spatial and temporal dimensions. In TRIPs-Py, the user can find three regularization terms for (45); see [21] for more details and a more extended list of such regularizers. In the Bayesian setting, edge-preserving regularization methods for dynamic problems are proposed in [43].

$\diamond $:

Anisotropic space-time TV We take

$$\begin{aligned} \mathcal {R}(\textbf{x})= \> \sum _{t=1}^{n_t} \Vert \Psi _s\widehat{\textbf{x}}^{(t)}\Vert _1 + \sum _{t=1}^{n_t-1}\Vert \widehat{\textbf{x}}^{(t+1)}-\widehat{\textbf{x}}^{(t)}\Vert _1 = \> \Vert (\textbf{I}_{n_t}\otimes \Psi _s)\textbf{x}\Vert _1 + \Vert (\Psi _t\otimes \textbf{I}_{n_s})\textbf{x}\Vert _1 \end{aligned}$$

(46)

The anisotropic TV terms $\Vert \Psi _s\widehat{\textbf{x}}^{(t)}\Vert _1$, $t=1,\dots , n_t$, ensure that the discrete spatial gradients of the images are sparse at each time step; moreover, we enforce that the images do not change considerably from one time instant to the next one by penalizing the 1-norm of their difference.

$\diamond $:

Isotropic TV in space, anisotropic TV in time Assuming, for simplicity, that $n_x=n_y$, we take

$$\begin{aligned} \begin{aligned} \mathcal {R}(\textbf{x})&= \> \sum _{\ell =1}^{n}\sqrt{ ((\textbf{I}_{n_t}\otimes \textbf{I}_{n_y}\otimes {\bar{\Psi }}_{x})\textbf{x})_{\ell }^2 + ((\textbf{I}_{n_t}\otimes {\bar{\Psi }}_{y}\otimes \textbf{I}_{n_x})\textbf{x})_{\ell }^2} + \sum _{t=1}^{n_t-1}\Vert \widehat{\textbf{x}}^{(t+1)}-\widehat{\textbf{x}}^{(t)}\Vert _1\\&= \> \Vert \,[(\textbf{I}_{n_t}\otimes \textbf{I}_{n_y}\otimes {\bar{\Psi }}_{x})\textbf{x}, (\textbf{I}_{n_t}\otimes {\bar{\Psi }}_{y}\otimes \textbf{I}_{n_x})\textbf{x}]\,\Vert _{2,1} + \Vert (\Psi _t\otimes \textbf{I}_{n_s})\textbf{x}\Vert _1, \end{aligned} \end{aligned}$$

(47)

where ${\bar{\Psi }}_{D}$ ($D=x, y$) denote the square version of the matrix $\Psi _D$ defined in (35), where a zero row has been added at the bottom, and $\Vert \cdot \Vert _{2,1}$ denotes the functional defined, for a matrix $\textbf{Z}\in \mathbb {R}^{m_x\times m_y}$, as $ \Vert \textbf{Z}\Vert _{2,1}=\sum _{i=1}^{m_x}\; \Vert \textbf{Z}_{i,:}\Vert _2$.

$\diamond $:

Group sparsity In dynamic CT, one can naturally group the spatial variables (pixels) at each time instant, i.e., $\{\widehat{\textbf{x}}^{(t)}\}_{t=1}^{n_t}$, although there are other possible ways of defining groups. In TRIPs-Py we impose group sparsity across the groups defined by the pixels of the gradient images for all the time instants, i.e., we take

$$\begin{aligned} \mathcal {R}(\textbf{x}) = \sum _{\ell =1}^{n_s'} \left( \sum _{t=1}^{n_t} (\Psi _s\widehat{\textbf{x}}^{(t)})_{\ell }^2 \right) ^{1/2} = \Vert \Psi _s\textbf{X}\Vert _{2,1}, \end{aligned}$$

(48)

where $n_s' = (n_y-1)n_x + (n_y-1)n_x$ is the total number of pixels in the gradient images and, referring to the notations in (43), $\textbf{X}=[\widehat{\textbf{x}}^{(1)},\dots ,\widehat{\textbf{x}}^{(n_t)}]\in \mathbb {R}^{n_s\times n_t}$.

Code 5 gives examples of the usage within TRIPs-Py of the basic regularization operators that appear in the regularization functionals listed above, as well as the framelet operator described in Section 2.4. These are coded in the file |operators.py| under the directory |trips/utilities|. All such regularization functionals can be handled in the general framework of MMGKS, as described at the end of Section 2.2.2.

3.3.1 Emoji test problem

This example considers real data of an “emoji” phantom measured at the University of Helsinki [44]. We modify the data to determine two main limited angle problems as follows:

1.
Problem 1: 10 projection angles. From the dataset DataDynamic_128x30.mat we generate the problems $\widehat{\textbf{A}}^{(t)}\widehat{\textbf{x}}^{(t)} + \widehat{\textbf{e}}^{(t)} =\widehat{\textbf{b}}^{(t)}$, $t = 1,2,\dots , 33$ where $\widehat{\textbf{A}}^{(t)}$ of size $2,170\times 16,384$ are defined by taking 217 fan-beam projections around only 10 equidistant angles in $[0,2\pi )$. The forward operator $\textbf{A}$ has size $71,610 \times 540,672$.
2.
Problem 2: 30 projection angles. From the dataset DataDynamic_128x60.mat we still generate 33 static problems, with $\widehat{\textbf{A}}^{(t)}$ of size ${6,510\times 16,384}$ computed from taking 217 fan-beam projections around 30 equidistant angles in $[0,2\pi )$. Hence the dynamic forward operator $\textbf{A}$ has size $214,830 \times 540,672$.

For both cases, the ground truth $\textbf{x}_\textrm{true}$ is not available, but photographs of the scanned shapes are shown in Fig. 6.

In TRIPs-Py, the main function to generate emoji data is $\texttt {generate\_emoji(da-}$ $\texttt {taset = your\_dataset)}$ which allows the user to select the dataset to be either 30 or 60. Such function automatically downloads the data from the repository^{Footnote 2}. If a noise level is provided as an argument in $\texttt {generate\_emoji()}$, then more Gaussian noise (on the top of the unknown noise already present in the recorded data) of the given level is added to the returned data. With $\texttt {generate\_emoji()}$ we generate data for solving both a sequence of static inverse problems of the form (44), where the operators are saved in Aseq and the sinograms are the columns of B, and the dynamic problem of the form (45), with the blockdiagonal matrix A and the stacked sinograms b. These are illustrated in Code 6.

We run 100 iterations of MMGKS for solving both the static problems and the dynamic problem, with regularization parameter computed by GCV. The usage of the TRIPs-Py functions to solve the static problems is illustrated in Code 7, and the reconstructions obtained at time steps $t = 6, 14, 20, 26$ are shown in Fig. 7, first row. The function $\texttt {plot\_recstructions\_series()}$ can be used to display the reconstructions for both static and dynamic problems with the argument dynamic = False or dynamic = True, respectively. When displaying the results, we set negative solution entries to 0 as a post-processing step (note that this is different to applying nonnegativity constraints during the reconstructions as done, for instance, in [32, 39]). The usage of the TRIPs-Py functions to solve the dynamic problem is reported in Code 8, where we particularly focus on solvers that, within MMGKS, enforce (an)isotropic TV in space and anisotropic TV in time, as well as Group Sparsity.

More details about this test problem can be found in the demo demo_dyna-mic_Emoji.ipynb.

3.3.2 STEMPO

Starting from the Spatio-TEmporal Motor-POwered (STEMPO) ground truth phantom from [45], within TRIPs-Py we generate both static and dynamic inverse problems, with both simulated and real data. The data for this example can be downloaded from the repository^{Footnote 3}.

For the simulated data we consider the STEMPO ground truth phantom stempo_ground_truth_2d_b4.mat, which contains 360 images of size $560\times 560$ pixels. From this dataset we retain $n_t$ images, chosen uniformly from 1 to 360 with a factor of 8, i.e., we choose the 1st, the 8th, ... up to the 360th image: these represent the ground truth at $n_t$ time instances; $n_t$ can be given in input by the user. Using the ASTRA toolbox [11] we generate the forward operators $\widehat{\textbf{A}}^{(t)}$, $t = 1,2,\dots , n_t$, each defined with respect to the angles stored in $n_t$ vectors of length 11, with 791 parallel rays departing from each angle. Each angle vector is generated by choosing 11 equispaced degree angles from $(5*(t-1), 5*(t-1) + 140)$, for $t = 1, 2, \dots , n_t$, which are then converted to radian. For instance, for $n_t = 20$, the forward operators at each time instant are of size ${8701 \times 313600}$, while the dynamic forward operator is a block diagonal matrix of size ${174020 \times 6272000}$. We perturb each measured vectorized sinogram $\widehat{\textbf{d}}_\textrm{true}^{(t)}$ with white Gaussian noise of level given in input by the user. The true image $\textbf{x}^{(t)}$ at time step $t = 10$, together with its reconstructions obtained by static anistotropic TV, hybrid LSQR, and dynamic isotropic TV are shown in Fig. 8. An illustration for both real and simulated gata generation is shown in Code 9. More details about this test problem can be found in the demo demo_dynamic_Stempo.ipynb.

3.3.3 Cross phantom

The last example considers real data of the cross phantom measured at the University of Helsinki [46]. We obtain the forward operator and the data from the file DataDynamic_128x15.mat, which can be downloaded from the repository^{Footnote 4}. The spatial resolution is $128\times 128$ pixels. The CT sinogram that represents the available data consists of 16 time frames that are generated by measuring a 2D cross-section of the dynamic phantom that is built from an aluminum stick, a graphite stick, and candle wax. The measurements were collected by considering 15 projection angles at each time instance, with angles shifted by one degree from a given time instance to the next one, i.e., if the angles for time step $t_1$ are [1, 15, 29, ..], then for time step $t_2$ they are [2, 16, 30, ..]. The measurement matrix that represents the forward operator obtained with a cone-beam geometry is a sparse matrix of size $33600 \times 262 144$, and the sinogram matrix is represented as $\textbf{B}\in \mathbb {R}^{140 \times 240}$, i.e., 15 projections on 16 images. The instance of the Cross Phantom test problem just described can be easily generated in TRIPs-Py as shown in Code 10. The high resolution filtered back projection reconstruction of the time dependent cross phantom computed from 360 projections along with the sinogram that corresponds to measurements from 16 time instances are shown in Fig. 9. Reconstructed images with MMGKS for the static inverse problems at time instances $t = 1, 5, 10, 15$ are shown in the first row of Fig. 10. The second row of Fig. 10 shows reconstructed images by solving the dynamic inverse problem with anisotropic TV at time instances $t = 1, 5, 10, 15$ from left to right, respectively. More details about this test problem can be found in the demo demo_dynamic_CrossPhantom.ipynb.

4 Conclusions and outlook

In this paper we describe TRIPs-Py, a Python package collecting solvers for small and large-scale discrete ill-posed inverse problems, and test problems. The package allows the user to easily set-up a test problem, with both simulated and real data, and employ built-in regularization methods to solve the inverse problem. Among solvers for small-scale problems, TRIPs-Py includes direct regularization methods such as truncated (G)SVD and Tikhonov. Among iterative regularization, TRIPs-Py includes methods based on both standard and generalized Krylov subspace methods. Among the test problems, special emphasis is given to a framework for handling dynamic simulated and real data. We envision that the user can provide their own data and use our solution methods within TRIPs-Py, as well as test their solution methods with the TRIPs-Py test problems.

This is a first public version of TRIPs-Py. Future developments will focus on adding new solution methods, functionalities, test problems, and improving existing data generation and visualisation tools. Specifically, in future versions of TRIPs-Py, we plan to:

$\diamond $:: Include solvers based on flexible Krylov subspace methods to, e.g., enforce sparsity on the desired solution; see [47].
$\diamond $:: Include methods that can enforce constraints into the solutions; see, e.g., [32, 39, 48].
$\diamond $:: Implement memory-aware iterative methods for edge-preserving and sparsity for solving large-scale and massive inverse problems; see [49].
$\diamond $:: Add other strategies to automatically determine the regularization parameter(s).
$\diamond $:: Add other instances of the existing deblurring test problems (e.g., a variety of PSFs) and tomography test problems (e.g., different scanning geometries).
$\diamond $:: Add other test problems, such as hyperspectral imaging and spectral CT.
$\diamond $:: Develop tools for high-order representation of dynamic imaging data.

Users are welcomed to contribute to TRIPs-Py by solution methods, functionalities, and/or test problems and data.

Data availability/Materials availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.Available.

Code Availability

All the codes are available on GitHub: https://github.com/mpasha3/trips-py

Notes

References

Boas, D.A., Brooks, D.H., Miller, E.L., DiMarzio, C.A., Kilmer, M., Gaudette, R.J., et al.: Imaging the body with diffuse optical tomography. IEEE Signal Process. Mag. 18(6), 57–75 (2001)
Article Google Scholar
Miller, E.L., Abriola, L.M., Aghasi, A.: Environmental remediation and restoration: hydrological and geophysical processing methods. IEEE Signal Process. Mag. 29(4), 16–26 (2012)
Article Google Scholar
Bennett, A.F., Chua, B.S., Leslie, L.: Generalized inversion of a global numerical weather prediction model. Meteorol. Atmos. Phys. 60(1), 165–78 (1996)
Article Google Scholar
Hansen, P.C.: Regularization tools: A Matlab package for analysis and solution of discrete ill-posed problems. Numer Algo. 6(3), 1–35 (1994)
Article MathSciNet Google Scholar
Gazzola, S., Hansen, P.C., Nagy, J.G.: IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems. Numerical Algorithms. 81(3), 773–811 (2019)
Article MathSciNet Google Scholar
Buccini, A., Reichel, L.: Software for limited-memory restarted $\ell _p-\ell _q$ minimization methods using generalized Krylov subspaces; To appear in Electron. Trans. Numer. Anal. (2024)
Nagy, J., Palmer, K., Perrone, L.: Iterative Methods for Image Deblurring: A MATLAB Object Oriented Approach. Numerical Algorithms. 36(1), 73–93 (2004)
Article MathSciNet Google Scholar
Hansen, P.C., Jørgensen, J.S.: AIR Tools II: Algebraic iterative reconstruction methods, improved implementation. Numer Algo. 79, 107–37 (2018)
Article MathSciNet Google Scholar
Biguri, A., Dosanjh, M., Hancock, S., Soleimani, M.: TIGRE: a MATLAB-GPU toolbox for CBCT image reconstruction. Biomed Phys Eng Express. 2, 055010 (2016)
Article Google Scholar
Jørgensen, J.S., Ametova, E., Burca, G., Fardell, G., Papoutsellis, E., Pasca, E., et al.: Core Imaging Library - Part I: a versatile Python framework for tomographic imaging. Phil Trans R Soc. A:3792020019220200192 (2021)
Van Aarle, W., Palenstijn, W.J., De Beenhouwer, J., Altantzis, T., Bals, S., Batenburg, K.J., et al.: The ASTRA Toolbox: A platform for advanced algorithm development in electron tomography. Ultramicroscopy 157, 35–47 (2015)
Article Google Scholar
Adler, J., et al.: odlgroup/odl: ODL 0.7.0. Zenodo; (2018) https://doi.org/10.5281/zenodo.1442734
Hansen, P.C.: Discrete inverse problems: Insight and algorithms. SIAM; (2010)
Hansen, P.C.: Rank-deficient and discrete ill-posed problems: numerical aspects of linear inversion. SIAM; (1998)
Saad, Y.: Iterative methods for sparse linear systems. SIAM; (2003)
Chung, J., Gazzola, S.: Computational methods for large-scale inverse problems: a survey on hybrid projection methods. (2023) arXiv:2105.07221.
Fenu, C., Reichel, L., Rodriguez, G.: GCV for Tikhonov regularization via global Golub-Kahan decomposition. Numer. Linear Algebr. Appl. 23(3), 467–84 (2016)
Article MathSciNet Google Scholar
Lewis, B., Reichel, L.: Arnoldi-Tikhonov regularization methods. J. Comput. Appl. Math. 226(1), 92–102 (2009)
Article MathSciNet Google Scholar
Lampe, J., Reichel, L., Voss, H.: Large-scale Tikhonov regularization via reduction by orthogonal projection. Linear Algebr. Appl. Elsevier 436(8), 2845–2865 (2012)
Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: A generalized Krylov subspace method for $\ell _p-\ell _q$ minimization. SIAM J. Sci. Comput. 37(5), S30–S50 (2015) SIAM
Pasha, M., Saibaba, A.K., Gazzola, S., Español, M.I de Sturler, Eric.: A computational framework for edge-preserving regularization in dynamic inverse problems. Electron. Trans. Numer. Anal. 58, 486–516 (2023)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)
Book Google Scholar
Eldén, L.: A weighted pseudoinverse, generalized singular values, and constrained least squares problems. BIT Numer. Math. 22(4), 487–502 (1982)
Article MathSciNet Google Scholar
Hansen, P.C., Nagy, J.G., O’leary, D.P.: Deblurring images: matrices, spectra, and filtering. SIAM; (2006)
Björck Å.: Numerical Methods in Matrix Computations. Texts in Applied Mathematics. Springer International Publishing; (2014)
Arnoldi, W.E.: The principle of minimized iterations in the solution of the matrix eigenvalue problem. Q. Appl. Math. 9(1), 17–29 (1951)
Article MathSciNet Google Scholar
Van Loan, C.: Computing the CS and the generalized singular value decompositions. Numer. Math. 46(4), 479–491 (1985) Springer
Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. United States Governm. Press Office Los Angeles, CA (1950)
Book Google Scholar
Lange, K.: MM optimization algorithms. SIAM; (2016)
Rodriguez, P., Wohlberg, B.: An efficient algorithm for sparse representations with $\ell _p$ data fidelity term. In: Proceedings of 4th IEEE Andean Technical Conference (ANDESCON); (2008)
Huang, G., Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: Majorization-minimization generalized Krylov subspace methods for $\ell _p-\ell _q$ optimization applied to image restoration. BIT Numer. Math. 57(2), 351–78 (2017)
Article Google Scholar
Buccini, A., Pasha, M., Reichel, L.: Modulus-based iterative methods for constrained $\ell _p-\ell _q$ minimization. Inverse Prob. 36(8), 084001 (2020)
Article Google Scholar
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with Sparsity-Inducing Penalties. Found. Trends Mach. Learn. 4(1), 1–106 (2012)
Article Google Scholar
Reichel, L., Shyshkov, A.: A new zero-finder for Tikhonov regularization. BIT Numer. Math. 48(3), 627–43 (2008)
Article MathSciNet Google Scholar
Gazzola, S., Landman, M.S.: Krylov methods for inverse problems: Surveying classical, and introducing new, algorithmic approaches. Mitteilungen der Gesellschaft für Angewandte Mathematik und Mechanik. 43,(4) (2020)
Buccini, A., Reichel, L.: An $\ell ^2$-$\ell ^q$ Regularization Method for Large Discrete Ill-Posed Problems. J. Sci. Comput. 78, 1526–49 (2019)
Article MathSciNet Google Scholar
Novati, P., Russo, M.R.: A GCV-based Arnoldi-Tikhonov regularization method. BIT. 54, 501–21 (2014)
Google Scholar
Buccini, A., Reichel, L.: Generalized cross validation for $\ell _p-\ell _q$ minimization. Numer Algor. 88, 1595–1616 (2021)
Article Google Scholar
Buccini, A., Pasha, M., Reichel, L.: Linearized Krylov subspace Bregman iteration with nonnegativity constraint. Numer. Algorithms. 1-24 (2020)
Cai, J.F., Osher, S., Shen, Z.: Linearized Bregman iterations for frame-based image deblurring. SIAM J. Imag. Sci. 2, 226–52 (2009)
Article MathSciNet Google Scholar
Mueller, J.L., Siltanen, S.: Linear and nonlinear inverse problems with practical applications. SIAM; (2012)
Hansen, P.C., Jørgensen, J.S., Lionheart, W.R.B.: Computed Tomography: Algorithms, Insight, and Just Enough Theory. SIAM, Philadelphia (2021)
Book Google Scholar
Lan, S., Pasha, M., Li, S.: Spatiotemporal Besov Priors for Bayesian Inverse Problems. (2023) arXiv:2306.16378.
Meaney, A., Purisha, Z., Siltanen, S.: Tomographic X-ray data of 3D emoji. (2018)arXiv:1802.09397.
Heikkilä, T.: STEMPO–dynamic X-ray tomography phantom. (2022) arXiv:2209.12471.
Latva-Äijö, S., Meaney, A., Siltanen, S.: Tomographic X-ray data of 3D cross phantom. (2018) arXiv:1809.00166.
Gazzola, S., Nagy, J.G., Sabaté Landman, M.: Iteratively Reweighted FGMRES and FLSQR for Sparse Reconstruction. SIAM J. Sci. Comput. (0):S47-69 (2021)
Gazzola, S., Wiaux, Y.: Fast nonnegative least squares through flexible Krylov subspaces. SIAM J. Sci. Comput. 39, A655-79 (2017)
Article MathSciNet Google Scholar
Pasha, M., de Sturler, E., Kilmer, M.E.: Recycling MMGKS for large-scale dynamic and streaming data. (2023) arXiv:2309.15759.

Download references

Acknowledgements

SG would like to thank Ludovico Carozza for his advice on many Python functionalities. We would like to thank Tatiana Bubba for discussions about avoiding ‘inverse crimes’ in inverse problems and Jakob Sauer Jørgensen for several discussions and advice that helped finalize the software package.

Funding

Open Access funding provided by the MIT Libraries. MP gratefully acknowledges support from the NSF under award No. 2202846. MP further acknowledges partial support from the NSF-AWM Mentoring Travel award. Both MP and SG acknowledge the Isaac Newton Institute for Mathematical Sciences, Cambridge, for the support and hospitality during the programme “Rich and Nonlinear Tomography - a multidisciplinary approach" (supported by EPSRC grant no EP/R014604/) where partial work on this paper was undertaken.

Author information

Authors and Affiliations

Laboratory for Decision and Information System, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, 02139, MA, USA
Mirjeta Pasha
Department of Mathematical Sciences, University of Bath, 4 West 5.7, Bath, BA2 7AY, United Kindom, UK
Silvia Gazzola
School for Biological and Health Systems Engineering, Arizona State University, 501 E Tyler Mall Win, Tempe, 85281, Arizona, USA
Connor Sanderford
School of Engineering, Electrical and Computer Engineering, Tufts University, 161 College Avenue, Medford, 02155, MA, USA
Ugochukwu O. Ugwu

Authors

Mirjeta Pasha
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Gazzola
View author publications
You can also search for this author in PubMed Google Scholar
Connor Sanderford
View author publications
You can also search for this author in PubMed Google Scholar
Ugochukwu O. Ugwu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MP and SG wrote the manuscript, designed the package and wrote most of the functionalities of it. MP and SG wrote the demos and tested the software. CS developed early versions of several functionalities of the software. UU contributed to writing the python code for GSVD decomposition and tested MMGKS. All authors reviewed the manuscript.

Corresponding author

Correspondence to Mirjeta Pasha.

Ethics declarations

Conflict of interest/Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Yes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pasha, M., Gazzola, S., Sanderford, C. et al. TRIPs-Py: Techniques for regularization of inverse problems in python. Numer Algor (2024). https://doi.org/10.1007/s11075-024-01878-w

Download citation

Received: 01 March 2024
Accepted: 01 July 2024
Published: 22 July 2024
DOI: https://doi.org/10.1007/s11075-024-01878-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

TRIPs-Py: Techniques for regularization of inverse problems in python

Abstract

Similar content being viewed by others

Convergent Regularization in Inverse Problems and Linear Plug-and-Play Denoisers

IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems

Data-Informed Regularization for Inverse and Imaging Problems

1 Introduction

2 Overview of the TRIPs-Py solvers

2.1 Direct methods for small-scale problems

2.2 Projection methods for large-scale problems

Remark 1

2.2.1 Methods based on standard Krylov subspaces

2.2.2 Methods based on generalized Krylov subspaces

2.3 Strategies to choose the regularization parameter

2.4 Regularization operators

3 Overview of the TRIPs-Py test problems

3.1 Deblurring (deconvolution)

3.1.1 1D Deblurring

3.1.2 2D deblurring

3.2 Computerized tomography

3.3 Dynamic computerized tomography

3.3.1 Emoji test problem

3.3.2 STEMPO

3.3.3 Cross phantom

4 Conclusions and outlook

Data availability/Materials availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Ethics approval and consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation