1 Introduction

Functional magnetic resonance imaging (fMRI) is an imaging technique that can be used to noninvasively monitor brain activity. The working principle of fMRI is based on the fact that neuronal activity is coupled with cerebral blood flow and volume. When a certain part of the brain is activated, the blood flow and amount of oxygen to that area increase more than the metabolic need for oxygen, enabling the related brain activity to be indirectly evaluated by the blood oxygen level-dependent (BOLD) contrast [2].

In a typical fMRI scan, the MRI scanner collects a time series of k-space data, which is used to reconstruct a time series of images. The image reconstruction is traditionally carried out using static image reconstruction methods, based on assumption of time invariance of the unknowns during the acquisition of the data for the selected k-space trajectory. Typically, when the k-space trajectory is a fully sampled Cartesian trajectory, the reconstruction is carried out using inverse FFT. When employing non-Cartesian sampling schemes, the usual reconstruction is regridding where the non-Cartesian data are interpolated onto a Cartesian grid before the inverse FFT [16]. To obtain sufficient spatial resolution with the classical methods, enough k-space samples per image frame are necessary in order to fulfill the Nyquist criterion. This comes at the expense of temporal resolution resulting in a loss of accuracy when studying a physiological process with changes faster than the sampling time for each image frame.

One approach to partially alleviate the loss of temporal resolution is to use a sliding window method [6], where the last available measurement data are combined with a time window of previous data in order to obtain a sufficiently sampled frame of k-space data. This approach, however, is suboptimal as the reconstruction problem becomes inconsistent in the sense that changes might have occurred in the target during the time window of the augmented data frame.

A natural approach to improve the temporal resolution is to use undersampled k-space data. When using undersampled data, it is preferable to use non-Cartesian acquisition, such as spiral or radial sampling, as it allows for denser sampling of the center of the k-space [1]. However, if undersampled data are used, the image reconstruction problem becomes ill-posed. This can lead to severe aliasing artifacts when conventional reconstruction methods, such as inverse FFT or regridding, are used. A well-known static reconstruction approach that allows for the use of drastically undersampled data is compressed sensing (CS) [4]. CS image reconstruction has been extensively used in static MRI, in cardiac MRI [9, 18, 25, 26, 29] and recently also in fMRI [14, 17, 28, 51].

In this paper, we propose a state estimation approach to fMRI. In the state estimation paradigm, the image reconstruction problem is considered explicitly as a time-dependent problem where the image can be updated after a new observation, such as a single spoke of radial data, becomes available. A state evolution model is used to model the unknowns (time series of fMRI images) as a time-dependent process, while the relation between the unknown fMRI image and k-space measurements at each time instant is modeled by a separate observation equation. The objective is to estimate the sequence of states (fMRI images) given the models and the time series of k-space data. Since the state estimation approach allows an update of the state estimate after each new measurement becomes available, it avoids the data inconsistency which is present in the conventional static approaches due to longer time window of acquiring data for a single image in the fMRI time series.

One of the most commonly used methods for computing the state estimates is the Kalman filter (KF) [21], which can additionally be smoothed by using the Kalman smoother (KS). In dynamic cardiac MRI, KF has been utilized in [8, 27, 32, 39, 40, 42] and KS in [32]. In fMRI, KF has been mainly used to estimate certain parameters from the reconstructed data [10, 24, 36]; KF, enhanced with CS, has been used in fMRI image reconstruction [49].

Further, we propose to complement the state estimation approach with a structural prior, which is designed to promote structural similarity of the gradient of the unknown image with the gradient of the anatomical MRI. This anatomical image is usually taken as part of the fMRI measurement protocol, but used only for visualization purposes. Two different reconstructions are considered. The first one is a structured smoothness prior, obtained by augmenting the state observation matrix with a spatially weighted difference matrix, where the weighting is constructed based on the magnitude of the gradient of the anatomical MRI image [19]. The approach corresponds to augmenting the observation updating density of Bayes filtering with a Gaussian prior model for the spatial features of the unknown state [20]. The second approach is an approximation, where the spatial prior information is utilized by applying a structurally guided total variation (TV) denoising to the Kalman filter estimates at each time step. The rationale behind this heuristic approximation is to obtain faster computational times compared to the augmented model. However, the approximation does not have known convergence results, while the augmented model has. In recent years, there have been several contributions to include an anatomical prior into a static image reconstruction problem, see, e.g., [5, 7, 44, 50]. In this paper, the idea of structured regularization is extended to the dynamic state estimation framework.

The state estimates for both approaches are computed using the Kalman filter and the Rauch–Tung–Striebel (RTS) fixed-interval Kalman smoother. In this study, the state estimation approach is evaluated using simulated and experimental, sparsely collected radial fMRI data from a rat brain. In the computations, the state estimates are updated after each new spoke of radial data becomes available. The approaches are compared with the sliding window method using a full circle of n spokes as well as with a conventional frame-by-frame method using the k-space data in frames of n-spokes. The sliding window and conventional estimates are reconstructed with linear least squares (LS).

2 Theory

2.1 Forward Model and Conventional Reconstruction Procedure

Let \(\mathbf {z}_j \in {{\mathbb {C}}}^M\) denote the k-space data for a single spoke in the radial sampling of the k-space, and let \(\mathbf {H}_j\) denote the forward matrix modeling the mapping from the image space to the jth spoke of data in the data space. Assuming that the radial sampling pattern consists of n equally spaced spokes around the full circle, the data and forward matrix for one traditional fMRI image frame can be written as

$$\begin{aligned} \mathbf {z} = \begin{pmatrix} \mathbf {z}_1 \\ \mathbf {z}_2 \\ \vdots \\ \mathbf {z}_n \end{pmatrix} \quad \mathbf {H} = \begin{pmatrix} \mathbf {H}_1 \\ \mathbf {H}_2 \\ \vdots \\ \mathbf {H}_n \end{pmatrix}, \end{aligned}$$
(1)

respectively, leading to an observation equation

$$\begin{aligned} \mathbf {z} = \mathbf {H}\mathbf {f} + \mathbf {v} \end{aligned}$$
(2)

for a single frame of the sampling scheme. In (2), \(\mathbf {v}\) models the measurement noise and \(\mathbf {f} \in {\mathbb {C}}^{N_{\text {pix}}}\) is the vector representation of the complex-valued \(N_\mathrm{pix}= N \times N\) image.

Assume that the fMRI experiment consists of \(N_\mathrm{frames}\) frames of data of the form (2), and let \(\mathbf {z}^{(\ell )}\) denote the \(\ell \)th frame of data in the time series \(\{ \mathbf {z}^{(\ell )} \in {\mathbb {C}}^{nM}, \ell = 1,\ldots ,N_\mathrm{frames}\}\). In the conventional setup to fMRI reconstruction, the respective image frames of the time series \(\{ \mathbf {f}^{(\ell )} \in {\mathbb {C}}^{N_\mathrm{pix}}, \ell = 1,\ldots ,N_\mathrm{frames}\}\) are reconstructed from the data vector \(\mathbf {z}^{(\ell )}\) using conventional static methods, such as the regridding algorithm or LS. When calculating the LS solution, we need to solve the following minimization problem

$$\begin{aligned} \hat{\mathbf {f}}^{(\ell )} = \min _{\mathbf {f}^{(l)}} \left\| \mathbf {z}^{(\ell )} -\mathbf {H} \mathbf {f}^{(l)} \right\| ^2, \end{aligned}$$
(3)

which can be solved iteratively by, for example, the LSQR algorithm [31].

We remark that when working with radially sampled data, one can carry out the reconstruction either by using the raw k-space data directly or by using data which have been transformed to complex-valued Radon transform formalism by 1D Fourier transform of each measured spoke of k-space data. In the former case, H corresponds to a complex-valued (non-uniform) FFT matrix and in the latter to a real-valued Radon transform operator.

2.2 State Estimation Representation

In the state estimation approach, the fMRI problem is modeled by the pair of equations

$$\begin{aligned} \mathbf {f}_t&= \mathbf {F}_{t-1}\mathbf {f}_{t-1}+\mathbf {w}_{t-1} \end{aligned}$$
(4a)
$$\begin{aligned} \mathbf {z}_t&= \mathbf {H}_t\mathbf {f}_t+\mathbf {v}_t, \end{aligned}$$
(4b)

where (4a) is the state evolution model and (4b) is the equation for the observation \(\mathbf {z}_t\) at time index t. In (4a), \(\mathbf {f}_t \in {\mathbb {C}}^{N_\mathrm{pix}}\) is system state at time t, \(\mathbf {F}_{t-1}\) is the state transition matrix, and \(\mathbf {w}_{t-1}\) is the process noise. \(\mathbf {H}_t\) in (4b) is the observation matrix. In this study, \(\mathbf {z}_t \in {\mathbb {C}}^M\) corresponds to the k-space data for a single radial spoke. Therefore, the state estimation approach provides n images during the cycle of one conventional frame of (2) and \(t=1,\ldots ,T\), where \(T = n N_\mathrm{frames}\), images from the whole fMRI experiment with \(N_\mathrm{frames}\) conventional image frames.

The Kalman filter (KF) is a linear recursive state estimation algorithm that can be used to estimate the states of a dynamic system of the form (4). Under the standard assumptions that \(\mathbf {v}_t\) and \(\mathbf {w}_t\) in (4a)–(4b) are Gaussian zero-mean noise processes with known covariances and mutually independent in the sense that \(\mathbf {v}_k \perp \mathbf {v}_j,\ \mathbf {w}_k \perp \mathbf {w}_j\) for \(k \ne j\) and \(\mathbf {w_k} \perp \mathbf {v_k}\), the Kalman filter produces the minimum mean-square-error estimate of \(\mathbf {f}_t\) based on the measurements \(\mathbf {z}_0, \ldots , \mathbf {z}_T\). The derivation of the Kalman filter can be found from [38]. In the statistical setup, the KF produces the means and covariances of the time evolution and observation updating probability densities for a Gaussian Bayes filtering problem, see [20, Chapter 4]. The standard KF equations are given by

$$\begin{aligned} \hat{\mathbf {f}}_{0}^+&= E(\mathbf {f}_0) \nonumber \\ \mathbf {P}_{0}^+&= E[(\mathbf {f}_0-\hat{\mathbf {f}}_{0}^+) (\mathbf {f}_0-\hat{\mathbf {f}}_{0}^+)^\mathrm{T}] \nonumber \\ \hat{\mathbf {f}}_t^-&= \mathbf {F}_{t-1}\hat{\mathbf {f}}_{t-1}^+\end{aligned}$$
(5a)
$$\begin{aligned} \mathbf {P}_t^-&=\mathbf {F}_{t-1}\mathbf {P}_{t-1}^+\mathbf {F}_{t-1}^\mathrm{T}+\mathbf {Q}_{t-1}\end{aligned}$$
(5b)
$$\begin{aligned} \mathbf {K}_t&=\mathbf {P} _t^-\mathbf {H}_t^\mathrm{T}(\mathbf {H}_t\mathbf {P}_t^-\mathbf {H}_t^\mathrm{T}+\mathbf {R}_t)^{-1}\end{aligned}$$
(5c)
$$\begin{aligned} \hat{\mathbf {f}}_t^+&=\hat{\mathbf {f}}_ t^-+\mathbf {K}_t(\mathbf {z}_t-\mathbf {H}_t\hat{\mathbf {f}_t^-})\end{aligned}$$
(5d)
$$\begin{aligned} \mathbf {P}_t^+&=(\mathbf {I}-\mathbf {K}_t\mathbf {H}_t)\mathbf {P}_t^-, \end{aligned}$$
(5e)

where \(\hat{\mathbf {f}}_t^-\) is the (complex-valued) a priori estimate, \(\mathbf {P}_t^-\) is the a priori error covariance of the estimate, \(\mathbf {Q}_t\) is the covariance of the process noise, \(\hat{\mathbf {f}}_t^+\) is the a posteriori estimate, \(\mathbf {P}_t^+\) is the a posteriori error covariance of the estimate, \(\mathbf {K}_t\) is the Kalman gain, and \(\mathbf {R}_t\) is the covariance of the observation noise. The error covariances \(\mathbf {P}_t\) represent the uncertainty in the a priori and a posteriori estimates given the model and the data; large values mean high uncertainty, while smaller values mean more reliable estimate [38]. The covariance \(\mathbf {Q}_t\) reflects the uncertainty in the state evolution model, and it determines the weighting that is given for measurements; large covariance values give more weight to the current measurement, while smaller covariance values emphasize the past estimates. Small covariance values lead to smoother realizations of the time series, but can also cause the estimate to lag behind, while larger values can lead to noisy images.

Since the observation noise \(\mathbf {v}_t\) in MRI is thermal noise, it can be modeled well as zero-mean white Gaussian process with covariance \(\mathbf {R}_t\) [39]. Furthermore, we assume that there is no correlation in the measurement noise between the time steps and that the variance of the observation noise is the same with all k-space samples, leading to a covariance matrix of the form \(\mathbf {R}_t=\sigma _v\mathbf {I}\).

The state evolution model can be used to incorporate temporal prior information and models about the unknowns into the image reconstruction problem. In this study, we employ a simple random walk formulation, where \(\mathbf {F}_{t-1} = \mathbf {I}\) and the process noise \(\mathbf {w}_t\) is modeled as zero-mean Gaussian process with diagonal covariance \(\mathbf {Q}_t =\sigma _w^2 \mathbf {I}\). In other words, we use random walk formulation to model the unknown process where the variance \(\sigma _w^2\) is used to control how much the process is allowed to change between consecutive time steps.

2.3 Structural Priors

In this section, we describe how the anatomical image, which is acquired as part of the fMRI experiment, is used to incorporate spatial prior information about the target into the image reconstruction problem. The rationale in the structural prior is to promote structural similarity of the unknown fMRI images with the gradient of the anatomical image. In the following, we denote the magnitude of the anatomical image by \(\tilde{\mathbf {f}}_\mathrm{ref}= | \mathbf {f}_\mathrm{ref}|\).

2.3.1 Structured Smoothness Prior

Structured smoothness prior can be obtained by augmenting the observation equation in the state estimation model (4b) by a linear regularization term, leading to the following augmented observation equation

$$\begin{aligned} \begin{pmatrix} \mathbf {z}_t \\ \alpha \mathbf {\kappa } \mathbf {L} \tilde{\mathbf {f}}_{\text {ref}} \end{pmatrix} = \begin{pmatrix} \mathbf {H}_t \\ \alpha \mathbf {\kappa } \mathbf {L} \end{pmatrix} \mathbf {f}_t +\mathbf {v}_t , \end{aligned}$$
(6)

where \(\alpha >0\) is a regularization parameter, that controls the strength of spatial regularization, \(\mathbf {L}\) is an \(N_\mathrm{pix}\times N_\mathrm{pix}\) first order difference matrix, \(\tilde{\mathbf {f}}_\mathrm{ref}\) is the anatomical MRI, which has been (if needed) resized to the same resolution with the fMRI problem, and \(\mathbf {\kappa }\) is a spatially varying weighting functional

$$\begin{aligned} \mathbf {\kappa } = \mathbf {\kappa }_x + \mathbf {\kappa }_y \end{aligned}$$
(7)

where \(\mathbf {\kappa }_x\) is

$$\begin{aligned} \mathbf {\kappa }_x = \exp \left( -\frac{\left| \tilde{\mathbf {f}}_{\mathrm {ref,}i}- \tilde{\mathbf {f}}_{\mathrm {ref,}i-1} \right| }{C}\right) \end{aligned}$$
(8)

and i denotes the pixel numbers. \(\mathbf {\kappa }_y\) is the same, but for the y-direction. The coefficient \(\mathbf {\kappa }\) controls the weighting of the difference matrix in a given pixel, and C is a positive constant that can be considered as an edge threshold parameter; if the edge magnitude in the anatomical image is clearly larger than C, the weighting \(\mathbf {\kappa }\) for the smoothness regularization in that particular location will be significantly smaller than one, implying locally less penalty for an edge in the unknown image at that location, and edges that have magnitude smaller than C in the anatomical image are considered as noise. For details on constructing a similar type of structured regularization for a static inverse problem, see [19, 22, 43].

To make Eq. (6) compatible with the Kalman filter recursions (5a)–(5e), we write

$$\begin{aligned} \tilde{\mathbf {z}}_t =\tilde{\mathbf {H}}_t\mathbf {f}_t +\tilde{\mathbf {v}}_t, \end{aligned}$$
(9)

where

$$\begin{aligned} \tilde{\mathbf {z}}_t&= \begin{pmatrix} \mathbf {z}_t \\ \alpha \mathbf {\kappa } \mathbf {L} \tilde{\mathbf {f}}_{\text {ref}} \end{pmatrix} \nonumber \\ \tilde{\mathbf {H}}_t&= \begin{pmatrix} \mathbf {H}_t \\ \alpha \mathbf {\kappa } \mathbf {L} \end{pmatrix}. \end{aligned}$$

Using these notations, \(\tilde{\mathbf {z}}_t\) replaces \(\mathbf {z}_t\) and \(\tilde{\mathbf {H}}_t\) replaces \(\mathbf {H}_t\) in (5c)–(5e). \(\mathbf {R}_t\) in (5c) becomes \(\tilde{\mathbf {R}}_t\) which is

$$\begin{aligned} \tilde{\mathbf {R}}_t = \begin{pmatrix} \mathbf {R}_t &{} 0 \\ 0 &{} \mathbf {I} \end{pmatrix}. \end{aligned}$$
(10)

The use of the augmented model (6) corresponds to augmenting the observation updating probability density of a Gaussian Bayes filtering problem with a Gaussian prior model for the spatial features of the state \(\mathbf {f}_t\). The augmented Kalman filter algorithm produces the means and covariances for the evolution and modified observation updating densities. For a detailed derivation and interpretation of the augmented model, see [20, Chapter 4.4] and [37].

2.3.2 Structured Total Variation Denoising

As the second approach for incorporating structural prior information into the state estimation, we consider a heuristic approximation, where a structurally guided TV denoising is applied to the Kalman filter estimate (5d) at each time step t of the Kalman recursion [utilizing the original observation equation in (4)]. The rationale in the approach is to utilize the spatial prior information in an approximate but computationally efficient way. However, a downside of the approach is that it does not have theoretical results for convergence.

TV was first introduced for image denoising by Rudin et al. in 1992 [35], and it has since then been used extensively in image denoising and regularization of image reconstruction problems. TV denoising has also been previously implemented in fMRI [28], but did not include anatomical information.

The functional we minimize in the denoising step is a locally and directionally weighted TV functional

$$\begin{aligned} \varPsi (\mathbf {u}_t^s)=\int _{\varOmega }\sqrt{(\nabla \mathbf {u}_t^s)^\mathrm{T}\mathbf {D}(\nabla \mathbf {u}_t^s) + \beta } \, {\hbox {d}}x, \end{aligned}$$
(11)

where \(\mathbf {u}_t := \hat{\mathbf {f}}_t^+\) is the KF estimate from (5d) at time t and

$$\begin{aligned} \mathbf {D}= \mathbf {I} - \lambda \mathbf {\nu }\mathbf {\nu }^\mathrm{T} \end{aligned}$$
(12)

is a weighting matrix, which implements a structured TV using the parallel level sets approach [7, 23], and \(\varOmega \) the whole image domain. Choosing \(\mathbf {D} = \mathbf {I}\) would convert the functional to conventional (isotropic) TV. \(\lambda \) in (12) is defined as

$$\begin{aligned} \lambda = 1 - \exp \left( {-\frac{\left\| \nabla \tilde{\mathbf {f}}_\mathrm{ref}\right\| ^2}{C^2}}\right) , \end{aligned}$$
(13)

and \(\mathbf {\nu }\) is defined as

$$\begin{aligned} \mathbf {\nu } = \left\{ \begin{array}{ll} \frac{\nabla \tilde{\mathbf {f}}_\mathrm{ref}}{\left\| \nabla \tilde{\mathbf {f}}_\mathrm{ref}\right\| } &{} \mathrm{when}\ \left\| \nabla \tilde{\mathbf {f}}_\mathrm{ref}\right\| \not = 0 \\ \mathbf {0} &{} \mathrm{otherwise}. \end{array} \right. \end{aligned}$$
(14)

We minimize (11) by an interior point method of [3]

$$\begin{aligned} \nabla G(\mathbf {u}_t^{s+1})=\nabla G(\mathbf {u}_t^{s}) - \gamma \nabla \varPsi (\mathbf {u}_t^s), \end{aligned}$$
(15)

where \(G(\mathbf {u}_t)\) is a function that defines the domain of \(\mathbf {u}_t\), \(s=0,\ldots ,S\) is the iteration index of the interior point algorithm with a total of S iterations and \(\gamma \) is a regularization parameter that controls the strength of the (structured) TV denoising. In this study, the function \(G(\mathbf {u}_t)\) is selected as \(G(\mathbf {u}_t)=(1/2)\left\| \mathbf {u}_t\right\| ^2\), leading to gradient descent iteration

$$\begin{aligned} \mathbf {u}_t^{s+1}=\mathbf {u}_t^{s} - \gamma \mathbf {W} \nabla \varPsi (\mathbf {u}_t^s) \end{aligned}$$
(16)

of \(\varPsi (\mathbf {u}_t)\), where \(\mathbf {W}\) a weighting matrix. The weighting matrix \(\mathbf {W}\) in (11) is selected as the a posterior error covariance \(\mathbf {P}_t^+\) in (5e) and has been included to balance the weighting of the TV denoising based on the magnitude of the errors of the state estimate; the denoising smooths more the areas of high uncertainty, while having less of an effect on areas of higher certainty. The inclusion of the weighting matrix improves the contrast of the KF estimates and was validated experimentally.

With the selected G, we do not impose constraints to the domain of the estimate. By using different choices of \(G(\mathbf {u}_t)\), one could select \(G(\mathbf {u}_t)\) such that it imposes, for example, a positivity constraint to the solution \(\mathbf {u}_t^{s+1}\). In this case we could choose \(G(\mathbf {u}_t)=\mathbf {u}_t^\mathrm{T}\ln \mathbf {u}_t -\mathbf {u}^\mathrm{T}_t\mathbf {e}\), where \(\mathbf {e}\) is a unit vector [34].

2.4 Kalman Smoother

The Kalman smoother (KS) implemented in this work is the Rauch-Tung-Striebel (RTS) fixed-interval smoother. The RTS smoother is defined with the following equations

$$\begin{aligned} \mathbf {K}_{fs,t}&= \mathbf {P}_t^+\mathbf {F}_{t}^\mathrm{T}(\mathbf {P}_{t+1}^-)^{-1} \end{aligned}$$
(17a)
$$\begin{aligned} \hat{\mathbf {f}}_{fs,t}&=\hat{\mathbf {f}}_t^+ + \mathbf {K}_{fs,t}(\hat{\mathbf {f}}_{fs,t+1}-\hat{\mathbf {f}}_{t+1}^+), \end{aligned}$$
(17b)

where \(\mathbf {K}_{fs,t}\) is the smoother gain and \(t=T-1,\ldots ,0\).

When calculating the KS estimate, the KF recursion (5a)–(5e) is calculated first through \(t=1,\ldots ,T\) (forward pass through the data) and the smoother is calculated as a backward pass. This requires saving the smoother gains (17a) during the forward pass.

3 Materials and Methods

3.1 Estimates

The following estimates are computed and compared in the results section:

(AKF):

Augmented Kalman filter with structured regularization. Equations (5a)–(5e), using the observation Eq. (9).

(AKS):

Kalman smoother applied to the results of AKF.

(TV-KF):

Kalman filter (5a)–(5e) with application of structured TV denoising to the KF (5d) estimate at each time step. Observation equation given by (4)

(TV-KS):

Kalman smoother applied to the results of TV-KF.

(LS):

Conventional static frame-by-frame estimation based on successive data frames of form (2). Reconstructions are computed using the LS method (3).

(SW):

Sliding window using a full frame of n-spokes, such that the last measured spoke of data (say with index \(n-k\) over the rotation of n spokes) is augmented atop of the previous \(n-1\) spokes. The data and observation matrix are of the form

$$\begin{aligned}&\mathbf {z} = \begin{pmatrix} \mathbf {z}_{n-k} \\ \mathbf {z}_{n-k+1} \\ \vdots \\ \mathbf {z}_n \\ \mathbf {z}_1 \\ \vdots \\ \mathbf {z}_{n-k-1} \end{pmatrix}, \mathbf {H} = \begin{pmatrix} \mathbf {H}_{n-k} \\ \mathbf {H}_{n-k+1} \\ \vdots \\ \mathbf {H}_n \\ \mathbf {H}_1 \\ \vdots \\ \mathbf {H}_{n-k-1} \end{pmatrix}. \end{aligned}$$
(18)

The sliding window reconstructions are computed by the least squares approach using data of the form (18).

For an fMRI experiment of \(N_\mathrm{frames}\) conventional frames with n spokes in each frame, the Kalman estimates produce a time series of \(nN_\mathrm{frames}\) images, the sliding window a time series of \(n(N_\mathrm{frames}- 1) + 1\) images and the LS a time series of \(N_\mathrm{frames}\) images from the same time interval. First sliding window estimate is made after n measurements have been collected, causing the sliding window estimate to be \(n - 1\) shorter than the Kalman estimates.

We remark that due to the high amount of time points in the fMRI time series, the RTS smoother is not calculated for the entire duration nor after every time point in the estimates AKS and TV-KS. Instead, smoother gain matrices (17a) are calculated for all time points, but only h gain matrices are stored in memory at every time point; previous gain matrices are removed when the next ones are calculated. Furthermore, for increased computational speed, after each smoother calculation of (17b) there are b time points that skip the smoother calculations. Selection of the parameters is a compromise between reconstruction properties and computational cost; increasing both h and b would give a temporally smoother estimate at the cost of increased memory usage and computational cost.

3.2 Experimental fMRI Data from a Wistar Rat

Experimental fMRI data were acquired in a 9.4 T horizontal magnet (Agilent, Palo Alto, USA) using a volume coil transmit/quadrature surface coil receive pair (Rapid Biomed, Rimpar, Germany). For animal experiments, a gradient echo-based radial pulse sequence (repetition time 20 ms, echo time 10 ms, flip angle 30\(^{\circ }\), field-of-view 32 \(\times \) 32 mm, slice thickness 1.5 mm, number of points in each spoke 64, 52 spokes collected in sequential order, time per image 1.04 s) was used to repeatedly collect data before and during nicotine injection for 18 min.

There were a total of 1000 frames with 52 spokes in the data. In these computations, we used \(N_\mathrm{frames}= 650\) frames of data by cutting a part of the initial baseline and end of the experiment away. Thus, while the LS approach produced a time series of 650 images, the state estimation produced a time series of \(T = n N_\mathrm{frames}= 52 \cdot 650=\)33,800 images and the sliding window a time series of \(T = n (N_\mathrm{frames}- 1) + 1 =\)33,749 images from the same time interval.

An anatomical image was collected from the same slice using a gradient echo pulse sequence with otherwise identical acquisition parameters but using a full Cartesian sampling to obtain a \(128 \times 128\) anatomical image as the reference image. For the reconstructions, the anatomical image was resized to the same pixel resolution (\(N_\mathrm{pix}= 64^2\)) with the fMRI and scaled to magnitude values between \([0,\ 1]\). The resized magnitude image is shown in Fig. 1.

All animal procedures were approved by the Animal Ethics Committee of the Provincial Government of Southern Finland and conducted in accordance with the guidelines set by the European Community Council Directives 2010/63/EU. Adult male Wistar rat weighing 385 g was anesthetized with isoflurane for surgery. The femoral artery was cannulated to allow the monitoring of the blood gases and pH. The femoral vein was cannulated for the administration of nicotine. After surgery, the anesthesia was changed to urethane (1.25 g/kg, i.p.). This ensured that the anesthesia level during functional imaging was not at the surgical level to hinder the detection of activation [15]. Rat was allowed to breathe normally a mixture of 70% N\(_2\)–30% O\(_2\), and the breathing rate was monitored. Rat was fixed in nonmagnetic stereotaxic frame with a bite bar and earplugs for the MRI experiments. Temperature was maintained at approximately 37\(^{\circ }{\text{ C }}\) with water circulation.

During the fMRI experiment, a dose of 0.25 mg/kg nicotine tartrate salt (0.081 mg/kg free base) was injected intravenously. Nicotine was chosen because it is widely used in pharmacological studies and is known to produce large cortical responses [11] under these experimental conditions [30].

For the analysis of the results, a region of interest (ROI) from the cortex was selected. The ROI is highlighted with a red line in Fig. 1.

Fig. 1
figure 1

Anatomical prior image used with experimental data and the ROI (red line)

3.3 Simulated Data

The \(128 \times 128\) anatomical image of the Wistar rat brain was used as the baseline target of the fMRI simulation. A region of interest (ROI), highlighted in red in the left image of Fig. 2, was defined for the simulation.

Radial k-space data were simulated by using the MATLAB codes by Guerquin-Kern [12] and Guerquin-Kern et al. [13]. Fifty-one equiv-spaced spokes were used in one frame for a total of \(N_\mathrm{frames}=50\) conventional measurement frames, leading to times series of 2550 true states. The states were obtained by adding a simulated, time-varying response with peak intensity of 0.1 to the pixel values inside the ROI. The right image in Fig. 2 shows the mean of the time series of true images in the ROI.

Gaussian noise with standard deviation (STD) of 0.005 was added to simulate physiological noise in the phantom images. Random noise was added to simulated raw k-space data using the code package’s noise simulator, where the noise power is computed relatively to the highest k-space frequencies sampled. The average STD of the noise was 0.001, leading to average SNR of 4.08 in the simulated fMRI data.

Resized version of the baseline phantom was used as the anatomical prior \(\tilde{\mathbf {f}}_\mathrm{ref}\) and normalized to have values between \([0,\ 1]\).

Fig. 2
figure 2

Left rat brain baseline image for fMRI simulation. The ROI is highlighted in red. Right mean signal intensity change of the ROI as a function of time

3.4 Kalman Filter Computations

Computations were done in MATLAB (2015a, The MathWorks Inc., Natick, MA) with Nvidia Geforce GTX Titan X and Nvidia Tesla K40c using single precision.

With both simulated data and experimental data, the TV-KF and TV-KS estimates were computed with Algorithm 1 while the AKF and AKS estimates were computed with Algorithm 2.

The structured TV prior in (11) was discretized by

$$\begin{aligned} \varPsi (\mathbf {u}_t)=\sum ^{N_{\text {pix}}}_{k=1}\sqrt{(\nabla \mathbf {u}_t^s)^\mathrm{T}_k\mathbf {D}_k(\nabla \mathbf {u}_t^s)_k + \beta }, \end{aligned}$$
(19)

where the gradient was calculated with the forward difference approximation

$$\begin{aligned} (\nabla \mathbf {u}^s)_k= ({u}_k^s-{u}^s_i,{u}^s_k-{u}^s_j)^\mathrm{T}, \end{aligned}$$
(20)

where i corresponds to the neighboring pixel of pixel k in x-direction and j to the neighboring pixel in y-direction.

In both the simulated and experimental case, the number of iteration steps in the TV denoising (16) was \(S=10\).

3.4.1 Simulated Test Case

The reconstructions were computed using resolution \(N_\mathrm{pix}= 64^2\). The simulated raw k-space data were transformed to the Radon transform formalism by 1D Fourier transform of each simulated spoke. Therefore, the matrix \(\mathbf {H}\) in (2) corresponds to a matrix implementing the Radon transform (or parallel ray x-ray tomography transform).

The value of the variance of the observation noise, \(\sigma ^2_v=1.33 \times 10^{-6}\), was set equal to the variance of the simulated noise that was added to the measurements. Process noise variance, \(\sigma ^2_w=1 \times 10^{-5}\), was chosen by test runs, such that the reconstruction errors were smallest, and kept constant with respect to time. A complex-valued image using the first 51 spokes reconstructed with the LS approach was used as the initial value for the KF. The initial value of the error covariance \(\mathbf {P}_0\) was diagonal with the values being 0.01% of the variance of the initial estimate. This was selected so that the diagonal values would be close to the final values, or steady-state values, that can be obtained either by iterating Eqs. (5b), (5c) and (5e) which can be done without the data, or by running a test run of the whole filter with the data to be reconstructed. In the Kalman smoother, the number of stored smoother gains were \(h=153\) (three full k-space cycles) and the number of time points that were skipped after each smoother calculation was \(b=3\). With these values, a good compromise was achieved with computational efficiency and temporal smoothness.

The parameters of the structural priors were selected manually (by test runs) for visually optimal image quality. The parameter values for the structured smoothness regularization (AKF) in Eqs. (68) were \(\alpha =0.02\) and \(C=0.01\)

The parameter \(\gamma \) in the structured TV denoising (16) was also selected manually for (visually) optimal image quality, leading to choice \(\gamma =0.25\).

We remark that the structured TV denoising was applied separately for the real and imaginary parts of the Kalman filter estimate (5d), using the real and imaginary parts of the anatomical image as the reference images, respectively. Same value of \(\gamma \) was used for the denoising of real and imaginary parts of the KF estimate.

Fig. 3
figure 3

Top mean change of the ROI for the original signal in simulated case, LS estimate with 51 spokes per time point, SW estimate, TV-KF, TV-KS, AKF and AKS. Middle relative error of ROI for LS, SW, TV-KF, TV-KS, AKF and AKS. Bottom relative error for the entire image \(\varOmega \) for LS, SW, TV-KF, TV-KS, AKF and AKS

3.4.2 Experimental Data

The radial fMRI data were transformed to the Radon transform formalism by 1D Fourier transform of each measured k-space spoke.

The variance of the measurement noise was not available from a calibration or phantom measurement, but was selected manually, leading to choice \(\sigma ^2_v=5 \times 10^{-2}\). The variance of the state noise \(\sigma ^2_w\) was set to the same value as in the simulated test case. The initial values for the estimate and error covariance were also chosen similarly to the simulated case.

The regularization parameter of the structural prior was selected manually for (visually) optimal estimates. The regularization parameter in (6) was \(\alpha =1\), while the edge weighting parameter C was the same as in the simulated case. In the TV-KF, different values of \(\gamma \) were employed for the denoising of the real and imaginary parts of the Kalman filter estimate; for the real part, the value was \(\gamma _r=4.7\) and for the imaginary part \(\gamma _i=3.2\). Different values were selected due to different levels of noise in the real and imaginary parts of the images. All other parameters were the same as in the simulated case.

3.5 Image Fidelity Measures

In the simulated test case, the fidelity of the reconstructed images with respect to the true target is assessed using the relative \(L_2\) norm error

$$\begin{aligned} \mathrm{Err}(\mathbf {f}) = \frac{\Vert \mathbf {f} - \mathbf {f}_\mathrm{true} \Vert }{\Vert \mathbf {f}_\mathrm{true} \Vert }, \end{aligned}$$

peak signal-to-noise ratio (PSNR) [45] and mean structural similarity index (SSIM) [46] over the whole image domain. The reconstruction quality of the simulated activation response inside the ROI is assessed using the \(L_2\) norm error and contrast-to-noise ratio

$$\begin{aligned} CNR = \frac{|A-A_{\text {base}}|}{\sigma _v}, \end{aligned}$$
(21)

where A is the peak (or minimum if the signal decreases) signal amplitude after the stimulus, \(A_{\text {base}}\) is the mean amplitude of the baseline signal (signal before stimulus), and \(\sigma _v\) is the STD of the initial baseline. Other means of calculating the CNR exist for fMRI, but (21) is commonly used [47].

Table 1 Integrated values of the relative \(L_2\) norm error in the ROI and the whole domain \(\varOmega \) from Fig. 3, mean SSIM and PSNR values for \(\varOmega \), and mean CNR values for the ROI in the simulated case
figure g
figure h
Fig. 4
figure 4

Reconstructions from simulated data. Rows from top to bottom 1 Left side shows the true target image with the temporal variations of the indicated column of pixels shown on the right. The vertical line on the temporal variation image shows the time of the image. 2 Same as above, but with SW reconstruction. 3 LS reconstruction. 4 TV-KF reconstruction. 5 TV-KS reconstruction. 6 AKF reconstruction. 7 AKS reconstruction. All images have the color scale of the original image

Fig. 5
figure 5

a Mean of experimental data of the ROI with LS, SW, TV-KF, TV-KS, AKF and AKS reconstructions for each time point. The black bars show the temporal location of the close-up, b of the mean change

4 Results

4.1 Simulated Test Case

Figure 3 shows the mean of the simulated ROI, the relative error of the ROI and the relative error of the entire image domain \(\varOmega \) with respect to time.

Notice that the LS approach produces new image after every frame of \(n=51\) spokes. To highlight this, the LS signal and error are displayed with a constant line between the consecutive estimates. The image fidelity measures over the time series are given in Table 1.

Figure 4 shows a snapshot of the simulated and reconstructed images at \(t=1530\) and the temporal variations of one column of image pixels for the entire time series. The sliding window estimate is 50 time points shorter in length than the KF and KS estimates, since the SW time series can be started from the first frame of \(n=51\) spokes.

Figure 3 and Table 1 reveal that the state estimation estimates have smaller reconstruction errors, both in the whole domain and in the ROI, compared to the LS approach or the sliding window method. Also, the other fidelity measures for the state estimation approaches are better than those for the LS or SW. Based on the fidelity measures, the Kalman smoothers give the best estimates with the TV-KS yielding slightly better results than AKS.

The rapid and periodic changes shown in Fig. 3 with the TV-KF and AKF are caused by the periodic collection of the radial data, with the largest drop occurring when moving from the last spoke of a circle to the first spoke of the next circle. However, as can be seen, the backward run by the smoother removes this periodic oscillation from the error. Furthermore, the smoother estimates also produce temporally smoother estimates when compared to the other methods.

The computational times of one time point with the current implementations were on average 0.0354 s for the TV-KF, 0.4312 s for the TV-KS, 0.4632 s for AKF and 0.8552 s for AKS on GTX Titan X and 0.0605 s for LS/SW on Intel Xeon E5507.

Fig. 6
figure 6

Reconstructions from experimental data. Rows from top to bottom 1 Left side shows the LS reconstruction with the temporal variations of the indicated row of pixels shown on the right. The vertical line on the temporal variation image shows the time of the image. 2 Same as above, but with SW reconstruction. 3 TV-KF reconstruction. 4 TV-KS reconstruction. 5 AKF reconstruction. 6 AKS reconstruction

4.2 Experimental Data

Figure 5a shows the mean of the ROI, indicated by a line in Fig. 1, as a function of time. Figure 5b shows a close-up of the activation region of Fig. 5a, the time interval shown with the black vertical lines in the top figure. Figure 6 shows a snapshot of the reconstructions together with time series of one row of pixels.

The results in Figs. 5a, b and 6 conform to the findings of the simulated test case; the state estimation approach yields better image quality than the LS or SW. The spatial details in the state estimates are more clear, and the images are less noisy compared to the conventional approaches. The KF estimates also produce temporally much smoother estimates. All the methods detect well the nicotine induced cortical response in the ROI. However, the response is detected with better contrast-to-noise ratio in the state estimates, see Table 2.

Table 2 Mean CNR values calculated with (21) for the ROI in experimental case

5 Discussion and Conclusions

In this paper, we proposed a state estimation approach for fMRI. The proposed method utilizes a structural prior from the anatomical image which is acquired as part of the fMRI measurement protocol. Two different constructions of the structural prior were considered, the first one being a structured smoothness regularization obtained by augmenting the observation equation with a structurally weighted regularization matrix and the second a heuristic approach where the Kalman filter estimates at each time step were denoised by a structurally guided TV denoising.

The state estimation approach was evaluated using simulated and experimental small animal imaging fMRI data from a rat brain and is shown to improve reconstruction quality compared to a conventional frame-by-frame reconstruction or the sliding window method. Overall, the Kalman smoother using the structurally guided TV denoising provided the best estimates of all the methods. While the approach using the structurally guided TV denoising is heuristic and does not have theoretical convergence results, the finding that it produced the best estimates in a computationally efficient form for all test cases considered suggests that it could be a feasible choice in practical applications. The augmented KF, on the other hand, has a sound theoretical justification in the statistical Bayes filtering framework with the algorithm converging to the means and covariances of evolution updating density and spatially regularized observation updating density in a Gaussian Bayes filtering problem [20, 37].

We also experimented the KF without any spatial regularization and with non-structured regularization. When no regularization was employed, the reconstructed images were highly noisy and clearly inferior compared to augmented KF or the TV-KF, implying that the use of spatial regularization in state estimation approach for fMRI is highly beneficial. We also studied structured regularization against their non-structured versions, finding out that the structurally guided regularization improves the details of the images compared to non-structurally guided regularization. To compare the KF estimates with a sparsity promoting regularized least squares estimation approach, we computed regularized LS estimates using the structured TV functional as the regularization functional and the same data that were used in the LS and SW approaches. While the regularization improved the reconstruction fidelity over the non-regularized LS or SW, the Kalman filter estimates were more accurate than the TV regularized reconstructions.

The construction of the structural prior is based on the assumption that the reference image, and the fMRI images are co-registered. In small animal imaging this seems reasonable, since the animal is anesthetized during the experiment and kept in fixed position with a support inside the MRI scanner. If needed, the co-registration of the images can be verified by checking the co-registration of the anatomical image against a standard reconstruction of the fMRI. Also, recent results in [23] indicate that the structurally guided regularization can be an improvement over conventional regularization even in cases where there is uncertainty about locations of the boundaries.

The structurally guided weighting for the structurally guided TV denoising in the TV-KF algorithm can be constructed either by using the magnitude of the anatomical image, or real and imaginary parts of the anatomical image separately for the real and imaginary parts of the unknown, respectively. Whenever the fMRI and anatomical reference are acquired using similar measurement and k-space sampling protocol, either choice produces roughly the same results. However, when different samplings are employed, such as the experimental case in this study with Cartesian sampling for the reference and radial sampling for fMRI, the magnitude image has to be used for the weighting due to different spatial features of the real and imaginary parts of the images obtained with different sampling protocols.

In this work, the state evolution was evaluated as a simple random walk process. A topic of future research is the improvement of the methods by using more advanced state evolution models for encoding temporal prior information. These could be, for example, based on simple models of the physiological signals [33], or if slow changes are expected, based on kinematic models [41] where rate of change of the signal derivatives are controlled by using a higher-order time series model. These approaches, however, have the complication that the computational demand increases due to increased number of unknown state variables.

In this study, the state estimation approach was evaluated using time steps of single k-space spoke. The time resolution and sampling can in principle be selected quite freely; for example, one could use more than one spoke per time instant or use the data in the sliding window manner. For radial sampling, an interesting choice could be to sample the spokes using the golden angle rule [48] instead of sampling the spokes in small incremental angles as this would make the sampling of the k-space more uniform and possibly prevent the periodic component of the reconstruction errors which can now be seen with TV-KF and AKF.