1 Introduction

In computed tomography (CT), objects are scanned using radiographic X-ray projections, which allow for reconstructing the object’s shape and interior structures using well-established formulas and methods. In practise, CT is afflicted with a large variety of errors in the input data, leading to various types of artifacts. Projections in CT are captured by a detector composed of numerous detector elements—one for each pixel. If the detector of the scanning device is imperfect, those defects produce erroneous pixel information, which is then propagated into the volume image that is reconstructed from a scan. As the detector follows the capturing trajectory, so do the defects, leaving accumulated errors in the reconstructed volume, in a shape that depends on the trajectory. For a circular trajectory, for example, the errors form concentric circles. While other trajectories lead to different shapes of the resulting artifacts, the principle problem remains the same, and such errors are commonly known as “ring artifacts”.

Ring artifacts stem from the fact that different detector elements do not measure X-ray radiation identically. Irrespective of the origin, these erroneous pixels are often collectively called defect detector elements in literature [1, 2], though they can be treated differently. While errors stemming from defunct detector elements are usually solved by disregarding their information and replacing it by interpolation, errors coming from incorrectly normalized detectors allow for correction, as they leave usable information in each projection. This work focuses on the latter.

In CT, normalization of the detector is necessary due to hardware limitations. Every detector element, or pixel, exhibits a different gain as well as base response, where the base response is the X-ray intensity reported by the pixel when no radiation reaches the pixel, and gain is the increase in X-ray intensity reported per increase in radiation reaching the pixel. A common method of normalizing the detector is to assume each pixel’s gain to be constant and therefore the relationship between radiation intensity and pixel response to be linear. In that case the function mapping a pixel response to a radiation intensity can be determined from two data points, such as a flood image with maximum radiation intensity and a dark image with no radiation. However, as X-ray images are inherently noisy, so are the flood and dark images, or any other images that would be captured by other normalization methods, leading to normalization errors. The effect of noise can be reduced by averaging multiple images, but it can never be fully eliminated and long normalization times are undesirable, as CT detectors’ pixel response functions change over time, thus requiring frequent re-normalization.

Several researchers perform ring artifact correction (RAC) in the reconstructed image by transforming it into polar coordinates and applying various methods of smoothing, such as a Gaussian filter [3], wavelet and Fourier decomposition [4], or total variation minimization [5, 6]. Since the cause of the artifacts resides in the projections, other methods more commonly seek to correct them in the sinogram, prior to reconstruction. Many authors first replace data from detector elements that are entirely dysfunctional and then correct normalization errors [1, 2]. Researchers often classify the normalization errors of detector elements as defect or non-defect, and correct defect elements using data from non-defect elements [1, 2, 7, 8]. For the correction of normalization errors, a large variety of methods have been proposed, including sinogram filtering [7, 9], adding a DC shift [1], morphological filters [2], multiplication with the inverse mean pixel value [10], the use of neural networks [11], and variation-based regularization [12]. Yousuf et al. perform ring correction in the already Feldkamp–Davis–Kress (FDK) filtered projections [7]. Alexeev et al. correct ring artifacts that occur specifically with antiscatter grids [13]. Croton et al. combat normalization errors in phase contrast tomography by correcting both gain and offset of each detector pixel [14]. This requires estimating the correct detector response for varying X-ray intensities, fitting the measured detector response as a linear function of the correct response. They do so by sweeping the X-ray beam across the detector and blurring the resulting projections, using the blurred projections as the correct response. Their gain and offset approach is more physically correct than a flat offset as seen in other works, because normalization errors stem from errors in the normalizing dark and flood projections, leading to errors in the inferred gain and offset. However, their method requires additional projections to be captured and directing the X-ray beam is not possible in many CT setups. We adopt their gain and offset based method and adapt the estimation of the correct response to work with conventional CT-projections, without requiring alterations to the acquisition process.

A problem of previous correction methods is that they indiscriminately employ all projections of a measurement for estimation of the correct pixel response. While this provides more data and thus less noise, some projections are inherently less fit for inferring a correct response of a pixel, specifically projections with large local variations near that pixel. We show that selectively omitting projections from the estimation process sharply reduces the introduction of blur by the RAC. We devise a method that compensates for normalization errors using sinogram filtering with the main differences to previous work being: (i) correction of every detector element rather than classifying as defect/non-defect, (ii) correction using gain and offset, and (iii) pixel-wise omission of unfit projections for gain and offset estimation. We show that our algorithm can be implemented inline, i.e., it can be performed during images acquisition, at the cost of precision. By performing the compensation inline, any computational overhead of applying the filter can be hidden, therefore allowing reconstruction with no additional time consumption.

2 Method

We aim for improving both the image quality as well as the computation time and therefore incorporate the following considerations in our method.

2.1 Image Quality

Our approach is performed on the sinogram, i.e., the projections of the measurement to be corrected. The general idea is to correct the gain and offset of each detector element individually. To estimate the gain and offset, we need to do nothing more than fit a linear function into the dependency ‘measured pixel response’ \(\rightarrow \) ‘correct pixel response’. The fitting itself can be done using simple linear regression. The problem lies in the fact that the ‘correct pixel response’ is unknown and needs to be estimated.

To do this, we make use of the fact that for most projections, a pixel’s correct response is similar to that of its neighbors. A pixel may differ from its neighbors in some projections due to noise or features of the object. But a pixel consistently being brighter or darker than its neighbors would mean there is a tube or ring in the object, exactly one pixel in size, positioned in such a manner that the thickest part of the ring is projected precisely onto the same detector element in each projection. That is not only improbable, it is also an object with features too small for reliable reconstruction. We assume that any such rings are undesired in the reconstructed image. Our algorithm operates in the following four steps:

Step 1: determine projections to be used. Each detector element p uses its own subset \(\textrm{S}(p)\) of projections for correction. That subset is made up of those projections where the pixel response of p is likely to be similar to those of its neighbors, i.e., projections with little local variation near p.

Let \(\textrm{I}(p,i)\) be the pixel response of detector element p in projection i and let \(\textrm{D}(p,i) = - \log (\textrm{I}(p,i))\) be the negative logarithmized response, which is commonly used for reconstruction. For the local variation, we look at p’s eight neighbors and denote the local difference \(\textrm{diff}(p,i)\) in projection i as the greatest difference between the \(\textrm{D}\) of two opposing pixels in that 8-neighborhood. For noise robustness, we select the majority of the projections to be in \(\textrm{S}(p)\). We only filter out those projections where \(\textrm{diff}(p,i)\) is an outlier. We define that a projection is an outlier if its \(\textrm{diff}\) is more than a standard deviation greater than the average \(\textrm{diff}\).

Step 2: estimate correct pixel response. For projections in \(\textrm{S}(p)\), we estimate the correct pixel response \(\bar{\textrm{I}}\) of p to be the median of its eight neighbors. We compute the median \(\bar{\textrm{D}}\) in \(\textrm{D}\), rather than in \(\textrm{I}\), and exponentiate thereafter to avoid a bias towards lower \(\bar{\textrm{I}}\). The median of eight values has a greater variance than an average or weighted average of the same values, assuming Gaussian noise. Nonetheless, we elect to use the median, as it is a better estimator on non-linear slopes.

Note that by using neighboring pixel information as the ‘correct pixel response’, we introduce gain and offset errors that those detector elements had into p. Since those errors stem from zero-average noise during the normalization, this process reduces the error on average. In a sense, we are blurring gain and offset errors. Therefore, projections must still be flat-field corrected beforehand as usual, to minimize the initial gain and offset errors. The subset-based approach is used to reduce the risk of blurring features of the projections as well.

While we estimate the correct pixel response using an 8-neighborhood, it is possible to use larger median kernels, to apply a stronger RAC, i.e., a stronger blur to gain and offset errors. However, farther neighbors are poorer predictors of p’s correct pixel response. If a stronger correction is desired, we instead suggest iteratively repeating the RAC with our small kernel size. For the gain and offset noise, this is equivalent to a larger kernel, at a reduced risk of blurring features in the projections.

Step 3: infer gain and offset. Gain and offset can be inferred through linear regression using the data pairs of measured pixel response and correct pixel response from Step 2. However, a small adjustment must be made to the linear regression formula with regard to noise, in order to emend a minor bias.

Additionally, there is a special case where all data pairs of a detector element are too close together to allow for linear regression. This is, for example, the case when a pixel remains unattenuated throughout the measurement and only sees air in each projection. Fortunately, in this case, one can fall back to merely correcting the offset of that detector element, as any error in its gain will have negligible impact. The stability of a linear regression depends on the variance of its data pairs. So to determine whether linear regression is feasible, we can simply ensure the variance of \(\textrm{I}(p,i)\) exceeds some threshold. Since pixel responses are normalized between 0 and 1, that threshold can be data-independent. We chose a threshold of 0.0225, which corresponds to a standard deviation of 0.15. In addition, the gain and therefore covariance of each (uncorrected) detector element is approximately 1. So \(0.9< \hbox {Cov}(\hbox {I},{\bar{\textrm{I}}}) < 1.1\) can be added as a safety check.

Fig. 1
figure 1

Visual comparison on synthetic data of no correction vertical (a) and horizontal (b), purely offset-based correction, and gain as well as offset correction. Correction using both gain and offset correction (d) is a visible improvement over only offset correction (c)

Fig. 2
figure 2

Deviation from ground truth of the offset inferred from synthetic projections when using all projections (center) or the subset approach (right). Offsets are inferred from a projections of a cylinder rotating through the scene (left). Errors denoted in the scale of normalized detector response, i.e., one is the difference between minimal and maximal response

Fig. 3
figure 3

Volume reconstructed from real data using different RAC methods. Higher levels of RAC can be activated in CERA for more aggressive RAC finding broader rings. This somewhat reduces the ring artifacts seen here, but leads to severe streak artifacts in this data set (as illustrated in Fig. 5)

Fig. 4
figure 4

Volume reconstructed from real data using different RAC configurations. a Using no RAC, b using CERA’s RAC level 1, c using CERA’s RAC level 2, and d using our method

We observe noise both in \(\textrm{I}\) and in \(\bar{\textrm{I}}\). Linear regression assumes that either the x- or y-axis is noisy, but not both. There are variants of linear regression that can work with noise in both axes, but they are unnecessarily complex for this scenario, because we know the ratio of the intensities of both noises. The ratio between the noise of one pixel and the noise of the median of eight pixels is a constant \(r = \frac{1}{0.16817...}\). With gain and offset correction, the relationship between the brightness we should measure and the brightness we measure is some line \(Y = a + m X\). Assume our neighboring pixels without noise describe the correct pixel response. Then each projection is a value pair, measuring x for p’s neighboring pixels and \(y = a + m x\) for p. Both X and Y are measured with noise. Let \(\eta \) be the noise of X and \(\delta \) be the noise of Y, so that \(x = X + \eta \) and \(y = Y + \delta \). Assume \(\eta \) and \(\delta \) each have a mean of 0 and are uncorrelated to all other variables. Let \(e = \text {Var} (\eta )\) be the variance of \(\eta \) and \(d = \text {Var} (\eta )\) be the variance of \(\delta \). Let the mean of X be \(\bar{X} = \bar{x} = \mu \). Then the mean of Y is \(\bar{Y} = \bar{y} = a + m \mu \). Thus, towards calculating m, we start by considering:

$$\begin{aligned} \mathrm {Var(x)} = \mathrm {Var(X)} + e, \end{aligned}$$
(1)
$$\begin{aligned} \mathrm {Var(y)} = \mathrm {Var(Y)} + d. \end{aligned}$$
(2)

Substituting the Y with the line equation, Eq. 1, and \(\mathrm {Cov(x,y)} = m \mathrm {Var(X)}\) in Eq. 2, we can derive:

$$\begin{aligned} m = \frac{\mathrm {Var(y)} - d}{\mathrm {Cov(x,y)}}. \end{aligned}$$
(3)

Considering \(r = \frac{d}{e}\), and given that \(e = \mathrm {Var(x)} - \mathrm {Var(X)}\), from Eq. 3 we can obtain:

$$\begin{aligned} m = \frac{\mathrm {Var(y)}}{\mathrm {Cov(x,y)}} - \frac{\mathrm {Var(x)}r}{\mathrm {Cov(x,y)}} + \frac{r}{m}. \end{aligned}$$
(4)

Renaming the first two terms on the right side of Eq. 4 as c , we can obtain the gain m as:

$$\begin{aligned} m = \frac{c + \sqrt{c^2 + 4r}}{2}. \end{aligned}$$
(5)

With the gain m calculated, the offset can be determined as usual via \(a = \bar{y} - \bar{x} m\).

Step 4: correct pixel response. To correct each pixel value, its response \(\textrm{I}(p,i)\) needs merely be put as x into \(y = a + m x\), where m and a are the corresponding detector element p’s gain and offset respectively. y is the corrected pixel response. Note that responses need to be corrected for all projections, not only those in \(\textrm{S}(p)\).

2.2 Computation Time

To avoid additional time overhead from applying RAC, it is desirable to perform the correction inline, i.e., during the acquisition of the projections. Naively, this is not possible, as all projections are required for determining the gain and offset values. Fortunately, the algorithm is already able to deal with only a subset of the projections and none of the computations are very costly. This allows for a very simple inline adaptation, where the last few projections acquired are excluded from the subset \(\textrm{S}(p)\) for each pixel p. This enables the computation of gain and offset values and the processing of all projections acquired so far. The remaining projections can then be corrected as soon as they are captured. The number of projections that need to be excluded from \(\textrm{S}(p)\) depends on the acquisition time and the speed of the machine computing the correction. Since our algorithm is trivially parallel, it can easily be implemented on the GPU.

Inlining the algorithm becomes more difficult if the reconstruction is already inlined, i.e., if reconstruction is performed during projection acquisition. If the reconstruction is slower than the acquisition, it is impossible to fully hide the overhead of RAC. If the reconstruction is faster, the difference between acquisition time and reconstruction time leaves margin for maneuver. It is possible to sacrifice scope of the RAC for performance. Particularly, ring artifacts are most severe near the axis of rotation. By restricting RAC to a region around the axis, one reduces both the number of pixels that need to be corrected and the number of voxels that need to be reconstructed with the correction. As the RAC itself is inherently fast, the reduction in the number of voxels is the major advantage of this region of interest approach. It is possible to first reconstruct the full image without RAC and later perform a much faster region of interest reconstruction using the corrected projections. This way, the full reconstruction needs not wait for the RAC and the correction can use most projections to estimate gain and offset. As long as the region on interest reconstruction is faster then the difference between acquisition of the remaining projections and reconstruction using the remaining images, this strategy fully hides the RAC overhead, even for inlined reconstructions. Ring artifacts are more severe in acquisition settings with high noise, which call for longer illumination times to reduce noise. This means that problem sets where RAC is the most useful are also those where acquisition times are long, making the above inlining approach particularly fruitful.

Note that the number of voxels in the region of interest reconstruction scales quadratically with its width. A region of interest with a tenth of the original width only has a hundredth of the original number of voxels. In common filtered backprojection style reconstructions, such as FDK or helical, the effect of pixels on the reconstructed image is entirely additive. This allows us to perform the second, ring artifact corrected reconstruction with only the difference between the corrected and uncorrected projections. This way, only the region of interest inside the projections needs to be stored until RAC, as the difference outside the region of interest will be zero. This sharply reduces memory consumption for small regions of interest.

Fig. 5
figure 5

Top: Cross section of an example of artifacts introduced by CERA’s RAC on higher levels of RAC on real data. Bottom: A dark (low density) spot gradually forming near the corner of the material as the CERA RAC level increases.

Fig. 6
figure 6

Cross section of reconstructed volume a using no RAC, b using Münch et al.’s method with the DB42 wavelet, 5 decomposition levels, and damping coefficient 1, c using CERA, and d using our method

Table 1 Root mean square error of various RAC methods as compared to ground truth on synthetic data
Fig. 7
figure 7

Visual comparison on synthetic data reconstructed from a ground truth, b uncorrected projections, and projections corrected with c CERA’s RAC level 1, and d our RAC

Fig. 8
figure 8

Reconstructed volume using our RAC method only in the volume’s center 62\(\times \)62 column (blue and green boxes). Inferring gain and offset errors from 995 (b) out of 1000 (a) projections yields no significant difference

3 Evaluation and Discussion

In the following, we evaluate our method both with regard to image quality and computational time.

3.1 Image Quality

We compare RAC using gain and offset correction to pure offset correction and no correction on a dataset with severe ring artifacts. Figure 1 shows a visual comparison of the three cases with synthetic data.

To evaluate the projection subset approach, we simulate projections of a cylinder on a detector normalized using dark and flood images with Gaussian noise. We infer offsets first using all projections and second using only the pixel-wise subset of projections. As ground truth, we use the offset that could be inferred from all projections captured without the cylinder. Figure 2 visualizes the deviation from ground truth of the inferred offsets in both cases.

For comparison with existing software, we use Siemens CERA’s RAC, one of the most popular state of the art CT reconstruction programs, as well as Münch et al.’s wavelet and Fourier decomposition method [4].

As can be seen in Figs. 3 and 4, CERA’s RAC level 1 leaves behind some minor artifacts near the center of rotation which our method is able to remove. CERA allows the user to activate higher, more aggressive levels of RAC. While using those higher levels, CERA eliminates said remainders of ring artifacts, using these levels is undesirable as they introduce new artifacts in the volume, mostly in the form of streaks on and along material borders depending on their orientation. Figure 5 shows a cross section of these artifacts.

Figure 6 shows a visual comparison between Münch et al.’s wavelet and Fourier decomposition method, CERA, and our method.

For quantitative comparison, we simulated two sets of projections for a shepp-logan phantom. In the first set, we simulated noise both in the projections as well as in the gain and offset of pixels. In the second set, we simulated the same noise in the projection images as in the first set, but with ideal pixel gain and offset. We reconstructed the first set with different RAC methods and compared them to a ground truth obtained by reconstructing the second projection set without RAC. Table 1 shows the root mean square error (RMSE) of the individual RAC methods as compared to the ground truth. Both using only an opportune subset of the projections for inference and correcting for gain reduce the remaining error. Although visually reducing ring-artifacts significantly, CERA’s RAC slightly alters the density and edges of some parts of the reconstructed phantom, deteriorating the ground truth fit. Münch et al.’s method even increases the error by introducing a broad artifact ring. ring. Figure 7 shows a visual comparison of the volumes reconstructed using our and CERA’smethod, as well as the ground truth and uncorrected volumes.

3.2 Computation Time

For performance, we evaluate our algorithm on a single Nvidia Quadro M4000 GPU. We reconstruct a 1024 \(\times \) 1024 \(\times \) 1024 voxel volume using 1000 real data projections sized 1024 \(\times \) 1024. Reconstruction using FDK filtered backprojection takes 21 s. RAC using our algorithm on the full projections takes 1.3 s (6% increase).

To perform the RAC inline, we reconstruct a 1504 \(\times \) 1504 \(\times \) 1256 volume from 1000 projections with 1504 \(\times \) 1256 pixels. We inline both the reconstruction and the RAC with our approach. Since the axis of rotation is the region most affected by ring artefacts, we apply RAC only to a 62 \(\times \) 62 \(\times \) 1256 shaft around the axis of rotation. Therefore, we need to apply the correction only to the center column of 62 \(\times \) 1256 pixels. The projection acquisition requires an average of 843 ms per projection. Offline reconstruction takes 72 s, or 72 ms per projection, leaving 771 ms of air to the image acquisition time. The ring artefact correction using all projections takes 1900 ms including re-reconstruction of the 62 \(\times \) 62 \(\times \) 1256 shaft. Therefore, we can use \(1000-\textrm{ceil}\bigg (\frac{1900}{771}\bigg ) = 997\) projections (out of 1000) for RAC and still be able to catch up with the reconstruction. We opt to use only 995 projections, to leave a safety margin, and are able to reconstruct the image fully inline. See Fig. 8 for a comparison between the RAC using all or only 995 images. Thus, our algorithm has completed RAC and backprojection for all but the last projection by the time the last projection finishes capture, thereby requiring virtually no time overhead.

4 Conclusion

We propose an algorithm for ring artifact correction that improves upon common methods of correcting only the normalization offset, by correcting for both gain and offset errors, as well as having each pixel determine its gain and offset errors only from an opportune subset of projections unlikely to introduce spurious corrections. Our method reduces ring artifacts both visually and quantitatively more effectively than if correcting for only offset using all projections for inference, reducing RMSE to 24.3% as compared to 27.5%. We show that our method can be implemented in a computationally efficient manner.