# Interactive GPU active contours for segmenting inhomogeneous objects

- 2.1k Downloads

## Abstract

We present a segmentation software package primarily targeting medical and biological applications, with a high level of visual feedback and several usability enhancements over existing packages. Specifically, we provide a substantially faster GPU implementation of the local Gaussian distribution fitting energy model, which can segment inhomogeneous objects with poorly defined boundaries as often encountered in biomedical images. We also provide interactive brushes to guide the segmentation process in a semiautomated framework. The speed of our implementation allows us to visualize the active surface in real time with a built-in ray tracer, where users may halt evolution at any time step to correct implausible segmentation by painting new blocking regions or new seeds. Quantitative and qualitative validation is presented, demonstrating the practical efficacy of our interactive elements for a variety of real-world datasets.

## Keywords

Segmentation Active contours Level set methods GPU Medical applications Biological applications## 1 Introduction

- Biosciences:
Cellular, developmental and cancer biology.

Plant biology, including plant–pathogen interactions.

Animal biology, including virus–host interactions and bacterial infections.

Microbiology, including food safety.

Neuroscience, including connectome projects and developmental neuroscience.

- Medicine:
Automated differential diagnosis.

- Diagnostic measurements, shape, and volume, of:
Macular holes in retinal degeneration.

Aneurysms, clotting and infarction.

Tumors, neoplasia and dermatological moles.

MRI segmentation in dementia and Alzheimer’s.

- Computer-assisted surgery:
Pre-surgical planning and surgery simulation.

Guided surgical navigation.

The oldest and most widely cited segmentation approaches are active contours [20]; these are variational frameworks which allow users to define an initial open or closed curve that deforms so as to minimize a energy functional, outlining or surrounding the object of interest. While active contours have been realized as fully automatic approaches without initial contours [25], their original foundation as an assisted approach is still important today as it allows users, such as clinicians, to extract precise measurements from specific objects of interest within a complex image. However, such interactivity relies on real-time visual feedback; therefore, they must also be computationally efficient.

Graphics processing units (GPUs) provide energy-efficient parallel computing and enable real-time interactive segmentation for larger 2D or 3D datasets [10, 43], but existing GPU segmentation methods currently rely on simple segmentation criteria restricting their usage and applications. The popular local Gaussian distribution fitting (LGDF) energy model [47] is much more powerful and able to segment a wider variety of general objects. However, it requires several intermediate processing steps that must be implemented sequentially, making it challenging to efficiently implement on graphics hardware. The current implementation of the LGDF energy model can segment small 2D images (\(99\times 120\) in 27.37 s), but requires several hours of processing for larger 2D or 3D images [47]. For a 3D image of size \(256\times 256\times 160\), this would take 6.6 hours if the implementation were available for 3D, preventing usage in many practical applications.

### 1.1 Contributions

In our approach, we: (1) significantly increase the performance of the LGDF energy model through an optimized GPU implementation, handling much larger 2D images and even 3D images at interactive performance, (2) introduce a novel set of interactive brush functions that are integrated into the GPU kernels such as to modify and constrain the evolving level set in real time, (3) provide a ray tracer to view the segmentation results at each time step, and (4) expose a simpler and more intuitive parameter space to the user, with suggested values and ranges. The combination of these four enhancements greatly improves the practicality of what is already considered a state-of-the-art level set method of particular relevance to the biomedical image processing communities. Our software is shown to be stable with respect to its input parameters and robust to noise through a large experiment on synthetic data and is further evaluated through segmenting a wide variety of real-world images, such as those shown in Fig. 1.

## 2 Related work

The field of active contours first gained mainstream adoption with the ‘active snakes’ model published by [20]. This seminal work proposes iterative evolution of an initial spline curve, with the evolution being governed by the minimization of an energy functional, the local minima of which correspond to curves that fit along prominent edges in the image. Level set methods (core theory explained in [30]) model contours implicitly as the zero-crossing of a scalar field. Originally they were proposed in [31] to model the evolution of inter-region boundaries in physical simulations. Malladi et al. [26] applied level sets to active contours, with the evolution of the contour being governed by its local mean curvature and the intensity gradient magnitude of the image, in such a way that local curvature is reduced and the motion of the contour stops as it approaches an edge. In [5], the authors develop a level set-based active contour framework in which the energy functional is based on the Mumford–Shah model, rather than image edges, which in practice are often faint, blurred, or broken. The Mumford–Shah energy model [28] is minimized by an optimal partition of an image into piecewise smooth segments, and high-quality implementations exist on the GPU [33]. The global optimum can be found using a primal-dual algorithm [4] resulting in a cartoon-like rendering of the original image. Local solutions, such as with a trust-region approach [14], have applications in interactive segmentation, where local edits need to be made frequently.

Deep convolutional neural networks are the state of the art in image segmentation, where millions of parameters of deeply layered convolutions are learned using backpropagation [22]. These models are capable of learning abstract features in the data; however, their current reliance on such large datasets makes them unusable for a number of applications.

The influential public datasets with ground-truth segmentations (such as BSDS, MSRC, iCoseg, FlickrMFC, SegTrack) include RGB videos or 2D images such as cars, chairs, and people. Of these, the interactive approaches take as input a set of scribbles where objects follow similar color distributions [53]. Graph cut segmentation is popular in this field, where Grady [15] and Vineet and Narayanan [46] propose GPU implementations. For interactive segmentation in the biosciences, we find the main limitations being (1) the initialization of the foreground–background scribbles in 3D datasets such as networks and (2) the opaque intermediate steps of the cutting algorithm making it difficult to obtain a high level of visual feedback. While popular and easy to validate, these approaches address a different problem to grayscale 3D segmentation as with imaging modalities (such as CT, PET, SPECT, MRI, fMRI, ultrasound, optical imaging and microscopy) in the biosciences [10]. There is still a need for benchmark medical datasets with well-defined interactive performance evaluation [51].

Accelerating image segmentation with GPUs is a large research field with several comprehensive surveys [10, 34, 41, 43]. The survey by [10] covers a broad range of algorithms and different imaging modalities, whereas Smistad et al. [43] focuses more on GPU segmentation with a detailed discussion on the current GPU architecture.

The GPU level set methods in the literature focus on limiting the active computational domain to a small region near the zero-crossing of the level set function, such as the traditional narrow band algorithm [1]. More recent extensions classify the active region using simple operations on the spatial and temporal derivatives of the level set function [36] and then discard unimportant regions through parallel stream compaction. While limiting the active computational domain produces excellent performance with lower memory usage, the current implementations all use simple speed functions that attract the level set to make it grow and/or shrink within a fixed intensity range [18, 23, 36]. In contrast, the LGDF model proposed by [47] is able to segment much more challenging images, in which objects exhibit intensity inhomogeneity or even have the same mean intensity as their background, being distinguished only by intensity variance. However, to date the only existing implementation runs on the CPU, likely due to the sequential dependency of convolutions in the intermediate steps. Further, the LGDF model is derived from [5] who introduce \(C^\infty \) regularization of the Heaviside and Dirac functions which are nonzero everywhere, unlike the \(C^2\) regularized Heaviside (proposed in [52]) which is nonzero only in the vicinity of the contour. \(C^\infty \) regularization restrains the algorithm from converging on local minima, but precludes traditional narrow band or sparse field algorithms because it requires the level set to update at all points on each time step.

GPU active contour methods parallelize the calculation of the energy forces described in the original snakes paper [20]. Traditional methods rely on simple intensity gradients and are prone to converging on local minima; however, [49] introduced a diffusion of the gradient vectors called gradient vector flow (GVF) to address this problem. [16] were one of the first GPU active contour implementations using GVF, and more recent optimizations in OpenCL exploit cached texture memory which has spatial locality in multiple dimensions [42]. The active contour can also be approximated by a surface mesh, such as in [39] who use Laplacian smoothing on local neighborhoods in conjunction with driving mesh vertices with gradient and intensity forces. However, these approaches still rely on the image gradient being a reliable indication of object boundaries, which is not the case in many real-world images [5].

Ever since the original snakes paper, active contours have gained popularity through being able to interactively edit the contour, or set up constraints to guide its motion [20]. Region-based active contour methods provide the option to initialize with a simple primitive shape, or sketch a starting region [7]. The more advanced approach by [27] introduces non-Euclidean radial-basis functions, which are weighted by the image features and blended to form an implicit function whose sign can be fixed at user-defined control points. The tool by [50] provides an interactive interface with geodesic active contours [3] and region competition [55]. Region competition favors a well-defined intensity range, whereas the geodesic approach is better suited for images with clear edges; by combining both approaches, [50] can segment a broad range of images, yet it requires significant tuning and can still fail in complex images with neither a well-defined intensity range nor clear edges.

There are several GPU approaches that produce segmentation without relying on initialization of a seed region [25]. Clustering methods join regions of a high-dimensional feature space [13], and superpixel approaches [35] form clusters that are deliberately over-segmented into more manageable regions. These approaches are good at simplifying complex images, yet they do not capture specific objects. In contrast, active shape and appearance methods fit a model to the data based on prior knowledge; however, this inherently makes assumptions of the overall shape of the objects and fails when these assumptions are not met.

## 3 Method

The LGDF model, originally proposed in [47], builds on existing active contour literature by introducing a new energy functional based on the local Gaussian distributions of image intensity. This functional drives a variational level set approach which is able to segment objects whose intensity mean and variance are inhomogeneous. Rather than creating segments whose intensity is as uniform as possible, this algorithm allows slow changes in intensity across an object, penalizing only sudden changes within it, without relying on a gradient based edge detector [5].

*H*is the \(C^{\infty }\) regularized Heaviside function, discretized to operate on a regular grid, first proposed by [5]:

Due to the smooth form of the \(C^{\infty }\) regularized Heaviside (Eq. 7), \(\delta (\phi ) = H'(\phi )\) is nonzero everywhere. This allows \(\phi \) some freedom to change at any point in the image, not just in a narrow band around the contour. This helps prevent convergence on local energy minima [5].

### 3.1 GPU implementation

The goal of the implementation is to iteratively solve Eq. 8 for \(\phi (\mathbf {x}, t)\) and visualize the results at each iteration. This is done by discretizing \(\phi \) with respect to time and applying numerical integration: starting with \(\phi (\mathbf {x}, t=0)\) (which is specified by the user), an update loop computes \(\phi (\mathbf {x}, t+\varDelta t)\) by computing \(\frac{\partial \phi }{\partial t}\) according to Eq. 8 and assuming this quantity stays constant during the short time step \(\varDelta t\). Existing GPU level set methods implement their update rule inside a single kernel function; however, \(E^{\text {LGDF}}\) is more challenging as relies on intermediate stages with neighborhood operations, such as convolutions and derivatives, whose sequential dependencies must be considered such as to avoid race conditions.

*I*denotes the input image and

*H*the smooth Heaviside function (Eq. 7). All variables of the form

*GX*represent the

*n*-dimensional Gaussian convolution of

*X*.

*GIH*,

*GH*, \(GI^2H\),

*GI*and \(GI^2\) using the following formulas:

*GI*and \(GI^2\) are constant). To compute the image force term \(e_1 - e_2\), we expand the brackets in Eq. 10 to get:

*I*and \(I^2\) and sum them. This results in just six convolutions altogether. Note that \(e_1\) and \(e_2\) are not computed separately; the variables \(E_0\), \(E_1\) and \(E_2\) are the three corresponding parts of \(e_1 - e_2\).

### 3.2 GPU architecture

The six required Gaussian convolutions require a large number of buffer reads. However, an *n*-dimensional Gaussian filter can be separated into the matrix product of *n* vectors allowing us to convolve with *n* 1D filters instead of one very large *n*-dimensional filter. This reduces \(l^2\) texture samples to 2*l* in 2D or \(l^3\) texture samples to 3*l* in 3D, for a truncated Gaussian kernel of length *l*. Therefore, our overall algorithmic complexity is \(O(n \cdot l)\) for an input of size *n*.

The buffer reads for the horizontal Gaussian pass are coalesced, but for the vertical and depth passes the reads are not coalesced and therefore very slow. This could be alleviated by transposing the image between convolutions, making the buffer reads coalesced for vertical and depth passes. However, transposing the image three times per convolution is slow, even when this is optimized by using local/shared memory. In our architecture, we instead make use of texture memory, which preserves spatial locality among neighboring pixels in all three dimensions, making access time for all three passes comparable to coalesced buffer reads. This allows us to skip the transpositions altogether and convolve up to four images at once in the available texture memory channels, yielding faster overall performance than local/shared memory approaches.

*X*,

*Y*, and

*Z*Gaussian passes accordingly, which we show in Fig. 3. This figure lists our kernels in the order they are called and shows their inputs and outputs (corresponding to the nodes in Fig. 2) within the available 4\(\times \)32-bit channels per GPU texture buffer. Besides the convolutions, the rest of our implementation is straightforward; we store the 1D convolution filter weights in constant memory and all intermediate values reside in registers.

The three Gaussian convolutions of the image and Heaviside (*GIH*, *GH*, \(GI^2H\), Fig. 2) are the result of neighborhood operations, but are not dependent on each other. This is also the case with the three Gaussian convolutions \(GE_0\), \(GE_1\), \(GE_2\). We therefore create kernels shown in Fig. 3 to perform each set of three Gaussian convolutions simultaneously, and two more kernels to prepare for them (called ‘Prep Conv 1’ to compute *H*, *IH*, \(I^2H\), and ‘Prep Conv 2’ to compute \(E_0\), \(E_1\), \(E_2\)). The curvature field \(\kappa \) (Eq. 9) requires all three (two in 2D) gradient components to be first stored in texture memory in order to avoid race conditions, since all differential operations are computed by central finite differences, a neighborhood operation. This is why we compute \(\kappa \) early on and pass it through the Gaussian convolution kernels in the conveniently available *w* channel of the texture buffer; computing \(\kappa \) immediately before ‘Update \(\phi \)’ would require an extra texture buffer since there is only one unused channel at that point. After updating, we force the partial derivatives of \(\phi \) to be zero at their corresponding image boundaries (in the ‘Neumann/Copy’ kernel) to prevent numerical instability and copy the result back into buffer *A* for the next iteration.

### 3.3 Interactive brushes

There are many applications in the biosciences, computer vision, medical, and pattern recognition communities where guidance by human experts is required [7, 20, 27, 48, 50]. The current interactive GPU level set methods, such as [36], provide interfaces to (1) initialize \(\phi \) inside/outside the object, (2) dynamically adjust parameters, and in some cases (3) allow \(\phi \) to be edited (a union operator on new objects/regions, followed by rerunning of the algorithm); however, it is difficult to refine evolution such as to prevent contour leaking or constrain the evolution. The graph-cuts and radial-basis function approaches [15, 27] allow users to sketch lines or define control points which are tagged to both the desired object and the undesired regions, but we find the process difficult to refine where the segmented boundary lies somewhere between the input locations, where there may not be discernible image intensity features (see Fig. 4 top-left and in the accompanying video).

To address these issues, we follow the strategies outlined in the survey [29] with similar functions to the modeling/graphics literature [12]; however, we closely integrate brush functions with our segmentation kernels with the goal of editing and constraining \(\phi \) during the iterative evolution process itself. Specifically, we provide functions to initialize, append, erase, and constrain (locally stop evolution of \(\phi \)) after each iteration of the update step (Eq. 8), and visualize the results after each iteration. Note that for simplicity we define our functions with circular (2D) or spherical (3D) regions, but there is nothing to prevent implementing more bespoke functions, such as surface pulling [12].

*r*and are implemented in the ‘Compose’ kernel (Fig. 3). We have deliberately arranged the read buffer

*B*to link to \(\phi \) from the previous update iteration. To complete a brush action, we relaunch the ‘Compose’ kernel with the brush parameters followed by the ‘Neumann/Copy’ kernel between each update iteration. The initialization brush sets \(\phi \) to a binary step function with a small positive constant (we choose 2 empirically):

*B*buffer

*z*-channel in combination with the rendered value of \(\phi \) stored in the

*A*buffer

*z*-channel, we can display the currently brush size and position without committing the stroke.

In Fig. 4, we illustrate two simple use-cases of our interactive brushes. In the top row, the user paints using the ‘barrier’ brush to cover the full image region, shown in blue. This is followed by the ‘erase’ brush (Eq. 17), to cut a permissible region in which a new seed region is placed (Eq. 16), which evolves to segment the macular hole without leaking into the opening. (We show this in 3D in the accompanying video.) Similarly, in the lower row, the vessels are segmented without leaking into the heart (see also Table 5 2b–c).

### 3.4 Real-time rendering

To render the zero-crossing of the level set function \(\phi \) in 3D, we launch a render kernel after the Neumann/Copy step in the update loop (Fig. 3). We send a camera matrix to initialize each pixel with a ray origin \(\mathbf {o}\) and direction unit vector \(\hat{d}\). We parameterize the ray’s position by \(\mathbf {r} = \mathbf {o} + \hat{d} s\) and, assuming \(\phi \) to be the signed distance to the zero-crossing, advance the ray in steps by \(s_{i+1} = s_i + \phi (\mathbf {r})\). However, \(\phi \) is not a perfect signed distance function; therefore, we must divide our step size by the maximum derivative of \(\phi \); this value is not known precisely, but in practice we find we can obtain sufficiently small visual artifacts at good performance by choosing a constant step size \(\varDelta s = 0.3 \phi (\mathbf {r})\). Further, given that \(\phi \) is not defined outside of the image boundaries, we initially advance \(s_0\) to the start of the image axis-aligned bounding box (where the \(s_0\) is calculated using an analytical ray-box intersection function [21]). To increase visual quality, we implement 3D ambient occlusion and soft-shadows by marching the ray in the directional of the normal and light source once it has hit a surface [11].

## 4 Results and validation

In this section, we provide quantitative results validating our algorithm’s performance, parameter insensitivity, and robustness to noise. We also provide qualitative results to justify the utility of our interactive brushes and assess the segmentation of real-world images from various domains.

Comparing the Jaccard index for our GPU implementation with the CPU implementation

Image | Jaccard index |
---|---|

Synthetic objects 2D | 1 |

Tumor (small) 2D | 1 |

Tumor (large) 2D | 0.981 |

Macular hole 3D | 0.990 |

Brain 3D | 0.984 |

Tumor 3D | 0.993 |

Segmentation without interactive brushes attained from a single circular seed region inside the object

These results show the GPU to be near-identical to the CPU implementation; we find small discrepancies at the boundary at sub-voxel precision caused by different implementations of low-level math library functions and different (mathematically equivalent) algebra in the intermediate steps (Eqs. 11 and 12).

### 4.1 Noise and parameter insensitivity

The results in Fig. 6 show that the method can segment severely noisy images, corrupted with a PSNR of about \(10^{1.05}\), under a constant parameter assignment. While the results in Fig. 6 show the method is more robust to Gaussian noise than speckle noise, it is important to understand that this is only within the parameters chosen; improvements can generally be made by adjusting the parameters for individual scenarios. In addition to Gaussian, salt and pepper, and speckle noise, we implemented a multi-frequency ‘cloud’ noise at a target PSNR, which simulates intensity inhomogeneity. In Fig. 6, it appears that the cloud noise improves under a PSNR of \(10^{0.81}\); however, this is caused by the cloud-like objects inside the synthetic object being captured. In such cases, we can still segment the underlying object, but only through decreasing \(\sigma \) or using the interactive brushes.

Our proposed parameters for controlling the method. All images in this paper are generated using these three parameters within their suggested range and constants \(\varDelta t=0.1\) and \(\mu =1.0\)

Description | Symbol | Suggested range | Default |
---|---|---|---|

Capture range | \(\sigma \) | \([1.01 , \; 10]\) | 3 |

Smoothing weight | \(\nu \) | \([10 , \; 90]\) | 50 |

Shrink or grow | \(\lambda \) | \([-\,0.1 , \; 0.1]\) | 0.05 |

We call \(\sigma \) a ‘capture range’ parameter as it describes the range from which a pixel’s energy may be affected by the contour (see Eqs. 2–4) and therefore determines the capture range. The parameter \(\nu \) penalizes the length of the contour (Eqs. 6 and 8); a larger \(\nu \) value results in a smoother contour which is less likely to burst through small gaps or capture small/sharp features. Traditionally many active contour methods have been designed to grow or shrink until they reach the object boundary and then stop; the parameter \(\lambda \) optionally enables this behavior by weighting the image terms \(e_1\) and \(e_2\) by \(\lambda _1\) and \(\lambda _2\), respectively (Eq. 8), biasing the contour toward shrinking or growing. By adjusting these parameters in real time, inexperienced users quickly learn to intuitively manipulate them in combination with our interactive brushes. In most cases, we set \(\lambda =0.05\) to prefer contour growth and adjust only \(\sigma \) and \(\nu \).

Following challenging scenarios are quickly and easily segmented with our interactive brushes

### 4.2 Segmenting real-world images

Segmentation results of multiple objects displayed in different colors. 1a shows a segmented image of HaCaT human cell culture cells using confocal microscopy, 1b shows the interdigitation of segmented layers of eisosome proteins from cryo-EM tomography data [19], 1c shows a malaria sporozoite [38]. Row 2 shows medical CT scans of the abdomen, body, and thorax [37]. 3a shows an MRI of a cerebral aneurysm and 3b an XA angiogram [37]. 3c shows the structure of the Sec13/31 COPII coat cage from cryo-EM data [44]. Row 4 shows the herpes simplex virus capsid [6], phi procapsid [40], and the mumps virus [9], all from cryo-EM data. Row 5 shows applications outside of biology and medicine: 5a is a CT scan of an engine block [2], 5b sintered alumina [38], and 5c shows a selection of objects from a CT scan of a backpack [2]

Many of the segmentations (Table 5 1a, 3a-b, and 5b-c) are not possible with the current GPU level set segmentation approaches, which use simple speed functions to attract and/or shrink the contour within a fixed intensity range [18, 23, 36]. For example, when painting an initial seed region inside a vessel network with intensity inhomogeneity, the active contour will not grow along the vessel. In contrast, the adopted LGDF energy model allows us to paint a simple initial sphere anywhere on the object which then spreads through the network of vessels. In cases where the contour evolution misses a vessel or oversegments part of the object, evolution is temporarily halted (\(\varDelta t=0\)), local amendments are made, and then evolution is resumed (\(\varDelta t=0.1\)). By making local adjustments with a high level of visual feedback, we can spot such issues and make amendments immediately.

### 4.3 Performance and memory usage

*n*voxels and a truncated 1D Gaussian kernel of length

*l*.

*z*-axis becomes more similar to the

*y*- and

*x*-axes with larger \(\sigma \). In the practical and suggested range of \(\sigma \) [1.01, 10] (Table 3), it can be seen that the running time increases in small steps (zoom to the lower-left of the graph). This is because running time is primarily influenced by the size of the 1D Gaussian filter buffer, whose size is \(\lfloor 4 \sigma + 1 \rfloor \) to approximate the Gaussian function with reasonable support.

We also investigated other optimizations given that the Gaussian convolution is the primary bottleneck of our approach. We implemented Gaussian convolution in the Fourier domain using MATLAB GPU arrays. While Fourier convolution allows for a lower order of growth, the benefits are outweighed by the large constant factor due to the algorithm complexity; this takes 400ms per frame using a GTX TITAN X, which is off the scale in Fig. 8.

The mean time of 100 iterations with our C++ OpenCL implementation is evaluated across different hardware and compared to our GPU Fourier implementation and the original MATLAB version on the CPU (which is vectorized and calls code written in C for the Gaussian convolution). These results are shown in Fig. 9.

In Fig. 9, our algorithm substantially outperforms the original implementation in all images. Given that we process the entire dataset with compact kernels and separable convolutions, we can fully utilize high-end GPU hardware to obtain a substantial speedup of up to three orders of magnitude from the original version, and 1–2 orders of magnitude from our GPU Fourier convolution version. This means that segmentations which previously took over an hour can now be achieved in a few seconds, without any trade in quality.

## 5 Discussion

The primary limitation of our implementation is that we require storing the full dataset at the original resolution in GPU texture memory, as the \(C^\infty \) Heaviside and Dirac functions are nonzero everywhere to reduce convergence on local minima [5]. This also limits the algorithm’s speed. In future work, we will investigate dynamically adjusting the resolution away from the zero-crossing of the \(C^\infty \) Heaviside, to reduce the memory requirements and improve performance, and evaluate the impact of this approach on segmentation quality.

While there are some excellent publicly available datasets for interactive segmentation of real-world 2D color images and videos [53], the problem of segmenting everyday objects in color photographs, e.g., with a graph cut approach on distributions of color information, is fundamentally different to segmenting a tissue or organ. In the latter case, the challenge is more often due to intensity inhomogeneity or poorly defined edges, rather than complex backgrounds or discontinuities within the object. As with [51], we would like to see benchmark 3D biological and medical datasets for evaluating interactive performance.

## 6 Conclusion

In conclusion, we have shown that sophisticated level set segmentation energy models, with sequential dependencies among intermediate processing steps, can be implemented efficiently on the GPU through careful structuring of the GPU kernels within the constraints of the GPU memory architecture. While active contours are used in unsupervised algorithms, they continue to benefit from interactive approaches that enable users to guide and constrain the contour to capture specific parts of more challenging objects. We have shown that the LGDF energy model proposed by [47] requires little parameter tuning, is robust against different types of noise, and can be generalized to a broad range of real-world 3D images from biology and medicine. Segmenting many of these images was not possible with existing GPU level set algorithms due to their simple energy functionals. We have greatly enhanced the LGDF model’s performance, making it practical in many more use-cases than before (including 3D images). We also extended its functionality through interactive brush functions that give direct influence over the dynamic contour evolution. In the future, we believe GPU adaptations of advanced segmentation algorithms will continue to proliferate, using similar design processes to ours.

## 7 Availability

We release our C++/OpenCL software and source code under the GNU General Public License Version 3, alongside an optional MATLAB wrapper. The implementation is cross-platform using GLFW with few dependencies, where binaries for Linux and Windows are also available: https://github.com/cwkx/IGAC

## Notes

### Acknowledgements

We are grateful to NVIDIA for providing a GTX TITAN X for this research. Table 5 1a shows fixed HaCaT human cell culture cells stained with SiR-Actin (Spirochrome) RED, rat anti-tubulin antibody/secondary anti-rat Alexa488 antibody GREEN and DNA DAPI BLUE. The cells are imaged with a Zeiss 880 Airyscan LSM confocal microscope, prepared, and imaged by Miss Bethany Cole, Miss Joanne Robson & Dr Tim Hawkins. Durham Centre for Bioimaging Technology, Department of Biosciences, Durham University.

## Supplementary material

Supplementary material 1 (mp4 85151 KB)

## References

- 1.Adalsteinsson, D., Sethian, J.A.: A fast level set method for propagating interfaces. J. Comput. Phys.
**118**(2), 269–277 (1995)MathSciNetzbMATHGoogle Scholar - 2.Bartz, D.: Volvis datasets. http://www.volvis.org (2005). Accessed 2016 Mar 30
- 3.Caselles, V., Catté, F., Coll, T., Dibos, F.: A geometric model for active contours in image processing. Numer. Math.
**66**(1), 1–31 (1993)MathSciNetzbMATHGoogle Scholar - 4.Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis.
**40**(1), 120–145 (2011)MathSciNetCrossRefGoogle Scholar - 5.Chan, T.F., Vese, L., et al.: Active contours without edges. IEEE Trans. Image Process.
**10**(2), 266–277 (2001)zbMATHGoogle Scholar - 6.Chang, J.T., Schmid, M.F., Rixon, F.J., Chiu, W.: Electron cryotomography reveals the portal in the herpesvirus capsid. J. Virol.
**81**(4), 2065–2068 (2007)Google Scholar - 7.Chen, H.L.J., Samavati, F.F., Sousa, M.C., Mitchell, J.R.: Sketch-based volumetric seeded region growing. In: Proceedings of the Third Eurographics Conference on Sketch-Based Interfaces and Modeling, pp. 123–130 (2006)Google Scholar
- 8.Cocosco, C.A., Kollokian, V., Kwan, R.K.-S., Pike, G.B., Evans, A.C.: BrainWeb: Online interface to a 3D MRI simulated brain database. NeuroImage
**5**, 425 (1997)Google Scholar - 9.Cox, R., Pickar, A., Qiu, S., Tsao, J., Rodenburg, Cynthia, Dokland, T., Elson, A., He, B., Luo, M.: Structural studies on the authentic mumps virus nucleocapsid showing uncoiling by the phosphoprotein. Proc. Natl. Acad. Sci. U.S.A.
**111**(42), 15208–15213 (2014)Google Scholar - 10.Eklund, A., Dufort, P., Forsberg, D., LaConte, S.M.: Medical image processing on the GPU past, present and future. Med. Image Anal.
**17**(8), 1073–1094 (2013)Google Scholar - 11.Evans, A.: Fast approximations for global illumination on dynamic scenes. In:
*ACM SIGGRAPH 2006 Courses*, pp. 153–171. ACM (2006)Google Scholar - 12.Eyiyurekli, M., Breen, D.: Interactive free-form level-set surface-editing operators. Comput. Graph.
**34**(5), 621–638 (2010)Google Scholar - 13.Fulkerson, B., Soatto, S.: Really quick shift: image segmentation on a GPU. In: Trends and Topics in Computer Vision, pp. 350–358 (2010)Google Scholar
- 14.Gorelick, L., Schmidt, F.R., Boykov, Y.: Fast trust region for segmentation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1714–1721 (2013)Google Scholar
- 15.Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
**28**(11), 1768–1783 (2006)Google Scholar - 16.He, Z., Kuester, F.: GPU-based active contour segmentation using gradient vector flow. In: International Conference on Advances in Visual Computing, pp. 191–201 (2006)Google Scholar
- 17.Jarrin, M., Young, L., Wu, W., Girkin, J.M., Quinlan, R.A.: Chapter twenty-one—in vivo, ex vivo, and in vitro approaches to study intermediate filaments in the eye lens. In: Omary, M.B., Liem, R.K.H. (eds.) Intermediate Filament Proteins, volume 568 of Methods in Enzymology, pp. 581 – 611. Academic Press (2016)Google Scholar
- 18.Jeong, W.K., Beyer, J., Hadwiger, M., Vazquez, A., Pfister, H., Whitaker, R.T.: Scalable and interactive segmentation and visualization of neural processes in em datasets. IEEE Trans. Vis. Comput. Graph.
**15**(6), 1505–1514 (2009)Google Scholar - 19.Karotki, L., Huiskonen, J.T., Stefan, C.J., Ziółkowska, N.E., Roth, R., Surma, M.A., Krogan, N.J., Emr, S.D., Heuser, J., Grünewald, K., Walther, T.C.: Eisosome proteins assemble into a membrane scaffold. J. Cell Biol.
**195**(5), 889–902 (2011)Google Scholar - 20.Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis.
**1**(4), 321–331 (1988)zbMATHGoogle Scholar - 21.Kay, T.L., Kajiya, J.T.: Ray tracing complex scenes. In: Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, pp. 269–278. ACM (1986)Google Scholar
- 22.LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature
**521**(7553), 436–444 (2015)Google Scholar - 23.Lefohn, A.E., Kniss, J.M., Hansen, C.D., Whitaker, R.T.: A streaming narrow-band algorithm: interactive computation and visualization of level sets. IEEE Trans. Vis. Comput. Graph.
**10**(4), 422–433 (2004)Google Scholar - 24.Li, C., Xu, C., Gui, C., Fox, M.D.: Level set evolution without re-initialization: a new variational formulation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 430–436 (2005)Google Scholar
- 25.Li, M., He, C., Zhan, Y.: Adaptive level-set evolution without initial contours for image segmentation. J. Electron. Imaging
**20**(2), 023004 (2011)Google Scholar - 26.Malladi, R., Sethian, J.A., Vemuri, B.C.: Shape modeling with front propagation: a level set approach. IEEE Trans. Pattern Anal. Mach. Intell.
**17**(2), 158–175 (1995)Google Scholar - 27.Mory, B.: Interactive Segmentation of 3D Medical Images with Implicit Surfaces. Ph.D. thesis, STI, Lausanne (2011)Google Scholar
- 28.Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math.
**42**(5), 577–685 (1989)MathSciNetzbMATHGoogle Scholar - 29.Olabarriaga, S.D., Smeulders, A.W.M.: Interaction in the segmentation of medical images: a survey. Med. Image Anal.
**5**(2), 127–142 (2001)Google Scholar - 30.Osher, S., Fedkiw, R.: Level Set Methods and Dynamic Implicit Surfaces. Applied Mathematical Sciences. Springer, New York (2002)zbMATHGoogle Scholar
- 31.Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed: algorithms based on Hamilton–Jacobi formulations. J. Comput. Phys.
**79**(1), 12–49 (1988)MathSciNetzbMATHGoogle Scholar - 32.Peng, D., Merriman, B., Osher, S., Zhao, H., Kang, M.: A PDE-based fast local level set method. J. Comput. Phys.
**155**(2), 410–438 (1999)MathSciNetzbMATHGoogle Scholar - 33.Pock, T., Cremers, D., Bischof, H., Chambolle, A.: An algorithm for minimizing the Mumford–Shah functional. In: IEEE International Conference on Computer Vision, pp. 1133–1140 (2009)Google Scholar
- 34.Pratx, G., Xing, L.: GPU computing in medical physics: a review. Med. Phys.
**38**, 2685 (2011)Google Scholar - 35.Ren, C.Y., Reid, I.: gSLIC: a real-time implementation of SLIC superpixel segmentation. Technical report, University of Oxford, Department of Engineering, Technical Report (2011)Google Scholar
- 36.Roberts, M., Packer, J., Sousa, M.C., Mitchell, J.R.: A work-efficient GPU algorithm for level set segmentation. In: Proceedings of the Conference on High Performance Graphics, pp. 123–132. Eurographics Association (2010)Google Scholar
- 37.Rosset, A., Spadola, L., Ratib, O.: Osirix: an open-source software for navigating in multidimensional dicom images. J. Digit. Imaging
**17**(3), 205–216 (2004)Google Scholar - 38.Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B., Tinevez, J.-Y., White, D.J., Hartenstein, V., Eliceiri, K., Tomancak, P., Cardona, A.: Fiji: an open-source platform for biological-image analysis. Nat. Methods
**9**(7), 676–682 (2012)Google Scholar - 39.Schmid, J., Iglesias-Guitián, J., Gobbetti, E., Magnenat-Thalmann, N.: A GPU framework for parallel segmentation of volumetric images using discrete deformable models. Vis. Comput.
**27**(2), 85–95 (2010)Google Scholar - 40.Sen, A., Heymann, J.B., Cheng, N., Qiao, J., Mindich, L., Steven, A.C.: Initial location of the RNA-dependent RNA polymerase in the bacteriophage Phi6 procapsid determined by cryo-electron microscopy. J. Biol. Chem.
**283**(18), 12227–12231 (2008)Google Scholar - 41.Shi, L., Liu, W., Zhang, H., Xie, Y., Wang, D.: A survey of GPU-based medical image computing techniques. Quant. Imaging Med. Surg.
**2**(3), 2223–2292 (2012)Google Scholar - 42.Smistad, E., Elster, A.C., Lindseth, F.: Real-time gradient vector flow on GPUs using OpenCL. J. Real Time Image Process.
**10**(1), 67–74 (2012)Google Scholar - 43.Smistad, E., Falch, T.L., Bozorgi, M., Elster, A.C., Lindseth, F.: Medical image segmentation on GPUs a comprehensive review. Med. Image Anal.
**20**(1), 1–18 (2015)Google Scholar - 44.Stagg, S.M., Gürkan, C., Fowler, D.M., LaPointe, P., Foss, T.R., Potter, C.S., Carragher, B., Balch, W.E.: Structure of the Sec13/31 COPII coat cage. Nature
**439**(7073), 234–238 (2006)Google Scholar - 45.Steel, D.H.W., Lotery, A.J.: Idiopathic vitreomacular traction and macular hole: a comprehensive review of pathophysiology, diagnosis, and treatment. Eye
**27**(1), 1–21 (2013)Google Scholar - 46.Vineet, V., Narayanan, P.J.: CUDA cuts: fast graph cuts on the GPU. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
- 47.Wang, L., He, L., Mishra, A., Li, C.: Active contours driven by local Gaussian distribution fitting energy. Signal Process.
**89**(12), 2435–2447 (2009)zbMATHGoogle Scholar - 48.Whitaker, R., Breen, D., Museth, K., Soni, N.: Segmentation of Biological Volume Datasets Using a Level-Set Framework, pp. 249–263. Springer, Vienna (2001)Google Scholar
- 49.Xu, C., Prince, J.L.: Snakes, shapes, and gradient vector flow. IEEE Trans. Image Process.
**7**(3), 359–369 (1998)MathSciNetzbMATHGoogle Scholar - 50.Yushkevich, P.A., Piven, J., Hazlett, H.C., Smith, R.G., Ho, S., Gee, J.C., Gerig, G.: User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage
**31**(3), 1116–1128 (2006)Google Scholar - 51.Zhao, F., Xie, X.: An overview of interactive medical image segmentation. Ann. BMVA
**2013**(7), 1–22 (2013)Google Scholar - 52.Zhao, H.-K., Chan, T., Merriman, B., Osher, S.: A variational level set approach to multiphase motion. J. Comput. Phys.
**127**(1), 179–195 (1996)MathSciNetzbMATHGoogle Scholar - 53.Zhu, H., Meng, F., Cai, J., Shijian, L.: Beyond pixels: a comprehensive survey from bottom-up to semantic image segmentation and cosegmentation. J. Vis. Commun. Image Represent.
**34**, 12–27 (2016)Google Scholar - 54.Zhu, L., Karasev, P., Kolesov, I., Sandhu, R., Tannenbaum, A.: Interactive Image Segmentation From A Feedback Control Perspective.
*ArXiv e-prints*(2016)Google Scholar - 55.Zhu, S.C., Yuille, A.: Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
**18**(9), 884–900 (1996)Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.