End-to-End Ovarian Structures Segmentation

Wanderley, Diego S.; Carvalho, Catarina B.; Domingues, Ana; Peixoto, Carla; Pignatelli, Duarte; Beires, Jorge; Silva, Jorge; Campilho, Aurélio

doi:10.1007/978-3-030-13469-3_79

End-to-End Ovarian Structures Segmentation

Diego S. Wanderley^17,18,
Catarina B. Carvalho¹⁸,
Ana Domingues¹⁸,
Carla Peixoto¹⁹,
Duarte Pignatelli^19,20,
Jorge Beires¹⁹,
Jorge Silva^17,18 &
…
Aurélio Campilho^17,18

Conference paper
First Online: 03 March 2019

2129 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11401))

Abstract

The segmentation and characterization of the ovarian structures are important tasks in gynecological and reproductive medicine. Ultrasound imaging is typically used for the medical diagnosis within this field but the understanding of the images can be difficult due to their characteristics. Furthermore, the complexity of ultrasound data may lead to a heavy image processing, which makes the application of classical methods of computer vision difficult. This work presents the first supervised fully convolutional neural network (fCNN) for the automatic segmentation of ovarian structures in B-mode ultrasound images. Due to the small dataset available, only 57 images were used for training. In order to overcome this limitation, several regularization techniques were used and are discussed in this paper. The experiments show the ability of the fCNN to learn features to distinguish ovarian structures, achieving a Dice similarity coefficient (DSC) of 0.855 for the segmentation of the stroma and a DSC of 0.955 for the follicles. When compared with a semi-automatic commercial application for follicle segmentation, the proposed fCNN achieved an average improvement of 19%.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Diseases of the female reproductive system can cause pain and discomfort, hormonal dysfunctions, infertility and even death, representing around 16% of all the cancers diagnosed in women worldwide and affecting annually more than 1.85 million women [10]. Due to the difficulty of distinguishing between benign and malignant tumors and to a high interobserver variability, ovarian malignant tumors result in a 68% fatality rate in the European Union [10]. A better characterization of the ovarian structures can have an important role in the early detection of pathologies (e.g ovarian cyst, polycystic ovarian syndrome or even ovarian cancer), while it can also help the monitoring of follicle growth and distribution, important features for assisted reproductive treatments.

The brightness mode (B-mode) ultrasound imaging is commonly used in the gynecological clinical practice because it allows a visualization of the ovary and its structures. In B-mode images, follicles are represented as hypo-echogenic elliptical structures, while the stroma of the ovary exhibits a slight variation in texture relative to its surrounding tissue and has partially hyper-echogenic boundaries. Figure 1 shows an example of a gynecological B-mode image containing an ovary with three follicles, and their manual segmentation.

Image segmentation methods are often used to automatically extract objects from images, reducing the time of analysis and also diagnostic errors. However, ultrasound image segmentation is not easy due to the presence of several image artifacts and noise [6]. According to the latest review in follicles detection [7], the methods used to segment ovarian structures can only detect and measure large follicles. To the best of our knowledge, the segmentation of the stroma has not received enough attention, being only used to reduce the search space for follicle detection [1].

Neural network techniques have been achieving impressive results in visual recognition systems. Among them, fully convolutional neural networks (fCNN) are specially good at learning image features from training data and have proved to be a powerful tool for segmentation of biomedical images [8]. The herein presented research aims to explore the use of fCNNs for the segmentation of the ovarian structures, namely stroma and follicles, in a single process.

2 Methodology

This section presents the methods implemented in this work to segment the ovarian structures in B-mode images. In the following subsections, the proposed system, its fCNN architecture and loss functions used are detailed.

2.1 Architecture

An overview of the proposed system is shown in Fig. 2. Switches $S_{1}$ and $S_{2}$ can be triggered to change the input data of the network and the tasks to be trained, respectively. These changes can work as regularization of the fCNN.

When switch $S_{1}$ is turned on, the B-mode image is preprocessed by a contrast limited adaptive histogram equalization (CLAHE) [11] in order to enhance local contrast and improve the visualization of the ovarian structures. Both CLAHE and original images can be used as input data, as represented in Fig. 2, left side.

For the training, the switch $S_{2}$ can be used to activate the multi-task learning, which consists of using the same network to simultaneously solve multiple tasks. In this work, a mask of the ovary is used as ground truth of an auxiliary task in order to prevent the network to classify elements outside the ovary as follicles or from classifying pixels inside of the ovary as background. The auxiliary task acts as regularization during the training of the network [9], and can help the network to focus the attention on difficult cases [3].

The fCNN architecture used in this work (Fig. 3) is based on the U-net [8]. This architecture consists of a downsampling stream (left side) followed by a symmetric upsampling stream (right side). Data from downsampling stream are skip connected to the corresponding layer in the upsampling stream. The convolutional layers are followed by a batch normalization layer and ReLu activation layer; also a dropout layer is inserted between them, when pooling or concatenating operations are performed. The last layer is a $1\times 1$ convolution followed by a softmax, which produces a pixel-wise discrete probability distribution of the three classes of interest (follicle, stroma of the ovary or background).

2.2 Loss Function

The proposed loss function can be decomposed into the main and the auxiliary tasks. Also, weight maps can be applied as regularization, in order to penalize wrong classifications. The details of each step are explained below.

The average Dice Similarity Coefficient (DSC) of each class, as proposed in [5], is the main component of the loss function. The average DSC can be defined as $\overline{ DSC }(Y,\hat{Y}) = 0.5 [ DSC (Y_f,\hat{Y}_f) + DSC (Y_s,\hat{Y}_s)]$, where Y represents the predictions and $\hat{Y}$ represents the ground truth (GT); the indexes f and s represent the follicles and the stroma, respectively. The background was not considered in the loss function because it is the largest region in the image and, so, the results can be heavily influenced by it.

In addition, two weight maps were computed to be applied with the DSC. The first one ($W_f$) intends to penalize wrong classifications between nearby follicles, as in w(x), defined by U-net [8]. The value of $W_f (i)$ is calculated using the distance between the ith pixel and the borders of the two nearest follicles. The second one ($W_o$) is applied to penalize false detections of ovarian structures in the background region, and is defined for each pixel as:

$$\begin{aligned} W_{o} (i) = {\left\{ \begin{array}{ll} 0 &{} \text {if} \quad i \quad \text {is inside the ovary} \\ 1 &{} \text {if} \quad \varDelta _{o}(i) > \ln (10) \sigma ^2 \\ 0.1 \cdot \exp (\frac{\varDelta _{o}(i)}{\sigma ^2}) &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

(1)

where $\varDelta _{o}(i)$ is the distance from pixel i to the nearest pixel of the ovary, and $\sigma $ is a constant that controls the distribution of the weights around the ovary.

Then, the loss function of the main task is computed as:

$$\begin{aligned} \mathcal {L}_{1}(Y,\hat{Y}) = 1 - \lambda _1\overline{ DSC }(Y,\hat{Y}) + \lambda _2 \frac{\sum _i Y_f(i) W_f(i)}{\sum _i Y_f(i)} + \lambda _3 \frac{\sum _i Y_s(i) W_o(i)}{\sum _i Y_s(i)}, \end{aligned}$$

(2)

where $\lambda _{1,2,3} \in \mathbb {R}^+$ are constants used to adjust the influence of each weight map.

The loss function of the auxiliary task is given by:

$$\begin{aligned} \mathcal {L}_{2}(Y_o,\hat{Y}_o) = 1 - DSC (Y_o,\hat{Y}_o), \end{aligned}$$

(3)

where $\hat{Y}_o$ is the GT mask of the ovary and $Y_o$ is the predicted ovary.

Finally, the total loss function is defined as:

$$\begin{aligned} \mathcal {L}(Y,\hat{Y}) = \alpha _1 \mathcal {L}_{1}(Y,\hat{Y}) + \alpha _2 \mathcal {L}_{2}(Y_o,\hat{Y_o}) \qquad \forall \alpha _{1,2} \in \mathbb {R}^+ | \alpha _{1} + \alpha _{2} = 1, \end{aligned}$$

(4)

where $\alpha _{1,2}$ are constants used to adjust the influence of each component.

2.3 Implementation Steps

The proposed system was evaluated using six variations. The input data of the fCNN (Fig. 2) is changed by the switch $S_{1}$; and the switch $S_{2}$ controls the multi-task learning. When $S_{2}$ is “off”, the Eq. (4) is written with $\alpha _1 = 1$ and $\alpha _2 = 0$, otherwise $\alpha _1 = 0.75$ and $\alpha _2 = 0.25$. Finally, the values of $\lambda _2$ and $\lambda _3$, in Eq. (2), determine if the weight maps are added or not to the loss function. For all these experiments $\lambda _2 = \lambda _3 \in [0,1]$, and $\lambda _1= 1$. The values of $\alpha $, $\lambda $ and $\sigma $ were defined empirically and were not changed during each train.

All original B-mode images were converted to gray-scale and cropped to $512 \times 512$ pixel. Aside from the batch normalization layers, no regularization or normalization to zero mean and unit variance were applied to the input images. To increase the training set, a data augmentation process using random linear transformations such as rotation, translation, flip and zoom was applied in each iteration of the training. Each iteration was performed with a batch of 4 images.

The network was trained using Adam (Adaptive Moment Estimation) optimizer [4] with an initial learning rate of $10^{-2}$. In this state-of-the-art stochastic optimization method, there is a learning rate for each weight of the network, and the learning rates are adapted during the training. To reduce the probability of overtraining, an early stopping callback is set to stop the training if the improvement of the validation loss is less than $10^{-3}$, during 50 epochs. This work was implemented in Python 2.7 using Keras 1.2.2 framework with TensorFlow 1.0.0 as backend.

3 Results

This section presents the dataset, the evaluation methodologies and the obtained results.

3.1 Dataset

The dataset consists of 87 B-mode images. Each image contains one ovary of a woman in childbearing age with at least one follicle. The images were acquired with an Ultrasonix SonixTouch Q+. During acquisition, the medical doctor performed semi-automatic segmentations using the Ultrasonix Auto Follicle segmentation (AF) tool [2]. It must be noted that not all of the follicles were segmented by the doctor, leading to, for instance one ovary with 4 follicles and only one semi-automatic segmentation. Posteriorly, a medical expert manually segmented each ovary and each follicle to produce the GT. The images were randomly divided as: 57 for training, 15 for validation and 15 for testing.

3.2 Evaluation

The quantitative evaluation of the results was divided into two different validation methodologies. First, the DSC between the GT and the predicted segmentations, obtained by the different trained networks, are presented. Secondly, a single follicle evaluation (SFE) was performed and then compared with the AF segmentation, mentioned in Sect. 3.1.

The motivation for SFE lays on the fact that a GT mask may have more segmented follicles than the ones annotated by the doctor using the AF tool during the acquisition. For example, while the GT of the test set has 44 follicles manually segmented, only 25 follicles were annotated with the AF tool. The SFE verifies if a follicle segmented by the AF has a corresponding follicle in the GT and in the fCNN segmentations. Then, for each follicle present in the AF data, the DSC of GT vs AF and GT vs fCNN are computed.

In Fig. 4 two scenarios of SFE are illustrated. Figure 4(a) represents a SFE with a larger overlay while Fig. 4(b) represents an incorrect segmentation. In this case, the fCNN and the AF segmented a large single follicle which merged the existing two follicles into one, leading to an inaccurate detection.

3.3 Results

The overall DSC results for the six trained networks are shown in Table 1. The fCNNs #1 and #4 show the best overall DSC for the follicles and the stroma. In Fig. 5, four examples of the segmentation performed by the developed fCNNs are shown. The highest DSC achieved for follicles was 0.955, with the fCNN #1 – Fig. 5(a), and for the stroma was 0.855, with the fCNN #3 – Fig. 5(b). Also, a standard case and the image with the worst segmentations are presented in Fig. 5(c) and (d), respectively.

Table 1. Overview of DSC for the predicted segmentations of the fCNN trained.

Full size table

In a qualitative analysis, the application of multi-task learning prevented follicles for being classified outside the ovary. This approach obtained a fast convergence in training and the smallest validation errors. However, for three test images with low contrast – e.g. Fig. 5(d), the ovarian structures were poorly or not detected, which impaired the overall results. The application of CLAHE improved the results and the use of the weight map $W_o$ solved the problem of false positive ovaries. However, weight map $W_f$ did not significantly reduce misclassification of the pixels between too close follicles; in addition, it produced the wrong classification of the outer boundary of the follicles as background.

The results of the SFE are presented in Table 2. The AF was overcome by all architectures except the fCNN #2. The best overall results for the SFE were obtained with the simplest architecture. Although the CLAHE improved the contrast in boundary regions, the SFE did not improve when CLAHE was used.

Table 2. Overview of DSC for single follice evaluation (SFE).

Full size table

4 Conclusions

In this paper, the first supervised fCNN for the segmentation of the stroma and follicles of ovaries in B-mode images, in an end-to-end fashion, was presented. Despite being trained with a small dataset, the developed method does not depend on heavy preprocessing or post-processing strategies. The visual results show that a fCNN can learn features that allow to distinguish the ovarian structures in B-mode images. This functionality could allow a better characterization of the overlooked stroma region. Also, the proposed method proved to be more accurate than a commercialized semi-automatic method for follicle segmentation.

Despite presenting slightly better results in the validation set, the proposed regularization techniques show worse overall DSC results for the test set, when comparing with the simplest fCNNs (#1 and #4). This may have been caused by the increasing of the complexity of the segmentation task and by the overwhelming of the data information by the regularization terms. An improvement of the proposed regularizations should be investigated to yield better results.

For future steps of this investigation, the proposed fCNN will be extended to a deeper architecture, increasing the number of learnable features. Due to the scarcity of data, a k-fold cross-validation should be applied to better evaluate the consistency of each architecture. Also, a more efficient loss function will be elaborated in order to force the network to learn the boundaries of the follicles. Finally, the increasing of the dataset is fundamental to improve the variability of the training set.

References

Chen, T., Zhang, W., Good, S., Zhou, K.S., Comaniciu, D.: Automatic ovarian follicle quantification from 3D ultrasound data using global/local context with database guided segmentation. In: IEEE 12th ICCV, pp. 795–802 (2009)
Google Scholar
Eskandari, H., Azar, R.Z., Pendziwol, L.: Ovarian follicle segmentation in ultrasound images. US Patent 9,679,375 (2017)
Google Scholar
Ferreira, F.T., Sousa, P., Galdran, A., Sousa, M.R., Campilho, A.: End-to-end supervised lung lobe segmentation. In: 2018 IJCNN, pp. 1–8 (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR (2014). http://arxiv.org/abs/1412.6980
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 4th International Conference on 3DV, pp. 565–571 (2016)
Google Scholar
Noble, J.A., Boukerroui, D.: Ultrasound image segmentation: a survey. IEEE Trans. Med. Imaging 25(8), 987–1010 (2006)
Article Google Scholar
Potočnik, B., Cigale, B., Zazula, D.: Computerized detection and recognition of follicles in ovarian ultrasound images: a review. Med. Biol. Eng. Comput. 50(12), 1201–1212 (2012)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv (2017). http://arxiv.org/1706.05098
Stewart, B.W., Wild, C.P.: IARC World Cancer Report 2014. International Agency for Research on Cancer, Lyon (2014)
Google Scholar
Zuiderveld, K.: Contrast limited adaptive histogram equalization. In: Heckbert, P.S. (ed.) Graphics Gems IV, pp. 474–485. Academic Press Professional Inc, USA (1994)
Google Scholar

Download references

Acknowledgments

This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project “POCI-01-0145-FEDER-006961”, and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia as part of project “UID/EEA/50014/2013”.

Author information

Authors and Affiliations

Faculdade de Engenharia da Universidade do Porto, Porto, Portugal
Diego S. Wanderley, Jorge Silva & Aurélio Campilho
INESC TEC, Porto, Portugal
Diego S. Wanderley, Catarina B. Carvalho, Ana Domingues, Jorge Silva & Aurélio Campilho
Centro Hospitalar de São João, Porto, Portugal
Carla Peixoto, Duarte Pignatelli & Jorge Beires
Faculdade de Medicina da Universidade do Porto, Porto, Portugal
Duarte Pignatelli

Authors

Diego S. Wanderley
View author publications
You can also search for this author in PubMed Google Scholar
Catarina B. Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Ana Domingues
View author publications
You can also search for this author in PubMed Google Scholar
Carla Peixoto
View author publications
You can also search for this author in PubMed Google Scholar
Duarte Pignatelli
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Beires
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Silva
View author publications
You can also search for this author in PubMed Google Scholar
Aurélio Campilho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego S. Wanderley .

Editor information

Editors and Affiliations

Biometrics and Data Pattern Analytics Lab, Universidad Autonoma de Madrid, Madrid, Spain
Ruben Vera-Rodriguez
Biometrics and Data Pattern Analytics Lab, Universidad Autonoma de Madrid, Madrid, Spain
Julian Fierrez
Biometrics and Data Pattern Analytics Lab, Universidad Autonoma de Madrid, Madrid, Spain
Aythami Morales

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wanderley, D.S. et al. (2019). End-to-End Ovarian Structures Segmentation. In: Vera-Rodriguez, R., Fierrez, J., Morales, A. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2018. Lecture Notes in Computer Science(), vol 11401. Springer, Cham. https://doi.org/10.1007/978-3-030-13469-3_79

Download citation

DOI: https://doi.org/10.1007/978-3-030-13469-3_79
Published: 03 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13468-6
Online ISBN: 978-3-030-13469-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)