Automatic Liver Segmentation from Multiphase CT Using Modified SegNet and ASPP Module

Nayantara, P. Vaidehi; Kamath, Surekha; Kadavigere, Rajagopal; Manjunath, Kanabagatte Nanjundappa

doi:10.1007/s42979-024-02719-2

Automatic Liver Segmentation from Multiphase CT Using Modified SegNet and ASPP Module

Original Research
Open access
Published: 29 March 2024

Volume 5, article number 377, (2024)
Cite this article

Download PDF

You have full access to this open access article

SN Computer Science Aims and scope Submit manuscript

Automatic Liver Segmentation from Multiphase CT Using Modified SegNet and ASPP Module

Download PDF

P. Vaidehi Nayantara¹,
Surekha Kamath ORCID: orcid.org/0000-0001-5567-166X¹,
Rajagopal Kadavigere² &
…
Kanabagatte Nanjundappa Manjunath³

271 Accesses
Explore all metrics

Abstract

Liver cancer is one of the dominant causes of cancer death worldwide. Computed Tomography (CT) is the commonly used imaging modality for diagnosing it. Computer-based liver cancer diagnosis systems can assist radiologists in image interpretation and improve diagnosis speed and accuracy. Since liver segmentation is crucial to such systems, researchers are relentlessly pursuing various segmentation approaches. A clinically viable computer-aided system requires examining multiphase CT images. However, most of the research focuses only on the portal venous phase. In this work, we developed an automatic and efficient Deep Learning (DL) method using SegNet, atrous spatial pyramid pooling module and leaky ReLU layers for liver segmentation from quadriphasic abdominal CT volumes. The proposed method was validated on two datasets, an internal institutional dataset consisting of multiphase CT and a public dataset of portal venous phase CT volumes. The Dice Coefficients (DC) obtained were greater than 96% for the latter dataset and the portal venous phase of the former. For arterial, delayed and plain CT phases of the former dataset, the DC achieved were 94.61%, 95.01% and 93.23%, respectively. Experiments showed that our model performed better than the other state-of-the-art DL models. Ablation studies have revealed that the proposed model leverages the strengths of all the three components that make it up. The promising performance of the proposed method suggests that it is appropriate for incorporation in hepatic cancer diagnosis systems.

Modified ResNet for Volumetric Segmentation of Liver and Its Tumor from 3D CT

Identification of optimal semantic segmentation architecture for the segmentation of hepatic structures from computed tomography images

Article 08 April 2024

Performance Analysis of Different 2D and 3D CNN Model for Liver Semantic Segmentation: A Review

Introduction

Liver cancer is one of the most lethal cancers in the world. It was the third major cause of cancer mortality (approximately 830,000 deaths) in 2020 [1]. Computed Tomography (CT) is the most frequently used imaging technique for identifying hepatic cancer. Various Computer Aided Diagnosis (CAD_x) solutions have been investigated to aid radiologists in decision-making and increase diagnosis efficiency. Liver segmentation is the first and most critical stage of a CAD_x system and is therefore decisive in determining the success of a diagnosis. However, liver delineation is difficult due to: (i) ambiguous boundaries with adjacent structures, (ii) large shape variability, (iii) the presence of organs with similar intensity in the vicinity, (iv) intensity variations and noise in the liver due to image acquisition and injection protocols [2] and (iv) division of liver into right and left lobes.

Liver segmentation has been the subject of extensive research for over two decades. The earlier studies explored traditional segmentation methods, primarily level set, Fuzzy C-Means (FCM) and region growing. Xu et al. [3] presented a semiautomatic approach in which region growing was utilized for initial liver delineation and level set method for final refinement. Wang et al. [4] suggested a shape–intensity prior level set method using probabilistic atlas and probability map constraints. Eapen et al. [5] delineated the liver using a Bayesian probabilistic level set framework. The different swarm optimization techniques were explored in [6,7,8]. Most of these methods required user intervention and were not very robust.

In recent years, the application of Deep Learning (DL) approaches for segmentation has risen rapidly. Liu et al. [9]. delineated the liver using UNet and dense feature selection. Jeong et al. [10] incorporated long short-term memory network and attention mechanism into UNet. Sun et al. [11]. proposed a UNet based architecture for overcoming the pitfalls in the skip connections and self attention mechanism for liver segmentation. In [12] a 3D version of UNet was developed by incorporating residual connections. Chung et al. [13] presented a Convolutional Neural Network (CNN) by combining auto-context and self-supervised sparse contour attention mechanisms. Ahmad et al. [14]. employed a deep belief network for initial liver delineation and Chan-Vese active contour method for final refinement. Senthilvelan et al. [15] developed a cascaded CNN model consisting of V-Net for initial liver segmentation and H-DenseUNet for final refinement. Araújo et al. [16] cascaded multiple UNets for segmenting simple and complex cases. However, the computational cost was high. Fan et al. [17] presented a variant of UNet, in which the skip connections were modified to extract better features. They also introduced special modules to fuse high- and low-level features and to capture multiscale details. Xie et al. [18] combined dynamic adaptive pooling, residual modules and UNet to segment liver from CT data. Ahmad et al. [19]. developed an efficient CNN initialized randomly with Gaussian weights for liver segmentation. Wei et al. [20] integrated generative adversarial network into mask region-based CNN to enhance liver segmentation results. Wang et al. [21] combined EfficientNetB4, attention gate and residual learning for liver delineation. Wu et al. [22] presented a DL model based on UNet by including pyramidal convolution and attention mechanisms. These works were automatic but only focused on extracting liver from the Portal Venous (PV) phase.

In clinical practice, generally, plain and contrast-enhanced CT images consisting of arterial, PV and delayed phase images are analyzed for tumor identification. Radiologists observe the enhancement patterns (generated by the contrast agent) in and around the tumor to diagnose them. Majority of the research on liver segmentation is centered on segmenting the liver solely from the PV phase. Very few authors have worked on multiple CT phases. For instance, Xu et al. [23] employed a network derived from UNet to segment liver from the triphasic CT data. The approach used by Rusko et al. [24] was based on region growing algorithm. They incorporated various pre and post processing operations using anatomical and multiphase information, to reduce over and under segmentation of liver. These studies required image registration, which is very time consuming.

A liver segmentation method feasible for a CAD_x system must be automatic, accurate, robust and computationally efficient; being effective for multiple CT phases would further add value to the method. This paper aims to accomplish such a method. We have developed a DL model from SegNet using two key components: Atrous Spatial Pyramid Pooling (ASPP) module and leaky Rectified Linear Unit (ReLU) layers. The ASPP module captured multiscale features without reducing the feature map resolution and the leaky ReLU layers improved the model’s generalizability. Performance evaluation on a public dataset with challenging cases and our institutional dataset (consisting of multiphase CT volumes) yielded satisfactory results. Ablation studies justified the significance of the model components. Comparison with the state-of-the-art techniques indicated that our model was comparatively superior.

The rest of the paper is structured into the following sections: Section "Materials and methods": elaborates on the datasets employed and the proposed method. Section "Experimental results": presents the quantitative and qualitative results, and Section "Discussion": discusses the results obtained. Finally, Section "Conclusion", outlines the contributions and presents the future work.

Materials and Methods

Dataset Details

The training\validation and test datasets were prepared primarily from different databases. Hence, we have presented the details of the datasets in two separate subsections.

Training and Validation We collected 4994 diverse CT images from three databases (two public and one internal), namely, 3D-IRCADb [25], LiTS [26] and Kasturba Medical College (KMC), Manipal. They comprised liver of different shapes and intensity distributions, with/without tumor and abdominal images of lung, heart, intestine etc. that did not have liver. This data was then split into training and validation sets, such that over 78% of the total CT images was used for training and the remaining for validation. Thus, the training and validation datasets comprised 3930 and 1064 CT images, respectively. The data splitting was done manually so that the training and validation datasets contained difficult cases equally. We ensured that both the datasets have similar diversity and that there is no overflow of only certain type of images in either of the two datasets. This was important because only the training images are used by the model for learning features. If the training set would not have images of all the types mentioned above, it would be unable to learn relevant features required for identification of liver from the challenging images. The validation accuracies for two split ratios 78:22 and 90:10 was compared, and it was found that the former ratio gave slightly better results, hence it was considered in the work.

The number of cases and CT images considered from each of the databases, along with other relevant attributes, are detailed in Table 1. The images in the public datasets had a fixed resolution of 512 × 512, whereas the images in the KMC, Manipal dataset had differing resolutions. The 3D-IRCADb and KMC, Manipal datasets are in Digital Imaging and Communications in Medicine (DICOM) format, whereas the LiTS dataset is in Neuroimaging Informatics Technology Initiative (NIFTI) format. Only PV images were considered from the former two databases, whereas multiphase images were taken from the latter database. Since the training images were selected from different databases, reasonable diversity was introduced in the training data.

Table 1 Details of training and validation sets

Full size table

Test set Two datasets were used to test the proposed model: an internal institutional dataset (KMC, Manipal) and CHAOS, a public dataset [27]. The KMC, Manipal dataset consists of ten CT volumes, out of which six CT volumes have all four phases (plain, arterial, PV and delayed) and the remaining have three CT phases (plain, arterial, PV). This can be seen in Table 4. The dataset has cases with different abnormalities, viz. metastasis, cysts and hepatocellular carcinoma.

The CHAOS dataset comprised twenty CT training datasets (labeled 1, 2, 5, 6, 8, 10, 14, 16, 18, 19, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 in the database) acquired at PV phase. We chose this dataset for evaluation, firstly because it had CT images in DICOM format which is the standard format for medical images. Secondly, the images were acquired using three different scanners namely, Philips SecuraCT with 16 detectors, Philips Mx8000 CT with 64 detectors and Toshiba AquilionOne with 320 detectors. Hence, the robustness of the model could be assessed. Thirdly, because the dataset had challenging cases, as will be discussed in the next sections. It is to be noted that none of the test set images were part of the training set. An overview of the test datasets is provided in Table 2.

Table 2 Details of test datasets

Full size table

The ground truths for the training, validation and test data required for the KMC dataset were generated using the ITK-SNAP tool [28] under the guidance of an experienced radiologist (co-author) with over twenty years of expertise in medical imaging. The ground truths for the three external datasets are available publicly in the database. The KMC dataset will be hereafter referred to as institutional dataset.

Preprocessing

The following operations were performed on the training/validation datasets before training: (i) the pixel intensities are first converted to Hounsfield units using a linear transformation described in the DICOM documentation, (ii) the images are converted to unsigned 8-bit integer format (Fig. 1b), (iii) the background pixels in the upper region of the image are removed through cropping to reduce the unwanted areas and magnify the abdominal region, (iv) the images are resampled to the dimension 384 × 384 × 3 pixels to satisfy the RGB input requirement of the DL model (Fig. 1c). The image dimension was chosen as a tradeoff between image quality and training time. Training images of higher dimensions gave better results at the cost of higher training time; and lower dimension images reduced the training time, however, produced inferior results. The preprocessed training and validation images were saved in. mat format. For the test sets all the preprocessing operations except step (iii) were performed. It is to be noted that all the above mentioned preprocessing steps were automated.

During training, augmentation techniques like scaling and translation were applied on the training and validation datasets on-the-fly. Hence only the diversity in the data was increased and the dataset size remained the same. The scaling factor was randomly selected from the range 60–100% (horizontally) and 40–100% (vertically). For translation, the value was randomly chosen between [– 25,25] pixels (horizontally) and [– 5,5] pixels (vertically). The ranges chosen ensured that the abdominal region remained sufficiently intact in the image, post augmentation.

Proposed Framework

The proposed SegNet based framework for liver segmentation consists of three parts: an encoder, an ASPP module and a decoder (Fig. 2). The model has four encoder-decoder pairs, consisting of convolution (conv.), Batch Normalization (BN) and leaky ReLU layers. The max unpooling layer at the decoder performs non-linear upsampling of the input feature maps using pooling indices derived at the corresponding encoder's max-pooling layer. The encoder layers are initialized with weights from the VGG-16 network [29] trained on the ImageNet database. The kernel size for the convolutional and max pooling layers are 3 × 3 and 2 × 2, respectively; the stride considered are 1 and 2, respectively, for the two layers. Cross entropy was the loss function employed during training. Table 3 gives the details of the model architecture.

Table 3 Detailed architecture of the proposed model

Full size table

To overcome the dying ReLU problem sometimes encountered in the ReLU layers, we employed the leaky ReLU layers [30] in the encoder and decoder blocks. A leaky ReLU is an activation function that multiplies any input less than zero by a fixed scalar. It is mathematically defined as follows,

$$f\left(x\right)=\left\{\begin{array}{c}scale*x, x<0\\ x, x\ge 0\end{array}\right.$$

(1)

where x is the input value and scale is the scalar, chosen as 0.01 in this work.

The ASPP module performs four convolutions in parallel, one with 1 × 1 filter and the remaining are atrous convolutions (with dilation rates of 6, 12, 18) with 3 × 3 filters (Fig. 2b). Each of these layers is followed by BN and ReLU layers; finally, a concatenation layer fuses the four outputs. The multiple receptive fields incorporated in the ASPP module aid in visualizing the abdominal CT image from different scales [31]. Unlike SegNet, which further downsamples the feature map (to dimension 12 × 12), resulting in information loss, the proposed model retrieves useful multiscale context information from the input feature map (of resolution 24 × 24 pixels) after the fourth encoder (shown in Table 3).

By increasing the receptive field of the convolution filters more context information can be extracted that can aid in identifying the liver region more precisely. Different receptive fields enable the model to learn different features pertaining to the region of interest, like neighboring structures, location and size with respect to other organs and many inherent patterns that are beyond human perception. However, increasing the receptive field makes the model more complex in terms of learnable parameters (weights and biases); thus, increasing the network complexity and training time. These are two very crucial factors for training DL models and hence should be used judiciously. A solution to this issue is to use multiple atrous convolutions instead of standard convolutions. Atrous convolutions with dilation rates of 6,12 and 18 for a 3 × 3 kernel expands the receptive field to 13 × 13, 25 × 25 and 37 × 37, respectively. Hence, the same pixel can be viewed with respect to 168, 624 and 1368 surrounding pixels, thus extracting more information without increasing the learnable parameters. The ASPP module was incorporated to improve the liver segmentation accuracy without increasing the learnable parameters.

Evaluation Metrics

Five standard metrics namely, Dice Coefficient (DC), Jaccard Index (JI), Matthews’s Correlation Coefficient (MCC), Absolute Volume Difference (AVD) and Average symmetric Surface Distance (ASD) evaluated the segmentation accuracy. DC and JI compute the percentage of overlap between the segmented and ground truth volumes [32]; MCC measures the quality of a binary classification. It is appropriate even when the pixels in the two classes are unbalanced [33]. AVD computes the absolute difference between the segmented and ground truth volumes and ASD gives the average distance between the surfaces of the two volumes in mm. Higher values indicate better performance for DC, JI and MCC; the opposite is true for AVD and ASD.

Experimental Results

The programs were implemented in MATLAB R2021a and the DL models were trained on a server with NVIDIA T4 GPU with 16 GB memory. The trained models were evaluated with the test sets on a laptop with Intel Core i7-10750H processor, 16 GB RAM (DDR4) and Windows 10 operating system.

Parameter Setting

The model was trained with Stochastic Gradient Descent with Momentum (SGDM) optimizer with momentum and minibatch size of 0.7 and 2, respectively. The initial learning rate was 0.1. It was lowered by a factor of 0.1 every 50 epochs. The proposed model was trained for eight different epochs viz. 50, 54, 58, 60, 90, 110, 130 and 150 to find the optimal epoch. The DC for the PV phase of the Institutional dataset was computed (Fig. 3). These results were analyzed and since the 150th epoch gave the highest DC, we finalized our epoch as 150.

Liver Segmentation Results

Tables 4 and 5 show the liver segmentation results for the two test sets. From Table 4, we can see that the best DC achieved was 97.01% (PV phase) and the poorest DC was 87.3% (plain phase). The average DC values obtained for the four CT phases are 96.12% (PV), 94.61% (arterial), 95.01% (delayed) and 93.23% (plain). For the CHAOS dataset, DC greater than 97% was achieved for majority of the cases. The average DC accomplished was 96.69%.

Table 4 Quantitative results for internal institutional dataset

Full size table

Table 5 Quantitative results for CHAOS dataset

Full size table

The liver segmentation results for some of the cases are illustrated in Figs. 4 and 5 for the two datasets. The first row of Fig. 4 shows a case that is grainy, the liver has an unusual shape and a large peripheral tumor. Although the contrast between the liver and adjacent structures like rib muscles is lower in the arterial and plain CT phases, the liver is segmented well. The second row shows a case where most part of the liver contains a heterogenous tumor and the contrast between the liver and heart is very low in all the phases except arterial. In both cases, the liver is segmented quite accurately in all phases. Slight discrepancies in the contouring of liver boundary and Inferior Vena Cava (IVC) mainly created differences in the ground truths and predicted masks in Figs. 4 and 5.

The training and validation accuracy and loss curves for the proposed model are shown in Fig. 6. The validation curves exhibited large fluctuations until the fifty-second epoch, after which the variations were reduced.

Ablation Study

An ablation study was conducted to validate the necessity of the different components of the proposed model. Three DL models were studied: Model 1 (Original SegNet with five encoder-decoder pairs), Model 2 (SegNet with four encoder-decoder pairs), Model 3 (Model 2 with ASPP block inserted between encoder 4 and decoder 4). Tables 6 and 7 summarize the results obtained for the institutional and CHAOS datasets, respectively. Table 8 illustrates the network complexity for the different models.

Table 6 Quantitative results of ablation study on the institutional test dataset

Full size table

Table 7 Quantitative results of ablation study on the CHAOS dataset

Full size table

Table 8 Comparison of network complexity

Full size table

Comparison with Model 1 It can be observed from Table 6 that the proposed model has outperformed the original SegNet (model 1) for all the CT phases and all the metrics, barring AVD for arterial phase and ASD for plain phase (institutional test set). Table 7 shows that for the CHAOS dataset, the proposed model has performed better for all metrics. Table 8 infers that the proposed model is also superior with respect to number of learnable parameters and training time by approximately 42% and 5 h, respectively. These results show that replacing the fifth encoder-decoder pair in the original SegNet with ASPP block and employing leaky ReLU layers, have improved the liver segmentation accuracy; and also reduced the network parameters and training time. Hence, we can say that the proposed model is superior to SegNet (model 1).

Comparison with Model 2 In order to examine the usefulness of the fifth encoder-decoder block in model 1, we removed the same and developed model 2. The number of learnable parameters and the training time were reduced considerably (Table 8). However, the segmentation results in the tables clearly show that model 1 is better than model 2 for most of the phases. Thus, the fifth encoder-decoder block is critical for the SegNet.

Compared to the proposed model, model 2 required 1.9 million lesser learnable parameters. However, the proposed model gave better segmentation results. For the PV, arterial, delayed and plain CT phases of the institutional test set the increase in DC was by 1.45%, 3.05%, 0.5% and 5.02%, respectively; the improvement in JI was by 2.46%, 4.95%, 0.9% and 7.98% for the four phases. The ASD values were better for the proposed model by 7.27 mm, 8.49 mm, 0.41 mm and 4.37 mm for the four phases in the same order. The improvement in MCC was by 1.4%, 2.9%, 0.52% and 4.89%; and AVD was by 2.73%, 6.42%, 0.91% and 6.64%, respectively. For the CHAOS dataset, the improvement was by 0.46%, 0.78%, 0.47%, 0.86% and 1.26 mm for DC, JI, MCC, AVD and ASD, respectively. These results indicate that by removing the fifth encoder-decoder pair, although there is some reduction in the learnable parameters, there is deterioration in the outcomes of segmentation. Thus, a four encoder-decoder SegNet is not as effective as our proposed model for liver segmentation.

Comparison with Model 3 In an attempt to get high accuracy along with a reduced number of learnable parameters, we investigated model 3, in which an ASPP module was inserted after the fourth encoder block. Model 3 outperformed model 2 in accuracy; however, it required 3.5 h more for training. Compared to model 1, model 3 gave better results barring ASD for arterial phase (Table 6). For the CHAOS dataset, model 1 gave better results than model 3 for all metrics except ASD (Table 7). However, the latter model took 1.5 h less than the former for training.

The proposed model has performed better than model 3 for all metrics except AVD for the PV phase. For the arterial phase, ASD was better by 1.15 mm (Table 6). For the CHAOS dataset, the proposed model has given better results for all metrics. In addition, although both the models had the same number of learnable parameters (17.1 M), the proposed model needed lesser training time. It is noted from Table 6 that model 3 performed slightly better than the proposed model for some of the phases in the institutional dataset. Nevertheless, the better results for the PV phase of the institutional dataset and the challenging CHAOS dataset; and the lesser training time required make the proposed model superior to model 3.

The above results emphasize that replacing the fifth encoder-decoder pair with the ASPP block and incorporating leaky ReLU instead of ReLU layers has enhanced the performance of the original SegNet. The improvement is in terms of accuracy, computational complexity, training time and model generalizability.

Comparison with Other DL Models

A comparative analysis with other widely used semantic segmentation networks like UNet, DeepLab v3 + and SegNet was performed and the results are reported in Table 9, 10 and 11. The proposed model outperformed all the DL models in all CT phases except for AVD in arterial phase and ASD in plain phase.

Table 9 Comparison of other DL models on the internal test dataset

Full size table

Table 10 Comparison of other DL models on the CHAOS dataset

Full size table

Table 11 Comparison of network complexity

Full size table

UNet gave the poor results for most CT phases (Table 9). The increase in DC for our model compared to UNet was by 1.6%, 2.08%, 1.6% and 4.16% for the PV, arterial, delayed and plain phases, respectively. The improvement in JI was by 2.81%, 3.62%, 2.8% and 6.87%; and MCC was better by 1.53%, 2.06%, 1.64% and 4.26% for the four phases. The ASD metrics improved by 4.7 mm, 3.09 mm and 0.23 mm for PV, arterial and delayed phases. For the CHAOS dataset, the proposed model was better by 0.5%, 0.89%, 0.55%, 0.71% and 0.3 mm for DC, JI, MCC, AVD and ASD, respectively (Table 10).

When compared to DeepLab v3 + , the DC of our model was better by 0.7%, 0.56%, 0.66% and 0.78% for PV, arterial, delayed and plain CT phases, respectively. The improvement in JI was by 1.28%, 1.02%, 1.19% and 1.35% for the four phases. The ASD metrics were better by 2.98 mm, 1.61 mm, 0.34 mm for PV, arterial and delayed phases, respectively. For the CHAOS dataset, proposed model gave the best results and DeepLab v3 + gave the poorest results (Table 10). The improvement in DC, JI, MCC, AVD and ASD were by 3.47%, 3.95%, 3.01%, 3.87% and 0.77 mm, respectively for the proposed model. The comparison of the proposed model with SegNet has already been discussed in the previous subsection.

From Table 11 it can be observed that the proposed model required the least number of learnable parameters (17,115,718 ~ 17.1 million). The learnable parameters constitute weights and biases from the convolutional layers; and offset and scale from the batch normalization layer. The other layers do not have any learnable parameters. The details of the parameters in these layers are given in Table 12. Compared to SegNet, UNet and DeepLab v3 + models, our proposed model requires approximately 42%, 86% and 61% lesser learnable parameters. The minimum training time required was 48.5 h for UNet, our model required 3.5 h more; however our segmentation results were better. Compared to the remaining models, our model was superior in terms of learnable parameters and segmentation accuracy. Hence, we conclude that our model is the best considering all the aspects.

Table 12 Layer wise details of the learnable parameters of the proposed model

Full size table

It was observed that all the DL models segmented the simple cases equally well. However, in the majority of the challenging cases, the proposed model outperformed the other models. The liver segmentation results for some of the unusual cases from the CHAOS dataset are presented in Fig. 7. Here, first column, shows the results for case 14, where the liver has varying intensity and nonuniform texture due to contrast injection. SegNet and UNet exhibited over and under segmentation; and DeepLab v3 + gave abysmal results for all the slices in the volume. However, apart from the inclusion of IVC, our model delineated the liver quite precisely. For case 6, the shape of liver is atypical and the boundary with spleen is vague. It is apparent from Fig. 7c (second column) that the proposed model has quite successfully segmented the liver compared to the other models. For cases 23 and 28, the other models incorrectly segmented parts of the spleen as liver, producing False Positives (FP). The proposed model has segmented only the liver regions, including both lobes for case 28. In case 25, an unusual liver shape is delineated best by the proposed model. The above analysis implies that our model is a clear improvement compared to the well-known DL models.

Figure 8 shows the segmentation results obtained for healthy (third and fourth columns) and unhealthy liver (first and second columns). In the first and second columns the tumors are present near the border of the liver. The liver has however been segmented well in both cases. The third column shows liver with two lobes, that has been segmented accurately. The fourth column shows a healthy liver with heart and other structures with similar intensity in the vicinity, also segmented precisely by the proposed model.

Discussion

The observations made based on our study are outlined in this section. It is seen that most of the literature on semantic segmentation of the liver has focused on UNet. SegNet based architectures are very rarely used. This trend may be because UNet was initially developed for medical image understanding and segmentation, whereas SegNet was primarily used for road scene segmentation. Our findings suggest that SegNet and the proposed SegNet-based model delineate the liver more accurately than UNet for all the CT phases.

The ablation studies have highlighted that integrating the ASPP scheme into the four encoder-decoder SegNet model improves the segmentation accuracy. Although the DeepLab v3 + network also uses the scheme, our investigations reveal that the proposed model is more efficient, effective and robust (Tables 9, 10 and 11). The proposed model requires 61% fewer learnable parameters and comparatively lesser training time.

The ablation studies indicate that leaky ReLU layers in the encoder and decoder sections have made the model more robust. The liver segmentation results for case 14 (with hyperdense liver) of the CHAOS dataset is depicted in Fig. 9. The results illustrate that model 3 (with ReLU layers) could not segment the liver as effectively as the proposed model (Fig. 8 (c) and (d)). Although both the models gave similar results for most cases, the results obtained for case 14 of the CHAOS dataset illustrate that the proposed network is more robust.

Our model was trained on CT images from three databases namely, 3D-IRCADb, LiTS and our institutional database. We tested our model on two test sets: (a) CHAOS dataset and (b) our institutional dataset and achieved satisfactory results with both. It is to be noted that no images from the CHAOS database were included in the training set and that the test data from the institutional database were separate from the training images from the same database. Since the images in the CHAOS dataset were acquired using different scanners as mentioned earlier, good results for this dataset proves the robustness of our model. Moreover, it was effective in segmenting the liver from multiple CT phases (plain, arterial, PV and delayed) although it was mainly trained on PV images. Hence, our model has the potential to be integrated into a CAD_x system.

Table 13 shows a comparison of the proposed method with other recent works that employed the CHAOS dataset. The metrics compared were DC and JI, the former metric was specified in all the works whereas the latter was specified only in few works. The proposed has given better results compared to these works. Mulay et al. [34] presented a method based on Holistically-nested Edge Detection and Region-Convolutional Neural Network for liver segmentation. Their approach required the images to be enhanced through adaptive histogram equalization and sigmoid function. The DC value obtained by their method was 94%. Lei et al.[35] proposed a U-shaped network that employed improved pooling operation and skip connections and achieved DC = 95.58%. Khan et al. [36] integrated UNet, residual networks, dilated convolutions and a new loss function to segment liver and reported DC and JI of 95.49% and 89.13%, respectively. Wu et al. [22] developed a CNN based on UNet, multiscale processing and attention mechanism and obtained a DC of 96.12%. The DC was only slightly less (0.57%) compared to the proposed method (DC = 96.69%), however the JI was lower by 0.93%. Moreover, their work required 24.97 million parameters while the proposed method utilized only around 17 million parameters. Since our model has performed better compared to the other works, in terms of DC, JI and other parameters we can conclude that our model is superior.

Table 13 Comparison of the proposed method with other works that have used CHAOS dataset

Full size table

To sum up, the advantages of the proposed model compared to the other architectures are that (i) it delineates the liver more precisely from all the CT phases, (ii) it is more robust as complex and uncommon cases, especially in the CHAOS dataset (liver with unconventional shapes, heterogeneous and hyperdense intensity distribution) were comparatively segmented in a better manner. Moreover, the model was trained mainly on the PV phase images and tested on all four CT phases, (iii) it produces fewer FPs that can adversely affect the diagnosis made by the CAD_x systems, (iv) it is better than other state-of-the-art methods that employed the same dataset (CHAOS) and (v) it is simple to implement as it is built from existing components. The key limitation of the model is that it has not identified the IVC in many cases. Another shortcoming is its sensitivity to the CT image format. Although it was trained using CT images in DICOM and NIFTI formats, the algorithm works better on the former and gives inferior results for CT images in the latter format.

Conclusion

This study developed a DL model for liver segmentation from multiphase abdominal CT volumes. The network was trained on CT images from different databases and tested on two diverse datasets, firstly an institutional multiphase CT dataset and secondly on a public dataset. The experimental results of a comparative study indicate that the proposed model is superior to some of the commonly employed DL models. It has performed well in terms of accuracy, learnable parameters and training time. Hence, we believe that our liver segmentation algorithm is suitable for incorporation into a CAD_x system. The future work includes constructing a hepatic CAD_x system for differentiating between normal and abnormal liver and diagnosing liver cancer.

Data Availability

The datasets cannot be made openly available due to institutional protocol.

References

Cancer. 2022. https://www.who.int/news-room/fact-sheets/detail/cancer. Accessed 28 Jun 2022.
Moghbel M, Mashohor S, Mahmud R, Bin Saripan MI. Review of liver segmentation and computer assisted detection/diagnosis methods in computed tomography. Artif Intell Rev. 2018;50(4):497–537. https://doi.org/10.1007/s10462-017-9550-x.
Article Google Scholar
Xu L, Zhu Y, Zhang Y, Yang H. Liver segmentation based on region growing and level set active contour model with new signed pressure force function. Optik (Stuttg). 2020;202(July):2019. https://doi.org/10.1016/j.ijleo.2019.163705.
Article Google Scholar
Wang J, Cheng Y, Guo C, Wang Y, Tamura S. Shape–intensity prior level set combining probabilistic atlas and probability map constrains for automatic liver segmentation from abdominal CT images. Int J Comput Assist Radiol Surg. 2016;11(5):817–26. https://doi.org/10.1007/s11548-015-1332-9.
Article Google Scholar
Eapen M, Korah R, Geetha G. Computerized liver segmentation from CT images using probabilistic level set approach. Arab J Sci Eng. 2016;41(3):921–34. https://doi.org/10.1007/s13369-015-1871-y.
Article Google Scholar
Mostafa A, Houssein EH, Houseni M, Hassanien AE, Hefny H. Evaluating swarm optimization algorithms for segmentation of liver images BT. In: Hassanien AE, Oliva DA, editors. Advances in soft computing and machine learning in image processing. Cham: Springer International Publishing; 2018. p. 41–62.
Chapter Google Scholar
Mostafa A, Fouad A, Elfattah MA, Hassanien AE, Hefny H. Artificial bee colony based segmentation for CT liver images BT. In: Dey N, Bhateja V, Hassanien AE, editors. Medical imaging in clinical applications: algorithmic and computer-based approaches. Cham: Springer International Publishing; 2016. p. 409–30.
Chapter Google Scholar
Sayed GI, Hassanien AE. Abdominal CT Liver Parenchyma Segmentation Based on Particle Swarm Optimization BT. In: The 1st International Conference on advanced intelligent system and informatics (AISI2015), November 28–30, 2015, Beni Suef, Egypt,” 2016; p. 219–228.
Liu Z, et al. Automatic liver segmentation from abdominal CT volumes using improved convolution neural networks. Multimed Syst. 2021;27(1):111–24.
Article Google Scholar
Jeong JG, Choi S, Kim YJ, Lee W-S, Kim KG. Deep 3D attention CLSTM U-Net based automated liver segmentation and volumetry for the liver transplantation in abdominal CT volumes. Sci Rep. 2022;12(1):6370. https://doi.org/10.1038/s41598-022-09978-0.
Article Google Scholar
Sun J, Hui Z, Tang C, Wu X. Liver segmentation based on complementary features U-Net. Vis Comput. 2022;39:1–12.
Google Scholar
Czipczer V, Manno-Kovacs A. Adaptable volumetric liver segmentation model for CT images using region-based features and convolutional neural network. Neurocomputing. 2022;505:388–401.
Article Google Scholar
Chung M, Lee J, Park S, Lee CE, Lee J, Shin YG. Liver segmentation in abdominal CT images via auto-context neural network and self-supervised contour attention. Artif Intell Med. 2021;113(November 2020):102023. https://doi.org/10.1016/j.artmed.2021.102023.
Article Google Scholar
Ahmad M, et al. Deep belief network modeling for automatic liver segmentation. IEEE Access. 2019;7:20585–95. https://doi.org/10.1109/ACCESS.2019.2896961.
Article Google Scholar
Senthilvelan J, Jamshidi N. A pipeline for automated deep learning liver segmentation (PADLLS) from contrast enhanced CT exams. Sci Rep. 2022;12(1):1–11.
Article Google Scholar
Araújo JDL, et al. Liver segmentation from computed tomography images using cascade deep learning. Comput Biol Med. 2022;140: 105095. https://doi.org/10.1016/j.compbiomed.2021.105095.
Article Google Scholar
Fan T, Wang G, Wang X, Li Y, Wang H. MSN-Net: a multi-scale context nested U-Net for liver segmentation. Signal, Image Video Process. 2021;15(6):1089–97. https://doi.org/10.1007/s11760-020-01835-9.
Article Google Scholar
Xie X, et al. Dynamic adaptive residual network for liver CT image segmentation. Comput Electr Eng. 2021;91: 107024. https://doi.org/10.1016/j.compeleceng.2021.107024.
Article Google Scholar
Ahmad M, et al. A lightweight convolutional neural network model for liver segmentation in medical diagnosis. Comput Intell Neurosci. 2022;2022:7954333. https://doi.org/10.1155/2022/7954333.
Article Google Scholar
Wei X, Chen X, Lai C, Zhu Y, Yang H, Du Y. Automatic liver segmentation in CT images with enhanced GAN and mask region-based CNN architectures. Biomed Res Int. 2021;2021:9956983. https://doi.org/10.1155/2021/9956983.
Article Google Scholar
Wang J, Zhang X, Lv P, Wang H, Cheng Y. Automatic liver segmentation using EfficientNet and Attention-based residual U-Net in CT. J Digit Imaging. 2022;35:1–15.
Article Google Scholar
Wu Y, Wang G, Wang Z, Wang H. PCAF-Net: A liver segmentation network based on deep learning. IET Image Process. 2021. https://doi.org/10.1049/ipr2.12346.
Article Google Scholar
Xu H, Wang B, Xue W, Zhang Y, Zhong C, Chen Y, Leng J. Automatic segmentation of liver CT image based on dense pyramid network. In: Multiscale multimodal medical imaging: first international workshop, MMMI 2019, Held in Conjunction with MICCAI 2019. Shenzhen, China: Springer International Publishing; 2019. p. 10–16.
Ruskó L, Bekes G, Fidrich M. Automatic segmentation of the liver from multi- and single-phase contrast-enhanced CT images. Med Image Anal. 2009;13(6):871–82. https://doi.org/10.1016/j.media.2009.07.009.
Article Google Scholar
Soler L et al. 3D image reconstruction for comparison of algorithm database: a patient-specific anatomical and medical image database. IRCAD, Strasbourg,” France, Tech. Rep, 2010.
Bilic P et al. The liver tumor segmentation benchmark (lits). arXiv Prepr. arXiv1901.04056, 2019.
Kavur AE, et al. CHAOS Challenge-combined (CT-MR) healthy abdominal organ segmentation. Med Image Anal. 2021;69: 101950. https://doi.org/10.1016/j.media.2020.101950.
Article Google Scholar
Yushkevich PA, Gao Y, Gerig G. ITK-SNAP: An interactive tool for semi-automatic segmentation of multi-modality biomedical images. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016; p. 3342–3345.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: 3rd Int. Conf. Learn. Represent. ICLR 2015-Conf. Track Proc., 2015; p. 1–14.
Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. Proc icml. 2013;30(1):3.
Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell. 2018;40(4):834–48. https://doi.org/10.1109/TPAMI.2017.2699184.
Article Google Scholar
Taha AA, Hanbury A. Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med Imaging. 2015;15:29. https://doi.org/10.1186/s12880-015-0068-x.
Article Google Scholar
Oliveira WS, Teixeira JV, Ren TI, Cavalcanti GDC, Sijbers J. Unsupervised retinal vessel segmentation using combined filters. PLoS ONE. 2016;11(2): e0149943.
Article Google Scholar
Mulay S, Deepika G, Jeevakala S, Ram K, Sivaprakasam M. Liver segmentation from multimodal images using HED-Mask R-CNN. In: Multiscale multimodal medical imaging. First International Workshop, MMMI 2019, Held in Conjunction with MICCAI 2019. Shenzhen, China: Springer International Publishing; 2020. p. 68–75.
Lei J, Lei T, Zhao W, Xue M, Du X, Nandi AK. Rethinking Pooling Operation for Liver and Liver-Tumor Segmentations. Front Sig Proc. 2022;1:808050
Article Google Scholar
Khan RA, Luo Y, Wu F-X. RMS-UNet: residual multi-scale UNet for liver and lesion segmentation. Artif Intell Med. 2022;124: 102231. https://doi.org/10.1016/j.artmed.2021.102231.
Article Google Scholar

Download references

Acknowledgements

This work was supported by Karnataka Science and Technology Promotion Society, Department of Science and Technology, Government of Karnataka, India (No. DST/KSTePS/Ph.D. Fellowship/ENG-02: 2019-20/419/19). We are grateful to Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal for providing the facilities to carry out the research and Kasturba Medical College, Manipal for providing the images.

Funding

Open access funding provided by Manipal Academy of Higher Education, Manipal.

Author information

Authors and Affiliations

Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
P. Vaidehi Nayantara & Surekha Kamath
Department of Radiodiagnosis and Imaging, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
Rajagopal Kadavigere
Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
Kanabagatte Nanjundappa Manjunath

Authors

P. Vaidehi Nayantara
View author publications
You can also search for this author in PubMed Google Scholar
Surekha Kamath
View author publications
You can also search for this author in PubMed Google Scholar
Rajagopal Kadavigere
View author publications
You can also search for this author in PubMed Google Scholar
Kanabagatte Nanjundappa Manjunath
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Surekha Kamath.

Ethics declarations

Conflict of Interest

The authors have no relevant financial or non-financial interests to disclose.

Ethics Approval

We have conducted a retrospective study and the anonymized data was collected after obtaining approval from the Institutional Ethics Committee of Kasturba Medical College and Kasturba Hospital (IEC:606/2019).

Consent to Participate

Written informed consent for participation was not required for this study in accordance with the institutional requirements.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Nayantara, P.V., Kamath, S., Kadavigere, R. et al. Automatic Liver Segmentation from Multiphase CT Using Modified SegNet and ASPP Module. SN COMPUT. SCI. 5, 377 (2024). https://doi.org/10.1007/s42979-024-02719-2

Download citation

Received: 04 September 2023
Accepted: 15 February 2024
Published: 29 March 2024
DOI: https://doi.org/10.1007/s42979-024-02719-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Automatic Liver Segmentation from Multiphase CT Using Modified SegNet and ASPP Module

Abstract

Similar content being viewed by others

Modified ResNet for Volumetric Segmentation of Liver and Its Tumor from 3D CT

Identification of optimal semantic segmentation architecture for the segmentation of hepatic structures from computed tomography images

Performance Analysis of Different 2D and 3D CNN Model for Liver Semantic Segmentation: A Review

Introduction