Background

A challenge still facing scientists is the efficient analysis and management of biomedical data, including images. Advances in biomedical imaging diagnosis have been possible thanks to the development of new imaging technologies. Anatomical Pathology has also benefited from these new technologies, which have provided solutions for whole slide scanning by means of motorized microscopes and scanners [1], that is, whole slide imaging (WSI). However, the image processing performed with these slides is still limited both in data processed and processing methods.

Much research has been carried out on the development of algorithms for histological image analysis. Most of them are based on the segmentation of just one region of interest (ROI), which is usually the nucleus, and its classification for diagnosis purposes. To this end, statistical information techniques, region growing algorithms, active contour models and morphological methods have been used for ROI detection and processing [25].

The main problem with these methods is that they are not designed to process large amounts of data, which is the case when working with WSI in pathology. Besides that, many of these methods show limited results because they are mainly focused on a single structure or a type of tissue.

There is a need to develop more general and efficient image processing methods. To this end the colour model should be analysed, as well as the distance colour model applied to the processing algorithm in order to reduce the computational cost and obtain, in an efficient way, a set of heterogeneous, complex and specific image analysis. In this work different colour models and distances have been studied and applied under a general parallel image-processing model designed and implemented with MPP (Massively Parallel Processing).

Methods

There are three main colour models:

RGB: channel Red, channel Green and channel Blue,

HSI: channel Hue, channel Saturation and channel Intensity,

L*a*b*: channel Luminance, channel a*, that is, range of channel between Red to Green and channel b* that is range of colours between Yellow to Blue.

All colour models have their advantages and drawbacks. It is necessary to identify which colour model is suitable to represent and reproduce the ROI under consideration for each tissue type and WSI modality. Analysing the distance colour formulae applied between two colours may do this, d ( x , y ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemizaq2aaeWaaeaacuWG4baEgaWcaiabcYcaSiqbdMha5zaalaaacaGLOaGaayzkaaaaaa@32A7@ , x = ( x 1 , x 2 , x 3 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmiEaGNbaSaacqGH9aqpcqGGOaakcqWG4baEdaWgaaWcbaGaeGymaeJaeiilaWcabeaakiabdIha4naaBaaaleaacqaIYaGmaeqaaOGaeiilaWIaemiEaG3aaSbaaSqaaiabiodaZaqabaGccqGGPaqkaaa@39BB@ , y = ( y 1 , y 2 , y 3 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmyEaKNbaSaacqGH9aqpcqGGOaakcqWG5bqEdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabdMha5naaBaaaleaacqaIYaGmaeqaaOGaeiilaWIaemyEaK3aaSbaaSqaaiabiodaZaqabaGccqGGPaqkaaa@39C3@ .

The distances considered within this study are: the Euclidean distance for the RGB model (Equation 1), the NBS colour distance formulae for HSI model (Equation 2) and the CIEDE2000 for the CIEL*a*b*, colour model (Equation 3).

d ( x , y ) = ( x 1 y 1 ) 2 + ( x 2 y 2 ) 2 + ( x 3 y 3 ) 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemizaq2aaeWaaeaacuWG4baEgaWcaiabcYcaSiqbdMha5zaalaaacaGLOaGaayzkaaGaeyypa0ZaaOaaaeaadaqadaqaaiabdIha4naaBaaaleaacqaIXaqmaeqaaOGaeyOeI0IaemyEaK3aaSbaaSqaaiabigdaXaqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiabikdaYaaakiabgUcaRmaabmaabaGaemiEaG3aaSbaaSqaaiabikdaYaqabaGccqGHsislcqWG5bqEdaWgaaWcbaGaeGOmaidabeaaaOGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaOGaey4kaSYaaeWaaeaacqWG4baEdaWgaaWcbaGaeG4mamdabeaakiabgkHiTiabdMha5naaBaaaleaacqaIZaWmaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqaIYaGmaaaabeaaaaa@506E@
(1)
d ( x , y ) = 1.2 2 x 2 y 2 ( 1 cos ( 2 π Δ H 100 ) ) + Δ S 2 + ( 4 Δ I ) 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemizaq2aaeWaaeaacuWG4baEgaWcaiabcYcaSiqbdMha5zaalaaacaGLOaGaayzkaaGaeyypa0JaeGymaeJaeiOla4IaeGOmaiJaey4fIOYaaOaaaeaacqaIYaGmcqWG4baEdaWgaaWcbaGaeGOmaidabeaakiabdMha5naaBaaaleaacqaIYaGmaeqaaOWaaeWaaeaacqaIXaqmcqGHsislcyGGJbWycqGGVbWBcqGGZbWCdaqadaqcfayaamaalaaabaGaeGOmaiJaeqiWdaNaeuiLdqKaemisaGeabaGaeGymaeJaeGimaaJaeGimaadaaaGccaGLOaGaayzkaaaacaGLOaGaayzkaaGaey4kaSIaeuiLdqKaem4uam1aaWbaaSqabeaacqaIYaGmaaGccqGHRaWkdaqadaqaaiabisda0iabfs5aejabdMeajbGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaaqabaaaaa@5B4C@
(2)
Δ E 00 = ( Δ L K L S L ) 2 + ( Δ C K C S C ) 2 + ( Δ H K H S H ) 2 + R T ( Δ C K C S C ) ( Δ H K H S H ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiLdqKaemyrau0aaSbaaSqaaiabicdaWiabicdaWaqabaGccqGH9aqpdaGcaaqaamaabmaajuaGbaWaaSaaaeaacqqHuoarcuWGmbatgaqbaaqaaiabdUealnaaBaaabaGaemitaWeabeaacqWGtbWudaWgaaqaaiabdYeambqabaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqaIYaGmaaGccqGHRaWkdaqadaqcfayaamaalaaabaGaeuiLdqKafm4qamKbauaaaeaacqWGlbWsdaWgaaqaaiabdoeadbqabaGaem4uam1aaSbaaeaacqWGdbWqaeqaaaaaaOGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaOGaey4kaSYaaeWaaKqbagaadaWcaaqaaiabfs5aejqbdIeaizaafaaabaGaem4saS0aaSbaaeaacqWGibasaeqaaiabdofatnaaBaaabaGaemisaGeabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiabikdaYaaakiabgUcaRiabdkfasnaaBaaaleaacqWGubavaeqaaOWaaeWaaKqbagaadaWcaaqaaiabfs5aejqbdoeadzaafaaabaGaem4saS0aaSbaaeaacqWGdbWqaeqaaiabdofatnaaBaaabaGaem4qameabeaaaaaakiaawIcacaGLPaaadaqadaqcfayaamaalaaabaGaeuiLdqKafmisaGKbauaaaeaacqWGlbWsdaWgaaqaaiabdIeaibqabaGaem4uam1aaSbaaeaacqWGibasaeqaaaaaaOGaayjkaiaawMcaaaWcbeaaaaa@69D6@
(3)

Where K L , K c , K H are weight factors and the rest of components, S L , S C , S H , C', H', may be calculated by means of the, {L*, a*, b*} coordinates [6].

Moreover, another aspect to be considered is how to deal with the colour coordinates, that is as a vector or in a marginal way. These aspects have been analysed within this work. To this end the 3*2 distances to the most representative colour ROIs and statistically identified on the image were calculated on different WSI, which is to prostate biopsies and lung cytology stained with hematoxiline-eosine (HEO), inmunohistochemistry and papanicolau. The images were obtained by the ALIAS II automatic microscope and the processing was done using our own libraries, implemented by the research group, running under MPI on a grid composed by 17 nodes Intel XEON (3,2 GHz) INFINIBAND net (10 GB full-duplex) architecture. The results are shown as follows.

Results

The results applied to microscopic images show that the Euclidean and NBS vector distance for the RGB and HSI model respectively distinguish between different ROIs but the vector CIEDE2000 distance for the CIEL*a*b* model reproduces in a better way, the original colour. However, the computational cost of the last one is higher than the other two colour models.

Figure 1 shows the result for a biopsy stained with Hematoxiline-Eosine and Figure 2 for a cytology stained with Papanicolau.

Figure 1
figure 1

Colour distances for ROI detection applied to biopsies. a) Original Image, b) RGB (Euclidean), c) HSI (NBS), d) CIEL*a*b (CIEDE2000). Colour distances for ROI detection applied to biopsies.

Figure 2
figure 2

Colour distances for ROI detection applied to cytology. a) Original Image, b) RGB (Euclidean), c) HSI (NBS), d) CIEL*a*b (CIEDE2000). Colour distances for ROI detection applied to cytology.

The computational cost for the three colour distance vector models against different number of ROIs is shown in Figure 3.

Figure 3
figure 3

Computational cost of the colour distances vs. number of ROIs. Computational cost for the different colour models against the number of ROIs analysed.

To quantify the goodness of the distance formulae a ROC analysis has been carried on. Figure 4 shows this analysis for two ROIs in a prostate biopsy at 10× stained with HEO. The true pixels belonging to the ROIS were indicated by experts at Hospital General Ciudad Real. Figure 4b) and 4c) show the true values for the two regions of interest, the glandular light and the nucleus, extracted from the original image (Figure 4a). Figure 4d) to i) show the different colour distance results for these two regions. Finally, Table 1 shows the ROC analysis for the Eucludian, NBS and CIEDE2000 colour distance for the RGB, HSI and CIEDEL*a*b* models. It is shown that the % of specificity is higher for the CIEDE2000 distance with lower value of FP.

Table 1 ROC analysis of the colour distance formulae for two ROIs
Figure 4
figure 4

Colour distance validation. a) Original Image, b) True section of glandular light, c) True section of nucleus, d) RGB (Euclidean), e) HSI (NBS), f) CIEL*a*b (CIEDE2000), g) RGB (Euclidean), h) HSI (NBS), i) CIEL*a*b (CIEDE2000). Colour distances validation on different ROIs.

Conclusion

This article has presented a comparative study between RGB, HSI and CIEL*a*b* colour models applied histological images. This analysis, in turn, allows both distinguishing possible regions of interest and retrieving their proper colour for further region analysis.

The results applied to prostate biopsies stained with HEO and lung cytologies stained with papanicolau show that the vector CIEDE2000 distance for the CIEL*a*b* model reproduces in a better way the original colour.

Therefore, this comparison does allow us to choose the best colour model tailored to the microscopic stain and tissue type under consideration to obtain a successful processing. Moreover, a compromise between the computational cost and the results focus always to distinguish between different colour detection and colour retrieval for further ROI analysis should be kept. The colour model should be taken into consideration when defining standards for histological images.