The microscopic multispectral imaging system that we used in our experiment to capture the H&E stained multispectral images is shown in fig. 1. Attached to the conventional microscope is a multispectral filter with spectral sensitivity in the visible spectrum, i.e. 400nm-700nm. Using our in-house software we can capture multispectral images at 5nm bandwidth across the visible spectrum and saved the 1034x1050 pixel images in TIFF format with 16-bit colour depth.
We captured 10 sets of H&E stained images with the microscope objective lens set at 20x. In [6, 8] 16 multispectral bands were used to enhance regions of collagen fiber to differentiate them from muscle fiber wherein the two tissue structures share similar colorimetric attributes in an H&E stained image. In our experiment the original 55-band multispectral (MS) images were reduced to 14-band MS images by taking the spectral average of the MS pixel at four contiguous bands. Table 1 illustrates the spectral sensitivity of the resulting bands.
The n-band spectral transmittance of a pixel is calculated by taking the ratio between the signal value, is, of the specimen and that of glass, ig:
where the entries of the Nx1 column vector f correspond to the spectral values at different wavelengths. We manually extracted the spectral samples for nucleus, cytoplasm, red blood cells (RBC), fiber, white area (the area in the image which is void of tissue) from the 5 images of the 10 sets of images that we captured. .
Principal component analysis (PCA)
The goal of PCA is to reduce the number feature variables while preserving the variance of the feature data. If we let F be an nxq data matrix representing q samples of an n-dimensional feature vector:
then nxn covariance of the of the matrix F can be derived as follows:
where is the mean vector of matrix F given by:
The diagonal elements of the covariance matrix, cii , denotes the variance of the data around the mean while the off diagonal elements, cij , express the degree of correlation between the ith and jth feature variables. From the covariance matrix C we can derive the eigenvectors and eigenvalues and by arranging the eigenvectors in descending order of their eigenvalues we can form an orthogonal basis having the first eigenvector containing the largest variance of the original data. The % variance of the original data using the first m eigen vectors can be explained by the ratio between the sum of m and n eigenvalues:
is the ith eigenvalue. If most of the variance in the original data is contained in the first m eigenvectors, say 99.99%, a reconstruction of the original data can be done by forming a linear combination of these vectors:
where αi is the PC coefficient and vi is the ith eigenvector.
The reconstruction error that results from the application of eqn. 6 largely depends on the accurate estimation of the data covariance matrix C which in turn is governed by the sufficiency of the data samples in F. If the feature variance of a sample is captured in the data matrix F fewer eigenvectors are needed to obtain smaller error in the reconstruction of such sample feature.
PCA is also a technique used to address the estimation of n dimensional spectral data by using m<n eigenvectors. If we consider F as the data matrix representing the spectral samples of some objects such that fk represents the spectrum of a particular object and its estimate derived from eqn. 6 then we can write the spectral error as follows:
where ek is an n dimensional column vector. The magnitude of the error in eqn.7 is a function of the estimation of .Consider that there are c classes of objects that are identified from an image but only c-1 of these classes are represented with spectral samples in the data matrix F, then for a given m eigenvectors the spectral errors are smaller for objects that belong to the first c-1 classes compared to objects that belong to the cth class.
Detection and segmentation using the spectral error
The configuration of ek ,i.e. the wavelengths at which the error peaks, is a function of the spectral attributes of the object. If we could identify two wavelengths r and s at which the spectral error of the cth class has the highest positive and negative peaks while the c-1 classes of objects experience an almost zero spectral error at these wavelengths it is possible to segment the objects in cth class by taking the difference between the spectral errors at these wavelengths:
where er is the highest positive peak and es the negative peak with the largest magnitude. To segment objects belonging to the cth class a threshold can be applied to the result of eqn. 8
Although in an H&E stained slide the tissue structures are generally categorized as either acidophilic or basophilic each tissue structure has its own distinct spectral attributes due to its unique reactions to the chemical dyes. Hence the spectral error of a tissue component not represented in the data matrix F would likely exhibit peaks at certain wavelengths for a given m eigenvectors; these wavelengths might be correlated to the absorption peaks of the dyes themselves. With these specific wavelengths identified it is possible to detect and segment such particular tissue component by applying appropriate thresholds. Furthermore translating the spectral error values at these wavelengths would also result to better visualization of the tissue component.
The general processes involved in the localization of eosinophils are illustrated in the block diagram in fig.2. First the spectral transmittance of a pixel is calculated then its estimate is calculated using m PC vectors that were previously identified in an off line experiment. The difference of the spectral errors between two bands is then calculated. Application of an appropriate threshold, eqn.9, segments the eosinophils.