Preparation and staining of tissue samples
44 biopsy specimens of DLBCL were selected from the files of the Lymph Node Registry Kiel based on availability of material. Core needle biopsies were excluded. Formalin-fixed paraffin-embedded (FFPE) tissue was sliced into 2 μm thin slides and, additionally to a conventional HE-staining, an immunohistochemical staining was done with antibodies against CD14 (Clone EPR3653; Cell Marque, Rocklin, CA, USA; 1:10) and CD163 (Clone 10D6; Novocastra, Leica Biosystems, Wetzlar, Germany; 1:100). Briefly, after deparaffinization in xylene and rehydration in alcohol, tissue sections were incubated for 3 min in citrate buffer (pH 6) within a pressure cooker. The slides were washed in PBS and then incubated for 1 h with a mixture of the primary antibodies in antibody-diluent (medac GmbH, Wedel, Germany). After incubation with the primary antibodies, the sections were washed in PBS and then incubated with a mixture of the secondary fluorescent-labeled antibodies in PBS for 1 h. As secondary antibodies, donkey anti rabbit Alexa 488 and donkey anti mouse Alexa 555 were used (both from Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA; 1:100). After washing in PBS the slices were incubated with DAPI (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA; 1:5000) for 2 min, washed in PBS and cover-slipped with mounting medium. Use of tissue was in accordance with the guidelines of the internal review board of the Medical Faculty of the Christian-Albrechts-University Kiel, Germany (No. 447/10).
Image acquisition, selection of tumor subregions and ROIs
Images were generated by Hamamatsu Nanozoomer 2.0 RS slide scanner (Hamamatsu Photonics, Ammersee, Germany) with 20 × magnification. For every fluorescent immunostained tissue slide, the whole tissue sample as well as a tumor subregion were imaged, resulting in single images for the Alexa 488, Alexa 555, and DAPI channel, respectively, and an overlay picture of the channels. Raw image data were saved in.ndpi format (single-channel images) or.ndpis format (overlay image), respectively. Pixel size is 0.45 μm × 0.45 μm in all images.
In order to select a tumor subregion within a whole tissue sample, the tumor area was defined and marked by a pathologist by inspection of the HE-stained slice. Subsequently, within the immunostained slice, a suitable subregion of the tumor area not larger than 10 mm2 has been selected depending on tissue and staining quality (no tissue artifacts, no scratches or folding in the tissue, no overstaining) and captured. The position of the selected tumor subregion has been marked within the raw data by use of the software kit NDP.view 2 (Hamamatsu Photonics, Ammersee, Germany), which is available as freeware .
From 25 randomly selected tumor subregions, ROIs of 900 × 600 px (0.109 mm2) size for manual counting and comparison of image analysis methods have been singled out (CD14 +/488 nm and CD163 +/555 nm channels). Note that the ROIs have been selected under the viewpoint of reflecting the several difficulties of automated macrophage recognition, see Fig. 1.
In order to prepare the scans for image analysis, raw data were converted into uncompressed .tif format and, in the case of whole tissue samples and tumor subregions, sliced into tiles of 1000 × 1000 px (0.202 mm2) size, using the software package ImageJ with the extension ndpitools . Since all obtained images are monochrome, they have been further converted from RGB into greyscale mode using the modulus Igrey=| IRGB | of the RGB vector and finally saved in losslessly compressed.png format. Thus we end up with 50 ROIs, 44 datasets for whole tissue samples and 44 datasets for tumor subregions, each comprising image data at three different immunostainings. Note that the image acquisition as well as the tiling resp. selection of the ROIs has been organized such that no misalignment between the scans at the different wavelengths occurred.
Let us remark that a further staining with Pax5 (polyclonal; Santa Cruz Biotechnology, Heidelberg, Germany; 1:100) and donkey anti goat Alexa 647 (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA; 1:100) has been simultaneously performed and imaged but all related information, as it is not concerned with macrophages, has been completely excluded from the following analyses.
Fully automated ROF filter based segmentation
a) Method description. The described method originates as a substantial further development of the approach presented in Bredies et al. , where IHC stained photoreceptor segmentation was performed with data-dependent but fixed intensity thresholding and without application of geometric rules for feature segmentation. Some of the steps described below are visualized in Fig. 2.
After initialization of the parameters (Step 0), subtraction of a median-filtered version I(1) from the original image I(0)(Step 1), which results in a brightness-normalized, unsharply masked image I(2)= max (I(0)−I(1), 0), we apply the Rudin-Osher-Fatemi (ROF) filter  (Step 2), ending up with I(3). ROF filtering constitutes a well-established standard procedure in image processing, resulting in a sligthly coarsened, cartoon-like version of the input image which, nevertheless, conserves the original edge structure. The procedure allows for a surprisingly efficient numerical realization , pp. 175 ff. Steps 0 − 2 are analogous to the algorithm described in Bredies et al. . We refer to the appendix of this paper for an outline of the mathematical background of the ROF approach.
Next, we extract the evaluation subregion to which the macrophage segmentation has to be applied (i.e., the part of the image where tissue is present). For this purpose, we apply Steps 1 and 2 to the DAPI image, which is available together with I(0). From the obtained DAPI cartoon, we generate a black-and-white mask Ieval by masking all pixels with intensity less than 10 at 8bit scale with black and covering every remaining pixel with a white 31 × 31 px square centered at the given position (Step 3). In the case of application of the method to the ROIs, this step is being skipped, and the evaluation subregion is assumed to coincide with the ROI image as a whole. Note that, in difference to the following step, the application of a fixed threshold is possible due to the much more regular structure of the DAPI image. The threshold value has been experimentally chosen.
In difference to , the cartoon I(3) will be segmented with a floating intensity threshold instead of a fixed one, and features will be identified as macrophages by application of a set of several geometrical rules. This subprocedure, which has been newly developed, will be described in more detail. For the geometrical description of a feature F, we employ the following variables: the size s(F) of the feature itself, the size c(F) of the convex hull of the feature, the ratio r(F) of the principal axes’ lengths of the smallest ellipse covering the feature, the perimeter p1(F) of the feature and the perimeter p2(F) of a circle with equal area to the feature F. Further, we define the parameters smin and smax — minimal and maximal feature size (in px), cmin — minimal area excess of the convex hull (in percent), rmax — maximal ratio of axes, and pmax — maximal excess of the feature perimeter p1 when compared with the perimeter of a circle with equal area p2.
We start at the intensity threshold i, which will be given as the mean intensity of I(3), rounded to the next integer value, and the feature mask I(3)(i):=I(3). Using Ieval, we mask in I(3)(i) all pixels outside the obtained evaluation subregion (Step 4). Now we perform the first segmentation step by masking in I(3)(i) all pixels with intensity less than i, subsequent labeling (Step 5) and inspecting the connected features Fj,j=1,..., N(i), in I(3)(i)(Step 6). Each of the features Fj will be classified by the following rules.
1) If smax<s(Fj) then do nothing, reserving the too large feature for further analysis with incremented intensity threshold (Step 7). 2) If s(Fj)<smin then neglect the feature as too small and mask it in I(3)(i)(Step 8). 3) If smin≤s(Fj)≤smax then test whether the feature satisfies all of the following three criteria: 3a) c(Fj)/s(Fj)≥1+cmin/100 (the feature is not too round), 3b) r(Fj)≤rmax (the feature is not too elongated), and 3c) p1(Fj)/p2(Fj)≤pmax (the feature’s boundary is regular enough). If yes, save the feature Fj into the output mask Isegm, interpreting it as macrophage, and mask it in I(3)(i). If at least one of the three criteria fails then neglect the feature and mask it in I(3)(i) as well (Steps 9 and 10).
As a result of the classification, we end up with a masked version I(3)(i) of the cartoon and (possibly) a set of features to be interpreted as macrophages, written into the output mask Isegm. Now the segmentation step is repeated with incremented intensity threshold i=i+1, further application of masking to I(3)(i+1):=I(3)(i)(Step 11) and geometrical analysis of the remaining features. Thus we repeat subsequent segmentation steps until the maximal intensity is reached. The complete algorithm is summarized in Fig. 3 again.
b) Input, output and implementation. As input for the method, a single-channel greyscale image is required. In the case of whole tissue samples and tumor subregions, the related greyscale DAPI image must be provided as well. The output of the procedure are three black-and-white masks. Ieval, the first one, contains the evaluation subregion. Into Isegm, all detected macrophages are plotted as white features which are, as a consequence of the organization of the processing steps, mutually disjoint, see Fig. 4c. Into the third mask Iconv, we plot all convex hulls conv (F) of the detected macrophages F. All result images are of the same size as the input image. Further, the method provides the total area of the evaluated subregion marked in Ieval, the number of features in Isegm as macrophage count and the total area marked in Iconv, i.e. the cumulative area of the convex hulls of the obtained features, as macrophage area. We refer to the obtained count as to method (S1) and to the obtained cumulative area as to method (S2).
The algorithm has been implemented as a series of MATLAB procedures. They have been tested on MATLAB 220.127.116.113654 (R2018a) and require the MATLAB Image Processing Toolbox [28, 29]. For the ROF filtering in Step 2, the numerical method from  is applied. The window size for the median filter (31 ×31 px) as well as the internal parameters of the ROF filtering are being fixed from the outset. The geometrical parameters from Steps 7 − 9 must be initialized as well. For the analysis of the ROIs, we used smin=140,smax=800,cmin=7.5,rmax=3 and pmax=2. For the analysis of the whole tissue samples and the tumor subregions, we set the parameters to smin=160,smax=1500,cmin=7.5,rmax=3 and pmax=2.5.
The parameter smin has been set above 140 px in order to exclude the misidentification of erythrocytes (with a mean diameter of about 6 μm and a corresponding mean area of ca. 100 px) as (parts of) macrophages. The setting of smax is well in agreement with the mean macrophage area reported in the “Results” section below. The values of the parameters cmin,rmax and pmax have been experimentally found. No particular attempts for performance tuning have been made.
Let us remark that dependency on proprietary software can be completely removed, e.g., by reimplementation of the ROF segmentation procedures in the freeware environment OCTAVE .
c) Availability and usage. We made the MATLAB procedures publicly accessible (CC0 1.0 Universal Public Domain Dedication or GNU General Public License v3) at the Leipzig Health Atlas repository under the address . Execution assumes that a single image set, consisting of three greyscale images representing the CD14 +/488 nm, CD163 +/555 nm and DAPI channels, as well as the procedures are stored in the MATLAB working directory. Output images and logfile will be saved at the same location. To start the analysis, type rof_segm_public_step_00_masterfile, which subsequently calls the other procedures, within the MATLAB command window. You will be asked to enter the image filenames and to confirm the parameter settings. Progress of segmentation can be traced by display messages. Parameters are set by default to the values used for the analysis of the whole tissue samples and the tumor subregions as described in the subsection above. They can be changed within the file rof_segm_public_step_01_parameters.m. Modification of the basic procedure in order to enforce batch processing may be easily effected but is left to the user as it depends strongly on the particular structure of the dataset to be analyzed.
Other segmentation methods
a) Commercial software kits. We applied two commercial software packages to the images. The first one is Tissue Studio 64, v3.6.1 (Definiens AG, Munich, Germany) . In the case of the ROIs, single-channel images (at 488 and 555 nm) in.png format were separately uploaded and analyzed. Magnification was defined using the image metadata (20 × magnification, pixel resolution 0.45 μm/px), stained area was analyzed in “Marker Area Detection” mode. The minimal feature size was set to 30 μm2 in order to exclude fragments of macrophage protrusions from counting. Thresholds for IHC marker intensity staining were manually adapted for each image (within ranges from 10 to 23 for CD14 +/488 nm and from 11 to 26 for CD163 +/555 nm channel on a 8bit scale). For the analysis of the whole tissue samples, .ndpis files were uploaded. In order to define the evaluation subregion, all layers were used for tissue background separation. Instead of using the auto-threshold function of the software kit, homogeneity threshold was set on 0.2, brightness control was manually adapted within a range from 2 to 6, tissue minimum size was set between 10 and 2000 μm2 depending on the tissue sample. Areas with overstaining, scratches or folding were excluded by manual marking. Then the CD14 +/488 nm and CD163 +/555 nm channels have been analyzed independently from each other in “Marker Area Detection” mode. Thresholds were manually set in ranges from 13 to 40 for CD14 +/488 nm and from 12 to 45 for CD163 +/555 nm channel on a 8bit scale. As output, the software provides the total area analyzed and the areas bearing the respective stainings. Graphical output is an annotated version of the original image with marking of the detected area, see Fig. 4d. We refer to the me thod as to (S3).
The other software kit is Halo, v2.1.1637.11 (Indica Labs, Corrales, New Mexico, USA) . Magnification was set to 0.45 μm/px, and “Area Quantification FL v1.2” mode was applied. In the case of the ROIs, single-channel images (at 488 and 555 nm) in.png format were separately uploaded and analyzed. For the analysis of the whole tissue samples, .ndpi files were uploaded. Based on simultaneous inspection of all layers, the evaluation subregion has been marked manually, excluding at the same time areas with apparent overstaining, scratches or folding. Then the CD14 +/488 and CD163 +/555 nm channels have been analyzed independently from each other. Again, thresholds for IHC marker intensity staining were adapted manually for each image (within ranges from 0.1 to 0.16 for CD14 +/488 nm and from 0.125 to 0.19 for CD163 +/555 nm channel for the ROIs and from 0.021 to 0.097 for CD14 +/488 nm and from 0.047 to 0.279 for CD163 +/555 nm channel for the whole tissue samples on a float scale). As output, the software provides the total area analyzed and the stained areas. Graphical output is an annotated version of the original image with marking of the detected area, see Fig. 4e. We refer to the method as to (S4).
b) Machine learning method (Mask R-CNN). Mask R-CNN is a region-based convolutional neural network, providing bounding boxes for candidate target objects together with a binary mask for the objects themselves . It depends on two sets of greyscale images annotated with bounding boxes for the contained features of interest, which are used for training and validation, respectively. In our case, the training set was built from 10 randomly selected ROIs (20 % of data available), and the validation set consisted of further 5 randomly selected ROIs (10 % of data available), thus leaving 35 ROIs for the application of the method. Selection and annotation of training resp. validation features within the original images was performed by assigning a centered 31×31 px square subregion around every tag obtained by manual counting (whose output is available as a mask) as a valid training feature. Annotation was performed by software package VGG Image Annotator . Annotated images were converted into backbone feature map of size 32×32×2048 by standard convolutional neural network ResNet-101 . Based on the obtained training data, the remaining 35 ROIs (at 488 and 555 nm, 70 % of data available) were subjected to segmentation with Mask R-CNN, using the implementation available at . Single-channel images were uploaded in .png format. The output of the method is an annotated version of the original image with bounding boxes for the detected macrophages and a black-and-white mask of the same size as the input image, into which all detected macrophages have been plotted, see Fig. 4f. For counting and area evaluation, features of size less than 140 px were ignored. We refer to the obtained count as to method (S5) and to the obtained cumulative area of macrophages, as derived from the black-and-white mask, as to method (S6).
Mutual comparison of the segmentation methods
a) Manual count as reference basis. Within single channel images of the ROIs (at CD14 +/488 nm and CD163 +/555 nm), macrophage cells were marked with a 3×3 px cross and manually counted (see Fig. 4b, wherein, for better visibility, the cross-shaped detection marks have been replaced by squares centered at the same pixel). Tags have been saved into a black-and-white mask of equal size as the original image. We refer to the manual count as to method (MC).
b) Method comparison by means of the ROIs. To the ROI image set, segmentation methods (S1) − (S6) have been applied and subsequently compared. For this comparison, the relative error turns out to be an inadequate measure. Indeed, since manual counts range from 8 to 311 macrophages per ROI, the relative error would vary from 0.32 % to 12.5 % per erroneously counted single feature, thus considerably overweighing errors made within ROIs with small macrophage numbers. Instead, we will use the Pearson correlation coefficients between the methods’ outputs for the complete sample of ROIs. Since the manual count as reference method gives no information about the area of the tagged cells, this measure has the further advantage to allow for an immediate comparison of count or area information without the necessity of a normalization of the latter.
For (S1) and (S5), we will further provide the percentage of manually counted macrophages which are exactly matched by the output of the respective method. Due to the reasons mentioned in the “Background” section, the relation between a detected feature and a manually tagged macrophage is to be considered as a matching not only in the case if the marking cross falls inside the convex hull of the detected feature. A matching is given nonetheless if the tag and the convex hull of the feature are mutually disjoint but visual inspection reveals that the convex hull covers the marked macrophage at least partly.
c) Method comparison by means of the whole samples. To the whole samples, methods (S1) − (S4) have been applied and subsequently compared. We provide first the Pearson correlation coefficients for the methods’ output for the CD14 +/488 nm and CD163 +/555 nm channels. Since, however, the evaluation subregions as well as the overall density of cells contained within them show considerable variation between the samples, the outputs will be appropriately normalized and then compared again. As normalizations for (S1), we calculate the density, which is given as total macrophage count divided by area of evaluation subregion, cf. Step 3 of Algorithm 1 above, as well as the cell percentage, which is given as total macrophage count diveded by estimated total number of cell nuclei within the evaluation subregion. The latter is obtained from the cartoon of the DAPI channel by masking all pixels with intensity less than 10 and dividing the number of the remaining pixels by 100. As normalizations for (S2) − (S4), we calculate the area percentages, which are given as cumulative macrophage area divided by the area of the corresponding evaluation subregion.
We consider a feature detected within the CD14 +/ 488 nm channel as double-stained if at least 20 % of the area of its convex hull is covered by convex hulls of some features detected within corresponding CD163 +/555 nm channel image. Note that the presence of a double staining does not influence the detection of a feature by methods (S1) − (S4) since the channels are analyzed independently from each other. However, the more completely and uniformly a given macrophage is stained, the more probable is the recognition of a possible double staining.
d) Analysis of tumor subregions. The tumor subregions have been analyzed with method (S1) only. Here, we will compare the full output with its 50 % and 25 % downsampling, considering only one half or one quarter of the tiles of the given tumor subregion dataset for evaluation. Further, we provide a comparison with the outputs of (S1) and (GE) for the corresponding whole tissue sample. The analysis is repeated with the normalized outputs of (S1), calculated as densities. All comparisons will be given in terms of Pearson correlation. Moreover, the percentage of double-stained features according to the above given definition will be recorded. Finally, we characterize the distribution of the feature sizes, which will be derived from the analysis of the CD14 +/488 nm channel. Frequencies are obtained by counting up all features of a given size and subpopulation over the outputs for all 44 datasets.
Comparison with gene expression data for the whole samples
Digital-multiplexed gene expression (DMGE) profiling was performed with the nCounter platform (NanoString, Seattle, OR, USA), targeting the genes of interest by digitally color-coded oligonucleotides. For a detailed description of the procedure, see [39, 40]. The data were further processed and normalized by the following three steps. First, we performed quality controls using the R package NanoStringQCPro . Here, four samples were flagged and removed from subsequent analysis. Second, we added a pseudo count and normalized the data by dividing sample-wise through the geometric mean of the housekeeper genes (B2M, MTMR14, PGK1, ABCF1, EIF2B4, LDHA, CTCF, TBP, WDR55, POLR2B), and third, we multiplied the data with a factor of 1000 to bring them on a natural scale. We refer to the normalized gene expression values as to method (GE). Below, the normalized counts will be compared with the outputs of image morphometry in terms of Pearson correlation coefficients.
Summary of methods’ application
In Tables 1 and 2, we provide a summary of the properties of the described macrophage counting approaches and the experiments performed with them. Note that, for the whole tissue samples, comparison of results of (S1) − (S4) is possible for 40 datasets, and of (S1) − (S4) and (GE) for 35 datasets while (S5) and (S6) have not been applied.