Deep Learning Enables Individual Xenograft Cell Classification in Histological Images by Analysis of Contextual Features

Patient-Derived Xenografts (PDXs) are the preclinical models which best recapitulate inter- and intra-patient complexity of human breast malignancies, and are also emerging as useful tools to study the normal breast epithelium. However, data analysis generated with such models is often confounded by the presence of host cells and can give rise to data misinterpretation. For instance, it is important to discriminate between xenografted and host cells in histological sections prior to performing immunostainings. We developed Single Cell Classifier (SCC), a data-driven deep learning-based computational tool that provides an innovative approach for automated cell species discrimination based on a multi-step process entailing nuclei segmentation and single cell classification. We show that human and murine cell contextual features, more than cell-intrinsic ones, can be exploited to discriminate between cell species in both normal and malignant tissues, yielding up to 96% classification accuracy. SCC will facilitate the interpretation of H&E- and DAPI-stained histological sections of xenografted human-in-mouse tissues and it is open to new in-house built models for further applications. SCC is released as an open-source plugin in ImageJ/Fiji available at the following link: https://github.com/Biomedical-Imaging-Group/SingleCellClassifier. Supplementary Information The online version contains supplementary material available at 10.1007/s10911-021-09485-4.


Introduction
Most of our understanding of mammary gland development and breast carcinogenesis stems from experiments with animal models. Mice are by far the most widely used experimental system due to their size, ease of use, and most importantly, the ability to establish genetically-engineered mouse models (transgenic mice are one subtype of geneticallyengineered mouse models). However, approximately 90% of potential oncology drugs fail in clinical trials [1,2], partly because of the lack of adequate preclinical models, raising concerns on how representative of the human physiology and disease data derived from mice are.
Patient-Derived Xenografts (PDXs), namely preclinical models developed by transplanting human-derived cells into immunosuppressed or humanized mice, currently best recapitulate the complexity of human tissues and are increasingly employed for translational research [3][4][5]. Classically, mammary xenografts are generated by transplantation of pieces Quentin Juppet and Fabio De Martino are contributed equally to this work. of primary breast tissues to the mammary fat pad of recipient immunosuppressed mice [6]. However, under these settings, the PDXs growth and their HR expression are dependent on estradiol supplementations which, resulting in serum E2 equivalent to mid-menstrual cycle levels [6,7], alter the physiological relevance of this preclinical model. Recent advances in this field were achieved with the Mouse INtraDuctal (MIND) model, which entails the injection of primary human-derived breast cells directly into the mouse mammary ductal tree via cleaved teat [8,9]. In the intraductal microenvironment, primary HBECs and breast cancer cells grow independently of exogenous hormone supplementations while retaining their HR expression and hormone responsiveness, making the MIND model an appealing preclinical tool [8,[10][11][12][13][14]. Hence, the MIND model provides the unprecedented opportunity to study the role that individual HRs play in the luminal compartment of the human breast epithelium by, for instance, histological techniques [9]. However, molecular analyses in xenograft models are hindered by the presence of both human and murine cells, which can lead to data misinterpretation due to contamination of different cell species. Therefore, prior to performing specific immunostaining to assess levels of proteins of interest in the xenografted cells, paraformaldehyde-fixed paraffin-embedded sections are usually stained by Haematoxylin and Eosin (H&E) in order to obtain a rough inference of the abundance of human cells within the tissue of interest based on morphological features. Although human cells appear usually bigger in size and more elongated than their murine counterparts, manual evaluation is error-prone, time-consuming and subject to inter-personal variability. This warrants the need for better tools to reveal species-specific features.
Machine learning techniques have been effectively applied in a number of different fields and emerged as valuable resources to decipher the content of biological images [15,16]. While some methods have already been developed to analyze H&E stained human histological sections, their aim was mainly set on tissue segmentation [17][18][19] and more sophisticated supervised learning-based tools have been created to perform nuclei segmentation [19]. Here, we hypothesized that deep learning could be employed in order to automate human-mouse cell discrimination in intraductal xenografts, a challenging task due to high biological and technical heterogeneity. We developed Single Cell Classifier (SCC), a data-driven machine learning-based approach capable of classifying either normal or malignant individual xenografted cells from murine cells in the same histological section according to specific features rather than images. Upon evaluation of a total of 484 cell-intrinsic or contextual features, SCC was proven to reach up to 96% of classification accuracy, with contextual information playing a major role on the classification performance. SCC is supplied as a publicly available plugin in ImageJ/Fiji [20] and can be downloaded at the following link: https:// github. com/ Biome dical-Imagi ng-Group/ Singl eCell Class ifier.

Image Acquisition
Slides were scanned with Olympus VS120-L100 slide scanner using a 20x/0.75 objective connected to a Pike F505 C Color camera. Because of the size of the resulting data (Figs. 1a and 2a), images were loaded into QuPath [21] using the BioFormats extension 1 . The slide images are publicly available on Zenodo 2 .

Data Extraction
Due to the pyramidal nature of the whole slide scanner images, a version of each image was extracted to define areas likely to contain ducts. Because ducts can be defined as relatively large, densely packed cell regions having a dense Eosin signal, we can use this information to extract them using Fiji's [20] Color Deconvolution with the built-in H&E DAB vectors. First, a 4-fold downsampled version of each whole slide image was sent from QuPath [21] to ImageJ [20]. The extracted signal was then filtered with a Gaussian kernel of = 2px before thresholding with ImageJ's Default method. Finally, connected components analysis (AnalyzeParticles) was used to obtain ROIs. The bounding boxes of these ROIs were extracted and enlarged slightly to ensure no ducts were touching the edge of the image. Resulting bounding boxes were reimported into QuPath [21] as annotations and used to perform the export of the full resolution ducts as .tif images.

Nuclei Detection
As defining precise boundaries between neighboring cells can be challenging, we segmented the nuclei as a first step using StarDist [22,23] (Fig. 2b), a state-of-the-art method outperforming classical approaches to detect star-convex objects for 2D images using a neural network. Such networks can be trained from various image types and the detection is represented as a labeled image where each nucleus is associated with a label.
A StarDist model was trained with a set of 24 images of size 320x320 pixels extracted from H&E-stained sections of normal human breast xenografted mammary glands (Supp. Fig. 7). The training data is publicly available on the GitHub of SCC. The nuclei of these images were manually annotated for a total of approximately 2500 nuclei ( Fig. 1b-g). The training was performed in Python using TensorFlow on Google Colab with GPU on 100 epochs for 40 minutes. The StarDist model was configured to detect 48 rays objects and a dropout of 0.5 was added to the network to avoid overfitting.

Cell Delineation
The estimation of the cells (Fig. 2.c,f) was computed according to two criteria: (i) cells should not overlap with each other and (ii) their thickness, defined as distance to the nuclei, should not exceed Δ = 2 m . This maximal value is an estimation of the expected thickness for the type of cells we aim to classify, and can be edited by the users. The mask of the first criterion was computed using a Voronoi diagram on the nuclei labels and of the second criterion by thresholding, at Δ , a distance map of the nuclei labels. Hence, each detected cell will be associated with its nucleus. These criteria can be represented as masks that can be integrated to delineate each cell.

Measurement
For classification, cells needed to be properly described (Fig. 2d). To do so, 484 features were extracted from the detected nuclei and cells. These features can be related to the object itself (i.e. cell-intrinsic features), or related to their neighbors and their organization (i.e. contextual features). Out of these, 47 concern cell-intrinsic features, 376 contextual features are related to the cell-intrinsic features of the neighboring cells and the remaining 61 contextual features describe the organization of the neighbors.

Shape and Size Features
Cell-intrinsic features related to the shape and the size of both nuclei and cells were measured by fitting an ellipse on the targeted object to extract its elongation, minor axis, and major axis as features. Additionally, the area, and in particular the area ratio between the nuclei and the cell, was exploited for the analysis. each other and the cell thickness (distance to the nuclei) should not exceed 2 m. g Source example for the contextual features, an H&Estained "humanized" mouse mammary duct. h The Delaunay graph in red over the detected nuclei associated with the source, randomly colored for distinction purpose. i. The cell chain graph over the detected nuclei associated with the source, randomly colored for distinction purpose, red segments correspond to cell chain. Scale bars, 100 m

Textural Features
Cell-intrinsic textural features were extracted from the pixels of the source images at specific areas for each cell. The cells were divided into two areas: (i) their nuclei and (ii) their estimated cytoplasms, defined as the cell area without the nucleus. Such features can be divided between features related to the color and features related to the texture itself, i.e. the local pattern and spatial organisation of pixel intensities. For the color-related features, the mean and the variation of the values of each color channel were computed. For the features related to the texture, the Haralick texture features [24,25] were calculated on a gray level version of the source image. The Haralick texture features provide 13 statistical parameters related to the pixels such as their entropy or their contrast. The gray level image was computed from a principal component analysis (PCA) on the channels of the image, with the first component corresponding to the factors to apply on the channels, thereby maximizing their variance, so that the maximum amount of information can be measured for the texture.

Contextual Features
Our contribution was to propose an additional set of features based on spatial arrangement of each cell and its neighboring cells, hereafter referred to as contextual features.
To efficiently determine the closest neighbors of a cell, we computed a neighborhood graph using the Delaunay Triangulation algorithm [26] (Fig. 2h). The cells are represented as nodes on this graph, with each node corresponding to the centroid of the cells. We identified the cluster of closed cells using the shortest-path algorithm (Djikstra).
In sake of versatility, several kinds of neighboring structures were inspected: (i) The neighbors directly connected to the cell on the Delaunay graph give information about the position of a given cell in the cluster given the mean and the variance of their distance. A cell described by high distance variance from its direct neighbors will have a high probability to be located at the border of a cluster. (ii) The lateral neighbors are the two cells located on the left and the right of a given cell given the orientation of the nucleus ellipse. Let's define a line N normal to the major axis of the cell c 1 and passing by the center of c 1 . Then a cell c 2 is considered to be the lateral neighbor of c 1 if the closest distance between the center of c 2 and the line N is lower than half of the major axis of c 1 . Features like the alignment, the distance, and the difference in orientation were measured. When two cells are mutually lateral neighbors, they can be iteratively connected to create a chain of cells (Fig. 2i). Features related to these chains, which are relatively abundant in human cell clusters, are their tortuosity and size. (iii) To cover different types of cell aggregates, 4 sizes of vicinity were analyzed represented by K equal to 5, 10, 20, and 40 cells. These K neighbors are the set of the K closest neighbors to the current cell.
The simultaneous use of a range of K values allows the extraction of both local (K small) and global (K large) features. It is possible to extract distance features (mean and variance) between the K neighbors and the current cells and, in particular, the distance of the current cell to the centroid of the set of K neighbors. The distance between the current cell and the centroid of the set provides information about the homogeneity in the set since, in a homogeneous cluster, it is expected that the current cell is very close to the position of the centroid as the neighbors should be spread around the current cell homogeneously. (iv) The K connected neighbors are the set of the K closest neighbors to the cells that are physically connected to each other. Like the K neighbors, the K connected neighbors provide at the same time both local and global information of the neighborhood of the cell taken into account, by considering the same K values. Moreover, a peculiar evidence is that cells that are physically connected share common features. This property motivated the extraction of the cellintrinsic features of the neighbors through means and variances. It is also relevant to observe the shape of the cluster formed by the neighbors, which can be performed by computing an ellipse in a similar fashion than what was previously performed for the nuclei and cells shape features.

Classification
To classify the cells based on their features (Fig. 2e), a neural network model was trained by supervised learning in Python using TensorFlow. A set of 174 images of various sizes were extracted from H&E-stained xenografted mouse mammary glands (Supp. Fig. 8). Among these images, 96 mostly contain normal human cells, whereas 78 contain only mouse cells. To ensure that an image contains normal human cells, a human-specific E-cadherin (E-CAD) antibody was used in order to uniquely probe for xenografted cells (Supp. Fig. 10). This set of images represented about 60'000 cells after detection, of which 26'000 were human. Thereby, 26'000 mouse cells were considered to balance the number of cells between the two classes of interest such that their influence on the training is equivalent.
To classify tumor-derived PDXs, 17 images that contain mostly human tumor cells were added to the previous set. In order to circumvent interpretation problems due to variable E-cadherin protein expression levels in tumor cells [27], tumor cells were uniquely detected exploiting a humanspecific cytokeratin 7 (CK7) antibody (Supp. Fig. 11). The final set represented about 80'000 cells with about 40'000 mouse cells, 26'000 normal human cells, and 14'000 human tumor cells.
To classify PDXs in fluorescence single channel DAPIstained images, 12'000 cells in each class were extracted from DAPI and E-CAD stained xenografted mouse mammary glands. The E-CAD channel was used as control and the DAPI channel as source for the classification.
To define the species of a cell, classes masks were manually annotated for each of the images based on their fluorescent controls, namely they were compared to adjacent sections that were probed for xenografted cells by means of the above described human-specific antibodies (Supp. Fig. 10, 11).
A neural network adapted to the problem was design which takes as input the 484 features and returns the probability of a cell to belong to both human or mouse class as output.

Training for Further Applications
Our ImageJ/Fiji [20] plugin named "Single Cell Classifier" has been implemented to allow users to perform cell classification with our method using built-in or custom models. The parameters of the methods such as the K values for the neighbors, the factors to convert in gray value, or the cell thickness Δ are also editable by the user.
As most deep learning models, the provided pre-trained models can only be used with images that are similar to the one used in their training (i.e. same modality of microscope, tissue, cell, staining, image contrast,...) [28,29]. To use our method with other images, new models need to be trained, our GitHub provides a set of Python scripts and explanation to help the user performing such task. The nuclei detection and the classification models are trained independently which allows detecting other classes with the same kind of images by re-training only the classification model.
This plugin depends on three other plugins: StarDist [22,23] for the nuclei detection, MorphoLibJ [30] for its morphological and analysis tools used in cells delineation and features extraction, and finally CSBDeep [31] that executes our classification neural network.

Nuclei Detection
The nuclei detection performed by our StarDist model reaches a detection accuracy of 74.74% (Table 1), remaining accurate to distinguish nuclei even under challenging conditions (Fig. 3). A detailed description of all computed object detection metrics is available in Supp. Note 1.

Features Analysis
Next, we performed a comparison of distribution and correlation between human and mouse features to characterize the impact of each feature on the neural network to assess the morphology of the engrafted human cells in the intraductal environment. Our correlation index corresponds to the absolute value of the Pearson correlation, as the correlation direction is not relevant to our problem. The analysis of the features distribution and their correlation with the classes revealed that shape and size of individual cells poorly discriminate the two classes, with a correlation index lower than 0.2, suggesting that they do not help the discrimination task (Fig. 4a, 4b). However, the contextual features for shape and size have a better correlation ranging between 0.3 and 0.4, highlighting the importance of the context for this purpose (Fig. 4a, 4c). Interestingly, textural features seem to offer better help in the cell species discrimination than shape and size, as highlighted by the resulting small degree of overlap between the two cell species and their global better correlation with the classes, concentrated between 0.15 and 0.35 but reaching a maximum of 0.5. The context seems to decrease the correlation for most of the features, concentrated between 0.05 and 0.3 (Fig. 4a, 4d). However, some exceptions in the contextual texture features still reach a high correlation of 0.5, making them appealing for such analysis.

Classification of Normal Breast PDXs in H&E-stained Histological Sections
SCC performs normal human breast cell discrimination reaching an accuracy of 96.51% (Fig. 5), as assessed by quantifying the number of accurate calls upon manual annotation of humanized mouse mammary ducts based on fluorescence labeling by human-specific E-cadherin antibody. Both sensitivity and precision are higher than 96% for the classification of both classes taken into consideration ( Table 2) shape and size features. Area ratio as assessed by ratio between cytoplasms and nuclei areas. px=pixels. c Violin plots of some textural features. d Violin plots of some contextual features. The value 0 (or 1 for the left plot) are assigned to cells that do not have enough neighbours estimate the predictive power and revealed that SCC efficiently predicts both human and mouse cells with probabilities associated with each of the analyzed cells close to the extrema 0 and 1, suggesting high confidence of our model. Interestingly, the feature importance previously discussed impinges on the classification task (Table 3). In line with our predictions, we observed that the contextual features without any texture information allow for a very good classification with 89.02% of accuracy, compared to the shape and size features alone achieving an accuracy of 66.84% , highlighting the importance of contextual features for such classification task. In order to dispel the possibility that our analysis may be confounded by host-derived tissue-resident immune cells [32][33][34][35][36], we performed a co-immunostaining using anti-CD45, antigen expressed on all leucocytes, and anti human-CK7 antibodies that revealed the lack of host-derived immune cells in intraductal xenografts (Supp. Fig. 12).

Classification of Breast Cancer-derived PDXs in H&E-stained Histological Sections
As patient-derived xenografts are mainly used in the context of cancer research, we went on to assess whether SCC was also able to classify cell species on tumor-derived PDXs.   (Table 4). Overall, this new model reaches an accuracy of 96.21% when both ILC and NST breast tumors were taken into consideration (Fig. 6). The analysis of how the features impinged on the classification was performed and showed that the textural features have a higher impact compared to the normal human cells, reaching an accuracy of 95.50% (Table 3).

Classification of Normal Breast PDXs in fluorescence DAPI-stained Images
Having established the accuracy of SCC in H&E-stained histological sections, we went on to test its performance on fluorescence single-channel DAPI-stained images (Supp. Fig. 9), representing a more challenging task because of the lack of any contextual features available. For nuclei detection, a built-in model named "Versatile" was used and a new model for classification of human and mouse cells was trained. Noticeably, SCC reached an accuracy of 94.78% in these settings. The analysis of the features impact revealed similar tendencies than the ones previously observed for the H&E counterparts (Table 3). While on one hand, shape and size features allowed to reach an accuracy of 70.91%, suggesting that these features are sufficient to perform the classification task under these settings, on the other hand textural features appeared less relevant, in line with cytoplasms providing fewer information in single-channel DAPIstained images than H&E.

Comparison with an Image-based Single-stage Method
Finally, we investigated whether both the nucleus segmentation task and cell classification task could be performed jointly by a single-stage model. To that end, we extended the model architecture of StarDist [22] and added a dedicated classification head that predicts the probability of a nucleus belonging either to a human or a mouse cell. We then annotated nucleus outlines and cell types (human/ mouse) for 16 images of size 330x320 containing in total 1'450 human and 300 mouse nuclei. After training, the extended model achieved a classification accuracy of 85.90% across all matched nucleus instances, well below the accuracy achieved by SCC. This suggests that decoupling the segmentation and classification tasks is beneficial in our case possibly due to the availability of only few annotated training data, and that SCC outperforms standard image-based single-stage methods.

Discussion
PDXs are innovative preclinical models and important for translational research. However, their usage is hampered by difficulty in data interpretation arising from the presence of host cells and no methods are currently available for species discrimination of individual cells in PDX models. To fill this gap, we developed SCC, a publicly available deep learning-based tool distributed as an ImageJ/Fiji plugin aiming at classifying human and mouse cells in PDX-derived histological sections. For the first time, SCC elaborates a comprehensive set of information to efficiently classify the  Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.