Key words

1 Introduction

1.1 What Are We Presenting?

This chapter aims to assist the reader in discovering and understanding state-of-the-art machine learning techniques used to analyze whole slide images (WSI), an essential data type used in computational pathology (CP). We are restricting our review to brain disorders, classified within four generally accepted groups:

  • Brain injuries: caused by blunt trauma and can damage brain tissue, neurons, and nerves.

  • Brain tumors: can originate directly from the brain (and be cancerous or benign) or be due to metastasis (cancer elsewhere in the body and spreading to the brain).

  • Neurodegenerative diseases: the brain and nerves deteriorate over time. We include, here, Alzheimer’s disease, Huntington’s disease, ALS (amyotrophic lateral sclerosis) or Lou Gehrig’s disease, and Parkinson’s disease.

  • Mental disorders: (or mental illness) affect behavior patterns. Depression, anxiety, bipolar disorder, PTSD (post-traumatic stress disorder), and schizophrenia are common diagnoses.

In the last decade, there has been exponential growth in the application of image processing and artificial intelligence (AI) algorithms within digital pathology workflows. The first FDA (US Food and Drug Administration) clearance of digital pathology for diagnosis protocols was as early as 2017,Footnote 1 as the emergence of innovative deep learning (DL) technologies have made this possible, with the requested degree of robustness and repeatability.

Ahmed Serag et al. [1] discuss the translation of AI into clinical practice to provide pathologists with new tools to improve diagnostic consistency and reduce errors. In the last five years, the authors reported an increase in academic publications (over 1000 articles reported in PubMed) and over $100M invested in start-ups building practical AI applications for diagnostics. The three main areas of development are (i) network architectures to extract relevant features from WSI for classification or segmentation purposes, (ii) generative adversarial networks (GANs) to address some of the issues present in the preparation and acquisition of WSIs, and (iii) unsupervised learning to create labeling tools for precise annotations. Regarding data, many top-tier conference competitions have been organized and released annotated datasets to the community; however, very few of them contain brain tissue samples. Those which do are from brain tumor regions obtained during a biopsy, making it harder to study other brain disorder categories which frequently require postmortem data.

In [1], the authors also mention seven key challenges in diagnostic AI in pathology, listed as follows:

  • Access to large well-annotated datasets. Most articles on brain disorders use private datasets due to hospital privacy constraints.

  • Context switching between workflows refers to a seamless integration of AI into the pathology workflow.

  • Algorithms are slow to run as image sizes are in gigapixels’ order and require considerable computational memory.

  • Algorithms require configuration, and fully automated approaches with high accuracy are difficult to develop.

  • Properly defined protocols are needed for training and evaluation.

  • Algorithms are not properly validated due to a lack of open datasets. However, research in data augmentation might help in this regard.

  • Introduction of intelligence augmentation to describe computational pathology improvements in diagnostic pathology. AI algorithms work best on well-defined domains rather than in the context of multiple clinicopathological manifestations among a broad range of diseases; however, they provide relevant quantitative insights needed for standardization and diagnosis.

These challenges limit the translation from research to clinical diagnostics. We intend to give the readers some insights into the core problems behind the issues listed by briefly introducing WSI preparation and image acquisition protocols. Besides, we describe the state of the art of the proposed methods.

1.2 Why AI for Brain Disorders?

An important role of CP in brain disorders is related to the study and assessment of brain tumors as they cause significant morbidity and mortality worldwide, and pathology data is often available. In 2022 [2], over 25k adults (14,170 men and 10,880 women) in the United States will have been diagnosed with primary cancerous tumors of the brain and spinal cord. 85% to 90% of all primary central nervous system (CNS) tumors (benign and cancerous) are located in the brain. Worldwide, over 300k people were diagnosed with a primary brain or spinal cord tumor in 2020. This disorder does not distinguish age, as nearly 4.2k children under the age of 15 will have also been diagnosed with brain or CNS tumors in 2022, in the United States.

It is estimated that around one billion people have a mental or substance use disorder [3]. Some other key figures related to mental disorders worldwide are given by [4]. Globally, an estimated 264 million people are affected by depression. Bipolar disorder affects about 45 million people worldwide. Schizophrenia affects 20 million people worldwide, and approximately 50 million have dementia. In Europe, an estimated 10.5 million people have dementia, and this number is expected to increase to 18.7 million in 2050 [5].

In the neurodegenerative disease group, 50 million people worldwide are living with Alzheimer’s and other types of dementia [6], Alzheimer’s disease being the underlying cause in 70% of people with dementia [5]. Parkinson’s disease affects approximately 6.2 million people worldwide [7] and represents the second most common neurodegenerative disorder. As the incidence of Alzheimer’s and Parkinson’s diseases rises significantly with age and people’s life expectancy has increased, the prevalence of such disorders is set to rise dramatically in the future. For instance, there may be nearly 13 million people with Parkinson’s by 2040 [7].

Brain injuries are also the subject of a considerable number of incidents. Every year, around 17 million people suffer a stroke worldwide, with an estimate of one in four persons having a stroke during their lifetime [8]. Besides, stroke is the second cause of death worldwide and the first cause of acquired disability [5].

These disorders also impact American regions, with over 500k deaths reported in 2019, due to neurological conditions. Among the conditions analyzed, the most common ones were Alzheimer’s disease, Parkinson’s, epilepsy, and multiple sclerosis [9].

In the case of brain tumors, treatment and prognosis require accurate and expedient histological diagnosis of the patient’s tissue samples. Trained pathologists visually inspect histology slides, following a time-consuming and labor-intensive procedure. Therefore, the emergence of CP has triggered great hope to ease this tedious task and make it more robust. Clinical workflows in oncology rely on predictive and prognostic molecular biomarkers. However, the growing number of these complex biomarkers increases the cost and the time for decision-making in routine daily practice. Available tumor tissue contains an abundance of clinically relevant information that is currently not fully exploited, often requiring additional diagnostic material. Histopathological images contain rich phenotypic information that can be used to monitor underlying mechanisms contributing to disease progression and patient survival outcomes.

In most other brain diseases, histological images are only acquired postmortem, and this procedure is far from being systematic. Indeed, it depends on the previous agreement of the patient to donate their brain for research purposes. Moreover, as mentioned above, the inspection of such images is complex and tedious, which further explains why it is performed in a minority of cases. Nevertheless, histopathological information is of the utmost importance in understanding the pathophysiology of most neurological disorders, and research progress would be impossible without such images. Finally, there are a few examples, beyond brain tumors, in which a surgical operation leads to an inspection of resected when the patient is alive (this is, for instance, the case of pharmacoresistant epilepsy).

Intraoperative decision-making also relies significantly on histological diagnosis, which is often established when a small specimen is sent for immediate interpretation by a neuropathologist. In poor-resource settings, access to specialists may be limited, which has prompted several groups to develop machine learning (ML) algorithms for automated interpretation. Computerized analysis of digital pathology images offers the potential to improve clinical care (e.g., automated assistive diagnosis) and catalyze research (e.g., discovering disease subtypes or understanding the pathophysiology of a brain disorder).

1.3 How Do We Present the Information?

In order to understand the potential and limitations of computational pathology algorithms, one needs to understand the basics behind the preparation of tissue samples and the image acquisition protocols followed by scanner manufacturers. Therefore, we have structured the chapter as follows.

Subheading 2 presents an overview of tissue preservation techniques and how they may impact the final whole slide image. Subheading 3 introduces the notion of digital pathology and computational pathology, and its differences. It also develops the image acquisition protocol and describes the pyramidal structure of the WSI and its benefits. In addition, it discusses the possible impact of scanners on image processing algorithms. Subheading 4 describes some of the state-of-the-art algorithms in artificial intelligence and its subcategories (machine learning and deep learning). This section is divided into methods for classifying and segmenting structures in WSI, and techniques that leverage deep learning algorithms to extract meaningful features from the WSI and apply them to a specific clinical application. Finally, Subheading 5 explores new horizons in digital and computational pathology regarding explainability and new microscopic imaging modalities to improve tissue visualization and information retrieval.

2 Understanding Histological Images

We dedicate this section to understanding the process of acquiring histological images. We begin by introducing the two main tissue preservation techniques used in neuroscience studies, i.e., the routine-FFPE (formalin-fixed paraffin-embedded) preparation and the frozen tissue. We describe the process involved in each method and the main limitations for obtaining an appropriate histopathological image for analysis. Finally, we present the main procedures used in anatomopathology, based on such tissue preparations.

2.1 Formalin-Fixed Paraffin-Embedded Tissue

FFPE is a technique used for preserving biopsy specimens for clinical examination, diagnostic, experimental research, and drug development. A correct histological analysis of tissue morphology and biomarker localization in tissue samples will hinge on the ability to achieve high-quality preparation of tissue samples, which usually requires three critical stages: fixation, processing (also known as pre-embedding), and embedding.

Fixation is the process that allows the preservation of the tissue architecture (i.e., its cellular components, extracellular material, and molecular elements). Histotechnologists perform this procedure right after removing the tissue, in case of surgical pathology, or soon after death, during autopsy. Time is essential in preventing the autolysis and necrosis of excised tissues and preserving their antigenicity. Five categories of fixatives are used in this stage: aldehydes, mercurials, alcohols, oxidizing agents, and picrates. The most common fixative used for imaging purposes is formaldehyde (also known as formalin), included in the aldehyde group. Fixation protocols are not standardized and vary according to the type of tissue and the histologic details needed to analyze it. The variability in this stage induces the possibility for several factors to affect this process, such as buffering (pH regulation), penetration (also depending on tissue thickness), volume (the usual ratio is 10:1), temperature, fixative concentration (10% solution is typical), and fixation time. These factors impact the quality of the scanned image, since stains used to highlight specific tissue properties may not react as expected.

After fixation, the tissue undergoes a processing stage necessary to create a paraffin embedding, which allows histotechnologists to cut the tissue into microscopic slides for further examination. The processing involves removing all water from the tissue using a series of alcohols and then clearing the tissue, which consists of removing the dehydrator with a miscible substance with the paraffin. Nowadays, tissue processors can automate this stage, by reducing inter-expert variability.

Dehydration and clearing will leave the tissue ready for the technician to create the embedded paraffin blocks. Depending on the tissue, these embeddings must be correctly aligned and oriented, determining which tissue section or cut is studied. Also, the embedding parameters (e.g., embedding temperature or peculiar chemicals involved) may defer from the norm for unique studies, so the research entity and the laboratory making the acquisition need to define them beforehand. Figure 1 shows a paraffin embedding cassette where the FFPE tissue samples can be stored even at room temperature for long periods.

Fig. 1
Two photographs. They present containers with tissues on them.

Paraffin cassettes

These embeddings undergo two more stages before being scanned: sectioning and staining. These procedures are discussed in the last section as they are no longer related to tissue preservation; instead, they are part of the tissue preparation stages before imaging.

2.2 Frozen Histological Tissue

Pathologists often use this tissue preservation method during surgical procedures where a rapid diagnosis of a pathological process is needed (extemporaneous preparation). In fact, frozen tissue produces the fastest stainable sections, although, compared to FPPE tissue, its morphological properties are not as good.

Frozen tissue (technically referred to as cryosection) is created by submerging the fresh tissue sample into cold liquid (e.g., pre-cooled isopentane in liquid nitrogen) or by applying a technique called flash freezing, which uses liquid nitrogen directly. As in FFPE, the tissue needs to be embedded in a medium to fix it to a chuck (i.e., specimen holder) in an optimal position for microscopic analysis. However, unlike FFPE tissue, no fixation or pre-embedding processes are needed for preservation.

For embedding, technicians use OCT (optimal cutting temperature compound), a viscous aqueous solution of polyvinyl alcohol and polyethylene glycol designed to freeze, providing the ideal support for cutting the cryosections in the cryostat (microtome under cold temperature). Different embedding approaches exist depending on the tissue orientation, precision and speed of the process, tissue wastage, and the presence of freeze artifacts in the resulting image. Stephen R. Peters describes these procedures and other important considerations needed to prepare tissue samples using the frozen technique [10].

Frozen tissue preservation relies on storing the embeddings at low temperatures. Therefore, the tissue will degrade if the cold chain breaks due to tissue sample mishandling. However, as it better preserves the tissue’s molecular genetic material, it is frequently used in sequencing analysis and immunohistochemistry (IHC).

Other factors that affect the tissue quality and, therefore, the scanned images are the formation of ice crystals and the thickness of the sections. Ice crystals form when the tissue is not frozen rapidly enough, and it may negatively affect the tissue structure and, therefore, its morphological characteristics. On the other hand, frozen sections are often thicker than FFPE sections increasing the potential for lower resolution at higher magnifications and poorer images.

2.3 Tissue Preparation

We described the main pipeline to extract and preserve tissue samples for further analysis. Although the techniques described above can also be used for molecular and protein analysis (especially the frozen sections), we now focus only on the image pipeline by describing the slide preparation for scanning and the potential artifacts observed in the acquired images.

Once the tissue embeddings are obtained, either by FFPE or frozen technique, they are prepared for viewing under a microscope or scanner. The tissue blocks are cut, mounted on glass slides, and stained with pigments (e.g., hematoxylin and eosin [H&E], saffron, or molecular biomarkers) to enhance the contrast and highlight specific cellular structures under the microscope.

Cutting the embeddings involves using a microtome to cut very thin tissue sections, later placed on the slide. The thickness of these sections is usually in the range of 4–20 microns. It will depend on the microscopy technique used for image acquisition and the experiment parameters. Special diamond knives are needed to get thinner sections, increasing the price of the microtome employed. If we use frozen embeddings, a cryostat keeps the environment’s temperature low, avoiding tissue degradation.

Once on the slide, the tissue is heated to adhere to the glass and avoid wrinkles. If warming the tissue damages some of its properties (especially for immunohistochemistry), glue-coated slides can be used instead. For cryosections, pathologists often prefer to add a fixation stage to resemble the readings of an FFPE tissue section. This immediate fixation is achieved using several chemicals, including ethanol, methanol, formalin, acetone, or a combination. S. Peters describes the differences in the image quality based on these fixatives, as well as the proposed protocol for cutting and staining frozen sections [10]. For FFPE sections, Zhang and Xiong [11] describe neural histology’s cutting, mounting, and staining methods. Protocols suggested by the authors are valuable guidelines for histotechnologists as tissue usually folds or tears, and bubbles form when cutting the embeddings. Minimizing these issues is essential to have good-quality images and accurate quantification of histological results.

Staining is the last process applied to the tissue before being imaged. Staining agents do not react with the embedding chemicals used to preserve the tissue sample; therefore, the tissue section needs to be cleaned and dried beforehand (e.g., eliminating all remains of paraffin wax used in the embedding). In [12], the authors present a review of the development of stains, techniques, and applications throughout time. One of the most common stains used in histopathology is hematoxylin and eosin (H&E). This agent highlights cell nuclei with a purple-blue color and the extracellular matrix and cytoplasm with the characteristic pink. Other structures in the tissue will show different hues, shades, and combinations of these colors. Figure 2 shows an H&E-stained human brainstem tissue and specific structures found on it.

Fig. 2
A biopsy highlights white matter tracts, blood vessel walls, red blood cells, and large nuclei with darkly stained nucleoli.

H&E-stained WSI from human brainstem tissue preserved using FFPE. Relevant structures were annotated by expert pathologist. Abbreviations. H&E: hematoxylin and eosin. FFPE: formalin-fixed paraffin-embedded. WSI: whole slide image

Other staining agents can be used depending on the structure we would like to study or the clinical procedure. For instance, the toluidine blue stain is frequently used for intraoperative consultation. Frozen sections are usually stained with this agent as it reacts almost instantly with the tissue. However, one disadvantage is that it only presents shades of blue and purple, so there is considerably less differential staining of the tissue structures [10].

For brain histopathology, other biomarkers are also available. For instance, the cresyl violet (or Nissl staining) is commonly used to identify the neuronal structure in the brain and spinal cord tissue [13]. Also, the Golgi method, which uses a silver staining technique, is used for observing neurons under the microscope [11]. Studies for Alzheimer’s disease also frequently use ALZ50 and AT8 antibodies to reveal phosphorylated tau pathology using a standardized immunohistochemistry protocol [14,15,16]. Figure 3 shows the difference between ALZ50 and AT8 biomarkers and tau pathologies found in the tissue.

Fig. 3
Two sets of W S I images. The three images on the left and right highlight a neurofibrillary tangle and a neuritic plaque. They are highlighted by marked lines.

[Top left] ALZ50 antibody used to discover compacted structures (tau pathologies). Below the WSI is an example of a neurofibrillary tangle (left) and a neuritic plaque (right) stained with ALZ50 antibody. [Top right] AT8 antibody, the most widely used in clinics, helps to discover all structures in a WSI. Below the WSI, there is an example of a neurofibrillary tangle (left) and a neuritic plaque stained with AT8 antibody (right). Abbreviation. WSI: whole slide image

Having the slide stained is the last stage to prepare for studying microscopic structures of diseased or abnormal tissues. Considering the number of people involved in these processes (pathologists, pathology assistants, histotechnologists, tissue technicians, and trained repository managing personnel) and the precision of each stage, standardizing certain practices to create valuable slides for further analysis is needed.

Eiseman et al. [17] reported a list of best practices for biospecimen collection, processing, annotation, storage, and distribution. The proposal aims to set guidelines for managing large biospecimen banks containing the tissue sample embeddings excised from different organs with different pathologies and demographic distributions.

More specific standardized procedures for tissue sampling and processing have also been reported. For instance, in 2012, the Society of Toxicologic Pathology charged a Nervous System Sampling Working Group with devising recommended practices to routinely screen the central nervous system (CNS) and peripheral nervous system (PNS) during nonclinical general toxicity studies. The authors proposed a series of approaches and recommendations for tissue fixation, collection, trimming, processing, histopathology examination, and reporting [18]. Zhang J. et al. also address the process of tissue preparation, sectioning, and staining but focus only on brain tissue [11]. Although these recommendations aim to standardize specific techniques among different laboratories, they are usually imprecise and approximate, leaving the final decision to the specialists based on the tissue handled.

Due to this lack of automation during surgical removal, fixation, tissue processing, embedding, microtomy, staining, and mounting procedures, several artifacts can impact the quality of the image and the results of the analysis. A review of these artifacts is presented in [19]. The authors review the causes of the most frequent artifacts, how to identify them, and propose some ideas to prevent them from interfering with the diagnosis of lesions. For better understanding and following the tissue preparation and image acquisition procedure, the authors proposed a classification of eight classes: prefixation artifacts, fixation artifacts, artifacts related to bone tissue, tissue-processing artifacts, artifacts related to microtomy, artifacts related to floatation and mounting, staining artifacts, and mounting artifacts. Figure 4 shows some of them.

Fig. 4
Four histological images. The squares highlight different types of artifacts.

[Top left] Folding artifact (floatation and mounting-related artifact), [Top right] Marking fixation process (fixation artifact), [Bottom left] Breaking artifact (microtome-related artifact), [Bottom right] Overlaying tissue (mounting artifact)

3 Histopathological Image Analysis

This section aims to better understand the role that digital pathology plays in the analysis of complex and large amounts of information obtained from tissue specimens. As an additional option to incorporate more images with higher throughput, whole slide image scanners are briefly discussed. Therefore, we must discuss the DICOM standard used in medicine to digitally represent the images and, in this case, the tissue samples. We then focus on computational pathology, which is the analysis of the reconstructed whole slide images using different pattern recognition techniques such as machine learning (including deep learning) algorithms. This section contains some extractions from Jimeénez’s thesis work [20].

3.1 Digital Pathology

Digital systems were introduced to the histopathological examination in order to deal with complex and vast amounts of information obtained from tissue specimens. Digital images were originally generated by mounting a camera on the microscope. The static pictures captured only reflected a small region of the glass slide, and the reconstruction of the whole glass slide was not frequently attempted due to its complexity and the fact that it is time-consuming. However, precision in the development of mechanical systems has made possible the construction of whole slide digital scanners. Garcia et al. [21] reviewed a series of mechanical and software systems used in the construction of such devices. The stored high-resolution images allow pathologists to view, manage, and analyze the digitized tissue on a computer monitor, similar to under an optical microscope but with additional digital tools to improve the diagnosis process.

WSI technology, also referred to as virtual microscopy, has proven to be helpful in a wide variety of applications in pathology (e.g., image archiving, telepathology, image analysis). In essence, a WSI scanner operation principle consists of moving the glass slide a small distance every time a picture is taken to capture the entire tissue sample. Every WSI scanner has six components: (a) a microscope with lens objectives, (b) a light source (bright field and/or fluorescent), (c) robotics to load and move glass slides around, (d) one or more digital cameras for capture, (e) a computer, and (f) software to manipulate, manage, and view digital slides [22]. The hardware and software used for these six components will determine the key features to analyze when choosing a scanner. Some research articles have compared the hardware and software capabilities of different scanners in the market. For instance, in [22], Farahani et al. compared 11 WSI scanners from different manufacturers regarding imaging modality, slide capacity, scan speed, image magnification, image resolution, digital slide format, multilayer support, and special features their hardware and software may offer. This study showed that robotics and hardware used in a WSI scanner are currently state of the art and almost standard in every device. Software, on the other hand, has some ground for further development. A similar study by Garcia et al. [21] reviewed 31 digital slide systems comparing the same characteristics in Farahani’s work. In addition, the authors classified the devices into digital microscopes (WSI) for virtual slide creation and diagnosis-aided systems for image analysis and telepathology. Automated microscopes were also included in the second group as they are the baseline for clinical applications.

3.2 Whole Slide Image Structure

The Digital Imaging and Communications in Medicine (DICOM) standard was adopted to store WSI digital slides into commercially available PACS (picture archiving and communication system) and facilitate the transition to digital pathology in clinics and laboratories. Due to the WSI dimension and size, a new pyramidal approach for data organization and access was proposed by the DICOM Standards Committee in [23].

A typical digitalization of a 20 mm × 15 mm sample using a resolution of 0.25 μm/pixel, also referred to as 40 × magnification, will generate an image of approximately 80, 000 × 60, 000 pixels. Considering a 24-bit color resolution, the digitized image size is about 15 GB. Data size might even go one order of magnitude higher if the scanner is configured to a higher resolution (e.g., 80 ×, 100 ×), Z planes are used, or additional spectral bands are also digitized. In any case, conventional storage and access to these images will demand excessive computational resources to be implemented into commercial systems. Figure 5 describes the traditional approach (i.e., single frame organization), which stores the data in rows that extend across the entire image. This row-major approach has the disadvantage of loading unnecessary pixels into memory, especially if we want to visualize a small region of interest.

Fig. 5
A W S images. Regions to be analyzed and loaded are highlighted. Regions are divided into rows.

Single frame organization of whole slide images

Other types of organizations have also been studied. Figure 6 describes the storage of pixels in tiles, which decreases the computational time for visualization and manipulation of WSI by loading only the subset of pixels needed into memory. Although this approach allows faster access and rapid visualization of the WSI, it fails when dealing with different magnifications of the images, as is the case in WSI scanners. Figure 7 depicts the issues with rapid zooming of WSI. Besides loading a larger subset of pixels into memory, algorithms to perform the down-sampling of the image are time-consuming. At the limit, to render a low-resolution thumbnail of the entire image, all the data scanned must be accessed and processed [23]. Stacking precomputed low-resolution versions of the original image was proposed in order to overcome the zooming problem. Figure 8 describes the pyramidal structure used to store different down-sampled versions of the original image. The bottom of the pyramid corresponds to the highest resolution and goes up to the thumbnail (lowest resolution) image. For further efficiency, tiling and pyramidal methods are combined to facilitate rapid retrieval of arbitrary subregions of the image as well as access to different resolutions. As depicted in Fig. 8, each image in the pyramid is stored as a series of tiles. In addition, the baseline image tiles can contain different colors or z-planes if multispectral images are acquired or if tracking variations in the specimen thickness are needed. This combined approach can be easily integrated into a web architecture such as the one presented by Lajara et al. [24] as tiles of the current user’s viewport can be cached without high memory impact.

Fig. 6
A W S images. Regions to be analyzed and loaded are highlighted. Regions are highlighted by tiles.

Tiled image organization of whole slide images. Tiles’ size can range from 240 × 240 pixels up to 4096 × 4096 pixels

Fig. 7
A histological image with a size of 2.5 x that can be zoomed in further to 10 x and 40 x magnifications. The 2.5x image shows a larger region of the sample at a lower resolution, while the 40x image shows a smaller region of the sample at a higher resolution with greater detail.

Rapid zooming issue when accessing lower-resolution images: large amount of data need to be loaded into memory. In this example, the image size at the highest resolution (221 nm/pixels) is 82,432 × 80,640 pixels

Fig. 8
A pyramidal representation of W I images. The pyramid is labeled with retrieved images at different magnifications. Three layers from top to bottom are labeled single-frame thumbnail image, multi-frame image, and multi-frame image with the lowest, intermediate, and highest resolution.

Pyramidal organization of whole slide images. In this example, the image size at the highest resolution (221 nm/pixels) is 82,432 × 80,640 pixels. The compressed (JPEG) file size is 2.22 GB, whereas the uncompressed version is 18.57 GB

As mentioned in previous paragraphs, WSI can occupy several terabytes of memory due to the data structure. Depending on the application, lossless or lossy compression algorithms can be applied. Lossless compression typically yields a 3X−5X reduction in size; meanwhile, lossy compression techniques such as JPEG and JPEG2000 can achieve from 15X−20X up to 30X−50X reduction, respectively [23]. Due to no standardization of WSI file formats, scan manufacturers may also develop their proprietary compression algorithms based on JPEG and JPEG2000 standards. Commercial WSI formats have a mean default compression value ranging from 13X to 27X. Although the size of WSI files is considerably reduced, efficient data storage was not the main issue when designing WSI formats for more than 10 years. In [25], Helin et al. addressed this issue and proposed an optimization to the JPEG2000 format, which yields up to 176X compression. Although no computational time has been reported in the aforementioned study, this breakthrough allows for efficient transmission of data through systems relying on Internet communication protocols.

3.3 Computational Pathology

Computational pathology is a term that refers to the integration of WSI technology and image analysis tools in order to perform tasks that were too cumbersome or even impossible to undertake manually. Image processing algorithms have evolved, yielding enough precision to be considered in clinical applications, such is the case for surgical pathology using frozen samples reported by Bauer et al. in [26]. Other examples mentioned in [22] include morphological analysis to quantitatively measure histological structures [27], automated selection of regions of interest such as areas of most active proliferative rate [28], and automated grading of tumors [29]. Moreover, educational activities have also benefited from the development of computational pathology. Virtual tutoring, online medical examinations, performance improvement programs, and even interactive illustrations in articles and books are being implemented, thanks to this technology [22].

In order to validate a WSI scanner for clinical use (diagnosis purposes), several tests are conducted following the guidelines developed by the College of American Pathologists (CAP) [30]. On average, reported discrepancies between digital slides and glass slides are in the range of 1–5%. However, even glass-to-glass slide comparative studies can yield discrepancies due to observer variability, and increasing case difficulty.

Although several studies in the medical community have reported using WSI scanners to perform the analysis of tissue samples, pathologists remain reluctant to adopt this technology in their daily practice. Lack of training, limiting technology, shortcomings in scanning all slides, cost of equipment, and regulatory barriers have been reported as the principal issues [22]. In fact, it was until early 2017 that the first WSI scanner was approved by the FDA and released to the market [31]. Nevertheless, WSI technology represents a milestone in modern pathology, having the potential to enhance the practice of pathology by introducing new tools which help pathologists provide a more accurate diagnosis based on quantitative information. Besides, this technology is also a bridge for bringing omics closer to routine histopathology toward future breakthroughs as spatial transcriptomics.

4 Methods in Brain Computational Pathology

This section is dedicated to different machine learning and deep learning methodologies to analyze brain tissue samples. We describe the technology by focusing on how this is applied (i.e., at the WSI or the patch level), the medical task associated with it, the dataset used, the core structure/architecture of the algorithms, and the significant results.

We begin by describing the general challenges in WSI analysis. Then we move on to deep learning methods concerning only WSI analysis, and we finalize with machine learning and deep learning applications for brain disorders focusing on the disease rather than the processing of the WSI. In addition, as in the primary biomedical areas, data annotation is a vital issue in computational pathology, generating accurate and robust results. Therefore, some new techniques used to create reliable annotations—based on a seed-annotated dataset—will be presented and discussed.

4.1 Challenges in WSI Analysis Using ML

Successful application of machine learning algorithms to WSIs can improve—or even surpass—the accuracy, reproducibility, and objectivity of current clinical approaches and propel the creation of new clinical tools providing new insights on various pathologies [32]. Due to the characteristics of a whole slide image and the acquisition process described in the sections above, researchers usually face two nontrivial challenges related to the visual understanding of the WSIs and the inability of hardware and software to facilitate learning from such high-dimensional images.

Regarding the first challenge, the issue relies on the lack of generalization of ML techniques due to image artifacts and color variability in staining. Imaging artifacts directly result from the tissue section processing errors and the hardware (scanners) used to digitize the slide. The uneven illumination, focusing, and image tiling are a few imaging artifacts present in the WSI, being the first the most relevant and studied as it is challenging for an algorithm to extract useful features from some regions of the scanned tissue. It gets even worse when staining artifacts such as stain variability are also present.

To address this problem, we find several algorithms for color normalization in the literature. Macenko [33], Vahadane [34], and Reinhard [35] algorithms are classical algorithms for color normalization implementing image processing techniques such as histogram normalization, color space transformations, color deconvolution (color unmixing), reference color density maps, or histogram matching. Extensions from these methods are also reported. For instance, Magee et al. [36] proposed two approaches to extend the Reinhard method: a multimodal linear normalization in the Lab color space and normalization in a representation space using stain-specific color deconvolution.

The use of machine learning techniques, specifically deep convolutional neural networks, has also been studied for color normalization. In [37], the authors proposed the StainNet for stain normalization. The framework consists of a GANFootnote 2 (teacher network) trained to learn the mapping relationship between a source and target image, and an FCNNFootnote 3 (student network) able to transfer the mapping relationship of the GAN based on image content into a mapping relationship based on pixel values. A similar approach using cycle-consistent GANs was also proposed for the normalization of H&E-stained WSIs [38]. In the last case, synthetically generated images capture the representative variability in the color space of the WSI, enabling the architecture to transfer any color information from a new source image into a target color space.

On the other hand, the second challenge related to the high dimensionality of WSIs is addressed in two ways: processing using patch-level or slide-level annotations. Dimitriou N. et al. reported an overview of the literature for both approaches in [32]. For patch-based annotations, the authors reported patch sizes ranging from 32 × 32 pixels up to 10,000 × 10,000 pixels and a frequent value of 256 × 256 pixels. Patches are generated and processed by sequentially dividing the WSI into tiles, which demand higher computational resources, by random sampling, leading to class imbalance issues, or by following a guided sampling based on pixel annotations. Patch-level annotations usually contain pixel-level labels. Frequently approaches using these annotations focus on the segmentation of morphological structures in patches rather than the classification of the entire WSI. In [39], the authors studied the potential of semantic architectures such as the U-Net and compared it to classical CNN approaches for pixel-wise classification. Another approach known as HistoSegNet [40] implements a combination of visual attention maps (or activation maps) using the Grad-CAM algorithm and CNN for semantic segmentation of WSI. In addition, several methods are summarized in [41, 42] using graph deep neural networks to detect and segment morphological structures in WSIs.

Pixel labeling at high resolution is a time-demanding task and is prone to inter- and intra-expert variabilities impacting the learning process of machine learning algorithms. Therefore, despite the lower granularity of labeling, several studies have shown promising results when working with slide-based annotations.

With no available information about the pixel label, most algorithms usually aim to identify patches (or regions of interest in the WSI) that can collectively or independently predict the classification of the WSI. These techniques often rely on multiple instance learning, unsupervised learning, reinforcement learning, transfer learning, or a combination thereof [32]. Tellez et al. [43] proposed a two-step method for gigapixel histopathology analysis based on an unsupervised neural network compression algorithm to extract latent representations of patches and a CNN to predict image-level labels from those compressed images. In [44], the authors proposed a four-stage methodology for survival prediction based on randomly sampled patches from different patients’ slides. They used PCA to reduce the features’ space dimension prior to the K-means clustering process to group patches according to their phenotype. Then, a deep convolutional network (DeepConvSurv) is used to determine which patches are relevant for the aggregation and final survival score. Qaiser et al. [45] proposed a model mimicking the histopathologist practice using recurrent neural networks (RNN) and CNN. In their proposal, they treat images as the environment and the RNN+ CNN as the agent acting as a decision-maker (same as the histopathologists). The agent then looks at high-level tissue components (low magnification) and evaluates different regions of interest at low-level magnification, storing relevant morphological features into memory. Similarly, Momeni et al. [46] suggested using deep recurrent attention models (DRAMs) and CNN to create an attention-based architecture to process large input patches and locate discriminatory regions more efficiently. This last approach needs, however, further validation as results are not conclusive and have not been accepted by the scientific community yet.

Relevant features for disease analysis, diagnosis, or patient stratification can be extracted from individual patches by looking into cell characteristics or morphology; however, higher structural information, such as the shape or extent of a tumor, can only be captured in more extensive regions. Some approaches to processing multiple magnification levels of a WSI are reported in [47,48,49,50,51]. They involve leveraging the pyramidal structure of WSI to access features from different resolutions and model spatial correlations between patches.

All the studies cited so far have no specific domain of application. Most of them were trained and tested using synthetic or public datasets containing tissue pathologies from different body areas. Therefore, most of the approaches can extend to different pathologies and diseases. In the following subsections, however, we will focus only on specific brain disorder methodologies.

4.2 DL Algorithms for Brain WSI Analysis

In recent times, deep-learning-based methods have shown promising results in digital pathology [52]. Unfortunately, only a few public datasets contain WSI of brain tissue, and most of them only contain brain tumors. In addition, most of them are annotated at the slide level, making the semantic segmentation of structures more challenging. Independently of the task (i.e., detection/classification or segmentation) and the application in brain disorders, we will explore the main ideas behind the methodologies proposed in the literature.

For the analysis of benign or cancerous pathologies in brain tissue, tumor cell nuclei are of significant interest. The usual framework for analyzing such pathologies was reported in [53] and used the WSI of diffuse glioma. The method first segments the regions of interest by applying classical image processing techniques such as mathematical morphology and thresholding. Then, several handcrafted features such as nuclear morphometry, region texture, intensity, and gradient statistics were computed and inputted to a nuclei classifier. Although such an approach—using quadratic discriminant analysis and maximum a posteriori (MAP) as a classification mechanism—reported an overall accuracy of 87.43%, it falls short compared to CNN, which relies on automated feature extractions using convolutions rather than on handcrafted features. Xing et al. [54] proposed an automatic learning-based framework for robust nucleus segmentation. The method begins by dividing the image into small regions using a sliding window technique. These patches are then fed to a CNN to output probability maps and generate initial contours for the nuclei using a region merging algorithm. The correct nucleus segmentation is obtained by alternating dictionary-based shape deformation and inference. This method outperformed classical image processing algorithms with promising results (mean Dice similarity coefficient of 0.85 and detection F1 score of 0.77 computed using gold-standard regions within 15 pixels for every nucleus center) using CNN-based features over classical ones.

Following a similar approach, Xu et al. [55] reported the use of deep convolutional activation features for brain tumor classification and segmentation. The authors used a pre-trained AlexNet CNN [56] on the ImageNet dataset to extract patch features from the last hidden layer of the architecture. Features are then ranked based on the difference between the two classes of interest, and the top 100 are finally input to an SVM for classification. For the segmentation of necrotic tissue, an additional step involving probability mappings from SVM confidence scores and morphological smoothing is applied. Other approaches leveraging the use of CNN-based features for glioma are presented in [47, 57]. The experiments reported achieved a maximum accuracy of 97.5% for classification and 84% for segmentation. Although these results seemed promising, additional tests with different patch sizes in [47] suggested that the method’s performance is data-dependent as numbers increase when larger patches, meaning more context information, are used.

With the improvement of CNN architectures for natural images, more studies are also leveraging transfer learning to propose end-to-end methodologies for analyzing brain tumors. Ker et al. [58] used a pre-trained Google Inception V3 network to classify brain histology specimens into normal, low-grade glioma (LGG), or high-grade glioma (HGG). Meanwhile, Truong et al. [59] reported several optimization schemes for a pre-trained ResNet-18 for brain tumor grading. The authors also proposed an explainability tool base on tile-probability maps to aid pathologists in analyzing tumor heterogeneity. A summary of DL approaches used in brain WSI processing, alongside other brain imaging modalities such as MRI or CT, is reported by Zadeh et al. in [60].

Let us now focus on studies dealing with tau pathology, which is a hallmark of Alzheimer’s disease. In [61], three different DL models were used to segment tau aggregates (tangles) and nuclei in postmortem brain WSIs of patients with Alzheimer’s disease. The three models included an FCNN, U-Net [62], and SegNet [63], with SegNet achieving the highest accuracy in terms of the intersection-over-union index. In [64], an FCNN was used on a dataset of 22 WSIs for semantic segmentation of tangle objects from postmortem brain WSIs. Their model is able to segment tangles of varying morphologies with high accuracy under diverse staining intensities. An FCNN model is also used in [65] to classify morphologies of tau protein aggregates in the gray and white matter regions from 37 WSIs representing multiple degenerative diseases. In [14], tau aggregate analysis is processed on a dataset of six postmortem brain WSIs with a combined classification-segmentation framework which achieved an F1 score of 81.3% and 75.8% on detection and segmentation tasks, respectively. In [16], neuritic plaques have been processed from eight human brain WSIs from the frontal lobe, stained with AT8 antibody (majorly used in clinics, helping to highlight most of the relevant structures). The impact of the staining (ALZ50 [14] vs. AT8 [16]), the normalization method, the slide scanner, the context, and the DL traceability/explainability have been studied, and a comparison with commercial software has been made. A baseline of 0.72 for the Dice score has been reported for plaque segmentation, reaching 0.75 using an attention U-Net.

Several domains in DL-based histopathological analysis of AD tauopathy remain unexplored. Firstly, even if, as discussed, a first work concerning neuritic plaques has been recently published by our team in [16], most of the existing works have used DL for segmentation of tangles rather than plaques, as the latter are harder to identify against the background gray matter due to their diffuse/sparse appearance. Secondly, annotations of whole slide images are frequently affected by errors by human annotators. In such cases, a DL preliminary model may be trained using weakly annotated data and used to assist the expert in refining annotations. Thirdly, contemporary tau segmentation studies do not consider context information. This is important in segmenting plaques from brain WSIs as these occur as sparse objects against an extended background of gray matter. Finally, DL models with explainability features have not yet been applied in tau segmentation from WSIs. This is a critical requirement for DL models used in clinical applications [66] [67]. The DL models should not only be able to precisely identify regions of interest, but clinicians and general users need to know the discriminative image features the model identifies as influencing their decisions.

4.3 Applications of Brain Computational Pathology

Digital systems were introduced to the histopathological examination to deal with complex and vast amounts of information obtained from tissue specimens. Whole slide imaging technology has proven to be helpful in a wide variety of applications in pathology (e.g., image archiving, telepathology, image analysis), especially when combining this imaging technique with powerful machine learning algorithms (i.e., computational pathology).

In this section, we will describe some applications of computational pathology for the analysis of brain tissue. Most methods focus on tumor analysis and cancer; however, we also find interesting results in clinical applications, drug trials [68], and neurodegenerative diseases. The authors cited in this section aim to understand brain disorders and use deep learning algorithms to extract relevant information from WSI.

In brain tumor research, an early survival study for brain glioma is presented in [44]. The approach has been previously described above. In brief, it is a four-stage methodology based on randomly sampled patches from different patients’ slides. They perform dimensionality reduction using PCA and then K-means clustering to group patches according to their phenotype. Then, the patches are sent to a deep convolutional network (DeepConvSurv) to determine which were relevant for the aggregation and final survival score. The deep survival model is trained on a small dataset leveraging the architecture the authors proposed. Also, the method is annotation-free, and it can learn information about one patient, regardless of the number or size of the WSIs. However, it has a high computational memory footprint as it needs hundreds of patches from a single patient’s WSI. In addition, the authors do not address the evaluation of the progression of the tumor, and a deeper analysis of the clusters could provide information about the phenotypes and their relation to brain glioma.

Whole slide images have been used as a primary source of information for cancer diagnosis and prognosis, as they reveal the effects of cancer onset and its progression at the subcellular level. However, being an invasive image modality (i.e., tissue gathered during a biopsy), it is less frequently used in research and clinical settings. As an alternative, noninvasive and nonionizing imaging modalities, such as MRI, are quite popular for oncology imaging studies, especially in brain tumors.

Although radiology and pathology capture morphologic data at different biological scales, a combination of image modalities can improve image-based analysis. In [69], the authors presented three classification methods to categorize adult diffuse glioma cases into oligodendroglioma and astrocytoma classes using radiographic and histologic image data. Thirty-two cases were gathered from the TCGA projectFootnote 4 containing a set of MRI data (T1, T1C, FLAIR, and T2 images) and its corresponding WSI, taken from the same patient at the same time point. The methods described were proposed in the context of the Computational Precision Medicine (CPM) satellite event at MICCAI 2018, one of the first combining radiology and histology imaging analyses. The first one develops two independent pipelines giving two probability scores for the prediction of each case. The MRI pipeline preprocesses all images to remove the skull, co-register, and resample the data to leverage a fully convolutional neural network (CNN) trained on another MRI dataset (i.e., BraTS-2018) to segment tumoral regions. Several radiomic features are computed from such regions, and after reducing its dimensionality with PCA, a logistic regression classifier outputs the first probability score. WSIs also need a preprocessing stage as tissue samples may contain large areas of glass background. After a color space transformation to HSV (hue saturation value), lower and upper thresholds are applied to get a binary mask with the region of interest, which is then refined using mathematical morphology. Color-normalized patches of 224 × 224 pixels are extracted from the region of interest (ROI) and filtered to exclude outliers. The remaining patches are used to refine a CNN (i.e., DenseNet-161) pre-trained on the ImageNet dataset. In the prediction phase, the probability score of the WSI is computed using a voting system of the classes predicted for individual patches. The scores from both pipelines are finally processed in a confidence-based voting system to determine the final class of each case. This proposal achieved an accuracy score of 0.9 for classification.

The second and third approaches also processed data in two different pipelines. There are slight variations in the WSI preprocessing step in the second method, including Otsu thresholding for glass background removal and histogram equalization for color normalization of patches of 448 × 448 pixels. Furthermore, the authors used a 3D CNN to generate the output predictions for the MRI data and a DenseNet pre-trained architecture for WSI patch classification. The last feature layer from each classification model is finally used as input to an SVM model for a unified prediction. In addition, regularization using dropout is performed in the test phase to avoid overfitting the models. The accuracy obtained with this methodology was 0.8.

The third approach uses larger patches from WSI and an active learning algorithm proposed in [70] to extract regions of interest instead of randomly sampling the tissue samples. Features from the WSI patches are extracted using a VGG16 CNN architecture. The probability score is combined with the output probability of a U-Net +  2D DenseNet architecture used to process the MRI data. The method achieved an accuracy of 0.75 for unified classification. Although results are promising and provide a valid approach to combining imaging modalities, data quality and quantity are still challenging. The use of pre-trained CNN architectures for transfer learning using a completely different type of imaging modality might impact the performance of the whole pipeline. As seen in previous sections, WSI presents specific characteristics depending on the preparation and acquisition procedures not represented in the ImageNet dataset.

An extension to the previous study is presented in [71]. The authors proposed a two-stage model to classify gliomas into three subtypes. WSIs were divided into tiles and filtered to exclude patches containing glass backgrounds. An ensemble learning framework based on three CNN architectures (EfficientNet-B2, EfficientNet-B3, and SEResNeXt101) is used to extract features which are then combined with meta-data (i.e., age of the patient) to predict the class of glioma. MRI data is preprocessed in the same way as described before and input to a 3D CNN network with a 3D ResNet architecture as a backbone.

The release of new challenges and datasets, such as the Computational Precision Medicine: Radiology-Pathology Challenge on brain tumor classification (CPM-RadPath), has also allowed studies using weakly supervised deep learning methods for glioma subtype classification. For instance, in [72], the authors combine 2D and 3D CNN to process 388 WSI, and its corresponding multiparametric MRI collected from the same patients. Based on a confidence index, the authors were able to fuse WSI- and-MRI-based predictions improving the final classification of the glioma subtype.

Moving on from brain tumors, examining brain WSI also provides essential insights into spatial characteristics helpful in understanding brain disorders.

In this area, analyzing small structures present in postmortem brain tissue is crucial to understanding the disease deeply. For instance, in Alzheimer’s disease, tau proteins are essential markers presenting the best histopathological correlation with clinical symptoms [73]. Moreover, these proteins can aggregate in three different structures within the brain (i.e., neurites, tangles, and neuritic plaques) and constitute one significant biomarker to study the progression of the disease and stratify patients accordingly.

In [14], the authors addressed the detection task of the Alzheimer’s patient stratification pipeline. The authors proposed a U-Net-based methodology for tauopathies segmentation and a CNN-based architecture for tau aggregates’ classification. In addition, the pipelines were completed with a nonlinear color normalization preprocessing and a morphological analysis of segmented objects. These morphological features can aid in the clustering of patients having different disease manifestations. One limitation, however, is the accuracy obtained in the segmentation/detection process.

Understanding the accumulation of abnormal tau protein in neurons and glia allows differentiating tauopathies such as Alzheimer’s disease, progressive supranuclear palsy (PSP), cortico-basal degeneration (CBD), and Pick’s disease (PiD). In [74], the authors proposed a diagnostic tool consisting of two stages: (1) an object detection pipeline based on the CNN YOLOv3 and (2) a random forest classifier. The goal is to detect different tau lesion types and then analyze their characteristics to determine to which specific pathology they belong. With an accuracy of 0.97 over 2522 WSI, the study suggests that machine learning methods can be applied to help differentiate uncommon neurodegenerative tauopathies.

Tauopathies are analyzed using postmortem brain tissue samples. For in vivo studies, there exist tau PET tracers that, unfortunately, have not been validated and approved for clinical use as correlations with histological samples are needed. In [75], the authors proposed an end-to-end solution for performing large-scale, voxel-to-voxel correlations between PET and high-resolution histological signals using open-source resources and MRI as the common registration space. A U-Net-based architecture segments tau proteins in WSI to generate 3D tau inclusion density maps later registered to MRI to validate the PET tracers. Although segmentation performance was around 0.91 accurate in 500 WSI, the most significant limitation is the tissue sample preparation, meaning extracting and cutting brain samples to reconstruct 3D histological volumes. Additional studies combining postmortem MRI and WSI for neurodegenerative diseases were reported by Jonkman et al. in [76].

5 Perspectives

This last section of the chapter deals with new techniques for the explainability of artificial intelligence algorithms. It also describes new ideas related to responsible artificial intelligence in the context of medical applications, computational histopathology, and brain disorders. Besides, it introduces new image acquisition technology mixing bright light and chemistry to improve intraoperative applications. Finally, we will highlight computational pathology’s strategic role in spatial transcriptomics and refined personalized medicine.

In [15, 16], we address the issue of accurate segmentation by proposing a two-loop scheme as shown in Fig. 9. In our method, a U-Net-based neural network is trained on several WSIs manually annotated by expert pathologists. The structures we focus on are neuritic plaques and tangles following the study in [14]. The network’s predictions (in new WSIs) are then reviewed by an expert who can refine the predictions by modifying the segmentation outline or validating new structures found in the WSI. Additionally, an attention-based architecture is used to create a visual explanation and refine the hyperparameters of the initial architecture in charge of the prediction proposal.

Fig. 9
A schematic presents the process to improve the segmentation of tauopathies by leveraging the expertise of human experts. The process involves patched-based detection and segmentation, network predictions, refined predictions by experts, a baseline, and A I network. The process also involves stratification based on features.

Expert-in-the-loop architecture proposal to improve tauopathies’ segmentation and to stratify AD patients

We tested the attention-based architecture with a dataset of eight WSIs divided into patches following an ROI-guided sampling. Results show qualitatively in Fig. 10 that through this visual explanation, the expert in the loop could define the border of the neuritic plaque (object of interest) more accurately so the network can update its weights accordingly. Additionally, quantitative results (Dice score of approximately 0.7) show great promise for this attention U-Net architecture.

Fig. 10
Six histology images. The images on the top are labeled ground truth mask, iterations 1, loss 0.7631, iterations 100, loss 0.8246. The images on the bottom are labeled original image, iterations, 500, loss 0.5071, and iterations 1000, loss 0.495, from left to right, respectively.

Attention U-Net results. The figure shows a patch of size 128 × 128 pixels, the ground-truth binary mask, and the focus progression using successive activation layers of the network

Our next step is to use a single architecture for explainability and segmentation/classification. We believe our method will improve the accuracy of the neuritic plaques and tangles outline and create better morphological features for patient stratification and understanding of Alzheimer’s disease [15, 16].

Despite their high computational efficiency, artificial intelligence—in particular deep learning—models face important usability and translational limitations in clinical use, as in biomedical research. The main reason for these limitations is generally low acceptability by biomedical experts, essentially due to the lack of feedback, traceability, and interpretability. Indeed, domain experts usually feel frustrated by a general lack of insights, while the implementation of the tool itself requires them to make a considerable effort to formalize, verify, and provide a tremendous amount of domain expertise. Some authors speak of a “black-box” phenomenon, which is undesirable for a traceable, interpretable, explicable, and, ultimately, responsible use of these tools.

In recent years, explainable AI (xAI) models have been developed to provide insights from and understand the AI decision-making processes by interpreting their second-opinion quantifications, diagnoses, and predictions. Indeed, while explaining simple AI models for regression and classification tasks is relatively straightforward, the explainability task becomes more difficult as the model’s complexity increases. Therefore, a novel paradigm becomes necessary for better interaction between computer scientists, biologists, and clinicians, with the support of an essential new actor: xAI, thus opening the way toward responsible AI: fairness, ethics, privacy, traceability, accountability, safety, and carbon footprint.

In digital histopathology, several studies report on the usage and the benefits of explainable AI models. In [77], the authors describe an xAI-based software named HistoMapr and its application to breast core biopsies. This software automatically identifies the regions of interest (ROI) and rapidly discovers key diagnostic areas from whole slide images of breast cancer biopsies. It generates a provisional diagnosis based on the automatic detection and classification of relevant ROIs and also provides a list of key findings to pathologists that led to the recommendation. An explainable segmentation pipeline for whole slide images is described in [40], which does a patch-level classification of colon glands for different cancer grades using a CNN followed by inference of class activation maps for the classifier. The activation maps are used for final pixel-level segmentation. The method outperforms other weakly supervised methods applied to these types of images and generalizes to other datasets easily. A medical use-case of AI versus human interpretation of histopathology data using a liver biopsy dataset is described in [78], which also stresses the need to develop methods for causability or measurement of the quality of AI explanations. In [67], AI models like deep auto-encoders were used to generate features from whole-mount prostate cancer pathology images that pathologists could understand. This work showed that a combination of human and AI-generated features produced higher accuracy in predicting prostate cancer recurrence. Finally, in [16], the authors show that, besides providing valuable visual explanation insights, the use of attention U-Net is even helping to increase the results of neuritic plaques segmentation by pulling up the Dice score to 0.75 from 0.72 (with the original U-Net).

Based on the fusion of MRI and histopathology imaging datasets, a deep learning 3D U-Net model with explanations is used in [79] for prostate tumor segmentation. Grad-CAM [80] heat maps were estimated for the last convolutional layer of the U-Net for interpreting the recognition and localization capability of the U-Net. In [81], a framework named NeuroXAI is proposed to render explainability to existing deep learning models in brain imaging research without any architecture modification or reduction in performance. This framework implements seven state-of-the-art explanation methods—including Vanilla gradient [82], Guided back-propagation, Integrated gradients [83], SmoothGrad [84], and Grad-CAM. These methods can be used to generate visual explainability maps for deep learning models like 2D and 3D CNN, VGG [85], and Resnet-50 [86] (for classification) and 2D/3D U-Net (for segmentation). In [87], the high-level features of three deep convolutional neural networks (DenseNet-121, GoogLeNet, MobileNet) are analyzed using the Grad-CAM explainability technique. The Grad-CAM outputs helped distinguish these three models’ brain tumor lesion localization capabilities. An explainability framework using SHAP [88] and LIME [89] to predict patient age using the morphological features from a brain MRI dataset is developed in [90]. The SHAP explainability model is robust for this imaging modality to explain morphological feature contributions in predicting age, which would ultimately help develop personalized age-related biomarkers from MRI. Attempts to explain the functional organization of deep segmentation models like DenseUnet, ResUnet, and SimUnet and understand how these networks achieve high accuracy brain tumor segmentation are presented in [91]. While current xAI methods mainly focus on explaining models on single image modality, the authors of [92] address the explainability issue in multimodal medical images, such as PET-CT or multi-stained pathological images. Combining modality-specific information to explain diagnosis is a complex clinical task, and the authors developed a new multimodal explanation method with modality-specific feature importance.

Intraoperative tissue diagnostic methods have remained unchanged for over 100 years in surgical oncology. Standard light microscopy used in combination with H&E and other staining biomarkers has improved over the last decades with the appearance of new scanner technology. However, the steps involved in the preparation and some artifacts introduced by scanners pose a potential barrier to efficient, reproducible, and accurate intraoperative cancer diagnosis and other brain disorder analyses. As an alternative, label-free optical imaging methods have been developed.

Label-free imaging is a method for cell visualization which does not require labeling or altering the tissue in any way. Bright-field, phase contrast, and differential interference contrast microscopy can be used to visualize label-free cells. The two latter techniques are used to improve the image quality of standard bright-field microscopy. Among its benefits, the cells are analyzed in their unperturbed state, so findings are more reliable and biologically relevant. Also, it is a cheaper and quicker technique as tissue does not need any genetic modification or alteration. In addition, experiments can run longer, making them appropriate for studying cellular dynamics [93]. Raman microscopy, a label-free imaging technique, uses infrared incident light from lasers to capture vibrational signatures of chemical bonds in the tissue sample’s molecules. The biomedical tissue is excited with a dual-wavelength fiber laser setup at the so-called pump and Stokes frequencies to enhance the weak vibrational effect [94]. This technique is known as coherent anti-Stokes Raman scattering (CARS) or stimulated Raman scattering histology (SRH).

Sarri et al. [95] proposed the first one-to-one comparison between SRH and H&E as the latter technique remains the standard in histopathology analyses. The evaluation was conducted using the same cryogenic tissue sample. SRH data was first collected as it did not need staining. SRH and SHG (second harmonic generation, another label-free nonlinear optical technique) were combined to generate a virtual H&E slide for comparison. The results evidenced the almost perfect similarity between SRH and standard H&E slides. Both virtual and real slides show the relevant structures needed to identify cancerous and healthy tissue. In addition, SRH proved to be a fast histologic imaging method suitable for intraoperative procedures.

Similar to standard histopathology, computational methods are also applicable to SRH technology. For instance, Hollon and Orringer [96] proposed a CNN methodology to interpret histologic features from SRH brain tumor images and accurately segment cancerous regions. Results show a slightly better performance (94.6%) than the one obtained by the pathologist (93.9%) in the control group. This study was extended and validated for intraoperative diagnosis in [97]. The study used 2.5 million SRH images and predicted brain tumor diagnosis in under 150 s with an accuracy of 94.6%. The results clearly show the potential of combining computational pathology and stimulated Raman histology for fast and accurate diagnostics in surgical procedures.

Finally, due to its strategic positioning at the cross of molecular biology/omics, radiology/radiomics, and clinics, the rise of computational pathology—by generating “pathomic” features—is expected to play a crucial role in the revolution of spatial transcriptomics, defined as the ability to capture the positional context of transcriptional activity in intact tissue. Spatial transcriptomics is expected to generate a set of technologies allowing researchers to localize transcripts at tissue, cellular, and subcellular levels by providing an unbiased map of RNA molecules in tissue sections. These techniques use microscopy and next-generation sequencing to allow scientists to measure gene expression in a specific tissue or cellular context, consistently paving the road toward more effective personalized medicine. Coupled with these new technologies for data acquisition, we have the release of new WSI brain datasets [98], new frameworks for deep learning analysis of WSI [99, 100], and methods to address the ever-growing concern of privacy and data sharing policies [101].