Abstract
Purpose
Accurate histological diagnosis in Hirschsprung disease (HD) is challenging, due to its complexity and potential for errors. In this study, we present an artificial intelligence (AI)-based method designed to identify ganglionic cells and hypertrophic nerves in HD histology.
Methods
Formalin-fixed samples were used and an expert pathologist and a surgeon annotated these slides on a web-based platform, identifying ganglionic cells and nerves. Images were partitioned into square sections, augmented through data manipulation techniques and used to develop two distinct U-net models: one for detecting ganglionic cells and normal nerves; the other to recognise hypertrophic nerves.
Results
The study included 108 annotated samples, resulting in 19,600 images after data augmentation and manually segmentation. Subsequently, 17,655 slides without target elements were excluded. The algorithm was trained using 1945 slides (930 for model 1 and 1015 for model 2) with 1556 slides used for training the supervised network and 389 for validation. The accuracy of model 1 was found to be 92.32%, while model 2 achieved an accuracy of 91.5%.
Conclusion
The AI-based U-net technique demonstrates robustness in detecting ganglion cells and nerves in HD. The deep learning approach has the potential to standardise and streamline HD diagnosis, benefiting patients and aiding in training of pathologists.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Purpose
Hirschsprung disease (HD) is a congenital disease characterised by the absence of ganglion cells in the distal bowel, extending proximally for varying distances [1]. A variety of diagnostic tests including contrast enema and anorectal manometry may be used as diagnostic screens, but diagnosis ultimately lays upon histopathological examination of a rectal biopsy. The diagnostic histological features of HD include the absence of ganglion cells and an increase in hypertrophic cholinergic nerves [2]. The accurate assessment of these histological features plays a pivotal role in planning a correct surgery needed to remove the non-functioning bowel. However, this histological analysis is far from straightforward. It presents several challenges, primarily in the accurate differentiation of these structures and is susceptible to errors due to the potential for variations in interpretation, which can significantly impact the diagnosis and consequently, treatment decisions. To address these challenges, artificial intelligence (AI) has emerged as a valuable tool in pathology [3, 4]. AI algorithms offer the promise of delivering consistent results, mitigating the interobserver variability that often can occur in manual assessment, thus facilitating the comparison of cases across different medical institutions [3, 5]. Furthermore, especially in low-volume centres, AI can help offset the deficiency in specialised expertise, thereby enhancing the quality of care provided to HD patients. In this study, we introduce an AI-based method aimed at automating the identification and quantification of ganglionic cells and hypertrophic nerves in the resected specimens for HD.
Materials and methods
Participant and slide selections
Specimens collected from patients undergoing surgical treatment for Hirschsprung’s disease (HD) between January 2010 and January 2022 were included in this study. Slides with poor quality, such as those exhibiting drying effects, significant air bubbles, or broken glass, were excluded from the analysis. Only formalin-fixed, paraffin-embedded tissue samples stained with haematoxylin–eosin were considered for the study. A high-capacity scanner was employed to capture images of 2048X1280 pixels at X10 magnification, thereby generating the dataset.
Development and training of AI system
The proposed methodology consisted in different steps:
-
1.
Annotation: A dedicated team comprising an expert pathologist and a paediatric surgeon specialising in HD histology meticulously annotated high-resolution slides (2048X1280 pixels at X10 magnification) using the AAPER web-based platform. Their meticulous work involved identifying and encircling ganglionic cells, nerves, and hypertrophic nerves individually. The annotations made on each image were then exported as multiclass masks. This process resulted in the creation of a tiff file for each histological slide, ultimately yielding two datasets: one for the histological images and another for the corresponding masks.
-
2.
Data augmentation: To bolster the training process and mitigate the risk of overfitting, we applied data augmentation techniques. These techniques encompassed random flipping, rotation, and colour normalisation.
-
3.
Dataset subdivision: The entire dataset underwent division into two subsets. The first subset, constituting 80% of the entire dataset, served as the training set for algorithm and model development. Simultaneously, the second subset, encompassing 20% of the dataset, was reserved for subsequent analysis and validation purposes.
-
4.
Manual segmentation: each entire slide was partitioned into 40 squared patches, each measuring 256 × 256 pixels. This division not only enhanced the overall output quality but also further increased the volume of data available for analysis.
-
5.
Train supervised U convolutional neural network: This extensive dataset served as the foundation for training neural networks. To create various neural network models, we curated subsets of the dataset, each with specific image characteristics. One dataset exclusively contained images of ganglion cells and normal nerves, while another comprised solely hypertrophic nerves. To ensure comprehensive learning, we also introduced images characterised solely by background into both datasets. This approach enabled the network to not only recognise the target areas for classification but also understand the broader context of the images, where normal cells and nerves would never appear, unlike hypertrophic nerves. In summary, the datasets we generated encompassed images with the following characteristics:
Dataset 1—model 1: Ganglionic cells and normal nerves
Dataset 2—model 2: Hypertrophic nerves:
The expected segmentations when applying model 1 are as follows
For images within the ganglionic zone: Segmentation of normal nerves and ganglionic cells
For images within the transition zone: Segmentation of ganglionic cells
For images within the aganglionic zone: No segmentation
For images containing only background: No segmentation
When applying model 2, the anticipated segmentations are:
For images within the ganglionic zone: No segmentation
For images within the transition zone: Segmentation of hypertrophic nerves.
For images within the aganglionic zone: Segmentation of hypertrophic nerves.
For images containing only background: No segmentation.
Consequently, there are images segmented by both model 1 and model 2, all of which belong to the transition zone and require additional scrutiny by the pathologist (Fig. 1).
-
6.
Testing convolutional neural network: Some slides were used to test the proposed model. In this context, each whole image of 2048X1280 pixels was divided into 40 patches with a size of 256 × 256 pixels. The predict function was then applied to each patch, and subsequently, the patches were reassembled using the un-patchify function to generate the original-sized mask. The manually segmented masks served as a verification control.
Analysis
At the conclusion of the training process, the evaluation function was applied to the model, yielding a metric value. In our study, we assessed accuracy, which provides insight into the proximity of predicted values to their true counterparts. This metric was determined by dividing the number of correct predictions by the total number of predictions made (Accuracy = correct prediction/all prediction). Achieving a high accuracy, typically exceeding 90%, is indicative of strong model performance. In addition to evaluating the model's ability to perform automatic segmentation, a prediction function was employed for each input sample, using this mathematical function: y_pred = model.predict(x_test). In our case, the input consists of the X test dataset, comprising images from the validation set representing 20% of the original dataset. This dataset was also used to assess the model’s accuracy at the end of each epoch.
Results
During the study period, we identified a total of 31 eligible patients diagnosed with Hirschsprung’s disease (HD). From these patients, we collected a comprehensive set of 108 tissue samples representing various regions, including the ganglionic zone, transition zone, and aganglionic zone. All these samples were preserved, paraffin-embedded, and formalin-fixed, and they were stored in the archive of the Institute of Pathology at a single specialised referral centre. Ganglionic cells, nerves, and hypertrophic nerves were annotated using a web-based platform (AAPER). The annotations were then used to generate masks, which would serve as essential inputs for the neural network. Following this, we took steps to augment the dataset to obtain 540 slides. Out of these, 50 were excluded from the training process as they were reserved for testing the U-NET model. Considering 490 slides, manual segmentation was applied to yield an extensive dataset consisting of 19,600 images. However, during the data processing phase, we excluded 17,655 slides that did not contain the specific target elements we were interested in. Subsequently, the AI algorithm was trained using the remaining 1,945 slides (930 for model 1 and 1015 for model 2), setting the stage for our comprehensive analysis and evaluation.
Model 1: Ganglionic cells and normal nerves
Three classes were considered for this segmentation: ganglionic cells, normal nerves, and background. The initial dataset comprised 930 images, divided as follows: 744 images for the training set and 186 images for the validation set. The training involved 120 epochs, and as demonstrated in Fig. 2, this training was deemed sufficient as a plateau was reached starting from the 100 epochs (green curve). Applying this scheme, the diagnostic accuracy reached 92.3% (Fig. 2—red curve). Figure 3 describes the prediction model related to the model 1.
Model 2: hypertrophic nerves
For training this model, we focussed on just two classes: hypertrophic nerves and background. The initial dataset comprised 1015 images, with the following distribution (812 images for the training set; 203 images for the validation set). Training involved 120 epochs, and as depicted in Fig. 4, the training accuracy reached a plateau starting from the 98th epoch, indicating a sufficient number of epochs for effective training. The calculated accuracy value of this model was 91.50% (Fig. 4). As for prediction, Fig. 5 demonstrates the model’s proficiency in accurately identifying the presence of both classes.
Test set evaluation
During this phase, we employed the 50 slides that were excluded from the training set and applied both models 1 and 2, which had undergone validation in previous stages. When model 1, designed to identify ganglionic cells and normal nerves, was applied to images from the ganglionic zone, it detected a higher number of ganglionic cells than were actually present (Supplementary Fig. 1). In the case of images from the aganglionic zone, when model 1 was employed, only 4 out of 40 patches were incorrectly segmented (Supplementary Fig. 2). On the other hand, applying model 2, specifically trained to recognise hypertrophic nerves, to ganglionic region images resulted in only 3 out of 40 patches being misclassified (Supplementary Fig. 3). When model 2 was applied to ganglionic regions, the system accurately segmented them (Supplementary Fig. 4).
Discussion
Artificial intelligence (AI) and machine learning are emerging technologies that can be used to create algorithms capable of decision making [6]. The whole medical scientific community has been fascinated by this new opportunity. Researchers and clinicians dedicated to rare conditions are foreseeing tools which would overcome the scarcity of numerosity of existing series to reach robust and supported diagnostic and therapeutic processes. Furthermore, the specific field of diagnosis through images both from radiology exams and pathology specimens would clearly receive huge support from AI-based assessments [7, 8]. Digital pathology, coupled with advanced digital slide scanning technology, has opened numerous possibilities for identifying various tissue types and specific target elements [9, 10]. Through the application of machine and deep learning techniques, it is now feasible to train a “computer pathologist” to recognise diverse structures, depending on their unique characteristics. However, one current limitation of fully automated pathology lies in the need for pathologist-guided delineation of specific regions within digitised slides. To achieve a diagnostically conclusive result, it has become increasingly important to blend both manual-adapted detection and automated cellular analysis through deep learning methods. Wang et al. have highlighted the advantages of combining these approaches to mitigate issues arising from the vast amount of data or a lack of inherent understanding of histological structures [12]. Deep learning relies on extensive datasets to train neural network algorithms [11]. As the number of slides/images increases, the algorithm's capability for unsupervised cellular analysis improves, enabling it to recognise disease-specific features and patterns through learned associations [12]. Until recently, AI and machine learning technologies were predominantly applied in the field of oncology [13,14,15]. However, more recently, these systems have been introduced into the diagnostic process of rare paediatric diseases, holding great promise [16,17,18,19]. Hirschsprung disease is a rare condition, belonging to the anomalies of the enteric nervous system. The condition is diffused all over the world with an incidence of 1 out 5000 newborns [20]. Although several aspects of the disease have been deeply studied, aetiology as well as variability in the phenotype and prognosis are still challenging the specialists who treat it. Guidelines for diagnosing and treating these cases are emerging from the editorial effort of medical societies and supranational institutions with methodology conditioned by poor level of evidence [21, 22]. In Hirschsprung’s disease and allied disorders, the expert’s involvement in crucial phases is rewarded as the possible guarantee of a correct approach. However, expertise definition is currently vague and volume of treated cases seems the only reliable parameter.
In this study, we have described the development and technical validation of a novel, supervised AI model for the evaluation of histopathologic features in the spectrum of Hirschsprung diagnosis. The primary goal of this study was to establish a “proof-of-principle” model in the setting of HD-AI system, showing its potential as a semi-automated tool in the field of anatomic pathology, providing accurate, reproducible, quantitative assessment of various microscopic features of interest (identifying ganglionic cells, hypertrophic nerves, normal nerves), increasing both efficiency and reporting standardisation in this specific context. Considering the rare nature of Hirschsprung's disease, there have been limited attempts to harness AI for its diagnosis. Schilling et al. made an endeavour in this direction, utilising AI to diagnose HD with the aid of histological slides stained for calretinin, microtubule-associated protein 2, Glucose transporter isoform 1, and S100. Their study involved 93 tissue blocks from 31 specimens of 27 patients. In their training set, they reported a sensitivity of 87.5% and specificity of 80%, while in the development set, they achieved 95% sensitivity and 90.4% specificity [18]. Our study diverges in both objectives and methodologies. First, mirroring a recent study by Greenberg et al., we exclusively employed H&E-stained slides, abstaining from the use of immunohistochemistry [23]. Second, through the diligent application of data augmentation and segmentation techniques, our dataset surpassed 1000 slides in volume, distinguishing it in terms of scale and potential.
The algorithm developed in this study demonstrates an accuracy rate of 92.3% for detecting ganglionic cells and 91.5% for identifying hypertrophic nerves, respectively. In the realm of Hirschsprung’s disease diagnosis, a common trend in the literature is the consistently high reported specificity, typically exceeding 90%, which translates to a rarity of false-positive results. However, the incidence of false-negative results displays a wider spectrum, ranging from 0 to 40% [24]. In this context, the potential incorporation of immunohistochemistry could further enhance diagnostic accuracy. Nevertheless, it is noteworthy that at our centre, our experienced pathologist achieved a 100% detection rate for pathological markers (including hypertrophic nerves and the absence of ganglionic cells) exclusively through the examination of H&E-stained slides, without any instances of false positives. Consequently, the utilisation of this algorithm may simplify the diagnostic process and empower less-experienced pathologists to perform effectively. To the best of our knowledge, this study represents the pioneering effort to employ two AI models for the histological diagnosis of Hirschsprung’s disease, encompassing both ganglionic cells and hypertrophic nerves. This innovation is significant because the combined use of these models enables the AI system to identify the transition zone, the area situated between the aganglionic and ganglionic zones. Notably, our research group has previously demonstrated that the length of this transitional area serves as a predictive factor for post-HAEC development (these findings are yet to be published). One of the most significant challenges encountered in this model is the accurate detection of ganglion cells. The machine learning algorithm may occasionally misclassify immature ganglion cells as mature ganglion cells, especially when they are not in proximity to the expected context. It is essential to note that ganglion cells are exclusively located within the submucosa or muscularis propria layers. Any cell or finding identified in any other layer, regardless of how similar it may appear, is highly unlikely to represent a genuine ganglion cell. However, in the absence of this contextual information, some findings can mimic ganglion cells, particularly immature ones, leading to potential misclassification.
To address this issue, our future applications of the algorithm will include tracking the origin of each image within its respective slide. This additional contextual information will significantly enhance the algorithm’s ability to provide a more accurate assessment by considering the specific histological layer in which the cells are located.
This study has several limitations which merit mention. As stated, the available dataset was limited and significantly smaller than that of similar studies on the use of AI in pathology [25, 26]. Large data sets are considered necessary to properly represent the wide variability present in clinical samples. Smaller datasets therefore suffer both from a statistical standpoint and from excessive uniformity. Our use of data augmentation techniques somewhat circumvents this problem. Nevertheless, additional data, including data generated by other institutions, considering the rarity of the disease would allow for further validation which could improve upon the algorithm. Furthermore, from a technical point of view, there are various challenges that will have to be overcome. For instance, artefacts can be mistaken as ganglionic cells if many tissue layers overlap and create a “brown-like colouration”. In addition, in a machine learning approach for histological purposes, the hierarchical analysis of specific structures such as nerves and ganglion cells within the tissue slide is a fundamental aspect that significantly contributes to the accuracy and effectiveness of the AI system.
Conclusion
The results demonstrate the robustness of the AI- based U-net technique in accurately detecting ganglion cells and nerves in HD histology. Furthermore, the streamlined nature of AI-based diagnosis can significantly benefit patients. Timely and accurate diagnoses are crucial in HD, as they directly impact the planning post-operative care. By reducing the time required for histological analysis, we can expedite faster treatment decisions and improve patient outcomes. The increase in data transfer speed associated with may predict scenarios where an AI-based pathology assistant may indicate if and where to transfer the case for a definitive diagnosis made by an human expert. Moreover, the integration of AI can also play a role in the training of pathologists. The technology serves as a valuable educational tool, allowing pathologists with special interest towards rare conditions, to learn from a vast dataset of annotated cases.
Data availability
Requests for data sharing will be considered upon written request to the corresponding author.
References
Das K, Mohanty S (2017) Hirschsprung disease—current diagnosis and management. Indian J Pediatr 84:618–623. https://doi.org/10.1007/s12098-017-2371-8
Matsukuma K, Gui D, Saadai P (2023) Hirschsprung disease for the practicing surgical pathologist. Am J Clin Pathol 159:228–241
Stenzinger A, Alber M, Allgäuer M et al (2022) Artificial intelligence and pathology: from principles to practice and future applications in histomorphology and molecular profiling. Semin Cancer Biol 84:129–143
Huang W, Randhawa R, Jain P et al (2021) Development and validation of an artificial intelligence-powered platform for prostate cancer grading and quantification. JAMA Netw Open. https://doi.org/10.1001/jamanetworkopen.2021.32554
Moscalu M, Moscalu R, Dascălu CG et al (2023) Histopathological images analysis and predictive modeling implemented in digital pathology—current affairs and perspectives. Diagnostics 13:2379. https://doi.org/10.3390/diagnostics13142379
Lapuschkin S, Binder A, Montavon G, et al (2016) Analyzing Classifiers: Fisher Vectors and Deep Neural Networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp 2912–2920
Chen X, You G, Chen Q et al (2023) Development and evaluation of an artificial intelligence system for children intussusception diagnosis using ultrasound images. iScience 26:106456. https://doi.org/10.1016/j.isci.2023.106456
Hosny A, Parmar C, Quackenbush J et al (2018) Artificial intelligence in radiology. Nat Rev Cancer 18:500–510
Irshad H, Veillard A, Roux L, Racoceanu D (2014) Methods for nuclei detection, segmentation, and classification in digital histopathology: a review-current status and future potential. IEEE Rev Biomed Eng 7:97–114. https://doi.org/10.1109/RBME.2013.2295804
Linder N, Konsti J, Turkki R et al (2012) Identification of tumor epithelium and stroma in tissue microarrays using texture analysis. Diagn Pathol. https://doi.org/10.1186/1746-1596-7-22
Wang H, Cruz-Roa A, Basavanhally A et al (2014) Mitosis detection in breast cancer pathology images by combining handcrafted and convolutional neural network features. J Med Imag 1:034003. https://doi.org/10.1117/1.jmi.1.3.034003
Madabhushi A, Lee G (2016) Image analysis and machine learning in digital pathology: challenges and opportunities. Med Image Anal 33:170–175
Shimizu H, Nakayama KI (2020) Artificial intelligence in oncology. Cancer Sci 111:1452–1460. https://doi.org/10.1111/cas.14377
Wang K, Xing Z, Kong Z et al (2023) Artificial intelligence as diagnostic aiding tool in cases of Prostate imaging reporting and data system category 3: the results of retrospective multi-center cohort study. Abdom Radiol. https://doi.org/10.1007/s00261-023-03989-9
Ramesh S, Chokkara S, Shen T et al (2021) Applications of artificial intelligence in pediatric oncology: a systematic review. JCO Clin Cancer Inform 5:1208–1219. https://doi.org/10.1200/CCI.21
Jia J, Wang R, An Z et al (2018) RDAD: a machine learning system to support phenotype-based rare disease diagnosis. Front Genet. https://doi.org/10.3389/fgene.2018.00587
Wang B, Xiao L, Liu Y et al (2018) Application of a deep convolutional neural network in the diagnosis of neonatal ocular fundus hemorrhage. Biosci Rep. https://doi.org/10.1042/BSR20180497
Schilling F, Geppert CE, Strehl J et al (2019) Digital pathology imaging and computer-aided diagnostics as a novel tool for standardization of evaluation of aganglionic megacolon (Hirschsprung disease) histopathology. Cell Tissue Res 375:371–381. https://doi.org/10.1007/s00441-018-2911-1
Mahajan A, Burrewar M, Agarwal U et al (2023) Deep learning based clinico-radiological model for paediatric brain tumor detection and subtype prediction. Explor Target Antitumor Ther. https://doi.org/10.37349/etat.2023.00159
Langer JC (2010) Hirschsprung disease. In: Fundamentals of Pediatric Surgery. pp 475–484
Kyrklund K, Sloots CEJ, de Blaauw I et al (2020) ERNICA guidelines for the management of rectosigmoid Hirschsprung’s disease. Orphanet J Rare Dis 15:164
Ambartsumyan L, Patel D, Kapavarapu P et al (2023) Evaluation and management of postsurgical patient with hirschsprung disease neurogastroenterology & motility committee: position paper of north american society of pediatric gastroenterology, hepatology, and nutrition (NASPGHAN). J Pediatr Gastroenterol Nutr 76:533–546. https://doi.org/10.1097/MPG.0000000000003717
Greenberg A, Aizic A, Zubkov A et al (2021) Automatic ganglion cell detection for improving the efficiency and accuracy of hirschprung disease diagnosis. Sci Rep. https://doi.org/10.1038/s41598-021-82869-y
Kapur RP, Reed RC, Finn LS et al (2009) Calretinin immunohistochemistry versus acetylcholinesterase histochemistry in the evaluation of suction rectal biopsies for hirschsprung disease. Pediatr Dev Pathol 12:6–15. https://doi.org/10.2350/08-02-0424.1
Wang S, Wang T, Yang L et al (2019) ConvPath: a software tool for lung adenocarcinoma digital pathological image analysis aided by a convolutional neural network. EBioMedicine 50:103–110. https://doi.org/10.1016/j.ebiom.2019.10.033
Campanella G, Hanna MG, Geneslaw L et al (2019) Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 25:1301–1309. https://doi.org/10.1038/s41591-019-0508-1
Funding
Open access funding provided by Università degli Studi di Padova within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Contributions
Study conception and design: MD, FFL, FU. Data acquisition: MD, AM, LS. Analysis and data interpretation: AM, FU, MD, FFL, LS. Drafting of the manuscript: MD, FFL. Critical revision: FFL, FU, LS, ADT, and PG.
Corresponding author
Ethics declarations
Conflict of interest
None of the authors has any conflicts-of-interest to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Duci, M., Magoni, A., Santoro, L. et al. Enhancing diagnosis of Hirschsprung’s disease using deep learning from histological sections of post pull-through specimens: preliminary results. Pediatr Surg Int 40, 12 (2024). https://doi.org/10.1007/s00383-023-05590-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s00383-023-05590-z