Introduction: Background and Scope

In 2020, 1.1 million children fell ill with tuberculosis (TB) globally. It is a leading infectious cause of morbidity and mortality in children worldwide and a major health concern, particularly in Africa and Southeast Asia, where the highest number of children are infected [1]. Despite being both curable and preventable, management of TB has been adversely impacted first by the human immunodeficiency virus (HIV) and more recently by the coronavirus pandemic.

The “2020 Global TB” report revealed that the identification of TB fell by 18% from 2019 to 2020 representing a drop from 7.1 million to 5.8 million estimated diagnoses. This significant reduction during the recent pandemic was due to the reduced capacity to provide adequate screening services because of various lockdowns, as well as restrictions in movement and the associated risks of visiting health care facilities, all of which impacted access to TB diagnosis, testing and treatment [2].

Important progress has been made in improving TB diagnosis in recent decades due to the availability of sensitive bacteriological tests for diagnosis and screening. “Xpert mycobacterium tuberculosis/resistance to rifampicin assay” is a rapid nucleic acid amplification (NAA) test that can detect Mycobacterium tuberculosis as well as drug resistance to rifampicin [3] Although it is a sensitive test, it is costly and requires additional resources including staff and time allocation, especially in paediatrics where obtaining sputum samples can be challenging. Imaging, therefore, plays a critical role in diagnosing pulmonary TB in children, due to their nonspecific clinical manifestations, low bacillary load and the difficulties in obtaining suitable samples [4].

In March 2021, the World Health Organisation (WHO) released consolidated guidelines for the management of tuberculosis (module 2) with the recommendation to use chest radiography. Radiography is a sensitive screening tool (pooled sensitivity 98%) [5] and, although lacking sufficient specificity to confirm a TB diagnosis, has an important role in the early detection of the disease in children. Additionally, research has demonstrated that children are at a higher risk of TB (compared to adults), and that early detection has the potential to reduce the overall population burden of the disease when combined with TB preventative treatment [5].

Unfortunately, many countries impacted with a high caseload of TB lack sufficient numbers of paediatric radiologists to interpret the increased influx of radiographic images. This creates challenges when screening high volumes of cases across a large range of countries [6].

In recent years, there has been a renewed emphasis on developing and deploying innovative sustainable solutions to deal with some of the health care resource challenges related to TB screening.

Computer-aided detection and machine learning have the potential to provide a solution to the resource and diagnostic challenges mentioned above. In the consolidated guidelines module 2, WHO updated its recommendations for TB screening to include computer-aided detection software packages to automate the interpretation of digital chest radiography images in patients older than 15 years of age, and to produce a numerical score indicating the likelihood of TB [5].

The “Stop TB” Partnership and the “Foundation for Innovative New Diagnostics” (FIND) launched an online resource centre of computer-aided detection products for the diagnosis of TB. In addition, they undertook a landscaping analysis, gathering intervention-relevant information on different computer-aided detection products for TB acquired from companies known to have developed these products [7].

Kim Y et al. [8] reviewed peer-reviewed research articles on AI in the thorax between 2015 and 2021. The review focused on how AI (specifically, deep learning) could be applied to complement aspects of the current health care system. With advances in technology and appropriate preparation of physicians, AI can address clinical problems that have not been solved due to a lack of clinical resources or technological limitations [8]. Understanding how these AI products work, their real-world applications, key performance metrics and the value proposition is essential for radiologists to consider when evaluating and reviewing the array of AI products for paediatric applications in TB diagnosis.

Chest radiography and tuberculosis

Up to 80% of reported cases of paediatric tuberculosis develop thoracic disease [9]. Chest radiography is an inexpensive tool that provides a highly sensitive method for screening patients (pooled sensitivity 98%) [5]. It is also a widely available aid in diagnosing pulmonary TB — when it cannot be confirmed bacteriologically — and uses a minimal amount of radiation [10].

Interobserver variability of chest radiography signs

Studies have evaluated both the diagnostic accuracy of chest radiography for TB diagnosis via the detection of specific radiographic findings and the classification of a normal/abnormal study. Studies that used the classification of normal/abnormal chest radiography have reported variable performance, from low to moderate interobserver agreement [11], and therefore assessing for specific findings on chest radiographs alongside the clinical evaluation is recommended [12].

Classic radiographic findings that may suggest TB include cavitations, nodules, consolidation, pleural effusion and enlarged mediastinal lymph nodes, the latter of which is commonly the only manifestation of the disease [13]. Most studies focusing on the detection of specific radiographic findings using imaging reported a low interobserver agreement, with widely variable kappa values from -0.03 to 0.52 [14,15,16]. Specifically, the diagnostic accuracy of mediastinal lymphadenopathy was reported as low, related to the possibility of the superimposition of structures, which did not improve despite using lateral views [14].

Recently, WHO published an updated guideline of recommendations with important alterations on how to use and interpret chest radiography to aid TB treatment and management [17, 18]. The changes proposed were to resolve some limitations, especially related to the detection of lymph nodes, with recommendations to focus on improved detection and better inter-reader agreement signs. This allows classification of affected children into groups of severity, where they might benefit from shorter regimens and fewer drugs.

The challenging landscape of paediatric TB detection using chest radiographs highlights the critical roles played by image quality and interpretation [19].

Several studies report the limitations of non-radiologists interpreting chest radiographs in children [20]. Given the worldwide shortage of radiologists, especially pediatric radiologists [6], this poses a problem particularly in countries with a high burden of TB. 

Effectiveness of chest radiography as a screening tool

Chest radiography as a screening tool was studied by Huang et al. [21] in a cohort of 4,468 children exposed to TB. They evaluated the protective efficacy of isoniazid preventive therapy in asymptomatic children with and without abnormal chest radiographs. They showed that exposed asymptomatic children with abnormal chest radiographs were 25.1-fold more likely to have co-prevalent TB and 26.7-fold more likely to be diagnosed with TB during follow-up than exposed asymptomatic children with normal chest radiographs [21]. This finding is important and highlights that there is a large portion of children with subclinical TB who do not show classic signs and are going undiagnosed [22]. Detecting subclinical TB provides an opportunity for health care professionals to deliver care early in the disease history and may limit the progression and risk of post-TB sequelae and extensive lung damage.

According to this evidence, WHO guidelines recommend chest radiography as a screening tool that includes symptom screening for children in all age groups as a part of systematic evaluation processes/protocols [17]. It is not recommended that chest radiography is used for follow-up/evaluating improvement; this must be done through monitoring weight and height as well as clinical evaluation [17].

Artificial intelligence for the diagnosis of tuberculosis from chest radiographs

In recent years, AI and computer-aided detection software have been developed to augment and automate the interpretation of digital chest radiography in TB screening [7]. A literature search for publications including studies on AI in TB imaging using search items related to AI, TB and thoracic imaging yielded 110 results of which 21 were published in the last 5 years (Table 1) [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43]. It is important to note that all but 2 of the 21 articles [32, 37] included studies where the algorithms were validated on data sets with a population age group > 15 years old. This highlights the lack of clinical studies of AI in the niche area of paediatric imaging.

Table 1 Summary of included publications

Artificial intelligence is the process of making a computer think and learn through programming, training and testing. As a subset of AI, machine learning uses statistical methods to enable machines to improve with and evolve as a result of that experience [44]. Deep learning is a subset of machine learning that is used in circumstances in which a very large amount of data must be processed. These deep learning networks rely on multlayered configurations called artificial neural networks to process data [45] (Fig. 1).

Fig. 1
figure 1

A simplified view of how artificial neural networks function. Although 2 layers are shown, hundreds or thousands of layers are used

These artificial neural networks, often hundreds of layers deep, can train themselves from large data sets to make accurate predictions on newly input unknown data.

Machine learning and deep learning algorithms have been developed to improve workflows in radiology or to assist the radiologist by automating tasks such as lesion detection (Fig. 2) or medical imaging quantification.

Fig. 2
figure 2

a, b AP chest radiograph in a 2-year-old girl with pulmonary tuberculosis. a Before use of the artificial intelligence (AI) tool. b The AI tool identifies lymphadenopathy (two smallest bounding boxes) and consolidation (two largest bounding boxes) and measures the cardiothoracic ratio (horizontal lines)

Lopez Garnier et al. [46] in 2019 summarised data on computer-aided detection type, study design and diagnostic accuracy. The authors included 53 of the 4,712 articles reviewed: 40 focused on computer-aided detection design methods (development studies) and 13 focused on evaluation of computer-aided detection. Of the 40 development studies, 7 (17%) used deep learning methods while the remaining 33 (83%) used machine learning approaches. The authors concluded that AI-based computer-aided detection programs are promising, but more clinical studies are needed that minimise sources of potential bias to ensure validity of the findings outside of study settings [46].

Triage and worklist prioritisation are important applications of AI in thoracic radiology and may have clinical relevance for TB screening where a timely diagnosis is critical with such a highly transmissible infection [8]. Of course, this will be reliant on accurate AI image review if the triage tool is to prioritise abnormal radiographs for urgent reporting or a highly specific AI tool using Natural Language Processing (NLP) that can review the clinical indication for suspicion of TB and bring this to the attention of the radiologist.

In the last few years, several studies have been published on the use of AI in diagnosing pneumonia and TB on chest radiographs; however, the scientific literature that includes the performance measures of AI products in TB intended to be used in the paediatric population is restricted and, further, there are very few existing adult applications that could be applied to the paediatric population [47].

The study by Mouton, Pitcher and Douglas [48] was the first to examine AI detection of abnormalities in paediatric chest radiographs in a population suspected of a high incidence of TB. The system reached reasonable performance with an area under the curve (AUC) of 0.78 for correctly identifying abnormal regions in the image.

Schalekamp et al. [47] reviewed 40 CE (Conformité Européenne)-marked commercial software packages and discussed their performances in different chest findings and diseases in paediatrics, where it was found that none of these products was specifically designed for that population. The intended use of most of the current AI products in radiology with CE certification is more tailored to adult screening.

Hwang et al. [49] developed a deep learning-based automated detection (DLAD) algorithm and compared radiology and non-radiology physician performance in image interpretation for the detection of active TB with and without AI assistance using six independent external multicentre test data sets. They concluded that both non-radiology physicians and board-certified radiologists showed improvements in sensitivity with the assistance of DLAD thus highlighting the potential of AI as a second reader [8, 28, 49].

AI as a double reader may replace the staffing needed for double radiologist interpretation and consensus and may increase the value radiologists are able to provide to their patients.

Computed tomography (CT) scans are not always accessible in high TB burden countries, but in countries where health care centres are better resourced with imaging equipment, these scans can aid radiologists in the diagnosis of suspected TB cases when chest radiographs are either inconclusive or in cases where there is clinical and imaging progression.

CT scans can provide additional information to guide diagnosis, monitor imaging changes and evaluate the severity of pulmonary TB.

Yan et al. [41] performed a retrospective, multicohort, diagnostic study where they developed an AI cascading model for fully automated diagnosis and triage of pulmonary TB based on the CT scans of 526 participants. Overall accuracy of 6 pulmonary critical imaging findings indicative of TB in the independent datasets was 81.1–91.1%. Spearman correlation analysis was used to assess the correlation between the radiologist-estimated CT score and the TB score determined by the algorithm, which was shown to be moderate to good (r = 0.453–0.761) [41].

Challenges and future directions of artificial intelligence in tuberculosis imaging

The challenges faced by using AI in TB imaging, particularly for the paediatric population, have many similarities to those of using AI in other diseases, namely a lack of diverse training data, lack of external validation of AI models, the possibility of bias, questionable reference standards (such as human reader opinion on radiographic diagnosis rather than correlation with microbiological reference) and real-world implementation data [50]. Future directions of AI for TB imaging, especially in children, will therefore need to focus on these aspects.

Training data for AI in TB imaging should include imaging from multiple centres and using multiple vendors' equipment and modalities from different manufacturers. This may also involve acquiring digital photographs of chest radiographs (either from radiographic film on a light box or computer screens). This is important given that some AI tools are being developed as smartphone applications, bypassing any need for local information technology expertise in rural centres where picture archiving and communication systems (PACS) software may not be available, or where IT information technology specialists are lacking.

When considering the adaptability of AI for TB diagnosis to other settings and types of chest radiograph photographs, there are challenges. For example, Becker et al. [51] reported an AUC of 0.82 for their deep learning model in the localisation of pathological areas on chest radiograph photographs, although specific diagnostic labels were still challenging in their small data set of 138 adults. Adapting a deep learning tool to identify photographs of chest radiographs in children for TB would be a useful area for future research and can build on existing tools rather than starting from scratch. Researchers are attempting computational methods for recalibration of existing AI models for chest radiographic interpretation to allow for improved accuracy from smartphone acquired photographs [52].

Bias has been highlighted in some AI articles as contributing to poor performance [53]. This is of particular concern when AI models are applied to a wider population than they have been adequately trained to make diagnoses on (e.g., using a model trained on adults for use in children). As an example, Harris et al. [50] conducted a systematic review of diagnostic accuracy of AI-based computer programs to analyse chest radiographs for pulmonary tuberculosis. They found there was a risk of bias and higher mean AUC in development studies that used chest radiography databases compared to clinical studies [50].

Training without bias is critically important and requires that the initial data are free of any external influences that might cloud the information, such as demographic information or additional medical information that may suggest a certain diagnosis. Elimination/reduction of bias also requires that the training dataset is sufficiently large to encompass a variety of patients.

Although this article has outlined AI research for paediatric pneumonia [54, 55], only one commercial product (CAD4TB v6) [56] is licensed to interpret chest radiographs for use in children older than 4 years of age (others are for adults and adolescents). The future of this field is likely to see more emerging commercial solutions and products designed for the paediatric population.

In the future, randomised controlled trials implementing AI models across several centres and comparing outcomes with those that do not employ the AI models will be vital in understanding the patient benefit and efficiencies that AI might bring. This is not only important to determine cost-effectiveness across differing sites but also to understand whether there are certain conditions that would be more optimally suited for AI triage. There is also consideration to be made on whether diagnostic thresholds require amending between sites with respect to different TB and HIV prevalence levels and patient ages. This would be helpful for raising funds to provide this standard of care and to enable greater access to emerging digital technologies. Eventually, as more robust AI models become developed across modalities other than radiography, it may be possible to use AI for point of care ultrasound in TB, for example [57].

Recent publications have reported key findings on chest ultrasound in TB [58,59,60], which could be highlighted by AI thus providing a point of care solution and diagnosis for less experienced users who may not know the significance of such appearances.

Artificial intelligence software could even become combined with a portable paediatric hands-free ultrasound device that remotely provides point of care diagnoses for other cardiothoracic conditions [61]. Such a device has been recognised as an innovative novel health technology for low resource settings by WHO [62] and a prototype is being developed.

The epidemiology of TB has been adversely impacted by HIV. Children with HIV often present without the classic imaging features of TB [63]. Understanding these features and training an algorithm to detect the pertinent signs and triage these patients can provide a solution in identifying the many patients who are missed in screening programmes.

Similarly, through AI analysis of historical data and the pathologies and signs on the chest radiograph and other imaging modalities, immune reconstitution inflammatory syndrome (IRIS), which is defined as clinical deterioration after initiation of antiretroviral therapy (radiologically manifested as worsening or new hilar or mediastinal lymphadenopathy with or without tracheobronchial compression, worsening or new reticular/nodular infiltrates, worsening or new air space consolidation or pleural effusion), can be identified and highlighted earlier.

Finally, without prospective, real-world testing of the AI models, we lack a true understanding of how these tools can aid patient outcomes and impact clinical workflow. This is not unique to TB imaging and relates to other avenues of AI research. One study by Philipsen et al. [64] performed a simulation test to determine the cost-effectiveness of using software (CAD4TB v3.07, Nijmegen, Netherlands) to pre-screen chest radiographs in adults with suspected TB. The software assigned each chest radiograph a score between 0 (normal) and 100 (highly abnormal) to determine which patients should undergo further, more expensive molecular testing for TB. They found that by adopting an optimal threshold score of 85, they would only require 40% of patients to undergo the molecular testing and overall costs per screened subject and costs per notified TB case were reduced by more than 50%. How generalisable this tool and such a workflow may be for paediatric cases is yet to be determined [45, 65].

Conclusion

Medical applications of AI are becoming increasingly important. The “End TB Strategy” was adopted by WHO with the aim of eliminating TB by reducing 90% of mortality and 80% TB incidence by 2030. This emphasises the need to ensure early and correct diagnosis and adequate treatment for people with TB.

An AI tool to assist with the diagnosis, triage and treatment of large numbers of patients with a highly transmissible infection will be transformative and will help us solve some of the challenges in combatting TB. However, more real-world AI applications and multicentre clinical studies are required in paediatric imaging to ensure that the strategy and goals set out to end TB in children are achievable.