Introduction

Rationale

The temporomandibular joint (TMJ) is a unique structure as it is the only diarthrodial synovial joint in the body. Three-dimensional (3D) technology provides improved visualization of the TMJ structures. Magnetic resonance imaging (MRI) is primarily used for evaluating the TMJ’s soft tissue [1], whereas computed tomography (CT) and cone-beam computed tomography (CBCT) are primarily employed for assessing the hard tissue. Indeed, these two technologies enable a more precise analysis of the TMJ’s osseous components than ever before [1,2,3,4]. It is worth noting, however, that CBCT exposes patients to less radiation compared to CT. Studies have reported that high-resolution images can achieve excellent accuracy in TMJ examinations [5,6,7].

Recently, the American Academy of Oral and Maxillofacial Radiology and the American Academy of Orofacial Pain issued recommendations regarding TMJ imaging. According to these recommendations, “maxillofacial CBCT allows for evaluating the osseous and dental hard tissue components. However, due to its limited ability to distinguish soft tissue details, CBCT is inadequate for assessing the necessary information required for diagnosing and managing patients with temporomandibular disorders (TMDs)” [8].

A wide range of methods has been proposed to evaluate various aspects of the TMJ, including the dimensions, inclinations, positions, surface areas, volume of the mandibular condyles, as well as the dimensions, inclinations, and dimensional/volumetric parameters of the glenoid fossa and TMJ spaces, respectively. However, the utilization of these methods has complicated TMJ analyses. These techniques have been employed in the context of patients with TMDs, normal TMJ patients, and even to assess the effects of specific interventions on TMJ morphology and dimensions. Unfortunately, the findings derived from these studies have been inconsistent and even contradictory [9,10,11,12,13,14]. These discrepancies can be attributed to a lack of methodological consensus. For instance, while the imaging is 3D, the TMJ measurements are often taken from slice sections resulting in two-dimensional (2D) data. Furthermore, there is considerable variability in the TMJ measurements utilized for assessing the osseous structures, and there is a wide diversity in the selected study samples.

To date, there has been neither comprehensive systematic review assessing the reliability and comprehensiveness of CT or CBCT-based osseous measurements of the TMJ, nor an examination of the extent to which the existing methodologies are valid for evaluating this intricate structure.

Objectives

This systematic review aimed to appraise the reliability and comprehensiveness of CT or CBCT-based methods employed for 3D positional and morphological assessment of the TMJ. Additionally, the review aimed to propose a standardized method for such assessments.

Methods

Protocol registration

The study protocol was registered with PROSPERO under registration number CRD42020199792 and was conducted following the guidelines outlined in the Cochrane Oral Health Group’s Handbook for Systematic Reviews (http://ohg.cochrane.org).

PICOS question and eligibility criteria

The inclusion criteria for this study comprised observational studies, primarily descriptive (either retrospective or prospective), that utilized CT or CBCT for imaging purposes in adult human subjects. These studies evaluated the reliability and comprehensiveness of osseous measurements in the TMJ. On the other hand, interventional studies, case reports, case series, literature reviews, systematic reviews, opinion articles, book chapters, as well as studies involving populations other than normal adult humans (such as cadavers, animals, growing patients, individuals with craniofacial anomalies, the TMDs, trauma to the temporal or temporomandibular region, or a history of surgical intervention/s in the TMJ or surrounding area) were excluded. Additionally, records focusing on other imaging methods or outcomes were also excluded.

Information sources, search strategy, and study selection

In August 2020, three co-authors (BS, LM, and SJ) conducted an independent and thorough search across six search engines/databases, namely PubMed, Scopus, Science Direct, Web of Science, Cochrane, and LILACS. The search was further complemented by a manual examination of the reference lists of the included studies. The same three co-authors subsequently updated the search in September 2022.

The retrieved records were entered into an Excel sheet (version 2010) to identify and remove duplicates. Subsequently, the titles and abstracts of the remaining records were screened to determine their potential for inclusion, and irrelevant studies were excluded. The full texts of the remaining studies were thoroughly reviewed, and any studies deemed irrelevant were eliminated. Two co-authors (AA and SP) carried out these steps independently. All co-authors independently assessed the potentially included studies to ensure they met the predefined inclusion criteria. In the event of disagreements, a discussion was initiated to reach a final consensus. The reporting of this systematic review adheres to the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15].

Data collection

Using a pre-designed template, three co-authors (BS, LM, and SJ) independently extracted the necessary data. In cases where doubts or uncertainties arose, a fourth co-author(MA) was consulted for resolution. The extracted data encompassed two primary categories: CBCT parameters and the demographic, qualitative, and quantitative characteristics of the included studies. The findings are organized and presented in Tables 1 and 2.

Outcome assessment

The parameter outcomes encompassed measurements of the three primary osseous components: the mandibular condyle, glenoid fossa, and TMJ spaces. Supplementary Material I provides detailed definitions of these parameters.

Risk of bias/quality assessment of the included studies

The assessment of the risk of bias was conducted independently by three co-authors (EH, AA, and MA) using a modified checklist derived from previous studies [16,17,18]. Any disagreements that arose during the assessment were resolved through discussion among the co-authors. The checklist consisted of 18 items that examined the methodological robustness of the study sample and the data analysis. The maximum achievable score on the checklist was 37. Based on their scores, the studies were categorized as having a high, medium, or low risk of bias if they fell below 18, between 19 and 27, or between 28 and 37, respectively. Further details can be found in Table 3.

The comprehensiveness of the methods used in the included studies

The term “comprehensiveness” refers to several aspects, including the view utilized for TMJ measurements (multiplanar or volumetric), the assessed TMJ components, and the variation in measurement types (linear, angular, surface area, or volumetric variables). To evaluate the comprehensiveness of the measurement methods employed for TMJ osseous components, a checklist comprising 22 items was developed. The maximum achievable score on this checklist was 33. One item focused on the reference plane or line used (maximum score of 4). Eight items addressed condylar measurements (maximum score of 12), while other eight items covered glenoid fossa measurements (maximum score of 12). The remaining five items pertained to joint space measurements (maximum score of 5) (Table 4).

Analyses

The data were subjected to qualitative analysis. Due to significant inconsistencies in reporting the outcomes of interest across the included studies, quantitative analysis (meta-analysis) was not feasible.

Results

Study selection

The PRISMA flowchart (Fig. 1) illustrates the resulting process. Initially, 2567 records were retrieved, out of which 684 were duplicates and subsequently excluded. After screening the remaining 1883 records based on titles and abstracts, 1827 were deemed irrelevant to the review question and excluded. The full texts of the remaining 56 studies were meticulously read, leading to the exclusion of 42 studies. Ultimately, a total of 14 studies were included in the qualitative synthesis. Notably, none of these studies utilized CT.

Fig. 1
figure 1

PRISMA diagram of article retrieval

Study characteristics

Tables 1 and 2 provide details regarding the characteristics of the 14 studies included in the analysis. These studies were published between 2011 and 2022. Notably, no studies utilizing CT were identified. Instead, all included studies employed CBCT as the imaging modality, utilizing various machines and equipment. The Next-generation i-CAT CBCT and KaVo 3DeXam CT systems were the most frequently reported machines. The CBCT settings are outlined in Table 1. It is worth noting that there was substantial diversity in the reporting of these settings across the studies. While three studies [19,20,21] did not provide basic details, the remaining studies present most of the setting parameters, which are also included in Table 1.

Table 1 The parameters of the used CT or CBCT machines in the included studies
Table 2 The demographic, qualitative and quantitative characteristics of the included studies

The sample sizes of the included studies varied, ranging from as low as 29 participants [22] to as high as 300 participants [23]. The age range of participants was 18 years and older, and both genders were included in most studies. Linear measurements were reported in all of the included studies, while angular measurements were reported in all studies except three [24,25,26]. Four studies each reported surface area [20,21,22, 27] and volume measurements [21, 24, 25, 28, 29]. The measurements of condylar parameters were reported in all of the included studies, while parameters related to the fossa and joint space were reported in all studies except two [21, 27] not reported fossa and two [22, 26] not reported joint space. There was variability in the software programs used, with eight different programs utilized across the 14 studies. The most commonly used software was Anatomage [19, 20, 27, 28], followed by Dolphin [23, 30]. The reliability test employed in all of the included studies was the intra-class correlation coefficient (ICC). Additionally, one study incorporated a paired t-test [22], and another study utilized the kappa test [26]. Further details can be found in Table 2.

Quality assessment (risk of bias)

Table 3 presents the risk of bias for the included studies. Four studies exhibited a high risk of bias [22, 24, 26, 29], while seven showed a moderate risk [20, 21, 23, 25, 28, 31, 32]. Only three studies [19, 27, 30] were found to have an overall low risk of bias. The main shortcomings observed in most studies included the absence of blinding in measurements, inadequate documentation of the examiners’ experience or professional degree, low inter-examiner agreement, limited inclusion of cases in the reliability analysis, and inadequate presentation of the reliability analyses.

Table 3 Quality assessment tool of the included studies

Characteristics of the measurements and examiners

Eight of the included studies [19,20,21, 23, 27, 28, 30, 32] reported that measurements were conducted by two examiners, while five studies [22, 24, 26, 29, 31] mentioned the involvement of only one examiner. One study [25] did not provide information on this aspect. Regarding intra- or inter-examiner reliability analysis, the measurements were repeated twice in most studies, except for one study [25], where they were repeated three times. The time intervals between repeated measurements varied across studies, with a 1-week interval in two studies [22, 25], a 2-week interval in nine studies [19,20,21, 24, 27,28,29,30, 32], a 3-week interval in two studies [23, 31], and a 1-month interval in one study [26]. It is noteworthy that only one study [31] explicitly mentioned that examiners were blinded during the measurement process, and only three studies [21, 24, 30] reported the experience or qualifications of the examiners. Interestingly, six studies [19, 20, 27, 28, 30, 32] re-examined all cases for reliability, while five studies [21, 22, 26, 29, 31] re-examined between 25 and 50% of the cases. In contrast, only three studies [23,24,25] reported re-examining less than 20% of the total included cases. Additionally, only two studies [26, 32] reported pre-calibrating the examiners. For more detailed information, refer to Supplementary Material II.

Supplementary Material III provides a detailed account of the reliability analysis reporting. Seven studies [22, 24,25,26, 29, 31, 32] did not conduct inter-examiner reliability assessments. Meanwhile, four studies [20, 21, 23, 30] performed inter-examiner reliability assessments but did not specify the exact values; they only mentioned the overall reliability. Two studies [19, 27] presented a comprehensive analysis of inter-examiner reliability for most of the included measurements.

Table 4 Checklist for the validity and comprehensiveness of the methods used in assessment of the JMJ osseous components

Regarding intra-examiner reliability analysis, 11 studies [20,21,22,23,24,25,26, 29,30,31,32] conducted it, but they did not mention the exact values; they only provided information about the overall reliability. Two studies [19, 27] furnished a detailed intra-examiner reliability analysis for most of the measurements included in the study. One study presented intra- and inter-examiners reliability analysis for landmarks coordinate system rather than the measurements [28].

The comprehensiveness of the methods used in the included studies

Table 4 displays the comprehensiveness of the measurement methods employed to assess the osseous components of the TMJ. These methods can be categorized into four main groups: (1) View(s) and reference(s) utilized, which involved either a multiplanar view (sagittal, coronal, or axial) or a 3D view. In the 3D view, landmarks were identified in the frontal, horizontal, and midsagittal planes, with the three coordinates serving as the reference points. (2) Measurements of the mandibular condyle. (3) Measurements of the glenoid fossa. (4) Measurements of the joint spaces.

Among the included studies, eight [21, 23,24,25,26, 30,31,32] performed nearly all measurements using multiplanar views. On the other hand, the remaining six studies [19, 20, 22, 27,28,29] employed a 3D view to identifying landmarks and utilized the frontal, horizontal, and midsagittal planes as references for their measurements.

Among the 12 predetermined condylar measurements, three studies [19, 20, 27] reported 11 measurements, while one study [28] reported ten measurements. The remaining studies [21,22,23,24,25,26, 29,30,31,32] reported between two and seven measurements. Regarding the 12 predetermined glenoid fossa measurements, three studies [19, 20, 27] reported 11 measurements, one study [28] reported nine measurements, and one study [22] reported eight measurements. The remaining studies [21, 23,24,25,26, 29,30,31,32] reported between 0 and 6 measurements. Out of the five predetermined joint space measurements, only one study [28] reported all of them. Four studies [19, 20, 27, 31] reported all the measurements but excluded the total joint space volume. The remaining studies [21,22,23,24,25,26, 29, 30, 32] reported between 0 and 3 measurements.

Among the total score of 33 points on the checklist used, three studies [19, 20, 27] scored 30, while one study [28] scored 28. The remaining studies scored below 60% of the overall score.

Discussion

There is a growing interest in measuring TMJ in various contexts: observational, to provide details on specific populations or traits; and interventional, to assess the effects of specific treatments. At the interventional level, studies have evaluated the effects of different orthodontic or orthognathic surgical interventions on the positional and morphological features of the TMJ. These interventions included the use of fixed appliances with extraction therapy [10, 33] or removable functional appliances [9, 34], distalization mechanics, maxillary arch expansion therapy [35], and orthognathic surgery [36, 37]. Furthermore, the effects of different prosthetic interventions, such as dental implants and full mouth rehabilitation [38], as well as other minor restorative procedures [39], have been studied to evaluate their impact on TMJ.

At the observational level, numerous studies have assessed differences in various anteroposterior skeletal malocclusions [20, 26, 28, 40], vertical facial patterns [19, 30, 41], and transverse discrepancies [5, 23]. Furthermore, many studies assessed the TMJ differences between patients with TMDs and those without TMDs [12, 42]. However, significant discrepancies have been observed among these studies. Some of these discrepancies can be attributed to the lack of standardization in the methodologies employed, such as the use of 2D- versus 3D-based assessment methods, despite the fact that the imaging technique of interest (CBCT) inherently provides a 3D view. Other discrepancies may be due to variations in the reliability of measurements and the parameters assessed: Did they comprehensively or partially represent the TMJ? Therefore, this systematic review aims to appraise the reliability and comprehensiveness of methods used for 3D positional and morphological assessment of the TMJ, utilizing CT and CBCT, and to recommend a standardized approach for the same purpose.

There are several important technical considerations to emphasize on CBCT imaging. Image resolution largely depends on voxel size, with smaller voxel sizes providing higher image resolution [43]. The selection of voxel size should align with the study’s specific objectives. In the studies reviewed, voxel sizes in CBCT ranged from 0.3 to 0.5 mm. Given that the TMJ is a delicate and complex region, imaging with a smaller voxel size is preferable [43, 44]. Among the included studies, five of them [25,26,27,28, 32] utilized a voxel size of 0.3 mm, while other studies either used larger voxel sizes or did not specify. For high-quality images of the TMJ, it is recommended to use a voxel size no larger than 0.3 mm, especially when a large field of view is employed [43].

Another important technical aspect in CBCT imaging is slice thickness, as smaller thicknesses retain more details while larger thicknesses may result in a loss of details. For imaging the TMJ, a recommended slice thickness is 1 mm or smaller [43, 44]. Surprisingly, only five of the included studies [22,23,24, 27, 30] mentioned the slice thickness used, and it is worth noting that only two of them [22, 27] adhered to the recommended thickness.

The TMJ is a complex structure, both mechanically and biologically, where changes occurring in one component of the TMJ can impact the others. Movement of the mandibular condyle in any direction typically results in corresponding changes in the surrounding joint spaces and, at times, bone remodeling in the mandibular fossa encompassing this synovial joint. Furthermore, movement on one side is often accompanied by either parallel or opposite movement on the other side [45]. Consequently, when radiographically assessing the TMJ, it is essential to include the three main components: the mandibular condyle, glenoid fossa, and joint spaces. For this systematic review, studies were included on the condition that they assessed at least two TMJ components. Eight of the included studies [19, 20, 23, 27, 28, 30,31,32] assessed all three TMJ components, while the remaining six studies [21, 22, 24, 25, 26, 29] focused on two components. Depending on the specific aim(s) of a study, certain TMJ component(s) can be selected for evaluation as the main outcome(s). However, it is recommended to evaluate all TMJ components (at least as secondary outcomes) due to the biomechanical interactions that occur among them.

The quality of the included studies was assessed using a checklist adapted from previously published research [16,17,18], with modifications made to make it more flexible and aligned with the objectives of the current systematic review. These modifications encompassed aspects such as the head orientation in 3D analysis, condylar orientation in sectional-based analysis, number of TMJ components included, diversity of measurement types (linear, angular, surface area, and/or volumetric measurements), experience and professional qualification of the examiners, percentage of cases included in the reliability analysis, and presentation of reliability results for the assessed variables. By applying this rigorous checklist, only three studies were identified as having a low risk of bias [19, 27, 30].

Reliability pertains to the accuracy of the obtained data and the extent to which any measuring tool controls random errors, which is crucial for ensuring the validity of the research. However, reliability alone is not sufficient [46]. Reliability analysis is necessary for any quantitative measurements where subjectivity may be a factor. It is recommended that two pre-calibrated examiners independently conduct the measurements, and both intra- and inter-examiner reliabilities should be assessed on at least 25% of the sample [47], although including the entire sample is preferable. All included studies reported intra-examiner reliability, although it was often presented as a single value or a range. In other words, only a few studies reported intra-examiner reliability for all individual variables. The same applies to inter-examiner reliability, which was evaluated in eight studies [19,20,21, 23, 27, 28, 30, 32], while the remaining six studies involved only one examiner [22, 24,25,26, 29, 31]. Only three studies [23,24,25] reported conducting the reliability analysis on fewer than the recommended number of cases (less than 25% of the entire sample).

To ensure a comprehensive TMJ osseous evaluation method, three main criteria must be met. First, landmark identification of this complex should be conducted in all three planes of space to enable precise and accurate measurements rather than relying solely on a multiplanar view. Second, the analysis must encompass the entire complex, including the glenoid fossa, mandibular condyle, and TMJ spaces. Third, there should be a diverse range of measurements covering dimensions, positions, and orientations in the three planes, as well as surface area and volumetric assessments. Interestingly, none of the included studies fulfilled all three of these criteria. The maximum score obtained was 30 out of 33 points, indicating that none of the methods employed in the included studies were comprehensive. Moreover, surprisingly, ten of the included studies scored less than 60% of the maximum score. Consequently, recommended criteria for the selected TMJ osseous evaluation method are listed to establish a standardized approach for future research, enhancing the validity of the findings.

Limitations

In addition to the limited number of studies, most exhibited a moderate to high risk of bias. The significant inconsistencies among the included studies regarding study design, reporting of reliability, and the comprehensiveness of measured outcomes made it unfeasible to conduct a meta-analysis. Another limitation of the review was the inclusion of only studies published in English. One worthy-to-mention limitation of the current study is that 3 studies among the 14 studies included in this review were authored by the one of the researcher of this study, a matter that might introduce selection and presentation biases. However, we confirm that we applied strict criteria where other co-authors (not involved in these 3 studies) were involved in the selection, assessment, and extraction processes. Besides that, 3 out of 14 studies represent small fraction of the overall evidence. Thus, the fear of bias can be considered minimal.

Conclusions

Considering the limitations of this review, the following conclusions can be drawn: (1) None of the evaluated methods provided a comprehensive assessment of the TMJ; (2) There was significant diversity in the performance and reporting of intra- and inter-examiner reliability analyses; and (3) Considerable variation was observed in CBCT imaging settings, particularly in slice thickness and voxel size.

Recommendations were proposed for a selected TMJ osseous evaluation method to improve the validity and reliability of future research. These recommendations include the following:

  1. A.

    The recommended voxel size during acquisition is 0.3 when using a large field of view.

  2. B.

    The slice thickness should ideally be no more than 1 mm.

  3. C.

    For precise identification, TMJ measurements should be conducted in the 3D view, with landmarks identified using the slice locator view.

  4. D.

    When using the TMJ view, proper slice orientation is crucial, with the long axis of the condyle in the axial view being perpendicular to the slice cuts.

  5. E.

    It is preferable to include the three main components of the TMJ in the assessment: mandibular condyle, glenoid fossa, and joint spaces.

  6. F.

    Measurements should encompass morphological variations, evaluating dimensions, positions, and angulations when applicable.

  7. G.

    Measurements should encompass different mathematical aspects, including linear, angular, and surface area or volume measurements.

It is recommended to apply these current recommendations and standard criteria for the positional and morphological evaluation of TMJ osseous structures using CBCT in clinical settings, particularly for patients with TMDs, to establish correlations between evaluated osseous structures and the existing disease.