Recommendations for standard criteria for the positional and morphological evaluation of temporomandibular joint osseous structures using cone-beam CT: a systematic review

Objective This systematic review aimed to appraise the reliability and comprehensiveness of imaging methods in studies that used three-dimensional assessment of the temporomandibular joint (TMJ) in order to propose a standardized imaging method. Methods Six databases/search engines were searched up until September 2022. The outcomes of interest included measurements of the mandibular condyle, glenoid fossa, joint spaces, or the entire TMJ. Two checklists were utilized: one to assess the risk of bias, with a maximum score of 37, and the other, a pre-designed checklist consisting of 22 items to evaluate the comprehensiveness of the methods used, with a maximum score of 33. Results Out of the 2567 records retrieved, only 14 studies, which used cone bean computed tomography (CBCT), were deemed eligible and thus included in the qualitative analysis. Three studies were deemed of low risk of bias, while the remaining studies were rated as moderate to high risk of bias, primarily due to improper reporting of inter-observer agreement, varying reliability values, and a limited number of cases included in the reliability analysis. Regarding the comprehensiveness of the methods used, only four studies achieved relatively high scores. The deficiencies observed were related to the reporting of variables such as slice thickness and voxel size, absence of or improper reporting of intra- and inter-examiner reliability analyses, and failure to assess all osseous components of the TMJ. Conclusion CBCT-based methods used to assess the positions and morphology of TMJ bony structures appear to be imperfect and lacking in comprehensiveness. Hence, criteria for a standardized assessment method of these TMJ structures are proposed. Clinical relevance statement Accurately, comprehensively, and reliably assessing the osseous structures of the temporomandibular joint will provide valid and valuable diagnostic features of the normal temporomandibular joint, and help establish potential associations between these osseous features and temporomandibular disorders. Registration The protocol for this systematic review was registered at the International Prospective Register of Systematic Reviews (PROSPERO, No.: CRD42020199792). Key Points •Although many methods have been introduced to assess the osseous structure of the temporomandibular joint, they yielded inconsistent findings. •None of the published studies comprehensively assessed the temporomandibular joint. •Recommendations for a comprehensive temporomandibular joint osseous assessment method were suggested for better validity and reliability of future research. Supplementary Information The online version contains supplementary material available at 10.1007/s00330-023-10248-4.


Rationale
The temporomandibular joint (TMJ) is a unique structure as it is the only diarthrodial synovial joint in the body.Threedimensional (3D) technology provides improved visualization of the TMJ structures.Magnetic resonance imaging (MRI) is primarily used for evaluating the TMJ's soft tissue [1], whereas computed tomography (CT) and cone-beam computed tomography (CBCT) are primarily employed for assessing the hard tissue.Indeed, these two technologies enable a more precise analysis of the TMJ's osseous components than ever before [1][2][3][4].It is worth noting, however, that CBCT exposes patients to less radiation compared to CT.Studies have reported that high-resolution images can achieve excellent accuracy in TMJ examinations [5][6][7].
Recently, the American Academy of Oral and Maxillofacial Radiology and the American Academy of Orofacial Pain issued recommendations regarding TMJ imaging.According to these recommendations, "maxillofacial CBCT allows for evaluating the osseous and dental hard tissue components.However, due to its limited ability to distinguish soft tissue details, CBCT is inadequate for assessing the necessary information required for diagnosing and managing patients with temporomandibular disorders (TMDs)" [8].
A wide range of methods has been proposed to evaluate various aspects of the TMJ, including the dimensions, inclinations, positions, surface areas, volume of the mandibular condyles, as well as the dimensions, inclinations, and dimensional/volumetric parameters of the glenoid fossa and TMJ spaces, respectively.However, the utilization of these methods has complicated TMJ analyses.These techniques have been employed in the context of patients with TMDs, normal TMJ patients, and even to assess the effects of specific interventions on TMJ morphology and dimensions.Unfortunately, the findings derived from these studies have been inconsistent and even contradictory [9][10][11][12][13][14].These discrepancies can be attributed to a lack of methodological consensus.For instance, while the imaging is 3D, the TMJ measurements are often taken from slice sections resulting in twodimensional (2D) data.Furthermore, there is considerable variability in the TMJ measurements utilized for assessing the osseous structures, and there is a wide diversity in the selected study samples.
To date, there has been neither comprehensive systematic review assessing the reliability and comprehensiveness of CT or CBCT-based osseous measurements of the TMJ, nor an examination of the extent to which the existing methodologies are valid for evaluating this intricate structure.

Objectives
This systematic review aimed to appraise the reliability and comprehensiveness of CT or CBCT-based methods employed for 3D positional and morphological assessment of the TMJ.Additionally, the review aimed to propose a standardized method for such assessments.

Protocol registration
The study protocol was registered with PROSPERO under registration number CRD42020199792 and was conducted following the guidelines outlined in the Cochrane Oral Health Group's Handbook for Systematic Reviews (http:// ohg.cochr ane.org).

PICOS question and eligibility criteria
The inclusion criteria for this study comprised observational studies, primarily descriptive (either retrospective or prospective), that utilized CT or CBCT for imaging purposes in adult human subjects.These studies evaluated the reliability and comprehensiveness of osseous measurements in the TMJ.On the other hand, interventional studies, case reports, case series, literature reviews, systematic reviews, opinion articles, book chapters, as well as studies involving populations other than normal adult humans (such as cadavers, animals, growing patients, individuals with craniofacial anomalies, the TMDs, trauma to the temporal or temporomandibular region, or a history of surgical intervention/s in the TMJ or surrounding area) were excluded.Additionally, records focusing on other imaging methods or outcomes were also excluded.

Information sources, search strategy, and study selection
In August 2020, three co-authors (BS, LM, and SJ) conducted an independent and thorough search across six search engines/databases, namely PubMed, Scopus, Science Direct, Web of Science, Cochrane, and LILACS.The search was further complemented by a manual examination of the reference lists of the included studies.The same three co-authors subsequently updated the search in September 2022.
The retrieved records were entered into an Excel sheet (version 2010) to identify and remove duplicates.Subsequently, the titles and abstracts of the remaining records were screened to determine their potential for inclusion, and irrelevant studies were excluded.The full texts of the remaining studies were thoroughly reviewed, and any studies deemed irrelevant were eliminated.Two co-authors (AA and SP) carried out these steps independently.All co-authors independently assessed the potentially included studies to ensure they met the predefined inclusion criteria.In the event of disagreements, a discussion was initiated to reach a final consensus.The reporting of this systematic review adheres to the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15].

Data collection
Using a pre-designed template, three co-authors (BS, LM, and SJ) independently extracted the necessary data.In cases where doubts or uncertainties arose, a fourth co-author(MA) was consulted for resolution.The extracted data encompassed two primary categories: CBCT parameters and the demographic, qualitative, and quantitative characteristics of the included studies.The findings are organized and presented in Tables 1 and 2.

Outcome assessment
The parameter outcomes encompassed measurements of the three primary osseous components: the mandibular condyle, glenoid fossa, and TMJ spaces.Supplementary Material I provides detailed definitions of these parameters.

Risk of bias/quality assessment of the included studies
The assessment of the risk of bias was conducted independently by three co-authors (EH, AA, and MA) using a modified checklist derived from previous studies [16][17][18].Any disagreements that arose during the assessment were resolved through discussion among the co-authors.The checklist consisted of 18 items that examined the methodological robustness of the study sample and the data analysis.The maximum achievable score on the checklist was 37. Based on their scores, the studies were categorized as having a high, medium, or low risk of bias if they fell below 18, between 19 and 27, or between 28 and 37, respectively.Further details can be found in Table 3.

The comprehensiveness of the methods used in the included studies
The term "comprehensiveness" refers to several aspects, including the view utilized for TMJ measurements (multiplanar or volumetric), the assessed TMJ components, and the variation in measurement types (linear, angular, surface area, or volumetric variables).To evaluate the comprehensiveness of the measurement methods employed for TMJ osseous components, a checklist comprising 22 items was developed.The maximum achievable score on this checklist was 33.One item focused on the reference plane or line used (maximum score of 4).Eight items addressed condylar measurements (maximum score of 12), while other eight items covered glenoid fossa measurements (maximum score of 12).The remaining five items pertained to joint space measurements (maximum score of 5) (Table 4).

Analyses
The data were subjected to qualitative analysis.Due to significant inconsistencies in reporting the outcomes of interest across the included studies, quantitative analysis (metaanalysis) was not feasible.

Study selection
The PRISMA flowchart (Fig. 1) illustrates the resulting process.Initially, 2567 records were retrieved, out of which 684 were duplicates and subsequently excluded.After screening the remaining 1883 records based on titles and abstracts, 1827 were deemed irrelevant to the review question and excluded.The full texts of the remaining 56 studies were meticulously read, leading to the exclusion of 42 studies.Ultimately, a total of 14 studies were included in the qualitative synthesis.Notably, none of these studies utilized CT.

Study characteristics
Tables 1 and 2 provide details regarding the characteristics of the 14 studies included in the analysis.These studies were published between 2011 and 2022.Notably, no studies    1.It is worth noting that there was substantial diversity in the reporting of these settings across the studies.
While three studies [19][20][21] did not provide basic details, the remaining studies present most of the setting parameters, which are also included in Table 1.
The sample sizes of the included studies varied, ranging from as low as 29 participants [22] to as high as 300 participants [23].The age range of participants was 18 years and older, and both genders were included in most studies.Linear measurements were reported in all of the included studies, while angular measurements were reported in all studies except three [24][25][26].Four studies each reported surface area [20][21][22]27] and volume measurements [21,24,25,28,29].The measurements of condylar parameters were reported in all of the included studies, while parameters related to the fossa and joint space were reported in all studies except two [21,27] not reported fossa and two [22,26] not reported joint space.There was variability in the software programs used, with eight different programs utilized across the 14 studies.The most commonly used software was Anatomage [19,20,27,28], followed by Dolphin [23,30].The reliability test employed in all of the included studies was the intra-class correlation coefficient (ICC).Additionally, one study incorporated a paired t-test [22], and another study utilized the kappa test [26].Further details can be found in Table 2.

Quality assessment (risk of bias)
Table 3 presents the risk of bias for the included studies.Four studies exhibited a high risk of bias [22,24,26,29], while seven showed a moderate risk [20,21,23,25,28,31,32].Only three studies [19,27,30] were found to have an overall low risk of bias.The main shortcomings observed in most studies included the absence of blinding in measurements, inadequate documentation of the examiners' experience or professional degree, low inter-examiner agreement, limited inclusion of cases in the reliability analysis, and inadequate presentation of the reliability analyses.

Characteristics of the measurements and examiners
Eight of the included studies [19-21, 23, 27, 28, 30, 32] reported that measurements were conducted by two examiners, while five studies [22,24,26,29,31] mentioned the involvement of only one examiner.One study [25] did not provide information on this aspect.Regarding intra-or interexaminer reliability analysis, the measurements were repeated twice in most studies, except for one study [25], where they Sample size calculation     were repeated three times.The time intervals between repeated measurements varied across studies, with a 1-week interval in two studies [22,25], a 2-week interval in nine studies [19-21, 24, 27-30, 32], a 3-week interval in two studies [23,31], and a 1-month interval in one study [26].It is noteworthy that only one study [31] explicitly mentioned that examiners were blinded during the measurement process, and only three studies [21,24,30] reported the experience or qualifications of the examiners.Interestingly, six studies [19,20,27,28,30,32] re-examined all cases for reliability, while five studies [21,22,26,29,31] re-examined between 25 and 50% of the cases.In contrast, only three studies [23][24][25] reported re-examining less than 20% of the total included cases.Additionally, only two studies [26,32] reported pre-calibrating the examiners.For more detailed information, refer to Supplementary Material II.Supplementary Material III provides a detailed account of the reliability analysis reporting.Seven studies [22, 24-26, 29, 31, 32] did not conduct inter-examiner reliability assessments.Meanwhile, four studies [20,21,23,30] performed inter-examiner reliability assessments but did not specify the exact values; they only mentioned the overall reliability.Two studies [19,27] presented a comprehensive analysis of interexaminer reliability for most of the included measurements.

The comprehensiveness of the methods used in the included studies
Table 4 displays the comprehensiveness of the measurement methods employed to assess the osseous components of the TMJ.These methods can be categorized into four main groups: (1) View(s) and reference(s) utilized, which involved either a multiplanar view (sagittal, coronal, or axial) or a 3D view.In the 3D view, landmarks were identified in the frontal, horizontal, and midsagittal planes, with the three coordinates serving as the reference points.(2) Measurements of the mandibular condyle.(3) Measurements of the glenoid fossa.(4) Measurements of the joint spaces.
Among the total score of 33 points on the checklist used, three studies [19,20,27] scored 30, while one study [28] scored 28.The remaining studies scored below 60% of the overall score.

Discussion
There is a growing interest in measuring TMJ in various contexts: observational, to provide details on specific populations or traits; and interventional, to assess the effects of specific treatments.At the interventional level, studies have evaluated the effects of different orthodontic or orthognathic surgical interventions on the positional and morphological features of the TMJ.These interventions included the use of fixed appliances with extraction therapy [10,33] or removable functional appliances [9,34], distalization mechanics, maxillary arch expansion therapy [35], and orthognathic surgery [36,37].Furthermore, the effects of different prosthetic interventions, such as dental implants and full mouth rehabilitation [38], as well as other minor restorative procedures [39], have been studied to evaluate their impact on TMJ.
At the observational level, numerous studies have assessed differences in various anteroposterior skeletal malocclusions [20,26,28,40], vertical facial patterns [19,30,41], and transverse discrepancies [5,23].Furthermore, many studies assessed the TMJ differences between patients with TMDs and those without TMDs [12,42].However, significant discrepancies have been observed among these studies.Some of these discrepancies can be attributed to the lack of standardization in the methodologies employed, such as the use of 2D-versus 3D-based assessment methods, despite the fact that the imaging technique of interest (CBCT) inherently provides a 3D view.Other discrepancies may be due to variations in the reliability of measurements and the parameters assessed: Did they comprehensively or partially represent the TMJ?Therefore, this systematic review aims to appraise the reliability and comprehensiveness of methods used for 3D positional and morphological assessment of the TMJ, utilizing CT and CBCT, and to recommend a standardized approach for the same purpose.
There are several important technical considerations to emphasize on CBCT imaging.Image resolution largely depends on voxel size, with smaller voxel sizes providing higher image resolution [43].The selection of voxel size should align with the study's specific objectives.In the studies reviewed, voxel sizes in CBCT ranged from 0.3 to 0.5 mm.Given that the TMJ is a delicate and complex region, imaging with a smaller voxel size is preferable [43,44].Among the included studies, five of them [25][26][27][28]32] utilized a voxel size of 0.3 mm, while other studies either used larger voxel sizes or did not specify.For high-quality images of the TMJ, it is recommended to use a voxel size no larger than 0.3 mm, especially when a large field of view is employed [43].
Another important technical aspect in CBCT imaging is slice thickness, as smaller thicknesses retain more details while larger thicknesses may result in a loss of details.For imaging the TMJ, a recommended slice thickness is 1 mm or smaller [43,44].Surprisingly, only five of the included studies [22-24, 27, 30] mentioned the slice thickness used, and it is worth noting that only two of them [22,27] adhered to the recommended thickness.
The TMJ is a complex structure, both mechanically and biologically, where changes occurring in one component of the TMJ can impact the others.Movement of the mandibular condyle in any direction typically results in corresponding changes in the surrounding joint spaces and, at times, bone remodeling in the mandibular fossa encompassing this synovial joint.Furthermore, movement on one side is often accompanied by either parallel or opposite movement on the other side [45].Consequently, when radiographically assessing the TMJ, it is essential to include the three main components: the mandibular condyle, glenoid fossa, and joint spaces.For this systematic review, studies were included on the condition that they assessed at least two TMJ components.Eight of the included studies [19,20,23,27,28,[30][31][32] assessed all three TMJ components, while the remaining six studies [21,22,24,25,26,29] focused on two components.Depending on the specific aim(s) of a study, certain TMJ component(s) can be selected for evaluation as the main outcome(s).However, it is recommended to evaluate all TMJ components (at least as secondary outcomes) due to the biomechanical interactions that occur among them.
The quality of the included studies was assessed using a checklist adapted from previously published research [16][17][18], with modifications made to make it more flexible and aligned with the objectives of the current systematic review.These modifications encompassed aspects such as the head orientation in 3D analysis, condylar orientation in sectional-based analysis, number of TMJ components included, diversity of measurement types (linear, angular, surface area, and/or volumetric measurements), experience and professional qualification of the examiners, percentage of cases included in the reliability analysis, and presentation of reliability results for the assessed variables.By applying this rigorous checklist, only three studies were identified as having a low risk of bias [19,27,30].
Reliability pertains to the accuracy of the obtained data and the extent to which any measuring tool controls random errors, which is crucial for ensuring the validity of the research.However, reliability alone is not sufficient [46].Reliability analysis is necessary for any quantitative measurements where subjectivity may be a factor.It is recommended that two pre-calibrated examiners independently conduct the measurements, and both intra-and inter-examiner reliabilities should be assessed on at least 25% of the sample [47], although including the entire sample is preferable.All included studies reported intra-examiner reliability, although it was often presented as a single value or a range.In other words, only a few studies reported intraexaminer reliability for all individual variables.The same applies to inter-examiner reliability, which was evaluated in eight studies [19-21, 23, 27, 28, 30, 32], while the remaining six studies involved only one examiner [22, 24-26, 29, 31].Only three studies [23][24][25] reported conducting the reliability analysis on fewer than the recommended number of cases (less than 25% of the entire sample).
To ensure a comprehensive TMJ osseous evaluation method, three main criteria must be met.First, landmark identification of this complex should be conducted in all three planes of space to enable precise and accurate measurements rather than relying solely on a multiplanar view.Second, the analysis must encompass the entire complex, including the glenoid fossa, mandibular condyle, and TMJ spaces.Third, there should be a diverse range of measurements covering dimensions, positions, and orientations in the three planes, as well as surface area and volumetric assessments.Interestingly, none of the included studies fulfilled all three of these criteria.The maximum score obtained was 30 out of 33 points, indicating that none of the methods employed in the included studies were comprehensive.Moreover, surprisingly, ten of the included studies scored less than 60% of the maximum score.Consequently, recommended criteria for the selected TMJ osseous evaluation method are listed to establish a standardized approach for future research, enhancing the validity of the findings.

Limitations
In addition to the limited number of studies, most exhibited a moderate to high risk of bias.The significant inconsistencies among the included studies regarding study design, reporting of reliability, and the comprehensiveness of measured outcomes made it unfeasible to conduct a meta-analysis.Another limitation of the review was the inclusion of only studies published in English.One worthy-to-mention limitation of the current study is that 3 studies among the 14 studies included in this review were authored by the one of the researcher of this study, a matter that might introduce selection and presentation biases.However, we confirm that we applied strict criteria where other co-authors (not involved in these 3 studies) were involved in the selection, assessment, and extraction processes.Besides that, 3 out of 14 studies represent small fraction of the overall evidence.Thus, the fear of bias can be considered minimal.

Conclusions
Considering the limitations of this review, the following conclusions can be drawn: (1) None of the evaluated methods provided a comprehensive assessment of the TMJ; (2) There was significant diversity in the performance and reporting of intra-and inter-examiner reliability analyses; and (3) Considerable variation was observed in CBCT imaging settings, particularly in slice thickness and voxel size.
Recommendations were proposed for a selected TMJ osseous evaluation method to improve the validity and reliability of future research.These recommendations include the following: A. The recommended voxel size during acquisition is 0.3 when using a large field of view.B. The slice thickness should ideally be no more than 1 mm.C. For precise identification, TMJ measurements should be conducted in the 3D view, with landmarks identified using the slice locator view.D. When using the TMJ view, proper slice orientation is crucial, with the long axis of the condyle in the axial view being perpendicular to the slice cuts.E. It is preferable to include the three main components of the TMJ in the assessment: mandibular condyle, glenoid fossa, and joint spaces.F. Measurements should encompass morphological variations, evaluating dimensions, positions, and angulations when applicable.G. Measurements should encompass different mathematical aspects, including linear, angular, and surface area or volume measurements.
It is recommended to apply these current recommendations and standard criteria for the positional and morphological evaluation of TMJ osseous structures using CBCT in clinical settings, particularly for patients with TMDs, to establish correlations between evaluated osseous structures and the existing disease.

4
Checklist for the validity and comprehensiveness of the methods used in assessment of the JMJ osseous componentsAuthor (year) reported 11 measurements, while *Checklist for the validity and comprehensiveness of the methods used in assessment of the osseous JMJ components A: items scored out of 4: -Reference plane (3 D coordinate = 4, 2D multiplanar = 2) B: items scored out of 3: -Point or geometric condyle position (Not assessed = 0, One view assessment = 1, two or three views assessments = 3) -Condyle inclination (Not assessed = 0, One view assessment = 1, two or three views assessments = 3) -Glenoid fossa position (Not assessed = 0, One view assessment = 1, two or three views assessments = 3) -Glenoid fossa inclination (Not assessed = 0, One view assessment = 1, two or three views assessments = 3) C: items scored out of 1: All other items ((Not assessed = 0, assessed = 1) Table 4

Table 1
The parameters of the used CT or CBCT machines in the included studies

Table 2
The demographic, qualitative and quantitative characteristics of the included studies