Introduction

The decrease of bone mineral density (BMD) is caused by a change in the bone microstructure. This is due to several factors, including calcium deficiency, vitamin D deficiency, lack of physical activity and genetic factors. Common diseases associated with lowering BMD are osteopenia and osteoporosis. Significant BMD impairment in the course of osteoporosis leads to fractures. Osteoporotic fractures lead to a significant reduction in the quality of life, increasing morbidity, mortality, and disability [1]. The standard method used to measure bone mineral density is dual-energy X-ray absorptiometry (DEXA). The differences between the reference BMD values and the measurements are expressed in standard deviations (SD) scores and defined as T-scores or Z-scores. The reference given by WHO is data on Caucasian women aged 20–29 years. The T-score describes the value of SD by which BMD differs from the mean expected in young healthy subjects, while the Z-score describes the value of SD by which BMD differs from the mean expected value for age and gender. A T-score between -1.0 and -2.5 SD represents osteopenia, which is usually a pre-osteoporosis condition. Osteoporosis is diagnosed when the T-score is equal to or lower than -2.5 SD [2]. In addition to the DEXA method, imaging techniques such as computed tomography (CT), qualitative ultrasound (QUS), and the technique of single or double photon absorptiometry (SPA or DPA) are also used for the diagnosis of osteoporosis [3]. Studies analysing the usefulness of radiomorphometric indicators of the mandible observed in pantomographic images in the screening or preliminary diagnosis of osteopenia and osteoporosis are becoming more and more popular. Although some research results indicate no relationship between these indicators and BMD measured with DEXA [4, 5], a significant part of research on that topic finds this relationship [i.a. 6,7,8,9]. Studies showing a correlation between the indices and BMD values emphasize the great diagnostic importance of this method, as patients could be early diagnosed during a routine visit to the dentist [10,11,12]. This review aims to evaluate the accuracy of various mandibular radiomorphometric indices in comparison with DEXA BMD measurements in the diagnosis of osteopenia and osteoporosis based on a meta-analysis of the sensitivity and specificity of the indices.

Materials and methods

Focused question

The following question was applied: Are qualitative and quantitative mandibular radiomorphometric indices reliable tools for the diagnosis of osteopenia/osteoporosis in men and women?

Protocol

The present systematic review has been prepared following the guidelines of Preferred Report Items Systematic Reviews and Meta-Analyses (PRISMA) [13].

Study selection

The materials for analysis were collected in August 2023 by searching three databases: PubMed Central® (United States National Institutes of Health’s National Library of Medicine), Web of Science® (Institute of Scientific Information – Clarivate Analytics), and Scopus® (Elsevier). No restriction on the year of publication was imposed during the search, but a filter was applied towards only English-language articles. The following keywords were used in each database: radiomorphometric indices AND mandible OR orthopantomography OR orthopantomograms AND bone mineral density. The selection consisted of three stages – screening by the title, abstract and full text. The review was performed independently by two researchers (J.H. and A.S.), and the final results and any discrepancies regarding the inclusion or exclusion of particular publications were then discussed.

Eligibility criteria

Only original research articles were included in this paper. Abstracts, case reports, notes, oral presentations, and reviews were excluded. Moreover, papers that used non-human objects were also excluded from the analysis. All qualified papers had to include both radiomorphometric indices of the mandible that were measured on orthopantomograms and bone mineral density measured with DEXA. The search did not impose a specific population in terms of gender, age or ethnicity.

Risk of bias assessment

Quality assessment was performed twice by one of the authors (J.H.) two months apart. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) criteria were used, in line with the Cochrane guidelines for diagnostic test accuracy [14]. The QUADAS-2 tool consists of 4 main parts, comprising patient selection assessment, index test, reference standard, and patient flow. In addition, the first three parts also include an applicability assessment. All parts are subject to a risk of bias assessment: low, high or unclear. The questions proposed by Calciolari et al. were used for the assessment [15]. If the answer to all questions in a panel was “yes”, the risk of bias was defined as “low”; if the answer to any of the questions was “no”, the risk was defined as “high”. If there was insufficient data to assess, the risk was defined as 'unclear'. Data were assessed for risk of bias, but not for applicability. All detailed information about signalling questions and risk of bias assessment are available in Supplementary Table 1.

Data synthesis and statistical analysis

The meta-analysis was performed for studies in which reduced bone mineral density was determined by DEXA regardless of the site of measurement. From all studies, two-by-two (2 × 2) tables were retrieved or performed based on available data and then the consistencies between 2 × 2 tables and reported sensitivity, specificity and positive and negative predictive value were checked. Analyses were performed using R Statistical Software (v4.2.2; R Core Team 2023) [16]. Studies included in the meta-analysis were subdivided according to cut-off values and the way the participants were divided into “osteopenic with healthy” or “osteopenic with osteoporotic” groups. Analyses were performed when a minimum of 3 studies with the selected criteria were present. In the case of providing the results of measurements made by several observers, the arithmetic mean of the 2 × 2 table was calculated. Using R statistical software, forest plots were made for each study showing estimates of sensitivity and specificity with confidence intervals and the value of heterogeneity between studies. For each cut-off value interval, summary ROC plots were made to determine the summary operating points (i.e., summary values for sensitivity and specificity) and the 95% confidence region around the summary point.

Results

Studies included

Following a database search and article review, 64 publications were selected for analysis. A summary of the search strategy is presented in Fig. 1. The characteristics of the publications included in the analysis are presented in Supplementary Tables 2 and 3.

Fig. 1
figure 1

Flow chart of the publication selection procedure, according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement [13]

Mandibular cortex index

46 studies assessed the morphology of the lower mandibular cortex to identify reduced BMD [4, 6,7,8,9,10, 12, 17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53]. Mandibular Cortical Index (MCI) is also referred to as Klemetti Index (KI) and Simple Visual Estimation (SVE). This index is a qualitative indicator that is assessed by classifying the mandibular cortical morphology distal to the mental foramen into one of three categories, for MCI and KI: C1: when the endosteal margin is even and sharp, C2: when the periosteal margin shows lacunar resorption or cortical remnants on one or both sides of the mandible, C3: when the cortex is markedly porous with advanced cortical remnants of the endosteal margin [53]. SVE is a simplification of the other two indices, as it distinguishes the mandibular cortex morphology as 'normal', 'intermediate' or 'very thin' [44]. In the studies analysed, the observation of eroded mandibular cortex (categories C2 and C3) in the identification of reduced BMD (T-score < -1) ranged for sensitivity from 32% [47] to 95% [26] and for specificity of the method from 7.8% [26] to 88.9% [52]. When a markedly eroded cortex of the mandibular bone was observed (category C3 or 'very thin' for SVE), sensitivity for the detection of osteoporosis ranged from 19.6% [26] to 81.1% [20] and specificity from 88.9% [52] to 100% [21]. The usefulness of using the markedly eroded mandibular cortex to detect reduced BMD was only used in eight publications [20, 22, 26, 29, 43, 50, 52, 53]. In two of those reports, authors concluded that MCI is not a precise method and should not be used for diagnostic purposes [22, 53], the rest of them sum up that MCI assessment is a veracious method for detecting low BMD [20, 26, 29, 43, 50, 52].

Mandibular cortical width

Mandibular Cortical Width (MCW), also referred to as Mental Index (MI) and Mandibular Cortical Thickness (MCT), was used to identify reduced BMD in 44 studies [6, 8, 9, 11, 12, 18,19,20,21,22,23,24,25, 29,30,31,32, 35, 37, 39,40,41,42,43,44,45, 47, 48, 53,54,55,56,57,58,59,60,61,62,63]. In most studies, this index is measured as cortical width along a line passing through the centre of the foramen and perpendicular to the tangent to the lower border of the mandible (Fig. 2). Four studies used indices that are a modification of the MCW, i.e. Cortical Index (CI) or Inferior Cortex (IC) – measurements of the width of the mandibular cortex at a line tangent to the posterior border of the mental foramen (Fig. 2) [4, 64, 65] and the Mental Posterior Index (MPI)—i.e. measurements at 1, 2 and 3 cm away from the MCW measurement (MPI1, MPI2, MPI3 respectively) [41]. All measurements were taken manually or digitally using image analysis software. Cut-off values ranged from 2.22 mm to 5 mm and were determined after drawing the ROC curve and selecting the highest levels of sensitivity and specificity. The levels of sensitivity and specificity of the method varied widely and additionally depended heavily on the cut-off selected. The sensitivity of the method ranged from 8.3% [62] to 100% [58], with only two publications reaching > 95% [58, 66]. The specificity of the method ranged from 9.8% [66] to 100% [32].

Fig. 2
figure 2

Metrical panoramic indices: MCW = A; CI = B; PMI = A/C; M/M Ratio = E/D

Panoramic Mandibular Index

In 20 studies, the Panoramic Mandibular Index (PMI) was used to identify low BMD in study patients [4, 5, 8, 19, 23,24,25, 27, 30, 32, 37, 38, 42, 45, 53, 55, 59, 60, 63, 67]. This index is usually calculated as the ratio of the MCW in the mental region to the distance from the inferior border of the mandible to the inferior edge of the mental foramen in a line perpendicular to the line tangent to the inferior border of the mandible (Fig. 2) [68]. Only seven studies reported cut-off values—five studies reported a cut-off value of approximately 0.3 (range 0.29 to 0.33) [4, 23, 24, 32, 63] and four studies reported a cut-off value of approximately 0.4 (range 0.38 to 0.44) [8, 32, 38, 53]. One study included four different cut-offs [32]. Sensitivity and specificity in detecting patients with reduced BMD (T-score < -1) ranged from 16.6% to 79% and from 58.9% to 91%, respectively. For the 0.3 and the 0.4 cut-off, sensitivity and specificity levels ranged from 57% to 95.2% and from 9% to 84.85%, respectively.

Other indices

In the other studies, indices different from those mentioned above were used to identify reduced BMD. Six publications applied Alveolar Bone Reabsorption (M/M ratio), named also as Mandibular Ratio (MR), which is the ratio of total mandibular height divided by mandibular height from the centre of the mental foramen to the inferior margin of the mandible (Fig. 2) [4, 27, 30, 37, 42, 55, 67]. Nine publications considered gonial and antegonial indices such as angles (Mandibular Angle—MA, Gonial Angle—GA, Antegonial Angle—AA), mandibular cortex height (Gonion Index—GI, Antegonial Index—AI), and antegonial depth (AD) [24, 40, 42,43,44,45,46]. Additionally, three publications considered indicators such as mandibular cortical bone integrity [70], alveolar bone loss [71], and mandibular bone reabsorption [63]. Only two publications reported sensitivity and specificity values for one of the other indices – MR. Drozdzowska et al. [4] used cut-off at < 1.78 and got results of sensitivity and specificity at 43% and 44% respectively. Passos et al. [63] compared cut-off < 2 with BMD measured at the femur and spine. For femoral BMD, there were sensitivity and specificity values of 30.6% and 71.2%, and for spine BMD, 31.5% and 73.5%, respectively. Detailed characteristics of all studies are included in the Supplementary Table 3.

Assessment of methodological quality (risk of bias)

None of the studies complied with all QUADAS-2 items. The patient selection and index test domains raised the most concerns about the risk of bias, as high risk occurred in 51.6% and 56.3% of the studies, respectively. The reference standard was found to be adequate in 81.3% of cases, suggesting that DEXA is a reliable method for diagnosing osteoporosis (Fig. 3). Additionally, the flow and timing were adequate in 46.9% of studies, but were not specified in 48.4% of studies, and in three studies (4,7%) the interval between pantomography and DEXA was longer than 12 months. Most authors did not clarify whether the researchers were blinded to the patient's skeletal BMD, the time interval between DEXA and pantomography and whether both intra- and inter-observer concordance was performed for the index measurement. A detailed quality assessment for the 64 included studies is presented in Supplementary Table 4.

Fig. 3
figure 3

A bar diagram showing the proportion of studies relative to the risk of bias. The highest risk is observed for the index test and patients selection, the lowest for reference standard, and the flow and timing have the highest proportion of unclear risk

Meta-analysis

The meta-analysis was conducted for the three indices—MCI, MCW, and PMI. In the publications selected for analysis, BMD was measured at the hip or spine. 30 publications were included in the analyses. Some of the studies used several cut-off values in a given interval and reported different 2 × 2 tables, so they are listed several times in the presented forest plots. Study-specific estimates of sensitivity and specificity, as well as 95% confidence intervals based on the 2 × 2 tables, are shown in Figs. 45 and 6. For the analyses of sensitivity and specificity, within each index cut-off value interval, heterogeneity was calculated, the values of which are also given in Figs. 45 and 6. Data used to make forest graphs and SROC curves are presented in Supplementary Table 5, as well as code used for calculations in R Statistical Software is available in Supplementary Material 6. Only some of the studies using MCI included osteoporosis assessment (T-score < -2.5), the others as well as studies on MCW and PMI referred to osteopenia (T-score < -1). The presence of any type of cortical erosion (C1 vs. C2 + C3) had estimated sensitivity and specificity in detecting reduced BMD on the level of 0.773 (95% CI, 0.693–0.837) and 0.477 (95% CI, 0.398–0.558) respectively (Fig. 7a). The sensitivity and specificity of MCI in detecting osteoporosis were respectively 0.357 (95% CI, 0.161–0.617) and 0.953 (95% CI, 0.826–0.988) (Fig. 7b). The estimated sensitivity and specificity for MCW ≤ 3 mm in detecting reduced BMD were 0.712 (95% CI, 0.477–0.870) and 0.804 (95% CI, 0.589–0.921) (Fig. 8a), for the range of 3 mm < MCW ≤ 4 mm were 0.411 (95% CI, 0.252–0.591) and 0.882 (95% CI, 0.773–0.943) (Fig. 8b), and for the range of 4 mm < MCW ≤ 5 mm were 0.728 (95% CI, 0.621–0.814) and 0.584 (95% CI, 0.492–0.670), respectively (Fig. 8c). For PMI, the estimated sensitivity and specificity for PMI ≤ 0.3 in detecting reduced BMD were 0.404 (95% CI, 0.204–0.643) and 0.722 (95% CI, 0.537–0.853) (Fig. 9a), for the interval 0.3 < PMI ≤ 0.4 were 0.470 (95% CI, 0.286–0.662) and 0.682 (95%CI, 0.554–0.788), respectively (Fig. 9b). The heterogeneity of the studies was very high. In the case of sensitivity, it ranged from 83 to 99%, while in the case of specificity, it was slightly lower and ranged from 63 to 95%. Insufficient data made it impossible to perform analyses for the PMI > 0.4 range.

Fig. 4
figure 4

Forest graphs depicting MCI sensitivity and specificity values for detection of reduced BMD (C1 vs. C2 + C3) and osteoporosis (C1 + C2 vs. C3), in addition, heterogeneity values are shown (R Statistical Software)

Fig. 5
figure 5

Forest graphs depicting sensitivity and specificity values for MCW according to the ranges of the cut-off values used in the publications, in addition, heterogeneity values are shown (R Statistical Software)

Fig. 6
figure 6

Forest graphs depicting sensitivity and specificity values for PMI according to the ranges of the cut-off values used in the publications, in addition, heterogeneity values are shown (R Statistical Software)

Fig. 7
figure 7

Summary receiver operating characteristic (SROC) curves of the studies plotted in Fig. 4 (R Statistical Software). (a) ROC curve plot for detecting reduced BMD (C1 vs. C2 + C3): estimated sensitivity, 0.773 (CI, 0.693–0.837); estimated specificity 0.477 (CI, 0.398–0.558). (b) ROC curve plot for detecting osteoporosis (C1 + C2 vs. C3): estimated sensitivity 0.357 (CI, 0.161–0.617); estimated specificity 0.953 (CI, 0.826–0.988)

Fig. 8
figure 8

Summary receiver operating characteristic (SROC) curves of the studies plotted in Fig. 5 (R Statistical Software). (a) ROC curve plot for < MCW ≤ 3mm in detecting reduced BMD: estimated sensitivity 0.712 (CI, 0.477-0.870); estimated specificity 0.804 (CI, 0.589-0.921). (b) ROC curve plot 3mm MCW ≤ 4mm for detecting reduced BMD: estimated sensitivity 0.411 (CI, 0.252-0.591); estimated specificity 0.882 (CI, 0.773-0.94322). (c) ROC curve plot for detecting reduced BMD: estimated sensitivity 0.728 (CI, 0.621-0.814); estimated specificity 0.584 (CI, 0.492-0.670)

Fig. 9
figure 9

Summary receiver operating characteristic (SROC) curves of the studies plotted in Fig. 6 (R Statistical Software. (a) ROC curve plot PMI ≤ 0.3 for detecting reduced BMD: estimated sensitivity 0.404 (0.204-0.643); estimated specificity 0.722 (CI, 0.537-0.853). (b) ROC curve plot 0.3 PMI ≤ 0.4 for detecting reduced BMD: estimated sensitivity 0.470 (CI, 0.286-0.662); estimated specificity 0.682 (CI, 0.554-0.788)

Discussion

Bone density changes throughout the skeleton to a similar extent, so more methods are being proposed to diagnose reduced bone density. Among such methods is the evaluation of qualitative and quantitative radiomorphometric indices of the mandible observed on panoramic radiographs, so that screening could be performed during a routine dentist appointment. However, there is a high heterogeneity of results in published.

studies, which affects the overall assessment of the accuracy and applicability of the indices in practice. Data on gender, age, demographic characteristics, as well as menopausal status in women are very heterogeneous, moreover, the cut-off values adopted by the authors are within a very wide range. In some of the studies, the study participants were also males [19, 21, 23, 24, 34, 38, 43, 52, 54, 58, 60, 69], in five of them the study group consisted only of men [19, 21, 34, 60, 69]. In Leite et al. [21], DEXA testing was performed only in patients whose MCW was < 3 mm. Nine studies also included patients who had undergone unilateral or bilateral oophorectomy, or hysterectomy, were using estrogen or biphosphate therapy, which may have affected BMD in these patients [9, 22, 30, 39, 40, 47, 49, 50, 71]. In addition, 4 studies also included perimenopausal women [23, 24, 26, 37]. The average methodological quality of the studies is low. Of the 64 studies included in the analysis, only two [7, 43] had a low risk of bias in all 4 domains presented in Fig. 3, while only 17 had a low risk of bias in 3 out of 4 domains [5, 6, 8, 18, 22, 28, 38, 39, 41, 42, 44, 45, 55, 56, 66, 70, 71] (Supplementary Table 4).

The most commonly reported indices in the literature are MCI, MCW and PMI, which appear to be the most useful for the initial detection of reduced BMD in patients, in addition, they are simple to apply. However, both linear measurements and visual assessment of pantomograms have many limitations. MCI is affected primarily by the subjectivity of the examiner, but also by the exposure parameters of the images, and differences between the equipment used to take and measure the pantomographic images. In the case of MCW and PMI measurements, in addition to equipment differences, the method of measurement adopted can also affect the results. Manual measurement carries the greatest risk of error. The evaluation can also be affected by the proper positioning of the patient, the magnification of the image used, and the bone structure, which can blur the visibility of certain features [72,73,74]. When MCI was used to detect osteoporosis (C1 + C2 vs. C3), very high specificity of the test was observed, indicating greater accuracy for excluding low BMD than for detecting it, which makes this index less usable in identifying people with osteoporosis. Higher specificity values were also observed for the lower MCW ranges and both PMI ranges used. However, MCI (C1 vs. C2 + C3) and the highest MCW interval had better performance in detecting reduced BMD than in excluding it. Only for MCW ≤ 3 mm was the sum of sensitivity and specificity at 1.516, the closest to the value of 1.5, which is an indicator of the test's utility [75]. The evaluation of the utility of these indicators is undoubtedly influenced by their reproducibility. The average κ value for intraobserver agreement for MCI was 0.818 (0.454–0.965) [26, 38], for MCW it was 0.880 (0.812–0.998) [12, 41], and for PMI it was 0.885 (0.775–0.990) [23, 24, 59]. The κ values for interobserver agreement differ slightly and are respectively: MCI—0.744 (0.300–0.922) [25, 31]; MCW—0.769 (0.380–0.926) [21, 76], PMI—0.732 (0.474–0.903) [45, 76]. Based on the mean values of the κ statistic in the analysed studies, reproducibility is rated as almost perfect and substantial for MCI, MCW and PMI in both intra- and interexaminer agreement [77]. None of the analysed indices has ideal parameters of sensitivity and specificity for identifying reduced BMD. In two cases of the multicentre study the OSTEODENT Project, it was decided to supplement the analysis of indicators with a clinical test (osteoporosis index of risk—OSIRIS) [78], which took into account parameters such as age, body weight, as well as the use of hormone replacement therapy and fracture history. As a result of combining the analysis of indicators such as MCI and MCW with OSIRIS, the specificity value increased, meaning a better ability to exclude healthy individuals from further diagnosis of osteopenia and osteoporosis [26, 66].

The lack of studies on radiomorphometric indices of the maxilla may be due to the peculiarities of pantomographic images, which do not show any characteristic reference points that allow reproducible measurements. In addition, the jaw is composed of relatively thin bones, and the presence of a large amount of spongy bone makes imaging this area difficult. These features, however, would make the jaw a better predictor of changes in bone mineral density.

Conclusions

Radiomorphometric indices of the mandible can be a useful as a screening tool to identify patients with low BMD, but they should not be used as a diagnostic method. None of the analysed indices has ideal values of sensitivity and specificity, while the most useful index seems to be MCW with a cut-off value of < 3 mm. MCI appears to be the least applicable index for identifying people with osteoporosis (C1 + C2 vs. C3). Additionally, MCI is the most prone to the risk of bias, while being the indicator easiest to use. Attention should also be drawn to the average low methodological value of the selected articles. The different models of pantomographic apparatus and their software used around the world do not allow a complete standardization of the method. However, it would be possible to standardize the software and parameters used to take measurements on the images, for which further research is needed. Special attention should also be paid to analysing the ability of the indices to detect osteoporosis in combination with clinical parameters.