Preciseness of artificial intelligence for lateral cephalometric measurements

Background The aim of the study was to assess the accuracy and efficiency of a new artificial intelligence (AI) method in performing lateral cephalometric radiographic measurements. Materials and methods A total of 200 lateral cephalometric radiographs were assessed for quality and included. Three methods were used to perform the cephalometric measurements: (1) the AI method using WebCeph software (AssembleCircle Corp., Gyeonggi-do, Republic of Korea), (2) the modified AI method using WebCeph software after manual modification of the landmarks’ position, and (3) using OnyxCeph software (Image Instruments GmbH, Chemnitz, Germany) by manual landmark identification and digital measurements generation. The results of the measurements produced by the three methods were compared, in addition to comparing the time required for the measurements’ generation required for each method. Results Statistically significant differences were detected between the measurements resulting from the three used methods. Fewer differences were detected between the modified AI method and the OnyxCeph method. The AI method produced the measurements the fastest followed by the modified AI method and then the OnyxCeph method. Conclusions Considering the used AI software, AI followed by manual tuning of the landmarks’ position might be an accurate method in lateral cephalometric analysis. AI alone is still not fully reliable at locating the different landmarks on the lateral cephalometric radiographs.


Introduction
Orthodontic data collection and diagnosis is a process that consumes time and effort [1].This process usually encompasses meticulous analysis of different diagnostic records including photographs, study models, and radiographs [2,3].Among these, lateral cephalometrics is one of the most important analyses to be performed before and after treatment [4].Lateral cephalometric analysis is essential in the process of diagnosis and treatment planning of orthodontic and orthognathic cases to clarify the skeletal and dental relationships, and to analyze growth-and treatment-related changes [2].
Nowadays, the concept of "digital orthodontics" is being increasingly implemented [5].Recently, the idea of using "artificial intelligence" (AI) in orthodontics has been introduced [6].Multiple advances utilizing the AI technology have been rapidly implemented in diagnosis and treatment, aiming to fine tune the orthodontic professionality [6].One of the promising applications of AI in orthodontics is in the field of digital cephalometry [7].AI was initially incorporated in tracing software as an adjunct that connects the traced points, previously marked by the operator.Then, calculations of angles and distances according to a programmed algorithm would be performed.An innovation was lately released comprising complete automatic detection of different cephalometric landmarks which is nonoperator dependent.Accordingly, identification of all landmarks and the resulting generation of measurements would be completed in seconds [8].If accurate, this fast production of cephalometric analysis would decrease the time consumed in landmark detection to produce the needed measurements and could reduce the orthodontists' treatment planning time.
Two previous studies [9,10] investigated the usage of AI in orthodontics.None of those studies discussed the accuracy of AI in generating the basic cephalometric measurements.
The aim of the current study was to investigate the accuracy and efficiency of AI applying a newly proposed, fully automated tracing software WEB CEPH (AssembleCircle Corp., Gyeonggi-do, Republic of Korea) with and without further manual support in cephalometric measurements production.

Selection of the lateral cephalometric radiographs
The sample comprised pretreatment lateral cephalograms of 200 growing and adolescent female patients that were filtered from 1879 radiographs from the pretreatment records saved at the outpatient clinic computer of the department of orthodontics of two different universities.It was ensured that both universities use the same cephalometric radiograph equipment to avoid different magnification.All radiographs used in the current study were meticulously selected to be free from any artifacts.Clear anatomical landmarks, presence of a ruler for calibration, patient in natural head position, and shadows of the cephalostat and mandible in centric relation were the criteria upon which the radiographs were checked for eligibility to be included in the current study.

Measurements' production
After quality checking, all selected radiographs were used to generate the cephalometric measurements mentioned in Table 1 and Fig. 1.The measurements were attained by three methods: (1) using OnyxCeph software (Image Instruments GmbH, Chemnitz, Germany), where manual landmark detection was completed followed by digital calculation of the measurements using the software's algorithms; (2) using WebCeph [8] website (www.webceph.com)for automatic landmark detection (using AI) and measurements calculation (Fig. 2); and (3) using AI provided by the Web-Ceph [8] website followed by manual tuning and modifying the automatically located landmarks (Fig. 3).
The primary outcome was to compare the accuracy of the measurements produced by the fully automated software WEB CEPH (with and without manual modifications) with K Table 1 Cephalometric measurements and definitions [12] Tab. 1 Kephalometrische Messungen und Definitionen [12] Measurement In addition, counting and comparing the complete analysis time required using the three methods was the secondary outcome.

Sample size calculation
Sample size calculation was performed using data from a previous study measuring the SNA angle with the help of the OnyxCeph software [11].The acceptable difference between the SNA angle calculated with OnyxCeph and that produced by the AI was set at 1°.Using the standard deviation of 4.6 from that paper, a power of 80% and a type I error of 0.5 calculation was performed with the PS calculator (version 3.1.2,Creative Commons Attribution-NonCommercial-NoDerivs 3.0, USA).The calculation indicated a need for a minimum of 168 cephalometric radiographs; thus, 200 radiographs were included in the current study.

Statistics
The significance level was set at P ≤ 0.05.Statistical analysis was performed with SPSS ® Statistics (version 20, IBM, Armonk, NY, USA).Handling of data was done using Excel software (Microsoft, Redmond, WA, USA).Data were explored for normality using Kolmogorov-Smirnov and Shapiro-Wilk tests.According to the behavior of the data (either parametric or nonparametric), the suitable statistical test was selected.
The means, standard deviations (SD), and confidence intervals (CI) were calculated for each group in each test.For normally distributed data, one-way analysis of variance (ANOVA) was used to compare the results of the three methods, followed by the Bonferroni test to make Interclass correlation coefficients (ICC) were calculated to detect the intra-and interobserver reliability of the manual identification of the landmarks in the study.

Results
All the included 200 radiographs were measured using the three methods.Acceptable intraobserver reliability and agreement between all the readings were found (ICC values ranged from 0.81 to 0.91).For the interobserver reliability, acceptable reliability was also observed for the carried-out measurements (ICC values ranged from 0.79 to 0.92).
For the overall comparison between the three methods (Table 2 and Fig. 4), statistically significant differences were found in all measurements and the time required for the measuring process.
For the pairwise comparisons (Table 3), statistically significant differences were found when comparing the AI vs the modified AI methods for all measurements except for the SN/MP, U1/NA angles, and the interincisal angle which did not differ significantly.When comparing the AI method vs the OnyxCeph analysis, statistically significant differences were detected for all measurements except for the L1/NB angle.For the modified AI method versus the Onyx-Ceph analysis, statistically significant differences were detected for the SNA, ANB, SN/MP, L1/NB angles, and the L1/NB linear measurement.The other measurements did not differ significantly.
For the time elapsed in the measuring process (Table 2 and Fig. 4), statistically significant differences were detected between the three methods, denoting that the AI method was the fastest followed by the modified AI method then the measurement using OnyxCeph.

Discussion
Lateral cephalometric analysis is an integral step in orthodontic diagnosis and treatment planning [4].The process of point identification and generation of the measurements K are statistically significant, SD standard deviation, AI fully automated landmarks identification and tracing using artificial intelligence (AI), modified fully automated landmarks identification and tracing followed by tuning of the landmark positions, OnyxCeph manual landmark detection completed followed by digital calculation of the measurements using software algorithms Fig. 4 The differences between the methods: AI artificial intelligence method, Modified the modified artificial intelligence method, Onyx-Ceph measurements produced using Onyx-Ceph software (Image Instruments GmbH, Chemnitz, Germany).Abbreviations are defined in Table 1 Abb. 4  is very tedious and time consuming.Thus, any accurate attempts to accelerate the process would be of great benefit.Recent technologies and AI provided a new approach for cephalometric landmarks identification and generation of measurements.Testing the accuracy of these technologies is crucial to confirm the reliability of its usage and to pro-vide clinicians with a simpler approach for cephalometric analysis.
The results of the current study are interesting.Although the differences between application of the AI alone and the conventional digital tracing were significant, the modified AI method resulted in readings closer to the conventional method.Based on this, fine tuning by the orthodon-K Table 3 Mean differences and 95% confidence interval of the different cephalometric measurements between two methods, the Bonferroni test was used to compare two methods in each measurement Tab. 3 1 tic clinician for the automatically located landmarks would be mandatory to achieve accurate final readings.This also makes unnecessary the need for further machine learning of the algorithm to be more precise in locating some cephalometric landmarks.
The current study also helped in detecting the measurements that were most affected by inaccurate localization of the points by the program.This could act as a guideline for the software provider in the process of enhancing the point detection efficiency throughout the learning of the algorithm.Comparing the results of the current study to the results of other studies that tested the efficiency of other fully automated tracing software programs [9,10], some differences were detected.Both studies [9,10] found that the AI alone is an accurate tool for cephalometric landmark identification, this was not the case in the current study.
Assessing the total tracing time required by the three different methods was also crucial in the current study.As the main purpose of inventing the fully automated tracing software was to reduce the time required by the clinician to locate the landmarks.The current software proved to be efficient in this aspect, even when the modified version was tested.Both the fully automated AI and the AI with operator modifications required significantly less time than the regular digital tracing method.

Conclusions
Within the limitations of the current study and also considering the use of AI software from only one provider, the following can be concluded: The AI method followed by fine tuning of the location of landmarks (the modified AI method) was successful and can be used as an alternative to ordinary digital landmark identification for lateral cephalometric analysis.The modified AI method was the most efficient method.Use of AI alone was not accurate enough for landmark identification and accordingly not precise in the generation of lateral cephalometric measurements.Measurements were generated fastest using the AI method, followed by the modified AI method, and finally the conventional digital method.you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Definition SNA (°) Angle formed by the SN line and the NA line 1 SNB (°) Angle formed by the SN line and the NB line 1 ANB (°) Angle formed by the NA line and the NB line 1 U1 to NA, mm Distance from incisal edge of upper 1 to N-B line U1 to NA (°) Angle from incisal edge of upper 1 to N-A line Interincisal angle (°) The angle between the long axes of the maxillary and mandibular incisors 1 L1 to NB, mm Distance from incisal edge of lower 1 to N-B line L1 to NB (°) Angle from incisal edge of lower 1 to N-B line Sn to GoMe (°) Angle formed by the SN line and the Go Me line the measurements derived from the regular tracing software.

Table 2
Mean and standard deviation of the different cephalometric measurements by the three methods compared using one-way analysis of variance (ANOVA) Tab. 2 Mittelwert und Standardabweichung der verschiedenen kephalomterischen Messungen mit den drei Methoden im Vergleich mittels einsei- *Significant (p < 0.05), ns nonsignificant (p > 0.05), values sharing the same letter are not statistically significant, values sharing different letters Mittlere Unterschiede und das 95 %-Konfidenzintervall der verschiedenen kephalometrischen Messungen zwischen 2 Methoden.Zum Vergleich zweier Methoden bei jeder Messung wurde der Bonferroni-Test verwendet a Abbreviations defined in Table