Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 10 May 2024

Does artificial intelligence predict orthognathic surgical outcomes better than conventional linear regression methods?

,
,
,
,
,
, and
Page Range: 549 – 556
DOI: 10.2319/111423-756.1
Save
Download PDF

ABSTRACT

Objectives

To evaluate the performance of an artificial intelligence (AI) model in predicting orthognathic surgical outcomes compared to conventional prediction methods.

Materials and Methods

Preoperative and posttreatment lateral cephalograms from 705 patients who underwent combined surgical-orthodontic treatment were collected. Predictors included 254 input variables, including preoperative skeletal and soft-tissue characteristics, as well as the extent of orthognathic surgical repositioning. Outcomes were 64 Cartesian coordinate variables of 32 soft-tissue landmarks after surgery. Conventional prediction models were built applying two linear regression methods: multivariate multiple linear regression (MLR) and multivariate partial least squares algorithm (PLS). The AI-based prediction model was based on the TabNet deep neural network. The prediction accuracy was compared, and the influencing factors were analyzed.

Results

In general, MLR demonstrated the poorest predictive performance. Among 32 soft-tissue landmarks, PLS showed more accurate prediction results in 16 soft-tissue landmarks above the upper lip, whereas AI outperformed in six landmarks located in the lower border of the mandible and neck area. The remaining 10 landmarks presented no significant difference between AI and PLS prediction models.

Conclusions

AI predictions did not always outperform conventional methods. A combination of both methods may be more effective in predicting orthognathic surgical outcomes.

INTRODUCTION

The number of patients who are willing to undergo combined surgical-orthodontic treatment has been increasing.1 Predicting surgical outcomes is crucial for planning treatment and achieving satisfactory results by visualizing postoperative changes. There have been numerous attempts to predict changes after orthognathic surgery for more than half a century (Table 1). At first, the correlation analysis between hard- and soft-tissue changes was applied to predict surgical outcomes, which was a simple one-to-one correspondence ratio.2–5 Still today, numerous commercial programs based upon simple correlation are available in the market for clinical use. Later, various prediction models based on more sophisticated methods, including multiple linear regression (MLR),6–9 partial least squares (PLS),10–13 probabilistic finite element method,14 and sparse PLS, were reported.15 Among the methods, PLS is known for its effectiveness when many variables are present and highly correlated with each other. The computation of PLS involves simple matrix algebra and it can be performed quickly. Previous publications demonstrated superior predictive performance of PLS to MLR in predicting postoperative soft-tissue changes.10–13

Table 1. Summary of Surgery Prediction Methods
Table 1.

Artificial intelligence (AI) has been popular in orthodontics for automatic workflows such as identifying cephalometric landmarks,16–18 image superimposition,19,20 providing subsequent analyses,21 and growth prediction.22,23 A recent growth prediction study applied an AI algorithm based on the TabNet deep neural network (DNN).24 The growth prediction accuracy from this AI outperformed the results from the PLS prediction.23 This AI technology was designed to apply to prediction scenarios involving multiple input and output variables. Since soft-tissue changes after orthognathic surgery can be influenced by various factors such as age, gender, type of surgery, individual response to surgery, individual skeletal configuration, and soft-tissue characteristics, in this complex situation, AI can be a useful tool in predicting postoperative changes by properly handling numerous input and output variables.

This study aimed to evaluate the performance of an AI model in predicting orthognathic surgical outcomes compared to conventional prediction methods.

MATERIALS AND METHODS

Subjects

The institutional review board for the protection of human subjects of the Seoul National University School of Dentistry approved the research protocol (S-D20200036).

The subjects were 705 patients (392 females and 313 males with an average age of 23.4 years) who had undergone orthognathic surgery for correction of skeletal malocclusions at Seoul National University Dental Hospital from January 2002 to December 2022. All the patients were in good health and belonged to the Korean ethnicity. Subjects who had cleft lip and palate, injury, or craniofacial syndrome were excluded from this study. Further characteristics of the subjects are shown in Table 2.

Table 2. Characteristics of Subjects (n = 705)
Table 2.

The preoperative lateral cephalograms (T1) were taken close to the time of orthognathic surgery. The postoperative radiographs (T2) were taken immediately after debonding. On a total of 1410 T1 and T2 images from 705 subjects, 78 cephalometric landmarks were manually identified by a single examiner (SJL, with over 33 years of clinical experience). When the examiner and another examiner repeated the manual identification twice on 283 validation images, the intra- and inter-examiner reliability measures were 0.97 ± 1.03 mm and 1.50 ± 1.48 mm, respectively.17

The 78 landmarks consisted of 46 skeletal and 32 soft-tissue landmarks. The reference planes were set with their origin at Sella. The horizontal reference plane was set as Sella-Nasion −7 degrees (Figure 1).

Figure 1.Figure 1.Figure 1.
Figure 1. Reference planes and cephalometric landmarks used in the present study. (A) Skeletal landmarks are shown in capital letters. (B) Soft-tissue landmarks are presented in lowercase letters.

Citation: The Angle Orthodontist 94, 5; 10.2319/111423-756.1

Variables

The predictors were 254 input variables that included age, sex, Angle classification, time after surgery, type of maxillary surgery, type of mandibular surgery, type of genioplasty, type of segmental osteotomy, type of zygomatic surgery, type of paranasal augmentation, preoperative skeletal and soft-tissue characteristics, 154 variables, and the amount of surgical skeletal repositioning during surgery, 90 variables. These 90 variables represented the amount of change in the x and y coordinates of 45 hard tissue landmarks, as shown in Figure 1A.

The outcomes were 64 Cartesian coordinate variables of 32 soft-tissue landmarks after surgery from glabella to the terminal point of the neck (Figure 1B).

Prediction Model Construction

The conventional prediction models were mathematical manipulations. MLR was based on the ordinary least squares. When developing MLR, the stepwise variable selection method based on the Akaike information criterion was applied. The other conventional prediction model, based on the partial least squares algorithm (PLS) combines the merit of the principal component analysis and MLR.25 The PLS model of the present study included 50 PLS components.

The AI algorithm applied in the present study was TabNet with a DNN architecture that was capable of including numerous numbers of input- and output variables.24 To construct the AI-based soft-tissue prediction model, the algorithm was adjusted using Python programming (Python Software Foundation, Wilmington, Delaware). TabNet DNN conditions were tuned with the synthetic minority oversampling technique set at 0.1. The early stopping condition was set to stop training before 10,000 epochs once the model performance no longer improved.

Statistical Analysis

To test and validate a prediction model, it is mandatory to validate the model through new data that was not used during the model-building procedures. To maintain the sample size and ensure the accuracy of prediction, the leave-one-out cross-validation technique (LOOCV) was employed. LOOCV has been demonstrated to be more effective than other validation techniques, such as the classical simple split technique, five-fold, or 10-fold cross-validation methods, particularly in clinical orthodontic research.13,26

At the beginning of LOOCV, a prediction model was formulated by using all subjects except one excluded subject. After constructing the prediction model, a prediction was performed for the excluded subject, calculating a test error for that individual. This procedure was repeated N times to yield the test errors, where N was the whole number of subjects.12,26 For validation purposes, consequently, 705 prediction models were built for each AI, MLR, and PLS prediction method.

To compare the prediction accuracy for the 32 soft-tissue landmarks, the Euclidean distance was calculated between the actual soft-tissue change after surgery and the prediction result for each landmark.

The t-tests with Bonferroni correction were used to compare the prediction accuracy between PLS and AI. To visualize the two-dimensional error patterns, scatterplots with 95% confidence ellipses were depicted.27 All statistical analyses were performed using Language R (Vienna, Austria).

RESULTS

Approximately 95% of 705 patients had Class II or III malocclusion at their first visit. The average elapsed time after orthognathic surgery was 0.9 years. The most frequent types of orthognathic surgery were Le Fort I osteotomy in the maxilla and bilateral sagittal split ramus osteotomy in the mandible. At least one of these two surgeries was conducted on over 80% of the subjects. Additionally, 59.6% of the patients received genioplasty (Table 2).

Figure 2 demonstrates the scatterplots of prediction errors along with 95% confidence ellipses. A smaller ellipse indicates more accurate results.27 Three different scenarios were represented: 1) PLS prediction was more accurate than AI (Figure 2A), 2) there was no statistically significant difference between PLS and AI (Figure 2B), and 3) AI prediction was more accurate than PLS (Figure 2C). From the visual inspection of the scatterplots for all soft-tissue landmarks, MLR demonstrated the poorest predictive performance, showing either a larger size or a more deformed shape of ellipse than PLS and AI.

Figure 2.Figure 2.Figure 2.
Figure 2. Scatterplots and 95% confidence ellipses of prediction errors for soft-tissue landmarks: (A) superior labial sulcus; (B) lower lip; (C) cervical point. The larger points at the center of each ellipse represent the mean or bias of the smaller-dotted error points enclosed by the ellipse.

Citation: The Angle Orthodontist 94, 5; 10.2319/111423-756.1

Table 3 shows pairwise comparisons between the prediction results of PLS and AI. The accuracy of the predictions varied depending on the location of soft-tissue landmarks. Out of the 32 landmarks, PLS showed more accurate results in predicting 16 landmarks from glabella to the upper lip. On the other hand, AI performed better in six landmarks located in the lower border of the mandible and neck area. The remaining 10 landmarks presented no statistically significant different results between AI and PLS prediction models.

Table 3. Comparisons in the Prediction Accuracy between Partial Least Squares Regression (PLS) and Artificial Intelligence (AI). Errors are the Euclidean Distance (mm) between Prediction Results and Real Soft-Tissue Profile after Surgery. A Superior Model is Marked with a Symbol
Table 3.

The prediction results shown in Figure 2 and Table 3 show many outliers and deviations, respectively. However, those aberrations may not be significant as long as the predicted positions fall within the profile curves. As shown in Figure 3, the soft-tissue prediction results are depicted to compare them with the actual changes after surgery. The soft-tissue landmarks from glabella to the terminal point on the lower neck were connected by applying the natural cubic spline function so that those soft-tissue landmarks could represent a smooth curve. Although the prediction results were distant from real soft-tissue changes in some areas, AI was particularly effective in predicting soft-tissue curves in the lower mandible and neck region (Figure 3).

Figure 3.Figure 3.Figure 3.
Figure 3. Real-case examples illustrating actual soft-tissue changes after orthognathic surgery and the corresponding prediction results. There is a mismatch between the outline curves and the soft-tissue profile line due to the outline being based on the lateral cephalometric image, while the lateral photographs were superimposed for illustrative purposes. In general, AI predictions are more accurate than PLS predictions in the lower border of the mandible and neck curve expression.

Citation: The Angle Orthodontist 94, 5; 10.2319/111423-756.1

DISCUSSION

The purpose of this study was to evaluate the performance of an AI model in predicting orthognathic surgical outcomes compared to conventional prediction methods. The present study was inspired by recent research that developed individualized facial growth prediction models, where AI showed effectiveness in predicting the facial changes of growing children.22,23 In this study, AI was expected to outperform conventional statistical methods such as MLR or PLS when predicting surgical outcomes. However, the results were different from what was envisaged. Among 32 soft-tissue landmarks, AI predicted better in only six outcome variables. Contrary to expectations, PLS performed better in predicting half of the total soft-tissue landmarks. Previously, while predicting facial growth, PLS showed more accurate predictions in nine out of the total of 78 landmarks, primarily located in the cranial base. According to Moon et al., statistical methods based on mathematical manipulation such as PLS or MLR may be more effective than AI when predicting craniofacial growth on landmarks with low variability.22 In this study, AI was found to be more accurate in predicting soft-tissue changes in the lower mandible and neck region, which are areas that typically exhibit significant variability after surgery. These areas may show inherent variability even with a slight postural change or without any surgical procedures.

Training an AI-based prediction model took more than 6 days, while the PLS-based model took less than 10 minutes. However, unlike the time-consuming training and model-building procedures used to develop the AI model, the prediction itself involves only relatively simple calculations. Consequently, after the prediction model was built, the predictions were made in only a few milliseconds. Once an AI model is developed, its prediction time is negligible despite the longer development time. Since AI model-building time required significantly longer than PLS, accordingly, it might seem reasonable that the prediction results of the AI-based model would be more accurate. However, as previously described, PLS was more successful than AI when predicting landmarks with less variability. Additional studies may be needed in the future to clarify a more accurate algorithm in terms of predictive performance. Since each algorithm expressed different strengths according to the variability of landmarks, a hybrid approach applying the two separate prediction models differently depending on the landmarks to be predicted may be a more viable option, rather than simply choosing a sole method. This simultaneous application of both algorithms as needed might offer an answer to various prediction problems.

One of the strengths of the present study was that, as of April 2024, it is the first AI study to use the TabNet DNN algorithm to predict orthognathic surgical outcomes. Additionally, this study included the greatest number of subjects, 705, as shown in Table 1, compared to the 620 subjects in the study by Veltkamp et al. (2002).8 This larger sample size might have contributed to improved prediction accuracy.

A limitation of the current study was that AI was not capable of explaining how the results were obtained. In comparison, conventional statistical models could provide the relationships via coefficient estimates and loading matrices. This may be why AI is sometimes referred to as a black box. Another limitation was that the AI prediction results could be different if other algorithms had been used instead of the TabNet DNN algorithm.24 Relying on cephalograms was another weak point of this study. However, it is also true that computed tomography is not commonly obtained during a patient’s first visit. Still, lateral cephalograms are routinely used to diagnose the need for orthognathic surgery. Although three-dimensional images may offer a more realistic visualization, the lateral profile line from a cephalogram is often viewed as a simpler way in practice.

Although AI may be thought to be a recent device for which many orthodontists see a need, AI by itself may not be the ultimate solution, at least in predicting orthognathic surgical outcomes. The initial expectation was that AI could be an adaptable solution for various challenges and complex issues in clinical orthodontics. However, this study discovered that AI predictions might not always be as reliable as expected in certain areas. If this is true, AI may not always outperform traditional statistical methods, especially when there is low variability and/or a clear cause-and-effect relationship. One such scenario could be predicting changes in the soft-tissue profile after orthodontic treatment. Unlike growth prediction scenarios, there is a clearer cause-and-effect relationship between dentoalveolar and soft-tissue changes in orthodontic treatment.28 Additionally, the changes in soft-tissue after orthodontic treatment are not as variable as the changes following orthognathic surgical procedures. Consequently, it is cautiously anticipated that AI-based prediction models might not be as effective as methods based on MLR or PLS in predicting changes in the soft-tissue profile after orthodontic treatment.28 This could be an interesting topic for future AI research in orthodontics.

CONCLUSIONS

  • AI effectively predicted soft-tissue curves in the lower mandible and neck region, which are typically characterized by wide variability after surgery. However, PLS presented superior predictions in more areas. Consequently, a combination of AI and conventional methods seemed to be a more effective way of predicting orthognathic surgical outcomes.

ACKNOWLEDGMENTS

Some of the data presented in the current study was included as part of a doctoral dissertation (JAP). This study was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant no. HI22C1518), and partly supported by grant 05-2023-0027 from the SNUDH Research Fund.

REFERENCES

  • 1.

    Lee CH, Park HH, Seo BM, Lee SJ. Modern trends in Class III orthognathic treatment: a time series analysis. Angle Orthod. 2017;87:269278.

  • 2.

    Robinson SW, Speidel TM, Isaacson RJ, Worms FW. Soft tissue profile change produced by reduction of mandibular prognathism. Angle Orthod. 1972;42:227235.

  • 3.

    Hershey HG, Smith LH. Soft-tissue profile change associated with surgical correction of the prognathic mandible. Am J Orthod. 1974;65:483502.

  • 4.

    Suckiel JM, Kohn MW. Soft-tissue changes related to the surgical management of mandibular prognathism. Am J Orthod. 1978;73:676680.

  • 5.

    Willmot DR. Soft tissue profile changes following correction of class III malocclusions by mandibular surgery. Br J Orthod. 1981;8:175181.

  • 6.

    Quast DC, Biggerstaff RH, Haley JV. The short-term and long-term soft-tissue profile changes accompanying mandibular advancement surgery. Am J Orthod. 1983;84:2936.

  • 7.

    Mobarak KA, Espeland L, Krogstad O, Lyberg T. Soft tissue profile changes following mandibular advancement surgery: predictability and long-term outcome. Am J Orthod Dentofacial Orthop. 2001;119:353367.

  • 8.

    Veltkamp T, Buschang PH, English JD, Bates J, Schow SR. Predicting lower lip and chin response to mandibular advancement and genioplasty. Am J Orthod Dentofacial Orthop. 2002;122:627634.

  • 9.

    Kneafsey LC, Cunningham SJ, Petrie A, Hutton TJ. Prediction of soft-tissue changes after mandibular advancement surgery with an equation developed with multivariable regression. Am J Orthod Dentofacial Orthop. 2008;134:657664.

  • 10.

    Suh HY, Lee SJ, Lee YS, et al. A more accurate method of predicting soft tissue changes after mandibular setback surgery. J Oral Maxillofac Surg. 2012;70:e553–562.

  • 11.

    Lee HJ, Suh HY, Lee YS, et al. A better statistical method of predicting postsurgery soft tissue response in Class II patients. Angle Orthod. 2014;84:322328.

  • 12.

    Lee YS, Suh HY, Lee SJ, Donatelli RE. A more accurate soft-tissue prediction model for Class III 2-jaw surgeries. Am J Orthod Dentofacial Orthop. 2014;146:724733.

  • 13.

    Yoon KS, Lee HJ, Lee SJ, Donatelli RE. Testing a better method of predicting postsurgery soft tissue response in Class II patients: a prospective study and validity assessment. Angle Orthod. 2015;85:597603.

  • 14.

    Knoops PGM, Borghi A, Ruggiero F, et al. A novel soft tissue prediction methodology for orthognathic surgery based on probabilistic finite element modelling. PLoS One. 2018;13:e0197209.

  • 15.

    Suh HY, Lee HJ, Lee YS, Eo SH, Donatelli RE, Lee SJ. Predicting soft tissue changes after orthognathic surgery: the sparse partial least squares method. Angle Orthod. 2019;89:910916.

  • 16.

    Moon JH, Hwang HW, Yu Y, Kim MG, Donatelli RE, Lee SJ. How much deep learning is enough for automatic identification to be reliable? Angle Orthod. 2020;90:823830.

  • 17.

    Hwang HW, Park JH, Moon JH, et al. Automated identification of cephalometric landmarks: Part 2-Might it be better than human? Angle Orthod. 2020;90:6976.

  • 18.

    Park JH, Hwang HW, Moon JH, et al. Automated identification of cephalometric landmarks: Part 1-Comparisons between the latest deep-learning methods YOLOV3 and SSD. Angle Orthod. 2019;89:903909.

  • 19.

    Kim MG, Moon JH, Hwang HW, Cho SJ, Donatelli RE, Lee SJ. Evaluation of an automated superimposition method based on multiple landmarks for growing patients. Angle Orthod. 2022;92:226232.

  • 20.

    Moon JH, Hwang HW, Lee SJ. Evaluation of an automated superimposition method for computer-aided cephalometrics. Angle Orthod. 2020;90:390396.

  • 21.

    Hwang HW, Moon JH, Kim MG, Donatelli RE, Lee SJ. Evaluation of automated cephalometric analysis based on the latest deep learning method. Angle Orthod. 2021;91:329335.

  • 22.

    Moon JH, Shin HK, Lee JM, et al. Comparison of individualized facial growth prediction models based on the partial least squares and artificial intelligence. Angle Orthod. 2024;94:207215.

  • 23.

    Moon JH, Kim MG, Hwang HW, Cho SJ, Donatelli RE, Lee SJ. Evaluation of an individualized facial growth prediction model based on the multivariate partial least squares method. Angle Orthod. 2022;92:705713.

  • 24.

    Arik SÖ, Pfister T. Tabnet: attentive interpretable tabular learning. Proceedings of the AAAI conference on artificial intelligence. 2021;35:66796687.

  • 25.

    Kim K, Lee SJ, Eo SH, Cho SJ, Lee JW. Modified partial least squares method implementing mixed-effect model. Commun Stat Appl Methods. 2023;30:6573.

  • 26.

    Donatelli RE, Lee SJ. How to test validity in orthodontic research: a mixed dentition analysis example. Am J Orthod Dentofacial Orthop. 2015;147:272279.

  • 27.

    Moon JH, Lee JM, Park JA, Suh HY, Lee SJ. Reliability statistics every orthodontist should know. Semin Orthod. 2024;30:4549.

  • 28.

    Cho SJ, Moon JH, Ko DY, et al. Orthodontic treatment outcome predictive performance differences between artificial intelligence and conventional methods. Angle Orthod. 2024; in press https://doi.org/10.2319/111823-767.1

Copyright: © 2024 by The EH Angle Education and Research Foundation, Inc.
Figure 1.
Figure 1.

Reference planes and cephalometric landmarks used in the present study. (A) Skeletal landmarks are shown in capital letters. (B) Soft-tissue landmarks are presented in lowercase letters.


Figure 2.
Figure 2.

Scatterplots and 95% confidence ellipses of prediction errors for soft-tissue landmarks: (A) superior labial sulcus; (B) lower lip; (C) cervical point. The larger points at the center of each ellipse represent the mean or bias of the smaller-dotted error points enclosed by the ellipse.


Figure 3.
Figure 3.

Real-case examples illustrating actual soft-tissue changes after orthognathic surgery and the corresponding prediction results. There is a mismatch between the outline curves and the soft-tissue profile line due to the outline being based on the lateral cephalometric image, while the lateral photographs were superimposed for illustrative purposes. In general, AI predictions are more accurate than PLS predictions in the lower border of the mandible and neck curve expression.


Contributor Notes

Clinical Lecturer, Department of Orthodontics, Seoul National University Dental Hospital, Seoul, Korea.
Private Practice, Cheonan, Korea.
Graduate Student (PhD), Department of Orthodontics, Graduate School, Seoul National University, Seoul, Korea.
Professor, Department of Oral and Maxillofacial Surgery, Seoul National University School of Dentistry, Seoul, Korea.
Private Practice, West Palm Beach, Florida, USA.
Professor, Department of Orthodontics, Seoul National University School of Dentistry, Seoul, Korea.
Corresponding author: Dr Shin-Jae Lee, Professor, Dental Research Institute, Seoul National University School of Dentistry, Jongro-Gu, Seoul 03080, Korea (e-mail: nonext.shinjae@gmail.com)
Received: 01 Nov 2023
Accepted: 01 Mar 2024
  • Download PDF