Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 31 Jan 2025

Comparison of individualized facial growth prediction models using artificial intelligence and partial least squares based on the Mathews growth collection

,
,
,
,
,
, and
Page Range: 249 – 258
DOI: 10.2319/082124-687.1
Save
Download PDF

ABSTRACT

Objectives

To develop facial growth prediction models using artificial intelligence (AI) under various conditions, and to compare performance of these models with each other as well as with the partial least squares (PLS) growth prediction model.

Materials and Methods

Longitudinal lateral cephalograms from 33 subjects in the Mathews growth collection were utilized. A total of 1257 pairs of before and after growth lateral cephalograms were included. In each image, 46 hard and 32 soft tissue landmarks were manually identified. Growth prediction models were constructed using a deep learning method based on TabNet deep neural network and partial least squares (PLS) method. Prediction accuracies of the two methods were compared.

Results

On average, artificial intelligence (AI) showed 0.61 mm less prediction error than PLS. Among the 77 predicted landmarks, AI was more accurate than PLS in 60 landmarks. When comparing AI models with varying numbers of training epochs, those with higher epochs yielded more accurate predictions. Overall, PLS and AI exhibited greater prediction errors for soft tissue and mandibular landmarks compared to hard tissue and maxillary landmarks. However, AI showed a smaller increase in prediction error in areas with greater variability.

Conclusions

AI proved to be a valuable growth prediction method, with clinically acceptable prediction errors averaging 1.49 mm for 45 hard tissue landmarks and 1.71 mm for 32 soft tissue landmarks. PLS accurately predicted landmarks with low variability. However, AI generally outperformed PLS, particularly for landmarks in the lower part of the craniofacial structure and soft tissue, where uncertainty is considerable.

INTRODUCTION

Understanding and predicting timing, pattern, and amount of human facial growth greatly impacts the effectiveness and efficiency of orthodontic treatment.1 Although some patients benefit from early intervention, others may miss critical windows, necessitating surgery as the only viable option. Additionally, some patients receive multiple rounds of treatment as they outgrow initial results or experience relapse. Ideally, with unlimited resources, prolonged treatments could yield optimal outcomes. Nevertheless, the best available scientific evidence should be used to determine the most effective and efficient treatment options.

Historical growth prediction methods2–10 provided general guidelines but were not always precise for individual variations. Despite efforts to understand and predict growth and development, the subjectivity in predicting dentofacial growth remains a challenge, as highlighted in Dr. Bishara’s article in 2000.11 Although growth prediction has been an important subject in orthodontics, few publications have addressed craniofacial growth prediction in the past 20 years.12 The precise determination of future growth magnitude, direction, and resulting facial changes continues to be uncertain, with most treatment planning relying on the subjective assessment of orthodontists.

Accurate growth prediction is challenging due to its complexity and the influence of genetic and environmental factors, which cause individual variations.13–15 As an attempt to account for these individual variations, recent statistical methods, such as discriminant analysis,16,17 multiple linear regression analysis,18 Bayes’ theorem,19,20 and nonlinear growth models,21,22 have included age and gender in growth prediction. The multivariate partial least squares regression method (PLS) has been utilized in growth prediction for its ability to manage a large number of intercorrelated individual attributes and has shown improved prediction accuracy.12 Recently, there has been growing interest in applying artificial intelligence (AI) to solve complex problems in orthodontics. There have been attempts to use AI in cephalometric landmark detection, automatic image superimposition, orthodontic diagnosis, and growth prediction.23–30

Although advances in technology allow for extensive computational analysis of large datasets, developing a robust growth prediction model remains challenging, as collecting longitudinal growth records solely for research purposes is often not feasible, especially when patients are not undergoing treatment. Therefore, the American Association of Orthodontists Foundation (AAOF) Craniofacial Growth Legacy Collection serves as an invaluable resource, providing longitudinal records of growing adolescents.31

Given the critical role of growth in successful treatment, it is imperative to base growth predictions on the best available scientific knowledge. This study aimed to develop facial growth prediction models using the Mathews collection from the AAOF growth collections, based on AI. Another goal was to compare the performance of the AI model with a prediction model that utilized the PLS method, which is one of the most recently implemented statistical approaches.

MATERIALS AND METHODS

Subjects

The institutional review board for the protection of human subjects of University of The Pacific reviewed and approved the research protocol (#2023-28). Subjects of this study were collected from one of the AAOF Growth Collections, the Mathews collection, which includes 36 subjects, primarily of European descent. Three subjects were excluded due to missing images, resulting in a final sample of 33 subjects who received annual cephalograms resulting in at least five timepoints, yielding 1257 before and after growth pair data.

Cephalometrics

Cephalometric images of the subjects, taken annually, were processed digitally to enhance quality for reliable landmark identification. Fiducials, pixel information, and magnification factors were considered to resize the images. A total of 78 anatomical landmarks (Table 1) were identified manually by a single examiner (SJL) with 33 years of clinical orthodontic experience (Figure 1). A Cartesian coordinate system was constructed using Sella as the origin, resulting in 77 landmarks for prediction. The horizontal reference plane was established by drawing a line 7° downward from the Sella-Nasion plane.

Figure 1.Figure 1.Figure 1.
Figure 1.Longitudinal serial growth data source: the University of the Pacific Mathews Growth Study on the AAOF Craniofacial Growth Legacy Collection website, https://www.aaoflegacycollection.org/aaof_collection.html?id = UOPMathews (A). Four fiducial points and 78 cephalometric landmarks that were manually identified using a computer vision annotation tool (B).

Citation: The Angle Orthodontist 95, 3; 10.2319/082124-687.1

Table 1.List of Anatomical Landmarks
Table 1.

Variables

The prediction model incorporated 159 predictor variables and 154 response variables. Predictor variables included individual characteristics such as age, gender, Angle classification, growth observation interval, and the x and y coordinates of 77 anatomic landmarks from the starting timepoint. The x and y coordinates of 77 anatomic landmarks from a later timepoint were used as response variables.

AI and PLS Prediction Models

The leave-one-out cross-validation (LOOCV) was employed to calculate test errors.32 The TabNet Deep Neural Network (DNN) by Arik and Pfister33 was chosen as the base model. The original TabNet DNN architecture was modified using Python programming (Python Software Foundation, Wilmington, Delaware, USA). Different numbers of training epochs of 100 and 1000 were used to compare performance while exploring options, to save computational resources.34 The PLS prediction model12 was implemented using the open-source programming language, R.

Statistical Analysis

Prediction errors were calculated using Euclidean distances between actual growth and predicted outcomes for specific landmarks. To compare the prediction accuracy of PLS and AI, t-tests adjusted for multiple comparisons using the Bonferroni correction were used. Scatterplots with 95% confidence ellipses were created to represent the pattern of prediction errors visually.35

RESULTS

Table 2 presents the characteristics of the subjects at the time of growth observation. The average observation period was 8.5 years, with a mean starting age of 7.4. Ninety-four percent of subjects had radiographs taken more than five times, whereas about 27.3% had radiographs taken more than 10 times. When the proportion of malocclusion was considered, 51.5% of the subjects had Class I, 48.5% of patients had Class II malocclusions, and no patient presented a Class III molar relationship at initial examination.

Table 2.Subject Characteristics (n = 33)
Table 2.

The performance of the developed models was evaluated based on prediction errors. Among AI models developed under various conditions, the model with an early stopping condition at 1000 training epochs was chosen to calculate the AI prediction errors. On average, AI presented more accurate prediction with 0.61 mm smaller error than that of the PLS model. The average error for 45 hard tissue landmarks with the PLS prediction model was 1.87 mm, whereas the AI prediction error averaged 1.49 mm. For 32 soft tissue landmarks, the errors averaged 2.63 mm for PLS and 1.71 mm for AI. Among the 77 predicted landmarks, the AI-based prediction model showed better prediction accuracy for 60 landmarks (Table 3). The PLS-based prediction model was more accurate for 13 landmarks (Nasion, Porion, Orbitale, Basion, Articulare, Condylion, Ramus tip, Pterygomaxillary fissure, Pterygoid point, PNS, glabella, glabella-nasion contour point, and cheekpoint). There was no statistical difference in four landmarks (Nasal bone tip, Key ridge contour smoothing point 1, soft-tissue nasion, inferior tip of nasal bone). Overall, both methods showed greater prediction errors for soft tissue and mandibular landmarks compared to hard tissue and maxillary landmarks. However, the AI method demonstrated a smaller increase in error for areas with more variability.

Table 3.Growth Prediction Errors (mm) Between the Partial Least Squares Regression (PLS) and Artificial Intelligence (AI). The Errors are Radial Errors or the Euclidean Distance (mm) Between the Actual Growth Position of a Given Landmark in the Lateral Cephalogram and its Predicted Position
Table 3.

The pattern of growth prediction errors for representative landmarks are shown in Figure 2. For hard tissue landmarks, PLS demonstrated better prediction accuracy in 13 landmarks including Nasion, Porion, Orbitale, Basion, and Condylion (Figure 2A). Generally, AI demonstrated significantly more accurate results than PLS, with AI model accuracy improving with more training epochs (Figure 2B). Among the soft tissue landmarks, only glabella, glabella-nasion contour point, and cheek point were better predicted by PLS, whereas all other landmarks were more accurately predicted by AI (Figure 2C). Overall, PLS exhibited better or comparable prediction performance than the AI method in the upper part of the craniofacial structure, whereas AI outperformed PLS in the lower part of the craniofacial structure and in soft tissue.

Figure 2.Figure 2.Figure 2.
Figure 2.Scatter plots presenting errors and 95% confidence ellipses for the three prediction models. Green, PLS; Blue, AI developed from the number of training epochs 100; Red, AI developed from the number of training epochs 1000. (A) Hard tissue landmarks better predicted by the PLS model; (B) Hard tissue landmarks better predicted by the AI models; (C) Soft tissue landmarks.

Citation: The Angle Orthodontist 95, 3; 10.2319/082124-687.1

Comparisons of actual growth and prediction outcomes based on the AI and PLS methods for real case examples are shown in Figure 3. The soft tissue landmarks from glabella to the terminal point of the lower neck were connected by applying the natural cubic spline function so that those landmarks could represent a smooth curve. Although both prediction results deviated from the actual profile after growth, AI-based predictions generally appeared to be closer to the actual profile.

Figure 3.Figure 3.Figure 3.
Figure 3.Example of profile predictions for patients included in the study. White, initial; Yellow, actual profile after growth; Red, PLS prediction; Blue AI. (A) a male subject from 10 y 7 mo to 15 y 0 mo; (B) a female subject from 9 y 4 mo to 14 y 9 mo; (C) a female subject from 6 y 8 mo to 12 y 1 mo.

Citation: The Angle Orthodontist 95, 3; 10.2319/082124-687.1

DISCUSSION

Research on growth prediction has not been actively conducted for about two decades.12 The complexity of predicting craniofacial growth with significant individual variation might have contributed to the lack of active research in this area. However, recent advancements in high-performance computing capable of handling the large computational demands of sophisticated algorithms have enabled the inclusion of large number of variables to develop more customized growth prediction models. This study used TabNet, one of the DNN algorithms,33 to address challenges in growth prediction. The results indicated that growth prediction remains challenging, as larger error values were observed for specific landmarks in some subjects. Nevertheless, this study utilized the best available growth data and current technologic advancements to identify methods that were more effective in predicting individual growth patterns.

Overall, AI predicted growth more accurately than PLS. However, accuracy varied according to the landmarks being predicted. Regarding performance in predicting various landmarks, the PLS was comparable to AI in landmarks with minor growth variations, whereas AI was more accurate in areas with significant variability, consistent with findings from a previous study.29 Among the 77 cephalometric landmarks, the PLS-based prediction demonstrated higher accuracy in 13 landmarks, primarily cranial base landmarks such as Nasion, Porion, and Basion. Additionally, PLS was more accurate in predicting Articulare and Condylion, aligning with expected outcomes since the positions of these mandibular landmarks are determined by the cranial base. Although statistically significant, the difference in errors between the two methods for these 13 landmarks averaged 0.45 mm, all being less than 1 mm. On the other hand, AI demonstrated greater prediction accuracy for 78% of the landmarks, including most of the landmarks in the lower part of the craniofacial structure and soft tissue. Soft tissue growth is more challenging to predict than skeletal growth, due to the influence of unpredictable factors such as posture and tonicity. This trend suggests that the development of AI growth prediction models can be beneficial for areas with greater variability in which predictions are more challenging.

In terms of model training, higher training epochs led to more accurate predictions with AI. This study was set to 100 and 1000 epochs, although previous publications reported up to 10,000 sessions for AI training.34 However, if 10,000 sessions had been included, a single computation could have taken several months.34 Figure 4 shows that, while the prediction error decreases as the number of sessions increases from 100 to 1000 (Figure 4A), the decrease in error was not significant beyond 800 to 1000 sessions (Figure 4B). As no substantial improvement in prediction performance was expected beyond 1000 sessions, this study chose 1000 epochs to achieve acceptable accuracy with a reasonable input of resources.

Figure 4.Figure 4.Figure 4.
Figure 4.Scatter plots presenting errors and 95% confidence ellipses for prediction models from different training epochs. (A) prediction error decreases as the number of sessions increases from 100, 200, 400 to 1000; (B) prediction error does not decrease significantly beyond 800 sessions.

Citation: The Angle Orthodontist 95, 3; 10.2319/082124-687.1

The errors from the models developed in this study were smaller than the errors of previously developed models.29 In a prior study using longitudinal data from 410 subjects, yielding 679 pairs of before and after growth data, PLS exhibited errors 2.11 mm greater than those of the AI method, which showed an average error of 2.78 mm.29 In this study, 33 subjects from longitudinal craniofacial growth records, resulting in 1257 pairs of before and after growth data, were included. PLS showed errors 0.61 mm greater than those of AI, which had an average error of 1.58 mm. Given that the inter-examiner error of cephalometric tracing was reported to be 1.5 ± 1.5 mm,24 the errors from this study were considered clinically acceptable. However, in some subjects or landmarks, the predicted landmarks still showed larger errors (Figure 2), partly due to the inherent nature of landmark location. These deviations may not be significant as long as the predicted landmark positions fall within the traced lines. In Figure 3A, despite a larger AI prediction error of 5.84 mm in soft tissue menton, the predicted lower mandibular soft tissue profile remained close to the actual profile.

Currently, collecting longitudinal growth data is challenging due to ethical concerns. Meanwhile, the AAOF Craniofacial Growth Legacy Collection compiles nine of the 11 recognized longitudinal collections of craniofacial growth records in the United States and Canada.31 Presently, approximately 20,000 digital images from 842 subjects are available on the AAOF website, which could facilitate further development of growth prediction methods using AI. The use of growth collections in this paper involved a significantly larger number of pairs of growth data than previous studies, offering a better representation of the general population. However, this study only included Class I and Class II subjects, whereas growth of Class III subjects is expected to be more challenging to predict due to increased mandibular growth. This limitation can be addressed by incorporating additional growth collections. Additionally, the AAOF growth collections predominantly consist of subjects of European descent, and the effects of different ethnicities on prediction accuracy still needs to be explored.

CONCLUSIONS

  • AI has shown to be an effective growth prediction method, with clinically acceptable prediction errors averaging 1.49 mm for 45 hard tissue landmarks, and 1.71 mm for 32 soft tissue landmarks.

  • Among AI prediction models, those with increased training epochs showed improved prediction performance, but there was no significant improvement beyond 1000 epochs.

  • AI generally outperformed PLS, particularly for landmarks in the lower part of the craniofacial structure and soft tissue, where uncertainty is considerable.

ACKNOWLEDGMENTS

This study was supported by the AAOF Orthodontic Faculty Development Fellowship Award (OFDFA 2024) and National Institute of Dental and Craniofacial Research (NIDCR) Loan Repayment Program Award.

REFERENCES

  • 1.
    Proffit WR. The timing of early treatment: an overview. Am J Orthod Dentofacial Orthop. 2006;129:S4749.
  • 2.
    Moorrees CFA, Lebret L. The mesh diagram and cephalometrics. Angle Orthod. 1962;32:214231.
  • 3.
    Moorrees CF, uan Venrooij ME, Lebret LM, Glatky CG, Kent RL, Reed RB. New norms for the mesh diagram analysis. Am J Orthod. 1976;69:5771.
  • 4.
    Johnston LE. A simplified approach to prediction. Am J Orthod. 1975;67:253257.
  • 5.
    Harris JE, Johnston LE, Moyers RE. A cephalometric template: its construction and clinical significance. Am J Orthod. 1963;49:249263.
  • 6.
    Popovich F, Thompson GW. Craniofacial templates for orthodontic case analysis. Am J Orthod. 1977;71:406420.
  • 7.
    Broadbent BH, Golden WH. Bolton standards of dentofacial developmental growth.
    Mosby
    ; 1975.
  • 8.
    Ricketts RM. A principle of arcial growth of the mandible. Angle Orthod. 1972;42:368386.
  • 9.
    Ricketts RM. The value of cephalometrics and computerized technology. Angle Orthod. 1972;42:179199.
  • 10.
    Ricketts RM. Planning treatment on the basis of the facial pattern and an estimate of its growth. Angle Orthod. 1957;27:1437.
  • 11.
    Bishara SE. Facial and dental changes in adolescents and their clinical implications. Angle Orthod. 2000;70:471483.
  • 12.
    Moon JH, Kim MG, Hwang HW, Cho SJ, Donatelli RE, Lee SJ. Evaluation of an individualized facial growth prediction model based on the multivariate partial least squares method. Angle Orthod. 2022;92:705713.
  • 13.
    Hennessy RJ, Stringer CB. Geometric morphometric study of the regional variation of modern human craniofacial form. Am J Phys Anthropol. 2002;117:3748.
  • 14.
    Ursi WJ, Trotman CA, McNamara JA,Jr., Behrents RG. Sexual dimorphism in normal craniofacial growth. Angle Orthod. 1993;63:4756.
  • 15.
    Hersberger-Zurfluh MA, Papageorgiou SN, Motro M, Kantarci A, Will LA, Eliades T. Genetic and environmental components of vertical growth in mono- and dizygotic twins up to 15-18 years of age. Angle Orthod. 2021;91:384390.
  • 16.
    Abu Alhaija ES, Richardson A. Growth prediction in Class III patients using cluster and discriminant function analysis. Eur J Orthod. 2003;25:599608.
  • 17.
    Oh H, Knigge R, Hardin A, et al. Predicting adult facial type from mandibular landmark data at young ages. Orthod Craniofac Res. 2019;22 Suppl
    1
    :154162.
  • 18.
    Suzuki A, Takahama Y. Parental data used to predict growth of craniofacial form. Am J Orthod Dentofacial Orthop. 1991;99:107121.
  • 19.
    Sherwood RJ, Oh HS, Valiathan M, et al. Bayesian approach to longitudinal craniofacial growth: The Craniofacial Growth Consortium Study. Anat Rec (Hoboken). 2021;304:9911019.
  • 20.
    Rudolph DJ, White SE, Sinclair PM. Multivariate prediction of skeletal Class II growth. Am J Orthod Dentofacial Orthop. 1998;114:283291.
  • 21.
    Lee SJ, An H, Ahn SJ, Kim YH, Pak S, Lee JW. Early stature prediction method using stature growth parameters. Ann Hum Biol. 2008;35:509517.
  • 22.
    Lee YS, Lee SJ, An H, Donatelli RE, Kim SH. Do Class III patients have a different growth spurt than the general population? Am J Orthod Dentofacial Orthop. 2012;142:679689.
  • 23.
    Hwang HW, Moon JH, Kim MG, Donatelli RE, Lee SJ. Evaluation of automated cephalometric analysis based on the latest deep learning method. Angle Orthod. 2021;91:329335.
  • 24.
    Hwang HW, Park JH, Moon JH, et al. Automated identification of cephalometric landmarks: Part 2-Might it be better than human? Angle Orthod. 2020;90:6976.
  • 25.
    Kim MG, Moon JH, Hwang HW, Cho SJ, Donatelli RE, Lee SJ. Evaluation of an automated superimposition method based on multiple landmarks for growing patients. Angle Orthod. 2022;92:226232.
  • 26.
    Moon JH, Hwang HW, Lee SJ. Evaluation of an automated superimposition method for computer-aided cephalometrics. Angle Orthod. 2020;90:390396.
  • 27.
    Moon JH, Hwang HW, Yu Y, Kim MG, Donatelli RE, Lee SJ. How much deep learning is enough for automatic identification to be reliable? Angle Orthod. 2020;90:823830.
  • 28.
    Park JH, Hwang HW, Moon JH, et al. Automated identification of cephalometric landmarks: Part 1-Comparisons between the latest deep-learning methods YOLOV3 and SSD. Angle Orthod. 2019;89:903909.
  • 29.
    Moon JH, Shin HK, Lee JM, et al. Comparison of individualized facial growth prediction models based on the partial least squares and artificial intelligence. Angle Orthod. 2024;94:207215.
  • 30.
    Park JA, Moon JH, Lee JM, et al. Does artificial intelligence predict orthognathic surgical outcomes better than conventional linear regression methods? Angle Orthod. 2024;94(
    5
    ):549556.
  • 31.
    Baumrind S, Curry S. American Association of Orthodontists Foundation Craniofacial Growth Legacy Collection: overview of a powerful tool for orthodontic research and teaching. Am J Orthod Dentofacial Orthop. 2015;148:217225.
  • 32.
    Donatelli RE, Lee SJ. How to test validity in orthodontic research: a mixed dentition analysis example. Am J Orthod Dentofacial Orthop. 2015;147:272279.
  • 33.
    Arik , Pfister T. TabNet: Attentive Interpretable Tabular Learning. Proc Conf AAAI Artif Intell. 2021;35:66796687.
  • 34.
    Lee JM, Moon JH, Park JA, Kim JH, Lee SJ. Factors influencing the development of artificial intelligence in orthodontics. Orthod Craniofac Res. 2024.
  • 35.
    Moon JH, Lee JM, Park JA, Suh H, Lee SJ. Reliability statistics every orthodontist should know. Semin Orthod. 2024;30:4549.
Copyright: © 2025 by The EH Angle Education and Research Foundation, Inc.
Figure 1.
Figure 1.

Longitudinal serial growth data source: the University of the Pacific Mathews Growth Study on the AAOF Craniofacial Growth Legacy Collection website, https://www.aaoflegacycollection.org/aaof_collection.html?id = UOPMathews (A). Four fiducial points and 78 cephalometric landmarks that were manually identified using a computer vision annotation tool (B).


Figure 2.
Figure 2.

Scatter plots presenting errors and 95% confidence ellipses for the three prediction models. Green, PLS; Blue, AI developed from the number of training epochs 100; Red, AI developed from the number of training epochs 1000. (A) Hard tissue landmarks better predicted by the PLS model; (B) Hard tissue landmarks better predicted by the AI models; (C) Soft tissue landmarks.


Figure 3.
Figure 3.

Example of profile predictions for patients included in the study. White, initial; Yellow, actual profile after growth; Red, PLS prediction; Blue AI. (A) a male subject from 10 y 7 mo to 15 y 0 mo; (B) a female subject from 9 y 4 mo to 14 y 9 mo; (C) a female subject from 6 y 8 mo to 12 y 1 mo.


Figure 4.
Figure 4.

Scatter plots presenting errors and 95% confidence ellipses for prediction models from different training epochs. (A) prediction error decreases as the number of sessions increases from 100, 200, 400 to 1000; (B) prediction error does not decrease significantly beyond 800 sessions.


Contributor Notes

Resident, Department of Orthodontics, Arthur A. Dugoni School of Dentistry, University of the Pacific, San Francisco, California, USA.
PhD Graduate Student, Department of Orthodontics, Graduate School, Seoul National University, Seoul, Korea.
Private Practice, Cheonan, Korea.
Research Scientist, AI Research Center, DDH Inc, Seoul, Korea.
Professor and Chair, Department of Orthodontics, Arthur A. Dugoni School of Dentistry, University of the Pacific, San Francisco, California, USA.
Professor, Department of Orthodontics and Dental Research Institute, Seoul National University School of Dentistry, Seoul, Korea.
Assistant Professor, Department of Orthodontics, Arthur A. Dugoni School of Dentistry, University of the Pacific, San Francisco, California, USA.
Corresponding author: Dr Heeyeon Suh, Department of Orthodontics, Arthur A. Dugoni School of Dentistry, University of the Pacific, San Francisco, California 94103, USA (e-mail: hsuh1@pacific.edu)
Received: 21 Aug 2024
Accepted: 01 Jan 2025
  • Download PDF