Evaluation of an individualized facial growth prediction model based on the multivariate partial least squares method
To develop a facial growth prediction model incorporating individual skeletal and soft tissue characteristics. Serial longitudinal lateral cephalograms were collected from 303 children (166 girls and 137 boys), who had never undergone orthodontic treatment. A growth prediction model was devised by applying the multivariate partial least squares (PLS) algorithm, with 161 predictor variables. Response variables comprised 78 lateral cephalogram landmarks. Multiple linear regression analysis was performed to investigate factors influencing growth prediction errors. Using the leave-one-out cross-validation method, a PLS model with 30 components was developed. Younger age at prediction resulted in greater prediction error (0.03 mm/y). Further, prediction error increased in proportion to the growth prediction interval (0.24 mm/y). Girls, subjects with Class II malocclusion, growth in the vertical direction, skeletal landmarks, and landmarks on the maxilla were associated with more accurate prediction results than boys, subjects with Class I or III malocclusion, growth in the anteroposterior direction, soft tissue landmarks, and landmarks on the mandible, respectively. The prediction error of the prediction model was proportional to the remaining growth potential. PLS growth prediction seems to be a versatile approach that can incorporate large numbers of predictor variables to predict numerous landmarks for an individual subject.ABSTRACT
Objectives
Materials and Methods
Results
Conclusions
INTRODUCTION
Craniofacial growth is a fundamental topic in orthodontics. Particularly in clinical practice, growth prediction assists orthodontists in formulating treatment plans and visualizing therapeutic outcomes to accomplish satisfying results for growing patients. Various growth prediction methods have been developed, with respect to direction and magnitude;1–15 however, accurate growth prediction remains challenging, due to the extremely variable nature of growth in individuals.
Growth is a complex process affected by genetic and environmental factors, and varies according to sex and ethnicity.1,3,4,15 Variation in craniofacial growth according to cephalometric characteristics has been reported previously.7,16–18 Growth prediction methods estimate a patient's residual growth based on average annual increments, as well as the anticipated amount of growth added to the patient's current state. As summarized in Table 1, growth prediction methods included specific cephalometric templates and guides, such as mesh diagrams,9,19 grids,5,20 templates,10 and Ricketts' visual treatment objective.11,12,21 However, these approaches do not account for individual variation; rather, average growth per year is applied to every patient. Subsequent studies have used more sophisticated approaches based on multivariate statistical methods,14,22 such as Bayesian theorem,13 a multilevel model,2 and application of nonlinear growth functions.6,7 Yet, growth prediction remains among the most daunting challenges in orthodontics. Numerous factors, such as innate skeletal and soft tissue variables, as well as a large amount of biological information, such as age and sex, must be considered to produce accurate and clinically applicable predictions.

When considerable numbers of input predictor variables and output response variables are highly correlated with one another, prediction models based on the partial least squares (PLS) method demonstrated superior predictive performance over conventional ordinary least squares (OLS) methods, such as linear regression models.23–27 A number of previous reports have demonstrated that the PLS algorithm was significantly more accurate for predicting treatment outcomes than OLS-based methods. The improved prediction capability of the PLS method may be due to its ability to control for significant correlations among the skeletal and soft tissue variables of individual patients.23–27 Furthermore, posttreatment changes are affected by various factors, including age and sex, among others. As predicting treatment outcomes and growth changes likely involve similar aspects, the PLS method is expected to be a useful tool for predicting growth by considering various factors. Through linear combination of numerous variables via matrix algebra, PLS can reflect the skeletal and soft tissue characteristics of an individual.
The purpose of this study was to develop and evaluate a facial growth prediction model based on the PLS method.
MATERIALS AND METHODS
Growth Data Collection
Subjects comprised 303 growing patients (166 girls and 137 boys) who had not undergone any orthodontic or orthopedic treatment and had at least two serial lateral cephalometric images taken at Seoul National University Dental Hospital, Seoul, Korea, from June 29, 2006 to December 20, 2019. Mean subject ages at the beginning and end of the growth observation period were 10.9 and 14.2 years, respectively (Figure 1). Approximately three-quarters of patients had skeletal Class II or III malocclusion (Table 2), consistent with the proportion of patients with malocclusion visiting the university-affiliated hospital.28,29



Citation: The Angle Orthodontist 92, 6; 10.2319/110121-807.1

Although subjects initially wanted to receive active orthodontic treatment at their first visit, treatment did not begin immediately for various reasons. Some subjects had such a severe skeletal discrepancy that observation was necessary, until their growth ceased, before they could receive combined surgical-orthodontic treatment. For other subjects, reasons for treatment postponement included finances, poor personal timing, and/or other unreported personal issues.
The institutional review board for the protection of human subjects of the Seoul National University Dental Hospital, Seoul, Korea, reviewed and approved the research protocol (ERI 19007).
Inclusion and Exclusion Criteria
The exclusion criteria were cleft lip and palate, and a syndromic or medically compromised condition. Simple space maintainers were considered to have little impact on growth; therefore, subjects who had used one were included in the present study. For every patient, serial lateral cephalometric radiographs were taken at least twice during the growth observation period. The characteristics of the subjects included in this study are summarized in Table 2.
Cephalometric tracing and landmark identification, at the beginning (T1) and end (T2) of growth observation, were manually performed for all images by a single examiner (SJL). A total of 46 hard tissue and 32 soft tissue landmarks were identified. To orient consecutive images to the same head position, the horizontal reference plane was set to Sella-Nasion −7°, with its origin at Sella following along the Sella-Nasion plane. The anatomic landmarks, reference planes, and coordinate system used in the study are presented in Figure 2.



Citation: The Angle Orthodontist 92, 6; 10.2319/110121-807.1
Predictor Variables, Response Variables, and Prediction Model
Predictor variables were a heterogeneous set, including individual characteristics that could be categorized into (1) demographic (age, sex); (2) molar relationship; (3) ages before and after the growth period; and (4) Cartesian (x,y) coordinates of 78 anatomic landmarks. A total of 161 predictor variables were incorporated into the input X matrix (Table 3).

Response variables comprised the x and y directions of 46 hard and 32 soft tissue landmarks after the period of growth. A total of 156 response variables were incorporated into the output Y matrix.
A growth prediction model, based on the PLS method, was established in two steps. First, a prediction model was built and fitted based on sample data (the training data). After construction of the prediction equation, training errors that were discrepancies between the actual and predicted positions after growth were calculated to evaluate the goodness-of-fit of the prediction model. Second, validation was performed by applying the prediction equation to new data (the test data) that were not used in the prediction model building procedure. The resultant test errors (also known as validation errors) were computed using the leave-one-out cross-validation technique (LOOCV). Given a total number of subjects, N, LOOCV constructs a prediction equation N times, with all the subjects except one. Then, the prediction equation is applied to the excluded subject.30 Test errors were compared to select an optimal model with the smallest number of PLS components (Figure 3). Since prediction errors in positive and negative directions cancel each other out, mean absolute values and root mean squared error of prediction were used to evaluate prediction performance (Figures 3 and 4).31,32



Citation: The Angle Orthodontist 92, 6; 10.2319/110121-807.1



Citation: The Angle Orthodontist 92, 6; 10.2319/110121-807.1
Analyzing Growth Prediction Accuracy
Multiple linear regression analysis was performed to investigate the influence of subject characteristics and landmark attributes on the accuracy of the growth prediction model. The open source statistics program, Language R,33 was used.
RESULTS
The optimal prediction model was selected, based on the root mean squared error of a prediction curve (Figure 3). As the number of PLS components increased, test errors initially decreased, but gradually increased as the maximum number of components was reached. Consequently, in this study, the optimal prediction model chosen included 30 PLS components.
Figure 4 shows the training and test errors, in the form of mean absolute errors, for several selected anatomical landmarks. Similar patterns were observed for the training and test errors. The magnitude of errors and the differences between the training and test errors tended to increase as landmarks were located at more inferior parts of the face.
The prediction error increased in proportion to the growth prediction interval (0.24 mm/y). Further, prediction error was greater with younger age at prediction (0.03 mm/y). Conversely, the older the age at the prediction, the more accurate the prediction results.
Girls, subjects with Class II malocclusion, growth in vertical direction, skeletal landmarks, and landmarks on the maxilla had lower prediction errors than boys, subjects with Class I or III malocclusion, growth in anteroposterior direction, soft tissue landmarks, and landmarks on the mandible, respectively (Table 3).
Figure 5 illustrates real case comparisons between actual growth and prediction results. To generate a smooth curve for the soft tissue profile line, cephalometric landmarks were connected using spline functions. The prediction results were far from perfect, but varied among subjects.



Citation: The Angle Orthodontist 92, 6; 10.2319/110121-807.1
DISCUSSION
The primary purpose of this study was to develop an automated and reliable growth prediction model that can reflect individual characteristics. Craniofacial growth is considered complex and difficult to predict, since it is influenced by various factors, including sex, ethnicity, and morphological characteristics, among others. To predict such complex skeletal and soft tissue changes accompanied by growth, the present study applied the PLS method, which is capable of reflecting a vast number of predictor variables, and of predicting numerous soft and hard tissue landmarks in an individual subject.
From the clinical perspective, the test error represents the criteria for prediction accuracy, while the training error may reflect the goodness-of-fit of the model. The results demonstrated that the test errors of the prediction model tended to increase with landmarks located in more inferior positions. The reason for the low predictive accuracy of landmarks located in the more inferior portion of the face may be the distance from the cranial base. The prediction results for anatomical landmarks located in the mandible were less accurate than those for landmarks on the maxilla (Figures 4 and 5).
The growth prediction error was greater in boys with Class III malocclusion than in girls with Class II malocclusion. It is speculated that this may be because, if other conditions such as age at prediction and growth observation period were the same, then boys with Class III malocclusion would have greater residual growth potential than girls with Class II malocclusion.
Prediction results were less accurate for soft tissue. It is speculated that soft tissue changes did not follow those of hard tissue in a one-to-one manner. Further, soft tissue landmarks may have been affected by varying subject posture.
Due to the multiple iteration cycles that occur during model building procedures, the PLS algorithm takes considerable time to generate a prediction equation. In addition, applying the LOOCV technique as a validation method for PLS takes much longer than applying any other type of validation method.26,30 Consequently, the model building procedures for the PLS prediction model consumed several days using a desktop computer with the ordinary specifications employed in the current study; however, once the prediction model was built, the time to produce a prediction result was less than milliseconds. This is because the PLS prediction process did not implement complicated iteration procedures unlike the model building procedures but, rather, performed matrix algebra, entailing simple and fast computations. Nevertheless, computer-aided clinical environments would be an essential condition for practical application of this growth prediction model.
The current study applied an advanced statistical approach; however, growth prediction performance was not as accurate as envisaged. Although imperfect and inaccurate, the prediction model presented here (see the real case application shown in Figure 5) may be useful as a rough guide, which is better than having no means of estimating growth changes, especially when used alongside other digitally derived methods by providing automated and rapid results.
A strength of the present study was that the data included a larger number of patients and more input and output variables than previous growth prediction studies (Table 1). A limitation of the current study was that the growth observation period varied among patients (Figure 1). The way that growth is interpreted may vary according to the measurement method applied and the observation interval.6,7 In the present study, growth observation intervals were not prearranged. Rather, subjects who had undergone serial cephalograms were collected retrospectively through medical record collation. Consequently, the interval for growth observation ranged from 1.0 to 13.2 years. Another limitation was that the growth prediction model could not consider the effect of age-related differential growth. Inclusion of additional variables that reflect skeletal age may be necessary.
CONCLUSIONS
-
The PLS growth prediction model presented here is versatile and incorporates a large number of predictor variables, as well as predicting numerous landmarks in individual subjects.
-
Further refinement using nonlinear age covariates and additional variables reflecting skeletal age may result in a more accurate prediction formula.

Growth observation period for each subject. Red and green dots indicate when the first and second cephalometric images were taken, respectively.

Reference planes and cephalometric landmarks used in present study. (A) Skeletal landmarks are shown in capital letters. (B) Soft tissue landmarks are presented in lowercase letters.

Growth prediction error according to the number of PLS components. Growth prediction errors for Gnathion were chosen to show the pattern of error in the horizontal direction (top) and the vertical direction (bottom).

Growth prediction errors for selected landmarks in the training data set (blue) and the test data set (red).

Comparisons between actual growth and prediction results. To concisely showcase the prediction result, only soft tissue outlines are shown for patients with Class I (left), Class II (middle), and Class III (right) malocclusions.
Contributor Notes