Reliability of three-dimensional anterior cranial base superimposition methods for assessment of overall hard tissue changes: A systematic review
ABSTRACT
Objective:
The purpose of this systematic review was to synthesize the available literature concerning the reliability of three-dimensional superimposition methods when assessing changes in craniofacial hard tissues.
Materials and Methods:
Four electronic databases were searched. Two authors independently reviewed potentially relevant articles for eligibility. Clinical trials, cohort, case-control, and cross-sectional studies that evaluated the reliability of three-dimensional superimposition methods on the anterior cranial base were included.
Results:
Six studies fulfilled the inclusion criteria. Four studies used the voxel-based registration method, one used the landmark-based method and one used the surface-based method. Regarding reliability, the voxel-based studies showed on average a difference of 0.5 mm or less between images. The optimized analysis using a six-point correction algorithm in the landmark-based method showed 1.24 mm magnitude of error between images.
Conclusions:
Although reliability appears to be adequate, the small sample size and high risk of bias among studies make available evidence still insufficient to draw strong conclusions.
INTRODUCTION
Superimposition of cephalometric headfilms taken at defined intervals is used by researchers and clinicians to help in orthodontic diagnosis and treatment planning and to obtain a general view of growth changes and treatment outcomes in the dentofacial complex.1–3 Conventional lateral cephalometric radiographs have proven to be an invaluable part of initial and final orthodontic records to quantify and determine craniofacial growth changes and effects of orthodontic treatment.1,4 However, two-dimensional (2D) cephalometric radiographs suffer from a number of inherent flaws, such as errors generated because of inadequate patient head position, alignment of the imaging device, inherent geometric distortions, and differential magnification created by projection distance and beam divergence.1,5–9
During the past decade, craniofacial three-dimensional (3D) digital records have become increasingly popular among orthodontists as the specialty progressed toward a 3D virtual representation of the patient for diagnosis, treatment planning, and surgical simulation. The advanced imaging capabilities of cone-beam computed tomography (CBCT) are depicted through 3D cephalometric analysis, temporomandibular joint visualization, and 3D evaluation of dental anomalies, to name only a few.4,10 A single scan provides an overlap-free 3D visualization of different components of the skull, enables volumetric measurements to be made, and allows a detailed assessment of the maxillofacial structures in variable thickness of the axial, coronal, and sagittal slices, providing real measurements with no magnification.9,11
Recently, similar to 2D cephalometric tracings, CBCT images can be superimposed, allowing a 3D evaluation of growth changes, treatment effects, and stability over a certain time interval through registration points, angles, shapes, and volumes.12–14 One of the main challenges of 3D superimposition of serial images is to understand that linear/angular measurements in 2D and 3D images are not directly comparable because of differences in size, shape, and relative spatial location of the skeletal, dental, and soft tissue between the two imaging systems.4,15
The following three general methods of 3D cephalometric superimposition have been published and used for clinical diagnosis and assessment of orthodontic treatment outcomes: (1) voxel based,3,4,16–19 (2) landmark based,7,20 and (3) surface based.4,21
A review addressing the 3D CBCT superimposition methods was published in 2015.4 Although it discussed the three main techniques, it focused mainly on their clinical applications, benefits, and limitations. It did not consider the measuring capabilities of any of those methods. No systematic review has been specifically conducted to investigate the reliability of these 3D superimposition methods when assessing changes in craniofacial hard tissues. Without an in-depth understanding of the measurement properties of each method, indiscriminate use should be questioned, as treatment decisions/assessments may not have been based on sound superimposition evidence.
The purpose of this systematic review was to synthesize the available literature concerning the reliability of 3D superimposition methods to evaluate craniofacial hard tissues changes.
MATERIALS AND METHODS
This systematic review followed, whenever applicable, the Preferred Reporting Items for Systematic Reviews and Meta-Analysis checklist.22
Protocol and Registration
The study protocol was not registered in advance.
Eligibility Criteria
The following selection criteria were applied for the review:
-
Study design: clinical trials, cohort, case-control, and cross-sectional studies that evaluated the reliability, repeatability, or reproducibility of 3D superimposition methods on the anterior cranial base were included. No restrictions were applied regarding language or year of publication.
-
Exclusion criteria: review articles, meeting abstracts, book chapters, case reports, editorial letters, and personal opinions were excluded from the review.
Information Sources and Search Strategy
A systematic search of four electronic databases (Embase, Medline via OVID, Web of Science, and SCOPUS) was performed. All searches were inclusive until December 2016. The search strategy was designed with the assistance of a health science senior librarian. Appropriate truncation and word combinations were selected and adjusted for each database search. Keywords used in the search and combination of terms per database can be found in Table 1.

Study Selection
The relevant articles were selected through a two-phase process. In phase 1, two authors (CPG and MLV) independently reviewed the titles and abstracts of all references. In phase 2, full texts of potentially relevant abstracts were retrieved, reviewed, and screened by the same two reviewers according to the same selection criteria to confirm final selection while considering the full manuscript. Any disagreement was settled by means of discussion until a mutual consensus was reached.
Data Collection Process and Data Items
Data were extracted from each of the selected studies using a developed standardized data collection form based on the Cochrane Consumers and Communication Review.23 One reviewer (CPG) collected the required information from the selected articles. The second reviewer (MLV) cross-checked the gathered data and confirmed its accuracy. Once again, any disagreement in either phase was resolved by consensus.
Risk of Bias in Individual Studies
The Consensus-Based Standards for the Selection of Health Measurement Instruments checklist, a standardized tool for assessing the methodological quality of studies that evaluates measurement properties, was used for quality assessment of included studies.24 Disagreements between the reviewers in relation to quality assessment were resolved by means of discussion, and the third reviewer (CFM) made a final decision if consensus was not reached by the first 2 reviewers.
Synthesis of Studies
As a result of the nature of the question and the available data, a meta-analysis was not possible. Included studies assessed reliability of measurements from different craniofacial anatomical regions.
RESULTS
Study Selection
A flow chart of the selection process of articles included in this study is outlined in Figure 1. A total of 254 manuscripts were selected for a phase 1 assessment. Thereafter a total of 219 studies were excluded following abstract/title assessment. Only 35 references and 1 additional study found through a manual search (from the reference list) were subsequently selected and received a full-text reading (phase 2). From the total full-text articles retrieved and reviewed, 30 studies were later excluded; therefore, only 6 studies fulfilled the criteria to be included in this review.



Citation: The Angle Orthodontist 88, 2; 10.2319/071217-468.1
Study Characteristics
All of the selected studies for the qualitative synthesis were published in English. Sample size ranged from 3 to 18 patients and included adults undergoing orthognathic surgery and children undergoing either orthopedic treatment with miniplates for Class III correction or rapid maxillary expansion. To assess treatment or growth changes, all of the studies used pre- and postimages to apply the 3D superimposition method. A large field of view was used in all of the studies, and voxel size ranged from 0.25 to 0.8. An abbreviated summary of the descriptive characteristics of the included articles is provided in Table 2.


Risk of Bias Within Studies
The methodological quality scores based on the Consensus-Based Standards for the Selection of Health Measurement Instruments checklist when assessing reliability were evaluated to have a high risk of bias (poor methodological quality) in all of the included studies. Similarly, when evaluating measurement error and validity, five studies scored poor methodological quality except for one study on each measurement property that obtained fair methodological quality. The critical appraisal details about each of the items and the evaluation criteria are described in Tables 3 to 6.




Results of Individual Studies
For better interpretation, the results were separated according to the superimposition techniques: voxel based, landmark based, and surface based. Pre- and posttreatment images were registered on the anterior cranial base surface on all studies,3,7,12,16,21,25 and one was also superimposed on the left zygomatic arch.12
Voxel-Based Method
Four studies tested this method. A first study carried out by Cevidanes et al.25 assessed interobserver reproducibility in a subset of 10 CBCT scans (before and after treatment) of five patients undergoing orthognathic surgery using three observers. They showed the similarity between the 3D color-coded maps and that pre- to postsurgery surface distance measurements differed among the three observers by no more than 0.26 mm (maximal error measured as displacement at the mandibular rami surface). The average inward displacement for all surfaces (mandibular rami, posterior border of the mandibular ramus, and condyles) was smaller than the image spatial resolution of 0.6 mm. The one-sample t-test P values were statistically significant at all surfaces, despite the small values of displacements that were observed.
A second study12 assessed the voxel-based method on pairs of CBCT scans of 16 adult patients. The mean absolute distances between the two 3D images were calculated in four different regions (cranial base, forehead, and right and left zygomatic arches). The results showed small interobserver variability when the 3D model construction and superimposition procedure was repeated by a second observer. Mean differences between superimpositions performed by the first and second observer were 0.01 mm (95% confidence interval [CI], 0.03–0.05) for the forehead region, −0.07 mm (95%CI, 0.13 to −0.003) for the right zygomatic arch, and −0.01 mm (95%CI, −0.09 to 0.07) for the left zygomatic arch. The correlation coefficient between the repeated superimpositions (intraobserver repeatability) ranged from 0.53 to 0.94.
A third study3 that used a sample of CBCT images of 18 patients also assessed the voxel-based method. A total of 10 patients were used as a reference standard, reorienting the spatial position of the pretreatment CBCT volume and then superimposing on the original image. The other eight patients (four nongrowing and four growing) had pre- and posttreatment superimposed images. The results showed that the surface distance error was less than 0.25 mm for the sample that tested the superimpositions of CBCT images with a 1-year interval for growing patients treated with rapid maxillary expansion. Similarly, the adult sample, which underwent orthognathic surgery, revealed discrepancies in the anterior cranial base between the registered surface models that were less than 0.5 mm for most regions.
A fourth study16 assessed the voxel-based method in growing patients. Three observers were trained for analysis of CBCT images using two images not included in the study. After calibration, each observer examined pre- and posttreatment CBCT scans of three growing patients. The interexaminer range of the measurements across anatomic regions was equal to or less than 0.5 mm.
Landmark-Based Registration
DeCesare et al.7 assessed the six-landmark superimposition method when defining the coordinate system using data from 10 growing patients. The error reported was the absolute value of the difference in distances between the points calculated for the first image and the second image. The results showed high intertest reproducibility and great consistency between trials. The average error seen in the distances between the first image and the second image was 1.24 mm.
Surface-Based Registration
Gkantidis et al.21 tested five surface-based 3D superimposition techniques in a sample of eight nongrowing orthodontic patients treated with rapid maxillary expansion (three-point registration, anterior cranial base, anterior cranial base + foramen magnum, both zygomatic arches, one zygomatic arch). The results showed that all of the techniques differed from each other (P < .005), except for the anterior cranial base and both zygomatic arches superimpositions (P = .43) using CT scans. The anterior cranial base + foramen magnum was the most accurate technique (P = .07). The reproducibility and precision of all the techniques were acceptable because there were no significant differences between the repeated measurements and among examiners on the measured structural changes (anterior cranial base + foramen magnum: examiner 1, 0.11 mm [95%CI, 0.09–0.17]; examiner 2, 0.07 mm [95%CI, 0.04–0.09]; examiner 3, 0.09 mm [95%CI, 0.04–0.14]).
Synthesis of Results
All of the included studies reported adequate reliability of all 3D superimposition methods. Nevertheless, the quality of the studies was consistently poor. Hence it is unknown how the poor quality of evidence influenced the results.
Risk of Bias Across Studies
The main methodological limitations across the studies were related to small sample size, different age groups, treatment type, flaws in the study design, and the lack of a detailed description of statistical analysis.
Additional Analysis
All of the articles used different anatomical regions to assess reliability and measurement error; this made the application of a meta-analysis questionable.
DISCUSSION
Summary of Evidence
In this systematic review, the available evidence concerning the reliability of the 3D superimposition methods when assessing the changes in the craniofacial hard tissues was investigated. Although all of the included studies for all three methods in this review reported acceptable reliability, the quality of evidence was low. Therefore, any reported conclusions are not to be supported with a high level of certainty.
CBCT is currently a well-established diagnostic tool for the 3D assessment of growth and/or treatment changes on craniofacial structures. However, it is important to understand that challenges remain because 3D superimposition is much more complicated than 2D superimposition. The difficulties assessing the reliability of 3D superimpositions are not only a result of registration issues but also a result of the choice of regions to test the reproducibility of the superimposition, with landmark locations on various anatomic surfaces in the three planes of space.26
To be suitable for routine application in medical image processing, a superimposition method should be able to register precisely and aid understanding of the changes a result of growth or treatment relative to the structures of reference. The image analysis procedures include 3D construction, registration, superimposition, and quantification of changes.
In the voxel-based registration method, all of the steps involved are automated, which may allow image analysis procedures independent of observer errors. The application of this method has been widely described in the literature to assess changes after orthognathic surgery and orthopedic treatment.2,17,27–29 Cevidanes et al.25 introduced this method into dentistry. This first study used this method to assess mandibular anatomy and position before and after maxillary advancement. They applied distance measurement to quantify mandibular rotation and displacement. The results showed that the interobserver errors had a range of 0.26 mm. Similarly, Nada et al.12 used the voxel-based image registration method to test the reliability and measurement error of CBCT superimposition on the anterior cranial base in adult patients who underwent combined surgical orthodontic treatment. The authors reported small differences within 0.5 mm, which were considered to be clinically insignificant. It was also mentioned that the registration of the superimposed scans on the zygomatic arch could be contemplated as an alternative to the anterior cranial base when using smaller Field of View scans in nongrowing patients. However, it is important to be aware that the regions used for quantification of error in this study were closer to the region of reference, thus reducing the magnitude of error. It is known that the further the region of interest is relative to the superimposition structures, the larger the theoretical error of measurement. Weissheimer et al.3 also tested the voxel-based method, and their results revealed that distances in the anterior cranial base between registered surface scans were <0.5 mm for most regions, indicating reliable superimposition. Nevertheless, the statistical analysis was not reported and therefore their conclusion should be taken warily.
When comparing the reliability of the voxel-based and the surface-based methods, Almukhtar et al.30 reported no significant difference between the two methods using pre- and post-CBCT images of orthognathic surgical patients, although voxel-based registration was associated with less variability. The higher variability in the surface-based method could have been a result of the extra step involving 3D model rendering that this registration required to generate a 3D surface mesh model, which may have introduced a possible source of error. This was also reported by Kang et al.31 when comparing four different software packages to produce the 3D surface meshes. They found that all four software programs generated reasonably similar meshing accuracies for clinical use. However, there were statistically significant differences at all anatomic regions between them, revealing that there was an inherent range of error in the CT image-based meshing process and highlighted that precautions should be taken in selecting the appropriate software and/or anatomic regions to avoid potential error in specific clinical applications.
When using the landmark-based method, the main drawback is that it requires landmark registration, which can increase the risk of observer-dependent errors. However, as previous studies have reported, images can offer consistent and reproducible data if protocols for operator training and calibration are followed.32,33 This method uses a reference point for 3D cephalometric analysis with CBCT.34,35 Lagravère et al.6 evaluated the potential errors associated with the superimposition of serial CBCT images. They used reference planes based on cranial base landmarks using a sensitivity analysis. DeCesare et al.7 later optimized this analysis using a six-point correction algorithm. This optimized method added two extra landmarks, foramen ovale right and left, which were shown to decrease the envelope of error when determining the coordinate system and increase the intrapoint reliability when comparing images.
Assessing reproducibility is similarly relevant for a 3D superimposition method to be used in research and clinical settings. Cevidanes et al.16 assessed reproducibility using the voxel-based method for superimposition in growing patients, although in a small sample. They analyzed before and after treatment CBCT scans of only three growing patients who had orthopedic treatment with miniplates as a treatment to correct a Class III malocclusion. The changes with growth and treatment were measured on the 3D models constructed by three examiners. They reported an interobserver range of measurements across anatomic regions equal or less than 0.5 mm, concluding that these variations were clinically insignificant; therefore, the technique provided a reproducible 3D assessment of growing patients. Comparable reproducibility results were reported by Nada et al.12 using the voxel-based method and Gkantidis et al.21 using the surface-based method when repeating superimpositions on the anterior cranial base. Although the last study used CT scans instead of CBCT, the authors claimed that the validity of the proposed superimposition method would not change substantially when applied to CBCT images. Although CBCT images have lower segmentation accuracy to some extent when compared to CT, anatomical landmarks and models are generated in a reliable and clinically applicable way.36
CONCLUSIONS
-
Findings from most of the studies included in the current review suggest that all three methods for 3D superimposition provide an acceptable level of reliability when assessing changes in craniofacial hard tissues.
-
However, due to a low methodological quality of the identified evidence, the overall results should be considered cautiously.
-
In addition, although the 3D superimposition methods are more convenient for craniofacial assessment than conventional 2D methods, to date no studies have used a gold standard to determine the real accuracy of any of these methods.

Flow diagram of the data search using the PRISMA guidelines.
Contributor Notes