Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 01 Oct 2001

A Comparison of 3 Computerized Bolton Tooth-Size Analyses With a Commonly Used Method

DMD,
DMD,
DDS, MSD, and
MS
Page Range: 351 – 357
DOI: 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2
Save
Download PDF

Abstract

Four methods of conducting overall and anterior Bolton tooth-size analyses were compared using 22 (11 pretreatment and 11 posttreatment) sets of models. No more than 3 mm of crowding existed in any of the models, and all were in good condition. An analysis employing vernier calipers was completed 3 times to set a standard. Pearson correlation coefficients revealed a high degree of intra-operator reliability with mean R values of 0.930 and 0.843 for the overall and anterior discrepancies, respectively. The mean Vernier caliper results were compared with each of the following computerized methods: QuickCeph, Hamilton Arch Tooth System (HATS), and OrthoCad. No statistically significant error was present for any of the methods using repeated-measures analysis of variance testing and paired t-tests (p < .05). Clinically significant differences (>1.5 mm) were present for each method. Absolute differences were calculated, and linear regression and R values were determined. The HATS analysis had the highest degree of correlation (R = 0.885 for overall and 0.825 for anterior), followed by OrthoCad (R = 0.715, 0.574), and QuickCeph (R = 0.432, 0.439). Each method also was compared based on the time required to complete each analysis. The QuickCeph was the fastest (1.85 minutes) followed by HATS (3.40 minutes), OrthoCad (5.37 minutes), and Vernier caliper (8.06 min).

INTRODUCTION

The Bolton tooth-size analysis is commonly used as a diagnostic tool in orthodontics. Sheridan1 reported that 91% of orthodontists polled only use a Bolton analysis when measuring tooth size. Achieving a good functional occlusion with proper overbite and overjet requires that maxillary and mandibular teeth be proportional in size. If an interarch tooth-size discrepancy exists, an ideal occlusion may not be achieved.2 In 1958, Bolton,3 studied tooth-size disharmony in relation to treatment of malocclusions. He studied 55 patients with excellent occlusions and produced ratios for the mesiodistal sizes of maxillary and mandibular teeth. These ratios, completed for the anterior 6 teeth and for the 12 teeth from first molar to first molar, gave a definite percentage of mandibular to maxillary tooth size. The formulas derived by Bolton are as follows: overall ratio = sum of the mandibular widths/sum of the maxillary widths = 91.3%, and anterior ratio = sum of the mandibular widths/sum of the maxillary widths = 77.2%. In 1962, Bolton4 reviewed his original study. He presented several clinical cases to determine if his analysis was a viable diagnostic aid and determined that by employing the analysis, there is rarely a need for a diagnostic setup.

Since Bolton's original studies, a number of articles have reported on the incidence of tooth-size discrepancies and the reliability of the analysis. Proffit2 reported that approximately 5% of the population has some degree of disproportion among the sizes of individual teeth. In 1989, Crosby and Alexander5 found that a large percentage of orthodontic patients had mesial-distal tooth-size discrepancies. In 1996, Freeman et al6 published a study of 157 orthodontic patient records and found that 30.6% presented with a significant anterior discrepancy, while 13.5% presented with an overall (first molar to first molar) discrepancy.

Traditionally the Bolton analysis is measured with a Boley gauge (Vernier calipers) or needlepoint dividers. In 1995, Shellhart et al7 evaluated the reliability of the Bolton analysis when performed with these 2 instruments and also looked at the effect of crowding on measurement error. They found that the Boley gauge was slightly more reliable than needlepoint dividers and that clinically significant measurement errors can occur on casts with at least 3 mm of crowding.

Today, many orthodontists are moving toward digitizing orthodontic records and using computers to assist with diagnosis and treatment planning. Proffit2 stated that one advantage of digitizing tooth dimensions for space analysis is that the computer can quickly provide a tooth-size analysis. In 1999, Ho and Freer8 used a computerized version of their Graphical Analysis of Tooth-Width Discrepancy (GATWD) and determined that the use of digital calipers can virtually eliminate measurement transfer and calculation errors.

Previous studies have evaluated Bolton's tooth-size analysis for incidence and reliability as well as how it relates to premolar extractions and different racial groups.6,7,9,10 To date, no study has compared manual and computerized Bolton's analysis. The purpose of this study was to determine the accuracy and efficiency of performing Bolton's tooth-size analysis using manual measurements with a Vernier caliper with each of 3 computerized methods: the Quick Ceph Image Pro computer program (QuickCeph Systems, Coronado, Calif), the Hamilton Arch Tooth System (HATS) (GAC International, Central Islip, NY), and OrthoCad software (CADENT Inc, Fairview, NJ).

MATERIALS AND METHODS

Twenty-two sets of casts from the files of the Tri-Service Orthodontic Residency Program at Wilford Hall Medical Center were studied. Eleven pretreatment and posttreatment sets of casts were selected using only 2 criteria: (1) no more than 3 mm of crowding was present in any of the arches, and (2) the models were in good condition.

Using a Vernier caliper (OIS® Orthodontics, Aston, PA), the author (JJT) measured tooth sizes and completed a Bolton tooth-size analysis three times on each set of casts. Each analysis was timed from first measurement to final computation. To eliminate bias, these initial analyses were completed within a 1-month period with at least 2 weeks between measurements, and the order in which the casts were presented was varied. The data from these measurements was averaged and used as the standard.

The models were then digitized into the QuickCeph Image Pro program (version 6.2) installed on a Power Macintosh 7500/100 computer (system software 7.5.3, system 7.5, update 2.0, Apple Computer Inc., Cupertino, California). A video camera (Cosmicar/Pentax, model CV-255E, Asuhi Optical Corporation, Ltd., Tokyo, Japan), mounted on a Kaiser mount (R53, Kaiser Fototechnik GmbH and Company, KG, Buchen, Germany), was calibrated to manufacturer's standards to produce a one-to-one image for all models. Using the QuickCeph Image Pro, the casts were measured, and the software calculated the Bolton analysis. This procedure was also timed.

A third analysis employed the Hamilton Arch Tooth System (HATS) software. Using digital calipers (PRO-MAX Digital Calipers, Fred V. Fowler Co., Inc., Newton, MA) connected to a computer (Compaq Deskpro EN Series, Intel Pentium II, 350 MHz, 64 MB equipped with Windows NT, version 4.0), the models were measured. The HATS software calculated the Bolton analysis and, again, the entire procedure was timed.

The final analysis was performed with OrthoCad software (version 1.14). The models were shipped to CADENT Inc, where they were scanned to make 3-dimensional images of the casts (Figure 1). These images were sent to the author via the Internet and downloaded on the Compaq computer previously mentioned. The Bolton analysis was performed on the model images, and this procedure was timed. It should be noted here that typically, impressions—not models—are sent to OrthoCad. (Editor's Note –OrthoCad uses a destructive scanning process, which destroys the model during scanning. A model must be made either by the doctor or by the company in order for OrthoCad to create the digital image used.) For the purposes of this study, the models were sent instead of impressions of the models to exclude the possibility for deformation, which would have adversely affected the results.

FIGURE 1. An example of an electronic model from OrthoCadFIGURE 1. An example of an electronic model from OrthoCadFIGURE 1. An example of an electronic model from OrthoCad
FIGURE 1. An example of an electronic model from OrthoCad

Citation: The Angle Orthodontist 71, 5; 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2

For all methods, the Bolton analysis was performed both by measuring the discrepancy between the anterior 6 maxillary and mandibular teeth and the discrepancy for the 12 teeth from first molar to first molar in both arches. For statistical purposes, all Bolton tooth-size discrepancies were expressed as maxillary excesses or deficiencies.

RESULTS

To determine examiner reliability, the 3 sets of measurements made with Vernier calipers were compared using absolute differences and Pearson correlation coefficients for the overall and anterior Bolton discrepancies. Tables 1 and 2 show the exact data for all 3 Vernier caliper measurements on each set of casts.

TABLE 1. Overall Bolton Discrepancies for Vernier Caliper Measurements*

          TABLE 1.

Each computerized method was compared with the averaged Vernier caliper results. Tables 3 and 4 show the exact data for all methods on each set of casts. First, the absolute differences of each method were calculated for both the overall Bolton discrepancy and the anterior discrepancy. Linear regression and Pearson correlation coefficients (R values) were then determined for the same data sets. Next, repeated-measures analysis of variance (ANOVA) testing and paired t-testing was performed on the same sets of data. A significance (P) value of .05 was used for each test. Lastly, each method was compared for length of time needed to complete the analysis.

TABLE 3. Overall Bolton Discrepancies for All Methods*

          TABLE 3.

One set of posttreatment models had a unilateral premolar extraction, and the QuickCeph analysis would not calculate the overall Bolton discrepancy. Thus, only 21 sets of models could be compared for the overall analysis using QuickCeph.

Vernier calipers

Comparing the overall discrepancy for the first set of Vernier caliper measurements (V1) to the second set of Vernier caliper measurements (V2) revealed that 72.7% of the calculations were within 0.9 mm of each other. The mean difference was 0.77 mm, and the range was 0.0 mm to 2.4 mm. The Pearson correlation coefficient (R value) was 0.934. For the anterior discrepancy, the differences ranged from 0.0 mm to 2.9 mm with a mean of 0.58 mm. Twenty of 22 measurements (90.9%) were within 1.0 mm, and the R value was 0.805.

Employing the same methods to compare the first and third sets of Vernier caliper measurements had similar results. For the overall discrepancy, 68.2% of the measurements fell within 1.0 mm of each other with a range of 0.0 mm to 2.4 mm and a mean of 0.95 mm. The Pearson coefficient was 0.936. For the anterior discrepancy, 90.9% of the measurements were within 0.7 mm, with a range of 0.0 mm to 1.5 mm and a mean of 0.47 mm. The Pearson correlation coefficient was 0.900.

Similarly, the second and third sets of measurements were compared. For the overall discrepancy, it was found that 72.7% of the differences were within 1.0 mm, and the differences ranged from 0.0 mm to 2.8 mm. The mean difference was 0.86 mm and the R value was 0.920. For the anterior discrepancy, 95.5% of the results were within 1.0 mm. The mean difference was 0.5 mm, and the range was 0.0 mm to 2.8 mm. The Pearson coefficient was 0.824.

QuickCeph vs Vernier calipers

The QuickCeph overall analysis differed from the Vernier calipers by a mean of 1.84 mm. The variations ranged from 0.2 mm to 7.7 mm, with 52.4% within 1.4 mm and 81.0% within 2.5 mm (Figure 2). For the anterior analysis, the mean difference was 1.07 mm, with a range from 0.0 mm to 3.2 mm. Seventeen of 22 (77.3%) were within 1.4 mm, and 20 of 22 (90.9%) were within 2.5 mm (Figure 3).

FIGURE 2. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for overall discrepancyFIGURE 2. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for overall discrepancyFIGURE 2. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for overall discrepancy
FIGURE 2. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for overall discrepancy

Citation: The Angle Orthodontist 71, 5; 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2

FIGURE 3. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for anterior discrepancyFIGURE 3. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for anterior discrepancyFIGURE 3. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for anterior discrepancy
FIGURE 3. Absolute differences of the 3 methods from the averaged Vernier caliper measurements for anterior discrepancy

Citation: The Angle Orthodontist 71, 5; 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2

The R value for the overall discrepancy was 0.432 and for the anterior discrepancy was 0.439 (Figures 4 and 5).

FIGURE 4. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for overall discrepancyFIGURE 4. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for overall discrepancyFIGURE 4. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for overall discrepancy
FIGURE 4. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for overall discrepancy

Citation: The Angle Orthodontist 71, 5; 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2

No statistically significant differences were present from either the ANOVA or the paired t-tests. The P values for the ANOVA were .065 for the overall discrepancy and .086 for the anterior discrepancy. For the paired t-test, the P value was .065 for the overall discrepancy and .064 for the anterior discrepancy.

Significant differences were evident for time measurements between the 2 methods. The mean time for the QuickCeph analysis was 1.85 ± 0.27 minutes as compared with 8.06 ± 0.54 minutes for the Vernier calipers.

HATS vs Vernier calipers

For the overall discrepancy, the results employing the HATS system averaged 0.99 mm, and the results differed from the Vernier caliper results by 0.3 mm to 2.4 mm. Nineteen of 22 (86.4%) were within 1.5 mm of the Vernier caliper method, and 90.9% were within 1.8 mm (Figure 2). The results for the anterior discrepancy differed by 0.1 mm to 1.5 mm, and the mean difference was 0.55 mm. Nineteen of 22 (86.4%) fell within 1.0 mm of the Vernier caliper measurements, and 100% were within 1.5 mm (Figure 3).

Pearson correlation coefficients were 0.885 for the overall discrepancies and 0.825 for the anterior discrepancies (Figures 4 and 5).

ANOVA testing showed no statistically significant differences between the 2 methods. P values of .057 for the overall discrepancy and .437 for the anterior discrepancy were recorded. Paired t-testing again revealed no significant differences, with P values of .115 for the overall and .546 for the anterior.

Significant differences again were present concerning time. The mean time for the HATS analysis was 3.40 ± 0.45 minutes as compared with 8.06 ± 0.54 minutes for the Vernier calipers.

OrthoCad vs Vernier calipers

Comparing the overall discrepancies, OrthoCad differed from the Vernier calipers on average by 1.20 mm. The differences ranged from 0.0 mm to 5.6 mm. Seventy-two percent rendered results that fell within 1.4 mm, and 90.9% were within 2.2 mm (Figure 2). For the discrepancy of the anterior 6 teeth, the differences ranged from 0.1 mm to 4.2 mm, with a mean of 1.02 mm. Eighteen of 22 (81.8%) were within 1.5 mm, and 20 of 22 (90.9%) were within 1.9 mm (Figure 3).

Pearson correlation coefficients were calculated to be 0.715 for the overall discrepancies and 0.574 for the anterior discrepancies (Figures 4 and 5).

Repeated-measures ANOVA testing comparing OrthoCad and the Vernier calipers registered P values of .829 for the overall discrepancy and .311 for the anterior discrepancy. Both of these were statistically insignificant (using a P value of .05). Paired t-tests also found no statistically significant differences between the 2 methods. The P values are .718 for the overall discrepancy and .243 for the anterior discrepancy.

Significant differences were present for time involved. On average, the Vernier caliper method took 8.06 ± 0.54 minutes vs 5.37 ± 0.87 minutes with OrthoCad.

DISCUSSION

Traditionally, a Boley gauge, Vernier caliper, or needlepoint divider is used to measure teeth and complete a tooth-size analysis. Although it involves much less time than diagnostic setups, manual tooth-size analysis can be time consuming in a busy practice, as well as prone to recording and calculation errors.8 In fact, other than looking at the lateral incisors, many clinicians probably do not routinely measure tooth-size discrepancies. Sheridan1 recently conducted a poll in which only 47% of the respondents routinely used a Bolton analysis. The current study was designed to determine if newer computerized methods were equally as accurate and more time efficient when measuring Bolton tooth-size discrepancies.

It is noted here that these computerized programs offer more than just Bolton analysis. The HATS program recommends a specific arch form that should be used for each patient. OrthoCad and QuickCeph include a variety of diagnostic tools too numerous to mention. Only the Bolton analyses were studied in this report.

The first step in setting up a “standard” was to determine the reliability of the initial measurements with the Vernier calipers. These values were highly correlated as evidenced by R values of no less than 0.920 for the overall discrepancy and 0.805 for the anterior discrepancy. Perhaps we should also look at the clinical importance of variations in these results. Proffit2 stated that a discrepancy of less than 1.5 mm is rarely significant. Thus, the decision to address tooth-size discrepancies during treatment may rest on this value. Comparing each of the Vernier caliper measurements (V1 vs V2, V1 vs V3, V2 vs V3) for overall discrepancy revealed that 86.4% (19 of 22 in each group) of the cases were within 1.5 mm of each other. The anterior discrepancy had even better reliability. Approximately 91% (20 of 22) to 95.5% (21 of 22) fell within 1.0 mm of each other, and 95.5% were within 1.5 mm. However, these numbers do indicate that some clinically significant variability existed. These findings are similar to the results from Shellhart et al.7 When measuring Bolton discrepancies on crowded dentitions with a Boley gauge and needlepoint dividers, they found that every investigator made at least one error in measurement that was greater than a clinically significant value for tooth-size excess.

The HATS analysis had very similar results when compared with the averaged Vernier caliper measurements. For the overall discrepancy, an identical 86.4% (19 of 22) were within 1.5 mm, and the R value was 0.885. When comparing the anterior discrepancy, 86.4% (19 of 22) were within 1.0 mm, and 100% were within 1.5 mm. The R value for the HATS vs the Vernier caliper was 0.825. These results indicate that the HATS system and the Vernier calipers are equally reliable even though there are a few clinically significant errors present with each method. This comes as no surprise since both methods employ handheld calipers. Two differences did exist between these methods. On average, it took less than half the time (4.66 minutes less) to complete the HATS analysis, and there was no chance of computation errors with the use of HATS.

The results from the OrthoCad analyses were slightly less correlated with the Vernier calipers results than were the HATS analyses. The R values for the overall and anterior discrepancies were 0.715 and 0.574 respectively. The data showed that 72.7% (16 of 22) of the results for the overall discrepancy and 81.8% (18 of 22) of the anterior discrepancies were within 1.5 mm. The range of measurements was greater for the OrthoCad than for the HATS or the initial Vernier caliper measurements. Therefore, OrthoCad had slightly less correlated results, and the differences were of a greater magnitude.

A variety of reasons could be responsible for these deviations. First, the operator was less familiar with this system than with the calipers. However, numerous practice sessions were completed until the author was accustomed to the device. The more probable reason is that the investigator found it difficult to pinpoint the exact mesial and distal points to be used for the measurements. At the time of this study, OrthoCad used a “click and drag” method to mark the mesial and distal points, making it difficult to consistently and accurately measure tooth widths. Additionally, the closer the points were on adjacent teeth, the harder the task of distinguishing them. To overcome this, the author had to zoom the image (as recommended by OrthoCad), but this practice increased the amount of time it took to complete the analysis. The average time for the OrthoCad analysis was 2.29 minutes faster than the that for Vernier calipers but 1.97 minutes longer than that for HATS analysis.

QuickCeph analyses were the shortest to complete (6.21 minutes less than the Vernier calipers), but were the least correlated of the 3 methods compared with the Vernier calipers. The R values for the overall and anterior discrepancies were 0.432 and 0.439 respectively. Eleven of 21 (52.4%) measurements for the overall discrepancy and 17 of 22 (77.3%) for the anterior discrepancy fell within the clinically significant level of 1.5 mm. The range of differences was the greatest for the QuickCeph technique. There was a tendency for the QuickCeph to measure positive differences or larger discrepancies than the Vernier calipers. Seventeen of 21 (77.3%) for the overall discrepancy and 15 of 22 (68.2%) for the anterior discrepancy measured positive differences. The two main difficulties incurred by the investigator with this method involved identifying the points used to measure the teeth and the loss of 3 dimensions by using video images of the models. Perhaps marking the models before digitizing them would have prevented this variability.

Distinguishing the proper landmarks was difficult with both the OrthoCad and QuickCeph systems. In his review, Houston11 stated that perhaps the greatest source of random errors is difficulty in identifying a particular landmark or imprecision in its definition. This leads to the issue of reproducibility of the computerized analyses. Only 1 analysis was completed for each set of models with each of the computerized methods and, therefore, the reproducibility of these methods cannot be documented. This is an area where further study would be beneficial.

CONCLUSION

In this study, significant differences were present for the time needed to complete the analysis. QuickCeph was the quickest followed in order by HATS, OrthoCad, and Vernier calipers. No statistically significant differences existed between the methods used to measure tooth-size discrepancies with the Bolton analysis. However, clinically significant differences (>1.5 mm) were evident for all methods. Compared with the Vernier calipers, the HATS program had very similar results, whereas OrthoCad and QuickCeph were less correlated.

This report indicates that there are more time-efficient methods available to measure a Bolton tooth-size analysis. Whether or not they are more or less accurate than traditional methods is the issue. One can expect that upgrades in the computerized methods will continue to improve until each is at least as reliable as the Vernier caliper method. Each clinician must decide if these alternative methods are acceptable and cost effective.

FIGURE 5. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for anterior discrepancyFIGURE 5. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for anterior discrepancyFIGURE 5. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for anterior discrepancy
FIGURE 5. Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for anterior discrepancy

Citation: The Angle Orthodontist 71, 5; 10.1043/0003-3219(2001)071<0351:ACOCBT>2.0.CO;2

TABLE 2. Anterior Bolton Discrepancies for Vernier Caliper Measurements*

          TABLE 2.
TABLE 4. Anterior Bolton Discrepancies for All Methods*

          TABLE 4.

Acknowledgments

This study was partially funded by the American Association of Orthodontists Foundation (AAOF).

REFERENCES

Copyright: Edward H. Angle Society of Orthodontists
<bold>FIGURE 1.</bold>
FIGURE 1.

An example of an electronic model from OrthoCad


<bold>FIGURE 2.</bold>
FIGURE 2.

Absolute differences of the 3 methods from the averaged Vernier caliper measurements for overall discrepancy


<bold>FIGURE 3.</bold>
FIGURE 3.

Absolute differences of the 3 methods from the averaged Vernier caliper measurements for anterior discrepancy


<bold>FIGURE 4.</bold>
FIGURE 4.

Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for overall discrepancy


<bold>FIGURE 5.</bold>
FIGURE 5.

Linear regression lines comparing the results of the 3 computerized methods vs the averaged Vernier caliper results for anterior discrepancy


Contributor Notes

The views expressed in this article are those of the authors and do not reflect the official policy of the Department of Defense or other departments of the United States Government.

Received: 01 Mar 2001
Accepted: 01 Apr 2001
  • Download PDF