Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 26 Jan 2015

Does clinical experience affect the reproducibility of cervical vertebrae maturation method?

,
,
,
,
, and
Page Range: 841 – 847
DOI: 10.2319/080414-544.1
Save
Download PDF

ABSTRACT

Objective: 

To assess interobserver and intraobserver reproducibility of the cervical vertebrae maturation method (CVMM) among three panels of judges with different levels of orthodontic experience (OE).

Materials and Methods: 

Fifty individual lateral cephalograms of good quality with complete visualization of cervical vertebrae 1 to 4 were selected. Thirty clinicians, divided according to their OE into three groups (junior group, JU, OE ≤ 1 year; postgraduate group, PG, 2 ≤ OE ≤ 4 years; specialist group, SP, OE ≥ 7 years), evaluated the cephalograms in two sessions (T1 and T2) at 3 weeks apart. Kendall's W and weighted Cohen's kappa (κ) coefficients were performed to assess interobserver and intraobserver agreement. The level of significance was set as P < .05. For both the interobserver and the intraobserver datasets, the percentage of perfect agreement (PPA) and the number of stages apart for each disagreement were calculated.

Results: 

Kendall's W at T1 was SP  =  0.61, PG  =  0.70, and JU  =  0.87; at T2 it was SP  =  0.78, PG  =  0.85, and JU  =  0.86. The percentage of total interobserver perfect agreement (Inter-PPA) was 42.3% at T1 and 46.3% at T2. The JU group had the highest Cohen's κ coefficient at 0.78, while the PG and SP had coefficients of 0.64 each. The percentage of total intraobserver perfect agreement (Intra-PPA) was 54.2%.

Conclusions: 

The reproducibility of the method was not improved by the level of orthodontic experience. The group with the lowest level of orthodontic experience had the best performance.

INTRODUCTION

Craniofacial growth may play an important role in the success of orthodontic treatment.1,2 The reliable prediction of patient mandibular and maxillary development could help in understanding the best therapeutic decision regarding treatment timing, appliance choice, and the possible need for surgery.3 As most orthodontic patients are growing individuals, orthodontists have to consider their craniofacial growth path for successful treatment planning.4 However, individuals with the same chronologic age may have different growth patterns regarding onset, duration, speed, direction, and amount of residual growth, as shown in several studies.59

Many indicators have been suggested to evaluate the timing of mandibular growth peak and skeletal maturation,1012 and the most used methods for this assessment are based on radiographic analysis. Specifically, numerous authors investigated the relationship between mandibular growth and skeletal maturation estimated by means of hand-wrist bone analysis (HWBA) or the cervical vertebrae maturation method (CVMM).3,1319 Since the CVMM is advocated as a timing tool for orthopedic treatment, its validity and reproducibility should be assessed by procedures without any methodologic shortcomings.20 Different investigations have studied the association between the CVMM and skeletal maturity and the CVMM and mandibular growth spurt, testing the validity.2131 Several studies found a good correlation between CVMM and HWBA, suggesting the possibility to use the CVMM instead of the HWBA to reduce the radiation dose.2126 However, a recent systematic review of the CVMM was not able to establish the validity of this method due to the lack of moderate/high quality papers on this topic.32 Several studies criticized the validity of the CVMM, showing that it is unable to predict the start of the peak in mandibular growth.2729 Furthermore, it has been shown that the effective radiation dose for a lateral cephalogram without a thyroid shield is 1.5 fold higher than the effective dose for a lateral cephalogram with a thyroid shield plus a hand-wrist radiograph.33

On the other hand, the reproducibility of the CVMM, which can affect its clinical usefulness, is strongly debated. Very high levels of interobserver and intraobserver reproducibility (over 90%) were reported by some examinations.18,22,24 However, these studies have some limitations, such as the analysis of traced vertebrae on the lateral cephalograms instead of actual radiographs, the presence of the authors among the group of judges performing the analysis, the use of small sample sizes, and the use of an improper statistical method.20

Some authors have succeeded in the investigation of the reproducibility overcoming the previous methodologic weakness but without exploring the influence of clinical experience on the reproducibility of the CVMM.20,30,31,34 Indeed, there is little information about the impact of judges' clinical experience on the CVMM, even though it should not be underestimated.20,35 Therefore, the aim of this study was to evaluate the interobserver and intraobserver reproducibility of the CVMM among three judge panels with different levels of orthodontic experience (OE). The null hypothesis was that the orthodontic clinical experience did not have any influence on the reproducibility of the CVMM.

MATERIALS AND METHODS

The study was approved by the Local Ethical Committee of the University of Naples Federico II. Fifty good quality individual lateral cephalograms of patients attending the School of Orthodontics of the University of Naples Federico II, with complete visualization of cervical vertebrae 1 through 4, were selected for our sample. The cephalograms were randomly chosen from the school's electronic database, by means of the random function of a scientific calculator (EL-506VB, Sharp Corp, Osaka, Japan) and were equally divided by sex (mean age 12.4 years ± 3.2; 25 female and 25 male). Afterwards, the original lateral cephalograms were searched and scanned at 600 dpi (Perfection V750 Pro, Seiko Epson Corp, Suwa, Japan) for presentation as high-resolution images in TIFF format to maintain the radiographic quality.

To avoid any additional information that might influence the observer during the evaluation of the CVMM (as stage of dentition), the lateral cephalograms were cut to include only cervical vertebrae from C1 to C4.

The judges were divided into three groups according to the level of clinical experience: the junior group (JU) formed by 10 recent graduates in dentistry, with less than 1 year of orthodontic experience (mean age 25.1 years ± 1.2; OE 0.4 years ± 0.5; 6 female and 4 male); the postgraduate group (PG), comprising 10 postgraduate students in orthodontics with clinical experience ranging between 2 and 4 years (mean age 29.7 years ± 1.4; OE 2.6 years ± 0.5; 6 female and 4 male); and the specialist group (SP) including 10 specialists in orthodontics with more than 7 years of orthodontic experience (mean age 41.8 years ± 10.3; OE 19.1 years ± 10.4; 3 female and 7 male). None of the clinicians recruited for the assessment participated in the study design. All of the judges belonged to the School of Orthodontics of the University of Naples Federico II as tutor/professor (SP), postgraduate student (PG), or voluntary frequenter (JU).

Each observer was invited to perform two sessions of evaluation of cervical stage on the lateral cephalograms according to the method suggested by Baccetti et al.3 Between the two sessions, a 3-week time interval was respected (T1  =  initial; T2  =  3 weeks). The cephalograms were presented in a high-resolution file, randomly ordered for the two sessions. Before the first session, all participants attended a lecture on the CVMM by one investigator (Dr D'Antò). Moreover, at the start of each session the observers also received a copy of the paper by Baccetti et al.,3 and beside each cephalogram a schematic representation of the CVMM was shown as well.

Statistical Analysis

The statistical analysis was conducted to calculate Kendall's W coefficient for the interobserver agreement and weighted Cohen's kappa (κ) coefficient for the intraobserver agreement. The Kendall's W, for the whole sample (Total) and for each group independently (JU, PG, SP) was calculated by means of the Statistical Package for the Social Sciences (SPSS, IBM, Armonk, NY) for the two sessions (T1 and T2). The weighted Cohen's κ was calculated by means of Statistical Analysis Software ver. 9.2 (SAS Inc, Cary, NC), with linear weights, comparing the two sessions for each observer and for each group (JU, PG, SP). The level of significance was set as P < .05. The range of variation of Kendall's W coefficient of concordance and of the weighted κ statistic is between 0 for no agreement and 1 for perfect agreement with five intermediate levels: slight agreement (0.01–0.20), fair agreement (0.21–0.40), moderate agreement (0.41–0.60), substantial agreement (0.61–0.80), and almost perfect agreement (0.81–0.99).36 Furthermore, for both interobserver and intraobserver datasets, the percentage of perfect agreement (PPA) and the number of stages apart for each disagreement were calculated.

For the interobserver dataset, the percentage of perfect agreement (Inter-PPA) was the number of the same staging among each couple of judges, of each observation, independently for T1 and T2. They were evaluated for all observers (Total) and for each group (JU, PG, SP).

For the intraobserver dataset, the percentage of perfect agreement (Intra-PPA) was the number of the same staging of each observation between T1 and T2 for each judge. Moreover, they were evaluated for all judges (Total) and for each group (JU, PG, SP).

Finally, the group top 10 was composed of the best 10 observers evaluated according to the results of the weighted Cohen's κ; in addition, all of the variables were assessed for this group.

RESULTS

The 30 participants performed a total of 3000 evaluations at two different time intervals. The Kendall's W coefficient for each group varied from 0.61–0.87. The interobserver agreement was the highest for the JU group in both time intervals and showed an almost perfect agreement (T1 W  =  0.87; T2 W  =  0.86). On the other hand, the SP group achieved the lowest Kendall's W values presenting a substantial agreement (T1 W  =  0.61; T2 W  =  0.78). The total interobserver agreement for the 30 participants at the two time intervals varied from W  =  0.70 at T1 (substantial agreement) to W  =  0.81 at T2 (almost perfect agreement) (Table 1).

Table 1. Kendall's W Coefficient for Interobserver Agreement at Two Time Intervals (T1 and T2)
Table 1.

A total of 21,750 comparisons between the evaluations for each couple of judges, for each session, was analyzed, to assess the Inter-PPA. The Inter-PPA was 42.3% (N  =  9204/21,750) at T1 and 46.3% (N  =  10,082/21,750) at T2. The percentage of one stage apart disagreement was 40.1% (N  =  8721/21,750) at T1 and 41.4% (N  =  9011/21,750) at T2 (Table 2).

Table 2. Percentage of Interobserver Perfect Agreement (Inter-PPA) With Cervical Stage Differences
Table 2.
Table 2. Extended
Table 2.

The intraobserver agreement for each observer ranged from κ  =  0.24 fair agreement to κ  =  0.81 almost perfect agreement. Among the 30 participants, however, only three showed a fair agreement, while 18 showed a substantial agreement, two showed an almost perfect agreement, and seven a moderate agreement (Table 3).

Table 3. Weighted Cohen's κ Coefficient for Intraobserver Agreement and Percentage of Agreement by Observersa
Table 3.

The group showing the best result was the JU, where the κ coefficient ranged from moderate agreement to almost perfect agreement. Moreover, the weighted Cohen's κ coefficient for the whole group was the highest (κ  =  0.78 substantial agreement). In the PG results there was one fair agreement but also one almost perfect agreement, while the SP had the worst performance. The latter two groups, PG and SP, showed a similar Cohen's κ coefficient for the whole group (κ  =  0.64 substantial agreement; Tables 3 and 4). For each observer the comparison between the evaluation at T1 and T2 of the 50 assessments was analyzed to evaluate the Intra-PPA. The total Intra-PPA was 54.7% (N  =  821/1500), while the percentage of intraobserver one stage apart disagreement was 34.3% (N  =  515/1500). The JU showed the highest Intra-PPA of 57.8% (N  =  289/500), and 34% (170/500) of their disagreements were of just one cervical stage (Table 5).

Table 4. Weighted Cohen's κ Coefficient for Intraobserver Agreement by Orthodontic Experience Groups
Table 4.
Table 5. Percentage of Intraobserver Perfect Agreement (Intra-PPA) With Cervical Stage Differences
Table 5.

The Top 10 group was formed by six observers of the JU, only one of the PG, and three of the SP (Table 3). This group achieved the best results in the Inter-PPA, Intra-PPA, and Kendall's W (Tables 1, 2, and 5).

DISCUSSION

The aim of this study was to analyze the influence of the OE on the reproducibility of the CVMM using three judge panels with different levels of clinical experience. The SP was the group with the highest clinical experience (OE  =  19.1), but this group achieved the lowest values of W, κ, Inter-PPA, and Intra-PPA. On the contrary, the group with less than 1 year of clinical experience (JU), showed the highest values for all of the parameters investigated. These results suggest that the OE does not improve the reproducibility of the CVMM, and they could be explained by the different familiarity with the CVMM. Indeed, during their last year of undergraduate courses at the University of Naples Federico II, the JU were educated on the CVMM and they were extensively trained on this method. On the other hand, all of the components of the SP group did not use the CVMM in their daily clinical practice. This might lead to confirmation that to correctly use the CVMM, a more in-depth training may be critical.

The interobserver agreement for the overall group of observers was 0.81 at T2, higher than that reported by Gabriel et al.20 (0.74). The main differences between the two works are the number of participants, the sample size, and the division of judge panels according to the level of clinical experience. Although the results are similar for the weighted Cohen's κ and for the Intra-PPA for all observers, in this study the JU and the PG groups reached a level of intraobserver and interobserver agreement higher than the results showed by Gabriel et al.20

These findings highlight that, probably, the level of practice and knowledge of the CVMM might be an important factor for its reproducibility, and that a simple use of a handout is likely not sufficient to obtain a good level of knowledge and familiarity with the CVMM.

In Zhao et al.,31 the CVMM was explained to the observers by means of training sessions, and it showed values of W and κ coefficients similar to our best performing sample (JU). Hence, to correctly use the CVMM there might be a need for multiple training sessions to understand how to assess the cervical stage and to acquire a consistent method of evaluation. Interestingly, the SP and PG groups showed an increase of the Kendall's W between T1 and T2 probably due to a training effect.

In a very recent study,34 the CVMM was tested for accuracy and reproducibility, comparing the cephalometric evaluations of the concavities and shape of the cervical vertebrae with the visual evaluation of CVMM. The authors found the visual method accurate and reproducible; moreover, they stated that in their group the reproducibility was high, independently of the level of the experience of the observers, which is similar to our results.

In the current investigation, albeit the CVMM showed a good reproducibility, the Inter-PPA and Intra-PPA were low. In fact, in the JU group, the Intra-PPA was 57.8% as opposed to a high κ coefficient of 0.78. This means that in almost 4 of 10 cases a clinician could have evaluated a cervical stage differently and might have changed his treatment plan. A similar situation was observed for the other two groups as well. Moreover, the Inter-PPA was similar to the that found in the studies by Gabriel et al.20 and Zhao et al.,31 and it exceeded the 40% with a peak of 51.9% for JU at T1. This means that a clinician has almost a 50% of chance to disagree on cervical staging assessed by others. It has to be taken into account that the value of the weighted Cohen's κ coefficient increases with the number of categories of the assessed method; therefore, this can be a possible explanation of the difference between the high κ coefficient and the low perfect agreement.37 Finally, it is interesting to note that in almost all three judge panels, on average, the 42% of the cervical staging were judged with only one stage apart of difference for both interobserver and intraobserver analysis. This might be another factor that affects the κ coefficient; in fact, having most disagreements in just one category increases the weighted Cohen's κ. Therefore, even if the weighted Cohen's κ and the Kendall's W coefficients were sufficiently high, the level of perfect agreement in the intraobserver and interobserver analysis seems to be not adequate enough to support the reproducibility of the CVMM as a method to evaluate skeletal maturation. Also analyzing the top 10 group, even if the achieved results were better than the results of the other groups, there was still a low level of Inter-PPA and Intra-PPA that could affect the clinical decision, and so the usability of the CVMM.

Having a powerful means to predict the residual potential growth in young patients, without any added biological cost, is clinically helpful, and the CVMM could assume an important role in orthodontics, especially during clinical diagnoses and treatment planning decisions.3 However, the level of reproducibility of this method may affect the clinical decision in the orthopedic orthodontic treatment.

One limitation of this work is that it cannot provide information on the validity of the CVMM due to the lack of a longitudinal sample analysis. Moreover, without using a gold standard observation there was not the possibility to assess the accuracy of the CVMM.

CONCLUSIONS

  • The main finding of this research was the lack of influence of orthodontic clinical experience on the interobserver and intraobserver reproducibility of the CVMM. The group with the lowest level of OE had the best performance. Hence, high level of orthodontic experience does not increase the reproducibility of the CVMM.

  • The Inter-PPA (36.1%–56.1%) and the Intra-PPA (52.6%–66.0%) of the CVMM were too low to suggest the exclusive use of the CVMM in the assessment of skeletal growth.

REFERENCES

  • 1.

    Petrovic A.
    Auxologic categorization and chronobiologic specification for the choice of appropriate orthodontic treatment. Am J Orthod Dentofacial Orthop. 1994;105:192205.

  • 2.

    Baccetti T,
    Franchi L,
    Toth LR,
    McNamara JA Jr.
    Treatment timing for Twin-block therapy. Am J Orthod Dentofacial Orthop. 2000;118:159170.

  • 3.

    Baccetti T,
    Franchi L,
    McNamara JA Jr.
    The cervical vertebral maturation method for the assessment of optimal treatment timing in dentofacial orthopedics. Semin Orthod. 2005;11:119129.

  • 4.

    Verma D,
    Peltomäki T,
    Jäger A.
    Reliability of growth prediction with hand-wrist radiographs. Eur J Orthod. 2009;31:438442.

  • 5.

    Björk A.
    The significance of growth changes in facial pattern and their relationship to changes in occlusion. Dent Rec (London). 1951;71:197205.

  • 6.

    Bjork A.
    Variation in the growth pattern of the human mandible: longitudinal radiographic study by the implant method. J Dent Res. 1963;42:400411.

  • 7.

    Bambha JK,
    Van Natta P.
    Longitudinal study of facial growth in relation to skeletal maturation during adolescence. Am J Orthod. 1963;49:481492.

  • 8.

    Bishara SE,
    Peterson L,
    Bishara EC.
    Changes in facial dimensions and relationships between the ages of 5 and 25 years. Am J Orthod. 1984;85:238252.

  • 9.

    Bishara SE,
    Jakobsen JR.
    Longitudinal changes in three facial types. Am J Orthod. 1985;88:466502.

  • 10.

    Lewis AB,
    Garn SM.
    The relationship between tooth formation and other maturation factors. Angle Orthod. 1960;30:7077.

  • 11.

    Tanner JM.
    Growth and Adolescence. 2nd ed.
    Oxford, UK
    :
    Blackwell Scientific Publications
    ; 1962.

  • 12.

    Mitani H,
    Sato K.
    Comparison of mandibular growth with other variables during puberty. Am J Orthod Dentofacial Orthop. 1992;62:217222.

  • 13.

    Chapman SM.
    Ossification of the adductor sesamoid and the adolescent growth spurt. Angle Orthod. 1972;42:236245.

  • 14.

    Lamparski DG.
    Skeletal Age Assessment Utilizing Cervical Vertebrae [dissertation].
    Pittsburgh
    :
    University of Pittsburgh
    ; 1972.

  • 15.

    Grave KC,
    Brown T.
    Skeletal ossification and the adolescent growth spurt. Am J Orthod. 1976;69:611619.

  • 16.

    Houston WJ,
    Miller JC,
    Tanner JM.
    Prediction of the timing of the adolescent growth spurt from ossification events in hand-wrist films. Br J Orthod. 1979;6:145152.

  • 17.

    Fishman LS.
    Radiographic evaluation of skeletal maturation. A clinically oriented method based on hand-wrist films. Angle Orthod. 1982;52:88112.

  • 18.

    Hassel B,
    Farman A.
    Skeletal maturation evaluation using cervical vertebrae. Am J Orthod Dentofacial Orthop. 1995;107:5866.

  • 19.

    Franchi L,
    Baccetti T,
    McNamara JA Jr.
    Mandibular growth as related to cervical vertebral maturation and body height. Am J Orthod Dentofacial Orthop. 2000;118:335340.

  • 20.

    Gabriel DB,
    Southard KA,
    Qian F,
    et al. Cervical vertebrae maturation method: poor reproducibility. Am J Orthod Dentofacial Orthop. 2009;136:478.e17.

  • 21.

    Gandini P,
    Mancini M,
    Andreani F.
    A comparison of handwrist bone and cervical vertebral analysis in measuring skeletal maturation. Angle Orthod. 2006;76:984989.

  • 22.

    Flores-Mir C,
    Burgess CA,
    Champney M,
    Jensen RJ,
    Pitcher MR,
    Major PW.
    Correlation of skeletal maturation stages determined by cervical vertebrae and hand-wrist evaluations. Angle Orthod. 2006;76:15.

  • 23.

    San Roman P,
    Palma JC,
    Oteo MD,
    Nevado E.
    Skeletal maturation determined by cervical vertebrae development. Eur J Orthod. 2002;24:303311.

  • 24.

    Uysal T,
    Ramoglu SI,
    Basciftci FA,
    Sari Z.
    Chronologic age and skeletal maturation of the cervical vertebrae and hand-wrist: is there a relationship? Am J Orthod Dentofacial Orthop. 2006;130:622628.

  • 25.

    Stiehl J,
    Müller B,
    Dibbets J.
    The development of the cervical vertebrae as an indicator of skeletal maturity: comparison with the classic method of hand-wrist radiograph. J Orofac Orthop. 2009;70:327335.

  • 26.

    Wong RW,
    Alkhal HA,
    Rabie BM.
    Use of cervical vertebral maturation to determine skeletal age. Am J Orthod Dentofacial Orthop. 2009;136:484.e16.

  • 27.

    Ball G,
    Woodside D,
    Tompson B,
    Hunter WS,
    Posluns J.
    Relationship between cervical vertebral maturation and mandibular growth. Am J Orthod Dentofacial Orthop. 2011;139:e455461.

  • 28.

    Beit P,
    Peltomäki T,
    Schätzle M,
    Signorelli L,
    Patcas R.
    Evaluating the agreement of skeletal age assessment based on hand-wrist and cervical vertebrae radiography. Am J Orthod Dentofacial Orthop. 2013;144:838847.

  • 29.

    Mellion ZJ,
    Behrents RG,
    Johnston LE Jr.
    The pattern of facial skeletal growth and its relationship to various common indexes of maturation. Am J Orthod Dentofacial Orthop. 2013;143:845854.

  • 30.

    Nestman TS,
    Marshall SD,
    Qian F,
    Holton N,
    Franciscus RG,
    Southard TE.
    Cervical vertebrae maturation method morphologic criteria: poor reproducibility. Am J Orthod Dentofacial Orthop. 2011;140:182188.

  • 31.

    Zhao XG,
    Lin J,
    Jiang JH,
    Wang Q,
    Ng SH.
    Validity and reliability of a method for assessment of cervical vertebral maturation. Angle Orthod. 2012;82:229234.

  • 32.

    Santiago RC,
    de Miranda Costa LF,
    Vitral RW,
    Fraga MR,
    Bolognese AM,
    Maia LC.
    Cervical vertebral maturation as a biologic indicator of skeletal maturity. Angle Orthod. 2012;82:11231131.

  • 33.

    Patcas R,
    Signorelli L,
    Peltomäki T,
    Schätzle M.
    Is the use of the cervical vertebrae maturation method justified to determine skeletal age? A comparison of radiation dose of two strategies for skeletal age estimation. Eur J Orthod. 2013;35:604609.

  • 34.

    Perinetti G,
    Caprioglio A,
    Contardo L.
    Visual assessment of the cervical vertebral maturation stages: a study of diagnostic accuracy and repeatability. Angle Orthod. 2014;84:951956.

  • 35.

    Baccetti T,
    Franchi L,
    McNamara JA Jr.
    Reproducibility of the CVM method: a reply. Am J Orthod Dentofacial Orthop. 2010;137:446447.

  • 36.

    Viera AJ,
    Garrett JM.
    Understanding interobserver agreement: the kappa statistic. Fam Med. 2005;37:360363.

  • 37.

    Brenner H,
    Kliebsch U.
    Dependence of weighted kappa coefficients on the number of categories. Epidemiology. 1996;7:199202.

Copyright: © 2015 by The EH Angle Education and Research Foundation, Inc.

Contributor Notes

Corresponding author: Dr Vincenzo D'Antò, Department of Neurosciences, Reproductive Sciences and Oral Sciences, University of Naples Federico II, Via Pansini 5, 80131 Naples, Italy (e-mail: vincenzo.danto@unina.it)
Received: 01 Aug 2014
Accepted: 01 Oct 2014
  • Download PDF