Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 01 Aug 2002

Reproducibility of Characteristics Assessing the Occlusion of Young Adults

L Odont,
M Sc,
L Odont, D Odont,
L Odont,
L Odont, D Odont, D Soc Sci, and
L Odont, D Odont
Page Range: 310 – 315
DOI: 10.1043/0003-3219(2002)072<0310:ROCATO>2.0.CO;2
Save
Download PDF

Abstract

The aim of the present investigation was to analyze the reproducibility in the assessment of six morphological and three functional characteristics included in a new method evaluating the occlusion in young adults. These characteristics comprised coincidence of midlines, overjet, overbite, canine relationship, crossbite, scissors bite, recurrent deviation on opening, guided lateral excursions, and discrepancy between the centric relation and the intercuspal position. The study was conducted in three stages: (1) five observers assessed the occlusions of five volunteers, (2) seven observers assessed nine volunteers, and (3) five observers assessed nine volunteers. Two calibrated orthodontists were used as references. For numerical variables, the nonparametric method for repeated measurements (Friedman's test) was used to test the significance of differences, while the proportion of agreement was calculated for categorical assessments. The results were analyzed using two precision levels: within a measurement unit/the same category and an acceptable/nonacceptable dichotomy. The magnitude of systematic differences was small and of minor clinical importance except in measurements of recurrent deviation on opening. The proportional agreement for acceptance was good in the assessment of overjet, coincidence of midlines, crossbite, scissors bite, open bite, and discrepancy between the centric relation and the intercuspal position. Moderate agreement was achieved in the assessment of overbite, canine relationship, recurrent deviation on opening, and guided lateral excursions. Among the nonacceptable cases, the agreement ranged from poor to good. The results indicated that noncalibrated observers assess categorical characteristics inconsistently.

INTRODUCTION

Occlusal classifications are descriptive tools used by orthodontists and craniofacial biologists for clinical and research purposes. The usefulness of these classifications has, however, been questioned, mainly because they are found to give inconsistent results.1–3

The reproducibility of classifications has been tested both in clinical settings4–8 and using patient records, such as facial and dental photographs, radiographs, or dental casts.2,9–13 In some studies, clinical data have been combined with data obtained from study models.14,15 Examiners have variously comprised orthodontists,2,3,11–14 orthodontists and other specialists,10 TMD specialists and auxiliary personnel,7,8,16,17 or general practitioners.4,9 In general, the results have shown low consistency in assessments of the tested characteristics,2,4,5,7,11–14,17 but there are findings indicating that specialists can reach an acceptable level of agreement in the assessment of morphological characteristics.3,10 Of functional assessments, on the other hand, only maximal mouth opening has frequently shown high reproducibility.5,6,8,16–20 While some investigators have reported that training and calibration of the examiners results in a high level of agreement,3,7,8,16,17,21 others are sceptical and suggest that these will have only a minor impact on reproducibility.10,14

In Finland, free dental care, including orthodontics, is provided on a population basis up to 18 years of age. The health care system is showing increasing interest in the effectiveness, quality, and efficiency of orthodontic treatment, but there are no satisfactory tools that could be applied in occlusal evaluations. Our research group has been developing a method that could be used to assess the occlusions of young adults when studying the targeting and outcome of orthodontic care. A group of specialists in orthodontics and stomatognathic physiology has selected a set of morphological and functional characteristics that would meet the requirements of the health care system and orthodontic professionals in Finland.22 The aim of the present study was to analyze the reproducibility of the assessment of the selected characteristics.

MATERIALS AND METHODS

The investigation was conducted in three stages. In the first stage, five orthodontists examined five orthodontically treated volunteers. In the second stage, seven observers (three orthodontists and three orthodontically experienced and one inexperienced general practitioner) examined nine orthodontically treated volunteers. In the third stage, five observers (three orthodontists and one experienced and one inexperienced general practitioner) examined a group of nine volunteers including both orthodontically treated and untreated individuals. The examinations were carried out during routine orthodontic follow-up visits or annual dental examinations. In all stages, the volunteers were rated in a random sequence and informed consent was obtained from all of them.

The reproducibility of the assessment of six morphological and three functional characteristics was evaluated. These characteristics were selected using a modified Delphi process. For each characteristic, a group of specialists in orthodontics and stomatognathic physiology had defined a demarcation line for an acceptable–nonacceptable dichotomy. Overbite, canine relationship, crossbite, scissors bite, and guided lateral excursions were assessed categorically, while numerical measurements were taken for the coincidence of the facial midline and midline of the upper dental arch, overjet, recurrent deviation on opening, and discrepancy between the centric relation (CR) and the intercuspal position (ICP) (Table 1). The CR was defined according to Dawson23 as “the relationship of the mandible to the maxilla when the properly aligned condyle–disk assemblies are in the most superior position against the eminentia, irrespective of tooth position or vertical dimension.” Before each stage, all assessment procedures were demonstrated, and detailed instructions were given to the observers. To achieve the CR, a bimanual manipulation technique of the mandible23 was used during the demonstration. However, the use of this technique was not insisted on; the observers were allowed to use their own methods.

TABLE 1. Assessments Used in the Reproducibility Study; Numerical Measurements Taken to the Nearest Millimeter With a Rulera

          TABLE 1.

Two orthodontists, who participated in all stages of the study, were calibrated for the assessment of the chosen criteria. During a training session, they independently evaluated 20 dental casts. In case of a disagreement, the cast was reevaluated and the source of disagreement was discussed. Thereafter, both observers clinically assessed the occlusions of 20 randomly selected adolescents. The first five adolescents were assessed together and their recordings were excluded from the analyses. Calibration of other observers was not performed.

Statistical analyses

For the numerical variables, the disagreement between the observers concerning each volunteer was quantified by calculating the average of absolute values of the differences between every pair of observers. The percentage of pairs in which the absolute value of the difference was not more than 1 mm was also calculated. Because the comparisons concerned five to seven observers at the same time and because it was not found appropriate to assume that the distributions of the measurements were normal distributions, the nonparametric method for repeated measurements (Friedman's test) was used to test the significance of differences.24 P-values of less than .05 were interpreted as statistically significant.

For categorical assessments, the proportion of agreement was used to avoid the pitfalls inherent in the intraclass correlation and the kappa coefficient.21,25,26 Clinically, it is often relevant to be aware of the agreement for both the acceptable and the nonacceptable classifications, especially if there is a low number of observations in one of the categories. Statistical computing was performed using the SAS System for Windows, release 8.1/2000.

RESULTS

At the dichotomous level, the proportion of agreement for acceptance among all observers ranged from moderate to good, while that for the nonacceptable category varied between poor and perfect (Tables 2 through 5). In the acceptable category, the orthodontists achieved a good level of agreement for all numerical variables (Tables 2 and 4).

TABLE 2. Reproducibility of Numerical Morphological Variables Among All Observers; Results Are Shown for All Observers or Separately for Orthodontists and General Practitioners (GPs)

          TABLE 2.

Although systematic differences were found in numerical measurements among both orthodontists and general practitioners, these differences were of minor clinical importance except in measurements of recurrent deviation on opening. According to this criterion, only 0–22% of volunteers were found to be within one measurement unit (1 mm) by all observers. Further, the mean of the average differences (calculated from the absolute values of differences) was more than twice that of the other criteria (Table 4). Even the measurements made by the calibrated orthodontists indicated a systematic difference at the level of calibration (P = .04). Their measurements fell within one measurement unit in 55% of all examined volunteers (n = 38).

TABLE 4. Reproducibility of Numerical Functional Variables Among All Observers: Results Are Shown for All Observers or Separately for Orthodontists and General Practitioners (GPs)

          TABLE 4.

DISCUSSION

In many Finnish health centers, general practitioners, under the supervision of an orthodontist, carry out screening of malocclusions and simple treatment procedures.27 In these cases, a satisfactory level of agreement between the orthodontists and general practitioners is of importance. All orthodontists participating in our study were familiar with the assessments, and their agreement level was considered to represent the level that could be achieved through training. The accuracy of measuring was set to 1 mm, which was considered adequate for measurements taken directly from the mouth. For a number of reasons, the study was conducted in several stages, with relatively few observers participating in each stage. As the assessment took about 6–7 minutes/observer, we suspected that a larger number of repeated examinations could have affected the volunteers' functional status and distorted the results. Furthermore, the time available for the assessment was limited because it took place during an orthodontic follow-up visit or an annual dental examination. This design made it possible to study samples of both orthodontically treated and untreated occlusions and enabled the inclusion of observers with varying orthodontic backgrounds.

Of all assessments, the widest variability was found in measurements of recurrent deviation on opening. This finding is in line with earlier studies, in which the reproducibility of categorically assessed jaw opening patterns has ranged from poor to good.16–18,20,28 It is possible, however, that the high variation in recurrent deviation on opening does not reflect differences in technical management but rather exemplifies the instability of the characteristic. 8,17,18,28

As in earlier studies,2,3,11,13,16,17 the classification of canine relationship was found to be ambiguous. Given that the sagittal measurements were reproduced with high precision, it is unlikely that the observed discrepancies in canine classification could be assigned to variation in mandibular position. Instead, it is possible that not all observers were familiar with applying the Angle's classification to canines. It is also possible that the observers did not use the same viewing angle when assessing the buccal segment occlusion,29,30 which might explain some of the observed variation. In borderline cases, the differences may have arisen from judgmental variation31 based on differing interpretations of Angle's classes. Practical training, together with clear instructions and well-defined demarcation lines, would probably increase the reproducibility of the classification of this characteristic.

When measured in millimeters, overbite has been shown to have good reproducibility.17 In line with the present results, the agreement in categorical assessments has varied between moderate and good.3,10,13 However, in our study, the percentages of exact agreement (within the same category) indicated a wider variability than was found by Keeling et al.3

CONCLUSIONS

The agreement among all observers concerning the acceptable category was good in the assessment of overjet, coincidence of midlines, crossbite, scissors bite, open bite, and discrepancy between the CR and the ICP. Moderate agreement was achieved in the assessment of overbite, canine relationship, and guided lateral excursions.

In the nonacceptable category, the variability in agreement may partly reflect the low number of observations in this group.

Exact agreement in categorical assessments was highly variable.

The reproducibility of measurements of recurrent deviation on opening was poor, as described by the relatively high mean of the average absolute differences and by the low percentage of pairs within one measurement unit.

TABLE 3. Reproducibility of Categorical Morphological Variables Among All Observers; Results Are Shown for All Observers or Separately for Orthodontists

          TABLE 3.
TABLE 5. Reproducibility of Categorical Functional Variables Among All Observers; Results Are Shown for All Observers or Separately for Orthodontists

          TABLE 5.

Acknowledgments

The authors wish to thank the chief dental officers and their staff in the municipal health centers of Pori and Rauma and the staff at the Department of Oral Development and Orthodontics in the Institute of Dentistry, University of Turku, for their cooperation and assistance in conducting the study. We are grateful to Mr Heikki Hiekkanen for performing the statistical analyses, and we warmly thank all the participating volunteers. This study was supported by a grant from the Emil Aaltonen Foundation.

REFERENCES

  • 1

    Rinchuse, D. J.
    and
    D. J.Rinchuse
    . Ambiguities of Angle's classification.Angle Orthod1989. 59:295298.

  • 2

    Katz, M. I.
    Angle classification revisited 1: is current use reliable? Am J Orthod Dentofac Orthop 1992. 102:173179.

  • 3

    Keeling, S. D.
    ,
    S.McGorray
    ,
    T. T.Wheeler
    , and
    G. J.King
    . Imprecision in orthodontic diagnosis: reliability of clinical measures of malocclusion.Angle Orthod1996. 66:381392.

  • 4

    Carlsson, G. E.
    ,
    I.Egermark-Eriksson
    , and
    T.Magnusson
    . Intra- and inter-observer variation in functional examination of the masticatory system.Swed Dent J1980. 4:187194.

  • 5

    Kopp, S.
    and
    B.Wenneberg
    . Intra- and interobserver variability in the assessment of signs of disorder in the stomatognathic system.Swed Dent J1983. 7:239246.

  • 6

    Nielsen, L.
    ,
    B.Melsen
    , and
    S.Terp
    . Clinical classification of 14–16-year-old Danish children according to functional status of the masticatory system.Commun Dent Oral Epidemiol1988. 16:4751.

  • 7

    Dahlström, L.
    ,
    S. D.Keeling
    ,
    J. R.Fricton
    ,
    S.Galloway Hilsenbeck
    ,
    G. M.Clark
    , and
    J. D.Rugh
    . Evaluation of a training program intended to calibrate examiners of temporomandibular disorders.Acta Odontol Scand1994. 52:250254.

  • 8

    de Wijer, A.
    ,
    A. M.Lobbezoo-Scholte
    ,
    M. H.Steenks
    , and
    F.Bosman
    . Reliability of clinical findings in temporomandibular disorders.J Orofacial Pain. 1995;181–191.

  • 9

    Lewis, E. A.
    ,
    J. E.Albino
    ,
    J. J.Cunat
    , and
    L. A.Tedesco
    . Reliability and validity of clinical assessments of malocclusion.Am J Orthod1982. 81:473477.

  • 10

    Phillips, C.
    ,
    L. T.Bailey
    , and
    R. P.Sieber
    . Level of agreement in clinicians' perceptions of class II malocclusions.J Oral Maxillofac Surg1994. 52:565571.

  • 11

    Baumrind, S.
    ,
    E. L.Korn
    ,
    R. L.Boyd
    , and
    R.Maxwell
    . The decision to extract: part 1—interclinician agreement.Am J Orthod Dentofacial Orthop1996. 109:297309.

  • 12

    Du, S. Q.
    ,
    D. J.Rinchuse
    ,
    T. G.Zullo
    , and
    D. J.Rinchuse
    . Reliability of three methods of occlusion classification.Am J Orthod Dentofacial Orthop1998. 113:463470.

  • 13

    Luke, L. S.
    ,
    K. A.Atchison
    , and
    S. C.White
    . Consistency of patient classification in orthodontic diagnosis and treatment planning.Angle Orthod1998. 68:513520.

  • 14

    Gravely, J. F.
    and
    D. B.Johnson
    . Angle's classification of malocclusion: an assessment of reliability.Br J Orthod1974. 1:7986.

  • 15

    Buchanan, I. B.
    ,
    A.Downing
    , and
    D. R.Stirrups
    . A comparison of the Index of Orthodontic Treatment Need applied clinically and to diagnostic records.Br J Orthod1994. 21:185188.

  • 16

    Dworkin, S. F.
    ,
    L.LeResche
    , and
    T.DeRouen
    . Reliability of clinical measurement in temporomandibular disorders.Clin J Pain1988. 4:8999.

  • 17

    Dworkin, S. F.
    ,
    L.LeResche
    ,
    T.DeRouen
    , and
    M.Von Korff
    . Assessing clinical signs of temporomandibular disorders: reliability of clinical examiners.J Prosthet Dent1990. 63:574579.

  • 18

    Kopp, S.
    Constancy of clinical signs in patients with mandibular dysfunction. Commun Dent Oral Epidemiol 1977. 5:9498.

  • 19

    Westling, L.
    ,
    E.Helkimo
    , and
    A.Mattiasson
    . Observer variation in functional examination of the temporomandibular joint.J Craniomandib Disord Facial Oral Pain1992. 6:202207.

  • 20

    Wahlund, K.
    ,
    T.List
    , and
    S. F.Dworkin
    . Temporomandibular disorders in children and adolescents: reliability of a questionnaire, clinical examination, and diagnosis.J Orofacial Pain1998. 12:4251.

  • 21

    Grant, J. M.
    The fetal heart rate trace is normal, isn't it? Observer agreement of categorical assessments. Lancet 1991. 337:215218.

  • 22

    Svedström-Oristo, A-L.
    ,
    T.Pietilä
    ,
    I.Pietilä
    ,
    P.Alanen
    , and
    J.Varrela
    . Morphological, functional and aesthetic criteria of acceptable mature occlusion.Eur J Orthod2001. 23:373381.

  • 23

    Dawson, P. E.
    Evaluation, Diagnosis, and Treatment of Occlusal Problems, 2nd edition. St Louis, Mo: Mosby; 1989:29, 41–44.

  • 24

    Lehmann, E. L.
    Nonparametrics: Statistical Methods Based on Ranks. San Francisco: Holden Day Inc; 1975:120–201, 260–285.

  • 25

    Haas, M.
    ,
    J.Nyiendo
    ,
    C.Peterson
    ,
    H.Thiel
    ,
    T.Sellers
    ,
    D.Cassidy
    , and
    K.Young-Hing
    . Interrater reliability of roentgenological evaluation of the lumbar spine in lateral bending.J Manipulative Physiol Ther1990. 13:179189.

  • 26

    Pett, M. A.
    Nonparametric Statistics for Health Care Research, Statistics for Small Samples and Unusual Distributions. London: Sage Publications; 1997:237–248.

  • 27

    Pietilä, T.
    ,
    I.Pietilä
    ,
    E.Widström
    ,
    J.Varrela
    , and
    P.Alanen
    . Extent and provision of orthodontic services for children and adolescents in Finland.Commun Dent Oral Epidemiol1997. 25:150155.

  • 28

    Smith, J. P.
    Observer variation in the clinical diagnosis of mandibular pain dysfunction syndrome. Commun Dent Oral Epidemiol 1977. 5:9193.

  • 29

    Scivier, G. A.
    ,
    D. M.Menezes
    , and
    C. D.Parker
    . A pilot study to assess the validity of the orthodontic treatment priority index in English schoolchildren.Community Dent Oral Epidemiol1974. 2:246252.

  • 30

    Richmond, S.
    ,
    W. C.Shaw
    ,
    K.O'Brien
    ,
    I. B.Buchanan
    ,
    R.Jones
    ,
    C. D.Stephens
    ,
    C. T.Roberts
    , and
    M.Andrews
    . The development of the PAR Index (Peer Assessment Rating): reliability and validity.Eur J Orthod1992. 14:125139.

  • 31

    Kay, E.
    and
    N.Nuttal
    . Clinical decision making—an art or a science? Part II: making sense of treatment decisions.Br Dent J1995. 178:113116.

Copyright: Edward H. Angle Society of Orthodontists

Contributor Notes

Corresponding author: Anna-Liisa Svedström-Oristo, Institute of Dentistry, University of Turku, Lemminkäisenkatu 2, FIN 20520 Turku, Finland (anlisve@utu.fi)

Accepted: 01 Feb 2002
  • Download PDF