SP16 The Effect of Standardization and Training on Inter- and Intra-Rater Reliability of the Modified Ashworth Scale in Children with Cerebral Palsy Nancy J. Clegg, BSN, MSN, PhD; Deborah A. Baldwin, BS; Sara Baldwin, BS; Carol Chambers, BS, MS, PCS, PT; Hun Epps, PT; Margie Goggans, BS; ChanHee Jo, PhD; Charter Rushing, PhD, PT; Angela Shierk, PhD, OTR/L; Mauricio R. Delgado, MD Texas Scottish Rite Hospital for Children • The University of Texas Southwestern Medical Center at Dallas, Texas BACKGROUND: GROUP A: TRAINED RATERS: The reliable measurement of spasticity is critical to monitor the progress of children with cerebral palsy (CP) and to assess the efficacy of current treatments. Most studies of spasticity measurement have consistently utilized the Ashworth Scale (Ashworth 1964) or Modified Ashworth Scale (Bohannon & Smith 1987), which are actually tone or stiffness measuring scales that have become a de facto criterion standard for spasticity measurement. Bohannon and Smith (1987) revised the original Ashworth Scale (1964) to render the scale more discrete. This new scale is commonly referred to as the Modified Ashworth Scale (MAS): 0 = no increase muscle tone; 1 = slight increase in muscle tone, manifested by a catch and release or by minimal resistance at the end of the range of motion when the affected part(s) is moved in flexion or extension; 1+= slight increase in muscle tone, manifested by a catch, followed by minimal resistance throughout the remainder (less than half) of the ROM; 2 = more marked increase in muscle tone through most of the ROM, but affected part(s) easily moved; 3 = considerable increase in muscle tone, passive movement difficult; and 4 = affected part(s) rigid in flexion or extension. For the purposes of this study, raters in Group A were trained using a newly developed specialized training method on performing MAS assessments. Raters in this group attended a lecture (PowerPoint with video demonstrations), received a training manual (See Figure 1.), were given detailed data collection tools for each joint (See Figure 2.), received detailed instructions on dividing the available range of motion into quarters, and watched demonstrations of the new assessment method and participated in supervised practice sessions with clinic patients. These raters were given specific instructions on dividing the available range of motion into quarters to better delineate the scores on the scale: • A MAS rating of “1” refers to a catch and release in the 4th quarter (last quarter) of the available range of motion. • A MAS rating of “1+” refers to a catch and release in the 3rd quarter of the available range of motion. • A MAS rating of “2” or “3” refers to an increase in muscle tone that begins in the 1st or 2nd quarter of the available range of motion. Following training and supervised practice, raters were certified in the assessment of tone using the MAS in clinic on pediatric patients diagnosed with CP. (See Figure 3.) Although the original scale has been modified twice, reliability studies have demonstrated varied results across ages. Furthermore, there are limited studies in children with CP. (See Table 1.) RESEARCH STUDY Fosang et al. (2003) Clopton et al. (2005) Yam & Leung (2006) Mutlu et al. (2008) Klingels et al. (2010) Numanoğlu & Günel (2012) Delgado et al. (2015) [In Process] RATERS SUBJECTS MUSCLE GROUPS N=6 N=5 N=2 N=3 N=2 N=1 N=3 Untrained N=3 Trained N=18 N=17 N=17 N=38 N=30 N=37 N=17 3 Lower Limb 6 Upper/Lower Limb 2 Lower Limb 5 Lower Limb 8 Upper Limb 6 Upper/Lower Limb 4 Upper/Lower Limb INTRA-RATER RELIABILITY -0.07 - 0.85 ICC 0.54 - 0.80 ICC Not performed 0.36 - 0.83 ICC 0.57 - 0.85 ICC 0.26 - 0.66 ICC Untrained: 0.37- 0.81 ICC Trained: 0.71 - 0.85 ICC Figure 2: Data collection tool for Group A Trained Raters. Table 1: Literature review of MAS intra- and inter-rater reliability studies in children with cerebral palsy. Recently, while executing another study using the MAS, “A phase III, multicenter, double-blind, prospective, randomized, controlled, multiple treatment study assessing efficacy and safety of Dysport® used in the treatment of upper limb spasticity in children” (Protocol Y-52-52120-153), the investigators created more detailed instructions on using the MAS to enhance reliability of measurements across multiple sites. EXAMINER CERTIFICATION Date: MATERIALS/METHODS: The subjects were divided into three groups (6 patients per group) and were assessed at one of three assigned times (AM, Noon, or PM) on both days, approximately 24 hours apart. During the assessment, each clinician rated the subject’s muscle tone using the MAS. Four joints (elbow flexor, wrist flexor, knee flexor and ankle plantarflexor) on each subject were measured by each rater on 2 consecutive days. Each rater had ten minutes per subject to complete the four MAS assessments. Raters were divided into 2 groups: Group A with specialized training in rating hypertonia using the MAS following a standardized protocol. Group B used MAS assessments following Bohannon & Smith’s 1987 published recommendations. The order of evaluations by examiners was randomized. All raters received the article by Bohannon and Smith (1987) and were instructed to: • Place the child in a supine position with the head midline. • For the timing for the extension of the limb, use a ‘fast velocity’ of one second. • Keep repeated movement cycles at a minimum, 5 to 8 times. The primary outcome measure was the extent of agreement among all raters (inter-rater reliability) & the extent of agreement between each rater’s 2 evaluations (intra-rater reliability). Statistical analysis was performed using weighted kappas with a 95% Confidence Interval. Assessor: Subject#: ELBOW Disagree Agree Disagree Left Right Left Right Left Right WRIST Patient and family were comfortable throughout the examination Examiner gave patient age appropriate instructions/ explanations Patient safety was priority at all times Proper disinfectant measures were taken Correctly positioned patient’s head/trunk & maintained position Correctly stabilized opposite limb and maintained stabilization Correctly positioned distal and proximal limb to be measured Examiner correctly positioned his/her hands Relaxation technique applied correctly The purpose of the current study was to examine the effect of standardization and training on interand intra-rater reliability of the MAS in children with cerebral palsy. The study compared two groups of raters: Group A with specialized training and Group B using the MAS per standard clinical protocol following the recommendations of Bohannon and Smith (1987). Seventeen children (mean age 10.9 years ±3.38) with hypertonia due to CP were recruited from the neurology clinic population at a tertiary care facility, Texas Scottish Rite Hospital for Children (TSRHC). The six raters included healthcare professionals who had specific experience in the management of patients with hypertonia: 1 Neurologist, 1 Physician Assistant, 1 Occupational Therapist, and 3 Physical Therapists. INTRA-RATER RELIABILITY Weighted Kappa Trained #1 0.725 #2 0.749 #3 0.853 Untrained #4 0.595 #5 0.822 #6 0.387 95% Confidence Interval ( 0.5665, 0.8355 ) ( 0.5470, 0.8574 ) ( 0.7320, 0.9277 ) ( 0.3698, 0.7597 ) ( 0.6976, 0.9023 ) ( 0.0882, 0.6320 ) Table 2: Intra-rater reliability for all healthcare providers with specific experience in the management of patients with hypertonia: Three trained raters using a newly developed standardized protocol (highlighted in blue) and three untrained raters performing assessments using routine clinical practice (highlighted in tan). Inter-rater reliability for the trained raters was 0.72 on Day 1 and 0.66 on Day 2. The inter-rater reliability for the untrained raters 0.23 on Day 1 and 0.27 on Day 2. (See Table 3.) Table 3: Inter-rater reliability on Day1 and Day2 for Group A Trained Raters using the new standardized protocol (highlighted in blue) compared to Group B Untrained Raters using routine clinical practice (highlighted in tan). Figure 3. Examiner Certification Form for Group A Trained Raters. OBJECTIVES: STUDY PARTICIPANTS/SETTING: Intra-rater reliability for the three healthcare providers in the trained group ranged from 0.73 to 0.85 while three healthcare providers in the untrained group ranged from 0.39 to 0.82. (See Table 2.) INTER-RATER RELIABILITY BY GROUP Weighted Kappa 95% Confidence Interval Trained Day1 0.717 ( 0.6040, 0.8133 ) Day2 0.663 ( 0.5386, 0.7626 ) Untrained Day1 0.233 ( 0.1085, 0.3568 ) Day2 0.269 ( 01415, 0.3906 ) INTER-RATER RELIABILITY 0.27 - 0.56 ICC 0.33 - 0.79 ICC 0.41 - 0.73 ICC 0.61 - 0.87 ICC 0.52 - 0.83 ICC Not performed Untrained: 0.27 - 0.29 ICC Trained: 0.716 - 0.717 ICC METHODS: DISCUSSION: RESULTS: Agree Left Right Evaluator: KNEE Agree Disagree Left Right Left Right ANKLE Agree Disagree Left Right Left Right Max flexion accurate Max extension accurate Available range of motion calculated correctly Speed accurate: 1 second Ashworth: Rated resistance to passive movements of the joint correctly Ashworth: Recorded resistance to passive movements of the joint correctly Figure 1: Excerpt from the MAS Training Manual on assessment of the elbow flexors. Figure 3: Examiner certification form for Group A Trained Raters. GROUP B: UNTRAINED RATERS: For the purposes of this study, raters in Group B were instructed to perform assessments using the MAS per standard clinical protocol following the recommendations of Bohannon and Smith (1987). Raters attended a lecture to review the MAS and the recommendations. The raters also received a data collection tool for each joint. (See Figure 4.) These raters were given the standard definitions provided by Bohannon and Smith (1987) for scoring their assessments: • A MAS rating of “1” refers to a catch and release at the end of the range of motion. • A MAS rating of “1+” refers to a catch and release throughout the remainder (less than half) of the range of motion. • A MAS rating of “2” or “3” refers to an increase in muscle tone through most of the range of motion. Raters in Group B were not provided supervised practice sessions nor were they certified in there method of assessing tone using the MAS. For inter-rater reliability by joint, the trained group had greater weighted kappa values compared to the untrained group for each of the four joints. For the trained raters, the inter-rater reliability ranged from 0.56 to 0.66. For the untrained raters, inter-rater reliability ranged from 0.11 to 0.41. (See Table 4 and Figure 5.) INTER-RATER RELIABILITY BY JOINT AND GROUP Joint Group Weighted 95% Confidence Kappa Interval Elbow Trained 0.6555 ( 0.3572, 0.8604 ) Elbow Untrained 0.4047 ( 0.1148, 0.6717 ) Wrist Wrist Knee Knee Ankle Ankle Trained Untrained Trained Untrained Trained Untrained 0.5643 0.2817 0.6591 0.1943 0.6133 0.1106 ( 0.2865, 0.7909 ) ( -0.0086, 0.6257 ) ( 0.2675, 0.7995 ) ( 0.0329, 0.3673 ) (0.3772, 0.8326 ) ( 0.0244, 0.2586 ) Table 4: Inter-rater reliability for four joints comparing the Group A Trained Raters (highlighted in blue) versus Group B Untrained Raters (highlighted in tan). Figure 4: Data collection tool for Group B Untrained Raters. Figure 5: Inter-rater reliability by joint: weighted kappas for Group A Trained Raters (highlighted in blue) versus Group B Untrained Raters (highlighted in tan). Training was associated with significant improvement in intra-rater reliability. Intra-rater reliability for the healthcare providers in the trained group showed good to excellent agreement (0.73 to 0.85) compared to raters in the untrained group (0.39 to 0.82). Training was associated with significant improvement in inter-rater reliability. Inter-rater reliability for the trained raters demonstrated good agreement (Day1=0.72, Day2=0.66). The inter-rater reliability for the untrained raters showed poor agreement (Day1=0.23, Day2=0.27). For inter-rater reliability by joint, the trained group showed greater reliability (0.56 to 0.66) compared to the untrained group (0.11 to 0.41) for each of the four joints. As found in previous intra- and interrater reliability studies of the MAS, there were differences in reliability for different muscle groups. As expected, both trained and untrained raters had fewer differences with measurements for the elbow joint. The trained raters had more challenges with wrist measurements. The untrained raters had challenges with all four joints but had the greatest challenges with the ankle and knee followed closely by the wrist. Based on this study as well as existing evidence, when utilizing the MAS, standardizing assessment techniques and providing detailed instruction and practice results in greater intra- and inter-rater reliability. Key components of reliably measuring tone using the MAS include: 1. Understanding of MAS definitions and accurately calculating the quarters of available range 2. Keeping the patient’s body and limb position the same & stabilization of the proximal limb segment 3. Using standardized hand placement for the clinicians 4. Using a consistent speed of movement of one second for the full available range 5. Ability of clinicians to recognize the different manifestations of muscle resistance 6. Ability of clinicians to identify maximum flexion, maximum extension, and pinpoint the location where resistance begins 7. Accurate recording of all measurements and calculations. 8. The need to practice of standardized process to gain proficiency and accuracy - even for skilled clinicians CONCLUSIONS: Training was associated with significant improvement in intra-rater reliability. • Intra-rater reliability for the healthcare providers in the trained group showed good to excellent agreement compared to varied results for raters in the untrained group. Training was associated with significant improvement in inter-rater reliability. • Inter-rater reliability for the trained raters demonstrated good agreement. The inter-rater reliability for the untrained raters showed poor agreement. Training was associated with significant improvement in inter-rater reliability by joint. • For inter-rater reliability by joint, for each of the four joints assessed, the trained group showed greater reliability compared to the untrained group. • As expected, both trained and untrained raters had fewer differences with measurements for the elbow joint. • The trained raters had more challenges with wrist measurements although still demonstrated greater inter-reliability than the untrained raters. • The untrained raters had challenges with all four joints but had the greatest challenges with the ankle and knee followed closely by the wrist. With most pediatric studies of hypertonia due to CP using the MAS for tone measurement, it is imperative to have reliable inter- and intra-rater measurements. Standardization and training of investigators significantly improves the accuracy of hypertonia assessment in children with CP with obvious clinical and research implications. REFERENCES: Ashworth B. Preliminary trial of carisoprodol in multiple sclerosis. Practitioner 1964;192:540-2. Bohannon RW, Smith MB. Interrater reliability of a Modified Ashworth Scale of muscle spasticity. Physical Therapy 1987;67:206-7. Clopton N, Dutton J, Featherston T, Grigsby A, Mobley J, Melvin J. Interrater and intrarater reliability of the Modified Ashworth Scale in children with hypertonia. Pediatric Physical Therapy 2005;17(4);268-74. Fosang AL, Galea MP, McCoy AT, Reddihough DS. Measures of muscle and joint performance in the lower limb of children with cerebral palsy. Developmental Medicine & Child Neurology 2003;45:664-670. Gracies J-M, Burke K, Clegg NJ, Browne R, Rushing C, Fehlings D, Matthews D, Tilton A, Delgado MR. Reliability of the Tardieu Scale for assessing spasticity in children with cerebral palsy. Archives of Physical Medicine and Rehabilitation 2010;91:421-8. Klingels K, De Cock P, Molenaers G, Desloovere K, Huenaerts C, Jaspers E, Feys H. Upper limb motor and sensory impairments in children with hemiplegic cerebral palsy. Can they be measured reliably? Disability and Rehabilitation 2010;32(5):409-16. Mutlu A, Livanelioglu A, Gunel MK. Reliability of Ashworth and Modified Ashworth Scales in children with spastic cerebral palsy. BMC Musculoskeletal Disorders 2008;9:44. Numanoğlu A, Günel MK. Intraobserver reliability of Modified Ashworth Scale and Modified Tardieu Scale in the assessment of spasticity in children with cerebral palsy. Acta Orthopaedica et Traumatologica Turc 2012;46(3):196-200. Pandyan AD, Johnson GR, Price CI, Curless RH, Barnes MP, Rodgers H. A review of the properties and limitations of the Ashworth and Modified Ashworth Scales as measures of spasticity. Clinical Rehabilitation 1999;13(5):373-83. Yam WKL, Leung MSM. Interrater reliability of Modified Ashworth Scale and Modified Tardieu Scale in children with spastic cerebral palsy. Journal of Child Neurology 2006;21:1031-5. PRESENTER: Nancy J. Clegg, RN, CNS, PhD, CCRP Childhood Motor Disorders Research Coordinator Texas Scottish Rite Hospital for Children CORRESPONDING AUTHOR: Mauricio R. Delgado, MD, FRCPC, FAAN Professor of Neurology and Neurotherapeutics University of Texas Southwestern Medical Center at Dallas Director of Pediatric Neurology Texas Scottish Rite Hospital for Children 2222 Welborn Street Dallas, Texas 75219 Office: 214-559-7831 Fax: 214-559-8383 [email protected] Disclosure of Relevant Financial Relationships: We have the following financial relationships to disclose: Grant/Research support from IPSEN. Disclosure of Off-Label and/or investigative uses: We will not discuss off label use and/or investigational use in our presentation.
© Copyright 2026 Paperzz