Measuring shoulder external and internal rotation strength and

J Shoulder Elbow Surg (2014) -, 1-8
www.elsevier.com/locate/ymse
Measuring shoulder external and internal rotation
strength and range of motion: comprehensive intra-rater
and inter-rater reliability study of several testing
protocols
Ann M. Cools, PT, PhDa,*, Lieven De Wilde, MD, PhDb, Alexander Van Tongel, MDb,
Charlotte Ceyssens, PTa, Robin Ryckewaert, PTa, Dirk C. Cambier, PT, PhDa
a
Department of Rehabilitation Sciences and Physiotherapy, Faculty of Medicine and Health Sciences, University Hospital,
Ghent, Belgium
b
Department of Orthopaedic Surgery and Traumatology, Faculty of Medicine and Health Sciences, University Hospital,
Ghent, Belgium
Background: Shoulder range of motion (ROM) and strength measurements are imperative in the clinical
assessment of the patient’s status and progression over time. The method and type of assessment varies
among clinicians and institutions. No comprehensive study to date has examined the reliability of a variety
of procedures based on different testing equipment and specific patient or shoulder position. The purpose of
this study was to establish absolute and relative reliability for several procedures measuring the rotational
shoulder ROM and strength into internal (IR) and external (ER) rotation strength.
Methods: Thirty healthy individuals (15 male, 15 female), with a mean age of 22.1 1.4 years, were
examined by 2 examiners who measured ROM with a goniometer and inclinometer and isometric strength
with a hand-held dynamometer (HHD) in different patient and shoulder positions. Relative reliability was
determined by intraclass correlation coefficients (ICC). Absolute reliability was quantified by standard
error of measurement (SEM) and minimal detectable change (MDC). Systematic differences across trials
or between testers, as well as differences among similar measurements under different testing circumstances, were analyzed with dependent t tests or repeated-measures analysis of variance in case of 2 or
more than 2 conditions, respectively.
Results: Reliability was good to excellent for IR and ER ROM and isometric strength measurements,
regardless of patient or shoulder position or equipment used (ICC, 0.85-0.99). For some of the measurements, systematic differences were found across trials or between testers. The patient’s position and the
equipment used resulted in different outcome measures.
Conclusions: All procedures examined showed acceptable reliability for clinical use. However, patient position and equipment might influence the results.
The Gent University Ethics Committee approved this study (Approval No.
B67020109775).
*Reprint requests: Ann M. Cools, PT, PhD, Department of Rehabilitation Sciences and Physiotherapy, University Hospital Ghent, De Pintelaan 185, 2B3, B-9000 Ghent, Belgium.
E-mail address: [email protected] (A.M. Cools).
1058-2746/$ - see front matter Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees.
http://dx.doi.org/10.1016/j.jse.2014.01.006
2
A.M. Cools et al.
Level of evidence: Level III, Diagnostic Study.
Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees.
Keywords: Shoulder rotation; range of motion measurement; strength measurement; reliability; goniometer; inclinometer; hand-held dynamometer
Clinicians and researchers routinely evaluate changes in
the status of patients over time. The assessment of range of
motion (ROM) and muscle strength is important in (1) the
diagnosis of glenohumeral disorders and pathologies, (2)
the assessment of treatment progression and effectiveness,
and (3) for quantifying the amount of change in movement
quality and force development occurring over time.30-32 In
addition, establishing objective measurements of ROM and
strength is essential to the identification of risk factors for
shoulder pain, particularly in an athletic population.3,28 It
is, therefore, important for clinicians and researchers to
have accurate and reliable examination tools to objectively
assess the functional status of the shoulder joint.19,25 In
general, ROM and strength assessment are considered to
be necessary outcome measures of shoulder function besides self-report outcome scores and subjective clinical
examinations.11,24,26
The method and type of functional shoulder assessment,
including the patient’s position and testing equipment,
varies among clinicians and institutions by factors such as
time, the clinician’s educational background, availability of
equipment, and the specific movement or muscle being
assessed. Goniometry has been used widely for ROM
assessment because of its low cost and portability. However, in the assessment of shoulder external (ER) and internal (IR) rotation ROM, the clinician is required to use
both hands, which makes stabilization of the trunk and the
scapula more difficult and thus often leads to increased risk
for measurement errors or the need for a second assessor.35
Inclinometry is another practical alternative in which
gravity is used as a reference point for ROM measurements.
Digital inclinometers are portable and lightweight but are
more costly than goniometers. In addition, ROM measurements with an inclinometer can only be performed in a
vertical plane because the tool depends on gravity for
interpretation of ROM measurements.17
A variety of methods have been used for the assessment
of shoulder rotational strength, including manual muscle
testing (MMT), hand-held dynamometry (HHD), and isokinetic testing. Although considered to be the gold standard, isokinetic testing is not always user-friendly because
of the high costs and the laboratory setting required.30 HHD
is a more objective evaluation method and far more superior to subjective MMT when evaluating changes in muscle
strength after injury.27
Numerous studies30,32 have examined the reliability of
one specific testing protocol or novel equipment4,6,14,16,29
or have compared testing positions or equipment.15,18,20
In general, the following conclusions may be drawn:
First, intra-rater and inter-rater reliability for the measurement of the passive movements of the shoulder varies with
the method of measurement and the equipment used.32
Universal goniometers, as well as inclinometers, are recommended for the assessment of shoulder ROM.32 Standardized trunk and scapula fixation, as well as
standardization of the amount of overpressure at the end
ROM, increase reliability.12,35
Second, considering HHD’s ease of use, portability,
cost, and compact size, the HHD can be regarded as a
reliable and valid instrument for shoulder muscle strength
assessment in a clinical setting.30 However, results are
prone to error that might arise from the strength of the
investigator, the testing position, and the stabilization of
the patient.30
In clinical practice, the minimal detectable change
(MDC) is one of the most important values to consider
when using objective outcome measurements.5 The MDC is
the minimum amount of change in a patient’s score that
ensures the change is not the result of measurement error.
The MDC is calculated in terms of confidence of prediction; for example, MDC90 is based on a 90% confidence
interval. However, only a few studies16,17 have mentioned
MDC results in the interpretation of their data regarding
shoulder ROM and strength evaluation.
Although measurement techniques for shoulder rotation
ROM and strength have been reported using supine, prone,
and sitting procedures, as well as using varying shoulder
positions from neutral rotational position up to 90 of
abduction, to our knowledge, no study has combined all of
these variables into one comprehensive reliability study
performed by the same team of examiners. This condition
mimics optimally the clinical reality in which often a team
of health professionalsdmedical doctors and paramedic
assistantsdperforms a set of tests using the available
assessment tools. In addition, the study design allows statistical analysis including all measurements, and in particular, stating the MDC, which is known to be very relevant
in clinical practice. Moreover, providing a variety of measurement protocols, based on established norms and potential functional requirements of the patient, this study
may improve the quality of the patient’s assessment over
time.
The purpose of this study was to examine the intra-rater
and inter-rater reliability and the MDC for clinical goniometric, inclinometric, and HHD dynamometry measurements of shoulder passive ROM into ER and IR, and
Shoulder ROM and strength measurement reliability
isometric strength into ER and IR, using different procedures, body and shoulder positions, and testing equipment. In addition, we hypothesized that ROM and strength
measurements might be influenced by gender and the position of the patient and his or her shoulder.
Materials and methods
Participants
A convenient sample of 30 asymptomatic adults (15 women and
15 men) was recruited for this investigation from a local university
setting during a 7-month interval. Estimated sample size was
based on literature5 suggesting that with 2 raters, a significance
level of 0.05 and a power of 80% to determine an ICC score of
0.7, that a minimum of 19 samples would be required. The mean
(standard deviation) for the participants’ age, body weight, and
height were 22.1 (1.4) years, 76.8 (17.8) kg, and 1.72 (1.9) meters,
respectively. None of the participants reported a history of
shoulder or neck pain or current participation in overhead sports
on a competitive level. Inclusion and exclusion criteria were
assessed with a questionnaire. Before participation, participants
read and signed the informed consent form.
Instruments
Shoulder ROM was measured with 2 instruments: a plastic
Baseline goniometer (Gymna hoofdzetel, Bilzen, Belgium),
marked in 1 increments, with 2 adjustable overlapping arms, and
an Acumar Digital Inclinometer (model ACU360; Lafayette Instrument Co, Lafayette, IN, USA). The manufacturer’s specifications indicate that this instrument is capable of measuring a range
up to 180 with an accuracy of 1 .
Isometric muscle strength of the shoulder external and internal
rotators was measured using a MicroFET 2 HHD (Hoggan Health
Industries Inc., West Jordan, UT, USA).
All measurements were performed by 2 assessors who were
familiarized with the procedures before the measurements.
Examiner 1 was male, height, 1.73 meters; and weight, 75 kg; and
examiner 2 was female, height, 1.63 meters; weight, 59.5 kg.
Procedures
After a standardized warm-up procedure consisting of multiplanar
shoulder movements that was supervised by the testers, the testing
procedure started. A summary of all measurements performed in
this investigation is given in Table I. The procedure started with
goniometric measurements (1-8) of the rotational ROM in
different positions. Two examiners participated in these measurements: 1 examiner provided stabilization of the scapula and
trunk and performed the shoulder ER or IR, and the other
examiner handled the goniometer to perform the measurement.
Subsequently, each examiner independently measured the shoulder in the various test positions (as mentioned in Table I) with the
inclinometer for the ROM measurement and with the HHD for the
strength measurement. Three 3 familiarization trials were performed for each measurement, and 3 subsequent trials were
executed for further analysis. All measurements were performed
3
in the order of (1) supine (measurements 12-14, 19-22), (2) sitting
(measurements 9-11, 15-18), and (3) prone (measurements 23 and
24), thus randomizing ROM and strength measures and avoiding
fatigue from the strength measurement. The raters were blinded to
the results because the second assessor recorded the data from the
first assessor and vice versa.
For the goniometric measurements, the goniometer was positioned with the fulcrum placed at the olecranon, the stable arm
horizontal (1-4 and 8) or vertical (5-7), depending on the specific
procedure, and on the moving arm along the forearm with the
processus styloideus ulnae as a reference point.28,34
For the ROM measurements with the inclinometer, the digital
inclinometer was zeroed by using a fixed horizontal (measurements 9-11) or vertical (measurements 12 and 13) reference point,
depending on the measurement, to ensure accuracy. Then, the
inclinometer was placed dorsally or ventrally midway on the
forearm, depending on movement direction.
For all ROM measurements, goniometric as well as inclinometric, scapular compensation was controlled by palpating the
coracoid (with the examiner’s thumb) and the spine of the scapula
(with the examiner’s fingers). This method has been shown to
more accurately control for scapular movements compared with
putting the full hand on the shoulder during the measurement.35
For the strength measurements, the HHD was placed 2 cm
proximal of the processus styloideus ulnae, on the dorsal (ER
strength) or ventral (IR strength) forearm.7 Three repetitions of
5 seconds of maximal voluntary effort were performed using a
‘‘make’’ test (gradually increasing resistance up to maximum
without ‘‘breaking’’ the subject’s strength). A 10-second resting
period was given in between trials. Stabilization of the upper arm,
shoulder, scapula, and trunk were provided by manual fixation by
the examiner’s hand, arm, and trunk, if necessary.
Statistical analysis
Means (standard deviations) were calculated across participants
for the dependent variables, the ROM values in degrees, and the
absolute isometric strength data in Newton (N). All the dependent
variables demonstrated a normal distribution (Kolmogorov-Smirnov test), and parametric tests were applied. To assess relative
reliabilitydthe degree to which individuals maintained their position in a sample with repeated measuresdintraclass correlation
coefficients (ICC) were calculated with the corresponding 95%
confidence interval (CI).8 For the intra-rater reliability, the 3 trials
for each measurement were used, and ICC3,k (two-way random
modeldabsolute agreement) was calculated because measurements were performed by 1 individual. For the inter-rater reliability, in which measurements from 2 independent assessors are
compared to determine whether the particular instrument (in this
case the inclinometer and the HHD) can be used with confidence
and reliability among equally trained clinicians,21 the mean of the
3 trials were used, and ICC2,k (two-way mixed modeldabsolute
agreement) was calculated.5 Only intra-rater reliability was
calculated for the goniometric measurements, and intra-rater as
well as inter-rater reliability was determined for the inclinometer
and HHD measurements.
ICC interpretation was based on the guidelines of Fleiss, according to whom high (>0.90) values reflect excellent reliability,
values between 0.80 and 0.89, good; between 0.70 and 0.79,
moderate; and values <0.70, low reliability.5 However, ICC values
4
A.M. Cools et al.
Table I
Overview of range of motion and HHD dynamometry measurements)
Type of measurement
Testing equipment
Test position
Testing procedure
No.
ROM
Goniometer
Sitting
ER at 0 abduction
ER at 90 abduction
IR at 90 abduction
IR at 90 forward flexion
ER at 0 abduction
ER at 90 abduction
IR at 90 abduction
IR at 90 forward flexion
ER at 90 abduction
IR at 90 abduction
IR at 90 forward flexion
ER at 0 abduction
ER at 90 abduction
IR at 90 abduction
ER at 0 abduction
IR at 0 abduction
ER at 90 -90 abduction/ER
IR at 90 -90 abduction/ER
ER at 0 abduction
IR at 0 abduction
ER at 90 -90 abduction/ER
IR at 90 -90 abduction/ER
ER at 90 -90 abduction/ER
IR at 90 -90 abduction/ER
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Supine
Inclinometer
Sitting
Supine
Isometric strength
HHD
Sitting
Supine
Prone
ER, external rotation; HHD, hand-held dynamometry; IR, internal rotation; ROM, range of motion.
) ER and IR measurements of ROM with the goniometer and the inclinometer, and isometric strength with the HHD, measured in a sitting, supine, or
prone position, with the shoulder in 0 or 90 of abduction, 90 -90 abduction/ER, or 90 forward flexion. All measurements are numbered for further
details in the text.
may be influenced by intersubject variability of scores, and a large
ICC may be reported despite poor trial-to-trial consistency if the
intersubject variability is too high.21 Therefore, to examine the
absolute reliabilitydthe degree to which repeated measurements
vary for individualsdthepstandard
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi error of measurement (SEM)
was calculated as SD 1 ICC, where the SD is the SD from
all scores from the participants.33 The SEM was used for calculating the minimal detectable
pffiffiffi change (MDC90) which was calculated as SEM 1.65 2.
Paired samples t test (for inter-rater differences) and a general
linear model analysis of variance (ANOVA) for repeated measures
(for intra-rater variability) were used to examine whether there
were systematic differences between testers or trials. In case of
systematic differences among trials, a Bonferroni procedure was
used as a post hoc test for pairwise comparisons.
In view of the research question of whether the gender, position
of the patient, or the equipment used influences ROM or strength
results, comparisons were performed between similar measurements with respect to measurement equipment (for ROM), gender,
patient, and shoulder position. A general linear ANOVA for
repeated measures with the within-subjects factor of ‘‘position,’’
and the between-subject factor of ‘‘gender’’ was used. In case of
significant interaction or main effects in the ANOVA model, post
hoc Bonferroni tests were performed. The a level was set at .05.
All statistical analyses were performed using IBM SPSS 21
software (IBM Corp, Armonk, NY, USA).
Results
Data from the reliability analysis for ROM measurements
are summarized in supplementary Table I (available on the
journal’s website at www.jshoulderelbow.org). ICCs for
intra-rater reliability varied from 0.85 (IR in 90 forward
flexion/goniometer/supine) to 0.99 (IR in 90 abduction/
inclinometer/sitting) and for interrater reliability from 0.96
(IR in 90 forward flexion/sitting/inclinometer) to 0.99 (IR
in 90 abduction/sitting/inclinometer), showing excellent
intra-rater and inter-rater reliability of all procedures. The
MDC90, derived from the mean of 3 repetitions, varied
from 4.02 (IR in forward flexion/sitting/inclinometer) to
7.48 (ER in 0 abduction/supine/goniometer).
For isometric strength measurements, ICC values for the
intra-rater reliability were between 0.93 (ER in 90 -90 /
sitting/examiner 2) and 0.99 (IR in 0 sitting/examiner 1;
IR in 90 abduction/examiner 1; IR in 0 abduction/supine/
examiner 1; IR in 90 abduction/supine/examiner 1; IR in
90 abduction/prone/examiner 1). Inter-rater reliability data
showed results between 0.94 (ER in 90 abduction/sitting)
up to 0.99 (IR in 90 abduction/supine). The MDC, derived
from the mean of 3 repetitions, varied from 7.87 N (ER in
Shoulder ROM and strength measurement reliability
Table II
5
Detail of the comparative statistical analysis for range of motion measurements based on position)
ROM
GSI
GSU
ISI
ISU
Statistical test
P value
ANOVA
ER in 0 abduction
X
X
X
ER in 90 abduction
X
X
X
X
IR in 90 abduction
X
X
X
X
IR in 90 forward flexion
X
X
X
ANOVA repeated measurements
with within-subject factor
‘‘position’’ 3 levels
ANOVA repeated measurements
with within-subject factor
‘‘position’’ 4 levels
ANOVA repeated measurements
with within-subject factor
‘‘position’’ 4 levels
ANOVA repeated measurements
with within subject factor
‘‘position’’ 3 levels
y
<.001
.428
.002)
<.001y
Pairwise comparisons
GSI-GSU: .081
GSI-ISU: .258
GSU-ISU: <.001y
GSI-GSU: NA
GSI-ISI: NA
GSI-ISU: NA
GSU-ISI: NA
GSU-ISU: NA
ISI-ISU: NA
GSI-GSU: <.001y
GSI-ISI: .001y
GSI-ISU: .031y
GSU-ISI: .053
GSU-ISU: <.001y
ISI-ISU: 1.00
GSI-GSU: .348
GSI-ISI: .039
GSU-ISI: <.001y
ANOVA, analysis of variance; ER, external rotation; GSI, goniometer/sitting; GSU, goniometer/supine; IR, internal rotation; ISI, inclinometer/sitting; ISU,
inclinometer/supine; NA, not applicable; ROM, range of motion.
) P values for the ANOVA for repeated measurements and for the post hoc pairwise comparisons between conditions, with NA indicating no post hoc test
performed.
y
P < .05 indicating statistical significance.
90 abduction/prone/examiner 1) to 22.11 N (IR in
0 abduction/sitting/examiner 2). Data are summarized in
supplementary Table II (available on the journal’s website
at www.jshoulderelbow.org).
From the P values of the ANOVA for repeated measurements (between trials) and the dependent t tests (between testers; see Supplementary Tables I and II), we can
conclude that despite the high ICCs for all measurements,
some are susceptible to systematic errors over trials and
between testers. In particular, some ROM measurements
with the inclinometer and some strength measurements in
the supine and prone position seem to be prone to systematic errors.
Because the ANOVA for repeated measurements
revealed no significant gender-based interaction or main
effects, men and women were combined for further analysis. The results of the comparative analysis of similar
measurements in different conditions according to testing
equipment (goniometer vs inclinometer) and patient position (sitting vs supine) are summarized in Table II. The
results show marked ROM differences between positions
and equipment used, particularly for the measurement of IR
in 90 of abduction. In addition, results from the goniometer and the inclinometer often differ significantly, despite
the high reliability.
Results for strength measurements, comparing patient
position (sitting, supine, and prone) and shoulder position (0
vs 90 -90 of abduction and ER) are described in Table III.
The results show that also here there is no consistency in the
outcome for most measurements comparing different measurement tools and patient positions. In particular, the prone
position results in higher values for IR and lower values for
ER strength compared with the other positions.
Discussion
This study established good to excellent reliability for IR
and ER passive ROM measured with a goniometer or an
inclinometer and isometric strength measurements using a
HHD, regardless of the patient’s position, shoulder position, or equipment used. This is, to our knowledge, the first
study to provide a comprehensive reliability analysis of
ROM and strength measurements performed by the same
examiners and combining several procedures frequently
used in clinical practice.
IR and ER ROM measurements
When adhering to the procedures outlined in this investigation, goniometric and inclinometric measurements were
found to be suitable instruments to quantify shoulder ROM,
with ICCs from 0.85 to 0.99 and a slight advantage for the
inclinometer of 0.89 to 0.99. Several studies, summarized
in the systematic review by van de Pol et al,32 have
examined the reliability of goniometry and digital
6
A.M. Cools et al.
Table III
Details of the comparative statistical analysis for isometric strength measurements based on position)
Isometric strength
SI
SU
PR
P value
Statistical test
ANOVA or dependent t test
ER in 0 abduction
IR in 0 abduction
ER in 90 -90 position
X
X
X
X
X
X
X
IR in 90 -90 position
X
X
X
Dependent t test
Dependent t test
ANOVA repeated measurements
with within subject factor
‘‘position’’ 3 levels
ANOVA repeated measurements
with within subject factor
‘‘position’’ 3 levels
y
<.001
<.001y
<.001y
.001y
Pairwise comparisons
NA
NA
SI-SU: .624
SI-PR: .006y
SU-PR: <.001y
SI-SU: .579
SI-PR: .042y
SU-PR: <.001y
ANOVA, analysis of variance; ER, external rotation; IR, internal rotation; NA, not applicable; PR, prone; SI, sitting; SU, supine.
) P values for the ANOVA for repeated measurements or t tests and for post hoc pairwise comparisons between conditions, with NA indicating no post
hoc test performed.
y
P < .05, indicating statistical significance.
inclinometry separately to measure shoulder ROM, with
considerably varying results.1,9,15,20,22 The trend for better
reliability with an inclinometer was also observed in a
study of Kolber et al,15 whereas Mullaney et al20 reported
that reliability estimates were similar between the goniometer and the digital inclinometer. It should be noted that
the Kolber study15 measured active ROM, whereas the
measurement of interest in our study was the passive
rotational ROM.
Moreover, contrary to these studies, goniometric measurements in our study were performed by 2 examiners.
This probably increases patient stabilization and measurement accuracy, which is reflected in the high ICC values.
The lowest ICC values for the goniometric measurements
were in the IR ROM in 90 of forward flexion (0.85), with
the patient sitting as well as supine. This might be due to
difficult scapular stabilization in this position or to difficulties in defining the starting reference position. Indeed,
some have suggested that manual scapular stabilization in
an appropriate manner is important to better isolate glenohumeral IR but that ER is not much affected by manual
stabilization.2,35
For measurement error and what might constitute true
change, the MDC90 for the intra-rater analysis indicated
that a change from 4.44 to 8.03 (depending on the
specific test) for the goniometer and from 4.02 to 6.36 for
the inclinometer is required to be 90% certain this change is
not due to intra-rater variability of measurement error.5 The
MDC90 for the inter-rater analysis of the inclinometric
measurements revealed that a change from 2.82 to 5.47
(depending on the specific measurement) is required to be
90% certain that this change is not the result of inter-rater
variability or measurement error. Only a few studies
mention MDC results in the interpretation of their data, so
we have little literature to compare our results with. Kolber
et al16,17 found MDC values of 8 to 9 in their studies,
which are slightly higher than in our study. However, the
clinician should bear in mind that MDC values depend on
the maximum value that can be obtained and that these
differ according to the movement direction. In our case, the
very low MDC90 of 2.82 for the measurement of IR supine
into forward flexion is the result of the relatively small arc
of motion of this movement, with a mean value of 14 . To
truly compare absolute reliability and measurement errors
for different kinds of movements, the percentage SEM
should be calculated. However, in view of the clinical
perspective, we reported the MDC in the assumption that
the MDC is the value the clinician uses in the interpretation
of the measurement data obtained in a patient.
Isometric strength measurement
Traditionally, isometric rotational shoulder strength is
assessed in a seated,4,13,14,23 supine,7,10 or prone position.23
Our results regarding relative reliability reveal ICCs between 0.93 and 0.99, showing excellent reliability for all of
the tests performed, regardless of the examiner, the position
of the patient, or the shoulder position. Our results are
slightly higher than the values previously reported.4,12,23
Although relative reliability values showed excellent
reproducibility, MDC values still are an important issue to
be considered. Depending on the specific measurement,
MDC90 varied from 7.87 (ER prone 90 -90 examiner 1) to
26.60 (IR prone 90 -90 inter-rater analysis). Because other
studies examining reliability of strength measurement
procedures did not calculate MDC23 or used other statistical analysis, such as limits of agreement,4 we have no
literature to compare our results with.
Systematic error of measurement between
trials and testers
Despite the high ICCs for all measurements, our results
revealed some systematic differences between trials and
between testers for the ROM as well as for the strength
measurements. In particular, some ROM measurements
Shoulder ROM and strength measurement reliability
with the inclinometer and some strength measurements in
the supine and prone position seem to be prone to systematic errors. Systematic differences in the strength
measurements might be the result of differences in the
strength of the testers.
Comparison of similar measurements using
different equipment or positions, or both
Although not the primary research question but relevant for
clinicians, the differences between procedures regarding a
similar measurement was performed. Several significant
differences became apparent, based on equipment and on
patient and shoulder position. In particular, ROM into IR
lacks consistency. For the strength measurements, the prone
position seems to correlate the least with the other positions
tested in this study. In conclusion, the clinician should
remember that, despite the high relative and absolute reliability, the procedures should not be mixed in the evaluation of 1 patient in view of the systematic differences found
between procedures and positions. We recommend that
clinicians use at least 2 positions to establish the strength
impairment in a more functional position for the patient but
urge the examiner to take into account the possible differences in outcome by patient and shoulder position.
7
of these procedures in patients with various pathologic
shoulder conditions.
Clinical implications
Our study shows excellent relative reliability for several
protocols for the measurement of ROM and strength into
ER and IR of the shoulder and clinically acceptable absolute reliability values. Our results show that none of the
procedures examined should be avoided, and it is up to the
clinician’s preference to use a specific examination protocol. From the perspective of practical utility, we recommend the supine position for all ROM measurements
because maximal trunk and scapula stabilization are
assured in this position.
With respect to testing equipment, the goniometer and
the inclinometer show excellent reliability. The goniometer
shows slightly higher accuracy and the inclinometer has the
advantage that only 1 tester is needed to perform the
testing.
For the strength measurements, sitting and supine positions both seem to give a reliable evaluation of muscle
function, whereas the prone position should be avoided, in
view of the large differences with the strength results in
other positions.
Limitations and future research
Conclusions
Some limitations of this study need to be noted. Owing to
technical issues, it was not possible to perform some testing
(for instance because of the limitations of the use of the
inclinometer depending on gravity for interpretation of
movement angles) or statistical analysis (inter-rater reliability analysis with the goniometer because both assessors
were needed during the protocol). In addition, despite
extreme effort to standardize not only the measurement but
also the stabilization of the patient’s trunk and scapula, no
external fixation was used for reasons of clinical relevance.
Clinicians in the field do not always have access to external
devices to measure ROM and strength, and very often,
external fixation makes the procedure more timeconsuming and, thus, less attractive for the clinician, but
might have influenced our results. The end range during the
ROM measurements was determined by subjective criteria,
based on familiarization trials by the assessors, but not
objectively controlled. An inclinometer was recently
introduced that measures simultaneous angle (in degrees)
and force (in kg), making it possible to quantify the load
applied at the end of ROM during passive movement
testing.4 Future studies should consider the use of these
kind of devices for ROM measurement; however, further
proof of reliability and validity is needed.
Finally, the use of asymptomatic individuals needs to be
acknowledged as a limitation. Future studies should
examine not only the reliability but also the responsiveness
The purpose of this study was to establish absolute and
relative reliability for several procedures measuring the
rotational shoulder ROM and strength into IR and ER.
The study results show good to excellent reliability
values for all procedures performed. Clinicians should
consider their choice based on the available equipment
and the ability of the patient to achieve the body or
shoulder position. In general, measurements in the supine position are recommended because of practical
applicability and body stabilization, and clinicians are
recommended to use more than 1 procedure to allow
functional measurements based on the patient’s abilities
at the moment of evaluation.
Supplementary data
Supplementary data related to this article can be found
online at http://dx.doi.org/10.1016/j.jse.2014.01.006.
References
1. Awan R, Smith J, Boon AJ. Measuring shoulder internal rotation range
of motion: a comparison of 3 techniques. Arch Phys Med Rehabil
2002;83:1229-34. http://dx.doi.org/10.1053/apmr.2002.34815
8
2. Boon AJ, Smith J. Manual scapular stabilization: its effect on shoulder
rotational range of motion. Arch Phys Med Rehabil 2000;81:978-83.
3. Byram IR, Bushnell BD, Dugger K, Charron K, Harrell FE Jr,
Noonan TJ. Preseason shoulder strength measurements in professional
baseball pitchers: identifying players at risk for injury. Am J Sports
Med 2010;38:1375-82. http://dx.doi.org/10.1177/0363546509360404
4. Cadogan A, Laslett M, Hing W, McNair P, Williams M. Reliability of
a new hand-held dynamometer in measuring shoulder range of motion
and strength. Man Ther 2011;16:97-101. http://dx.doi.org/10.1016/j.
math.2010.05.005
5. Carter RE, Lubinsky J, Domholdt E. Rehabilitation research: principles and applications. St. Louis, MO: Saunders; 2011. ISBN-13: 9781-437-70840-0.
6. Celik D, Dirican A, Baltaci G. Intrarater reliability of assessing
strength of the shoulder and scapular muscles. J Sport Rehabil 2012;
Technical Notes 3:1-5.
7. Couppe C, Thorborg K, Hansen M, Fahlstrom M, Bjordal JM,
Nielsen D, et al. Shoulder rotational profiles in young healthy elite
female and male badminton players. Scand J Med Sci Sports 2014;24:
122-8. http://dx.doi.org/10.1111/j.1600-0838.2012.01480.x
8. de Vet HC, Terwee CB, Knol DL, Bouter LM. When to use agreement
versus reliability measures. J Clin Epidemiol 2006;59:1033-9. http://
dx.doi.org/10.1016/j.jclinepi.2005.10.015
9. de Winter AF, Heemskerk MA, Terwee CB, Jans MP, Deville W, van
Schaardenburg DJ, et al. Inter-observer reproducibility of measurements of range of motion in patients with shoulder pain using a digital
inclinometer. BMC Musculoskelet Disord 2004;5:18. http://dx.doi.org/
10.1186/1471-2474-5-18
10. Donatelli R, Ellenbecker TS, Ekedahl SR, Wilkes JS, Kocher K,
Adam J. Assessment of shoulder strength in professional baseball
pitchers. J Orthop Sports Phys Ther 2000;30:544-51.
11. Ginn KA, Cohen ML, Herbert RD. Does hand-behind-back range of
motion accurately reflect shoulder internal rotation? J Shoulder Elbow
Surg 2006;15:311-4. http://dx.doi.org/10.1016/j.jse.2005.08.005
12. Hayes K, Walton JR, Szomor ZL, Murrell GA. Reliability of 3
methods for assessing shoulder strength. J Shoulder Elbow Surg 2002;
11:33-9. http://dx.doi.org/10.1067/mse.2002.119852
13. Hurd WJ, Kaplan KM, ElAttrache NS, Jobe FW, Morrey BF,
Kaufman KR. A profile of glenohumeral internal and external rotation
motion in the uninjured high school baseball pitcher, part I: motion. J
Athl Train 2011;46:282-8.
14. Kolber MJ, Beekhuizen K, Cheng MS, Fiebert IM. The reliability of
hand-held dynamometry in measuring isometric strength of the
shoulder internal and external rotator musculature using a stabilization
device. Physiother Theory Pract 2007;23:119-24. http://dx.doi.org/10.
1080/09593980701213032
15. Kolber MJ, Fuller C, Marshall J, Wright A, Hanney WJ. The reliability
and concurrent validity of scapular plane shoulder elevation measurements using a digital inclinometer and goniometer. Physiother Theory
Pract 2012;28:161-8. http://dx.doi.org/10.3109/09593985.2011.574203
16. Kolber MJ, Hanney WJ. The reliability, minimal detectable change
and construct validity of a clinical measurement for identifying posterior shoulder tightness. N Am J Sports Phys Ther 2010;5:208-19.
17. Kolber MJ, Vega F, Widmayer K, Cheng MS. The reliability and
minimal detectable change of shoulder mobility measurements using a
digital inclinometer. Physiother Theory Pract 2011;27:176-84. http://
dx.doi.org/10.3109/09593985.2010.481011
18. McCully SP, Kumar N, Lazarus MD, Karduna AR. Internal and
external rotation of the shoulder: effects of plane, end-range determination, and scapular motion. J Shoulder Elbow Surg 2005;14:60210. http://dx.doi.org/10.1016/j.jse.2005.05.003
19. Muir SW, Corea CL, Beaupre L. Evaluating change in clinical status:
reliability and measures of agreement for the assessment of glenohumeral range of motion. N Am J Sports Phys Ther 2010;5:98-110.
A.M. Cools et al.
20. Mullaney MJ, McHugh MP, Johnson CP, Tyler TF. Reliability of
shoulder range of motion comparing a goniometer to a digital level.
Physiother Theory Pract 2010;26:327-33. http://dx.doi.org/10.3109/
09593980903094230
21. Portney LG, Watkins MP. Foundations of clinical research: applications to practice. 3rd ed. Upper Saddle River, NJ: Pearson & Prentice
Hall; 2009. ISBN-10: 0131716409.
22. Riddle DL, Rothstein JM, Lamb RL. Goniometric reliability in
a clinical setting. Shoulder measurements. Phys Ther 1987;67:
668-73.
23. Riemann BL, Davies GJ, Ludwig L, Gardenhour H. Hand-held
dynamometer testing of the internal and external rotator musculature
based on selected positions to establish normative data and unilateral
ratios. J Shoulder Elbow Surg 2010;19:1175-83. http://dx.doi.org/10.
1016/j.jse.2010.05.021
24. Roddey TS, Cook KF, O’Malley KJ, Gartsman GM. The relationship
among strength and mobility measures and self-report outcome scores
in persons after rotator cuff repair surgery: impairment measures are
not enough. J Shoulder Elbow Surg 2005;14:95S-8S. http://dx.doi.org/
10.1016/j.jse.2004.09.023
25. Roy JS, MacDermid JC, Orton B, Tran T, Faber KJ,
Drosdowech D, et al. The concurrent validity of a hand-held versus
a stationary dynamometer in testing isometric shoulder strength. J
Hand Ther 2009;22:320-6. http://dx.doi.org/10.1016/j.jht.2009.04.
008. quiz: 327.
26. Rudiger HA, Fuchs B, von Campe A, Gerber C. Measurements of
shoulder mobility by patient and surgeon correlate poorly: a prospective study. J Shoulder Elbow Surg 2008;17:255-60. http://dx.doi.
org/10.1016/j.jse.2007.07.012
27. Schwartz S, Cohen ME, Herbison GJ, Shah A. Relationship between two
measures of upper extremity strength: manual muscle test compared to
hand-held myometry. Arch Phys Med Rehabil 1992;73:1063-8.
28. Shanley E, Thigpen CA, Clark JC, Wyland DJ, Hawkins RJ,
Noonan TJ, et al. Changes in passive range of motion and development
of glenohumeral internal rotation deficit (GIRD) in the professional
pitching shoulder between spring training in two consecutive years. J
Shoulder Elbow Surg 2012;21:1605-12. http://dx.doi.org/10.1016/j.
jse.2011.11.035
29. Shin SH, Ro du H, Lee OS, Oh JH, Kim SH. Within-day reliability of
shoulder range of motion measurement with a smartphone. Man Ther
2012;17:298-304. http://dx.doi.org/10.1016/j.math.2012.02.010
30. Stark T, Walker B, Phillips JK, Fejer R, Beck R. Hand-held dynamometry correlation with the gold standard isokinetic dynamometry: a
systematic review. PM R 2011;3:472-9. http://dx.doi.org/10.1016/j.
pmrj.2010.10.025
31. Streiner DL, Norman GR. Health measurement scales: a practical
guide to their development and use. Oxford: Oxford University Press;
2008. ISBN-10: 0199231885.
32. van de Pol RJ, van Trijffel E, Lucas C. Inter-rater reliability for
measurement of passive physiological range of motion of upper extremity joints is better if instruments are used: a systematic review. J
Physiother 2010;56:7-17.
33. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 2005;19:231-40.
http://dx.doi.org/10.1519/15184.1
34. Wilk KE, Macrina LC, Fleisig GS, Porterfield R, Simpson CD 2nd,
Harker P, et al. Correlation of glenohumeral internal rotation deficit
and total rotational motion to shoulder injuries in professional baseball
pitchers. Am J Sports Med 2011;39:329-35. http://dx.doi.org/10.1177/
0363546510384223
35. Wilk KE, Reinold MM, Macrina LC, Porterfield R, Devine KM,
Suarez K, et al. Glenohumeral internal rotation measurements differ
depending on stabilization techniques. Sports Health 2009;1:131-6.
http://dx.doi.org/10.1177/1941738108331201