Cross-Validation of the Behavioral and Emotional Rating Scale

J Child Fam Stud
DOI 10.1007/s10826-006-9117-y
ORIGINAL PAPER
Cross-Validation of the Behavioral and Emotional Rating
Scale-2 Youth Version: An Exploration of Strength-Based
Latent Traits
Michael J. Furlong · Jill D. Sharkey · Peter Boman ·
Roslyn Caldwell
C
Springer Science+Business Media, LLC 2007
Abstract High-quality measurement is a necessary requirement to develop and evaluate
the effectiveness of programs that use strength-based principles and strategies. Using independent cross-validation samples, we report two studies that explored the construct validity
of the BERS-2 Youth Report, a popular measure designed to assess youth strengths, whose
conceptual structure has not yet been examined. In Study 1, an exploratory factor analysis
found a four-factor solution with conceptual support, which included both internal assets
associated with (a) the management of emotions and positive social interaction skills and (b)
engagement in the important social contexts of family and school. In Study 2, confirmatory
factor analyses found reasonable model fit for the BERS-2 five-factor structure and superior
model fit for the more parsimonious four-factor solution found in Study 1. In future studies,
parallel reporting of the four-factor model may provide additional insight to the nature and
structure of the BERS-2 Youth Version’s clinical validity and utility when compared with
the five-factor model, thus potentially contributing to a broader objective to develop a better
understanding of important strength-based latent traits.
Keywords Strengths . Assessment . Rating scales . Test validity . Youth self-report
M. J. Furlong ()
Gevirtz Graduate School of Education, Department of Counseling, Clinical, and School Psychology,
University of California, Santa Barbara, CA 93106-9490, USA
e-mail: [email protected]
J. D. Sharkey
Center for School-Based Youth Development, University of California,
Santa Barbara, Santa Barbara, CA, USA
P. Boman
School of Education, James Cook University, Cairns, Queensland, Australia
R. Caldwell
Department of Psychology, John Jay College of Criminal Justice, City University of
New York, New York, NY, USA
Springer
J Child Fam Stud
There has been a surge of advocacy for positive psychology as a legitimate topic of research
in its own right and to offer an alternative to the predominance of deficit-based models that
have defined research regarding the mental health needs of youth (Seligman, Steen, Park, &
Peterson, 2005). Reflecting this growing interest, recent journal special issues have examined the values that accrue from integrating information about internal and external assets
into prevention and treatment program planning—School Psychology Quarterly (Huebner
& Gilman, 2003), Psychology in the Schools (Chafouleas & Bray, 2004), California School
Psychologist (Jimerson, 2004), and the Journal of School Health (Blum & Libbey, 2004). In
addition to discussing the theoretical rationales and clinical approaches of positive psychology, as in any research field, there is a need to develop, enhance, and refine the conceptual and
psychometric foundation of measures that assess relevant positive psychology constructs.
This is what Epstein (1999) and others have called “strength based” assessment, which is
based on the hypothesis that prevention and intervention services are enhanced when they
incorporate a youth’s personal strengths and his or her external social resources (Jimerson,
Sharkey, Nyborg, & Furlong, 2004; Rhee, Furlong, Turner, & Harari, 2001). Absent reliable
and valid procedures to assess youth strengths, it is not possible to develop and evaluate
the effectiveness of programs that purport to use strength-based principles and strategies
(Libbey, 2004). Without high-quality assessments, positive youth development and its focus on resiliency and strengths becomes primarily a philosophical or values orientation. To
operate from a science base, high-quality measurement is a necessary requirement.
Assessments have begun to be developed, but have not had time to fully mature, and
theoretical models to guide scale development have not been fully explored (Libbey, 2004;
O’Farrell & Morrison, 2003). Within this context, Epstein and colleagues offer one instrument
designed to measure youth strengths—the Behavioral and Emotional Rating Scale (BERS;
Epstein, 2004; Epstein & Sharma, 1998). The BERS was originally developed as part of
a Children’s Mental Health Services-funded system of care project (Center for Mental
Health Services, 2001; ORC Macro, 2003) and, as such, was partially based on developing
wraparound services for youth with emotional-behavioral disorders (EBD) that considered
the youth’s and their family’s resources and strengths (Lourie, Stroul, & Friedman, 1998).
Although the System of Care initiative extolled the values of treatment plans based on
strengths, Epstein and colleagues recognized that there were no psychometrically proven
instruments to assess positive changes in the youths with EBD involved in a cross-agency,
wraparound programs. In response to this need, Epstein and Sharma (1998) conducted a
literature search of strength-based assessment, developmental psychopathology, resilience,
and protective factors. After developing an item pool of 1,200 items, consulting with content
experts, and preliminary analyses, items that produced the greatest differentiation between
EBD and non-EBD youth were retained for scale development. Exploratory factor analysis
identified a final scale consisting of 52 items with five factors that were called: Interpersonal
Strength, Family Involvement, Intrapersonal Strength, School Functioning, and Affective
Strength; hence the scale had a mixture of positive internal assets and external resources.
The original BERS norm group was comprised of a mixed group of adult raters (teacher,
caregivers, and case managers) (Epstein & Sharma, 1998). Epstein (2004) expanded the reach
of the BERS (now called the BERS-2) by re-norming it with separate groups of teachers and
parents, and extended it by developing a youth self-report format.
The BERS was developed using a mixture of professional judgement and empirical procedures; however—it had no prespecified theoretical foundation or set of psychological
constructs. Following from this observation, reviewers of the original BERS raised questions
about the robustness of the original BERS factors. Doll (2001), for example, commented
that, “The factor analysis underlying the scale needs to be verified through replication on
Springer
J Child Fam Stud
independent samples, and a conceptual model is needed to describe the meaning of the scale
and its subscales” (p. 144). Similarly, Olmi’s (2001) judgement was that, “The presentation on construct validity that was offered in the BERS manual was less than adequate”
(p. 145). Since the original exploratory factor analysis reported in the BERS manual, no
other researcher has independently examined its construct validity. These early questions
about the generalizability of the factor structure of the original BERS when completed by
adult raters point to the need to carefully examine the cross-informant concordance of the
factor structures, particularly for the BERS-2 Youth Version.
Epstein, Mooney, Ryser, and Pierce (2004) presented three analyses of the BERS-2 Youth
Version for convergent validity with the Social Skills Rating Scale and the Achenbach (1991)
Youth Self Report. They also examined the stability of scores. Although the samples in these
three analyses were small and had restricted range, they provided evidence that the BERS-2
Youth Version score had short-term stability and had positive correlations when corrected for
restricted range, with the correlations of the BERS-2 Total Strength Index being .71 with the
SSRS Social Skills Composite and −.50 with the YSR Total Problems score. The one-week
test-retest coefficient was .80.
Additional support for the use of the BERS-2 Youth Version is provided by Uhing,
Mooney, and Ryser (2005) who compared the responses of 386 adolescents without an
emotional disturbance (non-ED) with those of 71 youth with ED. Predictive validity was
shown in that the Total Strength Index for the ED youth (92.3, with a mean of 100) was
significantly lower than that for the non-ED youth (101.3), with significant and moderate
effect size differences on each of the five subscales.
Farmer and colleagues (2005) examined the relationship between the BERS Parent and
youth self-reported behavior using the Interpersonal Competence Scale–Self (Cairns, Leung,
Gest, & Cairns, 1995) in a sample of African American middle school students. They found
that students placed into low, middle, and high groups based on the Total Strength Index
derived from parent ratings differed on aggression (high group reported lowest aggression),
affiliation (high group reported most affiliation, for girls only), and popularity (high group
reported most popularity). Results for the other subscales, including academics and internalizing problems, were not significant. This study provided evidence that the parent BERS-2
ratings had some concurrent validity; however, the analysis did not use the BERS-2 subscales
and did not add information about its construct validity across parent and youth BERS-2
ratings.
The only study to report the construct validity of the BERS-2 can be found in the BERS-2
Manual (Epstein, 2004), which presents CFA results using the 52 items to assess their fit with
the original BERS five-factor model as measured variables assessing the latent trait of a Total
Strength Index. The resulting fit indices were generally adequate (CFI = .995; TLI = .986;
NFI = .995). However, the RMSEA of .12 was above the generally accepted range of
.05–.08 for even a reasonable fit (Browne & Cudeck, 1993; Browne & Mels, 1990; Steiger,
1989; Thompson, 2004). Hence, there is a need for further exploration and verification of the
conceptual structure of the BERS-2 Youth Self Version using an independent cross-validation
sample.
Given the emerging interest in strength-based assessment and the BERS-2’s wide use in
the research (its use is mandated as part of the national evaluation of all CMHS System
of Care Grants) (Center for Mental Health Services, 2001), it is important to further examine its psychometric properties to refine and advance its use as a research and clinical
instrument. Furthermore, aligning the BERS-2 with theoretical models based in fields of risk
and resilience (e.g., Cicchetti & Lynch, 1993; Cicchetti & Toth, 1997) and school engagement (e.g., Jimerson et al., 2004; National Research Council and the Institute of Medicine,
Springer
J Child Fam Stud
2003) will allow for a better understanding of how the BERS-2 subscales relate to youth
functioning. Hence, this study has two major purposes. First, we examine the factor structure
of the BERS-2 Youth Version with an independent sample of adolescents. Exploratory factor
analysis (EFA) will be used with an independent sample to examine item characteristics of
the 52 items to determine if they fit best within the proposed five-factors structure or an
alternative factor structure and if all items have adequate loadings (the BERS-2 manual does
not report any EFA analyses for the youth version). Second, we sought to confirm the optimal
factor structure of the BERS-2 Youth Version with a second independent sample of adolescents. Confirmatory Factor Analysis (CFA) analyses will be used to examine the comparative
fit of the original BERS-2 Youth Version model versus alternative structures identified in the
EFA. In order to align the BERS-2 with theoretical models related to strength-based assessment, this analysis seeks to identify organizations of the items and subscales that have clear
links to constructs used in the related literatures identified by Epstein (2004; i.e., behavioral
and emotional skills, strength-based assessment, developmental psychopathology, risk and
resilience, and protective factors).
Study 1: Exploratory factor analysis
Method
Participants
The 752 adolescents who participated in Study 1 included two distinct samples of youths. The
first group consisted of 386 students attending a comprehensive high school located in the
central coast region of California with 194 males and 192 females, of who most were
European American (87%). The majority of students were in Grade 9 (93%) with a few in
Grades 10 (4%), 11 (2%), and 12 (1%).
Participants in the second group were 366 youths referred to a California county juvenile
probation department for a first-offense intake interview and risk assessment. This group
included more males (66%) than females (34%) and reported Latino American (61%),
European American (33%), African American (3%), and other (3%) ethnicities. Youths
were ages 10 to 18, with 3% under 12-years-old, 28% between 12- and 13-years-old, and
69% above 14-years-old.
Participants in the normative sample of the BERS-2, to whom our study participants
are being compared, included 54% males, 80% “Whites,” 12% “Blacks,” and 8% “Other,”
with 8% of the total sample responding “Yes” to being “Hispanic.” The “West” was represented by 22% of the normative sample. With the exception of geographical location, the
comprehensive high schools sample was demographically similar to the BERS-2 normative
sample.
Measure
Behavior and Emotional Rating Scale-2 (BERS). The BERS-2 youth version (Epstein, 2004)
is a 52-item, standardized, norm-referenced scale originally designed for parents, teachers,
and/or other caregivers to report about five aspects of the behavioral and emotional strengths
of children and adolescents aged 11 years, 0 months to 18 years, 11 months. Raters respond
to each question using a 4-point Likert scale, which for the youth version ranges from 0 (not
like me) to 3 (very much like me). Epstein (2004) describes five subscales. The Interpersonal
Strength subscale measures a youth’s ability to control his or her emotions or behaviors in
Springer
J Child Fam Stud
social situations (e.g., “I can express my anger in the right way”). For the Youth Version, the
BERS-2 manual (Epstein, 2004) reports an internal consistency alpha of .82 across all ages
and a test-retest reliability of .89 over two-weeks for a sample (N = 42) of 11- and 12-yearolds. The Family Involvement subscale measures a child’s participation in and involvement
with his or her family (e.g., “My family makes me feel wanted”). The BERS-2 manual
reported an internal consistency alpha of .80 and a test-retest reliability of .85. The School
Functioning subscale measures competence in school and classroom tasks (e.g., “I complete
tasks when asked”). The BERS-2 manual reported an internal consistency alpha of .88 and a
test-retest reliability of .89. The Intrapersonal Strength subscale measures a youth’s outlook
on his or her competence and accomplishments (e.g., “I believe in myself”). The BERS-2
manual reported an internal consistency alpha of .82 and a test-retest reliability of .91. Finally,
the Affective Strength subscale measures the ability of a child to accept affection from others
and express feelings towards others (e.g., “It’s okay when people hug me”). The BERS-2
manual reported an internal consistency alpha of .80 and a test-retest reliability of .84. The
BERS-2 also provides an overall Strength Quotient made up of all items with a reported
internal consistency alpha of .95 and a test-retest reliability of .91. Overall, inter-rater and
retest reliability indicates moderate to high correlation across all subscales.
Procedures
Measure. At this time that this study was conducted, the youth version of the BERS-2
(Epstein, 2004) was unavailable. Given the timing of our study, the youth survey used in
this study is not exactly the same as the published BERS-2 measure due to subtle wording
differences though the content and essential meaning of each question examined is the same.
To use the BERS with youth participants, we merely changed the four-point Likert response
format of the BERS parent version to the first person (e.g., 2 = Like my child changed to 2 =
Like me). Subsequently, the BERS-2 was published with the same 52 items used in this study
and the original BERS (Epstein & Sharma, 1998) with minor modifications. Information in
the BERS-2 manual describing how the youth version was adapted only indicates that as a
result of focus group feedback, “. . . individuals were satisfied with the content, format, and
uses of the instrument” (Epstein, 2004, p. 66). BERS-2 authors identified the need for a brief
career strengths subscale and five items to address this were added to the youth version. Thus,
the career strengths subscale is not evaluated in this study. The manual does not mention any
process for piloting or seeking input from youth on item content. For most items, authors of
the BERS-2 made wording changes so items reflected the first person (e.g., “studies for test”
became “I study for tests” in the youth version). Other items were changed apparently to
contain vocabulary that would be more easily understood by the younger adolescents (e.g.,
“Trusts a significant person with own life” became “I trust at least one person very much”).
Items in our study were presented in the same order as in the BERS-2. For this study, surveys
R software package for
were developed into a machine-readable format using the Teleform
efficient data entry.
Sample 1. For the comprehensive high school sample, classroom teachers under the direction
of the school counselor (substance abuse prevention coordinator) administered surveys in
Winter 2002. All students in a freshman Health class who were in attendance at school
on the day of data collection anonymously completed the BERS-2 as part of a tobacco
prevention school unit (non-ninth graders were taking the class to satisfy a graduation
requirement). Data were sent to researchers for data entry with neither names nor identifying
numbers.
Springer
J Child Fam Stud
Sample 2. All participants in the juvenile probation department sample were drawn from
the evaluation of the early intervention component of a state-funded delinquency prevention program. The youths had no probation interventions prior to this first referral. The
BERS-2 Youth Version was administered between July 1997 and June 2001 as part of
the regular intake assessment process by probation officers trained in assessment procedures and sent to evaluators for data entry and analysis with codes substituted for
names.
Data preparation. Prior to analysis, all surveys were examined for marking errors and
ambiguities (i.e., bubbles that were not completely filled in were darkened or markings
outside of the bubble were corrected). If an item had two marked responses, it was considered
missing data. After carefully examining the surveys, research assistants scanned and verified
the data accuracy using the Teleform software package, which automatically uploaded into
an SPSS file. During further review of the data, extreme responders were excluded; for
example, participants who marked all 0’s or 3’s. Per manual procedures, for surveys that had
less than two items of missing data per subscale, missing data points were substituted with
the overall subscale mean (Epstein & Sharma, 1998; Switzer & Roth, 2002).
Analysis
Exploratory Factor Analysis (EFA). EFA was performed using SPSS 11.0. EFA is a method
of investigating how scale items may optimally group together into different subsets to best
measure an overall construct. General characteristics of EFA are that (a) the number of
factors identified can range from a single factor to a number equal to the number of items in
a scale, (b) all items are able to correlate with all of the factors, which may make identifying
distinct factors difficult, and (c) rotation to change correlations between items and factors
is used to identify the clearest pattern of results (Kline, 1998; Thompson, 2004). Though
empirical techniques, such as creating an eigenvalue cut-off or minimum factor loading, can
be used to identify the ideal number of factors, it is most accurate to consider empirical data
within a theoretical framework that explains item groupings.
Results
Description of the youth responses
Table 1 displays the mean scores for the two groups of participants included in study 1.
The first group of participants, who attend a comprehensive high school, were not assessed
for their gender and thus, direct comparisons to the normative sample cannot be made.
When compared to the male normative sample descriptively, these comprehensive high
school participants had a similar mean on the Interpersonal Strength and School Functioning
subscales, a lower average score on the Family Involvement and Intrapersonal Strength
subscales, and a higher score on the Affective Strength subscale. Regarding the second
group of participants, who were first-time offenders on probation in California, males had a
similar mean on the Interpersonal Strength subscale as the normative sample and both males
and females had similar scores to the normative sample on the Affective Strength subscale.
All other scores were lower than the norm, which is not surprising given that a lack of assets
is associated with delinquent behavior.
Springer
J Child Fam Stud
Table 1 Descriptive statistics: Average raw scores for three groups of participants compared to range of
raw scores at a scaled score of 10 in the normative sample
Gender
BERS-2 scale
Normative
sample
Comprehensive
high schoola
Probation
CA
Detention
NV
Male
Interpersonal strength
Family involvement
Intrapersonal strength
School functioning
Affective strength
Interpersonal strength
Family involvement
Intrapersonal strength
School functioning
Affective strength
31–32
22
27–28
19–20
14
35–36
22–23
29
22
16
31.2
19.7
25.0
18.6
15.5
31.4
20.4
24.9
17.9
14.8
32.3
19.0
25.3
18.5
15.8
25.9
16.3
21.2
13.5
12.2
27.8
15.1
23.7
13.9
14.4
Female
a Gender
was not available for the comprehensive high school sample. Thus, means are shown for the total
sample.
Exploratory factor analysis
An exploratory factor analysis was conducted on the 52 BERS-2 Youth Version items.
The principal components factor method was used. A scree plot of the eigenvalues was
examined to determine the number of factors to be retained. Four pre-rotational factors were
found with eigenvalues ranging from 3.60 to 28.99. Because of the moderate correlations
between subscales, a direct oblimin rotation was then performed to find the simplest structure
(Fabrigar, Wegener, MacCallum, & Strahan, 1999; Thompson, 2004; Widaman, 1993).
Table 2 contains all within-scale loadings for the BERS-2, which were between .26 and
.71. The first factor contained a mixture of 15 items from three of the five original BERS2 subscales (Interpersonal Strength, Intrapersonal Strength, and Affective Strength; e.g.,
expresses affection for others, requests support from peers and friends). We interpret this
pattern to reflect youths’ global assessment of their social skills and their management and
expression of emotions particularly as they relate to their relationships with others. Given
this content, this factor could be labeled General Social Skills.
The second factor contained seven items from the original BERS-2’s School Functioning
subscale (e.g., completes homework regularly, pays attention in class). This differs from the
original scale in that two items (14 and 41) did not load. The retained items focus on active
participation in school-related tasks and not other types of school involvement such as sense
of membership and engagement; thus, this factor could be called School Participation.
The third factor contained six items from the original BERS-2 Interpersonal Strength
subscale (e.g., accepts responsibility for own actions, considers consequences of own behavior), these six items appear to represent a youth’s sense of his or her self-control and coping
with emotions. It makes conceptual sense that all Interpersonal Strength items would load
together in an adult rating format because adults observe self-control by observing a youth’s
social interactions. However, these results suggest that youths may distinguish between external social skills and internal emotional control (a youth may experience the need to calm
themselves and not react in a social situation and this may not be obvious from his or her
external behavior); thus, this factor could be called Emotional Control.
The fourth factor contained nine items primarily from the original BERS-2 Family Involvement subscale, but with two items crossing over from the Intrapersonal Strength subscale
Springer
J Child Fam Stud
Table 2
Exploratory factor analysis: BERS-2 items with loadings above .40 (N = 752)
Original BERS-2 item number
BERS-2 subscalea
I
34.
13.
25.
23.
6.
21.
3.
12.
33.
44.
49.
26.
46.
38.
32.
31.
24.
39.
51.
47.
52.
40.
37.
35.
28.
30.
17.
16.
15.
7.
29.
5.
1.
11.
36.
42.
45.
Eigenvalue
% of variance
No. of items
Score range
Alpha
Affective strength
Affective strength
Affective strength
Affective strength
Affective strength
INTRApersonal
Affective strength
INTERpersonal
INTERpersonal
INTERpersonal
INTERpersonal
INTRApersonal
INTRApersonal
INTRApersonal
INTRApersonal
School functioning
School functioning
School functioning
School functioning
School functioning
School functioning
School functioning
INTERpersonal
INTERpersonal
INTERpersonal
INTERpersonal
INTERpersonal
INTERpersonal
Family involvement
Family involvement
Family involvement
INTRApersonal
Family involvement
Family involvement
Family involvement
INTRApersonal
Family involvement
.68
.64
.63
.60
.57
.56
.53
.53
.51
.46
.43
.41
.40
.40
.40
II
III
IV
.71
.71
.68
.67
.64
.61
.49
.64
.58
.55
.55
.50
.44
28.99
56%
16
0–48
.88
4.90
9%
7
0–21
.81
3.90
8%
6
0–18
.76
.80
.76
.63
.63
.60
.55
.54
.46
.41
3.60
7%
9
0–27
.87
Note. Item numbers are those in the BERS-2. Four factor Oblimin resolution. Loadings lower than
0.40 suppressed for purposes of clarity of presentation.
a BERS-2
Springer
subscale on which the item is reported loading for the youth version.
J Child Fam Stud
(e.g., participates in family activities, demonstrates a sense of belonging to family). These
two Intrapersonal Strength items (i.e., is self-confident, is enthusiastic about life) appear to
differ from other items in the original BERS-2 Youth Version scale in that they may be more
related to family values and functioning than to peer and social functioning. Consistent with
original BERS subscale, we label this factor Family Involvement.
Discussion
EFA yielded a somewhat different factor structure than the original five-factor model provided
by the BERS-2 developer (Epstein, 2004). Our four-factor model is more parsimonious, in
that it relies on fewer variables (37 instead of 52) and still has good internal consistency
characteristics. However, an over-reliance on the statistics of a particular EFA can overfit
the model to a particular sample, thus, confirmatory factor analysis with a second sample is
desirable before drawing conclusions.
Study 2: Confirmatory factor analysis
Method
Participants
Participants were 358 youths referred to Juvenile Justice Services of Clark County, Nevada
for a first-time intake interview and risk assessment. This sample included more males
(58%) than females (42%) and reported African American (34%), European American
(29%), Latino American (28%), Native American (7%), and Asian (2%) ethnicities. Youths
were the following ages: 13 (5%), 14 (14%), 15 (21%), 16 (27%), and 17 (33%).
Measures
The same modified BERS-2 Youth Version was used for Study 2.
Procedures
The modified BERS-2 Youth Version was administered to adjudicated youth by trained
detention staff during 2002 and 2003 as part of the regular juvenile justice intake assessment
process. Data were sent to the researchers for data entry and analysis, with codes substituted
for names. The same data preparation strategies used in Study 1 were used in Study 2.
Analyses
AMOS 4.0 was used to conduct a Confirmatory Factor Analysis (CFA) to compare the
original five-factor BERS-2 model to the four-factor model suggested by Study 1. CFA
helps address questions of validity and “is at the heart of the measurement of psychological
constructs” (Nunnally, 1978, p. 113). CFA allows researchers to test predetermined models
by plotting a proposed factor structure with measured variables loading onto proposed,
or “latent” variables (Kline, 1998). If CFA finds reasonable results, (a) items have high
correlations with latent variables and (b) correlations between factors are not excessively
high (<.85; Kline, 1998). CFA is evaluated by the Chi-square statistic; however, this statistic
Springer
J Child Fam Stud
is not informative regarding model fit because it is too stringent for model testing, though it
may be used to compare models. Thus, absolute and incremental fit indices are consulted.
The root mean square error of approximation (RMSEA) is the absolute fit measure most
commonly used to evaluate CFA models. It approximates the error value in comparing
the proposed factors to a hypothetical population covariance matrix (Byrne, 2001). When
interpreting the values of RMSEA, .05 and under indicates a good fit, .05 thru .08 indicates a
reasonable fit, and above .10 indicates a poor fit (Browne & Cudeck, 1993; Browne & Mels,
1990; Steiger, 1989; Thompson, 2004). Researchers have increasingly used RMSEA as a
key CFA index (DiStefano & Hess, 2005). Two incremental fit measures that are valuable in
interpreting model fit are the Tucker-Lewis Index (TLI; aka Non-normed fit index) and the
Comparative Fit Index (CFI). These indices are superior to alternatives. TLI favors a simple
model and is not impacted by sample size. CFI does not favor a simple model; thus, it is
not fair to use when comparing multiple models. TLI and CFI values above .95 indicate a
well-fitting model.
Results
Description of the youth responses
Table 1 displays the mean scores for the group of participants included in study 2, who were
adjudicated youth detained in a southern Nevada juvenile justice facility. Both males and
females had lower mean scores on each of the BERS subscales than the normative sample
and both study 1 samples suggesting that this sample differs from any of the other samples
in terms of the level of strengths experienced by the youths.
Confirmatory factor analysis
CFA was used to test alternative strength-based assessment constructs of the BERS-2 Youth
Version with an independent sample. To test the original five-factor BERS, CFA was first
run for a one-factor solution in which all 52 items loaded on to a single general strengths
factor (Model A) and subsequently run for the five-factor model suggested by Epstein (2004),
which included its five core subscales, all of which loaded onto a single latent variable entitled
Strength Quotient (Model B). The same general models were examined to test our derived
four-factor model—a one-factor solution with all 37 items loading on to a single general
strengths index (Model C, equivalent to Model A) and subsequently for the four-factor model
suggested by Study 1 (Model D, equivalent to Model B).
The results of the CFAs for each model are shown in Table 3. The indices for all models
fit the data at least minimally well. For each model, the multiple factor to strength quotient
models (Models B and D) performed significantly better than the one-factor models (Models
A and C). That is, the Chi-square difference was significant when comparing the one-factor
model to the multiple-factor-strength-quotient model for both the original BERS-2 model,
χ 2 (5) = 963.95, p < .001, and our derived BERS-2 model, χ 2 (40) = 1069.75, p < .001.
In examining the model fit indices, both the original BERS-2 Model B and the new BERS2 Model D had RMSEA indices in the acceptable range. However, the new four-factor
BERS-2 Model D had superior fit values across all indices. Using the Akaike Information
Criterion, which is a goodness-of-fit measure that adjusts model chi-square to penalize for
model complexity and allows for comparisons between non-hierarchical models, this study’s
Springer
J Child Fam Stud
Table 3
Confirmatory factor analysis: Comparison of model fit indices
Model
2
df
3,739.36f
2,775.41f
1,274 .92
1,269 .97
.93
.97
.074
.058
.071–.076
.055–.061
8.72
.74
2,328.43f
1,258.68f
665
625
.93
.98
.084
.053
.080–.087
.049–.058
2.94
.10
Original BERS-2 five factors
A. One-factor model 52 items
B. STQ model (5 subscales)
Cross-Validation on this study’s four
derived factors
C. One-factor model 37 items
D. Strength quotient (STQ) 4 factors
a TLI:
Tucker-Lewis index.
b CFI:
Comparative fit index.
c RMSEA:
d CI:
.92
.98
RMSEAc RMSEA CId AICe
Root-mean-square error approximation.
Confidence interval.
e AIC:
fp
TLIa CFIb
Akaike Information Criterion.
< .001.
alternative four-factor BERS-2 Model D (AIC = .10) emerged as a better fit to the data than
the original BERS-2 Model B (AIC = .74), and it is a more parsimonious model overall.
Discussion
As efforts to build the conceptual foundation of strength-based assessment evolves, and with
it the refinement of instruments to measures its core latent constructs, it will be necessary
to test alternative assessment models. We sought to add to this discussion by examining
alternative factor structure models for the BERS-2 Youth Version with an independent
sample. The results of the CFA showed that the four-factor solution (Model D) had the best
fit statistics of the four models tested.
General discussion
As research and clinical interest in positive psychology and resilience assessment has grown,
the BERS-2, as one of the few psychometrically developed measures available, has been
quickly adopted for use in such important contexts as large-scale national mental health
outcome studies (Center for Mental Health Services, 2001). Thus, it has had a substantial
impact on defining key constructs within the emerging area of strength-based assessment
(Buckley & Epstein, 2004). This study was conducted to further examine the structural
validity of the BERS-2 Youth Version. The results of Study 1 showed that using a diverse
sample of youth, including both general education students and youth with involvement in
the juvenile justice system, the original five BERS factors were not replicated. Based on
this EFA, we found that only 37 of the 52 items loaded in to four factors with a modified
BERS-2 self-report format. Although there was substantial overlap in scale content, some of
the items loaded on to different factors.
Study 2 used CFA to further cross-validate different conceptual models for both the
original five-factor configuration and our derived alternative four-factor solution using an
independent sample. One difference in our CFA analysis is that the BERS-2 CFA analysis
reported by Epstein (2004) used subscale scores and did not include individual items. Hence,
Springer
J Child Fam Stud
all 52 items were retained and it is unknown if all of these items would add unique variance;
that is, if all of these items are needed and if the proposed model is the best conceptual fit
to understand youth responses to the BERS-2. Epstein did not test any other hypothesized
conceptual models. Nonetheless, the results of Study 2 provided some support for the original
BERS-2 Youth Version five-factor configuration because the fit indices and the RMSEA for
Model B were better than those reported by Epstein in the BERS-2 manual. In fact, the
RMSEA index, although not ideal, was a moderate fit to the data suggesting that youths’
perceptions of their personal strengths are best portrayed as a combination of first-order
constructs that are correlated and map onto to a second-order global strength index. However,
the four-factor CFA solution derived through study 1 had better fit indices than the BERS-2
model and an RMSEA that approached .05.
Of importance, this analysis examined alternative configurations that have implications
for developing a better understanding of the latent constructs (strengths) measured by the
BERS-2 Youth Version. First, it is important to acknowledge that both the original and
alternative solutions showed that models with a global strengths index were superior to
models with uncorrelated specific constructs. At least with the item content included in
the BERS-2, it appears that youths’ own perceptions of their personal strengths have a
superordinate element.
Second, it is instructive to examine the ways in which Model D (the best four-factor solution) differs from Model B (the best five-factor solution). The results of the EFA suggest that
youth may not differentiate between what is being measured by the BERS-2 Intrapersonal
Strengths and Interpersonal Strengths subscales. The Intrapersonal Strengths subscale was
not replicated in the EFA analysis and items that were retained loaded on the Interpersonal
Strengths subscale. This suggests that from a youth’s perspective, the distinction made between the personal skills needed to successfully manage and negotiate social relationships
may not be differentiated from those skills needed to understand and control their own emotional experiences. This may be particularly true for youth with emotional and behavioral
challenges because the need to control impulses often arises in the context of conflicted
social interactions.
Differences between the BERS-2 factor structure and the current study are not surprising because the BERS-2 youth version appears to be adapted directly from the original
BERS’s parent/caregiver report without additional work to examine its validity for a youth
population. It is not accurate to assume that parents’, teachers’, and youths’ experiences
and ratings of strengths will include the same latent constructs, given developmental and
cognitive differences in functioning. For example, youths may not have the self-awareness
to differentiate between their feelings and actions. Moreover, rating oneself versus another
may lead to different patterns of responding, and thus, different latent traits being accessed.
Recent efforts to explain informant discrepancies in assessing child psychopathology have
proposed an ABC model, which hypothesizes that memory recall, the actor-observer phenomenon, and source monitoring all differentially impact responding, depending on the rater
(De Los Reyes & Kazdin, 2005). For example, a parent may attribute behaviors to their
child’s disposition, whereas his or her child may attribute negative behaviors to settings
expectations (De Los Reyes & Kazdin, 2005). Results of this study indicate that a different
factor structure is applicable to the youth report, which maintains the external influences
(i.e., school participation and family involvement) proposed by the BERS-2 developers and
combines and reduces the three social-emotional factors into two factors with more concrete latent constructs (i.e., general social skills and emotional control versus interpersonal
strength, intrapersonal strength, and affective strength).
Springer
J Child Fam Stud
Although this study provided an important validity analysis to extend our understanding
of the theoretical and empirical utility of the BERS-2 Youth Report, limitations despite
the strengths of the study require that results be interpreted with caution. The primary
limitation of this study is that a modified version of the actual BERS-2 was used. Thus,
though the number, order, and content of items were the same, the actual wording of items
was slightly different. This flaw is ameliorated by the strengths of the study methodology.
Namely, conducting independent analyses with diverse samples allowed us to investigate
the generalizability of the BERS-2 factor structures. Although our study samples differed
somewhat from each other and from the original BERS-2 normative sample, CFA results
indicated reasonable to good model fit for both the four-factor and five-factor models. In
addition, the BERS-2 youth report was developed by making cosmetic changes to the original
parent/caregiver report. The BERS-2 developers did not report an exploratory factor analysis
for the youth self-report and they found inadequate model fit through confirmatory factor
analysis. Thus, this study is the first to explore the factor structure of the BERS items with a
youth sample. Additional research with the published BERS-2 Youth Report is necessary to
replicate results.
We note that the BERS-2 was empirically derived without reference to a specific theory
of strength-based assessment or a core set of apriori-defined latent constructs. Nonetheless,
when experts were asked to identify “strengths” in those youths with whom they worked,
they were able to identify a set of personal skills and social supports that seem to have
captured important aspects of positive youth development. The results of this investigation
aimed to explore the validity of the original BERS-2 Youth Version conceptual model
as it compares to other possible conceptual organizations. The results suggest that as the
field of strength-based assessment evolves, it may benefit by increasing it efforts to refine
instruments by linking them more directly with conceptual models derived from positive
youth development paradigms.
The BERS-2 items, in both the four-factor and five-factor versions, appear to correspond to theories of positive youth development and positive psychology. Zins, Bloodworth,
Weissberg, and Walberg (2004), for example, propose five constructs for their model of youth
social-emotional competence that include: Self-Awareness, Social Awareness, Responsible
Decision Making, Self-Management, and Relationship Management. The BERS-2 subscales
do not include items that measure decision making, and the Zins model does not have content
that focuses specifically on the family context. The first factor from our derived four-factor
model, which we called General Social Skills, appears to be an undifferentiated combination
of Self-Awareness, Social Awareness, and Relationship Management in the Zins model.
Though the original five-factor BERS-2 model proposed a factor structure that included
these three separate subscales, this conceptual structure was not supported in our study. The
school element does not contain items with content that assesses the more affective elements of school involvement (e.g., school bonding; O’Farrell & Morrison, 2003; O’Farrell,
Morrison, & Furlong, 2006), but on more functional participation in school activities, such as
completing homework assignments. What the BERS-2 Youth Version adds to the Zins et al.
model are items that include content related to positive social connections, both in families
and schools. Benard (2004) refers to these social assets as External Resources in her model
used to develop the Resilience Youth Development Module of the California Healthy Kids
Survey (Constantine, Benard, & Diaz, 1999). Her suggestions draw on research by Blum
and Libbey (2004) showing that both family and school positive social bonds, what they
call “connections” are associated with reduction in the rates of high risk behaviours such
as aggression and substance use. Mapping theoretical constructs on to the BERS-2 items
Springer
J Child Fam Stud
and scales will allow researchers to more confidently investigate the relationship between
strength-based latent constructs measured by the BERS-2 and outcomes.
Lambert and colleagues (2005) provide an illustration of how to move the field forward
in linking theoretical constructs to measurement tools. These researchers set out to carefully
develop an assessment package that measured the behavioral and emotional strengths in
Black youths by first drawing on the perspectives of the African American community.
The resulting scale, Behavioral Assessment for Children of African Heritage (BACAH), has
parent, teacher, and youth self-report formats. Of particular interest, the results of independent
EFAs found two unidimensional factors for all three forms, which they called Resilience and
Self-Regulation–Prosocial Behavior (similar to the General Social Skills factor found in our
four-factor BERS-2 solution). Although each rating group had two primary factors, there
were some differences in the item content, suggesting that the specific items for each group
may vary. They also examined item invariance across pairs of the three groups and found
some differential item functioning across the groups. However, their IRT analysis suggested
that it may be possible to measure the latent variables for Resilience and Self-Regulation–
Prosocial Behavior across all three groups and points to an approach that focuses more on
the level of traits being measured and not the total sum of items. In brief, their analysis
suggests another approach to selecting items for scales such as the BERS-2. In this instance,
items would be selected not only because they increased reliability, stability, or structural
validity, but because, for example, they had known characteristics that place them along a
continuum from low to high on the latent trait of “Resilience.” Such an IRT analysis with
the BERS-2, comparing youth responses to those of parents and teachers, in addition to
comparing subgroups of youth, would provide another strategy with which to enhance the
BERS-2 and to refine its use and interpretation.
Given the results of this investigation, we conclude that the most parsimonious organization of the BERS-2 Youth Version items was associated with a conceptual model that
included both internal assets associated with (a) the management of emotions and positive
social interaction skills and (b) engagement in the important social contexts of family and
school. Furthermore, our results should not be used to abandon the original BESR-2 scoring
rubric and norms because the results provided increased verification of the original BERS-2
five-factor model and the four-factor solution was not overwhelmingly superior. However,
the four-factor model produces similar information with fewer items, and as such, may prove
to be an efficient option in large-scale evaluation projects where the length of questionnaires
is an issue and confidence in the latent traits being measured is necessary. In future studies,
parallel reporting of the four-factor model may provide additional insight to the nature and
structure of the BERS-2 Youth Version’s clinical validity and utility when compared with
the five-factor model, thus potentially contributing to a better understanding of important
strength-based latent traits.
References
Achenbach, T. (1991). Manual for the Child Behavior Checklist. Burlington, VT: University of Vermont,
Department of Psychiatry.
Benard, B. (2004). Resiliency: What we have learned. San Francisco: WestEd.
Blum, R. W., & Libbey, H. P. (2004). School connectedness—Strengthening health and education outcomes
fore teenagers: Executive summary. Journal of School Health, 74, 231–232.
Browne, M., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. Long (Eds.),
Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage.
Browne, M. W., & Mels, G. (1990). RAMONA PC: User Manual. Pretoria: University of South Africa.
Springer
J Child Fam Stud
Buckley, J. A., & Epstein, M. H. (2004). The Behavioral and Emotional Rating Scale-2 (BERS-2): Providing
a comprehensive approach to strength-based assessment. California School Psychologist, 9, 21–27.
Byrne, B. M. (2001). Structural equation modeling with AMOS: Basic concepts, applications, and programming. Mahwah, NJ: Lawrence Erlbaum Associates.
Cairns, R. B., Leung, M.-C., Gest, S. D., & Cairns, B. D. (1995). A brief method for assessing social
development: Structure, reliability, stability, and developmental validity of the Interpersonal Competence
Scale. Behavior Research and Therapy, 33, 725–736.
Center for Mental Health Services. (2001). Annual report to Congress on the evaluation of the Comprehensive
Community Mental Health Services for Children and their Families Program, 2001. Atlanta, GA: ORC
Macro.
Chafouleas, S. M., & Bray, M. A. (2004). Introducing positive psychology: Finding a place within school
psychology. Psychology in the Schools. Special Issue Positive Psychology and Wellness in Children, 41,
1–5.
Cicchetti, D., & Lynch, M. (1993). Toward an ecological/transactional model of community violence and
child maltreatment. Psychiatry, 56, 96–118.
Cicchetti, D., & Toth, S. L. (1997). Transactional ecological systems in developmental psychopathology. In
S. S. Luthar, J. A. Burack, D. Cicchetti, & J. R. Weisz (Eds.), Developmental psychopathology: Perspectives on adjustment, risk, and disorder (pp. 317–349). United Kingdom: Cambridge University
Press.
Constantine, N., Benard, B., & Diaz, M. (1999). Measuring protective factors and resilience traits in youth:
The Healthy Kids Resilience assessment. Paper presented at the Seventh Annual Meeting of the Society
for Prevention Research New Orleans, LA, June.
De Los Reyes, A., & Kazdin, A. E. (2005). Informant discrepancies in the assessment of childhood psychopathology: A critical review, theoretical framework, and recommendations for further study. Psychological Bulletin, 131, 483–509.
DiStefano, C., & Hess, B. (2005). Using confirmatory factor analysis for construct validation: An empirical
review. Journal of Psychoeducational Assessment, 23, 225–241.
Doll, B. (2001). Review of the Behavioral and Emotional Rating Scale: A strength-based approach to assessment. In B. S. Plake & J. C. Impara (Eds.), The fourteenth mental measurements yearbook (pp. 142–144).
Lincoln, NE: The University of Nebraska-Lincoln.
Epstein, M. H. (1999). The development and validation of a scale to assess the emotional and behavioral
strengths of children and adolescents. Remedial and Special Education, 20, 258–263.
Epstein, M. H. (2004). Behavioral and Emotional Rating Scale-2: A strength-based approach to assessment.
Austin, TX: PRO-ED.
Epstein, M. H., Mooney, P., Ryser, G., & Pierce, C. D. (2004). Validity and reliability of the behavioral and
emotional rating scale (2nd edition): Youth rating scale. Research on Social Work Practice, 14, 358–367.
Epstein, M. H., & Sharma, J. (1998). Behavioral and Emotional Rating Scale: A Strength-based Approach to
Assessment. Austin, TX: PRO-ED.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory
factor analysis in psychological research. Psychological Methods, 3, 272–299.
Farmer, T. W., Clemmer, J. T., Leung, M., Goforth, J. B., Thompson, J. H., & Keagy, K., et al. (2005).
Strength-based assessment of rural African American early adolescents: Characteristics of students in
high and low groups on the Behavioral and Emotional Rating Scale. Journal of Child and Family Studies,
14, 57–69.
Huebner, E. S., & Gilman, R. (Eds.). (2003). Toward a focus on positive psychology in school psychology.
School Psychology Quarterly, 18, 9–102.
Jimerson, S. R. (2004). The California School Psychologist provides valuable information regarding strengthbased assessment. California School Psychologist, 9, 3–7.
Jimerson, S. R., Sharkey, J. D., Nyborg, V., & Furlong, M. J. (2004). Strength-based assessment and school
psychology: A summary and synthesis. California School Psychologist, 9, 9–20.
Kline, R. B. (1998). Principles and practice of structural equation modeling. New York: Guilford Press.
Lambert, M. C., Rowan, G. T., Kim, S., Rowan, S. A., An, S. J., Kirsch, E. A., & Williams, O. (2005).
Assessment of behavioral and emotional strengths in Black children: Development of the Behavioral
Assessment for Children of African Heritage. Journal of Black Psychology, 31, 321–351.
Libbey, H. P. (2004). Measuring student relationships to school: Attachment, bonding, connectedness, and
engagement. Journal of School Health, 74, 274–283.
Lourie, I. S., Stroul, B. A., & Friedman, R. M. (1998). Community-based systems of care: From advocacy
to outcomes. In M. H. Epstein & K. Kutash (Eds.), Outcomes for children and youth with emotional
and behavioral disorders and their families: Programs and evaluation best practices, outcomes for
children and youth with emotional and behavioral disorders and their families (pp. 3–19). Austin, TX:
PRO-ED.
Springer
J Child Fam Stud
National Research Council and the Institute of Medicine. (2003). Engaging schools: Fostering high school
students’ motivation to learn. Board on Children, Youth, and Families, Division of Behavioral and Social
Sciences and Education. Washington, DC: The National Academies Press.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
O’Farrell, S. L., & Morrison, G. M. (2003). A factor analysis exploring school bonding and related constructs
among upper elementary students. California School Psychologist, 8, 53–72.
O’Farrell, S., Morrison, G. M., & Furlong, M. J. (2006). School engagement. In G. Bear & K. Minke (Eds.),
Children’s needs III (pp. 45–58). Bathesda, MD: National Association of School Psychologists.
Olmi, D. J. (2001). Review of the Behavioral and Emotional Rating Scale: A Strength-Based Approach
to Assessment. In B. S. Plake & J. C. Impara (Eds.), The fourteenth mental measurements yearbook
(pp. 144–145). Lincoln, NE: The University of Nebraska-Lincoln.
ORC Macro. (2003, May). System-of-Care evaluation brief: Resilient adaptation in children with serious emotional disturbance. Retrieved May 6, 2006, from http://www.dhh.louisiana.gov/offices/publications/pubs142/SOCEB%20May%2003.pdf.
Rhee, S., Furlong, M. J., Turner, J., & Harari, I. (2001). Integrating strength-based perspectives in psychoeducational evaluations. California School Psychologist, 6, 5–17.
Seligman, M. E. P., Steen, T. A., Park, N., & Peterson, C. (2005). Positive psychology progress: Empirical
validation of interventions. American Psychologist, 60, 410–421.
Synhorst, L. L., Buckley, J. A., Reid, R., Epstein, M. H., & Ryser, G. (2005). Cross informant agreement of
the Behavioral and Emotional Rating Scale-2nd edition (BERS-2) parent and youth rating scales. Child
& Family Behavior Therapy, 27, 1–11.
Switzer, F. S., III, & Roth, P. L. (2002). Coping with missing data. In S. G. Rogelberg (Ed.), Handbook of
research methods in industrial and organizational psychology: Blackwell handbooks of research methods
in psychology (pp. 310–323). Malden, MA: Blackwell.
Steiger, J. H. (1989). EzPATH: A supplementary module for SYSTAT and SYGRAPH. Evanston, IL: SYSTAT
Inc.
Thompson, B. (2004). Exploratory and confirmatory factor analysis. Washington, DC: American Psychological Association.
Uhing, B. M., Mooney, P., & Ryser, G. R. (2005). Differences in strength assessment scores for youth with
and without ED across the youth and parent rating scales of the BERS-2. Journal of Emotional and
Behavioral Disorders, 13, 181–187.
WestEd. (2006). Healthy Kids Resilience Module (HKRM). Retreived May 6, 2006, from http://
www.wested.org/cs/we/view/rs/562.
Widaman, K. (1993). Common factor analysis versus principal component analysis: Differential bias in
representing model parameters? Multivariate Behavioral Research, 28, 263–311.
Zins, J. E., Bloodworth, M. R., Weissberg, R. P., & Walberg, H. J. (2004). The scientific base linking social
and emotional learning to school success. In J. E. Zins, R. P. Weissberg, M. C. Wang, & H. J. Walberg
(Eds.), Building academic success on social and emotional learning: What does the research say?
(pp. 3–22). New York: Teachers College Press.
Springer