SSBD Technical Manual - Pacific Northwest Publishing

Systematic
Screening for
Behavior
Disorders (SSBD)
Technical Manual
Universal Screening for PreK–9
Hill Walker, Ph.D.
Herbert H. Severson, Ph.D.
Edward G. Feil, Ph.D.
SECON D E DI T ION
Copyright © 2014 by Hill M. Walker, Herbert H. Severson, and Edward G. Feil
All rights reserved.
Cover and interior design by Aaron Graham
The purchaser is granted permission to use, reproduce, and distribute the reproducible forms in the book and on the CD solely for use in a single classroom.
Except as expressly permitted above and under the United States Copyright
Act of 1976, no parts of this work may be used, reproduced, or distributed in
any form or by any means, electronic or mechanical, without the prior written
permission of the publisher.
Published in the United States by
Pacific Northwest Publishing
21 West 6th Avenue
Eugene, OR 97401
ISBN 978-1-59909-065-8
Pacific
Northwest
Publishing
Eugene, Oregon | www.pacificnwpublish.com
TABLE OF CONTENTS
List of Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
SSBD National Standardization Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Grades 1–6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Preschool and Kindergarten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Supplemental SSBD Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
SSBD Instrument Development Procedures . . . . . . . . . . . . . . . . . . . . . . . 7
Phase 1: Initial Development of SSBD Instruments . . . . . . . . . . . . . . . . . 7
Stage 1 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Interrater Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Test-Retest Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Sensitivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Stage 2 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
SIMS Behavior Observation Codes . . . . . . . . . . . . . . . . . . . . . . . . . 10
Phase 2: Trial Testing, Field Testing, and Validation of
SSBD Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Trial Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Stage 1 Instruments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Stage 2 Instruments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
SIMS Behavior Observation Codes. . . . . . . . . . . . . . . . . . . . 19
Discriminating Externalizers, Internalizers
and Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Efficiency in Classifying Participant Groups . . . . . . . 21
Sex Differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Intercorrelations Among Stage 2 Measures and
SIMS Behavior Observation Code Variables. . . . . . . 24
SSBD Field Testing and Replication. . . . . . . . . . . . . . . . . . . . . . . . . 27
Validation Studies of the SSBD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Test-Retest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Internal Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Interrater. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Item Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Factorial Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Technical Manual | i
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Concurrent Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Discriminant Validity . . . . . . . . . . . . . . . . . . . . . . . . . .
Construct Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Social Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Phase 3: Extensions of SSBD Instruments . . . . . . . . . . . . . . . . . . . . . . . .
The Early Screening Project: Using the SSBD with Preschool
and Kindergarten Students. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
History and Development of ESP. . . . . . . . . . . . . . . . . . . . . .
Validation Studies: Reliability . . . . . . . . . . . . . . . . . . . . . . . .
Interrater Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Test-Retest Reliability . . . . . . . . . . . . . . . . . . . . . . . . . .
Consistency Across Measures. . . . . . . . . . . . . . . . . . .
Validation Studies: Validity . . . . . . . . . . . . . . . . . . . . . . . . . .
Content Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Concurrent Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Discriminative Validity. . . . . . . . . . . . . . . . . . . . . . . . .
Treatment Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Summary of ESP Technical Adequacy . . . . . . . . . . . . . . . . .
Using the SSBD with Students in Grades 7–9 . . . . . . . . . . . . . . . .
Update on SSBD Research and Outcomes . . . . . . . . . . . . . . . . . . . . . . . .
Research Conducted by Other Professionals . . . . . . . . . . . . . . . . .
Research Conducted by the SSBD Authors
and Colleagues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Appendix A: Normative Comparisons: SSBD Original Norms and
Updated Supplemental Normative Databases . . . . . . . . . . . . . . . . . . . . .
Appendix B: SSBD Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii | Table of Contents
36
36
53
56
57
57
57
59
59
60
60
61
61
62
62
63
64
65
66
66
68
70
73
79
89
LIST OF FIGURES AND TABLES
Figure 1: Means of Children Ranked Highest on Externalizing
Dimension, Internalizing Dimension, and Nonranked
Peers on T-Scores of ESP Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Figure 2: First Steps Intervention Results on the SSBD . . . . . . . . . . . . . . . . . . . . 63
Table 1: Proportion of Cases on SSBD Stage 2 and SIMS Behavior
Observation Codes by Standardization Sample Size . . . . . . . . . . . . . . 2
Table 2: SSBD Standardization Sample Demographic Characteristics . . . . . . . 4
Table 3: Number and Age of Children in the ESP Normative Sample . . . . . . . 6
Table 4: Test-Retest Stability Coefficients for Individual Teachers
on the Stage Rank 1 Ordering Procedures . . . . . . . . . . . . . . . . . . . . . . 16
Table 5: Item-Total Correlations for the Stage 2 Behavior
Scale Across Time 1 and Time 2 Rating Occasions . . . . . . . . . . . . . . 17
Table 6: Means, Standard Deviations, and ANOVAs for the
Three Participant Groups on the Stage 2 Instruments . . . . . . . . . . . . 18
Table 7: Means, Standard Deviations, and ANOVAs for the Three
Participant Groups on the Classroom and Playground
Observations Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Table 8: Scheffé Analysis of Mean Differences on Discriminating
SSBD Stage 2 and SIMS Behavior Observation Codes Variables . . . 21
Table 9: Correlations Between Predictor Variables and Group
Membership and Corresponding Beta Weights . . . . . . . . . . . . . . . . . .23
Table 10: Sex Differences on Stage 2 and SIMS Behavior Observation
Codes Variables for Combined Participant Groups . . . . . . . . . . . . . . 23
Table 11: Correlation Matrix for Stage 2 and SIMS Behavior
Observation Codes Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table 12: Item-Total Correlations for the Stage 2 Adaptive and
Maladaptive Rating Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Table 13: Adaptive and Maladaptive Behavior Rating Scale Factor
Structure and Item Loadings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Table 14: PSB Code Category and Code Category Combination
Means, Standard Deviations, and Significance Tests by
Participant Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Technical Manual | iii
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 15: Comparison of Frequency of Items Checked on the
Critical Events Index for Externalizing and Internalizing
Elementary Students Who Met Risk Criteria on the SSBD . . . . . . . . 40
Table 16: Means, Standard Deviations, and Significance Tests for Four
Participant Groups of North Idaho Children’s Home Residents . . . 41
Table 17: Chi-square Analysis of Critical Events Items for Four
North Idaho Children’s Home Participant Groups . . . . . . . . . . . . . . . 43
Table 18: Means, Standard Deviations, and Significance Tests for
Fourth-Grade Externalizing, Internalizing, and Nonranked
Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 19: Means, Standard Deviations, and Significance Tests for
Participant Groups on the Adaptive and Maladaptive
Rating Scale Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Table 20: SARS Behavior Profiles for Externalizing, Internalizing,
and Nonranked Control Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Table 21: Means, Standard Deviations, and Significance Tests for Isolate
and Nonisolate Participants on Teacher Social Skills Ratings and
SSBD Stage 2 and SIMS Observation Codes Measures . . . . . . . . . . . 51
Table 22: Correlations Between Year One and Year Two Follow-up
Scores on SSBD Stage 2 Measures for Combined Externalizing
and Internalizing Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Table 23: SIMS Behavior Observation Codes Predictor
Variables for Discrimination Analysis Classifying
Previous Year’s Participant Group Status . . . . . . . . . . . . . . . . . . . . . . 52
Table 24 Correlations Between SSBD Stage 2 Measures and
Achenbach TRF Scales and SSRS Scales . . . . . . . . . . . . . . . . . . . . . . . . 69
Table 25: Similarities in SSBD Score Profiles for Normative and
Research-Based Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table 26: Original SSBD Norms vs. Supplemental Practice—
Research Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Table 27: Profiles for Externalizing and Internalizing Students
Meeting vs. Not Meeting SSBD Stage 2 Risk Criteria . . . . . . . . . . . . . 84
Table 28: Descriptive Statistics for SSBD Stage 2 Measures . . . . . . . . . . . . . . . . 85
Table 29: Lane et al. Supplemental Norms From Research
Conducted in the U.S. Southeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Table 30: Meeting/Not Meeting SSBD Stage 2 Risk Criteria by
Ethnicity and Externalizing vs. Internalizing Status . . . . . . . . . . . . . 87
iv | Table of Contents
ACKNOWLEDGMENTS
A large number of professionals have made important contributions
to the research and development of the SSBD system. Project staff
members and colleagues of the authors who participated directly in the
research process on the SSBD were Bonnie Todis, Alice Block-Pedego,
Maureen Barckley, Greg Williams, Norris Haring, and Richard Rankin.
Their contributions and dedication always met the highest standards of
professionalism.
The SSBD system was field-tested at a number of sites around the country in order to develop its normative database and test its efficacy. In
particular, Vicki Phillips, Marilyn McMurdie, and Gayle Richards of the
Kentucky Department of Education and State of Utah made enormous
contributions to this process. Their generosity, dedication, and contributed time were outstanding and are greatly appreciated.
Fulvia Nicholson of the Jordan School District in Utah conducted a
full-scale, year-long replication of the SSBD through a grant from the
Utah Department of Education. Her skill, professional dedication, and
generosity were instrumental in making this a highly successful replication. The authors are indebted to her for these consistently high-quality
efforts. Linda Colson and Lisa York of Illinois also cooperated with the
authors and their staff in testing the SSBD over a year-long period. Our
thanks and gratitude are also extended to them for the quality and generosity of their efforts.
Other individuals who made important contributions to the SSBD’s development include Ken Reavis, Stevan Kukic, Steve Forness, Bill Jenson,
Mike Nelson, Ken Sturm, Ray Lamour, Kathy Ludholtz, Gary Adams,
Hyman Hops, Lew Lewin, Peter Nordby, Bob Hammond, Bob Lady, and
Kathy Keim-Robinson.
We would especially like to acknowledge the professional colleagues
who more recently shared their SSBD research and practice databases
with us to supplement and substantially expand our original normative
base of 4,463 cases. These supplemental norms comprised nearly 7,000
additional cases drawn from five different regions of the United States.
We acknowledge the following individuals for their invaluable efforts
in this regard: Doug Cheney, Lucille Eber, Kathleen Lane, Gale Naquin,
Technical Manual | v
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Jen Rose, Jason Small, Scott Stage, and Rich and Ben Young and their
colleagues at Brigham Young University. These individuals were also
instrumental in conducting data runs and analyses that made it possible
to maximally utilize these normative case data. We are most indebted to
them for their generosity and support.
The validation and norming of the SSBD over a 5-year period were
supported in part by research and model development grants to the
authors from a series of federal and state agencies. Finally, we would like
to acknowledge the excellent work of Jason Small, a research analyst at
the Oregon Research Institute, for his comprehensive analysis of the psychometric characteristics of the SSBD resulting from three large-scale
evaluation studies of the First Step program in which the SSBD was used
as a universal screener.
vi | Acknowledgments
INTRODUCTION
This document describes the development, trial testing, validation studies, and norming procedures and outcomes for the Systematic Screening
for Behavior Disorders (SSBD) screening system. These initial activities
occurred over a 5-year period prior to the SSBD’s publication. The development and validation of other measures in the Screening, Identification,
and Monitoring System (SIMS), including the School Archival Records
Search (SARS) and SIMS Behavior Observation Codes, occurred in
conjunction with work around SSBD Stages 1 and 2. As part of SSBD
development and validation, the SARS and SIMS Behavior Observation
Codes were often administered to students who met risk criteria at Stage
2. Therefore, this manual also presents technical information around the
development, validation, and use of these SSBD follow-up assessments in
conjunction with Stage 1 and 2 measures.
SSBD NATIONAL
STANDARDIZATION SAMPLE
At the completion of screening Stage 2 and/or use of the SIMS Behavior
Observation Codes, data and information are available to make normative comparisons. These normative data allow schools to determine how
an individual student compares with his or her peers on dimensions
assessed by the SSBD, and can help determine the student’s specific
behavioral status and possible eligibility for referral, special education
certification, access to interventions, and/or specialized services and
supports. Normative data are presented in Appendix A tables of the
Administrator’s Guide, and should be of value to professionals during
decision making about potentially at-risk students. Normative data
were also used to identify cut-offs for decision rules at Stage 2 that are
associated with risk for externalizing or internalizing disorders. The
composition of this national standardization sample is described in the
next sections, by grade level range.
Technical Manual | 1
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
National Standardization Sample: Grades 1–6
The national standardization sample for the SSBD is comprised of
approximately 4,400 cases (N = 4,463) on the Stage 2 measures and
approximately 1,300 cases (N = 1,219) on the SIMS Behavior Observation
Codes. These cases were developed within 17 school districts located
in 8 states across the country. These states were Oregon, Washington,
Utah, Illinois, Wisconsin, Rhode Island, Kentucky, and Florida. Table 1
contains the proportion of the total cases in the standardization sample
from each of these sites for both Stage 2 measures and SIMS Behavior
Observation Codes variables.
Table 1 Proportion of Cases on SSBD Stage 2 Measures and SIMS Behavior
Observation Codes by Standardization Sample Size
SSBD STAGE 2 MEASURES
State
Florida
n
% Total Sample
82
1.8
Washington
280
6.3
Illinois
198
4.4
Kentucky
1,144
25.6
Oregon
1,284
28.8
261
5.8
1,038
23.3
176
3.9
4,463
100
Rhode Island
Utah
Wisconsin
SIMS BEHAVIOR OBSERVATION CODES
State
n
% Total Sample
Washington
77
6.3
Illinois
99
8.1
Kentucky
212
17.4
Oregon
455
37.3
Utah
316
25.9
Rhode Island
2 | SSBD National Standardization Sample
60
4.9
1,219
100
This sample was developed over a 2-year period spanning the 1987–88
and 1988–89 school years. The development of the sample was made
possible through state education department (Utah and Kentucky) and
school district contacts of the authors. Two sites (Illinois and Wisconsin)
contacted the authors regarding participation in the standardization
process and in facilitating a field test of the SSBD.
Correlations were computed between the Stage 2 measures and grade
and sex of students in the standardization sample. For the Critical Events
Index and the Adaptive and Maladaptive Behavior Scales, the correlations with grade were .02, −.04, and .00. The corresponding correlations
with sex of student were −.18, .28, and −.26. Although several of these
correlations reached significance at (p < .05), they were clearly in the low
range of magnitude and, in the authors’ estimation, did not justify the
creation of separate samples and distributions based on grade or sex.
The SIMS Behavior Observation Codes AET code and some of the PSB
code categories, however, showed substantial age (AET) and/or sex differences (e.g., participation, social engagement, as well as positive and
negative social interaction). Thus, separate distributions were calculated
by age and sex of student on these variables for externalizers, internalizers
and nonranked participants. (See SSBD Administrator’s Guide and SSBD
Observer Training Manual for tables resulting from these distributions)
The authors were able to obtain data on the demographic and socioeconomic status characteristics for 12 of the 17 school districts participating in the standardization sample development effort. Table 2 displays
this information by total school district enrollment, total number and
proportion of non-white students, and the total proportion of students
coming from low-income homes.
Non-White proportions of the school population across these districts
ranged from less than 1% to 33%. The proportion of students coming
from low-income families ranged from 4.3% to 40%. Across school districts in the standardization sample, both non-White and low-income
student status appeared to be broadly represented.
Technical Manual | 3
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 2 SSBD Standardization Sample Demographic Characteristics
Total
Enrollment
Total
Non-White
Enrollment
%
Total LowIncome
Enrollment
%
Oregon
Springfield
—
—
3.0
—
33.0
Park Rose
1,254
179
14.3
—
35.0
17,686
—
24.6
—
Kentucky
Fayette
Ohio
25.9
4,157
55
1.3
1,524
37.0
—
—
<1.0
—
40.0
1,732
—
7.3
—
24.4
SASED
5,354
625
12.0
12
4.3
Dist. #33
2,026
590
29.0
468
23.0
Dist. #34
239
5
<1.0
0
—
Dist. #25
473
1
<1.0
25
5.0
Granite
76,799
6,374
8.3
12,364
16.1
Jordan
62,281
3,346
5.4
6,237
10.6
29,268
9,671
33.0
11,414
39.0
Owen
Henderson
Illinois
Utah
Washington
Tacoma
Peninsula
Florida
7,064
376
5.3
805
12.3
62,778
25,738
31.0
—
—
5,659
1,129
20.0
2,175
38.0
879
10
1.1
59
6.7
Rhode Island
Wisconsin
Standard score and percentile distributions of externalizing, internalizing, and nonranked student cases are presented and discussed in the
SSBD Administrator’s Guide. Cutoff scores based on these distributions
are used as decision criteria for determining whether individual students meet criteria for risk at Stage 2 and may benefit from additional
assessments, referral, certification, and access to needed supports and
specialized services. Complete instructions for making these decisions
are contained in the SSBD Administrator’s Guide.
4 | SSBD National Standardization Sample
National Standardization Sample: Grades
Prekindergarten and Kindergarten
The normative sample for prekindergarten and kindergarten was developed as part of the Early Screening Project (ESP). The sample consisted
of 2,853 children, aged 3 to 6 years old, who were enrolled in typical and
specialized programs from 1991 to 1994. Because the SSBD uses a gating
procedure and a comparison group, a decreasing number of children
participated across stages. Of the 2,853 children beginning in Stage 1,
1,401 (49%) moved to Stage 2 and 541 (19%) were assessed with the SIMS
Behavior Observation Codes.
The participating children were from preschool and kindergarten classrooms in the following states: California (n = 517), Kentucky (n = 687),
Louisiana (n = 386), Nebraska (n = 65), New Hampshire (n = 25), Oregon
(n = 220), Texas (n = 612), and Utah (n = 341). The specialized preschools
included programs for children identified as having serious emotional/
behavioral disorders, having developmental and language delays, and
living in families with low incomes (Head Start). The sample consisted
of 46% females and 54% males, with most of the children not eligible for
Special Education services (78%). Of those who did qualify for Special
Education services, 2% were eligible under the behavioral disorder category, 14% under developmental or language delay, and 6% under other
categories (e.g., at risk and other health impaired). Sixty-nine percent
of the children were White (as reported by their teachers), with 16%,
12%, and 3% reported as Hispanic, Black, and Native American or Asian,
respectively. Family income (as reported by teachers) was 39% “middle”
income ($15,000–$75,000/year); yet a substantial portion of families
(58%) were reported to be “low” income (less than $15,000/year or Head
Start eligible). Of the 1,304 families with low incomes, 974 had children
enrolled in Head Start. Community size was 10% urban (over 1 million),
6% semi-urban (between 250,000 and 1 million), 21% suburban, and 63%
rural (less than 100,000)
Table 3 uses data from the Early Screening Project and concurrent measures collected over a 3-year period (from September 1991 through June
1994). This research was conducted and involved separate but related
studies for the purpose of replicating and extending findings on the reliability and validity of the instrument with preschool and kindergarten
students.
Technical Manual | 5
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 3 Number and Age of Children in the ESP Normative Sample
Age
Not reported
Stage 1
140
Stage 2
61
SIMS Behavior
Observation Codes
5
3 years old
260
137
61
4 years old
1,463
721
278
5 years old
915
448
179
6 years old
Total
75
34
18
2,853
1,401
541
SUPPLEMENTAL SSBD NORMS
In the last several years, the SSBD authors have been able to recruit new
research as well as practice SSBD data and results from a series of ten
sites and colleagues from across the United States. A number of professional colleagues have developed substantial databases from conducting
research studies in which the SSBD was used as a study measure or
was the focus of the research. We find that SSBD data from numerous
research studies are a close match to our original norms when they are
collected in exactly the same fashion and under similar conditions.
This result argues for the relevance of the original norms in decision
making regarding today’s students as such normative student profiles
have remained stable across school years.
Based on the stability of the SSBD normative behavior levels, it is justifiable to retain the original cutoff points when using the SSBD as a
universal screener (i.e., decision rules and risk criteria cutoffs at Stage
2) and as a determinant for optional, additional screening and/or access
to supports and intervention services. The new supplemental norms of
6,743 cases for externalizers and internalizers, generated for students
who do and do not meet Stage 2 risk criteria, provide important and
highly consistent benchmarks across regional sites for evaluating the
behavioral status of today’s students.
A presentation of these updated norms is provide in Appendix A: Normative Comparisons: SSBD Original Norms and Updated Supplemental
Normative Databases.
6 | Supplemental SSBD Norms
SSBD INSTRUMENT
DEVELOPMENT PROCEDURES
Construction and testing of the measures that make up SSBD screening
Stages 1 and 2 and the SIMS Behavior Observation Codes are described
herein. Research on the SSBD’s development has been conducted
in three phases. In Phase 1, research efforts were focused on the initial development and testing of SSBD instruments, definitions, and
response formats of measures used across the screening stages. These
efforts occurred over a 1-year development period. In Phase 2, 4 years of
research and development were devoted to validation and field testing of
the developed measures. Lastly, Phase 3 is characterized by research and
applied work in extending the SSBD to other populations and settings.
Phase 1: Initial Development of SSBD Instruments
Stage 1 Instruments
Three separate versions of the SSBD Stage 1 definitions and rating formats were investigated and evaluated prior to selection of those included
in the final version of the SSBD. Each prototype version was trial-tested
with teachers and aides in elementary classrooms in school districts in
Oregon and Washington. Three criteria were used in evaluating these
prototype versions:
•• Interrater Reliability: The degree to which teachers with identical
amounts of exposure to the same students agree in their rank
orderings of them on externalizing and internalizing behavioral
dimensions.
•• Test-retest Reliability: The extent to which teacher rankings of
students are stable over time.
•• Sensitivity: The accuracy of the procedures in identifying students
in general education classrooms who had been previously certified
by a child study team as having behavioral disorders.
These criteria guided revisions of the SSBD Stage 1 instruments and
procedures during the development process. Their application is
described below.
Technical Manual | 7
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Interrater Agreement
Initial testing of the first prototype version of the Stage 1 definitions
and rank ordering procedures yielded Spearman rank order correlations
(rhos) among pairs of teachers, and teachers and aides ranging from .60
to .94 for the externalizing rank-ordering dimension and from .35 to
.72 for the internalizing dimension. Although these agreement levels
were promising, they were not sufficiently high to achieve the authors’
Stage 1 goal of reliably identifying students potentially at risk for externalizing and internalizing behavior disorders. Stage 1 is arguably the
most important of the SSBD screening stages because it determines
which nominated students are included in subsequent screening stages
and thereby qualify for further assessment(s), possible referral, and
access to supports and services. Consequently, the Stage 1 procedures
were revised to achieve greater behavioral specificity and precision in
the externalizing and internalizing definitions and to also simplify the
ranking procedure.
The revised version of the Stage 1 procedures improved interrater agreement for the externalizing dimension but reduced it for the internalizing
dimension. Spearman rhos, computed for two teachers and an aide on
the externalizing dimension, ranged from .89 to .99; on the internalizing
dimension, however, the range was a disappointing .11 to .28. The internalizing rhos and feedback from participating teachers indicated that
the internalizing definition was still ambiguous and lacked sufficient
clarity for effective use.
Following this phase, the internalizing definition was rewritten a second
time to provide additional behavioral specificity. This version of the
Stage 1 procedures was then trial tested using eight pairs of teachers and
teachers and aides. Noticeable improvements in agreement levels were
obtained. Spearman rhos for participating teachers across the two sites
ranged from .89 to .94 for the externalizing dimension and from .82 to
.90 for the internalizing dimension. The agreement levels achieved were
considered acceptable for achieving Stage 1 screening goals.
Test-Retest Reliability
A series of studies was conducted, also in the Oregon and Washington
sites, on the temporal stability of the revised version of the Stage 1 procedures. Ten teachers participated in these studies, and temporal interval
lengths ranged from 10 days to 1 month. Test-retest estimates over these
8 | Phase 1: Initial Development of SSBD Instruments
time intervals ranged from .81 to .88 for the externalizing dimension
and from .74 to .79 for the internalizing dimension. These estimates, in
the authors’ view, met acceptable standards of temporal stability.
Sensitivity
To test the sensitivity of the Stage 1 procedures, nine general education
teachers in kindergarten through sixth grade were identified in whose
classrooms ten certified students with behavioral disorders (BD) had
been placed previously by the school district. These teachers were not
informed of the purpose of the study but were asked simply to complete the Stage 1 ranking procedures using all students enrolled in their
classrooms. It was assumed that if the SSBD were sensitive to behavioral
differences among students enrolled in least restrictive environment
(LRE) settings, previously certified BD students would be ranked high
relative to other students on the Stage 1 externalizing and internalizing
behavioral dimensions. This proved to be the case. The Stage 1 procedures identified nine of the ten BD students as being within the highest
three ranks on the externalizing dimension; the remaining pupil was
ranked fifth on the internalizing dimension.
Stage 2 Instruments
The SSBD Stage 2 screening instruments (Critical Events Index and
Combined Frequency Index of Adaptive and Maladaptive Behavior)
were developed from prototype item lists contributed by Walker and his
colleagues (Hersh & Walker, 1983; Walker, 1982; Walker, Reavis, Rhode
& Jenson, 1985). The items that made up these three lists had been trial
tested extensively in prior studies, refined and socially validated by both
regular and special education teachers as measures of teacher behavioral
standards and academic expectations for general education students
(Walker, 1986; Walker & Rankin, 1983). Additional items included in the
Critical Events Index (CEI) prototype list were based on externalizing
and internalizing dimensions as conceptualized by Achenbach & Edelbrock (1979) and Ross (1980). These lists were informally trial tested in
the Oregon site using a sample of 15 cooperating elementary teachers
who rated the behavioral status of randomly selected students on them.
Feedback from teachers regarding the items and inspection of means
and variances were used as a basis for revising these items.
Technical Manual | 9
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
SIMS Behavior Observation Codes
The observation codes used in conjunction with SSBD Stages 1 and 2
were derived from codes developed by Walker and colleagues for recording pupil behavior within instructional and playground settings in prior
research (Walker, Hops, & Greenwood, 1984). These two codes were
trial tested extensively in school settings during the 1984–85 school
year as part of a related research study (Shinn, Ramsey, Walker, Stieber,
& O’Neill, 1987). Observer training times required for mastery of these
codes were relatively brief. Interobserver agreement ratios were consistently in the .90 to .99 range for the Academic Engaged Time (AET) code
and in the .78 to .90 range for the Peer Social Behavior (PSB) code during
their testing and refinement.
In Shinn et al. (1987), the AET code powerfully discriminated between
a group of 39 antisocial and 41 at-risk control fifth grade boys (Shinn,
Ramsey, Walker, Stieber & O’Neill, 1987). The antisocial students averaged 70% academic engagement during structured, classroom observations while the at-risk controls averaged 83%.
Results of these instrument development procedures indicated that the
SSBD measures appeared to have sufficient levels of reliability, sensitivity, and content validity to justify efforts for systematically investigating
their psychometric characteristics and including them in the overall
SSBD system. In 1986, the authors were awarded a 3-year field-initiated
research grant from the U.S. Office of Special Education Programs to
support these research efforts. This grant made it possible to study the
psychometric properties of the SSBD measures extensively, to field-test
the SSBD system, and to collect normative data on the Stages 2 and 3
instruments within 8 states and 18 school districts across the United
States. Results of Phase 2 of the SSBD research and development process,
supported by this external funding, are described in the next three sections under the headings of Trial Testing, Validation, and Field Testing.
10 | Phase 1: Initial Development of SSBD Instruments
Phase 2: Trial Testing, Field Testing, and
Validation of SSBD Instruments
Trial Testing
A year-long study designed to trial test the instruments comprising
the SSBD was implemented during the 1985–86 school year. The study
posed a number of crucial questions regarding the reliability and validity
of teacher judgments and the psychometric characteristics of the instruments comprising each of the SSBD assessment stages. This year-long
trial test of the SSBD within a Springfield, Oregon, elementary school
had two major goals:
•• To evaluate the psychometric characteristics of the instruments
used at each SSBD screening stage
•• To evaluate teacher accuracy in identifying, via the SSBD Stage
1 ranking procedures, contrasted groups of students (i.e., highranked externalizers, high-ranked internalizers, and unranked
students) who would be expected to behave differently from each
other within instructional and free-play settings.
This study also assessed teachers’ general acceptance of the screening
procedures in terms of their perceived value and consumer satisfaction.
In this regard, the authors were particularly interested in how long it took
to implement the Stage 1 and 2 procedures as well as their ease of use.
Participants in this study were 18 teachers assigned to grades 1 through
5 in a cooperating elementary school located in Springfield, Oregon, and
students enrolled in their classes (N = 454). All 18 teachers individually
completed the SSBD Stage 1 and 2 assessment procedures on two occasions 31 days apart. In Stage 1, teachers nominated two mutually exclusive lists of students whose characteristic behavior patterns were best
represented respectively by the externalizing or internalizing behavioral
definitions (n = 10 each). Next, each teacher rank ordered the students
within both lists in terms of the degree to which their characteristic
behavior patterns matched the appropriate behavioral profile (i.e., externalizing or internalizing).
Technical Manual | 11
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
It was necessary for the pupil lists to be identical at both Stage 1 ranking
occasions in order to assess the test-retest stability of teacher rank orderings using Spearman rhos. Therefore, after the participating teachers
had completed their Time 2 rank orderings, the pupil membership of
the externalizing (n = 10) and internalizing (n = 10) lists for Time-1 and
2 were compared. Those teachers whose pupil lists differed from their
initial list were given their original time 1 lists (in scrambled order) for
the externalizing and/or internalizing behavioral dimensions and asked
to re-rank this list of students.
Teachers were then asked to rate the top three ranked externalizers and
top three internalizers from their Stage 1 lists on the Stage 2 Critical
Events Index and the Combined Frequency Index. These procedures were
completed in the same manner at both Time 1 and Time 2 (one month
follow-up interval). The teacher form of the Child Behavior Checklist
(Achenbach & Edelbrock, 1979) was also completed by the classroom
teacher for the three top ranked externalizing and internalizing students
in each class following completion of the second set of SSBD rankings
and ratings. In addition, the Stage 2 Combined Frequency Index was
completed on two students from each teacher’s classroom (n = 33) who
did not appear on either the externalizing or internalizing lists in SSBD
Stage 1. These students served as normative controls for both Stage 2 and
Stage 3 assessments.
Following completion of the Time 1 and Time 2 teacher ranking/rating
tasks, a sample of students was selected from each of the 18 classrooms
for direct observation within instructional and free-play settings using
the SIMS Behavior Observation Codes. From each classroom, parental
permission was sought to observe four students: one externalizer, one
internalizer, and two unselected, nonranked students who served as
controls. The externalizing and internalizing participating students
were those who had been ranked highest on these behavioral dimensions across both ranking occasions. Letters of consent were sent first to
parents of the students with the highest average rankings across Times 1
and 2; if consent was denied, consent was then sought for observation of
the student with the second highest average ranking. Signed permission
forms were returned for 16 externalizers (8 first choices and 8 second
choices) and 15 internalizers (6 first choices and 9 second choices). Parental consent was not sought for students who ranked lower than second
on either the externalizing or internalizing dimensions. Thirty-three of
12 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
36 consent forms were signed and returned for the unranked control students. Thus, a total of 64 students were observed on the SIMS Behavior
Observation Codes (33 controls, 16 externalizers and 15 internalizers).
Each student from the three groups was observed on four occasions,
twice under seatwork conditions in the regular classroom setting and
twice under regular recess conditions on the playground. Observers were
uninformed as to the group membership (externalizing, internalizing, or
control) of any of the study participants.
Classroom observations were 15 minutes in length, and these sessions
were recorded only during reading, mathematics, social studies and
language periods. Whenever possible, observations were conducted
during independent seatwork periods and no data were collected during
teacher-led activities and classroom periods involving group unison
responding, such as those used in direct instruction formats. Observers
were provided with a stopwatch that was allowed to run whenever the
target student was academically engaged, i.e., attending appropriately to
academic materials and tasks, making appropriate motor responses, and
requesting teacher assistance with academic tasks. Whenever the target
student was not academically engaged (e.g., disturbing others, talking
out, off task, out of seat and so forth), the stopwatch was stopped and
remained off until the student resumed being academically engaged.
Playground observations were scheduled for 15 minutes each but were
sometimes shorter because recess periods did not always last for this
length of time. Observations were conducted under regular recess conditions only using partial interval coding procedure; coding did not take
place during playground activities that were actively led or controlled by
an adult. Observers coded the target student’s playground social behavior according to the following guidelines.
•• Only one code category could be recorded during any given
observation interval.
•• The category of social engagement overrode all other categories. If
any social engagement was observed, the participant’s behavior was
coded as socially engaged (SE) for that interval.
•• If the target student changed activities during an interval, the
activity that occurred most during of the interval was coded.
•• All other code categories overrode the no codeable response (NC)
category in the recording process. NC was coded only if no other
category could be determined.
Technical Manual | 13
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
•• The student’s behavior was coded as negative if any negative
behavior occurred during the interval.
Five graduate student observers were trained on the SIMS Behavior
Observation Codes by the authors’ colleagues. Each observer received
from 3.5 to 5 hours of direct, supervised training distributed across
the classroom and playground codes. The first three training sessions
occurred in a simulation training setting where observers practiced
recording while viewing videotapes of classroom and playground behavior. The final two training sessions occurred in naturalistic classroom
and playground settings.
Reliability criteria for the termination of training were two consecutive
sessions of a minimum of 90% agreement on the AET code and two
consecutive sessions of a minimum of 70% agreement per session, with a
mean of 80% or greater agreement on the PSB code. Observer agreement
for the AET code was calculated by dividing the larger amount of time
on the stopwatch recorded by one observer into the smaller amount
recorded by the other observer and multiplying by 100. Agreement on
the playground code was determined by dividing the number of intervals
in which there was complete agreement among the observer pair by the
total number of intervals observed and multiplying by 100.
A total of 256 observations was completed in classroom and playground
settings on students. Reliability checks were conducted by a colleague
who served as the observer trainer/calibrator during 40 of these observation sessions (15.6% of total observations). The five observers conducted
a total of 224 observation sessions while the trainer/calibrator conducted
the remaining 32.
Results for the SSBD trial test are described below by each assessment
stage. Correlations are reported between selected Stage 2 measures,
SIMS Behavior Observation Codes outcomes, and the Achenbach Child
Behavior Checklist (CBC). Further, additional analyses are reported that
assess the combined effects of Stage 2 measures and SIMS Behavior
Observation Codes outcomes in correctly classifying students assigned
to the three participant groups (externalizers, internalizers, controls) by
teacher rankings in SSBD Stage 1. Finally, score differences for males
and females across the three participant groups are reported on selected
measures.
14 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Stage 1 Instruments
The gender ratio of students nominated by participating teachers to
form the externalizing and internalizing groups differed markedly. For
example, at ranking Time 1, there were 46 males and 8 females in the
externalizing group and 27 males and 27 females in the internalizing
group. These proportions were nearly identical for the Time 2 ranking
occasion, with 45 males and 9 females in the externalizing group and 25
males and 29 females in the internalizing group.
The test-retest stability of teacher rankings was assessed in two ways.
First, stability was measured by determining the percentage of students
who were placed into the same participant groupings (externalizing,
internalizing) by their teachers on the two ranking occasions. A statistically significant relationship between teachers’ classifications of students
on the externalizing and internalizing behavioral dimensions across
ranking occasions was obtained. An analysis of the proportions of the
identical students comprising these participant groups from Time 1 to
Time 2 exceeded chance expectations. Results indicated that of the 168
students who were classified by teachers as externalizers at Time 1, 130,
or 77%, were so classified one month later. Using a chi-square analysis,
this result was significant at (p < .001). Further, of the 51 students ranked
among the top three externalizers by each teacher at Time 1, 35, or 69%,
also were ranked in the top three at ranking Time 2. For the internalizers, 132 of 165, or 80%, were classified as members of the same group on
both ranking occasions (p < .001). A similar proportion of students (69%)
ranked in the top three internalizers across the two ranking occasions.
The second method of assessing the stability of teacher rankings in Stage
1 involved computing Spearman rank order coefficients (rhos) between
the Time 1 and Time 2 data sets for each teacher. This analysis produced
34 rho coefficients (one classroom was excluded from this analysis
because the teacher changed between the two data collection occasions).
Across the 17 remaining teachers, these rho coefficients ranged from .33
to .98 for the externalizing dimension and averaged .76. For the internalizing dimension, the range was from .45 to .94 and averaged .74. Table
4 below contains test-retest rhos over a 1-month period for individual
teachers on the externalizing and internalizing dimensions.
Technical Manual | 15
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 4 Test-Retest Stability Coefficients for Individual Teachers on the Stage 1
Rank-Ordering Procedures
Teacher
Externalizing Dimension
Internalizing Dimension
1
.81
.73
2
.78
.66
3
.82
.74
4
.67
.87
5
.89
.73
6
.59
.82
7
.92
.57
8
.49
.70
9
.96
.81
10
.83
.72
11
.77
.94
12
.73
.78
13
.87
.45
14
.72
.69
15
.84
.87
16
.98
.76
17
.33
.67
Only two teachers on the externalizing dimension and one teacher on
the internalizing dimension had stability coefficients of less than .50 for
their rank orderings of students over a 1-month period. Some teachers
had substantial discrepancies in their stability coefficients between the
externalizing and internalizing dimensions with one dimension being
lower or higher than the other. However, there was no systematic pattern
to such discrepancies.
Stage 2 Instruments
The stability of the Combined Frequency Index (CFI) Adaptive and
Maladaptive Behavior Scales was assessed across rating occasions for
the three highest ranked students, on the externalizing and internalizing lists. Internal consistency of the scales was assessed at both rating
time points. These analyses were not conducted for the Critical Events
Index due to the scoring system used (1 or 0) and to the extremely low
frequencies of positively checked events for all three student groups.
16 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Pearson correlations were computed for the top-ranked externalizers and
top-ranked internalizers in order to assess the stability of teacher ratings
over a 1-month period. Students in the externalizing and internalizing
groups were combined for this analysis. The resulting correlations were
.88 for the Adaptive Behavior Scale and .83 for the Maladaptive Behavior
Scale between the Time 1 and Time 2 ratings. These correlations may be
inflated, however, due to the combined influence of sex, grade, and group
membership factors and should be interpreted cautiously. An inspection
of the raw data for the two scales indicated a normal distribution for the
Adaptive Behavior Scale; for the Maladaptive Behavior Scale, there was a
generally normal distribution with a slightly positive skew.
Internal consistency analyses (coefficient alpha) were conducted on
both CFI scales. For the Adaptive Behavior Scale, alpha was .85 and .88,
respectively, for the two rating occasions. For the Maladaptive Behavior
Scale, the comparable figures were .82 and .87.
Item analyses were conducted on the Adaptive and Maladaptive Behavior Scales to determine which items correlated positively with total
scale scores. Table 5 presents item-total correlations for the Adaptive
and Maladaptive Behavior Scales across the Time 1 and Time 2 rating
occasions.
Table 5 Item-Total Correlations for the Stage 2 Behavior Scales Across Time 1 and
Time 2 Rating Occasions
ADAPTIVE BEHAVIOR
Item
MALADAPTIVE BEHAVIOR
Time 1
Time 2
Item
Time 1
Time 2
1
0.66
0.67
1
0.64
0.68
2
0.68
0.67
2
−0.24
−0.12
3
0.59
0.59
3
0.62
0.69
4
0.48
0.66
4
0.57
0.57
5
0.64
0.70
5
0.69
0.87
6
0.62
0.75
6
0.60
0.62
7
0.46
0.69
7
0.60
0.75
8
0.68
0.70
8
0.71
0.69
9
0.67
0.50
9
0.17
0.22
10
0.16
0.27
10
0.65
0.70
11
0.61
0.72
11
0.54
0.59
12
0.06
0.02
Technical Manual | 17
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Across rating occasions and scales, the item total correlations ranged
from −.24 to .87. The deletion of one item each in the Adaptive and
Maladaptive Behavior scales, would have increased alpha; deletion of
any other items would have either lowered alpha or left it unchanged.
One item each in the CFI Adaptive and Maladaptive Rating Scales was
subsequently revised to improve clarity and ratability.
Table 6 contains means and standard deviations for the three participant
groups on the Stage 2 measures for the Time 2 rating occasion. Data
on the Stage 2 rating scales for all three participant groups, controls
included, were recorded only at rating Time 2.
Table 6 Means, Standard Deviations, and ANOVAs for the Three Participant
Groups on the Stage 2 Instruments
Variable
Externalizers
Internalizers
Controls
F Ratio
p Value
Adaptive
Behavior Rating
Scale
M = 36.38
SD = 6.16
M = 44.50
SD = 7.69
M = 54.68
SD = 4.37
40.56
<0.01
Maladaptive
Behavior Rating
Scale
M = 29.61
SD = 6.90
M = 18.16
SD = 4.83
M = 13.71
SD = 3.30
50.90
<0.01
Critical Events
Index
M = 1.72
M = 1.57
Range of Critical
Events
0–6
0–5
The mean differences on the CFI Adaptive and Maladaptive Behavior
Scales among the three participant groups were highly significant. These
differences were also in the predicted direction with controls, internalizers, and externalizers rated in order from most adaptive to least
maladaptive.
The incidence of positive occurrences on the two SSBD Critical Events
Indices was extremely low for both the externalizing and internalizing
participant groups. A Critical Events Index was not completed by participating teachers for control participants.
Correlations were computed between Adaptive and Maladaptive
Behavior Scales and the externalizing and internalizing subscales of
the Achenbach Child Behavior Checklist (CBC) in order to assess the
18 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
concurrent validity of the Stage 2 instruments. For the adaptive behavior
rating scale, the correlations with the CBC externalizing scale at rating
Times 1 and 2 were −.63 and −.68 (p < .001); for the Maladaptive Behavior Scale, these correlations were .81 and .77 (p < .001). Correlations
between the SSBD Adaptive Behavior Scale and the CBC internalizing
scale were .22 and .01 for the Time 1 and Time 2 ratings, respectively;
correlations were not computed between the CBC internalizing scale
and the SSBD Maladaptive Behavior Scale due to the behavioral content
differences between these two scales. Neither of the correlations with
the internalizing scale was significantly different from zero.
SIMS Behavior Observation Codes
Of the 64 students observed with the SIMS Behavior Observation
Codes, there were 16 externalizers, 15 internalizers, and 33 controls.
Fourteen were in first grade, 16 were in second grade, 11 were in third
grade, 13 were in fourth grade, and 10 were fifth graders. There were 36
males and 28 females. Again, there were gender proportion differences
by participant group. The 16 externalizers consisted of 12 males and 4
females; the 15 internalizers consisted of 8 males and 7 females. Controls
consisted of 16 males and 17 females.
Reliability estimates were calculated on the AET and PSB codes by
computing interobserver agreement coefficients between the observer
trainer/calibrator and each study observer. The mean agreement level for
the 19 AET reliability checks was .96 with a range from .86 to 1.00. The
mean agreement level for the 21 reliability checks on the PSB code was
.84 and ranged from .65 to 1.00.
Table 7 presents means and standard deviations on measures derived
from the AET and PSB behavior observation codes. Significance levels
for mean differences among the participant groups are also reported for
ANOVAs conducted on each measure.
Technical Manual | 19
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 7 Means, Standard Deviations, and ANOVAs for the Three Participant
Groups on Classroom and Playground Observation Measures
Externalizers
Internalizers
Controls
F
p
Value
Variable
M
SD
M
SD
M
SD
Academic
Engaged Time
(AET)
53.88
16.53
68.20
13.25
71.56
11.87
9.61
<0.01
Socially
Engaged (SE)
28.22
19.85
28.36
19.51
39.25
20.78
2.34
0.10
Parallel Play
(PLP)
21.09
19.34
25.63
22.34
23.38
16.64
0.22
0.80
Participation
(P)
31.53
30.54
13.40
25.15
18.86
22.81
2.15
0.13
Alone (A)
13.70
15.26
20.66
27.95
5.87
9.69
4.16
0.02
No Code (NC)
0.59
1.50
0.93
1.48
0.84
1.56
Positive
Behavior
54.91
21.59
45.28
32.15
62.95
18.15
3.14
0.05
Negative
Behavior
13.18
13.33
1.70
2.74
6.19
7.18
7.27
<0.01
Table 7 indicates that the following measures discriminated between the
three participant groups:
•• Academic Engaged Time
•• Alone
•• Total Positive Behavior
•• Total Negative Behavior
These observational measures were also correlated with the Achenbach
CBC externalizing and internalizing scales. Academic Engaged Time,
−.42 (p < .01), correlated significantly with the CBC externalizing scale.
None of the SSBD Observation Codes or categories correlated significantly with the CBC internalizing scale.
Discriminating Externalizers, Internalizers, and Controls
The Scheffé procedure for the analysis of mean differences was applied
to those Stage 2 measures and SIMS Behavior Observation Codes variables for which significant F ratios were obtained. Table 8 lists variables
on which at least one pair of participant groups differed at the .05 level
or beyond.
20 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
The results in Table 8 indicate that externalizers:
•• Were rated by teachers as engaging in significantly less adaptive
behavior than both internalizers and controls.
•• Were rated as significantly more maladaptive than either
internalizers or controls.
•• Spent less time academically engaged than internalizers
and controls.
The results in Table 8 indicate that internalizers:
•• Engaged in significantly less adaptive behavior than controls.
•• Produced significantly more maladaptive behavior than controls.
•• Spent significantly more time alone than controls.
Table 8 Scheffé Analysis of Mean Differences on Discriminating SSBD Stage 2 and
Observation Coding Variables
PAIRS OF PARTICIPANT GROUPS SIGNIFICANTLY DIFFERENT AT THE .05 LEVEL
(Group Means in Parentheses)
Adaptive Behavior Rating Scale
Externalizers (36.38) and Controls (54.68)
Internalizers (44.50) and Controls (54.68)
Externalizers (36.38) and Internalizers (44.50)
Maladaptive Behavior Rating Scale
Externalizers (29.61) and Controls (13.71)
Internalizers (18.16) and Controls (13.71)
Externalizers (29.61) and Internalizers(18.61)
SIMS Behavior SIMS Behavior
Observation Codes: Academic
Engaged Time (AET)
Externalizers (53.88) and Controls (71.56)
Externalizers (53.88) and Internalizers (68.20)
SIMS Behavior Observation Codes:
Alone
Internalizers (20.66) and Controls (5.87)
Efficiency in Classifying Participant Groups
A discriminant function analysis was conducted to determine the
number of study participants who could be correctly classified into their
respective participant groups (externalizers, internalizers, controls) on
the basis of their scores on SSBD Stage 2 measures and SIMS Behavior
Observation Codes outcomes. The Stage 2 variables entered in this
analysis were Adaptive Behavior Scale score and Maladaptive Behavior
Scale score. The SIMS Behavior Observation Codes variables used were
Academic Engaged Time, Social Engagement, Social Involvement, Parallel Play, Participation, Alone, Positive Social Interaction, and Negative
Technical Manual | 21
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Social Interaction. Results of the discriminant analysis indicated that
89.47% of the study participants were correctly classified into their
respective participant groups on the basis of their SSBD Stage 2 scores
and SIMS Behavior Observation Codes variable scores. Of the 13 externalizers included in this analysis, one was misclassified as an internalizing student. Three of the 12 internalizers were misclassified as controls
and 2 of the 30 controls were misclassified, one as an externalizer and
one as an internalizer. Incomplete data on three externalizers and three
internalizers led to their deletion from this analysis.
It should be noted that with 3 groups, 55 participants, and 10 discriminating variables, the participants to variables ratio was quite low in this
analysis. Further, the achieved 89.47% correct classification rate was not
adjusted for the prior probabilities of group membership. There were
approximately twice as many controls as there were externalizers and
internalizers in this analysis.
A multiple regression analysis was conducted to determine the extent
that scores on SSBD variables that discriminated the participant groups
could predict group membership. The six variables that discriminated
the three participant groups were entered into this analysis. The multiple
correlation (R) between group membership and these six variables was
.849. In combination, these variables accounted for approximately 72%
of the variance between groups. Group membership was dummy coded
in this analysis.
A simultaneous regression analysis was also conducted to determine the
relative weights of the variables entered in the equation, thereby permitting an analysis of which variables were most effective in predicting
group membership. Table 9 shows the correlations between each predictor variable and group membership and corresponding beta weights.
These results indicate that virtually all of the variance in group membership could be accounted for by Maladaptive Behavior Scale score,
Adaptive Behavior Scale score, and scores on the Alone and Academic
Engaged Time variables from the SIMS Behavior Observation Codes.
22 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Table 9 Correlations Between Predictor Variables and Group Membership and
Corresponding Beta Weights
Variable
Group Membership
Beta Weights
−.79
−.42
.77
.35
SIMS Behavior Observation Codes:
Alone
−.30
−.19
SIMS Behavior Observation Codes:
Academic Engaged Time
.45
.11
SIMS Behavior Observation Codes:
Positive Social Interaction
.35
.04
SIMS Behavior Observation Codes:
Negative Social Interaction
−.24
.04
Maladaptive Behavior Rating Scale
Adaptive Behavior Rating Scale
Sex Differences
An analysis was conducted to determine which of the SSBD Stage 2 and
SIMS Behavior Observation Codes variables registered sex differences
across the three participant groups. Statistically significant mean differences among males and females were obtained for two of the playground
measures (Social Engagement and Participation in structured games and
activities) and teacher ratings on the Adaptive Behavior Scale in Stage 2.
The Social Engagement variable measures peer-to-peer social interactions
in free play contexts. Table 10 presents means and standard deviations on
these variables across the three participant groups by sex of student.
Table 10 Sex Differences on Stage 2 and SIMS Behavior Observation Codes
Variables for Combined Participant Groups
Females (n = 28)
Social Engagement
Participation
Adaptive Behavior
Scale Ratings
Males (n = 36)
M
SD
Significance
M
SD
45.21
17.06
(p < 0.01)
25.18
16.88
9.65
12.63
(p < 0.01)
29.39
29.55
51.49
5.81
(p < 0.03)
45.71
7.27
The relatively small numbers of participants prohibited analysis of sex
differences within each of the participant groups.
Technical Manual | 23
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Intercorrelations Among Stage 2 Measures and SIMS Behavioral
Observation Code Variables
Because variables from Stage 2 measures and the SIMS Behavioral
Observation Codes were treated as independent measures in the analyses reported above, the authors examined the intercorrelations among
these variables in order to assess the extent of their covariation. Table
11 shows intercorrelations among the Stage 2 measures and the SIMS
Behavioral Observation Codes reported for the SSBD trial test.
Table 11 Correlation Matrix for Stage 2 and SIMS Behavior Observation Codes
Variables
Adaptive Maladap.
Scale
Scale
AET
Social
Eng.
Particip.
Parallel
Play
Adaptive Scale
1.00
Maladaptive
Scale
−.60
1.00
Academic
Engaged Time
.02
−.21
1.00
Social
Engagement
.12
.15
−.16
1.00
Participation
−.10
−.18
.24
−.40
1.00
Parallel Play
−.27
.10
−.13
−.25
−.40
1.00
.21
-.16
.03
−.26
−.05
−.20
Alone
Alone
1.00
Inspection of the correlation matrix in Table 11 indicates the intercorrelations among these variables were in the low to moderate range. The
highest correlations were between teacher ratings of adaptive and maladaptive student behavior (r = −.60) and between Social Engagement and
Participation and Parallel Play code categories (r = .40, .40). Correlations
of this magnitude among these variables are not unexpected because (a)
teachers rated the same participants on both the Adaptive and Maladaptive Behavior Scale item lists, (b) the Positive Social Interaction category
subsumes both the Social Engagement and Social Involvement codes,
and (c) Social Interaction opportunities are severely restricted during
structured playground activities during which the Participation category
was coded. However, statistical significance for group differences on
any one of these moderately correlated measures could predict similar
outcomes on the others.
Overall, results of the initial trial testing of the SSBD were encouraging.
Estimates of reliabilities for the instruments comprising each of the
24 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
three SSBD screening stages were judged acceptable for their assessment
purposes. Both the test-retest stability of teacher rankings of students on
the final form of the Stage 1 procedures and interrater agreement levels
among pairs of teachers and/or teachers and aides were satisfactory and
provided a foundation for future research on the system. The accuracy
of the teachers’ classification of students, as indicated by the consistency
of students’ group membership (i.e., externalizing, internalizing) from
Time 1 to Time 2, was substantial.
However, while overall test-retest rhos for the Stage 1 ranking dimensions averaged .75, several teachers were in the low .30s and .40s. Similarly, some teachers were not as consistent as others in their accuracy of
assigned group membership for students from Time 1 to Time 2. Overall,
however, the Stage 1 procedures allow teachers to identify behavior patterns that remain quite stable over periods of 1 month or less­—a finding
that has important implications for the referral of students to special
education and related services.
Teachers participating in this initial trial testing of the SSBD, via their
Stage 1 ranking tasks, also validated findings from the professional literature on the differential representation of sex differences within externalizing and internalizing behavior patterns and disorders. In this study,
the ratio of boys to girls in the teacher-nominated Stage 1 sample was
nearly six to one for members of the externalizing group. For members
of the internalizing group, boys and girls each comprised about half of
the sample.
At Stage 2, the CFI Adaptive and Maladaptive Behavior Scales demonstrated acceptable internal consistency and short-term stability, with
all correlations in the mid to high .80s. As noted earlier, however, these
coefficients should be interpreted cautiously. Item total correlations for
these scales were, with the exception of one item on each scale, adequate.
Coefficient alpha for the two scales was in the mid to high .80s and would
be improved with deletion of these items.
The SIMS Behavior Observation Codes proved to be highly reliable and
was sensitive in discriminating the three participant groups in both
classroom and playground settings. The average interobserver agreement level for the AET code was .96 during the study. The PSB code
was somewhat less reliable than the AET code, perhaps because of the
greater complexity of peer social behavior and the uncontrolled stimulus
Technical Manual | 25
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
conditions of playground settings. However, this code demonstrated an
acceptable interobserver agreement level of .84 in this trial test. Leff and
his colleagues, in a comprehensive review of over 80 coding systems,
have rated the PSB code as one of the best coding systems available for
recording playground social behavior (see Leff & Lakin, 2005).
The direction of the observed behavioral differences for the externalizing,
internalizing, and control participants was consistent with the authors’
expectations based on empirical evidence presented in the literature.
Teachers’ ability to identify intact groups of students, using the Stage
1 ranking procedures, and the clear differentiation of these groups on
teacher ratings and direct observational measures recorded by professionally trained observers, serves to validate both teacher judgment and
the viability of the SSBD approach.
The construct validity of the SSBD rests upon the bipolar externalizinginternalizing behavioral classification of Achenbach (1978) and Ross
(1980) and the assumption that it is possible to reliably differentiate
externalizers and internalizers from each other, and both behavior
patterns from nonranked control students. The discriminant function
analysis conducted in this study addressed this question. Collectively,
the Stage 2 measures and SIMS Behavior Observation Codes variables
were efficient in correctly classifying students whose group membership
was based on teacher assignments in Stage 1. As noted, these measures
correctly classified 89.4 7% of the study participants. Four variables in
combination accounted for 72% of the variance in determining this
group membership (i.e., Maladaptive Behavior Scale score, Adaptive
Behavior Scale score, and SIMS Behavior Observation Codes scores for
Academic Engaged Time and Alone). This level of overall precision in
separating students into identifiable groups spoke well for the continued
development of the SSBD.
Overall, results of the initial trial testing of the SSBD system and its
component measures were quite encouraging and provided a basis for
the design of a series of more extensive validation studies designed to
investigate a range of validity types and psychometric characteristics of
the SSBD instruments under field test conditions. Descriptions of and
results from these studies are reported in the next section.
26 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
SSBD Field Testing and Replication
The SSBD has been formally field tested within six sites across the
country. These sites were located in the states of Oregon, Utah, Illinois,
Wisconsin, Kentucky, and Rhode Island. SSBD Stage 1 and 2 measures
were recorded in all six of these field sites. The SIMS Behavior Observation Codeswas administered in all these sites except Wisconsin. The
authors and their colleagues conducted on-site training of school district
personnel involved in the field testing process prior to initiation of any
data collection within each site. Attempts were made to field-test the
SSBD in these sites under conditions that would approximate as closely
as possible those that would exist under normal conditions of screening
and SSBD usage. The adherence to a minimum set of research requirements across these sites no doubt attenuated achievement of this goal to
some degree. However, these requirements were necessary to produce
comparable, reliable, and generalizable data across these sites.
Formal training in administration of the SSBD procedures and dealing
with logistics involved in meeting field-test research requirements were
usually accomplished within a single day; however, the training of school
personnel as reliable observers on the SIMS Behavior Observation Codes
procedures generally required a second day of training and supervised
practice using the codes within in vivo settings. In some cases, follow-up
visits were made to field-test sites to conduct additional training, coordinate the monitoring of observer cadres, and assist with the calibration
of interobserver agreement indices. Telephone contacts were maintained
with field-test sites throughout the field-testing process, which spanned
periods ranging from 4 to 8 months. A supervisor/coordinator was identified within each field-test site to monitor and troubleshoot problems
that arose during the field-testing process. These field-testing activities
were supported by a 3-year, field-initiated research grant from the U.S.
Office of Special Education Programs.
Field-test results from these sites allowed for intersite replication of
SSBD procedures and outcomes, and were quite consistent across sites.
In addition, data and results from these sites were included in the SSBD
national standardization sample for the Stage 2 (N = 4,463) measures
and the SIMS Behavior Observation Codes (N = 1,219).
Technical Manual | 27
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
A formal replication of both the SSBD’s implementation and the results
of its initial field testing, as reported by Walker, Severson, Stiller, Williams, Haring, Shinn, and Todis (1988), was conducted by Nicholson
(1988) during the 1987–88 school year and is also reported in Walker,
Severson, Nicholson, Kehle, Jenson, & Clark (1994). This systematic
replication was supported by a grant from the Utah Office of Special
Education. The replication effort spanned the entire 1987–1988 school
year and was conducted within the Jordan School District, a suburban
school district serving the suburbs of Salt Lake City, Utah. Though the
full range of SES levels is represented in this district, it served primarily
a middle-class population.
Three elementary schools within the Jordan Utah School district participated in the SSBD replication study. Participants involved in the SSBD
Stage 1 screening procedures were 1,468 students and their respective
teachers (n = 58) in grades 1–5 within these three elementary schools. At
SSBD screening Stage 2, participants consisted of 475 students in grades
1–5 who were selected from Stage 1 based on their teachers’ rankings.
Participants who were observed in classroom and playground settings
with the SIMS Behavior Observation Codes were 225 participants identified as the top-ranked externalizer, top-ranked internalizer, and two
nonranked students selected from each participating classroom.
A total of 900 observations of a minimum 12 minutes’ duration were
recorded on these participants in classroom and playground settings
on two occasions each. Classroom observations were recorded during
independent seatwork periods whenever possible and during normal
recess periods on the playground. Observers were rigorously trained and
carefully monitored in this study. Reliability checks were conducted on
16 of the classroom observation sessions, and interobserver agreement
averaged 95%. Similarly, interobserver agreement was calculated for 49
of the playground observation sessions and averaged 88%.
Of the 173 students who appeared in the highest three ranks on the Stage
1 externalizing rank order dimension, 82% were males and 18% were
females. In contrast, 44% of the top-ranked internalizers were males
and 66% were females. These results are very similar to the proportions
identified in the Walker et al. (1988) initial field test (see SSBD Trial
Testing above).
28 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Similarly, interrelationships among the Stage 2 measures and SIMS
Behavior Observation Codes variables also closely replicated the initial
trial test results. Correlations among the SSBD Stage 2 measures ranged
from .61 to .77 and intercorrelations among SIMS Behavior Observation
Codes variables ranged from .13 to .62. As in the Walker et al. (1988) trial
test, correlations between Stage 2 and SIMS Behavior Observation Codes
variables were low and ranged from −.21 to .17. In addition, coefficient
alpha for the Stage 2 Adaptive and Maladaptive Behavior Rating Scales
were .94 and .90, respectively, in this replication.
The Stage 2 measures and SIMS Behavior Observation Codes variables
were highly sensitive in discriminating behavioral differences between
high-ranked externalizers, high-ranked internalizers, and nonranked
control participants. A discriminant function analysis indicated that the
SSBD Stage 2 measures and SIMS Behavior Observation Codes variables
correctly classified 84% of the three participant groups overall. As in
the Walker et al. (1988) study, the classification ratios were highest for
nonranked students, followed by externalizers and then internalizers.
Externalizers exhibited less adaptive behavior, more maladaptive
behavior and more critical events than either internalizers or nonranked
students. They also spent less time academically engaged and produced
fewer positive interactions as compared to internalizers and nonranked
students. Internalizers exhibited less adaptive behavior, more maladaptive behavior and more critical events than nonranked students. They
also spent a lower percentage of observed time academically engaged
than nonranked students. Though fewer between-participant differences
were found on the SIMS Behavior Observation Codes PSB code categories in the Nicholson (1988) study, these results overall closely replicated
those reported by Walker et al. (1988) in a separate replication study.
Resource teachers, psychologists, and general education teachers were
surveyed to assess their general satisfaction with the SSBD procedure
and to compare its efficacy with traditional procedures. Resource
teachers and psychologists completed a 13-item survey, and general
education teachers completed an 11-item survey. Results indicated that
resource teachers and psychologists were much more favorable about
their experiences with the SSBD than were general education teachers.
Seventy-five percent of the resource teachers and psychologists would
Technical Manual | 29
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
recommend the SSBD for use by other school faculties, and a majority
of these respondents (n = 8) responded to each survey item favorably. In
contrast, only 33% of the general education teachers sampled (n = 51)
would recommend the system to other school faculties. However, when
the survey items were analyzed by individual school faculties (n = 3), it
was apparent that faculty members one school was very negative in its
responses to the survey while the other two faculties were considerably
more positive. In this school, the percentage of faculty responding favorably ranged from 12—72% across the survey items and averaged 32%. In
the second school, in contrast, these percentages ranged from 53–87%
and averaged 67%. In the third school, these figures ranged from 42–90%
and averaged 64%. Thus, the results from the most negative school may
have been substantially biased or influenced by variables over which the
field-test personnel had little control or knowledge.
The results of this replication study provided considerable overlap in
findings with the initial trial study conducted by Walker et al. (1988) in
one school and involving 18 teachers. The Nicholson (1988) replication
involved 58 teachers in 3 elementary schools located in another state.
These findings extended substantially the results of the original trial test
of the SSBD as reported by Walker et al. (1988). Consumer satisfaction
surveys of the Nicholson (1988) study are important contributions in the
comprehensive evaluation of alternative approaches to existing practices
as represented by the SSBD.
Validation Studies of the SSBD
An extensive series of validation studies has been conducted on the
SSBD to date, examining the psychometric properties of Stage 1 and
Stage 2 instruments as well as the SIMS Behavior Observation Codes
and School Archival Records Search.
Results from these studies and others described in this section relate
to the following types of reliability: test-retest, internal consistency, and
interrater reliability. Additionally, this section describes findings that
provide empirical SSBD support for each of the following validity types:
item, factorial, concurrent, discriminant, criterion related, predictive,
and construct.
30 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Reliability
Test-Retest
Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley
(1990) investigated the test-retest stability of the SSBD Stage 1 and 2
measures over a 1-month period. Forty teachers in the elementary age
range completed the SSBD Stage 1 and 2 procedures on two occasions
separated by 31 days. The mean test-retest rank order correlations (rhos)
on the externalizing and internalizing behavioral profiles (n = 10 students each) were .79 and .72, respectively. Individual teacher rank order
correlations ranged from −.16 to .96 on the externalizing dimension and
from −.07 to .92 on the internalizing dimension. Eighty-eight percent
of the 40 participating teachers had test-retest rhos greater than .45. In
a reanalysis of these data, the authors excluded two teachers from the
sample whose rhos were negative (−.16 and −.07) and treated them as
outliers. The average externalizing rho improved to .88 and the average
internalizing rho improved to .74 in this reanalysis.
Pearson correlations were computed for the Stage 2 measures from
Time 1 to Time 2. For the Critical Events Index, the resulting r was .81;
for the Combined Frequency Index adaptive behavior rating scale, the
r was .90; and for the maladaptive behavior rating scale, the r was .87.
These correlations were all statistically significant at (p < .01). Overall,
these results suggest that teachers are capable of making relatively stable
judgments regarding child behavioral characteristics using the Stage 1
and 2 screening instruments.
Internal Consistency
Walker, Severson, Stiller, Williams, Haring, Shinn, and Todis (1988)
calculated coefficient alpha to estimate the internal consistency of the
Combined Frequency Index Adaptive and Maladaptive Behavior Scales.
Using a sample of 18 teachers, each of whom rated 8 students in their
classes on two occasions (i.e., the three highest ranked externalizers and
internalizers and two nonranked comparison students), coefficient alpha
was derived for these two scales at two time points separated by one
month. For the Adaptive Behavior Scale, alpha was .85 and .88, respectively, across the two rating occasions. For the Maladaptive Behavior
Scale, these coefficients were .82 and .87. Coefficient alpha was also
Technical Manual | 31
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
calculated for the SSBD national standardization sample of 4,463 cases
on the Stage 2 instruments. The resulting alpha coefficients were .94 for
the Adaptive Behavior Scale and .92 for the Maladaptive Behavior Scale.
The average item intercorrelations for these two scales were .59 and .49,
respectively. Coefficient alpha was not calculated for the Critical Events
Index because of the divergent behavioral content (externalizing and
internalizing) sampled by the items comprising this instrument.
Interrater
The authors extensively investigated the interrater agreement of the
SSBD Stage 1 ranking procedures in the process of developing and
refining the externalizing and internalizing behavioral profiles (results
of these activities were described earlier under “Instrument Development Procedures”). Interrater agreement levels have been established for
the Stage 2 measures. Interrater agreement was the primary criterion
used to develop, evaluate, and revise the SIMS Behavior Observation
Codes. The Academic Engaged Time (AET) code uses a one-paragraph
definition and a stopwatch duration recording procedure to estimate the
proportion of time spent academically engaged (see SSBD Observation
Manual). Interrater agreement indices were calculated by dividing the
smaller amount of stopwatch recorded time from one observer by the
larger amount recorded by the second observer and multiplying by 100.
These ratios consistently have ranged between 90–100% and have averaged approximately 95% in SSBD studies. Similarly, interrater agreement
ratios have been calculated for the partial interval Peer Social Behavior
(PSB) (see SSBD Observation Manual). However, since the PSB is a
five-category code with a 10-second recording interval, interrater agreement among pairs of observers was determined by dividing the number
of recording intervals on which there was complete agreement by the
total number of intervals recorded and multiplying by 100. Interrater
agreement ratios for the PSB code have consistently averaged 85% and
have generally ranged between 80 and 90%. (Note: The PSB was originally a six-category code; however, the categories of social engagement
and social involvement were combined into one code category­—social
engagement—because of their overlapping content and failure to discriminate among participant groups.)
32 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Validity
The following types of validity have been estimated to date on the SSBD
system: item, factorial, concurrent, discriminant, criterion related, predictive, and construct. Evidence in support of each of these validity types
is described in this section.
Item Validity
Item validity was estimated on the Stage 2 Adaptive and Maladaptive
Behavior Scales by calculating item-total correlations using the SSBD
standardization sample (n = 4,463). Table 12 contains corrected itemtotal correlations for the Adaptive and Maladaptive Behavior Scale items.
These correlations ranged from .64 to .81 for the Adaptive Behavior
Scale and from .32 to .83 for the Maladaptive Behavior Scale. Item-total
correlations averaged .59 for the Adaptive Scale items and .49 for the
Maladaptive Scale items. All items in both scales met the minimum
criterion of .30 and above for acceptable item-total correlations.
Table 12 Item-Total Correlations for the Stage 2 Adaptive and Maladaptive Rating
Scales (n = 4,463)
Adaptive Rating Scales
Maladaptive Rating Scales
Item
Corrected Item-Total
Correlation
Item
Corrected Item-Total
Correlation
A1
.79
M1
.79
A2
.78
M2
.34
A3
.75
M3
.73
A4
.72
M4
.66
A5
.64
M5
.83
A6
.80
M6
.75
A7
.72
M7
.78
A8
.77
M8
.76
A9
.77
M9
.32
A10
.68
M10
.76
A11
.81
M11
.68
A12
.73
Alpha = .94
Mean interitem correlation = .59
Alpha = .92
Mean interitem correlation = .49
Technical Manual | 33
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Factorial Analysis
A Principal Components Factor Analysis with a Varimax rotation, using
SPSS-X procedures, was used to investigate the factor structure of the
SSBD Stage 2 Adaptive and Maladaptive Behavior Scales. The items in
these two scales were factor analyzed in tandem rather than separately
within the two scales. This procedure, conducted on the SSBD national
standardization sample, yielded seven factors with eigenvalues greater
than one that collectively accounted for 79% of the variance. Two factors
were then specified for extraction in a second-order analysis. The results
of this analysis, with corresponding item loadings on each of the two
factors, are presented in Table 13.
It was expected that these two rating scales would collectively provide
measures of the two primary forms of school adjustment required of all
students in school settings, i.e., peer related and teacher related (Walker,
Ramsey, & Gresham, 2004). The two factors that emerged from the factor
analysis confirmed this hypothesis. Factor One was very dominant in
the overall structure and accounted for 52% of the total variance; in
contrast, Factor Two accounted for only 9%. The corresponding eigenvalues for these two factors were 12.11 and 2.09, respectively. Factor One
consisted of scale items that define school adjustment according to adult
expectations while the content of factor two seemed to focus primarily
on peer relations. Table 13 lists the item factor loadings on these two
factors across the Adaptive and Maladaptive Behavior Scale items. (See
the SSBD Instruments and Forms Packet for descriptions of these items.)
Factor scores were calculated for these two factors based on a sample of
1,337 externalizers, 1,310 internalizers, and 862 nonidentified comparison students drawn from the standardization sample of 4,463 cases. For
Factor One (teacher related), these scores for externalizers, internalizers
and comparison students were, respectively, .89, −.67, and −.36. For Factor
Two (peer related), these scores were −.27, −.37, and .99, respectively.
34 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Table 13 Adaptive and Maladaptive Behavior Rating Scale Factor Structure and
Item Loadings (n = 4,463)
ROTATED FACTOR MATRIX
Item
Factor 1 Teacher Related
Factor 2 Peer Related
M5
.84
−.31
M7
.82
−.28
M8
.82
−.18
M1
.79
−.34
M6
.79
−.19
M3
.75
−.25
M10
.75
−.29
A1
−.70
.51
A2
−.63
.56
A11
−.63
.59
M11
.61
−.34
M9
.37
−.05
A12
−.16
.83
A8
−.24
.81
A10
−.14
.78
A6
−.43
.72
A3
−.32
.71
A9
−.38
.70
A4
−.33
.69
A7
−.33
.68
M2
.08
−.57
M4
.49
−.52
A5
−.48
.50
FACTOR SCORES
Group
Teacher Related
Peer Related
M
SD
M
Externalizer
.89
.89
−.27
SD
.77
Internalizer
−.67
.63
−.37
1.00
Nonranked
−.36
.36
.99
.54
Technical Manual | 35
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Concurrent Validity
The concurrent validity of the SSBD Stage 2 measures was estimated as
part of a study of the SSBD system’s use within resource room settings
(Walker, Block-Pedego, Severson, Barckley & Todis, 1989). Teacher
rating and direct observational measures were completed on a sample of
56 resource room students in the elementary age range. Six teachers and
their resource room students participated in this study.
The SSBD Stage 2 measures were correlated with the Walker-McConnell
Scale of Social Competence and School Adjustment (Walker & McConnell, 1988) and with direct observation code measures recorded by the
Classroom Adjustment Code (CAC) (Walker, Block-Pedego, McConnell,
& Clarke, 1983). The Walker-McConnell scale is designed for use by
teachers in the K–12 grade range in rating students’ social skills. The elementary scale consists of 43 items and has three subscales. Its national
standardization sample contains 1,812 cases. Extensive studies of the
scale’s validity and reliability have been conducted by the authors and
are reported in the manual for the scale.
Correlations between total score on the Walker-McConnell scale and
total scores on the Critical Events Index and the Adaptive and Maladaptive Rating Scales were −.57 (p < .001), .79 (p < .001), and −.44 (p < .001)
respectively. Correlations of this magnitude provide partial support
for the concurrent validity of the SSBD Stage 2 measures with a wellconstructed and validated measure of teacher-rated social skills. Similarly, correlations between the three Stage 2 measures and the child
behavior code categories of an on task and unacceptable were as follows:
Critical Events Index vs. on task and unacceptable were −.45 (p < .01)
and .15 (p < .05), respectively. These same correlations for the Adaptive
Behavior Scale were .45 (p < .01) and −.16 (p < .05); for the Maladaptive
Behavior Scale, they were −.37 (p < .03) and .26 (p < .05). Though they
were in the low to moderate range of magnitude, these correlations provide evidence that teacher self-reported ratings of students’ classroom
behavior were related to their actual observed behavior as recorded by
independent, professionally trained observers.
Discriminant Validity
A number of studies have been conducted by the authors and their
colleagues investigating the discriminant validity of the SSBD system
36 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
and its component measures. These studies have ranged from assessing the performance of clinical samples on selected SSBD measures
to discriminant function analyses of the classification efficiency of the
Stage 2 measures and SIMS Behavior Observation Codes. The studies
and results reported herein under discriminant validity are extensive
in documenting the SSBD procedure’s ability to identify externalizers
and internalizers with problematic behavioral profiles and to separate
them accurately and reliably from students with well-adjusted school
behavior patterns. In one sense, this is the most important form of SSBD
validity because the procedure and its component instruments were
designed to identify and discriminate externalizing and internalizing
students from pools of students who do not manifest such behavioral
tendencies. The studies described below provide substantial evidence
in support of the SSBD’s ability to accomplish this goal.
Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley
(1990) conducted validation and replication studies on the SSBD system
in Oregon and Washington field sites. In the Oregon site, 170 teachers
in grades 1–5 completed the Stage 1 and 2 measures. Stage 2 measures
were completed by each participating teacher on the three top-ranked
externalizing students, the top three internalizing students, and two
nonranked comparison students. ANOVAs were computed to test for
participant group differences on the Stage 2 measures. The resulting F
ratios were as follows: Critical Events Index, F (2,853) = 163.62 (p < .001);
Adaptive Behavior Scale, F (2,850) = 500.51 (p < .001); and Maladaptive
Behavior Scale, F (2,821) = 596.97 (p < .001). The corresponding Omega
squared coefficients for these one way ANOVAs were .28, .54, and .59,
respectively. Post hoc Scheffe tests indicated that all possible pairs of
mean differences among the three participant groups were significant
at (p < .01).
In a replication of these findings, 40 regular classroom teachers in the
Washington site completed SSBD Stages 1 and 2, resulting in ratings
on 270 high-ranked externalizers, high-ranked internalizers, and nonranked comparison students who did not appear on either rank-ordering
list of ten students each per classroom. ANOVAs were computed separately for the Critical Events Index, the Adaptive Behavior Scale, and the
Maladaptive Behavior Scale to test for between-participant differences
on these measures. As in the Oregon site, the Stage 2 measures proved
to be highly sensitive in discriminating these three participant groups.
Technical Manual | 37
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
The F ratios for the Critical Events Index, the Adaptive Behavior Scale,
and the Maladaptive Behavior Scale were F (2,267) = 77.97 (p < .001); F
(2,267) = 152.00 (p < .001); and F (2,267) = 214.93 (p < .001), respectively.
Omega squared coefficients for these analyses were .34, .47, and .56.
Post hoc Scheffe tests indicated that all mean differences for the three
participant groups exceeded chance expectations (p < .05) on each of the
Stage 2 instruments.
A sample of four students was selected from each of the 97 participating
classrooms in the Oregon site where the SIMS Behavior Observation
Codeswas used. These were the highest ranked externalizer, the highest
ranked internalizer, and two unranked comparison students (a boy and
a girl). These participant groups were observed on two occasions each
in academic and playground settings using the AET and PSB codes,
respectively. Comparison students averaged 77% of observed time
academically engaged; the corresponding figures for the internalizing
and externalizing participants were 73% and 65%. The mean differences
between comparison and externalizing students and between comparison and internalizing students were significant at (p < .01) and (p < .05);
the mean difference between externalizers and internalizers was not
statistically significant.
Table 14 contains means and standard deviations for the three participant groups on individual PSB code categories and on variables derived
from combining selected code categories.
Table 14 PSB Code Category and Code Category Combination Means, Standard
Deviations, and Significance Tests by Participant Group (N = 336)
SIMS Behavior
Observation Codes: PSB
Variables
Externalizers
(n = 73)
Internalizers
(n = 76)
Comparison
(n = 152)
Omega
Squared
Mean (SD)
Mean (SD)
Mean (SD)
(SD)
35.1 (15.3)
0.03
Socially Engaged (SE)
31.1 (16.8)
Participation (P)
1
27.4
2tab
(13.1)
22.6 (28.7)
7.4 (16.1)
14.2 (22.6)
0.05
Parallel Play (PLP)
5.8 (7.5)
10.72,3 (10.6)
4.9 (7.5)
0.07
Alone (A)
6.1 (8.3)
8.6 (10.0)
3.5 (5.5)
0.06
No Codeable Response
Total Positive Behavior (+)
Total Negative Behavior
1
1
2
2
1.5 (1.7)
1.7 (3.6)
1.3 (2.0)
—
80.51 (17.2)
77.22 (18.2)
88.6 (11.5)
0.10
6.01,3 (7.9)
1.8 (4.2)
1.9 (5.6)
0.07
1 = Externalizers vs. Comparison Students (p < 0.05)
2 = Internalizers vs. Comparison Students (p < 0.05)
3 = Externalizers vs. Internalizers (p < 0.05)
38 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Statistically significant differences were obtained between one or more
pairs of student participant groups on all but one of these variables (i.e.,
No Codeable Response). Thus, both Stage 2 measures and SIMS Behavior
Observation Codes variables proved to be highly sensitive to behavioral
differences among teacher-nominated externalizing, internalizing, and
nonranked comparison students, as reflected in both teacher ratings
and direct observations recorded in natural settings by independent,
professionally trained observers.
Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley
(1990) investigated two forms of discriminant validity for the SSBD. These
were (a) the ability of the SSBD Stage 2 measures and SIMS Behavior
Observation Codes variables to correctly classify the group membership
assignments of study participants by teachers in screening Stage 1, and (b)
the existence of statistically significant differences on Stage 2 measures and
SIMS Behavior Observation Codes variables for first- vs. second-ranked
students identified by their teachers in SSBD Stage 1. Using a discriminant
function analysis procedure, these authors found that the SSBD Stage 2
measures and SIMS Behavior Observation Codes variables correctly
classified 84.69% of the three participant groups selected by their teachers
in screening Stage 1. In this same analysis, 142 of 150 (95%) nonranked
comparison students were correctly classified, with four misclassified as
externalizers and four misclassified as internalizers. Similarly, 56 of 69
(81%) externalizers were correctly classified, with 4 misclassified as nonranked comparison students and 9 misclassified as internalizers. Finally,
51 of 75 (68%) internalizers were correctly classified, with 13 misclassified
as comparison students and 11 misclassified as externalizers. Although
this level of classification efficiency far exceeds chance levels, the SSBD
measures were least successful in correctly classifying internalizers whose
behavioral characteristics appear to have less salience for adult raters and
often more closely overlap with those of nonranked comparison students.
Independent t tests were used to test for significant differences between
Stage 1 first- and second-ranked students on the Stage 2 measures and
SIMS Behavior Observation Codes variables. For externalizers, all three
Stage 2 measures discriminated at (p < .05) while the SIMS Behavior
Observation Codes variables of Parallel Play and Total Positive Behavior
discriminated at this level for first- and second-ranked students. All mean
differences favored the highest ranked externalizing students as predicted. For internalizing participants, the corresponding discriminating
Technical Manual | 39
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
variables were the Critical Events Index, the Adaptive Behavior Scale, and
the SIMS Behavior Observation Codes variable of Participation. As with
externalizers, these differences favored the highest ranked internalizers.
These results provide further empirical evidence for the sensitivity of
teacher ranking judgments when using the Stage 1 screening procedures.
Todis, Severson, and Walker (1990) investigated items on the Critical
Events Index that significantly discriminated between externalizers and
internalizers with scores on the SIMS Behavior Observation Codes that
indicate risk for externalizing and internalizing disorder. Using two separate samples drawn from SSBD field test sites, a total of 41 participants
(27 externalizers, 14 internalizers) enrolled in grades 1–5 from the two
sites met these criteria. Nine of the 33 items of the Critical Events Index
significantly discriminated between externalizers and internalizers.
Table 15 contains these items along with the proportion of the two participant samples that had the item(s) checked as present by their teachers
along with corresponding significance levels.
Table 15 Comparison of Frequency of Items Checked on the Critical Events
Index for Externalizing and Internalizing Elementary Students Who Met Risk
Criteria on the SSBD
Item
Externalizers
(n = 27)
Internalizers
(n = 14)
p Value
(2-tailed Fisher’s
Exact Test)
Ignores teacher reprimands
96.2%
7.7%
<.01
Is physically aggressive
73.1%
0%
<.01
Damages other’s property
50.0%
0%
<.01
Steals
38.5%
0%
.02
Uses obscene language, swears
34.6%
0%
.02
Has tantrums
30.8%
0%
.04
Makes lewd or obscene gestures
30.8%
0%
.04
Demonstrates obsessive/
compulsive behavior
38.5%
7.7%
.06
7.7%
84.6%
<.01
57.7%
30.8%
<.01
Exhibits painful shyness
Is teased, neglected by peers
Item 8 (“demonstrates obsessive/compulsive behavior”) in Table 15
approached statistical significance at (p < .06). Eight of these 10 items
were in the predicted directions for externalizers and internalizers
in terms of expected or anticipated prevalence rates. However, on
two items (“is teased and/or neglected by peers” and “demonstrates
40 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
obsessive-compulsive behavior”), teachers assigned higher rates to
externalizers than to internalizers. Both of these items were originally
judged by the authors to be more characteristic of internalizers than
externalizers.
In prior research, the Critical Events Index has proved to be highly sensitive in discriminating externalizers and internalizers from nonranked
comparison students. The results of the above study indicate that nearly
a third of the items discriminated externalizers from internalizers as
well using the combined sample of 41 participants from two sites.
The SSBD Stage 2 measures were completed on 106 participants by
teachers within the North Idaho Children’s Home, a residential facility
serving severely emotionally disturbed and/or abused children in the
K–12 grade range. This facility maintains two residential programs and
two day treatment programs. Elementary and middle school age students
were included in the sample rated by their teachers for this study. A total
of 52 participants were rated from the regular residential population,
along with a total of 17 participants from the secure treatment residential
program, which serves severely involved students (e.g., homicidal, suicidal, depressed and so forth). The nonresidential portion of the sample
comprised 20 day treatment students served by the residential facility
and 17 students assigned to a community-based educational program
operated by the facility staff on the campus of a cooperating college.
Table 16 contains means and standard deviations for the four participant
groups on the three Stage 2 measures.
Table 16 Means, Standard Deviations, and Significance Tests for Four Participant
Groups of North Idaho Children’s Home Residents (N = 106)
PARTICIPANT GROUP
Day Treatment Secure Treatment
(n = 20)
(n = 17)
Residential
(n = 52)
Community
(n = 17)
M
SD
M
SD
M
SD
M
SD
Critical
Events Index
5.50
3.26
9.11
3.93
5.84
4.13
6.47
4.16
Adaptive
Behavior
Rating Scale
3.26
0.61
3.09
0.73
3.40
0.60
3.27
0.80
Maladaptive
Behavior
Rating Scale
2.73
0.89
2.90
0.65
2.53
0.77
2.64
0.87
Technical Manual | 41
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
For the Critical Events Index, the entries in Table 16 represent the average
number of items checked as present by the teacher; for the Adaptive and
Maladaptive Behavior Scales, the entries are the average item scores, on
a 5-point Likert rating scale of frequency, as assigned by teachers. Separate one-way ANOVAs conducted for these three measures indicated
statistically significant differences among the four groups on the Critical
Events Index (3,102) = 3.22, p < .02), but no significant differences on
either the Adaptive or Maladaptive Behavior Scale. A post hoc Scheffé
test indicated that the means for the secure treatment (n = 17) and regular residential (n = 52) students were significantly different (p < .05);
none of the other mean differences approached significance.
Chi-square analyses were conducted for the four groups on each of the
33 items comprising the Critical Events Index to identify discriminating
items. Table 17 contains the results of these analyses along with average
percentages of each participant group having that item checked by their
teachers as present.
Seven of the 33 items significantly discriminated the four groups at (p
< .05). Students in the secure treatment group had the highest prevalence rates of any of the four groups on five of the seven discriminating
items. Their profiles across the 33 items of the Critical Events Index
are extremely problematic and indicate very severe levels of behavioral
pathology. In addition, profiles of the other three participant groups
were also highly indicative of serious behavioral adjustment problems
in comparison with nonranked students of the same age. The sensitivity
of the Critical Events Index in profiling the greater severity of the secure
treatment participant group and in discriminating the other three groups
from nonranked students provides additional evidence in support of its
discriminant validity.
42 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Table 17 Chi-square Analysis of Critical Events Items for Four North Idaho
Children’s Home Participant Groups
CRITICAL EVENT ITEM
1. Steals
2. Sets Fire
3. Vomits
4. Tantrums
5. Assaults Adult*
6. Painful Shyness
7. Weight Loss/Gain
8. Sad Affect*
9. Physically Aggressive
10. Damages Property
11. Obsessive-Compulsive
12. Nightmares
13. Sexual Behaviors
14. Self-Abusive
15. Seriously Injure
16. Suddenly Cries
17. Severe Headaches*
18. Suicide Thoughts
19. Thoughts Disorders*
20. Ignores Warnings
21. Lewd Gestures
22. Physical Abuse
23. Drug Abuse
24. Sexually Abused
25. Obscene Language
26. Cruelty to Animals
27. Teased by Peers*
28. Restricted Activity*
29. Enuretic
30. Encopretic
31. Sexually Molests Children
32. Hallucinations
33. Lacks Interests*
PERCENTAGE OF PARTICIPANTS HAVING ITEM CHECKED
Day (n = 20)
70
—
5
60
—
15
10
30
30
30
20
15
—
5
—
20
10
20
—
45
20
5
—
15
45
—
55
5
5
5
5
—
5
Secure
(n = 17)
18
—
—
52.9
29.4
41.1
17.6
76.4
35.2
29.4
58.8
11.7
17.6
17.6
—
32.5
5.8
5.8
58.8
64.7
47.0
11.7
5.8
17.6
64.7
5.8
70.5
41.1
—
11.7
17.6
5.8
35.2
Residential
(n = 52)
25
—
7.6
53.8
53.8
15.3
1.9
61.5
21.1
21.1
42.3
26.9
15.3
11.5
3.8
13.4
34.6
13.4
23.0
40.3
15.3
—
13.4
21.1
40.3
1.9
21.1
5.7
9.6
3.8
5.7
5.7
1.9
Community
(n = 17)
23.5
5.8
5.8
64.7
64.7
17.6
5.8
35.2
23.5
35.2
35.2
76.4
5.8
29.4
11.7
35.2
23.5
5.8
23.5
47.0
23.5
11.7
23.5
—
58.8
—
35.2
11.7
—
—
—
5.8
—
*Significance at p < 0.05
Technical Manual | 43
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
As part of a larger study of the social competence of regular students
enrolled in second- and fourth-grade classrooms, Hops, Lewin, Walker,
and Severson (1990) incorporated the SSBD Stage 1 measures as part of
their overall data collection procedures. A total of 47 participants were
participants in this part of their overall study, with 26 participants enrolled
in grade 2 and 21 enrolled in grade 4. Of the 26 second graders, 10 were
highly ranked by their teachers as externalizers, 9 were highly ranked as
internalizers, and 7 did not appear on either rank-order list using the SSBD
Stage 1 procedures. For the 21 fourth graders, there were 7 each externalizers, internalizers, and nonranked students. The second- and fourth-grade
participant groups were combined for purposes of statistical analysis.
These authors recorded a broad range of social competence and academic
skill measures on all students, including the participants described above,
who were enrolled in the participating classes and for whom prior parental consent was obtained. These measures included direct observations
recorded on child and/or teacher behavior in classroom and playground
settings, sociometric assessment procedures, and teacher ratings of the
students’ status on such dimensions as aggression, social withdrawal,
and academic competence. Comparative analyses were conducted of
the SSBD participant groups’ relative status on these measures. Separate
one-way ANOVAs were conducted for each of these study variables,
with post hoc tests conducted for variables yielding significant F ratios.
Results of these analyses are presented in Table 18.
Table 18 contains means, standard deviations, and results of corresponding post hoc test results for each statistically significant study
variable. A relatively large number of variables in the Hops et al. (1990)
study discriminated the three participant groups. As a rule, mean levels
and corresponding discriminations were in expected directions (e.g.,
externalizers were rated as significantly more aggressive than internalizers or nonranked students, internalizers were rated as significantly
more socially withdrawn than either of these participant groups, both
groups had lower average likability ratings assigned by their teachers,
externalizers and internalizers had lower peer preference scores than
nonranked students, externalizers had much higher observed rates of
teacher disapproval in reading than either internalizers or nonranked
students and so forth). However, it should be noted that the codes used
by Hops et al. (1990) for recording the participants’ playground social
behavior were not sensitive in discriminating the three groups.
44 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Table 18 Means, Standard Deviations, and Significance Tests for Fourth-Grade
Externalizing, Internalizing, and Nonranked Students (N = 47)
Externalizers
Internalizers
Nonranked
M
SD
M
SD
M
SD
TEACHER RATINGS OF SOCIAL SKILLS
Subscale 1
46.001
11.31
56.68
16.19
66.35
12.26
Subscale 2
1
58.52
13.95
53.31
15.12
70.78
7.99
Subscale 3
1
31.76
8.16
36.75
10.61
45.64
5.12
2
TEACHER RATINGS OF ACADEMIC SKILLS
Reading
5.111
2.34
5.182
2.31
7.71
1.20
Math
5.35
2.26
6.00
1.78
7.85
1.40
TEACHER RATINGS OF BEHAVIORAL ATTRIBUTES
Likability
0.701
0.77
0.872
1.20
2.14
1.40
Aggression
7.58
4.01
1.68
2.54
1.35
2.37
Withdrawal
1.29
Popularity
7.111
Peer Preference
(Positive choices minus
negative choices)
Social Impact (Positive
choices plus negative
choices)
1,3
1.79
2
3.31
2.96
0.42
0.75
6.43
6.682
5.58
13.71
5.95
-0.501
1.13
-0.432
0.96
0.58
0.80
0.27
0.99
0.10
0.67
0.48
0.72
3
SOCIOMETRIC STATUS
OBSERVED TEACHER BEHAVIOR
Teacher Disapproval
(Reading)
0.061,3
0.08
0.01
0.04
0.01
0.04
Teacher Disapproval
(Math)
0.01
0.06
0.02
0.05
0.02
0.06
OBSERVED STUDENT BEHAVIOR
Classroom
On Task (Reading)
75.361
24.91
86.16
10.87
84.01
27.73
On Task (Math)
86.22
9.80
85.92
11.63
90.96
7.69
15.66
23.11
5.85
11.23
20.84
35.48
% Negative Social
Behavior
0.57
1.53
0.10
0.41
0.00
0.00
% Time Alone
4.45
5.36
7.68
9.10
8.76
17.32
Playground
Sociability Index
(Initiations to peers/
Initiations from peers)
1 = Externalizers vs. Nonranked Students (p < .05)
2 = Internalizers vs. Nonranked Students (p < .05)
3 = Externalizers vs. Internalizers (p < .05)
Technical Manual | 45
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Block-Pedego, Walker, Severson, Todis, and Barckley (1989) conducted
a study in which the Alone category of the SIMS Behavior Observation
Codes was used as a selection variable for identifying subgroups of
students identified as externalizers and internalizers in Stage 1. Those
students who had very low rates of social contact with peers in free-play
settings were selected for inclusion in the study. A pool of externalizers (n
= 22) and internalizers (n = 30) represented the highest 25% of scores on
the Alone variable of the SIMS Behavior Observation Codes. A random
sample of 52 participants was selected from the pool of nonidentified
students who served as controls in difference analyses conducted on
SSBD Stage 2 measures and SIMS Behavior Observation Codes variables
in order to identify behavioral correlates of large amounts of time spent
alone in free-play settings.
The Critical Events Index average scores in this study were 4.50 (SD =
2.90), 2.30 (SD = 2.08) and .08 (SD = .27) for externalizers, internalizers, and nonranked students, respectively. Means for externalizers and
internalizers were significantly different from the mean for nonranked
students at (p < .05).
The Adaptive Behavior Scale average scale scores for these groups were
31.53 (SD = 6.89), 42.80 (SD = 8.75), and 55.55 (SD = 3.26); similarly, Maladaptive Behavior Scale average scale scores were 35.18 (SD = 5.90), 19.00
(SD = 4.59), and 13.76 (SD = 3.00). The mean scores for externalizers
and internalizers were significantly different from control participant
means for both scales at (p < .05). Item-by-item comparisons were also
conducted for the three participant groups on the Stage 2 rating scales.
Means and standard deviations for each of these item comparisons are
presented in Table 19.
The results of post hoc tests designating statistical significance at (p <
.05) are also presented in Table 19 on an item-by-item basis for the two
Combined Frequency Index scales. Inspection of this table indicates that
all of the rating scale items discriminated externalizers from nonranked
participants; these same items were substantially less powerful in discriminating internalizers and nonranked participants. Nearly all these
items discriminated externalizers from internalizers.
Separate t tests were conducted for externalizers vs. nonranked participants and for internalizers vs. nonranked participants on each of the
SIMS Behavior Observation Codes PSB categories. For externalizers vs.
46 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Table 19 Means, Standard Deviations, and Significance Tests for Participant Groups on the Adaptive and
Maladaptive Rating Scale Items (N = 104)
ADAPTIVE BEHAVIOR
RATING SCALE
A1. Follows established classroom rules
A2. Is considerate of the feelings of others
A3. Produces work of acceptable quality given her/his
skill level
A4. Gains peers’ attention in an appropriate manner
A5. Expresses anger appropriately (i.e., reacts to situations
without being violent or destructive)
A6. Cooperates with peers in group activities or situations
A7. Makes assistance needs known in an appropriate
manner (e.g., asks to go to the bathroom, raises hand
when finished with work, asks for help with work, etc.)
A8. Is socially perceptive (i.e., "reads" social situations
accurately)
A9. Does seatwork assignments as directed
A10. Compliments peers regarding their behavior or
personal attributes (e.g., appearance, special skills, etc.)
A11. Complies with teacher requests and commands
A12. Initiates positive social interactions with peers
MALADAPTIVE BEHAVIOR RATING SCALE
M1. Requires punishment (or threat of same) before she/
he terminates an inappropriate activity or behavior
M2. Refuses to participate in games and activities with
other children at recess
M3. Behaves inappropriately in class when corrected (e.g.,
shouts back, defies the teacher, etc.)
M4. Responds inappropriately when other children try to
interact socially with her/him
M5. Child tests or challenges teacher-imposed limits (e.g.,
classroom rules)
M6. Uses coercive tactics to force the submission of peers
(e.g., manipulates, threatens, etc.)
M7. Creates a disturbance during class activities (e.g., is
excessively noisy, bothers other students, out of seat, etc.)
M8. Manipulates other children and/or situations to get
his/her own way
M9. Is overly affectionate with others (peers and adults),
e.g., touching, hugging, kissing, hanging on, etc.
M10. Is excessively demanding (e.g., requires or demands
too much individual attention)
M11. Pouts or sulks
Externalizers
(Highest 25%
Alone)
M
SD
2.811,3
0.73
2.221,3
0.75
Internalizers
(Highest 25%
Alone)
M
SD
4.222
0.85
3.962
0.75
Nonranked
(Random
Sample)
M
SD
4.80
0.40
4.70
0.50
2.901
1.10
3.512
1.22
4.62
0.66
2.651,3
1.12
3.292
1.03
4.64
0.74
1,3
2.69
0.99
4.07
1.23
4.78
0.67
2.631,3
0.72
3.852
1.06
4.90
0.30
2.861,3
0.99
3.742
1.16
4.74
0.44
2.271,3
0.88
3.142
1.02
4.66
0.59
2.771,3
1.23
3.662
1.17
4.68
0.58
1,3
1.91
0.88
2
2.57
1.24
3.62
0.85
2.771,3
3.001
0.81
1.15
4.332
2.772
0.91
1.08
4.88
4.68
0.32
0.58
M
SD
M
SD
M
SD
3.501,3
1.19
1.74
1.16
1.34
0.74
2.471,3
0.98
3.22
1.18
1.20
0.40
2.711,3
1.38
1.07
0.38
1.40
1.21
3.001,3
0.94
2.252
0.94
1.26
0.82
3.951,3
1.07
1.39
0.84
1.36
0.63
3.201,3
1.28
1.29
0.54
1.12
0.38
4.141,3
1.19
1.44
0.64
1.24
0.43
3.231,3
1.13
1.48
0.64
1.22
0.46
1.761
1.09
1.40
0.97
1.18
0.69
3.661,3
0.96
1.662
0.91
1.18
0.48
3.501,3
1.14
1.882
0.75
1.18
0.43
1 = Externalizers vs. Non ranked Students (p<.05)
2 = Internalizers vs. Non ranked Students (p<.05)
3 = Externalizers vs. Internalizers (p<.05)
Technical Manual | 47
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
nonranked participants, the following codes and variables, derived from
code category combinations, discriminated at (p < .05): Parallel Play,
Alone, Positive Interaction, and Negative Interaction. For internalizers
vs. nonranked participants, the discriminating code categories were
Parallel Play, Alone, and Positive Interaction. Behavioral profiles on each
of these code categories favored the nonranked participants.
Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley
(1990) examined the school records of 85 high-ranked externalizers,
86 high-ranked internalizers and 161 nonranked comparison students
from their larger sample of 856 participants in grades 1–5. Their school
records were systematically coded using the School Archival Records
Search (SARS) procedure (Walker, Block-Pedego, Todis, & Severson,
1991; in press.) The SARS makes it possible to code systematically 11
archival school records variables commonly found in school folders (e.g.,
referrals, achievement, attendance, discipline contacts, and so forth).
Table 20 contains profiles of the three participant groups on each of
these SARS variables.
Table 20 SARS Record Search Profiles for Externalizing, Internalizing, and
Nonranked Control Students (N = 332)
SARS Variables
Externalizers
(n = 85)
Internalizers
(n = 86)
Nonranked
Control
(n = 161)
Achievement Test
M = 38.35
SD = 27.58
M = 43.61
SD = 28.73
M = 68.56
SD = 23.74
YES
NO
YES
NO
YES
Behavioral Referrals Within
School
18
67
5
81
0
161
Academic Referrals Within
School
25
60
28
58
9
152
Current IEP
18
67
22
64
8
153
Placement in Non-Regular
Classroom
17
68
22
64
18
143
Chapter I Services
30
55
33
53
18
143
Referrals Out of School
NO
9
76
3
83
1
160
Negative Narrative
Comments
46
39
35
51
17
144
Disciplinary Contacts with
Principal
37
48
5
81
5
156
Speech and Language
Referrals within School
10
75
24
62
13
148
*p < .01 for all SARS variables on all Chi-square group comparisons
48 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Chi-square analyses indicated that all of the SARS variables discriminated the three participant groups at (p < .01). The Yes column shows the
number of students who met SARS at-risk criteria for each variable; the
No column shows those who did not. Inspection of Table 20 indicates
that the SARS behavioral profiles for externalizers and internalizers were
highly problematic and document extensive adjustment problems in the
school setting. Their profiles were dramatically different from those of the
nonranked comparison students. Although both groups seem to be considerably at risk for school failure, the profiles for externalizers appeared
to be somewhat more problematic than the internalizers’ profiles. Thus,
teacher-identified participant groups in SSBD Stage 1 proved to have
school records profiles that indicate very serious adjustment problems
and that provide established support for the utility of teacher judgment
in the screening and identification of behaviorally at-risk students.
Finally, as part of a larger study that investigated the construct validity
of the overall SSBD procedure, Walker, Block-Pedego, Severson, Todis,
Williams, Barckley, and Haring (1991) examined the social competence
of externalizing and internalizing participant groups. Participants for
this study were participating teachers (grades 1–5) and students (n =
280) in their classrooms in a single elementary school located in the
Peninsula School District of Washington State. Two teachers from four
of the grade levels and three teachers from one grade level were study
participants. The SSBD Stage 1 and 2 measures and SIMS Behavior
Observation Codes variables were recorded for students participating in
the study. In addition, sociometric procedures, teacher ratings of social
skills, and SARS archival school record searches were also completed for
these participants.
These authors used a sociometric assessment procedure in which
students indicated their three most preferred work and play choices to
identify participants for the study. Students who received one or fewer
work and play choices in this assessment procedure were selected for
study inclusion. Results of this procedure indicated that 22% of the
externalizing participants (n= 24), 25% of the internalizing participants
(n = 27), and 5% of nonranked comparison participants (n = 3) met the
criterion for being socially isolated. For comparative purposes, isolate
and nonisolate groups of externalizers and internalizers were constructed by the authors. Isolate participants were those ranked by their
teachers as having either externalizing or internalizing behavior patterns
Technical Manual | 49
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
and as receiving one or fewer work and play choices from their peers.
Comparison students (n = 57) consisted of those who did not appear on
either the externalizing or internalizing rank-order teacher lists and had
more than one work and play choice from peers. The isolate and nonisolate participant groups were compared on teacher ratings of social
skills, the SSBD Stage 2 measures, and the SIMS Behavior Observation
Codes PSB variables. Table 21 contains means, standard deviations,
and results of statistical tests of participant group differences on each
of these variables. This table also contains means and standard deviations for the nonranked comparison students to provide a normative
standard for evaluation of externalizing and internalizing participants’
behavioral level.
Substantial mean differences were identified between the isolate and
nonisolate externalizing and internalizing participant groups on each of
the three major classes of dependent measures. Total scale score and all
three subscales of the Walker-McConnell scale discriminated isolates
from nonisolates on both externalizing and internalizing dimensions.
Similarly, all three Stage 2 instruments also significantly discriminated
these participant groups. Isolates had less favorable behavioral profiles on
these two classes of measures than nonisolates in every case. Four SIMS
Behavior Observation Codes PSB categories discriminated isolates from
nonisolates, the No Code category discriminated isolate from nonisolate
externalizers, and the categories of Parallel Play, Alone, and Total Positive
Behavior discriminated isolate from nonisolate internalizers. As with the
other study measures, the profiles for non-isolates were more favorable
than for isolates on each of these variables. Overall, these results and
measures demonstrate the sensitivity of teacher judgment to differences
in child behavior as expressed through the SSBD instruments.
50 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
72.84
46.70
57.43
60.74
67.66
Normal Controls
Externalizers
Isolate
Nonisolate
Internalizers
Isolate
Nonisolate
10.79
12.42
14.71
13.67
6.83
SD
7.71
10.25
—
F
M
2.35
1.20
OBSERVATION
CODING
MEASURES
Normal Controls
Externalizers
Isolate
.58
.56
Nonisolate
Nonisolate
1.25
Internalizers
Isolate
3.42
1.43
Nonisolate
Internalizers
Isolate
4.16
Externalizers
Isolate
3.64
.05
Normal Controls
Nonisolate
M
STAGE 2
MEASURES
F
8.69
19.95
—
1.08
4.30
8.53
1.89
4.66
SD
8.85
.39
—
F
Parallel Play
.89
1.45
1.92
4.40
.22
SD
Critical Events Index
M
Subscale 1
Teacher Preferred
TEACHER
SOCIAL SKILLS
RATINGS
.01
.54
—
p
.00
.00
—
p
.01
.00
—
p
4.36
18.71
4.82
5.30
6.57
M
52.06
47.25
44.91
38.62
56.85
M
13.65
13.82
13.56
14.66
8.75
SD
17.75
24.33
—
F
7.09
18.38
9.17
9.72
9.84
SD
F
—
9.86
0.01
—
F
8.78
8.26
Alone
7.06
8.07
9.63
8.87
3.25
SD
Adaptive Behavior
Rating Scale
62.53
49.70
66.72
51.00
75.59
M
Subscale 2
Peer Preferred
.00
.92
—
p
.00
.01
—
p
.00
.00
—
p
1.84
1.28
0.78
2.30
0.28
M
13.15
15.09
20.75
25.84
12.36
M
5.87
8.59
9.12
8.94
3.60
SD
13.41
8.40
—
F
F
5.42
7.59
—
5.91
1.77
1.23
2.30
0.57
SD
.06
4.66
—
F
No Code
3.18
5.18
7.71
8.92
2.43
SD
Maladaptive Behavior
Rating Scale
43.88
38.48
38.16
32.08
47.80
M
Subscale 3
School Adjustment
.81
.04
—
p
.02
.01
—
p
.00
.01
—
p
90.17
72.57
85.68
86.00
88.00
M
174.08
148.92
8.40
129.79
196.17
M
18.61
16.95
—
F
10.22
20.69
14.74
10.36
11.91
SD
9.56
.00
—
F
Total Positive
25.15
29.33
34.78
32.01
16.26
SD
Total Scale Score
.00
.96
—
p
.00
.00
—
p
Table 21 Means, Standard Deviations, & Significance Tests for Isolate and Nonisolate Participants
on Teacher Social Skills Ratings and SSBD Stage 2 and SIMS Observation Codes Measures
Technical Manual | 51
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 22 Correlations Between Year One and Year Two Follow-up Scores on SSBD
Stage 2 Measures for Combined Externalizing and Internalizing Groups (N = 155)
Critical Events
Index
Adaptive
Rating Scale
Maladaptive
Rating Scale
Critical Events index
.32
p = < .01
−.41
p = < .01
.36
p = < .02
Adaptive Behavior Scale
−.26
p = < .02
.45
p = < .01
−.39
p = < .01
Maladaptive Behavior Scale
.34
p = < .01
−.55
p = < .01
.70
p = < .01
Inspection of Table 22 indicates that the correlations for the Stage 2
measures across the 1-year follow-up period ranged from .32 (Critical
Events Index) to .70 (Maladaptive Behavior Scale). Correlations among
these measures at each time point were in the low to moderate range.
The SIMS Behavior Observation Codes variables, recorded in the
follow-up year (1987–88), were entered into a discriminant function
analysis procedure in order to test their accuracy in classifying the
participants’ group membership in the previous year (1986–87). A total
of 10 observation code categories and combinations were used as predictors in this analysis. Overall, these variables correctly classified 53%
of participants in the three student groups, with 44%, 43%, and 60% of
externalizers, internalizers and nonranked students, respectively, being
correctly classified. Table 23 lists the predictor variables for each of the
two discriminant functions in the analysis arranged by order of magnitude according to the size of their correlations within each function.
Table 23 SIMS Behavior Observation Codes Predictor Variables for Discrimination
Analysis Classifying Previous Year’s Participant Group Status (N = 155)
PSB Variables
No Code
Function 1
Function 2
.59
.21
Total Positive
−.59
.12
Academic Engaged Time
−.56
.02
Alone
Positive Interaction
Total Negative
.51
−.44
−.34
.23
.27
.80
Negative Interaction
.20
.75
Social Engagement
−.28
.49
Parallel Play
.19
−.24
Participation
.11
−.20
52 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
These results indicate that the follow-up observation data classified the
participants’ group status at Rating Time 1 above chance expectation
levels. However, the classification efficiency of these variables was only
in the moderate range.
Overall, these results suggest relatively modest levels of predictive validity for the SSBD measures as recorded over a 1-year follow-up period.
The fact that 69% and 52% of externalizers and internalizers, respectively, appeared in their teachers’ top three Stage 1 rankings in Year 2
suggests moderate behavioral stability on the part of these participants
and considerable teacher sensitivity to their behavioral characteristics.
The stability of the participants’ maladaptive behavior, as rated by teachers, was substantially higher than the stability for either their adaptive
behavior or their status on the Critical Events Index.
Construct Validity
This type of validity refers to an instrument’s ability to measure a particular construct and is usually established through indirect evidence and
inference (Salvia & Ysseldyke, 1988). Two examples of the SSBD’s construct validity have been demonstrated. In the Walker, Severson, Todis,
Block-Pedego, Williams, Haring, and Barckley (1990) study, these authors
identified 40 elementary classrooms (grades 1–5) in a cooperating school
district in the State of Washington in which a total of 54 certified SED
students had been previously mainstreamed. The participating teachers
were not informed about these students or about their relationship to
the proposed study. These teachers completed SSBD Stages 1 and 2 for
their classrooms. SARS record searches were also completed on eight
students from each classroom (three externalizers, three internalizers,
and two nonranked students).
The authors were especially interested in whether the 54 SED students
would appear among their teachers’ top three ranks on either the externalizing or internalizing dimensions in SSBD Stage 1. Teachers assigned
all 54 SED students to either the externalizing (n = 45) or internalizing
(n = 9) rank order lists. In terms of their actual rank orders, 39 of 45
externalizers were ranked by their teachers as among the top three students on the externalizing list (n = 10), while all 9 internalizers appeared
among the top three ranks. These results provide support for the sensitivity of the SSBD in measuring the construct of school-related behavior
disorders.
Technical Manual | 53
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Severson, Walker, Barckley, Block, Todis, and Rankin (1989) investigated
the construct validity of the SSBD procedure using participants enrolled
in grades 1, 3, and 5 of a single elementary school in Washington State.
Data were collected on all students (N = 76) in the three classrooms
using the following instruments:
•• Sociometric assessments (work and play)
•• Walker-McConnell Scale of Social Competence and School
Adjustment
•• SIMS Behavior Observation Codes for Academic Engaged Time (AET)
•• SIMS Behavior Observation Codes for Peer Social Behavior (PSB)
•• School Archival Records Search (SARS)
•• SSBD Critical Events Scale
•• SSBD Adaptive and Maladaptive Behavior Scales
The focus of analyses conducted on these variables was to confirm the
degree of problematic adjustment status of high-ranked externalizers
and internalizers in SSBD Stage 1 as indicated by multiple indicators of
school adjustment (e.g., sociometrics, direct observations, school records,
social skills ratings). These school adjustment indicators provided a large
number of specific measures that could register confirmatory evidence
of school adjustment problems.
Two subsets of five variables each from this array were chosen for the
internalizing and externalizing participant groups on the basis of both
theoretical and empirical grounds. That is, five variables were selected
for each behavioral dimension that fit a theoretical model of factors
likely to contribute to internalizing vs. externalizing dimensions and
that manifested a unique contribution to total variance based on their
partial correlations within each dimension. Five variables were judged to
be an appropriate number for a sample size of 76. This five-variable set
for externalizers consisted of:
•• School adjustment subscale of the Walker-McConnell Social Skills
Scale
•• Sociometric work and play choices
•• Amount of negative social interaction, based on SIMS Behavior
Observation Codes
•• Amount of total positive social behavior, based on SIMS Behavior
Observation Codes
•• SIMS Behavior Observation Codes PSB No Code category
54 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
The variable set for internalizers was as follows:
•• Peer subscale of the Walker-McConnell Social Skills Scale
•• Sociometric work and play choices
•• Referral for speech and language (SARS)
•• Amount of time spent alone on the playground, based on SIMS
Behavior Observation Codes
•• Amount of time spent in parallel play at recess, based on SIMS
Behavior Observation Codes
These variables were then submitted to a discriminant function analysis
procedure using a direct entry method where the groups to be discriminated were the top three ranked students (teacher ranks 1–3) vs. the
bottom seven (teacher ranks 4–10) on both the externalizing and internalizing dimensions. These analyses created two discriminant functions
which were linear combinations of weighted standard scores, thus
making it possible to aggregate these sources of data in order to create
an overall score for each participant. This score served as an aggregated
criterion variable for establishing the “behavioral at-risk status” of each
participant. The students who appeared on the externalizer (n = 10)
and the internalizer (n = 10) teacher-generated lists were respectively
rank ordered on this “risk index” according to the composite scores
they received based on the five variables identified for externalizers or
internalizers. The two discriminant function analyses determined how
well SSBD Stage 1 teacher rankings correctly classified participants as
being in the top three ranks vs. the bottom seven ranks on this index for
the externalizing and internalizing dimensions.
The Spearman rank order correlation between the problem behavior index and the SSBD Stage 1 rankings was rho = .71 (p < .001) for
externalizers and rho = .76 (p < .001) for internalizers. The results of the
discriminant function analyses indicated that Stage 1 teacher rankings
correctly classified 73% of externalizers as being in the top three ranks
vs. ranks 4–10 on the index, and similarly, Stage 1 teacher rankings
correctly classified 82% of the internalizers (ranks 1–3 versus 4–10) on
this index. These classification ratios are well above chance expectations.
The resulting Wilk’s lambdas of .54 and .50 for these analyses, were both
significant at p < .05.
Technical Manual | 55
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
In the authors’ view, these results provide confirmatory evidence of a
largely external nature that teacher rankings on externalizing and
internalizing dimensions, as expressed through the structured format
of the SSBD procedure, are predictive of school adjustment problems
as indicated by a series of multiple measures of behavioral status. The
correlations between teacher-generated Stage 1 rankings and rankings
of the same participants on the externally generated behavioral deviance
index are in the moderate to high range of magnitude and provide evidence in support of the SSBD’s construct validity.
Social Validity
The SSBD procedure is broadly perceived by both experts and practitioners in education as representing an innovative and best-practices
approach to the task of behavioral screening and identifying appropriate
candidates for further evaluation and access to supports, services, and
intervention(s). A large number of educators and psychologists in higher
education, state education department offices, and school district settings have endorsed the SSBD as an effective response to the need for
systematic universal screening and identification and the development
of higher quality referrals (i.e., students with behavioral problems/disorders whose school adjustment is seriously impaired).
On the basis of documented need and the strength of the research data
presented, and following a vigorous peer review, the SSBD received
program validation in the form of approval by the U.S. Department of
Education’s Program Effectiveness Panel in February 1990. This panel of
experts in the field of evaluation viewed the SSBD as effective in meeting
its stated goals, as adoptable and transportable to other sites, and as a
program that fills a critical need in education. PEP validation made the
SSBD eligible for access to funds for supporting its national dissemination and adoption. Since its publication, the SSBD has been positively
reviewed by a number of professionals concerned with the EBD student
population.
56 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments
Phase 3: Extensions of SSBD Instruments
The SSBD was originally validated for use with students in Grades
1–6. Since its release, practitioners and researchers have explored the
relevance of the SSBD to other contexts outside of Grades 1–6. These
extensions characterize Phase 3 of SSBD validation, which is ongoing
and seeks to validate the use of the SSBD with students in other grades,
including preschool/kindergarten and middle school/junior high
students.
The Early Screening Project: Using the SSBD with
Preschool and Kindergarten Students
Since introduction of the SSBD, there has been considerable interest by
other researchers in using the system with preschool children, and preliminary investigations confirmed that the SSBD could be successfully
adapted for preschool use.
History and Development of ESP
The primary adaptation of the SSBD procedure was to alter the SIMS
Behavior Observation Codes procedures. Eisert, Walker, Severson,
and Block (1989) found the Peer Social Behavior observations were
able to discriminate reliably among preschool groups of externalizers,
internalizers, and control children. Sinclair, Del’Homme, and Gonzalez
(1993) also reported a pilot study using the SSBD with preschool children. Sinclair et al. used the SSBD intact except that (1) in Stage 1, the
teachers were asked to nominate and rank seven externalizers and seven
internalizers (out of classes of 15), (2) the direct observation of Academic
Engaged Time was eliminated, and (3) the direct observation of Peer
Social Behavior during free play in the classroom and on the playground
was doubled to four 10-minute sessions. The three top-ranked externalizers and internalizers were followed up on with SSBD Stage 2 rating
scales and the SIMS Behavior Observation Codes.
While their results were encouraging, Sinclair et al. found that changes
were needed to make the SSBD more appropriate for the preschool population. For example, the cutoff criteria for defining problem children
needed adjustment to take into account the developmental status of
younger vs. older children (e.g., younger children engage in more parallel
Technical Manual | 57
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
and solitary forms of play than do older students). In 1990 Edward Feil,
working with the SSBD authors Walker and Severson, began to modify
the SSBD to make it appropriate for younger children. From 1990 to 1994,
this research was supported in part through grants from (1) the U.S.
Department of Education, Office of Special Education and Rehabilitative
Services, Research in Education of the Handicapped Program: Student
Initiated and Field-Initiated Research, and (2) the U.S. Department of
Health and Human Services, Administration for Children and Families:
Head Start Research Fellows Program. This resulted in publication of
Feil’s dissertation research (Feil, 1994; Feil & Becker, 1993). The modified
screener became known as the Early Screening Project (ESP).
In revising the original elementary-based SSBD for use with preschoolers,
the authors found it necessary to consider changing some of the SSBD
procedures. Because most preschool children will exhibit some problem
behaviors at one time or another (Campbell, 1990; Paget, 1990), the
frequency and intensity of the behaviors were most likely the important
discriminative features. The Stage 2 SSBD behavior checklist measures
were substantially modified to make them appropriate for rating preschool-level children. Approximately half of the occurrence/nonoccurrence items on the Critical Events Index were changed to a 5-point Likert
scale to allow a better report on frequency and/or intensity of behavior
problems. Also, items regarding academics were omitted due to their
inapplicability to preschool activities, and wordings were changed to
make the items more appropriate to preschool children. Items that specifically referred to aggressive acting-out behavior were put together into a
new scale titled the Aggressive Behavior Scale. Consequently, nine SSBD
occurrence/nonoccurrence items were converted to frequency ratings
and used for externalizers only. The Critical Events Index for preschool
and kindergarten contains 16 occurrence/nonoccurrence items, and the
Aggressive Behavior Scale consists of nine 5-point Likert response scales
that are sensitive to both frequency and intensity dimensions. In order
to better distinguish children with internalizing behavior problems—
who are generally more difficult to identify accurately—the authors
used the Social Interaction Rating Scale (Hops, Walker, & Greenwood,
1988) for children who were highly ranked as internalizers. This scale
uses eight 7-point Likert-type scale behavioral items that (1) correlated
with observational measures of social interaction, and (2) discriminated
58 | Phase 3: Extensions of SSBD Instruments
between appropriate referrals and nonranked peers. A score of 28 or less
successfully discriminated between referred children and their typical
nonreferred classmates, with 90% correct classification.
This promising adaptation of the SSBD procedure for use with preschools led to additional validation studies and inclusion of preschool
and kindergarten students in screening procedures outlined in the 2nd
edition of the SSBD. These validation studies support the reliability and
validity of preschool and kindergarten application of the SSBD, and are
described in more detail below.
Validation Studies: Reliability
Interrater Reliability
Pearson correlations and Kappa coefficients between raters (i.e., teacher/
assistant teacher pairs) for Stage 1, Stage 2, and the concurrent measures
(i.e., Preschool Behavior Questionnaire and Connors Teacher Rating
Scale, when applicable) were completed to obtain interrater reliability
coefficients. A cross-tabulation table was constructed in Stage 1, considering only whether a child was nominated to be among the three highest
ranked externalizers and internalizers by the teacher and assistant
teacher. Kappa coefficients were computed between the teachers and
assistant teachers and ranged from .42 to .70. These coefficients show
that Stage 1 has adequate reliability for screening purposes. In Stage 2,
comparing the teachers’ and assistant teachers’ scale scores resulted in
highly significant reliability coefficients ranging from .48 to .79, with
a median coefficient of .71. These coefficients are equal to those of
the Preschool Behavior Questionnaire (Behar & Stringfield, 1974) and
Conners Teacher Rating Scale (1989), two published measures used for
the identification of preschool behavior problems (e.g., Attention Deficit
Hyperactivity Disorder and Oppositional Defiant Disorder). The observational interrater reliability coefficients were calculated from a random
sample of 20% of the observations. Interrater reliability was derived by
dividing the smaller score by the larger score. In two research studies,
this provided a proportion indicator of rater differences weighted for
length of observation and resulted in coefficients of .87 and .88, which
is within acceptable limits for a screening device of this type (Salvia and
Ysseldyke, 1988).
Technical Manual | 59
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Test-Retest Reliability
For test-retest reliability, teachers and assistant teachers were asked to
rank order and rate the children again in the spring after a 6-month
interim period. In Stage 1, considering only whether a child was nominated by the teacher and assistant teacher to be in the three highest
ranked externalizer and internalizer groups, a cross-tabulation table
was constructed to examine stability over time. Kappa coefficients were
computed between the teachers and assistant teachers and resulted in
coefficients of .59 for externalizers and .25 for internalizers. These coefficients show a drop, but this is to be expected with 6 months between
data collection periods. Classroom fall and spring scores on the Critical
Events, Adaptive, and Maladaptive scales were compared and resulted
in highly significant correlations ranging between .75 and .91, with a
median correlation of .77. Correlations of classroom fall and spring scores
on the two concurrent measures in this study (i.e., Preschool Behavior
Questionnaire and Conners scales) resulted in highly significant coefficients ranging between .61 and .79, with a median correlation of .72. One
study assessed the ESP measures as compared to a concurrent measure
(i.e., Conners scale) over a 1-year test-retest reliability period. Pearson
correlation coefficients of the ESP Stage 2 measures were generally
greater than the Conners stability coefficients. With the exception of the
Critical Events Scale, the correlation coefficients of the Stage 2 measures
were highly significant (p < .001) . Although the attrition rate was high
(from 121 to 26 subjects) and therefore makes these results inconclusive,
the representativeness of the study’s participants and the strong validity
coefficients of the Stage 2 measures are encouraging after a time span
of more than 1 year (November 1991–February 1993). These results are
above expectations for coefficients over a 1-year time span (Elliot, Busse,
& Gresham, 1993).
Consistency Across Measures
The consistency across measures was examined by comparing the
standard T-scores (M = 50, SD = 10) of the children ranked highest on
Stage 1 externalizer and internalizer dimensions and children ranked
as average (nonranked) who served as a control comparison across ESP
and concurrent measures. In Figure 1, these groups were discriminated
on all measures used, and most clearly differentiated on the Aggressive,
60 | Phase 3: Extensions of SSBD Instruments
Adaptive, and Maladaptive Behavior scales. Both the externalizer and
internalizer groups had relatively equivalent scores on the Critical Events
Index and Social Behavior Observation.
Figure 1 Means of Children Ranked Highest on Externalizing Dimension,
Internalizing Dimension, and Nonranked Peers on T-Scores of ESP Measures
65
Externalizer
T-Scores
60
55
Internalizer
50
45
40
Critical
Events
Nonranked
Aggressive
Behavior
Adaptive
Behavior
Maladaptive
Behavior
Negative/Nonsocial
Behavior
Observation
Validation Studies: Validity
Content Validity
Content validity is the degree to which a measure is representative of
the domain of interest (Elliot et al., 1993). In this case, content validity
refers to externalizing and internalizing behavioral dimensions. Content
validity was inferred from three data sources: empirical findings from
past studies, the judgments of a panel of experts, and preschool teacher
feedback. In the formulation phase of this research (from October 1990
to June 1991), all the above sources were consulted. The literature search
was completed in Fall 1991 and is represented in the item selection and
adaptations of the ESP instruments. A draft of the ESP was presented to a
panel of experts during Fall 1990. The few changes suggested were minor
and were implemented before any data were collected. A pilot study was
conducted in Spring 1991 in one preschool classroom of nine children
and two teachers. After completion of the Stage 2 behavior questionnaires, these teachers did not object to any of the items on the ESP.
Technical Manual | 61
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Concurrent Validity
The concurrent validity of ESP measures was examined through correlations with the Behar and Connors scales. These data showed very
good overall concurrent validity, with significant correlations ranging
from .19 to .95 and a median and mode of .69 and .80, respectively. The
Aggressive, Adaptive, and Maladaptive Behavior Scales also showed substantial concurrent validity. Consistent with past findings, the observational data have lower correlations than the teacher rating scale data. All
the ESP scales were statistically significant on at least two of the three
concurrent scales. Further concurrent validity of the ESP was examined
by comparing the Stage 2 behavior questionnaire with SIMS Behavior
Observation Codes variables using Pearson’s correlations. Most of the
correlation coefficients were significant, ranging from .23 to .35. Because
these data are from a different source (i.e., observational measures vs.
teacher ratings), the low correlations (r) for these measures is expected
and was typical (Elliot, Busse, & Gresham, 1993; Schaughency & Rothlind, 1991).
Discriminative Validity
Discriminant function analysis using the general linear model estimates
the accuracy of a set of dependent measures in predicting a priori groupings. The a priori groups were teacher recommendation of Behavior
Disorders (BD) eligibility status (i.e., whether the teacher listed the child
for further evaluation for BD status), and the dependent measures were
the ESP. A discriminant analysis provides a measure of the accuracy
of the ESP with specificity and sensitivity coefficients. Specificity and
sensitivity are important criteria when choosing an assessment method
(Elliot et al., 1993). Sensitivity is the percentage of true positives, and
specificity is the percentage of true negatives (Schaughency & Rothlind,
1991). The discriminant classification resulted in sensitivity and specificity rates ranging from 62% to 100% and 94% to 100%, respectively. This
shows that the ESP has a low false diagnosis rate. An overall MANOVA
test of the group means for the ESP measures on the combined samples
found a highly significant difference (F = 24.67, df = (7,203), p < .001)
between those students identified by teachers and those who were not
identified. The discriminant function and MANOVA test indicate that
the ESP is an accurate measure for predicting BD behaviors in preschoolers. The discriminant function results show that the ESP has a very low
chance of overidentifying children with behavior problems. Usually it
62 | Phase 3: Extensions of SSBD Instruments
would be desirable for a screening instrument to slightly overidentify
potentially at-risk children because later assessment could separate the
false positives from the true positives. Since the issue of labeling young
children with behavior disorders can be fraught with personal feelings
of stigmatization, the ESP’s small chance of obtaining a false-positive
outcome is an asset. That is, practitioners can be confident that a child
who is identified with the ESP is actually different from his/her peers.
Treatment Utility
Treatment utility is the degree to which assessment activities are shown
to contribute to beneficial intervention outcomes (Hayes, Nelson, & Jarrett, 1987). To assess the ESP’s utility for intervention, it was used as part
of an intervention study of The First Step to Success Program. First Step is
a school enhancement program for kindergartners and their parents that
targets three areas that are very important for every child’s school success:
(1) getting along with teachers, (2) getting along with peers, and (3) doing
school work. Trained consultants provided a behavior intervention plan
for 25 at-risk children identified by the ESP with problematic aggressive
and externalizing behaviors. As shown in Figure 2, teacher ratings of the
children’s behavior improved after the First Step intervention. Adaptive
Behavior scores and percent of Academic Engagement increased, while
Maladaptive Behavior and Child Behavior Checklist (Achenbach & Edelbrock, 1986) scores decreased from preintervention to postintervention.
These results indicate that the ESP can be used as an effective monitor
of intervention effects as well as a streamlined identification procedure.
Figure 2 First Step Intervention Results on the SSBD
90
80
Preintervention
Postintervention
70
60
50
40
30
20
10
Adaptive
Behavior
Maladaptive
Behavior
Child Behavior
Checklist
Academic
Engagement
Technical Manual | 63
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Summary of ESP Technical Adequacy
This line of ESP research consists of a series of studies designed to evaluate the psychometric properties of the ESP. The results from these studies
show that the ESP can be used with diverse groups of preschool children,
the results can be interpreted with confidence, and the instruments
meet criteria for acceptable ESP technical adequacy. As noted earlier,
the correlations between Stage 2 teacher measures and SIMS Behavior
Observation Codes variables were low, but this is an expected outcome
(Cairns & Green, 1979; Schaughency & Rothlind, 1991). Observational
measures record child behavior directly with less bias and filtering of
information. However, observational measures are very sensitive to ecological variables, such as situation-dependent interactions and physical
settings. Both ratings and observational measures are important to
develop an effective understanding of the child within the preschool
context. Ratings appear to be more effective predictors of individual differences, and observations appear to be more effective in the analysis of
interactional regulation and development (Cairns & Green, 1979). Both
kinds of ESP data and analyses are important to understanding behavior
problems with their socially dependent basis.
In sum, the ESP has excellent psychometric characteristics and procedures that justify its use for its intended purposes. The ESP meets
current standards for special education best practices in student decision making. The ESP also conforms to developmental standards for
procedural integrity among preschool-age populations (Bredekamp,
1987). The ESP assesses preschool-age children’s social and emotional
behavior with multimethodological techniques and with an emphasis
on teacher judgments (Stages 1 and 2). Developmental differences
between preschool and school-age children have been accounted for in
developing the ESP. Finally, the technical adequacy of the ESP rating
scales demonstrate that teachers have a wealth of normative information
regarding children’s development and competencies across differing
domains and situations. The ESP procedures take advantage of teachers’
extensive normative knowledge base using cost-effective and systematic
screening procedures. In addition to normative teacher ratings, the ESP
includes direct observations of the child’s behavior in the context of peer
interactions. The information gained in these assessments can be used
to plan interventions, identify children with special needs, communicate
with parents, and evaluate program effectiveness.
64 | Phase 3: Extensions of SSBD Instruments
Using the SSBD With Students in Grades 7–9
The SSBD has also been used successfully within the middle and junior
high school context. While less research attention has focused on this
population to date, emerging findings suggest that the SSBD can be an
effective tool in identifying students at-risk for externalizing and internalizing disorders during the middle school years.
The work of Caldarella and colleagues has successfully replicated the
SSBD procedure for use at middle and junior high school levels (See
Caldarella, Young, Richardson, Young, & Young, 2008; Richardson,
Caldarella, Young, Young, & Young, 2009). These investigators used
the standard SSBD implementation guidelines and procedures in this
process. In Caldarella et al. (2008), the SSBD was administered to
adolescents in middle and junior high schools in Utah. The SSBD was
implemented within two suburban secondary schools that enrolled
a total of 2,173 students. The ASEBA Teacher Report Form and Social
Skills Rating System were also administered to students nominated in
Stage 1 to compare the SSBD with other established measures of student
behavior, .
Findings from this study provide support for:
•• Concurrent validity of the SSBD at Stage 1: The number of office
discipline referrals and grade point average for students nominated
during Stage 1 differed significantly from ODRs and GPA of
students not identified.
•• Internal consistency and interrater reliability at Stage 2: Internal
consistency estimates for the Critical Events Index, Adaptive
Behavior Scale, and Maladaptive Behavior Scale were in the
adequate range (.54–.90) Additionally, interrater reliability
correlations indicated adequate agreement between teachers on
ratings (.58–.60).
•• Convergent validity at Stage 2: Students identified as at risk on the
SSBD were also identified as having higher levels of externalizing
or internalizing behavior on the other rating scales. Significant
correlations (p < 0.05) between SSBD Stage 2 measures and the
ASEBA Teacher Report Form, Social Skills Rating System, number
of office discipline referrals, and grade point average provide
support for the convergent validity of Stage 2 measures.
Technical Manual | 65
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
More research should be conducted to assess the reliability and validity
of using the SSBD with middle school students, but these initial findings
provide support for extending screening with the SSBD to older students.
Caldarella and colleagues concluded that, based on their SSBD research
in these contexts, the screening procedure works effectively with middle
school students and teachers without the necessity of structural adaptation or redesign. As with the development of the Early Screening Project
(ESP), (i.e., a preschool adaptation of the SSBD,) these research outcomes
substantially extend the SSBD’s applicability inclusive of the age-grade
range from preschool through middle school (Feil, Walker, Severson, &
Ball, 2000; Lane et al., 2012; Walker, Severson & Feil, 1995)
UPDATE ON SSBD RESEARCH
AND OUTCOMES
Research Conducted by Other Professionals
Since its publication, excellent research has been conducted on the SSBD
by numerous professionals. By far, the most prolific researcher in this
regard has been Kathleen Lane and her colleagues. Lane et al. have also
provided a superb review of the empirical and the practical knowledge
bases on the SSBD system (See Lane, Menzies, Oakes, & Kalberg, 2012,
Chapter Two). They have conducted the most comprehensive research to
date on the SSBD and have used the instrument 1) as a criterion measure
for evaluating other screening approaches (Lane, Kalberg, Lambert,
Crnobori, & Bruhn, 2010), 2) as an efficacious screening approach in
its own right (Lane, Oakes,and Menzies, 2010), and 3) as a system for
enhancing and evaluating the impact of behavioral-academic interventions (Lane et al., 2012). Additional applications of the SSBD by Lane
and her colleagues include monitoring at-risk students’ behavior within
and across school years, preventing negative behavioral and academic
outcomes, and providing a basis for determining movement across tiers
within three-tiered PBIS approaches. Lane concluded her review by
noting that the SSBD is a cost-effective screening approach with excellent
66 | Phase 3: Extensions of SSBD Instruments
psychometrics of its constituent measures, and is the only commercially
available screening system that identifies both externalizers and internalizers. She notes that the SSBD is regarded by many professionals as
the gold standard of universal behavioral screening (Lane et al., 2012).
SSBD research has also been conducted by Cheney and his associates at
the University of Washington. Their research has focused primarily on
elementary-age students (see Cheney & Breen, 2008; Cheney, Flower, &
Templeton, 2008; Walker, Cheney, Stage, & Blum 2005) within studies of
Positive Behavior Support approaches and contexts.
Other SSBD applications have been reported by Epstein and Cullinan
(1998), in which they found positive convergent validity between the
SSBD and the Scale for Assessing Emotional Disturbance. Epstein &
Sharma (1998) found similar relationships between the SSBD and the
Behavioral and Emotional Rating Scale. Further, Epstein, Nordness,
Nelson, & Herzog (2002) reported moderate convergent validity between
the SSBD and the BERS measure of behavioral and emotional functioning. Lane et al. (2012) has empirically documented strong relationships
between the SSBD and the Drummond Student Risk Screening Scale
(Drummond, 1994). Finally, Kamps and colleagues have used the SSBD
effectively as a screening evaluation measure in a series of prevention
and intervention studies with students having EBD or at risk for same
(see Kamps, Kravits, Rauch, Kamps, & Chung, 2000; Kamps, Kravits,
Stolze, & Swaggart, 1999).
The extensive use of the SSBD in research and related applications by
other professionals and the positive empirical and practical outcomes
associated with its use are gratifying (Severson, Walker, Hope-Doolittle,
Kratochwill, & Gresham, 2007). These results indicate the SSBD instrument continues to be a valid and, reliable tool in addressing the needs
of behaviorally at-risk students and the school staffs that accommodate
them. Next, some recent research findings are reported by the SSBD
authors and their colleagues resulting from the SSBD’s use in two large
scale randomized controlled trials of the First Step to Success program
(Walker et al., 1997).
Technical Manual | 67
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Research Conducted by the SSBD Authors
and Colleagues
Walker and his colleagues conducted two randomized controlled trials
(RCT) of the First Step Early Intervention program in which the SSBD
procedure was used as a screening device and also as a measure of intervention outcomes from pre- to post-1-to year followup assessments. The
first RCT was a 4-year efficacy trial of First Step to Success (Walker et al.,
2009), conducted in the Albuquerque, NM’s school district, where 72%
were students of color and 70% were eligible for free and reduced lunch.
The Albuquerque district ranks as the 17th largest in the United States.
The second RCT was a national effectiveness trial of First Step involving
six participating sites in Oregon, California, Illinois, West Virginia,
and Florida.
A total of 200 primary grade participants in the efficacy trial were evenly
divided between intervention and usual care conditions. In the national
effectiveness trial, a total of 286 students were enrolled across the six
sites, with 142 students in the intervention group and 146 in the comparison, usual care condition. The efficacy trial is reported in Walker,
Seeley, Small, Severson, Graham, Feil, Serna, Golly and Forness (2009);
the effectiveness trial is reported in Sumi, Woodbridge, Javitz, Thornton,
Wagner, Rouspil, Yu, Seeley, Walker, Golly, Small, Feil, & Severson (2012).
The SSBD proved to be a reliable and accurate instrument for the broadbased screening of general education classrooms in these two large scale
studies. The SSBD Stage 2 measures and the Academic Engaged Time
(AET) SIMS Behavior Observation Codes codes proved to be sensitive in
discriminating among externalizing and comparison students at baseline and in documenting gains produced by the First Step intervention
in these two studies. In terms of effect sizes the efficacy trial reflected
these gains; the figures were .82 for Adaptive Behavior ratings and .87
for Maladaptive Behavior ratings by participating teachers. The effect
size for observer-recorded AET was .44 for the efficacy trial. For the
effectiveness RCT, these comparable figures were, respectively, .42, .67,
and .35. The relative sensitivity of these SSBD measures provides support
for their general use as outcome measures for short-term interventions
in school settings.
68 | Update on SSBD Research and Outcomes
Table 24 Correlations Between SSBD Stage 2 Measures and Achenbach TRF Scales
and SSRS Scales
Achenbach TRF Scales
Conduct Disorder
Oppositional
Defiant Disorder
Aggression
CEI
.53
.54
.62
ABI
−.44
−.40
−.43
MBI
.57
.68
.69
SSRS Scales
Social Skills
CEI
−.45
ABI
.53
MBI
−.37
.59
−.33
.62
SRSS Parent Ratings
Social Skills Scale
CEI
−.13
ABI
.14
MBI
−.09
Problem Solving
Academic
Competende
−.23
.20
−.21
Problem Behavior
Scale
.30
−.30
.29
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
These two studies used randomized control study designs and so provided an opportunity to further establish the concurrent and convergent
validity of the SSBD Stage 2 screening measures. Table 24 shows correlations between the SSBD Stage 2 measures and the Achenbach TRF scales
of Conduct Disorder, Oppositional Defiant Disorder, and Aggression
and the Gresham and Elliott SSRS social skills, problem behavior, and
academic competence scales. Correlations between SSBD measures and
parent ratings of social skills and problem behavior scales of the SSRS
are also shown.
Seeley, Small, Walker, Feil, Severson, Golly, & Forness (2009) analyzed
the Albuquerque RCT database for the First Step program’s sensitivity to
students with ADHD. This analysis allowed calculations of correlations
between the Connors ADHD Scale (Connors, 1997) and the SSBD Stage
2 measures. Correlations were run for teacher-reported total score,
inattentive score, and hyperactive score. For the CEI, these correlations
Technical Manual | 69
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
were: .31, .32, and .36. For the ABI, the correlations were −.37, −.42, and
−.50. Finally, for the MBI they were .47, .43, and .34.
For the most part, these validity coefficients were in the moderate range,
with the exception of the correlations between SSBD measures and parent
ratings of social skills and problem behavior on the Gresham and Elliott
scale. They provide further confirmation that teachers and parents view
the behavior of the same children quite differently and also that children
may well behave differently across home and school settings.
Caldarella and his colleagues (Caldarella et al., 2008) conducted an
analysis of the CEI in which they identified externalizing items and
internalizing items by evaluating which of them discriminated significantly between Stage 1 externalizing- and internalizing-identified
students using SSBD data from middle and junior high school students.
In a similar fashion, Small (2014), as part of an analysis of the SSBD’s
psychometrics, identified lists of externalizing, internalizing, and other
items using a combination of content analysis by experts followed by
confirmation through factor analyses. The sample used for this analysis
consisted of 2,237 cases of primary grade-level students distributed
among three separate studies (n = 723, 1,098, and 416, respectively), in
which the SSBD served as a universal screener within investigations of
the First Step program’s efficacy and effectiveness. The externalizing
items identified are listed by their item numbers as follows: 1, 2, 4, 5, 9,
10, 15, 20, 21, 23, 25, and 26. Identified internalizing items are 3, 6, 7, 8,
11, 12, 14, 16, 17, 18, 27, and 33. The other items are 13, 19, 22, 24, 28, 29,
30, 31, and 32. Although these analyses do not change the way the CEI
is scored for decision-making purposes regarding screening, having this
item classification available may be helpful to professionals for diagnostic purposes and for use in identifying strategies and interventions for
use with externalizing and internalizing students.
Overall, the above research outcomes a) clearly establish the sensitivity
of the Stage 2 Adaptive and Maladaptive Behavior Scales as well as the
AET Code in documenting intervention effects produced by a well-implemented behavioral intervention and b) the CEI, ABI, and MBI appear
to be moderately related to three well-known and highly respected scales
70 | Update on SSBD Research and Outcomes
for assessing social competence, ADHD symptoms, and behavioral
pathology, respectively, among general education students. These results
extend the applicability of these SSBD measures well beyond universal
screening purposes.
CONCLUSION
This Technical Manual has described the research and development
process the SSBD authors and their colleagues conducted to establish the
psychometric integrity, efficacy, and social validity of the SSBD procedure and the instruments that make up each of its screening stages. The
resulting outcomes of this 5-year development and testing process are
impressive in establishing the SSBD’s accuracy, validity, and reliability.
There appears to be little doubt that students who meet risk criteria on
the SSBD have very serious behavioral adjustment problems of either an
externalizing or internalizing nature. As such, the quality and accuracy
of teacher referrals are likely to be significantly enhanced when using the
SSBD or multi-gating systems like it (see Kettler et.al. 2014). In addition,
“all” students enrolled in general education classrooms will have the
opportunity, through the SSBD’s systematic application, to be screened
on a regular basis and to access needed services.
The SSBD meets best-practice standards in the areas of proactive,
universal screening and in providing information for use in designing
interventions and matching students’ problems with available programs,
placements, and/or intervention procedures. The accompanying SSBD
Administrator’s Guide provides instructions and guidelines for applying
the SSBD system with high levels of implementation fidelity. This guide
is designed to provide a roadmap for the SSBD coordinator to use in
setting up, implementing, and troubleshooting the SSBD’s application.
Technical Manual | 71
REFERENCES
Achenbach, T. M. (1978). The child behavior profile. I. Boys aged 6–11.
Journal of Consulting and Clinical Psychology, 46, 478–488.
Achenbach, T., & Edelbrock, C. (1979). The child behavior profile. II.
Journal of Consulting and Clinical Psychology, 47, 223–233.
Block-Pedego, A., Walker, H. M., Severson, H. H., Todis, B., & Barckley,
M. (1989). Behavioral correlates of social isolation occurring within free
play settings (Technical Report). Eugene, OR: Oregon Research Institute.
Caldarella, P., Young, E., Richardson, M., Young, B., & Young, K. (2008).
Validation of the Systematic Screening for Behavior Disorders in middle
and junior high school. Journal of Emotional and Behavioral disorders,
16, 105–117.
Cheney, D., Breen, K., & Rose, J. (2008). Universal school-wide screening
to identify students for Tier 2/Tier 3 interventions. National Forum for
Implementation of School-wide Positive Behavior Supports. Chicago, Ill.
Cheney, D., Flower, A., & Templeton, T. (2008). Applying response to
intervention metrics in the social domain for students at-risk of developing emotional and behavioral disorders. Journal of Special Education,
42, 108–126.
Eisert, D., Walker, H. M., Severson, H. H., Block-Pedego, A. E. (1989).
Patterns of social-behavioral competence in behavior-disordered preschoolers. Early Child Development and Care, 41, 139–152.
Epstein, M. & Cullinan (1998). Manual for the Scale for Assessing Emotional Disturbance (SAED). Austin, TX: Pro-Ed.
Epstein, M., Nordness, P., Nelson, R., & Hertzog, M. (2002). Convergent
validity of the Behavioral and Emotional Rating Scale with primary
grade-level students. Topics in Early Childhood Special Education,
22, 114–122.
Epstein, M., & Sharma, J. (1998). Manual for the Behavioral and Emotional Rating Scale (BERS). Austin, TX: Pro-ed.
Technical Manual | 73
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Feil, E., Walker, H., Severson, H., & Ball, A. (2000). Proactive screening
for emotional/behavioral concerns in Head Start preschools: Promising
practices and challenges in applied research. Behavioral Disorders,
26, 13–25.
Feil, E., Walker, H., Severson, H., Golly, A., Seeley, J., & Small, J. (2009).
Using positive behavior support procedures in Head Start classrooms
to improve school readiness: A group training and behavioral coaching
model. National Institute of Health Dialog, 12, 88–103.
Hersh, R., & Walker, H. M. (1983). Great expectations: Making
schools effective for all children. Policy Studies Review [Special issue],
2(1), 147–188.
Hops, H., Lewin, L., Walker, H. M., & Severson, H. H. (1990). Social competence correlates of externalizing, internalizing and normal behavior
patterns among elementary aged students (Technical Report). Eugene,
OR: Oregon Research Institute.
Kamps, D., Kravits, T., Rauch, J., Kamps, J., & Chung, N. (2000). A prevention program for students with or at risk for ED: Moderating effects
of variation in treatment and classroom structure. Journal of Emotional
and Behavioral disorders, 8, 141–154.
Kamps, D., Kravits, T., Stolze, J., & Swaggart, B. (1999). Prevention strategies for at-risk students and students with EBD in urban elementary
schools. Journal of Emotional and Behavioral Disorders, 7, 178–188.
Kettler, R., Glover, T.,Albers, C., & Feeney-Kettler, K. (2014). Universal
screening in educational settings. Washington, DC: American Psychological Association.
Lane, K., Kalberg, J., Lambert, W., Crnobori, M., & Bruhn, A. (2010). A
comparison of systematic screening tools for emotional and behvavioral
disorders: A replication. Journal of Emotional and Behavioral Disorders,
18, 100–112.
Lane, K., Menzies, H. M., Oakes, W. P., & Kalberg, J. R. (2012). Systematic screenings of behavior to support instruction: From preschool to high
school. New York, NY: Guilford.
74 | References
Lane, K. L., Menzies, H. M., Oakes, W. P., Lambert, W., Cox, M., &
Hawkins, K. (2012). A validation of the Student Risk Screening Scale for
internalizing and externalizing behaviors: Patterns in rural and urban
elementary schools. Behavioral Disorders, 37, 244–270.
Lane, K., Oakes, W., & Menzies, H. (2010). Systematic screenings to prevent the development of learning and behavior problems: Considerations
for practitioners, researchers, and policy makers. Journal of Disabilities
Policy Studies, 21, 160–172.
Leff, S., & lakin, R. (2005). Playground-based observation systems: A
review and implications for practitioners and researchers. School Psychology Review, 34(4), 475–489.
Nicholson, F. (1988). Evaluation of a three-stage, multiple-gating, standardized procedure for the screening and identification of elementary
school students at risk for behavior disorders. Unpublished doctoral
dissertation, University of Utah, Salt Lake City.
Reynolds, W. M. (1984). Depression in children and adolescents: Phenomenology, evaluation and treatment. School Psychology Review,
13(2), 171–182.
Richardson, M., Caldarella, P., Young, B., Young, E., & Young, K. (2009).
Further validation of the Systematic Screening for Behavior Disorders in
middle and junior high school. Psychology in the Schools, 46, 605–615.
Ross, A. O. (1980). Psychological disorders of childhood. New York:
McGraw-Hill.
Salvia, J., & Ysseldyke, J. (1988). Assessment in special and remedial education. Palo Alto: Houghton Mifflin.
Seeley, J. Small, J., Walker, H., Feil, E., Severson, H., Golly, A., & Forness,
S. (2009). Efficacy of the First Step to Success intervention for students
with attention-deficit/hyperactivity disorder. School Mental Health,
1, 37–48.
Severson, H. H., Walker, H. M., Barckley, M., Block-Pedego, A. E., Todis,
B., & Rankin, R. (1989). Confirmation of the accuracy of teacher rankings
on internalizing and externalizing behavior profiles: Mostly hits and a
few misses. (Technical Report). Eugene, OR. Oregon Research Institute.
Technical Manual | 75
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Severson, H. H., Walker, H. M., Hope-Doolittle, J., Kratochwill, T., &
Gresham, F. M. (2007). Proactive, early screening to detect behaviorally
at-risk students: Issues, approaches, emerging innovations, and professional practices. Journal of School Psychology, 45, 193–223.
Shinn, M. R., Ramsey, E., Walker, H. M., Stieber, S., & O’Neill, R. E. (1987).
Antisocial behavior in school settings: Initial differences in an at risk
and normal population. The Journal of Special Education, 21(2), 69–84.
Small, J. (2014). Psychometric analysis of the Systematic Screening for
Behavior Disorders (SSBD). ORI Technical Report. Eugene, OR.
Sumi, W., Woodbridge, M., Javitz, H., Thornton, S., Wagner, M., Rouspil,
K., Yu, J., Seeley, J., Walker, H., Golly, A., Small, J., Feil, E., & Severson,
H. (2012). Journal of Emotional and Behavioral Disorders, 21 (1), 66–78.
Tsai, S. & Cheney, D. (2012). The impact of the adult-child relationship
on school adjustment for children at risk of serious behavior problems.
Journal of Emotional and Behavioral Problems, 20(2), 105–114.
Todis, B., Severson, H. H., & Walker, H. M. (1990). The critical events
scale: Behavioral profiles of students with externalizing and internalizing behavior disorders. Behavioral Disorders, 15(2), 75–86.
Volpe, R., DiPerna, J., Hintze, J. & Shapiro, E. (2005). Observing students
in classroom settings: A review of seven coding schemes. School Psychology Review, 34(4), 454–474.
Walker, H. M. (1986). The assessments for integration into mainstream
settings (AIMS) assessment system: Rationale, instruments, procedures,
and outcomes. Journal of Clinical Child Psychology, 15(1), 55–65.
Walker, H. M., Block-Pedego, A. E., Severson, H. H., Barckley, M., &
Todis, B. J. (1989). SSBD profiles of resource room students in restrictive
and less restrictive classroom settings (Technical Report). Eugene, OR:
Oregon Research Institute.
Walker, H. M., Block-Pedego, A., Severson, H., Todis, B., Williams, G.,
Barckley, M., & Haring, N. (1991). Behavioral profiles of sociometrically
isolated students at risk for externalizing and internalizing behavioral
disorders. (Technical Report). Eugene, OR. Oregon Research Institute.
76 | References
Walker, H. M., Block-Pedego, A., Todis, B., & Severson, H. H. (1991). The
school archival records search (SARS). Longmont, CO: Sopris West.
Walker, H. M., Cheney, D., Stage, S., & Blum, C. (2005). School-wide
screening and positive behavior supports: Identifying and supporting
students at-risk for school failure. Journal of Positive Behavior Intervention, 7, 194–204.
Walker, H. M., & McConnell S. R. (1988). The Walker-McConnell scale
of social comptence and school adjustment. Chico, CA: Duerr Evaluation
Associates.
Walker, H. M., Ramsey, E., & Gresham, F. M. (2004). Antisocial behavior
in school. Belmont, CA: Wadsworth.
Walker, H. M., & Rankin, R. (1983). Assessing the behavioral expectations and demands of less restrictive settings. School Psychology Review,
12, 274–284.
Walker, H. M., Reavis, H. K, Rhode, G., & Jenson, W. R. (1985). A conceptual model for delivery of behavioral services to behavior disordered
children in educational settings. In P. Bomstein & A. Kazdin (Eds.),
Handbook of clinical behavioral services with children. Homewood, IL:
Richard D. Irwin.
Walker, H. M, Seeley, J., Small, J., Severson, H., Graham, B., Feil, E., Serna,
L., Golly, A., & Forness, S. (2009). A randomized controlled trial of the
First Step to Success early intervention: Demonstration of program efficacy outcomes in a diverse, urban school district. Journal of Emotional
and Behavioral Disorders, DOI: 10.1177/1063426609341645.
Walker, H. M., Severson, H., & Feil, E. (1995). The Early Screening Project
(ESP). Eugene, OR: Deschutes Research.
Walker, H. M., Severson, H., Nicholson, F., Kehle, T., Jenson, W.R., &
Clark, E. (1994). Replication of the Systematic Screening for Behavior
Disorders (SSBD) procedurel for the identification of at-risk children.
Journal of Emotional and Behavioral Disorders.
Technical Manual | 77
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Walker, H. M., Severson, H. H., Stiller, B., Williams, G., Haring, N. G.,
Shinn, M. R., & Todis, B. (1988). Systematic screening of students in the
elementary age range at risk for behavior disorders: Development and
trial testing of a multiple gating model. Remedial and Special Education,
9(3), 8–14.
Walker, H. M., Severson, H. H., Todis, B., Block-Pedego, A., Williams,
G., Haring, N., & Barckley, M. (1990). Systematic screening for behavior
disorders (SSBD): Further validation, replication, and normative data.
Remedial and Special Education, 11(2), 32–46.
Walker, H. M., Stiller, B., Golly, A., Kavanagh, K., Severson, H., & Feil, E.
(1997). First Step to Success: Helping young children overcome antisocial
behavior (an early intervention program for grades K-3). Longmont, CO:
Sopris West.
78 | APPENDIX A: Updated Supplemental Norms
APPENDIX A
Normative Comparisons: SSBD Original Norms and
Updated Supplemental Normative Databases
SSBD procedures and instruments are completely standardized and
self-contained. Normative levels on the SSBD have been established to
assist in the following tasks:
•• Facilitate decision making in moving from one screening stage
to another
•• As tools in determining eligibility in relation to generalized
normative criteria
The original national standardization sample for the SSBD Stage 2 consists of 4,463 cases, and the SIMS Behavior Observation Codes have a
national normative sample of 1,219 cases. It should be noted that one
rarely encounters an observation code in the professional literature
that has been nationally normed. Details on the original SSBD normative sample were described earlier. Here we discuss new supplemental
normative databases that have been developed over the past decade by
research and practice usages of the SSBD.
It is recognized that students passing both SSBD screening Stages 1
and 2 may exceed normative levels and expectations for the referring
classroom setting but may not exceed normative cutoff scores derived
from the SSBD national standardization sample. This possible outcome
highlights the importance, whenever possible, of using normative
criteria that are external to the behavioral norms and expectations of
specific school settings and local districts in screening referred students
who may have moderate to severe behavior disorders. However, in this
context, it is also important to demonstrate that the referred student’s
behavioral profile is substantially divergent from those of nonreferred,
same-sex peers within the referring classroom setting. Ideally, local
practice norms on the SSBD should be used to inform and supplement
decision making whenever they exist.
Technical Manual | 79
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Supplemental SSBD Practice and Research-Generated Norms
Since its publication, the SSBD has been used extensively by single
schools and districts. As noted, it has also been a popular research tool
and has been the focus of considerable research attention in the context
of universal, proactive screening while frequently serving as a validation
criterion or standard against which other screeners are judged (Lane,
Menzies, Oakes, & Kahlberg, 2012). Relatively large databases exist for
SSBD screenings as a result of this extensive usage.
During the 2012–13 school year, the authors were able to assemble
an extensive supplemental normative database on the SSBD Stage 2
instruments consisting of 6,743 cases. These cases and resulting student profiles were generated through research conducted on the SSBD
by other investigators and through use of the SSBD in ongoing school
district practices involving regular universal screening of grade 1–6
general education student populations for behavior problems over the
past decade. They represent the following regions of the United States:
Northwest, Mountain West, Southwest, Midwest, Southeast, South, and
East. The supplemental SSBD normative data from across these regions
and cities provide an important context for evaluating the relevance of
the SSBD’s original normative sample for current decision making about
behaviorally at-risk students.
Table 25 contrasts the original SSBD normative levels on the Stage 2 instruments (Critical Events Index, Adaptive Behavior Index, and Maladaptive
Behavior Index) with normative levels drawn from a series of research
studies in which the SSBD Stage 1 and 2 instruments were administered.
Participating teachers and schools were asked to simply complete Stages
1 and 2 SSBD screening without making any decisions about qualifying
scores or cut points for individual students or selecting any of them from
the larger student pool based on their Stage 2 score profiles. The original norming of the SSBD Stage 2 instruments followed this instrument
administration and data collection procedure exactly.
As can be seen in Table 25, the normative levels for externalizing and
internalizing students who were nominated from screening Stage 1, when
averaged for each of the Stage 2 instruments, were quite similar to the
original SSBD normative levels and to each other even though they represented different regions of the United States and many years between
screening occasions. Thus, these samples contained an undifferentiated
80 | APPENDIX A: Updated Supplemental Norms
Table 25 Similarities in SSBD Score Profiles for Original Normative and Research-Based
Samples
SSBD Original Norms (N = 4,463)
M Externalizing
M Internalizing
3.40
2.03
.12
ABI
35.10
44.42
55.29
MBI
29.45
17.20
13.37
CEI
M Nonranked
From Walker, H. M., & Severson, H. H. (1990). Systematic screening for behavior disorders
(SSBD). Longmont CO: Sopris West.
Study 1 — NW Region (N = 454)
M Externalizing
M Internalizing
M Nonranked
CEI
1.72
1.57
—
ABI
36.38
44.50
—
MBI
29.61
18.16
—
From Walker, H. M., Severson, H., Stiller, B., Williams, G., Haring, N., Shinn, M., & Todis, B.
(1988). Systematic screening of pupils in the elementary age range at risk for behavior
disorders: Development and trial testing of a multiple gating model. Remedial and
Special Education, 9(3), 8–14.
Study 2 (N = 856)
M Externalizing
M Internalizing
M Nonranked
CEI
3.02
1.96
.11
ABI
36.52
44.78
55.43
MBI
30.65
18.62
13.52
From Walker, H. M., Severson, H. H., Todis, B. J., Block-Pedego, A. E., Williams, G. J., Haring,
N. G., & Barckley, M. (1990). Systematic screening for behavior disorders (SSBD): Further
validation, replication and normative data. Remedial and Special Education, 11(2), 32-46.
Study 3 (N = 1,468)
M Externalizing
M Internalizing
CEI
3.20
1.80
M Nonranked
.10
ABI
36.10
44.70
55.90
MBI
28.40
15.90
13.30
From Walker, H. M., Severson, H. H., Nicholson, F., Kehle, T., Jenson, W. R., & Clark, E.
(1994). Replication of the Systematic Screening for Behavior Disorders (SSBD) procedure
for the identification of at-risk children. Journal of Emotional and Behavioral Disorders,
2(2), 66-77.
Utah Sample (N = 2,188)
Positive Behavior Support Initiative, Brigham Young University
M Externalizing
M Internalizing
M Nonranked
CEI
3.51
2.91
—
ABI
36.30
41.91
—
MBI
28.57
18.73
—
Caldarella, P., Young, E. L., Richardson, M. J., Young, B. J., & Young, K. R. (2008). Validation
of the Systematic Screening for Behavioral Disorders in middle and junior high school.
Journal of Emotional and Behavioral Disorders, 16(2), 105-117.
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
Technical Manual | 81
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
mix of students, some of whom would and some of whom would not meet
SSBD Stage 2 risk criteria. The close correspondence between the student
profiles in Table 25 across these divergent research samples and the similarity to the original SSBD norms speaks to the stability of externalizing,
internalizing, and normative or representative behavior patterns both
across years and regional sites.
Table 26 contrasts the original SSBD normative profiles with a series of
samples drawn from different U.S. regions in which participants were
(required to meet Stage 2 risk criteria.) As can be seen in this table, SSBD
Stage 2 profiles are significantly more problematic, as expected. Again,
the consistency of Stage 2 measures scores as highly consistent across
samples and regions.
In a large SSBD Midwest sample of students (n = 1,970), those students
who met Stage 2 risk criteria based on their CEI scores had much higher
scores and more problematic profiles across the Stage 2 instruments
than those who did not meet these criteria. Table 26 displays the Stage 2
instrument profiles for these two groups of students.
Similarly, in a replication of these results within two randomized control
trials implemented to investigate the efficacy as well as the effectiveness
of the First Step to Success early intervention program, the SSBD was
used to screen grade 1–3 students for inclusion in these studies (Sumi,
Woodbridge, Javitz, Thornton, Wagner, Rouspil, & Severson, 2013;
Walker, Seeley, Small, Severson, Golly, & Feil, 2009). The target student
populations for these studies required that they have externalizing type
problems and disorders, so screening for internalizers was not conducted. Participating students in these two trials totaled more than 500
in number and represented six sites across the United States.
As can be seen in Table 27, those externalizing students who met SSBD
Stage 2 exit criteria had substantially more problematic profiles on the
Stage 2 measures than those who did not. Note: For the First Step trials,
96% of the included students met Stage 2 criteria based on their CEI
profiles only.
82 | APPENDIX A: Updated Supplemental Norms
Table 26 Original SSBD Norms vs. Supplemental Practice—Research Samples
SSBD Original Norms (U.S. Region; N = 4,463)
Externalizers
M
Critical Events Index
Internalizers
M
3.4
2.03
Adaptive
35.91
44.42
Maladaptive
29.45
17.20
Lane et al. Samples (Southeast Region; N = 1,195)
Sample 1 (n = 180)
Critical Events Index
6.96
4.96
Adaptive
28.74
35.94
Maladaptive
38.03
23.86
8.17
5.57
Adaptive
28.10
34.71
Maladaptive
38.03
24.28
6.16
4.37
Adaptive
29.18
36.34
Maladaptive
36.76
22.44
Sample 2 (n = 254)
Critical Events Index
Sample 3 (n = 761)
Critical Events Index
Stage-Cheney Sample (Northwest Region; N = 432)
Critical Events Index
6.41
4.56
Adaptive
28.36
34.24
Maladaptive
37.86
26.35
Naquin Sample (Southern Region; N = 800+)
Critical Events Index
6.43
4.97
Adaptive
27.45
33.97
Maladaptive
37.56
23.76
Eber-Rose et al. Sample (Midwestern Region; N = 1,970)
Critical Events Index
6.01
4.04
Adaptive
30.87
37.76
Maladaptive
36.60
22.83
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
Technical Manual | 83
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 27 Profiles for Externalizing and Internalizing Students Meeting vs. Not
Meeting SSBD Stage 2 Risk Criteria
Met Stage 2 Risk Criteria, Externalizer • Descriptive Statistics
N
Minimum
Minimum
Sum
Mean
SD
Critical Events
Index
336
0
15
2020
6.01
2.17
Adaptive
Behavior Score
200
12
48
6174
30.87
6.76
Maladaptive
Behavior Score
199
0
52
7283
36.60
7.23
Valid N
(listwise)
199
Met Stage 2 Risk Criteria, Internalizer • Descriptive Statistics
N
Minimum
Minimum
Sum
Mean
SD
Critical Events
Index
368
1
13
1487
4.04
1.84
Adaptive
Behavior Score
259
16
60
9781
37.76
8.89
Maladaptive
Behavior Score
256
11
56
5845
22.83
7.92
Valid N
(listwise)
256
Did Not Meet Stage 2 Risk Criteria, Externalizer • Descriptive Statistics
N
Minimum
Minimum
Critical Events
Index
Sum
Mean
SD
813
0
4
1332
1.64
1.28
Adaptive
Behavior Score
813
0
60
32264
39.69
8.36
Maladaptive
Behavior Score
812
0
51
19495
24.01
7.61
Valid N
(listwise)
812
Did Not Meet Stage 2 Risk Criteria, Internalizer • Descriptive Statistics
N
Minimum
Minimum
Critical Events
Index
453
1
3
655
1.45
0.61
Adaptive
Behavior Score
453
0
60
20924
46.19
10.03
Maladaptive
Behavior Score
453
0
55
6285
13.87
4.60
Valid N
(listwise)
453
84 | APPENDIX A: Updated Supplemental Norms
Sum
Mean
SD
Table 28 Descriptive Statistics for SSBD Stage 2 Measures
Albuquerque Efficacy Trial
Total Sample
(N = 723)
Met Stage 2 Criteria
(n = 371)
Didn’t Meet Stage 2
Criteria (n = 352)
M
(SD)
Min
Max
M
(SD)
Min
Max
M
(SD)
Min
Max
CEI
4.9
(3.6)
0
17
7.7
(2.7)
1
17
2.1
(2.7)
0
4
ABI
34.5
(7.7)
13
60
30.5
(5.6)
13
48
38.7
(7.3)
18
60
MBI
31.8
(8.5)
11
51
36.7
(6.5)
16
51
26.6
(7.3)
11
46
Measure
National Effectiveness Trial
Total Sample
(N = 1,084)
Met Stage 2 Criteria
(n = 660)
Didn’t Meet Stage 2
Criteria (n = 424)
M
(SD)
Min
Max
M
(SD)
Min
Max
M
(SD)
Min
Max
CEI
6.0
(3.3)
2
20
7.9
(2.9)
4
20
3.0
(0.8)
2
4
ABI
36.7
(8.1)
13
60
33.6
(6.8)
13
57
41.6
(7.5)
25
60
MBI
28.7
(8.7)
11
50
32.4
(7.7)
11
50
22.9
(6.8)
11
47
Measure
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
Similar findings were demonstrated in other trials. As Table 28 indicates, profiles of externalizing and internalizing students meeting Stage
2 criteria were more problematic than profiles of students who didn’t
meet risk criteria.
Table 29 presents average score levels for students exceeding Stage 2
risk criteria for three additional supplemental normative sites developed
in research by Kathleen Lane of the University of Kansas and her colleagues representing the Southeast United States, wherein students had
to exceed Stage 2 cutoff points to be included in a research study or to
be considered for further screening (Lane, Kalberg, Lambert, Crnobari,
& Bruhn, 2010; Lane, Menzies, Oakes, Lambert, Cox, & Hawkins, 2012).
Score levels on the Stage 2 instruments were substantially higher than
average for students in the above samples where risk criteria for Stage
2 measures had to be met. As can be seen in this table, the average
score levels for students in these three samples were significantly higher
and their SSBD profiles were more problematic than for students who
Technical Manual | 85
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Table 29 Lane et al. Supplemental Norms from Research Conducted in the
U.S. Southeast
Sample 1 (N = 761)
Controls
n
Externalizers
M
SD
n
M
Internalizers
SD
n
M
SD
CEI
614
1.26
1.29
77
6.16
2.44
70
4.37
2.14
ABI
614
46.24
9.48
77
29.18
5.71
70
36.34
7.31
MBI
614
18.46
7.94
77
36.76
6.64
70
22.44
5.83
From Lane, K. L., Kalberg, J. R., Lambert, E. W., Crnobari, M., & Bruhn, A. L. (2010). A comparison of systematic screening tools for emotional and behavioral disorders, JEBD, 18, 100–112.
Sample 2 (N = 180)
Controls
n
Externalizers
M
SD
n
M
Internalizers
SD
n
M
SD
CEI
70
1.57
1.11
59
6.96
2.91
51
4.96
1.96
ABI
70
44.42
9.36
59
28.7
6.61
51
35.94
8.16
MBI
70
20.32
9.31
59
38.03
6.90
51
23.86
6.94
From Lane, K. L., Menzies, H. M., Oakes, W. P., Lambert, W., Cox, M., & Hankins, K. (2012). A
validation of the Student Risk Screening Scale for internalizing and externalizing behaviors:
Patterns in rural and urban elementary schools. Behavioral Disorders, 37, 244–270.
Sample 3 (N = 253)
Controls
n
Externalizers
Internalizers
M
SD
n
M
SD
n
M
SD
CEI
181
1.48
1.31
51
8.17
4.85
21
5.71
3.97
ABI
181
43.15
9.85
50
28.10
6.94
21
34.71
8.74
MBI
181
18.76
8.39
50
37.32
6.82
21
24.28
5.71
From Lane, K. L., Oakes, W. P., Harris, P. J., Menzies, H. M., Cox, M., & Lambert, W. (2012). Initial
evidence for the reliability and validity of the Student Risk Screening Scale for internalizing
and externalizing behavior at the elementary level. Behavioral Disorders, 37, 99–122.
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
typically did not meet criteria for risk at Stage 2. Similar SSBD databases,
drawn from the South (N = 850) and Northwest (N = 429) regions and
developed under these same conditions, closely matched these score
levels. For example, the CEI average scores for externalizing students
in these two regional samples were 6.43 and 6.41, respectively. For the
Adaptive Behavior Index, these corresponding averages were quite low at
27.45 and 28.36. Their Maladaptive Behavior Index score averages were
33.97 and 34.24. Similar results were obtained for internalizing students
across these two sites. The CEI average scores for internalizers were
higher than average at 4.56 and 4.97, respectively. The Adaptive Behavior
86 | APPENDIX A: Updated Supplemental Norms
Table 30 Meeting/Not Meeting SSBD Stage 2 Risk Criteria by Ethnicity and
Externalizing vs. Internalizing Status
PART A —
­­ STUDENTS MEETING STAGE 2 RISK CRITERIA
White
African-American
Hispanic/Latino
Asian
Multi-Ethnic
Externalizing
M
Internalizing
M
Externalizing
M
Internalizing
M
Externalizing
M
Internalizing
M
Externalizing
M
Internalizing
M
Externalizing
M
Internalizing
M
CEI
6.63
4.13
5.72
4.25
6.01
3.83
6.33
3.82
6.40
4.00
ABI
30.50
38.31
31.08
35.60
30.91
38.29
31.00
37.67
33.78
39.33
MBI
36.12
23.78
36.90
24.44
35.68
20.70
38.00
21.67
36.56
21.75
CEI
1.54
1.45
1.71
1.49
1.71
1.46
1.10
1.28
1.67
1.55
ABI
41.19
47.36
38.41
46.05
38.72
44.57
42.90
47.64
38.62
44.91
MBI
23.49
14.74
26.30
14.37
23.30
13.29
20.70
13.84
24.21
14.27
PART B — STUDENTS NOT MEETING STAGE 2 RISK CRITERIA
CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index
Index score averages for internalizers across the two sites were 33.97
and 34.24 while their Maladaptive Behavior Index scores were 23.76 and
26.35 for the South and Northwest sites.
We have also examined a portion of this supplemental normative base
for the influence of ethnicity on status of student behavior disorders (i.e.,
externalizing, internalizing). We analyzed the Midwest data base of 1,970
cases for this purpose. Table 30 (part A and B) provides average Stage
2 instrument scores for five ethnic groups: White, African American,
Hispanic-Latino, Asian, and Multi-Ethnic. Examination of this table
shows remarkable similarities in Stage 2 score profiles across these ethnic
groups. Those students who met Stage 2 risk criteria show very similar
patterns in their externalizing and internalizing profiles regardless of their
ethnicity, and they also reflect known differences between the severity of
externalizing and internalizing profiles on the CEI, ABI, and MBI.
Thus, students who are screened via Stages 1 and 2 of the SSBD and
meet risk criteria for both, look very much alike in the content of their
behavioral profiles as well as in their severity levels. These results are
encouraging and indicate that separate norms do not need to be created
to accommodate ethnicity in SSBD screening and decision making.
Technical Manual | 87
APPENDIX B
SSBD Bibliography and Resource List
A bibliography of information resources regarding the SSBD is provided
below in Appendix B for those seeking to access greater detail about the
screening system and its use by others. This material describes SSBD
applications along with empirical outcomes and provides commentaries and perceptions of the system by both users and researchers. This
bibliography is organized into the following categories: books, journal
articles, chapters, and websites. The vast majority of this information has
been contributed by other professionals. The websites listed are those
that have emerged through Google searches for PowerPoints. Rather
than present SSBD PowerPoints about the SSBD, we refer you to the
websites in which information about the SSBD is contained.
Books
Crone, D. A., Horner, R. H., & Hawken, L. S. (2004). Responding to
problem behavior in schools: The Behavior Education Program. New
York: Guilford.
Kauffman, J. M. (2001). Characteristics of emotional and behavioral
disorders of children and youth (7th ed.). Columbus, OH: Merrill.
Kauffman, J. M., & Brigham, F. J. (2009). Working with troubled children.
Verona, WI: Full Court Press.
Kettler, R., Glover, R., Albers, C., & Feeney-Kettle, K. (Eds.). Universal
screening in educational settings: Identification, implementation, and
interpretation. Washington DC: Division 16 Practitioners’ Series of the
American Psychological Association.
Lane, K. L., & Beebe-Frankenberger, M. E. (2004). School-based interventions: The tools you need to succeed. Boston: Allyn & Bacon.
Lane, K. L., Kalberg, J. R., & Menzies, H. M. (2009). Developing schoolwide programs to prevent and manage problem behaviors: A step-by-step
approach. New York: Guilford.
Technical Manual | 89
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Lane, K. L., Menzies, H. M., Bruhn, A. L., & Crnobori, M. (2011). Managing challenging behaviors in schools: Research-based strategies that work.
New York: Guilford Press.
Lane, K. L., Oakes, W. P., & Cox, M. (2011). Functional assessment-based
interventions: A university-district partnership to promote learning and
success. Manuscript submitted for publication.
Lane, K. L., Menzies, H. M., Oakes, W. P., & Kalberg, J. R. (2012). Systematic Screenings of Behavior to Support Instruction: From Preschool to
High School. New York: Guilford.
Lane, K. L., Robertson, E. J., & Wehby, J. H. (2002). Primary Intervention
Rating Scale. Unpublished rating scale.
Nelson, J. R., Cooper, P., & Gonzalez, J. (2004). Stepping Stones to Literacy. Frederick, CO: Cambium Learning Group.
Pashler, H., Bain, P. M., Bottge, B. A., Graesser, A., Koedinger, K.,
McDaniel, M., et al. (2007). Organizing instruction and study to improve
student learning: A practice guide (NCER 2007–2004). Washington, DC:
National Center for Education Research, Institute of Education Sciences,
U.S. Department of Education. Retrieved from ies.ed.govinceelwwc!pdflpracticeguidesI20072 004.pdf.
Walker, H. M., Ramsey, E., & Gresham, F. M. (2004). Antisocial behavior
in school: Evidence-based practices (2nd ed.). Belmont, CA: Wadsworth.
Walker, H. M., Stiller, B., Golly, A., Kavanagh, K., Severson, H. H., &
Feil, E. (1997). First Step to Success: Helping young children overcome
antisocial behavior. Longmont, CO: Sopris West.
Chapters
Morris, R. J. Shah, K., & Morris, Y. P. (2002). Internalizing behavior
disorders. In K. L. Lane, F. M. Gresham, & T. E. O’Shaughnessy (Eds.),
Interventions for children with or at risk for emotional and behavioral
disorders (pp. 223–241). Boston: Allyn & Bacon.
Severson, H., & Walker, H. (2002). Proactive approaches for identifying
children at risk for sociobehavior problems. In K. Lane, F. M. Gresham,
90 | APPENDIX B: SSBD Bibliography & Resource List
& T. E. O’Shaughnessy (Eds.), Interventions for children with or at risk for
emotional and behavioral disorders (pp. 33–54). Boston: Allyn & Bacon.
Walker, H. M., Severson, H. H., Seeley, J. R., & Feil, E. G. (in press).
Multiple gating approaches to the universal screening of students with
school related behavior disorders. In R. Kettler, R. Glover, C. Albers, & K.
Feeney-Kettler (Eds.), Universal screening in educational settings: Identification, implementation, and interpretation. Washington DC: Division
16 Practitioners’ Series of the American Psychological Association.
Journal Articles
Caldarella, P., Young, E. L., Richardson, M. J., Young, B. J., & Young, K. R.
(2008). Validation of the Systematic Screening for Behavioral Disorders
in middle and junior high school. Journal of Emotional and Behavioral
Disorders, 16(2), 105–117. DOI: 10.1177/106342660731312
Cheney, D., Flower, A., & Templeton, T. (2008). Applying response to
intervention metrics in the social domain for students at risk of developing emotional or behavioral disorders. Journal of Special Education,
42, 108–126.
Epstein, M.H., Nordness, P. D., Cullinan, D., & Hertzog, M. (2002). Scale
for Assessing Emotional Disturbance: Long-term test-retest reliability
and convergent validity with kindergarten and first-grade students.
Remedial and Special Education, 23, 141–148.
Epstein, M. H., Nordness, P. D., Nelson, J. R., & Hertzog, M. (2002).
Convergent validity of the Behavioral and Emotional Rating Scale with
primary grade-level students. Topics in Early Childhood Special Education, 22, 114–121.
Gresham, F. M., Lane, K. L., & Lambros, K. M. (2000). Comorbidity of
conduct problems and ADHD: Identification of “fledgling psychopaths.”
Journal of Emotional and Behavioral Disorders, 8, 83–93.
Hasselbring, T. S., & Goin, L. I. (1999). Read 180 [Computer software].
New York: Scholastic. Hastings, R. P. (2003). Brief report: Behavioral
adjustment of siblings of children with autism. Journal of Autism and
Developmental Disorders, 33, 99–105.
Technical Manual | 91
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Kalberg, J. R., Lane, K. L., & Menzies, H. M. (2010). Using systematic
screening procedures to identify students who are nonresponsive to primary prevention efforts: Integrating academic and behavioral measures.
Education and Treatment of Children, 33, 561–584.
Kamps, D., Kravits, T., Rauch, J., Kamps, J. L., & Chung, N. (2000). A prevention program for students with or at risk for ED: Moderating effects
of variation in treatment and classroom structure. Journal of Emotional
and Behavioral Disorder, 8, 141–154.
Kamps, D., Kravits, T., Stolze, J., & Swaggart, B. (1999). Prevention strategies for at-risk students and students with EBD in urban elementary
schools. Journal of Emotional and Behavioral Disorders, 7, 178–188.
Kazdin, A. E. (1977). Assessing the clinical or applied importance of
behavior change through social validation. Behavior Modification,
1, 427–452.
Lane, K. L. (1999). Young students at risk for antisocial behavior: The
utility of academic and social skills intervention. Journal of Emotional
and Behavioral Disorders, 7, 211–223.
Lane, K. L. (2003). Identifying young students at risk for antisocial behavior: The utility of “teachers as tests.” Behavioral Disorders, 28, 360–389.
Lane, K. L. (2007). Identifying and supporting students at risk for emotional and behavioral disorders with multi-level models: Data-driven
approaches to conducting secondary interventions with academic
emphasis. Education and Treatment of Children, 30, 135–164.
Lane, K. L., Eisner, S. L., Kretzer, J. M., Bruhn, A. L., Crnobori, M. E.,
Funke, L. M., et al. (2009). Outcomes of functional assessment-based
interventions for students with and at risk for emotional and behavioral
disorders in a job-share setting. Education and Treatment of Children,
32, 573–604.
Lane, K. L., Harris, K., Graham,S., Weisenbach, J., Brindle, M., & Morphy,
P. (2008). The effects of self-regulated strategy development on the writing performance of second grade students with behavioral and writing
difficulties. Journal of Special Education, 41, 234–253.
Lane, K. L., Kalberg, J. R., Bruhn, A. L., Mahoney, M. E., & Driscoll, S. A.
(2008). Primary prevention programs at the elementary level: Issues of
92 | APPENDIX B: SSBD Bibliography & Resource List
treatment integrity, systematic screening, and reinforcement. Education
and Treatment of Children, 31, 465–494.
Lane, K. L., Kalberg, J. R., Lambert, W., Crnobori, M., & Bruhn, A. (2010).
A comparison of systematic screening tools for emotional and behavioral
disorders: A replication. Journal of Emotional and Behavioral Disorders,
18, 100–112.
Lane, K. L., Kalberg, J. R., Menzies, H., Bruhn, A., Eisner, S., & Crnobori,
M. (2011). Using systematic screening data to assess risk and identify
students for targeted supports: Illustrations across the K-12 continuum.
Remedial and Special Education, 32, 39–54.
Lane, K. L., Kalberg J. R., Parks, R. J., & Carter, E. W. (2008). Student
Risk Screening Scale: Initial evidence for score reliability and validity
at the high school level. Journal of Emotional and Behavioral Disorders,
16, 178–190.
Lane, K. L., Little, M. A., Casey, A. M., Lambert, W., Wehby, J; H.,
Weisenbach, J. L., et al. (2009). A comparison of systematic screening
tools for emotional and behavioral disorders: How do they compare?
Journal of Emotional and Behavioral Disorders, 17, 93–105.
Lane, K. L., Little, M. A., Redding-Rhodes, J. R., Phillips, A., & Welsh, M.
T. (2007). Outcomes of a teacher-led reading intervention for elementary
students at-risk for behavioral disorders. Exceptional Children, 74, 47–70.
Lane, K. L., Mahdavi, J. N., & Borthwick-Duffy, S. A. (2003). Teacher
perceptions of the prereferral intervention process: A call for assistance
with school-based interventions. Preventing School Failure, 47, 148–155.
Lane, K. L., Oakes, W. P., Ennis, R. P., Cox, M. L., Schatschneider, C., &
Lambert, W. (in press). Additional evidence for the reliability and validity
of the Student Risk Screening Scale at the high school level: A replication
and extension. Journal of Emotional and Behavioral Disorders.
Lane, K. L., Oakes, W. P., & Menzies, H. M. (2010). Systematic screenings
to prevent the development of learning and behavior problems: Considerations for practitioners, researchers, and policy makers. Journal of
Disabilities Policy Studies, 21, 160–172.
Lane, K. L., Parks, R. J., Kalberg, J. R., & Carter E. W. (2007). Systematic
screening at the middle school level: score reliability and validity of the
Technical Manual | 93
SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD)
Student Risk Screening Scale. Journal of Emotional and Behavioral Disorders, 15, 209–222.
Todis, B., Severson, H. H., & Walker, H. M. (1990). The Critical Events
Scale: Behavioral profiles of students with externalizing and internalizing behavior disorders. Behavioral Disorders, 15, 75–86.
Walker, B., Cheney, D., Stage, S., & Blum, C. (2005). School wide screening and positive behavior supports: Identifying and supporting students
at risk for school failure. Journal of Positive Behavior Interventions,
7, 194–204.
Walker, H. M., Golly, A., McLane, J. Z., & Kimmich, M. (2005). The
Oregon First Step to Success replication initiative: State-wide results of
an evaluation of the programs impact. Journal of Emotional and Behavioral Disorders, 13(3), 163–172.
Walker, H. M. Kavanagh, K., Stiller, B., Golly, A., Severson, H. H., & Feil,
E. G. (1998). First Step to Success: An early intervention approach for
preventing school antisocial behavior. Journal of Emotional and Behavioral Disorders, 6, 66–80.
Walker, H. M., Severson, H., Nicholson, F., Kehle, T., Jenson, W. R., &
Clark, E. (1994). Replication of the Systematic Screening for Behavior
Disorders (SSBD) procedure for the identification of at-risk children.
Journal of Emotional and Behavioral Disorders, 2, 66–77.
Walker, H. M., Severson, H., Stiller, B., Williams, G., Haring, N., Shinn,
M., et al. (1988). Systematic screening of pupils in the elementary age
range at risk for behavior disorders: Development and trial testing of a
multiple gating model. Remedial and Special Education, 9, 8–20.
Walker, H. M., Severson, H., Todis, B. J., Block-Pedego, A. E., Williams,
G. J., Haring N.G., et al. (1990). Systematic Screening for Behavior Disorders (SSBD): Further validation, replication, and normative data. RASE:
Remedial and Special Education, 11, 32–46.
Webster-Stratton, C. (2000). The Incredible Years training series. Washington, DC: Office of Juvenile Justice and Delinquency Prevention,
Juvenile Justice Bulletin.
94 | APPENDIX B: SSBD Bibliography & Resource List
Wolf, M. M. (1978). Social validity: The case for subjective measurement
or how applied behavior analysis is finding its heart. Journal of Applied
Behavior Analysis, 11, 203–214.
Websites
Association for Positive Behavior Support
http://www.apbs.org
Delaware PBS Project
http://www.pbisnetwork.org/wp-content/uploads/2011/04/ScreeningNWPBISMay2011.ppt
The Illinois PBIS Network
http://www.pbisillinois.org/home
http://www.pbisillinois.org/curriculum/universalscreening/
scoring-tools
Maine Department of Education—Screening & Progress
Monitoring
http://www.maine.gov/doe/rti/screening.html
http://www.mepbis.org/docs/wcc-03-02-10-systematic-screening-ofbehavior.pdf
New Hampshire Center for Effective Behavioral Interventions and
Supports NH CEBIS (PBIS)
http://www.nhcebis.seresc.net
http://www.nhcebis.seresc.net/universal_ssbd
Northwest PBIS Network
http://www.pbisnetwork.org/
Positive Behavior Support Initiative, Brigham Young University
http://education.byu.edu/pbsi
Southeastern Regional Education Service Center (SERESC)
http://www.seresc.net
Technical Manual | 95