A construct validity study of the sentence verification technique as a

University of Massachusetts Amherst
ScholarWorks@UMass Amherst
Masters Theses 1911 - February 2014
Dissertations and Theses
1982
A construct validity study of the sentence
verification technique as a method of measuring
reading comprehension.
Douglas J. Lynch
University of Massachusetts Amherst
Follow this and additional works at: http://scholarworks.umass.edu/theses
Lynch, Douglas J., "A construct validity study of the sentence verification technique as a method of measuring reading
comprehension." (1982). Masters Theses 1911 - February 2014. 1745.
http://scholarworks.umass.edu/theses/1745
This thesis is brought to you for free and open access by the Dissertations and Theses at ScholarWorks@UMass Amherst. It has been accepted for
inclusion in Masters Theses 1911 - February 2014 by an authorized administrator of ScholarWorks@UMass Amherst. For more information, please
contact [email protected].
A Construct Validity Study
of the Sentence Verification Technique
as a Method of Measuring Reading Comprehension
A Thesis Presented
By
DOUGLAS JAY LYNCH
Submitted to the Graduate School of the
University of Massachusetts in partial fulfillment
of the requirements for the degree of
MASTER OF SCIENCE
September 1982
Psychology
A CONSTRUCT VALIDITY STUDY
OF THE SENTENCE VERIFICATION TECHNIQUE
AS A METHOD OF MEASURING READING COMPREHENSION
A Thesis Presented
By
DOUGLAS JAY LYNCH
James M. Royer, Chairperson of Committee
/
lancy A./Myers, Member
Ronald K. Hambleton, Member
Bonnie Strickland, Department Head
Psychology
TABLE OF CONTENTS
INTRODUCTION
1
Chapter
I.
NORM REFERENCED READING COMPREHENSION
TEST QUESTIONS
Characteristics of norm referenced reading
comprehension tests
Do reading comprehension tests
measure reading comprehension?
Test performance is not highly associated
with reading the passage
Reading comprehension tests and
intelligence tests
Reading test performance varies with type
of test question
II.
4
4
6
7
12
21
AN EXPERIMENT INVESTIGATING THE
CONSTRUCT VALIDITY OF THE
SENTENCE VERIFICATION TECHNIQUE
Method
Results and discussion
Final discussion
Concluding remarks
TABLES and FIGURES
35
50
58
69
74
76
BIBLIOGRAPHY
105
APPENDIX
109
•
•
*
111
LIST OF TABLES
1
.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16..
Mean Reading Comprehension Test Performance
either with Passages or Without Passages ....
Structural Variables used by Dunn (et. al
1981) to Predict Test Performance
The Prediction of Reading or Aptitude Test
Performance from Earlier Tests
Test Performance as a function of Type of
Test Question and Deleated Text
Wide Range Reading Test and Metropolitan
Reading Test scores from the Direct
Instruction Project
Sample of Test Sentences used by Sachs (1974)
Sample of Test Sentences used in the SVT
Mean Proportion Correct SVT Scores by
Content Type and Expertise of Subjects
Pairs of Tests Administered throughout the
Semester
Mean Proportion correct Scores
by Passage and Time of Test Session
Mean Proportion Correct SVT Scores by
Question Type and Time of Administration ....
Mean Proportion Correct SVT Scores
by Question Type, Content of Passage
and Group
Mean Proportion Correct SVT Scores
by Question Type, Content of Passage, and
Passage Pair
Analysis of Variance Table of
Proportion Correct SVT Performance
Mean Combined SVT and Confidence Rating by
Content Type and Time of Administration
Mean Combined SVT and Confidence Rating
by Question Type and Time of Administration
Mean Combined SVT and Confidence Rating
by Question Type, Content of Passage,
and Group
Mean Combined SVT and Confidence Rating
by Question Type, Content of Passage,
and Passage Pair
Analysis of Variance Table of Combined SVT
and Confidence Rating Variable
.
.
17.
18.
19.
iv
76
77
78
79
80
81
82
83
84
85
88
89
90
91
92
93
94
95
96
20.
21.
22.
23.
24.
25.
26.
27.
Mean Confidence Ratings per SVT Test Sentence
for Correct and Incorrect SVT Responses by
Content Type and Time of Test Session
Mean d' scores by Content Type and Time
of Test Session
Analysis of Variance Table of d' Variable
Sample of Idea Units from one Passage
Mean Recall Scores by Content and Time
of Test Session
Analysis of Variance Table of Recall Variable
Correlations between SVT and Recall
Performance by Time of Test Session and
Content Type for d' and Proportion
Correct Scores
Conditional Probabilities of SVT Performance
given Recall Performance by Content Type ...
v
97
98
99
100
101
102
103
104
LIST OF FIGURES
1.
Mean SVT Proportion Correct, Combined SVT
and Confidence Rating, and d' Scores by Content
Type and Time of Test Session
86
vi
INTRODUCTION
Overv iew of the Thesis
The
primary
investigate
purpose
construct
the
verification technique as
comprehension.
The
chapters.
The
first
suggesting
that
two
assessing
of
this
validity
organized
chapter
presents
the
reading
comprehension
tests
methods
critiqued
chapter
procedure
and
in
most
into
two
research
common methods of
on
referenced
norm
are inadequate.
multiple
the
to
the sentence
of
is
thesis
of
is
method of measuring reading
a
comprehension
reading
thesis
one
choice
are
The two
cloze
the
procedure.
The
second chapter presents an experiment investigating the
construct
validity
of
the
sentence
verification
techn ique
The first chapter reports research
which
suggests
that cloze tests and multiple choice test questions are
inadequate methods of measuring
on
comprehension
reading
norm referenced reading comprehension tests.
There
chapter.
Each
are three sections
section
presents
within
the
first
investigating
research
1
either
the
2
cloze
technique
assessing
multiple
or
reading
choice
comprehension.
presents evidence
that
test
technique
The
of
first section
performance
on
several
major norm referenced reading comprehension tests using
multiple choice questions is not dependent
the
passages
The
second
from
section
demonstrating
norm
performance
several
studies
referenced
reading
highly
is
intelligence test performance.
suggests
possibility
the
measuring
presents
that
intelligence
comprehension.
associated
This relationship
may
tests
the
rather
that
chapter
comprehension
reading
be
reading
than
The third section of the first
evidence
reading
the questions were derived.
reports
that
comprehension test
with
which
on
test
performance varies with the type of test question.
The second chapter of this thesis investigates
construct
validity
technique
as
comprehension.
a
of
method
previous research which used
task
to
experiments
sentence
assess
by
presents
memory
other
a
of
text
comprehension are described
technique
in
example of
an
sentence
researchers
verification
reading
measuring
of
chapter
The
verification
sentence
the
the
verification
sentences.
Two
used
the
which
assess
detail.
The
reading
primary
3
purpose
of
chapter
the
is
experiment which extends these
while
supporting
description
the
previous
two
argument
the
verification technique is
a
that
an
studies,
sentence
the
valid method
of
of
measuring
read ing comprehension
it's most general form,
In
the
read ing
stud en ts
technique
The
.
the semester.
same
read ing
s
the
sentence
of
college
ver if ica tion
tuden ts were tested early and late in
laboratory
whether
there
comprehension
are
by
performance
assesses
All of the students were members of
psychology
investigates
which
comprehension
measured
as
the experiment
related
to
course
was
performance
the
The
.
experiment
improvement
for
knowledge
the
in
the
those passages
the
presumably learned in the psychology cour se
students
CHAPTERI
NORM REFERENCED READING COMPREHENSION TEST
QUESTIONS
Characte ristics
of
Comprehension Tests
many
school
Norm
Referenced
Reading
Norm referenced tests are used
.
systems
to assess reading comprehension.
These tests use different types of questions.
the
most
common
types
of
test
choice
Achievement
Reading;
Test;
test
questions
Metropolitan
Two
questions
referenced reading comprehension tests
multiple
are
(cf.
of
norm
on
cloze
and
California
Achievement
Test,
Iowa Test of Basic Skills, Reading; Wide Range
Reading Test)
.
Cloze tests are constructed by deleating every
word from
a
text.
either
deleated.
supply
nth
Usually the deleated words are every
fifth random or function word.
to
by
or
select
The subject's
task
is
the word which has been
The test scores are usually reported as
the
absolute number of correct responses, or the percentage
of
correct
responses
out
of
the
total
possible
responses
The second
question
type of test
4
which
is
often
5
used
in norm referenced
the multiple
question
choice
can
be
first part is
questions
based.
The
one of these is
answers
choice
upon
passage
which
may
be
(commonly
answer the
to
correct
the
the
the
"poses the question", specifying
question.
the four or five answers.
is
The
several
The second part is the stem of
stem
component
multiple
text
or
The
the information required
third
The
into three components.
passage
sentences of text.
question.
question.
divided
the
are
reading comprehension tests is
called
answer,
and
distractors)
The
Usually
other
the
keyed as
are
incorrect responses.
There
questions
different
are
used
types
reading
in
of
multiple
comprehension tests.
distinguishing feature of the different
relationship
Johnston
between
(Note
described
1)
passage-question
passage
the
relationships
developed by Pearson and
multiple
choice
Johnson
types
is
The
the
the question.
and
three
types
from
a
(1978).
of
typology
The
three
are
textually
explicit, textually implicit, and scriptally
implicit.
types
of
Textually
in
choice
explicit questions have both the information
the question stem and
single
questions
sentence
in
the
the correct answer stated in
passage.
a
Textually implicit
6
questions have the information in the question stem
and
correct
the
passage.
the
answer
in
different
sentences
of
Scriptially implicit questions have
some
the
of
information required to answer the question in the
passage,
but
subject
the
must
supply
additional
information from world knowledge or "script" [1] of the
topic.
may
These three types of multiple choice
questions
be used in testing situations in which the passage
is either available or not available for
subject
the
selects
three types of test
types
of
text
answer.
an
questions
rereading
as
Therefore, with the
crossed
with
the
two
availability conditions, there are six
possible types of multiple
choice
questions
used
to
measure reading comprehension.
Do
comprehension
read ing
comprehension ?
tests
measure
reading
This section of the chapter will review
evidence suggesting
that
many
reading
comprehension
may not be measuring reading comprehension.
The
first area of research demonstrates that examinees
can
tests
perform
using
a
at
above
multiple
passages.
The
chance levels on standardized tests
choice
second
without
format
area
evidence that performance on
of
norm
reading
research
referenced
the
presents
reading
7
comprehension
tests
highly
is
associated
performance on intelligence tests.
this
chapter
reading
function
reports
research
comprehension
of
the
with
The last section of
that
performance
indicates
varies
that
as
a
test question used to assess reading
comprehension
Test
the
performance is not highly associated with reading
passage
demonstrated
multiple
reading
major
A
.
that
choice
study
by
examinees
questions
who
of
responded
5th,
6th
and
reading
comprehension
without
reading
performance
grade
students
either
passages.
the
scores
tests
from
an
Table
1
reports
passage.
after reading or
He
gathered
unusually
large
test
sample
comprehension
the mean scores for the six
reading
compared to mean scores of the tests when
answered
The
the
referenced
tests when subjects answered questions without
passage
Tuiman
answer
in norm
(n=600) of students on six major reading
subjects
the
standardized
passage.
the
multiple choice questions contained
the
to
comprehension tests could receive above chance
4th,
tests.
(1973-1974)
several
scores even if they didn't read
had
Tuiman
the
table
questions
also
after
reading
the
shows the mean test scores
8
which would be expected if
basis
of chance alone.
students
answered
on
the
It is clear from Tuiman's data
the students can receive test scores considerably above
chance
without
reading
Cunningham (1981) report
indicates
that
when they
answer
the passages.
members
at
University
of
can
Center
choice
for
Illinois,
study
which
questions
without
Study
the
of
Reading,
answered 45 questions from
comprehension
passages.
the
formal
and
Seventeen secretaries and staff
fifth grade level reading
reading
Royer
also perform above chance
multiple
reading
passages.
less
a
adults
the
the
test
a
without
For 36 out of the test items,
the scores were significantly above chance.
Tuiman
and
Gray
conducted
(1972)
study
a
demonstrating that 7th grade students could answer most
of the multiple choice questions included in
comprehension
test
The researchers used
even
three
passages
contained
in
original
form,
the
non-function
2)
words
if
the
these
three
types
versions
tests:
deleated, and
of
reading
the text was incomplete.
passages
50% of the words deleated.
a
of
1)
with
3)
the
reading
the passages in
30%
of
the
the passages with
The students
read
one
of
passages and answered multiple
choice questions based on the passages.
The mean
test
9
performance
of
group
the
which
read
reduced
30%
passages was only 13% less than the group that read the
unmutilated
passages.
The
group
that read the text
which had half of the words missing had scores 23% less
than the group that read the whole text.
The studies by Tuiman (1973-1974;
1972)
and Royer and Cunningham (1981)
performance when examinees did not
upon
Tuiman
which
Gray,
&
investigated test
read
passages
the
multiple choice questions were based.
the
They found that test performance was not dependent upon
reading
the
text
and Cook (1981)
performance
passages.
A
study by Drum, Calfee,
provides additional evidence that
is
not
dependent
test
upon comprehending the
passage part of multiple choice questions.
Drum, Calfee,
relationship
norm
between
referenced
performance
and
on
reading
comprehension
value for each test question.
A p
by
tests.
value is the proportion of
all
Test
examinees
particular test question correct.
reported
were
by
major
18
was
indexed
study
and
performance
comprehension
the
the
tests
analysed
They
tests.
the
p
investigated
(1981)
several structural properties of
reading
the
Cook
obtained
from
the test publishers.
The
p
who
get
a
values used in
norming
statistics
Drum (et.al.,
1981)
10
specified the structural
components
of
the
test
by
identifying
sixteen predictor variables characterizing
the passage,
the stem
answer,
and
the
of
the
incorrect
question,
the
correct
answers.
The
sixteen
variables mentioned above are listed in Table
were
There
2.
four variables for each of the four components of
the test quesion.
By using
al.,
a
multiple regression procedure, Drum (et.
found the 16 predictor variables accounted
1981)
for 72% of the variance in test performance.
only
of
12%
However,
variance was associated with the
this
predictor variables representing the structure
passage.
Structural
question and
accounted
the
in
variability
correct
in
of
answers
for 59% of the variance in test performance.
The small amount of
associated
with
variability
passage
test
in
performance
characteristics
poses
significant question about norm referenced tests
multiple
use
choice
questions.
test
performance
a
which
Although the primary
purpose of the test is to assess the
text,
the
the stem of the
incorrect
and
4
appears
comprehension
to
be
more
of
highly
associated with reading questions and answers.
Drum (et.
suggests
test
al
.
,
1981)
address
performance
is
a
second
issue
that
difficult to interpret.
1
Passage
difficulty
inconsistent.
and
item
difficulty
question
on
but
stem
answer.
or
the same test may have
comprehension
of
the
of
a
question
may
difficult
al.
"vocabulary,
require
phrase.
is
that
proposi tional
test-taking
manner."
report
1981)
f
demands
(p.
are
From
the
their
tests,
Drum
difficulty
the
density,
syntax,
changed in
all
Another
similar passage,
a
structural analysis of 18 norm referenced
(et.
often
For example, one passage may have easier
words than the words in the stem
the
are
a
of
and
confounded
511)
One implication of the Drum (et.
al
that
be
reading
test
scores
may
.
,
study
1981)
difficult
to
unambiguously
interpret.
components
the test do not vary systematically and
of
Since
characteristics of the passage are not
performance,
test
to
one
can
structural
the
highly
rarely
related
isolate
the
variables which account for test performance.
research
This section of the chapter has presented
on
multiple choice test performance.
the
research
answered
text,
at
the
suggests
above
chance
relative
characteristics
in
that
test
Although much of
questions
can
be
levels without reading the
independence
of
structural
the stem and answers from the text
12
presents
another
correctly
comprehend
stem
or
possibility.
answers
possibility,
the
Examinees
text, but find
Given
this
and the evidence that subjects can answer
reading
the tests could easily misrepresent
comprehension ability.
correctly
test score
the question
incomprehensible.
the questions correctly without
responded
could
might
performance.
In
In
the
student's reading
a
the case where
without
comprehend the passages of
the
student
reading the passage, the
overestimate
another
passage,
reading
instance,
the
comprehension
student might
a
but
test,
choose
an
incorrect response because the question stem or answers
were incomprehensible.
which
underestimates
This might lead to
ability when reading text.
chapter
presents
a
student's
the
The next
reading
test score
comprehension
section
of
this
different problem in interpreting
test scores from norm referenced reading
tests:
a
comprehension
comprehension
performance
may
be
confounded with intelligence.
Read ing
comprehension
tests
intelligence tests
describe
.
research
This section of
the
which
intelligence and reading comprehension
suggests
chapter
and
as measured by standardized
will
tests are closely
related.
13
The
research
that
will
be
cited
contention by demonstrating high
scores
on
intelligence
supports
correlations
tests
and
between
scores on reading
comprehension tests.
Initially evidence of these
correlations
be
will
presented,
discussion of two alternative
this
high
followed
interpretations
by
of
a
the
high correlations between intelligence test performance
and reading comprehension test performance.
Several studies report
high
correlations
between
intelligence test scores and reading comprehension test
scores.
and
Harootunian (1966), for example, assessed
8th
grade student's performance on the California
Test of Mental
Test
7th
Maturity,
(vocabulary
and
California
the
comprehension)
,
Achievement
the
Iowa
Every-Pupil Test of Basic Skills, and an assortment
15
by
other tasks.
Guilford
abilites.
These tasks were similar to those used
isolating
in
example,
For
different
intellectual
the "Critical Thinking"
required the subject to decide whether inferences
given
information
Words"
task required the
words
when
they
calculated
an
scores
these
on
of
logical,
was
were
subjects
given
in tercorr el a tion
17
and
to
part of
matrix
task
from
the "Incomplete
identify
a
word.
between
whole
Having
the
tests or tasks, he found reading
14
test scores
(r=.57)
were
and
highly
correlated
with
IQ
scores
with Critical Thinking scores (r=.53).
different study by
Guice
(1969)
found
that
A
reading
comprehension scores of college students as measured
by
the Co-operative English Test of Reading
was
correlated
with
.64
their
Comprehension
scores from the Otis
Quick Scoring Test of Mental Abilites.
Sassenrath (1972-1973) presents
analysis
a
different type of
of the relationship between intelligence test
performance and reading comprehension test performance.
He
collected
4th
grade
test
scores of college, high school and
students.
intelligence,
reading
reading subskills.
questions
1961;
The
used
conceivable
subskill
previous
in
reading
tests
analysis", "word
"perception
of
He wanted
test
to
many other
and
research (Homes
These researchers
what
&
Singer
used
test
they believed were all the
subskills.
A
few
examples
of
are "vocabulary in isolation", "verbal
sounds",
verbal
"phoneme
score
similarity",
relationships".
calculated an in tercorrelation matrix
composite
represented
The subskills were assessed by test
assess
to
scores
comprehension,
Singer 1964,1965).
questions
test
using
and
Sassenrath
a
single
for each of the subskill tests.
identify common reading factors which were
15
shared
by
factor
analysed
scores.
several
The
result
which
the
of
correlations
load
same
interest
comprehension
the same
the
tests.
Therefore,
between
the
he
test
logic behind factor analysis is that the
correlations
represent
different
on
the
underlying
here
same
factor
may
cognitive trait.
that
is
the
The
reading
test and the intelligence test loaded on
factor
for
both
college
high
and
school
students
A
very
different study by
high
positive
Thorndike
(1973-1974)
found
correlations between intelligence
test scores and reading comprehension test scores.
found
such
high
He
correlations between norm referenced
reading comprehension test scores and intelligence test
scores
that
he concluded that reading beyond decoding
is inseparable from reasoning.
from
an
elementary
school
aptitide test one year and
comprehension
test
the
a
He gathered
district
norm
next.
test
eight years.
early
performance
on
which
Therefore,
grades
will
an
reading
the
performance
data
and
the same students over
scores
Thorndike reports that test
elementary
used
referenced
represents reading comprehension test
aptitude
test scores
predict
amount of the variance in the scores of
a
a
from
substantial
test
given
16
later
the
in
elementary grades.
seven tests sequentially to
variance
in
Table
emphasize
the
the 7th grade Otis Alpha and
reading test scores that is accounted
reading
or
aptitide
variance
of
performance.
only
grade
the
for
by
earlier
The grade two
for
seven
of
62.
Otis
H
of
the
Alpha
test
The grade three Otis Alpha aptitude
test
accounts
for 7.6% more grade seven aptitude test
variance than the reading test
Reading
amount
the 8th grade
test performance.
Metropolitan Reading test accounts
reports the
3
tests
and
aptitude
at
the
second
tests
are
grade.
both
very
effective in predicting later test performance.
Several studies have been reported suggesting
reading
comprehension
related.
evidence
The
intelligence
and
for
this
are
assertion
that
closely
is
the
consistently high positive correlations between reading
comprehension test scores
The
discussion
will
now
and
aptitude
test
turn to the two alternative
interpretations of these high correlations.
interpretation
comprehension
tests
supported
essentially
The
by
second
correlations
and
interpretation
may
performance
that
is
be
due
scores.
intelligence
on
first
The
reading
on
tests
are
the same cognitive process.
suggests
to
that
the
high
similarities in the test
17
rather than cognitive processes.
Thorndike
comprehension
(1973-1974)
is
argues
highly
a
that
reading
complex cognitive process
which is inseparable from reasoning.
If this is
true,
there are certain implications for how difficult it
may
be to improve the reading comprehension of students
who
perform poorly on norm referenced reading tests.
These
students may have general intellectual
which
may
be
difficult to improve.
deficits
Thorndike writes that
a
barrier "is set by the child's limited comprehension of
what
he
reads,
which
see now as not primarily
we
deficit in one or more specific and
reading skills but as
And
this barrier
promises
to
the way of a wide range of future learnings."
in
In
147)
(p.
teachable
reflection of generally meagher
a
intellectual processes.
stand
readily
a
improvements
other
in
words,
student
Thorndike
suggests
that
reading comprehension may be
limited by the student's intelligence.
An
alternative
correlations
are
comprehension
similarities
the cognitive
reading
of
high
the
between intelligence test scores and norm
referenced reading
there
interpretation
processes
comprehension.
test
scores
argues
between the tests rather than
underlying
Norm
intelligence
referenced
and
reading
8
comprehension tests
tests
the
in
be
similar
at least two ways:
multiple
dependent
choice
on
test
inferential
comprehension, and
used
may
2)
intelligence
to
performance on some of
1)
questions
reasoning
there may be
may
be
more
than on language
similar
procedures
by the publishers of tests in selecting items for
both types of tests, thereby creating
tests
that
are
which
may
artificially measuring the same process.
One type of
require
multiple
inferential
information
the
question
reasoning
implicit test question.
of
choice
In
is
this type of question, part
which
needed
is
question correctly is in the text.
information
knowledge.
of
this
must
be
answer the
to
Another part of the
supplied from the reader's world
Royer and Cunningham (1981) give an example
type of question.
The passage from which the
question was derived was about
going
scriptally
the
camping.
passage
The
a
boy
did
and
father
his
describe what
not
instrument was used to pound tent stakes.
However, the
question asked: "What did John's father probably use to
pound the tent stakes?"
may
very
knowledge.
respond:
well
A
depend
child
"pound
The response
upon
without
the
of
the
reader's
camping
subject
previous
experience
the stake with a rock", while
a
may
child
19
with camping experience may mark
"pound the stake with
The
scriptally
question
rich
appears
store
knowledge
of
implicit
Intelligence
Scale
effect
For
for
titled "information".
assess
"knowledge
multiple
test also has
a
In
example,
this
test,
second
on
Wechsler
a
test
subtest
items
an individual with an average
(Ellias
&
Ellias
"general information" subtest assessing
reason
comprehension
general
the
the
a
The Stanford-Binet Intelligence
57)
world knowledge (Kagen
A
test
performance
Children (WISC) has
that
Ellias 1977, p.
This
the
opportunity may be able to acquire".
&
choice
knowledge.
also
tests.
response:
require both reading skill and
general
intelligence
keyed
hachet."
a
to
may
the
tests
&
Lang 1978).
that
and
performance
aptitude
tests
on
are
reading
highly
related may be due to item selection practices when the
tests
written.
are
Norm
referenced
reading
comprehension tests are often called achievement tests.
Popham
comments
(1978)
upon the similarities between
achievement tests and aptitude tests.
He contends that
norm
are
revised
scores.
referenced
to
If
achievement
achieve
a
test
an
tests
appropriate
continually
distribution
of
quesion is consistently answered
20
correctly by examinees, that question will
from
the
test.
This
suggests
be
removed
that if teachers are
successful in teaching the particular skill assessed
by
the
test
test.
item,
After
achievement
the test item may be removed from the
few
a
test
years,
will
Popham
become
contends
less
sensitive
the
to
instructional effects if students do well on the tests.
In
time,
the
intelligence
achievement test will measure just what
tests
were
designed
measure.
to
Intelligence tests assess cognitive processes which are
"brought to school", rather than what
school".
If
selection
procedure
positive
Popham'
"in
contention is correct, the item
s
would
correlation
learned
is
contribute
to
high
the
between intelligence test scores
and norm referenced reading comprehension test scores.
This section of the thesis has
that
reading
related to
choice
evidence
comprehension test performance is highly
intelligence
test
presented
questions
test
which
performance.
require
the
Multiple
reader to
generate inferences may use similar cognitive processes
to
those used in intelligence tests.
demonstrate
comprehension
whether
are
Research needs to
intelligence
inherently
related
and
reading
or whether the
association is merely due to test characteristics.
One
21
way
this
could
be
done
is
to show that readers can
improve their reading comprehension performance
without
changing their intelligence.
If inferential multiple
students
use
to
choice
cognitive
questions
processes
cognitive processes they would use on
test,
then
require
similar to the
intelligence
an
norm referenced reading comprehension test
performance may be confounded with intelligence.
It is
possible that another type of test question would yield
a
different
students.
wanted
reading
This
most
the
comprehension
comprehension
presents
a
accurate
reading
problem for
assessment
performance.
yield different
score
If two
scores,
for
the
teacher who
a
of
reading
types of questions
which
is
more
addressed
this
one
accurate?
The third section of this
problem
greater
in
detail
chapter
— focusing
on research that
demonstrates reading test performance varies
type of test question used
Read ing test
question
.
to
performance
This
section
wi th
type
inadequate
methods
of
test
the chapter continues the
argument that cloze and multiple choice test
are
the
assess comprehension.
varies
of
with
for
assessing
questions
reading
22
comprehension.
emphasized
is
the type of
The
major
problem
that
will
be
that reading test scores vary both with
test
question
used
measure
to
reading
comprehension, and with the subjects who take the
test.
In other words,
reading performance appears
change
to
depending upon the type of test question and on certain
characteristics
research
will
different
of
the
be
Initially
reported
investigating how cloze tests yield
performance
comprehension
reader.
indices
tests.
from
other
reading
Then research will be presented
demonstrating differences in reading test scores due
to
the type of multiple choice test question and the world
knowledge that the reader brings to the test.
The cloze
procedure
comprehension
the missing
reportedly
measures
reading
by assessing whether subjects can supply
words
from
partially
a
deleated
text.
However, Carroll (1972) suggests that the cloze test is
primarily
a
language
rather than semantic understanding of
measurement
of
syntactic
redundancy
a
of
text.
Tuiman, Blanton, and Gray (1975) designed an experiment
investigate
to
al
.
and
,
Carroll's
1975)
study is
Gray
(1972)
chapter.
a
assertion.
different analysis of the
experiment
Two groups
The Tuiman (et.
of
Tuiman
reported earlier in this
subjects
answered
either
a
23
cloze
test
or
a
multiple choice test assessing their
comprehension of the same
group
was
text.
Each
test
divided further into three conditions which
differed in the amount of text which was
subject:
a
1)
complete text,
2)
that
Table
the
deleated,
six
H
test
reports the
mean
conditions.
test
test
because
the
and
3)
(compared
of
the
of
performance
reduced
complete
the
to
deleation
the
redundancy
words
cloze
text
test)
eliminated
the
Since they believe the poor
text.
performance was due to the inability of the subject
use
redundancy,
text
performance in
measure
a
Tuiman and Gray argue
that the scores are so much lower on the
cloze
by
had 50% of the words from the complete text
deleated.
for
read
text that had 30% of
a
the words from the complete text
text
question
of
regular
a
syntactic
they
infer
cloze
that
test
redundancy
is
cloze
to
test
primarily
a
rather than reading
comprehension
Table
are
4
also demonstrates that
considerably
scores.
questions
different
This suggests
may
assess
that
cloze
scores
test
from^ mul tiple choice test
the
two
types
of
test
different reading processes.
study by Weaver (1963) provides further
evidence
A
that
cloze tests and multiple choice tests measure different
24
skills.
Weaver (1963)
college
administered
students.
Eight
of
The remaining tests assessed
language abilities.
reading
18
tests
to
these were cloze tests.
variety of
a
He factor analysed
reading
and
the scores from
the tests,
finding that the eight cloze tests loaded on
the
factor.
same
The
cloze test scores were highly
correlated with each other, but cloze test scores
not
correlated
with
the
scores
were
from other tests of
reading comprehension.
This research suggests that test scores from
tests
may
present
comprehension
comprehension
a
different
performance
tests.
Test
estimate
than
scores
certain
characteristics of subjects.
varied scores with different types of
of reading
other
will
depending on the type of multiple choice
cloze
reading
also change
question
and
Research showing
multiple
choice
questions is cited below.
Johnston (Note
which
subjects
1)
recently
reported
a
received different scores on
comprehension test depending on the
type
of
study
a
in
reading
multiple
choice test question and the subject's world knowledge.
He manipulated
the type of
the availability of text and
multiple
choice
question,
the relative knowledge the
25
readers had of the text.
Eighth grade students from
rural
an
school
passages.
and
from
Each passage had
passages
were
school
different
a
read three
theme.
The
about corn, urban transit problems, and
the Civil War.
vocabulary
urban
a
According to their
test
test, rural readers had
a
scores
on
a
greater knowledge
of corn, while the urban readers had greater knowledge
of
urban
transit
problems.
They
had
equivalent
knowledge of the Civil War.
These texts were assessed with the three
multiple
thesis:
choice
textually
scriptally
questions
mentioned
explicit,
implicit.
A
textually
types
earlier
implicit,
the
information in textually explicit questions.
textually
question information is in
the answer information.
a
sentences.
use
answer.
across
questions,
different
sentence
In order
to answer a
the
from
textually
information
Scriptally implicit test questions
have part of the answer in the
must
answer
implicit
implicit question, the reader must combine
across
and
Therefore
the reader does not have to combine information
In
the
single sentence in the passage
contains both the question information and
sentences.
in
of
text,
but
his world knowledge to generate
the
a
reader
reasonable
These three quesion types were further
varied
26
assess
to
the
centrality of the test question.
Test
questions either assessed central ideas (main
ideas) of
the text or periferal ideas of the text.
The other
varied
manipulation
whether
the
Johnston's
in
subjects
could
experiment
reread the text.
This condition presumably placed different demands
upon
long
term memory
text when
(
LTM)
answering
by altering the availability of
the
multiple
choice
questions.
There were three memory demand conditions: no demand
on
LTM, slight demand on LTM, and greatest demand on
Subjects
could
condition.
were
In
reread
the
the slight
text
demand
in
the
no
condition,
demand
subjects
not allowed to reread the text, but they answered
the questions immediately after their initial
In
LTM.
this test condition,
the readers may have used their
short term memory of the text to
questions.
demand
The
upon
complete
an
LTM,
third
reading.
test
involved
answer
some
condition
and
requiring
of
the
greatest
subjects
to
intervening task five minutes after their
initial reading of the
passage
and
then
answer
the
questions without rereading the text.
In
varying
demand,
summary, Johnston assessed reading comprehension
subject
and
knowledge,
centrality
of
question
the
test
type,
memory
question.
He
27
reports
three
thesis:
1)
results
which
are
relevant
to
this
prior knowledge of the topic accounted for
a
significant amount of the variance in test performance.
Subjects with greater knowledge of
percentage
of
theme had
higher
a
correct answers than subjects with less
knowledge of that theme.
affected
a
The type of test
2)
performance.
The
percentage
question
of
correct
responses was highest on textually explicit
questions,
followed
Scriptally
by
textually implicit questions.
implicit test questions had the
correct
responses.
lowest
percentage
of
Subjects scored higher when the
3)
text could be reread than when
they
had
to
rely
on
their memory of the text.
The basic conclusions which can be drawn from these
results
are
that
multiple
choice
reading
performance varies with three conditions: the
test
question,
whether
theme.
This
previous
suggests
knowledge
that
assessments
of
It
of
the
norm
might
yield
reading performance.
There
may even be different types of questions
test.
of
different
referenced tests of reading comprehension
different
type
subjects can reread the text,
and whether subjects have
text
test
on
the
same
would be interesting to investigate whether
the type of test question and
the
centrality
of
the
28
questions
are
equally
represented
referenced reading comprehension tests.
of
test
question
and
major
norm
Both the
type
in
the centrality of the question
could be identified with considerable
effort.
However,
the suggestion that multiple choice
test performance is
affected by the world knowledge of the reader
a
presents
special problem in interpreting test scores.
The problem is particularly evident
receive
low
scores
comprehension test.
on
a
norm
when
referenced
reading
correctly
suggest
The scores may
that the examinees have poor reading ability.
the examinees may also have
test
and
due
to
a
subjects
performed
However,
poorly
on
mismatch between their world knowledge
the content of the multiple choice passages on
test.
The
same
examinees
might perform at
world
reading
test in which the passages matched their
knowledge.
examinees
This
could
have
reasoning
their
reading
suggests
to
their
world
knowledge.
referenced reading comprehension
assess
changes
in
reading
tests
that
comprehension
ability underestimated if the passage content does
relate
the
higher
a
level if they answered another multiple choice
comprehension
the
Or,
were
if
used
not
norm
to
comprehension performance
over time, the tests might be insensitive to changes in
29
reading ability.
There
is
some
suggestion
that
norm
referenced
reading comprehension tests may be insensitive
to gains
reading
in
comprehension
socio-economic
children.
performance
of
lower
The evidence comes from the
educational improvement program called Follow Through.
Follow Through was
Congress
in
1967
services.
provide
to
primary grades with
federal project authorized
a
poor
educational,
Appropriations
did
children
health,
by
the
in
and
social
not allow all three of
these services, so Follow Through was converted into
massive
educational
established
a
research
project.
a
The
project
variety of different programs to
improve
the education of poor children.
Each of these programs
attempted to improve the reading comprehension of
poor
children
Reading comprehension performance, as
norm
referenced
tests,
over control groups in any
1977)
that
measured
by
did not improve significantly
of
the
programs
(Becker,
This has generally been interpreted as evidence
.
the
improve.
children's
But
reading
comprehension
not
it is also possible that the tests were
insensitive to gains in reading comprehension.
thorough
did
examination
of
one
of
the
A
more
Follow Through
30
programs will suggest that reading comprehension
have
improved,
but
the
norm referenced test used to
assess reading comprehension was
gain.
The
program
might
which
insensitive
that
to
most clearly presents this
pattern was the Direct Instruction program.
In
this program, reading was initially taught
emphasis
on
decoding
skills.
Later,
stressed reading comprehension.
comprehension
(1977),
According
"skills"
were
structured format, with children
and
use
information
and
had
program
Becker
to
taught within
learning
a
extract
to
"rules" of problem solving.
These skills were instructed
which
the
with
by
using
considerable structure.
of the content areas were astronomy,
content
areas
For example,
muscle
few
a
function,
and measurement.
Reading
performance
referenced
was
used
tests.
Metropolitan
assess
reports
Total
reading
the
decoding
Reading
Test
comprehension
by
two
norm
performance.
(MAT)
was
performance.
The
used to
Table
5
reading percentile scores for the WRAT in
pre-kindergar ten compared
Instruction
measured
The Wide Range Reading Test (WRAT)
assess
to
was
did
pre-kindergar ten
not
to
assess
children.
after grade three.
Direct
reading comprehension of
These
scores
suggest
31
decoding
performance
improved considerably due to the
Direct Instruction experience.
Table
also
5
reports
that the same children had less gain in the
MAT reading
comprehension scores.
the
40th
percentile.
compared to
mean
After grade three they scored at
This
percentile
control percentile of 20.
a
who
Through (Becker
Given
different
is
the
did
not
participate
test
scores,
one
may
of
children
arrive
Becker
participating
at
suggested
because
examples
experience of
penicillin,
the
their
vocabulary
of
53*0
says
He
they
were
too
the children had little exposure to
the appropriate words in
(p.
comprehension.
the vocabulary items on the MAT were too
difficult for the children.
"amazon ant,
Direct
in
argues that the children
(1977)
did not improve considerably in reading
these
Follow
interpretations of the reading comprehension
Instruction.
difficult
in
Carnine, 1978).
&
these
abilities
al."
That
be
percentile of lower socio-economic children after
the third grade
He
should
He
Becker
homes.
items
from
disease-causing
argues
children
were
these
enrolled
in
cites
MAT:
the
germs,
et.
beyond
the
the
Follow
Through program.
Another interpretation of
the
Direct
Instruction
32
test
scores
is
that
the
children's
comprehension performance may have been
by
the MAT test.
reading
underestimated
The test might have required them to
know vocabulary terms as Becker noted.
The
test
may
also have had multiple choice test items which
required
the children to generate inferences
base
which
children
they
had
improved
performance,
they
never
than the 40th percentile
in
have
if
knowledge
a
developed.
greatly
might
from
Since
their
the
decoding
received scores higher
they
answered
different
types of reading comprehension test questions.
The
Direct
Instruction
reading
comprehension
results have provided an example of how difficult it is
to
interpret
reading
disadvantaged
method
of
children.
measuring
naturally
comprehension
The
test
scores
need for an alternative
reading
comprehension
stems
from the evidence presented in this chapter.
The primary focus of the chapter has been to build
argument
that
norm
referenced
multiple choice test items.
this thesis
sentence
presents
verification
comprehension.
an
cloze
tests
The second chapter of
experiment
technique
the
reading comprehension
tests are inadequate when they consist of
and
of
to
which
measure
used
the
reading
The sentence verification technique may
provide
a
viable
alternative to traditional means
assessing reading comprehension.
34
Footnotes
[1]
The
term
"script
has
been
used
researchers to refer to the structure of
memory.
(cf.
Shank
&
Abelson,
i
977)
by
several
knowledge
in
CHAPTERII
AN EXPERIMENT INVESTIGATING THE CONSTRUCT
VALIDITY OF THE SENTENCE VERIFICATION TECHNIQUE
This chapter presents an
the
construct
technique
validity
as
a
comprehension.
In
experiment
of
investigating
the sentence verification
method
of
measuring
it's most general form,
reading
the sentence
verification technique involves having subjects read or
listen to
a
sentence or passage.
presented with
decision
whether
relevant
judging
sentence that they are asked
about.
This
sentence
the
whether
from
decision
true
is
present
the
to
"different"
in
a
The subjects are then
a
or
is
make
a
involve judging
false,
experiment,
sentence
the
can
to
or,
more
it can involve
"same"
the
or
sentence the subject was exposed
the earlier phase of the experiment.
to
Same judgments
can be made on the basis of exact similarity between an
original sentence and
of
semantic
relationship)
a
test sentence, or on the
similarity
between
original
35
(i.e.,
and
a
test
basis
paraphrase
sentence.
36
Different
judgments can be based on total disimilarity
between original and test sentence, or on the
basis
altering
as
one
to alter
of
or more words in an original sentence
so
it's
meaning
relative
to
the
original
a
sentence
sen tence
Researchers
have
frequently
used
verification task to assess memory of discourse.
Sachs
(1967,
using
a
1974),
sentence
for example, conducted
verification
results of these
relevant
the
to
two
use
task
studies
of
two studies
to assess memory.
are
the
The
also
particularly
sentence
verification
technique for measuring reading comprehension.
Sachs conducted two experiments in
had
make
to
a
decision
of
whether
changed based on their memory of the
sentence.
The first study (1967)
memory with
original
a
task in which
passage
and
which
subjects
sentences were
original
passage
investigated sentence
subjects
heard
the test sentences.
both
the
The second
study (1974) replicated the listening task of the first
study
and
extended the task by having subjects read
passage and respond to test
print.
The
sentences
which
were
a
in
second study will be described in greater
detail
Sachs (1974) had college students either listen
to
37
a
passage and respond to aural test sentences
or read
passage and respond to written test sentences.
cases,
different
which were based
Sachs
used
forms
on
five
of
the
test
a
In both
sentences were used
original
passage
sentence.
test sentences: identical, semantic,
passive-active, formal, and lexical.
Table
example of each form of test sentence.
lists one
6
Having heard or
read the passage, the subjects decided whether the
test
sentence
was
sentence.
They could not listen to the passage
or
that
reread
changed
the passage.
semantic,
sentences
from
the
all
passage
again,
The correct decision would be
passive-active,
were
original
formal,
"changed"
from
and
lexical
the
original
sen tence
Sachs also varied the amount of
material
between
the
original
interpolated
sentence and when the
subject heard or read the test sentence.
interpolated
material
was from
0
The range
of
to 80 syllables.
In
the condition with 0 interpolated syllables,
sentence
would
be
presented
immediately
subject read or heard the original sentence.
were
added
the
after
test
the
Sentences
between the original sentence and the test
sentence to create the interpolated
additional
text
sentences
provided
a
syllables.
These
continuation of the
38
passage, and were consistent
with
the
theme
of
the
original sentence.
Sachs measured the percentage of correct
for
decisions
the
different test sentences at various levels of
intervening material.
She found
the
pattern of
the
data
was
similar
conditions.
The
for
both the listening and reading
accuracy
decreased
for
correctly
identifying that the test sentence was changed from the
original with an increase in the amount of
material.
After
syllables had intervened between
80
the orginal sentence and the
ability
detect
to
accuracy at the
that these results
words
of
subjects
hear
difficulty
or
these
subjects
and
lexical
test
considerably
reduced
from
syllable level.
provide
evidence
Sachs suggested
that
the
exact
are not retained in memory when
read
discourse.
sentence
the
results,
sentence,
formal
identifying changes in
overall meaning of
Given
0
sentence
a
test
that
sentences were changed was
the
intervening
Sachs
"gist" of the sentence is
a
Subjects
had
few words when the
remained
the
same.
contends that meaning or
retained
in
memory
rather
than the exact wording of the sentence.
The Sachs
sentence
experiments
verification
illustrate
the
use
of
technique to assess memory of
a
a
39
sentence.
first
Royer
Hastings, and Hook
,
researchers
technique
to
reported
two
use
to
assess
reading
verification technique
were
the
verification
comprehension.
which
in
was
1980)
sentence
a
experiments
(
used
They
sentence
a
measure
to
reading
comprehension of elementary school students.
The sentence verification technique used
(et.
and
al.,
1980)
then respond
to
rereading the text.
is whether
by
requires the subject to read
a
series of test sentences
Royer
text
a
without
The basic decision for the subject
test sentences have the same
meaning
or
different meaning from the original text sentence.
sentence verification technique will be referred
the
SVT.
SVT
sentence,
are
a
sentence.
SVT.
the
test
(et.al.,
as
original
a
paraphrase
meaning change sentence, and
a
distractor
detail
7
in
lists an example of each form of the
the
method
described
be
1980)
is
study and
the
the
same
in
section of the thesis
The procedure for writing the
sentences
this thesis.
of
sentence,
Table
experiment.
of
to
The four test sentences
These four test sentences will
greater
The
The SVT consists of four test sentences for
each sentence in the text.
the
a
for
experiment
four
kinds
both the Royer
reported
in
40
The subject's task in the Royer
study
was to decide whether
meaning or
new
a
sentence.
meaning
(et.
al
.
1980)
,
test sentence had an old
a
compared
to
the
passage
The subjects did not reread the passage,
but
made an old or new
response
on
the
basis
of
their
memory of the text.
the
In
Hastings,
first
and
Hook
the 6th grades read
difficulty.
experiment
The
reported
by
(1980), students from the 5th and
three passages which varied in text
students'
teachers had selected text
which were two grades below, on grade
level,
and
grades above the reading level of the class.
a
fifth
grade
approximately
Royer,
student
read
a
text
two
Therefore
which
was
at grade three reading difficulty, grade
five, and grade seven difficulty.
In
second
the
experiment
reported
by
Royer,
Hastings, and Hook (1980), fourth grade and sixth grade
students read passages
levels
which
were
either
below their reading grade level, on their grade
level, or two grade levels above their current
grade
grade
two
level.
reading
This range of text difficulty meant that
the above grade text for the fourth grade students
the
same
students.
text
as
was
the on grade level for sixth grade
Likewise, the below
grade
text
for
sixth
41
grade
readers
was
the on grade text for fourth grade
readers
The
procedure
reported
was
by Royer (et.
the
al
.
same
,
for
1980).
be
presented together.
in which
the students learned how
sentence
the
two
studies
practice session
a
respond
to
to
Royer (et.al.,
responses and
1980)
scores
d'
SVT
on
each
passage.
used the proportion of correct SVT
because
response
considered
a
z
a
scores for dependent variables.
d'
measure of
they
are
accuracy.
A
They
criterion free
a
score
d'
may
3.0
is
basis
of
chance.
correct responses and the
showed
that
SVT
text difficutly.
different
grade
made
on
Analysis of the proportion of
d'
scores
each
for
passage
performance declined with increasing
In
their comparisons of
levels
(experiment two), Royer (et.
higher
A d'
very high score, indicating very little
a
probability that the subject's responses were
the
be
score representing the probability that
the subject is responding on the basis of chance.
of
the
verification technique, the students read the
three passages, responding to
used
After
studies
Since the SVT was
used in both studies, the results of
will
both
who
al
read
.
,
1980)
students
the
same
at
text
found that the
grade level students received higher SVT scores
42
than the lower grade level students.
This is what
one
would expect, given the greater experience
of the older
students.
Therefore, the Royer (et.
contributed
to
the
al
.
,
study
1980)
evidence that the SVT was
method of measuring reading comprehension in
a
valid
two
ways
the SVT was sensitive to text difficulty, and
2)
1)
SVT was sensitive to expected
differences
the
student
in
reading ability.
A
(Note
a
recent experiment by Royer, Lynch, and Bulgarelli
2)
extended the investigation of using the SVT as
method of
experiment
assessing
reading
investigated
comprehension.
whether reading comprehension
varied for readers with different
matter
expertise
Their
when
they
degrees
read
of
subject
passages within or
outside their presumed knowledge area.
Royer, Lynch, and Bulgarelli
groups
differing
in general
(Note
2)
had
three
educational experence and
psychology expertise read passages about psychology and
non
psychology
topics.
Undergraduates,
previous university psychology courses were
with the least expertise.
the
moderate
the
no
group
Upper division undergraduate
students with several previous
were
with
courses
expertise
in
groups.
psychology
Psychology
graduate students had the greatest amount of expertise.
43
All
of
the
subjects read non psychology passages
and
psychology passages.
Each subject in Royer (et.
psychology
six
Note 2) read three
,
non
pool of
a
psychology passages.
assembled into test packets.
were
.
passages and three non psychology passages.
The passages were drawn from
and
al
complete,
there
which were evenly
were
distributed
psychology
The six tests were
When
two
six
the
test
packets
sets of six passages
the
in
undergraduate,
advanced undergraduate, and graduate groups.
Royer, Lynch,
reading
and
Bulgarelli
comprehension
(Note
performance
by
assessed
2)
measuring
the
subject's responses to six sentence verification tests.
Each test had 12 test sentences.
SVT so that it contained
paraphrase
sentences,
distractor sentences.
the
test
each passage
original,
each
for
sentence
paraphrase,
They constructed each
original test
3
3
meaning
sentences,
change,
and
3
3
By using four different forms of
passage, they were able to assess
with
an
meaning
equal
proportion
of
change, and distractor
test sentences.
Royer, Lynch, and Bulgarelli (Note 2) measured both
the
amount
and
three
of time subjects used to read the passages
variables
assessing
accuracy:
proportion
44
correct on the SVT
SVT performance and
data
and
similar
the
to
on
d»
,
a
the SVT
and
,
combination of
a
confidence rating.
Since the
d'
combined SVT-conf idence rating data was
the
proportion
correct
data,
only
the
proportion correct data will be described here.
The results of the study were consistent
interpretation
that
the
comprehension.
Table
8
correct
SVT
was
reports
with
measuring
mean
the
reading
proportion
SVT scores for the three expertise groups.
analysis of variance indicated that performance
improved
result
with greater educational experience.
was
that
significantly
reading
higher
comprehension
for
upper
the
An
scores
Another
scores
were
level undergraduates
than undergraduates, even though they read the passages
in
the same amount of time.
was
sensitive
which
a
reading
to
This suggests that the SVT
comprehension
related to the reading time for
Royer (et.
al
.
,
Note
2)
a
differences
passage.
found that SVT performance
improved with greater educational experience.
They had
initially expected SVT performance to improve more
the
psychology passages compared
passages across the three Groups.
a
significant
Content of passage
to
the non psychology
They failed to
X
for
find
Group interaction.
These results could mean that overall reading "ability"
45
increased
accordance
in
with
increasing educational
experience, thereby producing better
or
they
could
mean
that
the
increased
with
performance,
knowledge required
comprehend both the psychology and
passages
SVT
the
non-psychology
advancing
educational
experience, and it was the knowledge gain
increased
results.
The results seem
passages
to
suggest
that
than
the
non
had such wide ranging topics that
students with greater general knowledge
have
rather
general ability that produced the pattern of
psychology
may
to
education
and
utilized their world knowledge to comprehend
the non psychology passages as well as
psychology
the
passages
There are numerous
that
previous
text.
of
experiments
knowledge
which
in
several ways.
that subjects who had
found
affects the comprehension of
Previous knowledge may affect the
text
have
comprehension
Johnston (Note
greater
knowledge
1)
of
reported
a
topic
(compared to subjects with less knowledge of the topic)
had higher scores on his
These
reading
comprehension
test.
subjects did not receive higher test scores when
they read
Subjects
text
also
outside
recall
their
high
knowledge
greater amounts of
a
area.
previously
heard text if they have more knowledge of the theme
of
46
the
text (Voss, Vesonder, Spilich,
1980).
of recall is also similar to previous
particular topic.
influenced by
(Anderson,
Lachman,
knowledge
of
subject's
schemata
Bransford
Subjects
1971).
knowledge
of
a
The pattern of recalled ideas may be
the
1977;
The pattern
certain
Franks,
&
may
also
types
of
of
the
1971;
develop
text.
topic
Dooling
&
previous
For example,
through repeated experience with stories, subjects
may
learn
as:
that
stories
consist
of
parts
such
beginnings, settings, reactions, outcomes, and endings.
Several
experiments
have
found
subjects have better
recall for particular parts of stories.
the
stories
appear
be
to
&
Glenn
,
1979)
parts
of
based upon their previous
knowledge of story "scripts" ( Mandler
Stein
The
&
Johnson,
1977;
.
The second alternative interpretation of the Royer,
Lynch,
and
the groups
overall
Bulgarelli
may
reading
have
(Note
differed
ability.
results suggests that
2)
in
"intelligence"
or
The Royer (et.al., Note 2)
study could not distinguish between the two alternative
interpretations of the results.
The experiment in this thesis stems naturally
the
Royer,
Lynch, and Bulgarelli (Note
thesis experiment is essentially
a
2)
study.
construct
from
The
validity
47
experiment
investigating
whether
the
SVT is
method of assessing reading comprehension.
valid
a
Therefore,
it extends the research from both the
Royer, Lynch, and
Bulgarelli (Note
(1980)
2)
and
the Royer,
Hastings,
and
Hook
experiments.
There are
thesis
the SVT.
experiment
effects
areas
which
of
investigation
in
the
extends previous research on
The first area of investigation addressed the
association
ability.
three
between
SVT
performance
and
general
The current experiment is designed to
of
performance.
ability
on
reading
reduce
comprehension
The experiment uses reading comprehension
measurements on the same subjects at two different time
periods.
reading
If
there
are
comprehension
improvements
performance,
in
it
is not likely
that their general ability has also improved.
Royer,
Lynch,
and
Bulgarelli
(Note
2)
subjects'
In
research, the
three groups were at such different academic levels
was
plausible
that
the
it
subjects could differ in ability.
The current experiment uses subjects with
a
much
more
narrow range of academic experience.
The second extension from previous research is
current
experiment
will
utilize
both
SVT
the
and free
recall performance as indices of reading comprehension.
48
Both
of
the previous studies used
the SVT as the only
method of measuring reading comprehension.
The third extension is that readers in
experiment
which
may
develop
relevant
is
thesis
knowledge base or schemata
a
interpreting
to
the
the
psychology
passages they will be reading.
Previous
develop
research
schemata
has
shown
that
subjects
based upon the organization of text.
Most of this research has demonstrated the
story
&
schemas upon the recall of
Johnson,
1977;
experiment
Stein
Glenn,
&
designed
is
affects
The
1979).
assess
to
may
be
developing
the
thesis
reading
may
develop
that
schemata of the structure of
text from psychology journals.
text
of
story text (Mandler
a
comprehension of college students during the time
they
may
Schemata of
psychology
the students conduct psychology
as
experiments and write lab reports which conform
to
the
general guidelines of psychology journal articles.
The
hypotheses
experiment
reported
builds
interest
of
from
the
in
this
chapter.
demonstrated
that
the
world
in
previous
These
the
current
research areas
research
areas
knowledge of the reader
influences their reading comprehension, and the SVT has
received
support
as
a
valid technique for measuring
49
reading
comprehension.
Therefore,
the
experiment
reported below addresses two basis
hypotheses:
should be improvement in the SVT scores
text
(between
tests)
scores
early
and
late
non
psychology text.
positive correlation
scores.
If
the
between
results
of
psychology
reading comprehension
which is not matched by an
of
of
there
1)
improvement
2)
SVT
this
in
SVT
There should be
scores
and
a
recall
experiment are as
predicted, the results will provide additional evidence
supporting
the
interpretation that the SVT is
method of of measuring reading comprehension.
a
valid
METHOD
Subjects and Design. The 82 subjects
in this experiment
were
students
enrolled
in
an
upper
division
undergraduate psychology course
entitled Methods of
Inquiry in Psychology.
Almost all of the students were
psychology
majors.
The
curriculum of the course
required students
to
conduct experiments and write
laboratory reports which followed
a style similar to
the
American Psychological Association
format
for
journal articles (cf.
Publications Manual, 1974).
students attended two and one-half
hours
lecture
and another
per
The
week
of
two and one-half hours per week
of
laboratory experiments.
The reading comprehension tests
were
the
regularly
scheduled
lecture
early semester test session and
The
students
received
the
a
time.
late
tests in
fashion, as noted in Table 9.
psychology passage and
a
taken
a
during
There was an
test
session.
counterbalanced
Students received both
a
non psychology passage at each
test session.
Materials.
All
of
the materials were selected from
larger pool of passages
Royer,
Lynch,
and
and
SVT
Bulgarelli
50
tests
(Note
used
2)
in
study.
a
the
Each
51
subject read four passages that were
in
length.
The
concerned
with
topic.
Two of
psychology
and
concerned with non psychology topics.
passages
sentences
passages differed from each other
total number of words and
were
twelve
the
passages
and
The
psychology
were re-written psychological abstracts taken
non
psychology
passages
were
reviews from the Non-Fiction in Brief
New
the Appendix
There
original,
re-written
section
book
of
the
All of the passages are reproduced in
.
were
four
types
paraphrase,
of
SVT
test
Table
7.
Original
sentences
sentence from the passage.
same
meaning
as
sentences
the
different
preserved
meaning
of
listed
identical
to
sentences
in
a
had
original sentence, but the
original sentence, but one or
alter
were
Paraphrase
the
meaning was expressed with
change
sentences:
meaning change, and distractor.
An example of each type of test sentence is
to
1960's.
York Times Sunday Book Review Section appearing in
the late 1960's.
the
were
two
from psychology journals published in the late
The
in
most
a
the
Meaning
words.
of the words of the
few words were
sentence.
changed
Distractor
sentences were consistent with the general theme of the
passage,
but
unrelated to any original sentence.
The
52
distractor sentences were written (intuitively)
the
same
as
the
original
sentence
to
be
length,
in
difficulty, and syntactical structure.
The SVT test
forms.
There
Each form
written
were
had
in
test Form
1
assessing
sentences
of
comprehension
the
four
in
test
The
forms
were
The first step was to write
that the first six
the
sentences
sentences.
several steps.
so
used
four forms based on each passage.
test
12
were
passage,
test
of
and
sentences
the
the
were
initial
next
six
six
test
sentences were based on the last six passage sentences.
The purpose of this ordering was to increase the amount
of
time
intervening
between
appearance
the
of
an
original passage sentence and the appearance of
a
sentence
thereby
based
on
that
original
sentence,
reducing the possibility that the response
sentence
memory.
would
be based on
The second step
to
test
test
the
the contents of short term
was
to
randomize
test
the
sentences within the first six sentences and the second
six sentences.
in which
to
the
The result of this procedure was
a
test
the order of test questions did not correspond
the order of sentences in
first
test
sentence
fourth passage sentence.
the passage.
might
be
For example,
derived from the
53
The third step was to randomly select
one
type
of
test sentence (original, paraphrase, meaning
change, or
distractor)
appear in the test.
to
this procedure was that the total
an equal
proportion
sentence
i.e.,
(
change, and
of
3
each
the construction of Form
the
3
types
all
on Form
the
3
test
sentence
on
3rd
....
Forms
and
4
also
Therefore,
the
2,
3,
12th test sentences on all the
12
sentences on Form
test question varied from one
below
describes
were chosen for Forms 2,
If
the
paraphrase
Form
types
2
4
If the first test sentence
1.
test forms assessed the same sentence from the
example
and
3,
was derived from the fourth passage sentence,
1
2nd,
the
meaning
These three steps completed
assessed the fourth passage sentence.
as
test
1
based on Form
first
1st,
of
paraphras
The order of test sentences on Forms 2,
were
on
test form had to have
of
original,
distractor).
3
The restriction
first
test
3,
test
and
to
another.
4
from Form
of
Form
selected from this set
for
1.
1
was
a
the first test sentence of
was randomly selected from one of the
of test sentences.
The
types of test sentences
sentence
sentence,
However, the type of
1.
form
how
passage
remaining
If a distractor sentence was
Form
2,
the
first
test
54
sentence
on
Form
remaining test
change.
was
3
sentence
sentence
12
types:
selected from the
original
or
meaning
If a meaning change sentence was selected
the first test sentence
All
randomly
on Form
4
Form
on
the
3,
in
the
same
manner.
procedure, each sentence of the
assessed
with
each
test
would be an original test sentence.
test sentences on each of the
selected
first
for
type
test
By
forms
were
following
this
original
passage
was
of test sentence across the
four test forms.
Another aspect of the materials which is
is
the type of answer sheet.
important
After the subject read
passage, they would respond to two answer sheets.
first was
a
sheet
had
was
a
The
second
twelve old or new responses and was
used for the sentence verification task.
response
The
recall answer sheet where the subject wrote
his/her best recollection of the passage.
answer
a
Next to
each
five point confidence scale to record
the degree of confidence the subject had
in each old or
new response.
Procedure
semester
.
The
subjects
were
and late in the semester.
test session was during the 5th
early
tested
week
in
the
The early semester
of
the
course.
55
The late semester test session was during
the 12th week
of
the
course.
Subjects read
the
passages
and
responded
group.
to
the free recall
tasks and SVT tests as
The testing was done in the same lecture
a
room,
and at the same time, as their regular
course lecture.
After the subjects were seated, one
was
distributed
each
to
test
student.
envelope
Subjects
instructed to write their mother's maiden name
test
envelope.
This
was
were
on
done for two reasons
the
to
1)
assure the student that performance on the reading
test
would not affect their course grade, and
that two
so
2)
different passages could be conveniently distributed
subjects
at
the late semester test session.
envelopes had been previously arranged
was
a
The test
that
so
to
there
balanced distribution of passages and test forms.
The test envelopes were stacked
envelope
had
different
that
so
passages
every
and SVT tests.
example, the first envelope contained passages
with SVT Form
with SVT
Form
repeated
the
Form 2.
envelope.
1,
other
1
For
&
the second envelope had passages 2 &
The
1.
alternating
third
and
fourth
4
3
envelopes
passages, but used SVT test
This pattern was repeated through
the
eighth
The ninth envelope would start the sequence
again with passage
1
&
4,
SVT Form
1.
56
The order of materials within each
envelope was: 1)
directions,
first passage, 3) recall answer
2)
sheet,
4) SVT for the first passage,
5)
second
passage,
6)
recall answer sheet, 7) SVT for the
second passage.
After
the
experimentor
stressed
subjects
explained
that
demonstrated
read
a
directions,
how to complete the tests.
passages
on
the
could
not
be
reread,
the
He
and
blackboard at the front of the room
how to respond to the four types of SVT test
sentences.
After
questions
were
answered,
he
instructed
the
the
the
subjects to start the test.
While
experimentor
were
subjects
the
watched
rereading
were
for
taking
evidence
passages.
He
did
test,
that the subjects
not
witness
any
rereading of the passages.
Most of the subjects completed the test
hour.
the
The
subjects
envelopes
experimentor.
and
within
placed all the materials back in
returned
Subjects
returned
the
envelopes.
session,
the
experimentor
the
left
the
Before
removed
envelopes
room
the
to
the
after
they
second
tests
psychology
and
test
the test contents
from each envelope, and replaced the contents with
different
an
two
non psychology passages and
57
The late semester test
essentially
session.
session
session
The two differences from the
were:
form".
Each
each
conducted
in
the same manner as the early semester
test
1)
early
semester
subjects selected the envelopes with
their mother's maiden name, and 2)
test,
was
subject
received
after completing the
an "experimental credit
This form was given to their course
instructor.
subject received course credit which was added
their semester average in the laboratory
course.
to
RESULTS and DISCUSSION
The data was
following
analysed
hypotheses
determine
to
were
whether
supported.
The
first
hypothesis was that there should be improvement
in
psychology text SVT scores between the early and
test.
not
the
the
late
This improvement in psychology SVT scores should
be matched by improvement on the SVT scores on
the
non psychology text.
should be
The second hypothesis
was
there
positive correlation between SVT scores and
a
recall scores.
Three dependent variables were
sentence
verification
test
analysed
performance.
variable was the proportion of correct test
The
second
variable
combined
SVT
according
was
to
SVT
scores
converted
to
first
responses.
with
The third
d'
scores
the procedure of signal detection analysis
(Swets, Tanner,
Birdsall,
&
Propor tion Correc
t
correct
improved
scores
.
1961;
Banks,
1970).
Mean psychology passage
relative
passages from the early test session
session.
The
performance
confidence estimates for each test sentence.
variable
assessing
to
to
proportion
non psychology
the
late
test
Mean psychology scores increased from .750 to
.777 whereas non psychology means decreased
58
from
.747
59
to
Table 10 indicates mean and difference
scores
.726.
for both individual passages and
the
two
test sessions.
type
of
passage
Panel A of Figure
proportion correct SVT scores
by
1
passage
for
shows the
content
at
early and late administrations.
Performance
compared
(.763)
was
to
higher
non
on
psychology
psychology
passages
passages (.737).
Performance varied with the type of test sentence,
with
the
highest
mean performance on distractor (.857) and
original test sentences (.839).
performance
was
.693,
Meaning
mean
followed by mean performance on
paraphrase test sentences (.610).
11,
change
noted
As
Table
in
performance improved from the early test session
the late test session on distractor and meaning
to
change
test sentences, and performance declined for paraphrase
test sentences.
for
The mean
peformance
remained
stable
original test sentences across the two test times.
Performance on the sentence types interacted with other
factors in the experiment, such as the pair of passages
and
the order of the test set.
data
for
The
mean
performance
these interactions is shown in Tables 12 and
13.
A
hierarchical
variance
was
repeated
utilized
to
measure
test
for
analysis
of
the Content type
60
(psychology or non psychology passage)
session
interaction.
factors:
The analysis used
(Content
2
administration)
X
type)
X
(Pairs
2
of
or
late
test
sessions)
of
tests)
X
Question type is
tests
was
a
(Question
4
nested
within
of
psychology
subject
X
test
and
at
on
passage,
which
particular
a
type
non
either
(Groups
2
test
the following
consisted of the order subjects received
pair
of
(Time
2
psychology tests given to the same
early
Time
X
the
SVT)
and
paired
are nested within time of administration.
Group
between subject factor.
The
Content
interaction
level,
(F
type
was
marginally
78)=3.928,
(1,
p= .05 = 3 .976)
Time
X
With
.
of
significant
while
which
at the p=.05
critical
the
The
at
sources
of
were less relevant to the experimental
hypothesis, but which
were
were
Content of passage,
follows:
as
F
this analysis of variance design,
there were many sources of variance.
variance
administration
1)
statistically
significant
2)
Question
type, 3) Question type X Time of test, 4) Question type
X
Pairs
of
passage,
Question type
Content
X
X
Content
Pairs.
5)
X
Question type
Group,
7)
X
Group,
Question
Type
6)
X
The F value of each effect and the
appropriate level of significance are reported with the
61
complete analysis of variance table listed
in Table
Combination
subjects
rated
response on
that
Variable
the
a
Following
.
their
degree
point scale.
5
subject
was
each
of
(A mark of "5"
"very
of the response.)
assessing SVT
confidence
sure"
received
a
incorrect
and
"1" for
a
response
that
in
indicated
of their old/new
were
"not
at
all
The second dependent variable
performance
rating
response,
confidence
response, while "1" indicated they
sure"
SVT
14.
was
the
product
the SVT performance.
correct response and
to
SVT
a
test
possible range of scores for the
of
this
(Subjects
"-1" for an
a
sentence.)
combination
The
variable
was from -5 to +5 with no zero point.
shown
As
increased
in
Table
session.
Mean
session
session.
combined
to
a
a
mean
Panel B of Figure
mean of 2.20
a
declined
mean of 2.19 at
of
1
variable
mean of 2.46 at the late
a
performance
psychology passages, from
test
,
psychology passages from
for
at the early test session to
test
the
15,
2.03
at
non
for
the
early
the late test
displays the
pattern
of
the combined variable data.
The mean combined
passages
(2.33)
was
variable
greater
score
than
for
psychology
the mean combined
62
variable score
Mean
for
non
psychology
(2.11).
performance was highest on orginal test
sentences
and distractor test sentences
(3.10)
change
test
sentences had
(3.02).
Meaning
mean score of 1.76, while
a
paraphrase test sentences had
The
passages
mean
a
score
of
1.00.
mean performance on the sentence types
varied with
the time of test session.
scores
increased
sessions
for
Table
between
meaning
16
the
shows
that
mean
and
late
test
early
change
and
distractor
test
sentences,
while mean scores declined for original and
paraphrase
test
between
sentence
sentences.
type
Additional
interactions
and other factors are shown in
Tables 17 and 18.
The combination variable
same
hierarchical
variable.
test
analysis
as
analysed
using
session
was
the
the proportion correct
The interaction of Content type
Time
X
of
not significant at the p<.05 level.
Several other effects were
even
was
statistically
significant,
though they are less relevant to the experimental
hypotheses.
The following
statistically
Question type
Question
Type
sources
significant:
X
Time,
X
3)
1)
Question
of
variance
Question
type
X
were
type,
2)
Group,
4)
Pairs of passages, 5) Question type
Content X Pairs of passages.
The F values
and
X
levels
63
of
statistical
significance
are
listed
complete source of variance table in
Table
with
the
19.
Table 20 shows the mean confidence ratings
per test
sentence for correct or incorrect SVT
responses within
psychology and non psychology tests
at
the
two
test
times.
Confidence ratings increased from early to
late
test sessions for both psychology and
non
psychology
passages.
However,
for incorrect
sentences
confidence ratings increased more
sentences
than
they
did
for
correct
on both psychology and non psychology tests.
There is no obvious explanation for why subjects
become
more
confident
incorrect compared
in
responses which were scored
their
to
would
confidence
responses
in
which were scored correct.
d*
scores
passages
.
The
and
d
scores improved for
non
psychology
semester test to the
reports
mean
the
late
semester
passages within each content
sessions.
Panel
pattern of the
d'
C
of
test.
summed
area
Figure
21
the
two
across
for
1
Table
the
presents
two
a
test
graphic
scores.
The analysis of
analysis
of
psychology
passages from the early
scores
d'
both
d'
scores
required
a
different
variance design from the design used with
64
the proportion correct and combination
variables.
scores
d'
are
responses to
The
based upon the distribution of correct
the
sentences,
whereas
the
proportion
correct variable and the combination SVT
test sentences
and confidence rating variable are based
upon each
test
sentence
from
the
could not have
of
score
1
SVT.
Analysis by
sentence type factor.
a
per
SVT
for the d»
d'
The restriction
variable required an
ANOVA design with fewer factors than the
used
with
the
proportion
variable was analysed with
2
(Content
of
passage)
correct
a
2
significant
at
ANOVA
design
variable.
The d'
(Time of test session) X
analysis
interaction of Content type X
by necessity
Time
the p<.05 level.
of
variance.
The
of
test
not
was
The complete sources
of variance table is listed in Table 22.
Free
the
Recall
.
The
subject's
assessing
free
derived for each
proportion of correct idea units in
protocal
recall.
passage.
was
dependent
the
Idea
The
variable
units were intuitively
idea
units
for
one
passage are listed in Table 23.
The mean proportion of correct idea units
stable
for non psychology passages
for the psychology passages from
.36
(
.33)
remained
but decreased
at the early
test
65
to
.32 at the late
test session.
mean recall scores for psychology
Table 24 reports the
and
non
psychology
passages at the early and late test sessions.
The recall variable
ANOVA design as the
significant Time
( 1
,81
)
=4
analysed
variable.
•
Content of
X
.46 ,p< .05
d
was
with
the
same
This analysis found
passage
interaction,
a
F
No other effects were significant.
.
The complete analysis of variance table
is
listed
in
Table 25.
The recall protocals were
the
proportion
experimen tor
idea
unit
112 free
.
of
correct
In order
scoring,
recall
initially
to
idea
check
the
analysed
units
by
for
the
reliability
of
an independent evaluator rescored
passages.
The
overall
correlation
between the two scorers was r=.84 (p<.001).
The second major hypothesis of interest predicted
positive
correlation
recall performance.
correct
SVT,
d
?
a
between SVT performance and free
Correlations
scores
and
between
proportion
recall scores indicated
significant relationships (p<.01) for both combinations
Table
26
reports
these
correlations.
correlation betwen SVT proportion
recall scores was r=.37, (p<.001).
correct
The
overall
scores
and
This correlation is
based upon 328 observations (82 subjects responding
to
66
passages)
.
Conditional Probability
passages
prior
to
The subjects free recalled the
.
responding
probabilities were calculated
relationship
and
between
the SVT
to
Conditional
.
assess the
to
degree
of
recalling particular information
the assessment of that information
on the SVT.
order
calculate
to
the
conditional
probability
responding to the SVT given recall of idea
tests
of
analysis.
were
randomly
the
selected
for
from
test session and 32 subjects from the later
test session.
there
subjects
of
units,
This selection allowed for 32 subjects
early
the
64
In
was
Within the early or late
an
passage.
equal
Pooling
proportion
across
the
of
test
tests for each
8
test
two
session,
times,
the
analysis of conditional probabilities was based upon
16
subjects per passage.
In
most
sentence
was
cases
the
greater
number
than
of
one.
idea
A
units
in
a
subject did not
receive credit for recalling an entire sentence if they
recalled less that 50% of the idea units in
If
they
recalled
50%
or
more
idea
a
units
sentence.
in
that
sentence, they were credited with correct recall of the
sen tence
67
Table 27 lists the conditional
probabilities of SVT
performance by content type given
previous recall
performance.
As can be seen, the liklihood
for correct
performance
on
the
SVT is greater if the subject had
correctly recalled at least 50% of the
idea
units
in
The pattern of the conditional probability
data
is
the sentence.
reasonable.
score
The
overall
may
(.757)
conditional
be
had
correctly
SVT.
to
data.
in
The
examining
the
probability
the SVT task once
the
of
subject
recalled at least 50% of the idea units
was larger (.81)
the
considered
probability
correctly responding
mean proportion correct SVT
than the overall mean
However,
performance
on
if the subjects did not recall at
least 50% of the idea units of the sentence, they had
probability of .72 that they would respond correctly
the SVT text sentence.
lower
than
mean performance on the SVT.
This relationship is reasonable because it is what
would
to
to
This conditional probability is
overall
the
a
one
expect if subjects either comprehended or failed
comprehend
comprehend
passages.
the
the
Those
subjects
who
text well should receive higher scores
on both the recall
task and the SVT compared to overall
mean
on
performance
the
SVT.
Following
the
same
68
reasoning, those subjects who
text
as
well
do
not
comprehend
the
should receive lower scores on both
the
recall task and the SVT test compared to
overall
mean
performance on the SVT.
The conditional probabilities
based
idea
upon
units
criteria
in
a
reported
here
were
less than perfect recall (i.e., 50% of the
in
a
sentence).
If
a
more
stringent
had been adopted (i.e., 75% of the idea
units
sentence), the patten reported above may have been
quite different.
FINAL DISCUSSION
This section is organized into two general
areas of
discussion
corresponding
hypotheses.
The first section considers performance on
the
to
two
experimental
the SVT tests.
The second section discusses the recall
data
correlations
and
the
between
recall
and
SVT
hypothesis
was
per formance
15!
P attern
there
91 SVT scores
should
The
.
first
be improvement in the SVT scores for the
psychology passages between the
sessions
which
scores for the
dependent
the
non
not
clearly
direction,
dependent
of
SVT
variable
15)
non
combined
only
test
Mean
All
three
the
proportion
neared significance.
The
variables
the
present
d
1
assessing
a
pattern which
scores (Table 21) and
SVT and confidence rating scores (Table
showed greater differences between
psychology
late
indicated interaction in
performance
supports the hypothesis.
mean
passages.
but
mean scores of the other two
accuracy
and
matched by change in the SVT
psychology
variables
predicted
correct
was
early
psychology
and
passage comprehension at the late test
69
70
session compared to the early test
session.
scores
d«
and
the
mean
The
mean
combined SVT and confidence
rating scores improved more for the
psychology passages
than the non psychology passages.
the
greater
compared
differences
to
variables.
the
early
at
direction
the
test
Although the SVT
Figure
late
session
illustrates
1
test
for
performance
session
all three
was
in
the
predicted by the hypothesis, the interaction
was only marginally significant when proportion
correct
was
the
dependent variable.
Content type
d'
variable
variable.
Time of test interaction for either
X
or
The
significant
There was no significant
the
the combined SVT and confidence rating
failure
find
to
statistically
a
interaction may be due to several factors.
One of the most plausible will be discussed below.
One of the issues raised earlier
was
that
reading
comprehension
schemata of text organization (cf.
1977;
Stein
&
Glenn,
in
Mandler
organization
of
Johnson,
&
students
Methods class may very well have acquired
knowledge
chapter
is aided by reader's
The
1979).
this
psychology
a
in
the
schematic
articles.
The
laboratory experiences and written assignments required
them
to
learn
the
format
of
major
sections
of
psychology articles: introduction, method, results, and
71
discussion.
The
APA
format
was repeatedly stressed
throughout the course.
However,
this
experiment
format.
the
the psychology passages which were
used in
did
not strictly conform to this APA
The passages did not have headings
different
indicating
research journal sections.
the passages necessarily follow
the
APA
Neither did
sequence
of
introduction, method, results, and discussion sections.
It is conceivable that the subject's performance
have
improved
if
would
there were schematic markers in the
passage corresponding to the APA format.
Another related factor concerns the type and amount
of
"psychology
knowledge"
The psychology passages
represent
research
acquired
areas
which
course curriculum or laboratory
the
"
two
psychology
idenf if ication of
experiment
this
in
by the subjects.
were
born
students"
and
abilities",
their Method's course dealt
"teaching
young
children
utilizing correlational analysis" and
experiences
Because
with
the
experiment
psychology
passage
bilingual
"operant
with
an
the
superior
as
conditioning", "transfer of learning tasks",
project".
Whereas
concerned
children
not
the Methods
in
experiences.
passages
first
did
a
"project
"independent
did not match lab
content,
the
72
students
may
not
have
gained
enough
functional
knowledge from their course which was
relevant
to
the
experimental test.
Free Recall and SVT performance
of
this
experiment
correlation
between
Consistent
positive
was
.
The second
there
recall
should
scores
correlations
be
and
were
hypothesis
a
positive
SVT
scores.
found between
recall performance and both proportion correct
scores
.
Free recall
of
performance
measuring
the
is
a
subject's
Kintsch and his colleagues (Kintsch,
van
d'
for both psychology and non psychology passages
(Table 26)
method
and
Dijk,
Vipond,
1978;
1980)
commonly
accepted
memory
of text.
Kintsch
1979;
utilize
free recall
protocals in assessing reading comprehension.
evidence
supporting
relationship
the
&
Further
between
free
recall and reading comprehension was found by Bransford
and
Johnson
They
(1972).
found reader's subjective
estimates of their level of reading comprehnsion
reading
a
passage corresponded to their level of recall
performance.
recall
after
The positive
scores
the validity of
and
correlations
between
free
SVT scores contributes evidence to
the
reading comprehension.
SVT
as
a
method
of
measuring
73
There was
Content
Mean
of
passage
recall
passages
between
for
There
is
pattern
the
but
the
no
of
Time
interaction
performance
test session,
stable
significant
a
results.
with
declined
mean
session
X
recall scores.
for
recall
psychology
clearly
test
psychology
early test session and the late
the
non
of
evident
scores
passages (Table 24).
explanation
However,
remained
for
this
one explanation which
might account for the pattern is subjects may have
been
less
motivated
to
write complete psychology protocals
at the late test session than they were
test
session.
motivated
to
session
for
complete
the
task
reasons.
two
was
a
at
First,
novel
they
test
might
experience.
be
And
some students may have believed performance on
session
the
experiment
may
would
the experiment.
not
By the late
have
intriguing and students may have decided
grades
early
early
the
the tasks would affect their course grade.
test
the
The students may have been more highly
motivated if the task
second,
at
been
their
less
course
be affected by their performance in
CONCLUDING REMARKS
This chapter of the thesis has
which
contributes
to
the conclusion
verification technique is
reading
presented
comprehension.
a
evidence
that the sentence
valid method
of
measuring
The experiment presented here
is important because it complements
previous
research
which supported the construct validity of the
SVT.
At
this
date,
experimental
research
demonstrated that the SVT is sensitive
level of
(Royer,
text
and
Hastings,
increased
Hook,
&
to
reading
1980),
2)
1)
has
difficulty
skill
ability
differences in
reading comprehension of readers with different degrees
of
subject
matter
expertise
when they read passages
within or outside their presumed knowledge area (Royer,
Lynch,
&
Bulgarelli (Note 2).
The experiment in this
chapter has demonstrated two more points which
suggest
the
reading
SVT
is
comprehension:
reading
a
valid
1)
The SVT is sensitive
comprehension
method
matched
ability.
This
with
2)
measuring
to
changes
in
subjects who increase their
of
knowledge that applies to text in
matter.
of
a
particular
subject
increase in knowledge is presumably not
an
increase
Responses
to
7^
in
the
general
SVT
intellectual
are
positively
75
correlated with free recall of the
studies
same
text.
These
are building an increasingly stronger
argument
that the SVT is
comprehension
a
valid
method
of
measuring
reading
TABLES AND FIGURES
i
76
76
TABLE
M
A
p™p
tS
EITHER St
WITH
:
1
COMPREHENSION TEST PERFORMANCE
PASSAGES OR WITHOUT PASSAGES a
With
Passages
Name of Test
Without
Passage
Chance b
Nelson Reading
45. 96
29. 36
18. 75
California Achievement Test
26. 66
14. 36
10. 10
SRA Achievement Test
37. 17
22. 17
15. 00
29. 54
22. 27
11. 25
Metropolitan Achievement Test
(Intermediate)
28. 82
20. 27
11. 25
Iowa Test Basic Skills
27. 05
19. 29
11. 50
Metropolitan Achievement Test
(Elementary)
'Cited in Tuiman
1
(1973-1974).
'Mean test score estimated on the basis of chance.
Chance
is defined as n/4 where n is the number of test items.
This test contained a few
*
5
choice test items.
c
77
TABLE
2
STRUCTURAL VARIABLES USED BY DUNN fET
AL "
TO PREDICT TEST PERFORMANCE
Passage Components
Percent Content Words
Unique Information
Percent Content-Function Words
Average Sentence Length
Stem Components
Percent
Percent
Percent
Percent
Content Words
New Content Words
Non-Dale-Chall Words
Content Function Words
Correct Choice Components
Percent Content Words
Percent New Content Words
Percent Non-Dale-Chall Words
External Information
Incorrect Choice Components
Percent Content Words
Percent New Content Words
Percent Non-Dale-Chall Words
Plausibility
78
TABLE
3
THE PREDICTION OF READING OR
APTITUDE TEST
PERFORMANCE FROM EARLIER TESTS a
Variable Being Predicted
7th Grade
Otis
Added Predictor
Gr.
Gr.
Gr.
Gr.
Gr.
Gr.
Gr.
1:
2:
3:
4:
5:
6:
7:
California Mental
Maturity
8th Grade
Reading
Increase
.
375
.
624
.
R
Increase
.
334
.251
.
522
.
188
700
.076
.
616
.
094
.
784
.
084
.
708
.
092
Otis Beta
.
823
.039
.
726
.
018
Stanford Reading
.
843
.
020
.
778
.
052
.
808
.
030
Metropolitan
Reading
Otis Alpha
Metropolitan
Reading
Otis Beta
Cited in Thorndike (1973-1974)
79
TABLE
4
TEST PERFORMANCE AS A FUNCTION
OF TYPE OF
TEST QUESTION AND DELETED TEXT
Type of Test Question
Type of Text
Complete
Cloze a
Multiple Choice b
26. 33
25.17
30% reduced
5.98
21.90
50% reduced
6.64
19. 33
Tuiman, Blanton, Gray (1975)
b-
Tuiman, Gray (1972)
*
80
TABLE
5
WIDE RANGE READING TEST AND
METROPOLITAN READING
FROM THE DIRECT INSTRUCTION PROJECT TEST SCORES
Grade Level
Pre K
Wide Range Test
Metropolitan Test
Test scores are percentiles.
18
Post
83
40
3
81
TABLE
6
SAMPLE OF TEST SENTENCES USED BY
SACHS a
BaSG:
Immoral
Semantic:
Lexical:
fatheTS consider ed owning slaves
to b
The founding fathers didn't
consider owningg
slaves to be immoral.
Passive/Active:
Formal:
1112
Owning slaves was considered to be
immoral
by the founding fathers.
The founding fathers considered owning
slaves
immoral.
The founding fathers thought owning
slaves to be
immoral.
Cited in Sachs (1974)
82
TABLE
7
SAMPLE OF TEST SENTENCES USED
IN THE SVT
Original:
Then suddenly, one windy, cold
day, the bright
§
leaves tumble to the ground in a
goldenshower
Paraphrase
Then abruptly, on some gusty, brisk
day
leaves fall from the trees like a
colorful
:
the
rain.'
Meaning Change:
Then suddenly, one windy, cold day,
the
dead branches tumble to the ground in
a dangerous
shower
Distractor:
Jerry collects the brightest, mo st colorful
leaves for his mother to use in her Fall
decorations
83
TABLE
8
MEAN PROPORTION CORRECT SVT
SCORES BY CONTENT TYPE
AND EXPERTISE OF SUBJECTS 3
Group
Non-Psychology
Content Type
Maj or
Undergrads
Psychology
Non-Psychology
Cited in Royer, Lynch
Advanced
Psychology
Undergrads
Psychology
Graduate
Students
.78
.81
.88
75
.80
.85
.
$
Bulgarelli (Note
2)
84
TABLE
9
PAIRS OF TESTS ADMINISTERED
THROUGHOUT THE SEMESTER
Time of Test Session
Group
Early
A
B
Late
1
§
4
2
§
3
2
§
3
1
$
4
re£lec Psychology tests, while tests
J
reflect non-psychology
tests.
2
4
!
3
§
85
TABLE 10
MEAN PROPORTION CORRECT SVT
SCORES BY PASSAGE
A(jt
AND TIME OF ADMINISTRATION
™^
Time of Administration
Passage Type
Early
Non-Psychology
Note
Cell means are based upon 41 subjects
Late
Fl 8:
A ) Mean SVT Proportion Correct,
(B) Combined SVT and Confidence
Rating, and
(C) d' Scores by Content Type and Time
of Test
Session
„
1
-
(
A.
Mean SVT Proportion Correct
Scores
79
Psychology
77
75
73
Non-Psychology
Early
Late
Time of Test Session
B.
Mean Combined SVT and Confidence Rating
2.5
Psychology
2.4
2.3
2.2
2.1
¥
2.0
Non-Psychology
Early
Late
Time of Test Session
C.
Mean
d'
Scores
1.8
1.7
Psychology
1.6
Non-Psychology
1.5
1.4
Early
Late
Time of Test Session
88
TABLE 11
MEAN PROPORTION CORRECT SVT SCORES
BY QUESTION TYPE
AND TIME OF ADMINISTRATION
Time of Administration
Question Type
Early
Late
839
.840
Paraphrase
.650
.571
Meaning Change
.663
Original
.
Distractor
.
Note.
843
All cells are based upon 164 responses.
.
724
.873
89
TABLE 12
MEAN PROPORTION CORRECT SVT
SCORES BY QUESTION TYPE
CONTENT OF PASSAGE AND GROUP
Group
l
a
'
Group
2
b
Psychology
Original
Paraphrase
Meaning Change
Dis tractor
847
701
734
846
.844
.579
680
.874
.809
.680
.651
863
.855
.479
709
.847
.
.
.
.
.
Non-Psychology
Original
Paraphrase
Meaning Change
Dis tractor
.
.
Group 1 received passages 1 § 4 in the early test session
and passages 2 § 3 in the late test session.
i
Group 2 received passages 2 § 3 in the early test session
and passages 1 5 4 in the late test session.
Note.
All cells are based upon 82 responses.
TABLE 13
MEAN PROPORTION CORRECT SVT SCORES
BY QUESTION TYPE
CONTENT OF PASSAGE, AND PASSAGE PAIR
Pair
1
Pair
Psychology
Original
Paraphrase
Meaning Change
Distractor
840
596
684
853
.851
.687
.730
867
.
Non- Psychology
Original
Paraphrase
Meaning Change
Distractor
Passages
1
§
4.
Passages
2
§
3.
Note.
826
517
746
859
All cells are based upon 82 responses
838
.643
.613
.850
.
'
2
91
TABLE 14
ANALYSIS OF VARIANCE TABLE OF PROPORTION
CORRECT SVT PERFORMANCE
Mean Square
Time (T)
Subjects (S)
Group (G)
S:G
Pairs (P)
Error a
Contents
79
1
78
1
78
(C)
1
C x T
C x S
C x G
1
79
1
CS:G
Error b
Question Type
Q x T
Q x S
Q x G
QS:G
Error
78
78
(Q)
3
3
237
3
c
Q x C
Q x C x T
Q x C x S
Q x C x G
234
3
3
237
3
117
1.939
.
3.
720
1.
153
.26450
.20910
.06460
00226
.06275
05323
4.969
3.928
1.214
4.44272
26046
08376
.55146
.07776
.29332
.05974
74. 368
.05607
06820
05947
.17203
05803
19810
05885
.953
.
.05
.042
1.179
.
.
3
.
.
.
234
Q x P
.00957
15898
30505
15712
.09453
.08201
1
.
.
QCS:G
234
Q x C x P
Error d
3
.
234
.
.
360
402
231
1.302
.001
4.
1.
9.
.001
4.910
.01
1.159
1.011
2. 923
.986
3.
366
.01
.05
.05
92
TABLE 15
MEAN COMBINED SVT AND CONFIDENCE RATING
BY CONTENT TYPE
AND TIME OF ADMINISTRATION
Time of Administration
Note_.
Content Type
Early
Late
Psychology
2. 20
2. 46
Non-Psychology
2. 19
2.03
All cells are based on 82 subjects.
TABLE 16
MEAN COMBINED SVT AND CONFIDENCE
RATING BY QUESTION
U
lim
TYPE AND TIME OF ADMINISTRATION ^
^
Time of Administration
Question Type
Paraphrase
Note_.
All cells are based upon 164 responses.
TABLE 17
MEAN COMBINED SVT AND CONFIDENCE RATING
BY QUESTION TYPE
CONTENT OF PASSAGE, AND GROUP
Content Type
Psychology
Original
Paraphrase
Meaning Change
Distractor
3.18
1.74
2.10
3.02
3.01
2.97
3.23
.
75
1.69
3.18
Non-Psychology
Original
Paraphrase
Meaning Change
Distractor
1
.64
1.46
2.99
-
.13
1.83
2.89
Group 1 received passages 1 § 4 in the early test session
and passages 2 § 3 in the late test session.
Group 2 received passages 2 § 3 in the early test session
and passages 1 § 4 in the late test session.
Note.
All cells are based upon 82 responses.
95
TABLE 18
MEAN COMBINED SVT AND CONFIDENCE
RATING BY QUESTION TYPE
CONTENT OF PASSAGE, AND PASSAGE PAIR
Content Type
Pair
1
Pair
Psychology
Original
Paraphrase
Meaning Change
Distractor
3.04
2.93
3.15
1.60
2.07
3.26
3.06
3. 14
.14
2.29
1.37
1.00
3. 14
2
.90
1. 72
Non- Psychology
Original
Paraphrase
Meaning Change
Distractor
Passages
1
§
4
Passages
2
$
3
Note_.
All cells are based upon 82 responses
.
74
2
'
96
TABLE 19
ANALYSIS OF VARIANCE TABLE OF
COMBINED
SVT AND CONFIDENCE RATING
Mean Square
Time (T)
Subjects (S)
Group (G)
S:G
Pairs (P)
Error a
Contents
1
79
78
1
78
1
79
Q x T
Q x S
Q x G
QS:G
Q x C
Q x C x T
Q x C x S
Q x C x G
.
6.43
3. 43
1
.
.
.
.
75
62
46
01
47
330. 60247
82. 27
3
25.91850
6. 50292
39. 28163
6. 08268
29. 05354
4. 01849
6.45
1.63
237
3
3
c
.53
6 59
19. 03
3
234
Q x P
Error
78
78
(Q)
24496
.12403
4. 29779
9. 16050
4.
1
CS:G
Error b
.
16. 02498
15. 63797
1
C x T
C x S
C x G
Question Type
.
1
(C)
.98235
12 17908
35 15889
11. 88447
6.33939
1. 18480
234
3
3
237
3
QCS:G
234
Q x C x P
Error d
3
234
3.42911
4.39724
4.14361
9. 88652
4.06998
17.38122
4. 08322
.001
.005
9. 78
1. 51
.001
7.23
.001
.84
1.08
1.01
2.42
1.00
4.26
.01
97
TABLE
2 0
MEAN CONFIDENCE RATINGS PER SVT TEST
SENTENCE
AND INCORRECT SVT RESPONSES BY CONTENT FOR CORRECT
TYPE
AND TIME OF TEST SESSION
Time of Test Sessi on
Content Type
Early-
Late
4.23
3.77
4.29
3.95
Psychology
Non-Psychology
Correct
Incorrect
TABLE 21
MEAN D PRIME SCORES BY CONTENT TYPE
AND TIME OF ADMINISTRATION
Time of Administration
Note.
Content Type
Early
Late
Psychology
1.56
1.71
Non-Psychology
1.56
1.69
All cells are based on 82 subjects.
TABLE
2 2
ANALYSIS OF VARIANCE TABLE
d' SCORES ON SVT TESTS
Source
Time (T)
Content (C)
Subject X T
Subject X C
Time X Content
Subject X T X C
df
Mean Squar
1
1. 76
1
81
81
1.01
1.57
1.42
1
.44
81
1.07
100
IDEA UNITS FOR PSYCHOLOGY TEST #2
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Recently, an experimental method was
developed for
teaching young children bilingual abilities
they were instructed in French (exclusively!
they used English at home.
There were two monolingual control groups
each was instructed in their maternal language
the control groups/experimental were matched
by environment and socio-economics
also matched for IQ
Instruction was for 2 years
They were tested for communication skills
Controls were tested in their maternal language
Experimentals were tested in both French and English
One experiment measured abilities as decoders of
novel
information
the other experiment measured encoding
Experimentals did as well as controls in both experiments
The apparent result of the experiment was that young
children instructed exclusively in a foreign language
could apply abilities developed mainly through teacher
pupil interaction
(apply to) non-academic, peer to peer communications
there was no decrement of maternal language performance
This contradicts the idea that learning a second
language will handicap the first
The handicap may become evident over time
as increasing complexity of the language requires more
complex communication skills.
TABLE 24
MEAN RECALL SCORES BY CONTENT TYPE
AND TIME OF TEST SESSION
Time of Test Session
Type of Passage
Early
Late
Psychology
36
32
Non-Psychology
33
33
102
TABLE
2 5
ANALYSIS OF VARIANCE TABLE OF RECALL SCORES
Source
Time (T)
Content (C)
Subject X T
Subject X C
Time X Content
Subject X T X C
df
1
1
81
81
1
81
Mean Square
228 890
36. 890
128. 841
95. 606
487. 805
109. 447
.
1. 78
.39
4.46
n. s
n
.
s
05
103
TABLE 26
CORRELATIONS BETWEEN SVT AND RECALL
PERFORMANCE BY
ADMINISTRATION AND CONTENT TYPE FOR D PRIME TIME OF
AND
PROPORTION CORRECT SCORES
Time of Administration
Earl Y
Variable
Psych.
12
32
13
14
(n s
.
Late
Non-Psych.
Psych.
.49
.43*
.46*
56
.38*
.33*
.
*p<. 01
a
Free recall and Proportion Correct
Free recall and D Prime Scores
Note
.
All cells are based on 82 subjects.
Non-Psych
104
TABLE
2 7
CONDITIONAL PROBABILITIES OF SVT
PERFORMANCE
BY CONTENT TYPE GIVEN RECALL PERFORMANCE
Content Type
a
b
Correct/Correct
Correct/ Incorrect b
The probability of answering correctly on the SVT
having
correctly recalled at least 50* of the idea units for
that
sentence.
The probability of answering correctly on the SVT
having
failed to recall at least S0% of the idea units for that
sentence
Note_l.
The mean proportion correct performance on psychol
ogy passages was .768.
Mean proportion correct
performance on non-psychology passages was .744.
The mean proportion of correctly recalled idea
units on psychology passages was .34.
The mean
proportion of correctly recalled idea units on non
psychology passages was .33.
Note__2.
Each of the cells in the above table is based upon
32 subjects.
BIBLIOGRAPHY
105
BIBLIOGRAPHY
Anderson R.C., Schema-directed processes
in language comprehension.
Center for the study of rea ding. Technical
Wl
*
report 50, Un iversity ot Illinois, 1977.
Banks
W.P
Signal detection theory and human memory
Psychological Bulletin 1970 74_ (2), 81-99.
.
,
Becker, W
Teaching reading and language to the disadvantaged:
what we've learned from field research
Harvard Educational Review 47_, 4, November 1977 518^543.
.
,
Becker W.C., Carnine, D.W., Direct instruction:
A behavior
theory model for comprehensive educational intervention
with the disadvantaged. Paper presented at the Eighth
Symposium on Behavior Modification, Caracas, Venezuela
February 1978.
Bransford, J.D., Franks, J.J., The abstraction of linguistic
ideas.
Cognitive Psychology 1971, 2, 331-350.
,
Bransford, J.D., Johnson, M.K., Contextual prerequisites for
understanding:
Some investigations of comprehension
and recall.
Journal of Verbal Lear ning and Verbal
Behavior 1972, 11_, 717-726.
,
Carroll, J.B., Defining language comprehension:
some speculations.
In J.B. Carroll and R.O. Freedle (Eds.),
Language Comprehension and the acquisition of knowledge
Washington, D.C
Winston § Sons, 1972.
.
:
Dooling, D.J., Lachman, R.
Effects of comprehension on retention of prose.
Journal of Experimental Psychology,
su~
1971
88
216-222
,
.
,
,
Drum, P. A., Calfee, R.C., Cook, L.K., The effects of surface
structure variables on performance in reading comprehension tests.
Reading Research Ouarterly,
L 1981, 16
486-514.
(4),
4
—
Elias, M.F., Elias, P.K., Elias, J.W., Basic Processes in
Adult Developmental Psychology
St. Louis:
C.V.
Mosby, 1977.
.
Guice, B.M., The use of the cloze procedure for improving
reading comprehension of college students.
Journal of
Reading Behavior, 1969, 1 (3), 81-92.
105
,
106
HarootunianB.
Intellectual abilities and reading
achieveThe E1 ^enta ry School Journal
1966, 66, 386
,
Holmes J A
Singer, H.
The substrata-factor theory
Substrata factor differences underlying
reading ability in
known groups at the high school levfl.
U.^O
Cooperative Research Project No. 538, S.A.E.
8176, 1961?
Kagan J., Lang C
Psychology and Education: An I ntroduction.
New York:
Harcourt Brace Jovanovich, 1978.
^
.
Kints ch^W ^On^modeling^comprehension.
.
Kintsch, W
van Dijk, T.
sion and production.
363-394.
,
Educational Psychol -
Toward a model of test comprehenPsychology
— Review, 1978 85
—
'
—
Mandler, J.M., Johnson, N.S., Remembrance of things
parsedStory structure and recall.
Cognitive Psychology
,
1977,
Pearson, R.D., Johnson, D.D.
Teaching Reading Comp rehension.
New York:
Holt, Rmehart $ Winston, 1978.
,
Popham, W.J., Criterian-Ref erenced Measurement
Cliffs, N.J.:
Prentice-Hall, 1978.
.
Englewood
Publication Manual of the American Psychological Associatio n
C2na Edition)
Washington, D.C.: American Psychological Association, 1974.
.
Royer, J.M., Cunningham, D.J.
On the theory and measurement
of reading comprehension.
Contemporary Ed ucational
Psychology 1981, 6, 187-216":
,
,
Royer, J. M.
Hastings, N.
A sentence verifica$ Hook, C.
tion technique for measuring reading comprehension.
Journal of Reading Behavior 1979, n, 355-363.
,
,
,
,
Sachs, J.S., Memory in reading and listening to discourse.
Memory and Cognition 1974, 2, 95-100.
,
Sachs, J.S., Recognition memory for syntactic and semantic
aspects of connected discourse.
Perception and Psycho physics 1967, 2, 437-442.
,
107
Sassenrath, J M., Alpha factor analysis
of reading measures
at the elementary, secondary, and
college levels
Journal of Reading Behavior. 1972-1973,
5
(4), 304-316.
Shank, R.
Abelson, R.
Scripts, Plans, Goals, an d Understandmgs. Hillsda le, N.J.: Harlbaum, 1977.
,
Singer, H.
Subs trata- factor patterns accompanying
ment of power of reading, elementary through developcollege
levelThe Philos ophical and Sociologica l Bases
of
* ea in Efourteenth yearbook o t the National Readin
j
g
s
Conference
1964.
,
,
Singer, H.
Substrata- factor reorganization accompanying development
speed and power of reading at the elementary school level.
U.S.O.E., Cooperative Research
Project No. 2001, 1965.
m
,
Stem, N.S., Glenn, C.G., An analysis of story comprehension
in elementary school children.
In R. Freedle (Ed.),
New directions in discourse p rocessing. Hillsdale
N.J.
Ablex, 1979.
:
Swets, J. A., Tanner, W.P., Birdsall, T.G., Decision processes in perception.
Psycholog ical Review, 1961, 68
301-340.
—
Thorndike, R.L., Reading as reasoning.
Quarterly 1973-1974, 9, 135-147.
Reading Research
,
Tuiman, J.J., Determining the passage dependency of comprehension questions in five major tests.
Reading Research
Quarterly 1973-1974, 9 (2), 206-223.
,
Tuiman, J.J., Blanton, W.
Gray, G.
A note on cloze as a
measure of comprehension. Journal of Psychology,
90
'
(20), 159-162.
,
,
^ —
Tuiman, J.J., Gray, G., The effect of reducing the redundancy of written messages by deletion of function words.
Journal of Psychology 1972 8_2 299-306.
,
,
,
Vipond, D.
Micro- and macroprocesses in text comprehension.
Journal o f Verbal Learning and Verbal Behavior, 1980,
19, 276-296.
,
Voss, J.F., Vesonder, G.T., Spilich, G.J., Text generation
and recall by high knowledge and low knowledge individuals.
Journal of Verbal Learning and Verbal Behavior
1980, 19, 651-667.
,
108
Weaver, W.W.
Kingston, A.J., A factor analysis of
the cloze
procedure and other measures of reading
and language
Journal of Communication 1963, 13 (Dec),
252-261*
,
.
REFERENCE NOTES
1.
Johnston, P.
Question type and the assessment of reading comprehension.
Paper presented at the annual meeting of the American Educational Research
Association
New York, March 1982.
2.
Royer, J.M., Lynch, D. J.
Bulgarelli, C.
Using the
sentence verification technique to assess the comprehension of technical text.
Paper presented at the
annual meeting of the American Educational Research
Association.
New York, March 1982.
,
APPENDIX
109
110
Psychology Passage
1
Between 1956 and 1965, teachers and counselors
in 90
schools used a list of 14 behavioral criteria to
select
1,503 ninth-grade students to participate in a special counseling program for superior students.
Students who were
selected generally ranked in the top 5% of their class and
above the 95th percentile on standard measures of academic
performance.
The selected students' birth orders were compared with census figures and with chance expectancies based
on the number of children in their families.
Significant
over-representations of firstborns were found for 9 of the
10 years and for every family size.
Because of this consistency over a ten-year period, variability due to cultural
change could be practically disregarded.
Furthermore, the
selected students represented only about 1 out of 5 possible students who ranked in the top 5% of their class.
Although the study did not assess the possibility, it seems
quite likely that firstborns would not be significantly
over-represented in the entire top 5% of their class. The
excess of firstborns among the selected students may have
reflected their teachers' judgments of their academic performance in ways other than conventional measures of academic performance or ranking.
It is known that a strong
relationship exists between being firstborn and having high
levels or drive states on several traits, such as achievement
motivation, seriousness, and adult orientation. Behavioral
differences such as these between firstborns and other offspring could have been instrumental in the teachers' selection process.
Thus, behavioral differences specific to
firstborns, rather than other factors such as superior
intelligence, may have accounted for the significant overrepresentation of firstborns among the students selected for
participation in the laboratory. These factors may also
have accounted for the striking similarities which were
found between the over-representation of firstborns in the
selected population reported in this study, and the overrepresentation of firstborns in populations of eminent
persons reported in previous studies.
Ill
Psychology Passage
2
Recently, an experimental method was developed
for
teaching young children bilingual abilities by
instructing
them exclusively in a second language (French), while
having
them use their native language (English) at home and
outside
the school.
The children in this experimental group were
compared with two monolingual control groups who were instructed only in French or English depending on which was
their maternal language.
The experimental group and the
control groups were matched by socio-economic, environmental,
and IQ criteria to avoid confounding factors. After two
years of instruction for all groups, the groups were tested
for communication skills.
The monolinguals
of course, were
tested only in their maternal language, while the expermental bilinguals were tested in both French and English.
One
experiment examined their abilities as decoders of novel
information.
The other experiment tested their proficiency
of encoding.
In both instances, the experimental bilinguals
were found to be as capable as the matched monolingual control groups.
The apparent result of the experiment was that
young children instructed exclusively in a foreign language
could apply abilities developed mainly through teacher-pupil
interaction to non- academic peer-to-peer communications.
This occurred with no decrement in maternal language perfrmance.
This evidence contradicts the notion that a
bilingual' s progress in one language will be balanced or
offset by a handicap in the other.
Of course, the handicap
may become more evident over time as increasing complexity
of the languages requires higher levels of mastery in communication skills.
,
,
112
Non Psychology Passage
3
Edmund Halley (apparently pronounced "Haw-lee") was
one
of the greatest of the seventeenth century astronomers.
According to his biographer, Mr. Ronan (a British science
writer and editor), the name Halley is familiar, of course
because of the comet named after him. The comet, signed as
far back as 239 B.C., was most recently seen in 1910.
However, most people do not realize that in addition to his
work on the comet, it was Halley who first made use of
Newton's mechanical equations - -publ ished in 1687--to predict
that the comet would return at Christmastime, 1758. And return it did- -it was first sighted on Christmas Day of that
year--seventeen years after Halley' s death at eighty-six.
Halley was a polyhis tor- - at home in all the sciences, arts,
and letters of his time.
He made several daring sea voyages
of exploration.
He also plotted the earth's winds and magnetic fields in North and South America.
He was a man of
great charm and tact, and was, perhaps, the only person who
was in a position to persuade the neurotic Newton to publish
his greatest work, the "Principia."
In fact, Halley supervised the printing of the book, read print, and even paid
the printer out of his own pocket.
Mr. Ronan s biography of
Halley makes delightful reading.
From it one gets a sense
of how science functioned in one of its greatest epochs.
'
113
Non Psychology Passage
4
Mrs. Elizabeth
A Memoir was written by Elizabeth
Anderson with help from Gerald R. Kelley. Mrs. Anderson
now eighty-four, was Sherwood Anderson's third wife.
She
met him in New York (where she was managing the Doubleday
Doran bookstore) and lived with him in New Orleans Paris
and rural Virginia until 1929. At that time, he sent her'to
visit her parents and then wrote her a one-line letter:
"I
just wish you would not come back." Mrs. Anderson then
moved to Taxco, Mexico, renewed a friendship with William
Spratling, whom she had known in New Orleans, and opened
what became a successful dress shop. Her book ends with
Spratling' s death in an automobile accident in 1967, of
which she comments:
"I miss Bill Spratling so very much
more than I ever missed Sherwood Anderson." It is a curious
book, bland in describing her early years, dutiful and
matter-of-fact about the Anderson years, and chatty about
the Mexican years that followed.
The writing is clearly
that of Mr. Kelly, a professional journalist.
But Mrs.
Anderson's observations on her celebrated friends are just
as clearly her own.
"Others might eat an apple, Sherwood
experienced it," she says. And, "Edna St. Vincent Millay
always had a coterie of followers but did not care about
them one way or the other." Or, "Bill Faulkner's studied
courtesies and Southern mannerisms were a pose."
: