Language Measures
Common Measures of
Naturalistic Language: What
They Do and Don’t Tell Us
Cheryl M. Scott, PhD, CCC-SLP
Rush University Medical Center
Nickola W. Nelson, PhD, CCC, SLP,
Western Michigan University
MLU, MLTU, MLCU, NDW, TTR,
TNW, SI, etc., etc., etc.
Preschool
School-age
MLU changes
rapidly
MLTU changes
slowly
Obligatory
grammatical
markers
Small # sampling
venues
Optional
grammatical
structures
Large # sampling
venues
Reliability Validity Sensitivity Dimensionality
Increased use of naturalistic
language samples with school-age
children and adolescents
• Searching for phenotypes/clinical markers of
language impairments (sensitivity/specificity)
• Comparing language impairments, e.g., SLI,
Autism, brain injury, WS
• Formulating intervention goals
• Documenting language change with
intervention
Learning objectives
• Identify and define several commonly used
measures of naturalistic language production at
word, sentence, and discourse levels
• Critique measures for reliability and validity
(variance), sensitivity, and dimensionality
• Account for modality and genre effects on these
measures
• Illustrate the use of language sample measures
using a large scale writing intervention project
• Suggest alternative measures and tasks for
particular clinical questions
MODALITY affects word, sentence,
and discourse features
Speaking
Writing
Word
Sentence
Discourse
Text
Scott & Nelson: ASHA 2007
Acoustic (fleeting; nuance;
audience feedback)
Structure is linear
Lexicon is less dense;
higher frequency words
Visual (deliberative;
permanent; imagined
audience)
Structure is hierarchical
Lexicon is more dense,
lower frequency words
1
Language Measures
GENRE affects word, sentence,
and discourse features
Measures
Macrostructure Microstructure
•
•
•
•
Conversation
Narration
Expository (informational)
Persuasive (argument)
Reliability/ Consistency
Within-subject
• Compare apples to
apples
• Sample-to-sample
variability
• Less for more tightly
controlled samples
• Obligatory vs optional
forms
Validity/Truthfulness
Construct validity
• Did I conceptualize (and operationalize) a
construct (a language domain) in a reasonable
way?
• Is this the right measure for what I want to know
(or do)?
Criterion validity
• Does the measure behave the way I would
expect based on my theory of the construct?
Scott & Nelson: ASHA 2007
• Productivity
– TNW
– Total #
utterances
– Timed measures
• Cohesion
• Text structure
•
•
•
•
•
Sentential complexity
– Mean length of utterance (C-unit, Tunit)
– Clause density (Subord Index)
– PropComplex
– Fine-grained analyses
– Error analyses
Lexical diversity (TTR, NDW, VOCD)
Lexical density
Word length, word frequency
Finer-grained lexical analyses
Reliability/ Consistency
Measurement (inter-rater)
•
•
•
•
It depends on the measure
COMPUTERS are good at some things, and
bad at others
HUMANS are better at some things, but far
from perfect
Example:
–
A couple of weeks later a knight was traveling[v]
down a path when the five people threw[v1A] a
letter out of the bars with an arrow that was
asking[v2R] for someone to save[v3VCNF2] them.
_______________________________
Examples:
• Average length of utterance/sentence is sensitive
to developmental comparisons at ages 2 and 3,
but not 9 and 10
• Average length of sentence varies greatly across
discourse genres and tasks
• Type token ratio (TTR) should not be used to
compare two language samples of different sizes
• Average length of sentence tells me whether a
child with a language impairment is/is not “in the
ballpark” for his/her age, but does not inform
intervention planning
2
Language Measures
Sensitivity and specificity
Examples:
• Does this measure contribute in an
important way to the identification of a
particular condition and how accurately
does it do this?
– A “stand-alone” clinical marker?
– Additive evidence ?
• Does this measure contribute to a
differential diagnosis of related conditions
• Morphosyntax (e.g., verb
marking) are known to be
sensitive indicators of SLI
• Sentence length and
morphosyntax measures
versus verbal formulation
measures distinguish
between ADHD and SLI
• Measures of average
sentence length and
clause density are
equivocal in
distinguishing TD, LI, LD
groups
Dimensionality
WORDS/ Lexical diversity
• How does this measure relate to other measures?
• If our 10 measures “cluster” nicely into 2
dimensions; I can identify the two measures (one
for each dimension).
Does the individual have a sufficient “store”
of different words (for the task) or,
alternatively, use the same (highfrequency) words over and over?
___________________________________
Example:
• Lexical diversity measures align more with
productivity measures than they do with sentence
complexity measures in children’s narratives
(Justice et al., 2006)
WORDS/ Lexical diversity
Reliability:
WORDS/ Other measures
Computer does a good job
Validity: Construct (variety and specificity of
words C/A can call up) seems solid; Sample size
effects are problematic; task may constrain
• Lexical density: Proportion of content words
+/-
Sensitivity: When sample size is strictly
controlled by # words rather than # utterances,
several studies have not found SLI/TD group
differences in either age or language ability
matches
?
Dimensionality: Not known because it is
conflated with sample length in several studies.
?
Scott & Nelson: ASHA 2007
MEASURES
1. NDW: # different words (types) in sample
2. TTR: # types / # tokens
3. VOCD (Vocabulary Diversity)
(nouns, verbs, adjectives) to total words
• Word length: Average # characters (letters); #
words with 7+ letters
• Word frequency: Distribution of words among
2000 most frequent
(http:www.edict.comhk/textanalyser)
– Significant effect of modality (W>S) from age 10+
– Significant age effects in speech and writing, but only
for particular developmental “windows”
3
Language Measures
Let’s compare Dick and Jane
Narrative: Selfgenerated story
from a wordless
picture book
One Frog Too Many,
Mayer & Mayer, 1975
D (um) there was a boy who got (um) like a Christmas or a
birthday present.
D I don/’t know which present.
D but it was a present.
D and so (did he) he had three pets a frog a dog and a turtle.
D and when he opened it he saw something inside.
……….
D and so the little frog was really scared.
D now lets make names for the little frog and the big frog.
……….
D and Sam was very happy riding along right behind Jack
and the dog riding behind the boy and the boy right in front
like military soldiers.
D hey that/’s what they used to do.
D well Jack when the boy was/n’t (looking) looking and no
one else was looking (he kicked the frog off of the turtle
and I mean) he kicked Sam off of the turtle.
D and Sam started crying because he was just a baby.
D and so the boy did/n’t really appreciate what Jack did.
SENTENCES/ Complexity
•
•
•
•
•
•
•
•
Scott & Nelson: ASHA 2007
MLU, MLTu, MLCu
Subordination Index
Clause density
(#clauses / # T-units)
# Complex (Simple)
PropComplex
COORD
SUBORD
Types of complexity
(complements, relatives,
adverbials)
J The boy got a present from
his friend.
J He open/*ed it.
J He got a little frog.
J He had two frog/*s.
J One was big.
J One was little.
J The big one did/n't like the
little one.
J The big one bite[EW:bit]
his leg.
J And then he was XX.
J He was riding on his back.
J And the big one was in the
front.
J He kicked the little one (out
of the) out[EW:off] of the
turtle's back.
J And he made him cry.
J (He leave him) he
leave[EW:left] him over
there.
J He was (he go) going
away.
J He *was mad at him.
J He jump/*ed on the X.
J He kick/*ed him off.
J He's rowing away.
J He kick/*ed him off the
boat.
J And he did/'nt move.
J He never move/*ed.
J The turtle X.
J (and the) and the frog
stick[EW:stuck]
he[EP:his] tongue out at
him.
Jane (7;4)
Dick (6:9)
No. main
body words
147
596
NDW
66
TTR
0.45
Low
frequency
words
rowing, moved,
tongue, squeaky,
squished
175
147
first
66
147
middle
77
0.29
0.46
0.50
ribbiting, talking, pond, military
soldiers, appreciate, safety, stick,
paddle, sneakily, quietly, breath,
choice, tongue, attending,
nowhere, angry, scared, awarded,
good deeds,
1. Yanis’ father was a farmer (1)
2. And he wanted him to be a farmer (2)
3. And then Yanis let the little goats out to go up the
mountain and stuff (2)
4. And one went off (1)
5. And he was going to go after it (2)
6. And he got to the top of the mountain and saw the sea
(2)
7. And he had a dream that night (1)
8. And he told his dad that he could be a fisherman and
be back in a couple of days (3)
Clause density: 14 clauses / 8 T-units (1.75)
4
Language Measures
SENTENCES/ Complexity/
MLU, MLTU, MLCU
Reliability:
95% inter-rater segmentation agreement
Computers are good at the actual calculation
Validity: Growth slows after age 6;
Sentence length does not always reflect
complexity
Huge effects of task, genre, and modality
(E>N, W>S in some languages)
Examples of slow growth
9
8
7
6
Justice
Bishop
5
4
3
2
1
6
7
8
9
10
11
12
Sensitivity:
Large standard deviations;
Equivocal findings of group differences
Sentence length and complexity
mismatches
SENTENCES/ Complexity
SI, CD, Complex, PropComplex
___________________________________
1. The dog that comes every evening
sniffing around is mean.
(10 words, 3 clauses)
Reliability: Fair reliability in coding instances
and type of subordinate clauses; Different
studies code different types of clauses
2. The dog has a big head and a stubby
tail.
(10 words, 1 clause)
Sentence length and complexity
mismatches
___________________________________
1. The residents of London were surprised
last fall when they saw a whale
swimming in the Thames river (18 words,
3 clauses)
2. Although the whale that was spotted in
the Thames river was taken back to sea,
it died anyway (18 words 3 clauses)
Scott & Nelson: ASHA 2007
Validity:
Growth curve slows; Counting
instances may not reflect developmental
differences; text length is a confound for counting
instances
Sensitivity: Equivocal findings for language
ability; but more sensitive metrics may come to
the rescue
WORDS & SENTENCES:
Dimensionality (Justice et al. 2006)
Predicted
Found
Productivity
Productivity
TNW, NDW
LENGTH (# T-units)
TNW, NDW, LENGTH (# Tunits), COMPLEX, coord,
subord
Complexity
Complexity
MLTU-W, COMPLEX,
COORD, SUBORD,
PROPCOMPLEX
MLTU-W, PROCOMPLEX
subord
5
Language Measures
SENTENCES/ Grammatical error
PRODUCTIVITY
•
Sensitivity
• In study after study, grammatical error has been
found to be a robust measure distinguishing TD
and LI children and adolescents
– Larger effect sizes than other measures
– One of the few measures that also
distinguishes LI from language-matched
controls ( A true clinical marker)
– Error is exacerbated in LI in writing compared
to speaking and in expository compared to
narrative discourse
Is there an adequate amount of language for
the task ?
• Is the language produced in a timely way
(productivity and fluency)
_______________________________________
1. TNW
2. Total # T-units
3. Utterances per turn
Caveat: A child with the highest # sentences may/may not
have the highest # words
Productivity findings
Discourse/ Text Level
• Significant effect for age
• Significant effect for language ability
Measures include
• Points assigned for text structure components
• Ratings (e.g., 0-3) for rubric domains (character
development, plot development, organization)
• Narratives > expository
• Spoken > written
Reliability
Discourse/ Text Level
Nickola Wolf Nelson, Ph.D., CCC-SLP
Sensitivity
• Narrative macrostructure: 85-90% inter-rater reliability
• Expository macrostructure: More difficult
– Deciding what is a generalization, e.g.,
The desert is a hot and inhospitable place (Scott &
Jennings, 2004)
(+/-)
• Liles et al. (1995): Microstructure measures
were more effective distinguising LI and TD
groups than macrostructure measures
• Fey et al. (2004): Low-moderate effect sizes
distinguished TD, SLI, and NLI
• Gillam & Pearswon (2004) report good
sensitivity (.92) for Test of Narrative Language,
but caution that “tests don’t diagnose”
Scott & Nelson: ASHA 2007
Western Michigan University
Written Discourse Analysis: Clinical
Applications
6
Language Measures
Acknowledgments
• Adelia Van Meter, M.S., Western Michigan University
• Christine Bahr, Ph.D., St Mary of the Woods College,
Terre Haute, IN
• Many graduate assistants, especially Sally Andersen,
Pam Ansell, Kylee Biddel, Karey Hill, Carrie Kopitzki,
Amanda Luna-Bailey, Kristen Kopacz. Carey Nagayda,
& Anna Putnam
• U.S. Department of Education; however, the views
expressed in this presentation are those of the authors
and no endorsement should be construed on the part of
the U.S. government
Clinical Meaning of Variance
Clinically Useful Measures
Sampling Method
• Valid
Sampling methods
Continuum of naturalness
Analysis Techniques
• Valid
– Relevant
– Sensitive to
– Represent what matters
• Qualitative
• Quantitative
• Development
• Disorder
– Captures evidence
• Sensitive
• Specific
– Specific to disorder
• Reliable
– One sample to the next
– Consistent when sampling
conditions remain constant
• Time
• Genre
– Dimensional
• Unique variance
• Not too redundant
• Reliable
– Not too
• Complex
• Subjective
– One scorer to the next
– One scoring attempt to the
next
Hunt, 1970 (W-free writing) Loban, 1976 (portfolio)
Nelson & Van Meter, 2002, 2007 (W-original stories)
Fey et al., 2004 (OW-3
pictures)
Gillam & Johnston, 1992
(OW-3 pictures)
O’Donnell et al., 1967 (OWsoundless films)
Puranik & Lombardino, 2006 (Wpassages read aloud)
Scott & Windsor, 2000 (OW-films w/ sound)
Andersen, Nelson, & Scott, 2007
(W-phrase combining)
O=Oral
W=Written
Hunt, 1970 (W-rewriting)
Three Purposes of
Written Language Sampling
1. Does this child have a language disorder
in the area of written language?
2. What written language abilities should be
targeted at this point with this child?
3. Is change occurring?
– Is this child making progress?
– Is this treatment working?
•
•
RCT
Do children with special needs move closer to
typical children?
Andersen, Nelson, & Scott, 2007
(W-sentence combining)
Scott & Nelson: ASHA 2007
7
Language Measures
We’ll return to this one later.
Purpose One.
Purpose Two.
Diagnosing disorder
What to target next?
The Writing Lab Approach
• Writing process instruction
– Curriculum-based, authentic writing projects
– Language targets: discourse, sentence, word, social
interaction
– Instructional tools: scaffolding with audience perspective,
minilessons, author chair, peer conferencing, author
notebooks, environmental supports
• Computer Supports
• Inclusive, collaborative, individualized instruction
– 2 to 3 days a week working in classrooms and computer
labs, teachers and SLPs side-by-side
– Special needs students included for all instruction
– Intentional goals based on analysis of narrative probes
Probing Written Language
Nelson, N. W., Bahr, C. M., & Van Meter, A. M.
(2004) The Writing Lab Approach to Language
Instruction and Intervention. Baltimore, MD:
Paul H. Brookes.
Nelson, N. W., & Van Meter, A. M. (2002).
Assessing reading and writing samples for
planning and evaluating change. Topics in
Language Disorders 22(2), 47-72.
Nelson, N. W., & Van Meter, A. M. (2006). The
writing lab approach for building language,
literacy, and communication abilities. In R.
McCauley & M. Fey (Eds.). Treatment of
language disorders in children. (pp. 383-421).
Baltimore, MD: Paul H Brookes.
Nelson, N. W., & Van Meter, A. M. (2007).
Measuring Written Language Ability in Narrative
Samples. Reading and Writing Quarterly, 23,
287-309.
www.wmich.edu/hhs/sppa
“special projects”
Assessment of Written
Language:
Describe initial performance in
We are interested in the stories that __ graders
write. You know something about stories. Stories
have a problem. They tell what happened and
how the story ended. Your story can be real or
imaginary.
Scott & Nelson: ASHA 2007
Writing Processes
Written Products: Language
levels
Spoken language in writing
process context
8
Language Measures
WRITING ASSESSMENT SUMMARY AND OBJECTIVES
Student _____________ Grade _____ Teacher ________________
Assessment sources ________________Genre ______ Date ______
OBSERVATIONS AND
IMPRESSIONS
GOALS AND OBJECTIVES
Writing Processes
Planning and organizing
Drafting
Revising and editing
Written Products
Discourse level
Sentence level
Word level
Conventions
Oral Language
Writing process oral contexts
Genre specific
www.wmich.edu/hhs/sppa
“special projects”
Plan for Initial Probe,
Jan. 22, 2nd Grade.
“April”
• African-American girl, age 7;10 when
we began working with her 2nd grade
classroom mid-year
• Teacher identified her as high-risk
– History
• Near fatal vehicular accident at age 4
• Head trauma
• Not identified as special ed
“Story” for Initial Probe,
Jan. 22, 2nd Grade
Assessing Writing Processes
• Planning and organizing
–
–
–
–
Picture
Graphic organizer (semantic web)
Notes (used classroom supports)
Dictation (teacher scaffolded)
• Drafting
– Refers to planning (used plan to generate list)
– Pauses periodically
• Revising and editing
– Rereads work
– Makes corrections (letter-level corrections)
– Etc.
Scott & Nelson: ASHA 2007
9
Language Measures
Narrative Scoring
Assessing Discourse Level
• Productivity (# words; T/C-units) (20 words)
• Structural organization (per genre) (not a story)
• Sense of audience
–
–
–
–
Creative and original (copied text)
Relevant/adequate information (few intelligible wds)
Dialogue/other literary devices (n/a)
Title (none)
• Cohesion
– Within/across sentences (list)
– Pronoun reference
– Verb tense
1. Isolated description (heaps)
2. Action (temporal) sequence
-
“What next?” strategy
Often linked by and, so, then
3. Reactive sequence
-
Causally linked, without planning
May be implied but should characterize story
4. Abbreviated episode
-
Problem stated
Character’s intentions may be implied, “decided to”
5. Complete episode
-
Plan stated
Clear ending
6. Complex/multiple episodes
Discourse Level
Baseline & Goals
Assessing Sentence Level
• T-units
• Baseline:
– Willing to attempt task
– Produced expository description to story
probe
– Limited independence; dependent on
classroom text
• Will use:
– Planning to use story grammar elements
– Higher level narrative maturity
– Number of T/C-units (5 C-units)
– MLTU (MLTU = 4.0)
• Types of Sentences & Variability
– Simple and complex
– Correct and incorrect
– [si] [sc] [ci] [cc]
– 4 [si], 1 [ci]
• Temporal connections
• Cause-effect elements
Sentence Types
Sentence Level
Baseline & Goals
• Simple incorrect [si]
– Her and I saw you.
– Taking care of people.
• Simple correct [sc]
– Tiger and Coco were best buddies.
• Complex (or compound) incorrect [ci]
– These are{r} the thing/*s that a nurse does{bus}.
[1 T-unit; 2 verb phrases]
• Complex (or compound) correct [cc]
– That night I dressed up as a witch too, but I was
a lot uglier. [2 T-units]
Scott & Nelson: ASHA 2007
• Baseline:
– Produced 5 T-units
• 4 verb phrases [si]
• 1 complex incorrect [ci] sentence— “These are {r} the
thing/*s that a nurse does {bus}” [no word boundaries]
• Will use higher level syntactic maturity:
– Generate independently a majority of complete
sentences-- [sc], [ci], or [cc]--with a subject
phrase and verb phrase (may receive scaffolding
to transcribe)
– Reread and revise to correct syntactic errors.
10
Language Measures
Assessing Word Level
• Word choice
– Mature and interesting words
– Correct for context
– Number of different words
• Knowledge of sounds &
morphemes in words
– Spelling accuracy (% correct)
– Evidence of spelling knowledge &
developmental advances
Spelling knowledge/strategies
• Prephonetic (meaningless letter or letter-like sequences
e.g., takyskrp for “I am coming”)
• Semiphonetic (partial phonetic representation; lettername strategies, e.g., “ne” for any)
• Phonetic (representing all or most phonemes)
• Transitional (ortho-morphographic representation)
– Orthographic patterns (e.g., silent –e rule, -ought, -ould)
– Morphemes (e.g., -ed, -ing, un-, re-)
• Conventional (most words spelled correctly)
– letter doubling rules
– less frequent patterns
• Higher level, mixed issues
– Use Masterson & Apel sources for assessment and intervention.
Word Level
Baseline & Goals
• Baseline:
– Word choice dependent on context (6 of 20 words were
intelligible; no word boundaries)
– Spelling accuracy (30% correct)
– Semiphonetic for generative spelling
Plan for Final Probe,
April 10, 3rd Grade.
• chavThrhrLe “checking blood pressure”
• pypoll/pepll/”people”
• Will generate meaningful words independently
– Produce appropriate words for context
– Increase %age of correctly spelled words:
• Phonetic knowledge
• Morpho-orthographic knowledge
Scott & Nelson: ASHA 2007
11
Language Measures
Evidence of Maturity & Ability
Study Across 5 Grade Levels
(Nelson & Van Meter, 2003)
• Sampling task: original
probe stories
Purpose Three.
– Beg
– Mid
– End
Measuring change associated
with instruction/intervention.
Race/Ethnicity
• Of 322 students
identified ethnically:
African Am
White/NonHispanic
Hispanic
Asian
– 165 African American,
– 136 Euro American,
– 20 Hispanic, and 1 Asian
(or other) students.
Study of Midyear Probes for all Grades
(N = 277)
Evidence of Change with Intervention
Study of Third Graders
(Nelson & Van Meter, Reading & Writing Quarterly, 2007)
(Nelson, Bahr, & Van Meter, 2004; Nelson & Van Meter, 2006)
• Special Needs (53)
–
–
–
–
–
–
–
• 5 third grade classes
(N=101)
ECDD 9
LD 17
S-L impaired 6
EI 4
Cog Imp 2
ASD 1
HI 1
– Typical Devel. (82)
– Special Needs (19)
• Original Story Probes
– Beg
– Mid
– End
Discourse Level: Differences across
& changes within grade level
Methods of Analysis
• Students’ written samples were transcribed using
SALT (Miller & Chapman, 2000).
• Utterances were divided into T-units.
• Samples were coded at three levels:
Scott & Nelson: ASHA 2007
Story Gram m ar Score
– Discourse level: coded for maturity using story grammar
sequence (Glenn & Stein, 1980)
– Sentence level: coded as simple or complex and correct or
incorrect [si][sc][ci][cc], grammatical errors, and SALT
counts of number and types of conjunctions
– Word level: coded for spelling accuracy by entering
intended word, followed by student’s spelling, e.g.,
“night{nite}[sp], and SALT counts of number of different
words
– Coding reliability: ranged from 81% to 99% agreement.
Story Grammar Maturity
Productivity (Fluency)
180
5
160
4.5
4
First
3.5
Second
3
Third
2.5
Fourth
Fifth
2
1.5
Total Words in S tory
• Typical Developing
(224)
140
First
Second
Third
Fourth
Fifth
120
100
80
60
40
20
1
0
Beg
Mid
Schoolyear
Final
Beg
Mid
Final
Schoolyear
Story Scores: 1 = isolated description, 2 = temporal sequence, 3 = causal sequence, 4 =
abbreviated episode, 5 = complete episode, 6 = complex or multiple episodes.
12
Language Measures
Discourse Level: Narrative Story Score by Ability
Special
4
3.5
3
2.5
2
1.5
1
0.5
0
Special
Beg
Mid
Schoolyear
Mid
End
Special
25
4
3.5
3
2.5
2
1.5
1
0.5
0
Typical
Special
End
Beg
Schoolyear
20
First
15
Second
Third
10
Fourth
Fifth
5
Mid
Schoolyear
5th Grade
T o t U tt
Sto ry G ram m ar
Sco re
Story Grammar Scores
Typical
0
4
Story Grammar Score
Key:
0 Heaps
1 Isolated description
2 Temporal sequence
3 Causal sequence
Beg
3.5
3
2.5
Special
1.5
1
0.5
0
4 Abbreviated episode
Beg
Mid
Mid
9
8
7
6
5
4
3
2
1
0
First
Second
Third
Fourth
Fifth
Beg
Final
Mid
Schoolyear
Note: Some 1st or 2nd grade
children with special needs
entered the group at mid-year or
end-year when they achieved
enough literate language abilities
to write their own words.
Typical
2
Mean Length of T-Units (MLTU)
Total T-Units in Sample
4th Grade
4
3.5
3
2.5
2
1.5
1
0.5
0
Mid
End
Schoolyear
3rd Grade
Beg
& changes within grade level
Typical
W o rd s p e r T -u n it
Story Grammar Score
Story Grammar
Score
Typical
Beg
Sentence Level: Differences across
2nd Grade
1st Grade
4
3.5
3
2.5
2
1.5
1
0.5
0
Final
Schoolyear
Schoolye ar
5 Complete episode
Sentence Level: MLTU by Ability
2nd Grade MLTU by Ability
1st Grade Changes in MLTU by Ability
9
8
Special
Beg
Mid
End
3
Beg
Mid
Schoolyear
4th Grade MLTU by Ability
3rd Grade MLTU by Ability
9
9
8
Words per T-Unit
8
Words per T-unit
End
Schoolye ar
7
Typical
6
Special
5
7
Typical
6
4
3
3
Beg
Mid
Special
5
4
Beg
End
Mid
Schoolyear
5th Grade MLTU by Ability
Schoolye ar
8
7
6
5
4
3
2
1
0
Beg
Mid
Final
5
4
3
Sentence Type Changes
Typical
6
Beg
Mid
Final
4
7
Beg
Mid
Final
Special
Beg
Mid
Final
Typical
6
5
Beg
Mid
Final
7
Number of Sentences
8
Words per T-Unity
Words per T-Unit
9
First
Second
Third
Fourth
Fifth
Words per T-Unit
9
SI
8
7
SC
CI
CC
RO
Typical
6
Sentence Codes
Special
5
4
3
Beg
Mid
Schoolyear
70
60
50
40
30
20
10
0
SC
SI
SC
CI
CC
1st
2nd
3rd
4th
5th
TotWds
DifWds
Final
70
60
50
40
30
20
10
0
Mid
Special Students' Sentence Codes
Beg
Grade Level
First
Second
Third
Fourth
Fifth
Final
5th
Mid
4th
Beg
3rd
Final
2nd
180
160
140
120
100
80
60
40
20
0
Mid
CC
Beg
CI
1st
Proportion
Word Level: Differences across &
changes within grade level
SI
Number of Words
Proportion
Typical Students' Sentence Codes
CorrSp
(Total, Different, Correctly Spelled) Words
Grade Level
Scott & Nelson: ASHA 2007
13
Language Measures
Word Level: Number of Different Words by Ability
Number of Different Words
20
80
60
Typical
40
Special
20
0
Mid
End
Beg
Total
Typical
40
Special
End
4th Grade
100
100
80
80
60
Typical
Special
40
20
No. Diff. Words
No. Different Words
60
Mid
Schoolyear
3rd Grade
60
Typical
Special
40
20
0
Beg
Mid
End
Beg
Schoolyear
5th Grade
Mid
Schoolyear
100
Note: Some 1st or 2nd grade
children with special needs
entered the group at end-year
when they achieved enough
literate language abilities to
write their own words.
0
2nd
3rd
4th
5th
No. Diff. Words
20
Grade Level
80
60
Typical
40
Special
20
0
Beg
Mid
Schoolyear
Best Measures for Comparing Scores for Typical and Special
Needs Students at Points Across the School Year
1st
2nd
3rd
4th
5th
Grade
Grade
Grade
Grade
Grade
--
--
Beg**Mid*
Beg*
Beg (p<.08)
End**
Beg**End**
Mid*
--
Mid**
Mid**
Sentence Level
MLTU
--
--
--
--
No. Conj.
Beg*End**
Mid**
--
--
--
Types Conj.
End*
End**
Mid*
Beg*
Beg*
% Gram. Error
(drop)
--
--
--
--
--
End**
Beg**End**
--
--
Mid**
Beg**
End*
Mid*End*
Beg**
Mid**
% Spelling Error
(drop)
Special
0
1st
Word Level
No. Diff. Wds.
Typical
40
Schoolyear
80
– NDW***
Tot Wds.
60
Beg
No. Different Words
Discourse Level
Story Scores
80
0
100
Language
Measures
100
No. Diff. Words
No. Diff. Words
• Grade level changes NDW**
1st 5th (midyear samples)
M = 22.6 (9.3 SD) 87.2 (43.3 SD)
• Significant difference
based on ability
2nd Grade
1st Grade
100
What about older children and
adolescents with more severe
disabilities...
who are not on the charts?
* or ** independent t-test results significant at p<.05* (p<.01**) for typical and special needs students at a
particular point in the school year.
Developmental Writing Scale
Primary Trait Scoring
D’s Independent Efforts
(Sturm & Nelson, 2007)
1.
2.
3.
4.
Drawing
Scribbling [D’s work fits here]
Strings of letters not grouped into words
Strings of letters grouped into “words” with one or less
intelligible possible real words set apart or embedded in a string
of letters
5. Two to three semi-intelligible words
6. More than three semi-intelligible words in a list format
7. More than three attempted words with a sentence-like frame
8. At least one “sentence” with a subject phrase or a verb phrase
9. Several sentences with somewhat related or unrelated content.
10.Several sentences on one topic.
11.Multiple paragraphs with different main topics and one or more
related sentences for each topic.
12.The writing has a beginning, middle, and end with almost all
sentences relating to a main topic
Scott & Nelson: ASHA 2007
14
Language Measures
From Sturm’s data
Back to ...
Lltons
Football
Dethott
Lbons
This
•
•
•
•
•
•
Purpose One.
Level 6
Total Intelligible Words = 5
Developmental Writing Level = Level 6 (>3 wds, list format)
Topic = Lions football
Total Unique Words = 4
Number of C-Units = 0
Diagnosing disorder
Differentiating based on ability
Not differentiating based on dialect
or race/ethnicity
Sampling constraint influences on
validity and reliability
Original Stories/Free writing
• Higher ecological validity
• Reliability
– “Test-retest”?
– “Inter-scorer”?
•
The question of…
Sentence Combining/Re-writing
• Lower ecological validity
• Reliability
– Less likely to be influenced by
• Knowledge of topic/genre
• Internal pace/time
Measure curriculum-based
language ability
– Test-retest
– Inter-scorer
•
Measure curriculum-relevant
language ability
– Read stimulus sentences/words
– Comprehend re-present
discourse meaning
– Manipulate kernel sentences in
working memory
– Formulate more complex syntax
while maintaining meaning
– Spell/transcribe words into text
Inter-scorer Coding Reliability
• Inter-scorer (trained graduate assistants)
agreement based on 20 randomly selected
samples
– 97% for T-unit division
– 100% for number of words
– 94% for omitted words
– 93% for omitted bound morphemes
– 90% for total sentences
• 68% for simple incorrect
• 75% for complex correct
Scott & Nelson: ASHA 2007
Interscorer Variability
Test-Retest Variability
Test-Retest Reliability
• Same task: Story probes
• Same students: N = 11 students (3rd
grade-Spring; mean age 8.9 years)
• Gathered 2 weeks apart -- Not enough
time for change to be attributed to:
– development
– Instruction
– intervention
15
Language Measures
Data from Test-Retest Study
Results for Test-Retest Reliability
N
• T-tests showed:
– No significant differences from time 1-2 in any of the
quantitative measures
• Correlations showed:
– Most reliable (highest correlations) were for:
•
•
•
•
•
story scores
total utterances
total words
number of different words
error counts of omitted morphemes
Can a constrained, but
curriculum-relevant task be
developed…
Sig.
story5 & story6
11
.638
.035*
Pair 2
totut5 & totut6
11
.736
.010**
Pair 3
mltu5 & mltu6
11
.014
.968
Pair 4
ndw5 & ndw6
11
.849
.001***
Pair 5
totwd5 & totwd6
11
.825
.002***
Pair 6
ow5 & ow6
11
-.329
.324
Pair 7
obmorph5 & obmorph6
11
.849
.001***
Pair 8
conjtot5 & conjtot6
11
.358
.280
Pair 9
ConjType5 & ConjType6
11
.209
.536
Pair 10
RelProTot5 & RelProTot6
11
.301
.368
Pair 11
RelProType5 & RelProType6
11
.209
.538
Pair 12
sc5 & sc6
11
.475
.140
Pair 13
si5 & si6
11
.084
.806
Pair 14
cc5 & cc6
11
.852
.001***
Pair 15
ci5 & ci6
11
.652
.030*
Pair 16
ro5 & ro6
11
.335
.313
– Most unreliable (lowest correlations) were for:
• MLTU
• Totals and Type counts for conjunction and relative pronoun
Correlation
Pair 1
Test of Integrated Language &
Literacy Development
as part of a standardized test?
Still looking for a few more Beta test sites
[email protected]
The TILLS Reading-Writing Stories
Sampling Method
Graphic Organizer Task
News Story Plan
• Challenge: Develop a structured task
that
– Has a degree of ecological validity
– Can yield a lot of information in a short
amount of time
– Is sensitive and specific to identifying
disorders of written language
– Can be scored reliably
Scott & Nelson: ASHA 2007
Who/Where?
Our school
What?
It was closed.
Why?
Pipe broke.
When?
Last week
What was the problem?
A flood
What was done to solve the
problem?
Pipe was fixed.
How did it end?
Children could come back.
16
Language Measures
Sentence Combining “Rewriting” Task
7. Reading the News (reading fluency)
The Little Dog
8. Writing the News
There are many good ways to put these notes together to
write an interesting story. Here is one example…
There was a dog.
He was little.
He was brown.
He was white.
A car almost hit him.
It was in front of our school.
A little brown and white dog
almost got hit by a car in front
of our school. He was scared, but
he was okay.
He was scared.
He was okay.
The TILLS Reading-Writing Stories
Analysis Techniques
• Decisions:
– Rubric scoring attempts did not yield
satisfactory reliability
– T-unit division may not be a skill that all
examiners possess
– Counting words and content units is more
likely to result in reliable scoring by most
examiners
– Counting correctly and incorrectly spelled
words makes sense.
Perhaps we need to move on
Better tie the language
measure to the
clinical and/or
research questions
Get out of the lab and
pay attention to real
language that’s out
there, for
example…………
If you had asked most architects 25 years
ago whether modern architecture could
make a good city, the answer would have
been a rousing “no.” Wounded by
spiritless steel-and-glass boxes and the
social tumult at notorious public housing
projects such as Cabrini-Green, the
dominant style of the 20th Century was in
full retreat, even in Chicago, the nation’s
pre-eminent stronghold of steel and glass.
Scott & Nelson: ASHA 2007
17
Language Measures
Future directions for language
measures
• Finer-grained analyses of form (syntax)
• Interactions of measures
– Form with function
– Lexicon with syntax
– Confluence of forms
• Proxy measures
A few examples
• Sentence Complexity Index (Scott & Nelson,
2006)
– A way of capturing more detail about syntactic
complexity.
– Features deemed more complex and/or laterdeveloping are weighted differently
– For example:
• Kernel Sentence Index (Scott & Nelson, 2006)
– A sentence combining task (writing)
– KSI: # input “kernels” / # T-units
– For example:
Example of sentence combining
Example:
–
–
A couple of weeks later a knight was traveling[v]
down a path when the five people threw[v1A] a
letter out of the bars with an arrow that was
asking[v2R] for someone to save[v3VCNF2] them.
Although he suspected[v1LLBA] the whole team
was[v2VC] against him and put[v2C] that stuff in his
locker, Ron thought[v] he could convince[v1VC] a
few of them to help[v2VCNF2] him.
Findings to date:
1. There was a building.
2. It was old.
3. No one knew how old.
4. No one new exactly.
5. People talked about it.
6. It was used in a war.
7. It was the Civil War.
8. The building was a hospital.
9. Many soldiers died.
10. They were buried.
11. The graveyard was still there.
1. There was an old building.
2. No one knew how old, at least
not exactly.
3. People talked about it being
used in the Civil War.
4. It was a hospital where
soldiers died.
5. They were buried in the
graveyard that’s still there.
11 kernals / 5 T-units
KSI = 2.2
Summary and Conclusions
• Match the measure:
• SCI shows a robust development effect
(ages 7-15) that is reasonably
independent of task (unlike MLTu)
• KSI (SC) correlates significantly with
complexity in a free writing task (SCI)
• KSI offers a “level playing field” for C/A to
show the full range of complexity types
• KSI shows a more linear developmental
effect
Scott & Nelson: ASHA 2007
– To the question
– To age of child or adolescent
• When measuring change over time, or group
differences, compare apples to apples
• Beware of effects of sample length on measures
– Many of them increase in a monotonic way with sample
length (TNW)
• Productivity (TNW, NDW) is a good measure:
– Captures general language development and ability
• Error analyses yield greatest sensitivity to LIs
• Still looking for the best measures to capture
growth in syntactic ability during the school-age
years
– But it does occur!
18
Language Measures
REFERENCES
Andersen, S., Nelson, N. W., & Scott, C. M. (2007, November). Measuring syntactic
complexity across three writing tasks. Poster presented at the Annual Conference of
the American Speech-Language-Hearing Association, Boston, MA.
Apel, K., & Masterson, J. J. (2001). Theory-guided spelling assessment and intervention.
Language, Speech, and Hearing Services in the Schools, 32, 182-195.
Apel, K., Masterson, J. J., & Hart, P. (2004). Integration of language components in
spelling: Instruction that maximizes students’ learning. In E. R. Silliman and L. C.
Wilkinson (Eds.), Language and literacy learning in schools (pp. 292-315). New York:
Guilford Press.
Apel, K., Masterson, J. J., & Niessen, N. L. (2004). Spelling assessment frameworks. In
A. Stone, E. R. Silliman, B. Ehren, & K. Apel (Eds.), Handbook of Language and
Literacy: Development and Disorder (pp. 644-660). New York: Guilford Press.
Bishop, D. , & Clarkson, B. (2003). Written language as a window residual language
deficits: A study of children with persistent and residual speech and language
impairments. Cortex, 39, 215-237.
Fey, M. E.; Catts, H. W.; Proctor-Williams, K.; Tomblin, J. B.; & Zhang, X. (2004). Oral
and written story composition skills of children with language impairment. Journal of
Speech, Language, and Hearing Research, 47, 1301-1318.
Francis, W., & Kucera, H. (1982). Frequency analysis of English usage: Lexicon and
grammar. Boston, MA: Houghton Mifflin.
Freedman, S. W. (1982). Language assessment and writing disorders. Topics in
Language Disorders, 2 (4), 34-44.
Gentry, J. R. (1982). An analysis of developmental spelling in GNYS AT WRK. The
Reading Teacher, 36, 192-200.
Nelson, N. W., Bahr, C. M., & Van Meter, A. M. (2004) The Writing Lab Approach to
Language Instruction and Intervention. Baltimore, MD: Paul H. Brookes Publishing
Co.
Nelson, N. W., & Van Meter, A. M. (2002). Assessing reading and writing samples for
planning and evaluating change. Topics in Language Disorders 22(2), 47-72.
Nelson, N. W., & Van Meter, A. M. (2003, June). Measuring written language change
through the elementary school years. Poster presented at the annual Symposium
on Research in Child Language Disorders, University of Wisconsin-Madison.
Nelson, N. W., & Van Meter, A. M. (2006). The writing lab approach for building
language, literacy, and communication abilities. In R. McCauley & M. Fey (Eds.).
Treatment of language disorders in children. (pp. 383-421). Baltimore, MD: Paul H
Brookes.
Nelson, N. W., & Van Meter, A. M. (2007). Measuring Written Language Ability in
Narrative Samples. Reading and Writing Quarterly, 23, 287-309.
O’Donnell, R. C. (1976). A critique of some indices of syntactic maturity. Research in
the Teaching of English, 10, 31-38.O’Donnell, R. C., Griffin, W. J., & Norris, R. C.
(1967). Syntax of Kindergarten and Elementary School Children. Champaign, IL:
National Council of Teachers of English.
Owen, A., Leonard, L. (2002). Lexical diversity in the spontaneous speech of
children with Specific Language Impairment: Application of D. Journal of Speech,
Language, and Hearing Research, 45, 927-937.
Scott & Nelson: ASHA 2007
Gillam, R. B., & Johnston, J. R. (1992). Spoken and written language relationships in
language/learning-impaired and normally achieving school-age children. Journal of
Speech and Hearing Research, 35, 1303-1315.
Gillam, R., & Pearson, N. (2004). Test of Narrative Language. Austin, TX: Pro-Ed
Golub, L. S., & Kidder, C. (1974). Syntactic density and the computer. Elementary
English, 51 (8), p. 1128-1131.
Hunt, K. W. (1965). Grammatical structures written at three grade levels. Urbana, IL:
National Council of Teachers of English.
Hunt, K. W. (1970). Syntactic maturity in school children and adults. Monographs of the
Society for research in Child Development, No. 134.
Hunt, K. W. (1977). Early blooming and late blooming syntactic structures. In C. R.
Cooper & L. Odell (Eds.), Evaluating writing: Describing, measuring, judging (pp. 91106). Urbana, IL: National Council of Teachers of English.
Justice, L., et al. (2006). The index of narrative microstructure: A clinical tool for
analyzing school-age children’s narrative performance. American Journal of SpeechLanguage Pathology, 15, 177-191.
Loban, W. D. (1963). The language of elementary school children. NCTE Research
Report No. 1. Urbana, IL: National Council of Teachers of English.
Loban, W. D. (1976). Language development: Kindergarten through grade twelve.
Urbana, IL: National Council of Teachers of English.
Masterson, J. J., & Apel, K., (2000). Spelling assessment: Charting a path to optimal
intervention. Topics in Language Disorders, 20(3), 50-65.
Masterson, J. J., Apel, K., & Wasowicz, J. (2006). SPELL-2 Spelling Performance
Evaluation for Language and Literacy (2nd ed.) [Computer software]. Evanston, IL:
Learning By Design. www.learningbydesign.com
Puranik, C. S., & Lombardino, L. J. (2006, November). Assessing the
microstructure of written language. Presented at the annual ASHA Convention,
Miami, FL.
Redmond, S. (2004). Conversational profiles of children with ADHD, SLI, and
typical development. Clinical Linguistics & Phonetics, 18:2, 107-125.
Scott, C. M., & Nelson, N. W. (2006, June). Capturing Sentence Complexity in
Children’s Writing: Promising Tasks and Measures. Poster presented at the
annual SRCLD Convention, Madison, WI.
Scott, C. M., Nelson, N. W., Andersen, S. A., & Zielinski, K. (2006, November).
Development of Written Sentence Combining Skills in School-Age Children.
Poster presented at the annual ASHA Convention, Miami, FL.
Scott, C., & Windsor, J. (2000). General language performance measures in
spoken and written narrative and expository discourse of school-age children
with language learning disabilities. Journal of Speech, Language, and Hearing
Research, 43, 324-339.
19
© Copyright 2025 Paperzz