Dynamic Testing in Schizophrenia: Does

Dynamic Testing in Schizophrenia:
Does Training Change the Construct
Validity of a Test?
by Karl H. Wiedl, Henning Schottke, Michael F. Qreen,
and Keith H. Nuechterlein
Grigorenko and Sternberg 1998). DT was reformulated
for clinical purposes by Wiedl and Schottke (1995) and
has recently been advocated for further application to
schizophrenia research by Green et al. (2000).
For DT in clinical samples, the targeted cognitive
performance should have relevance to the psychiatric disorder and the intervention should be suitable for integration into the testing procedure. Schizophrenia patients
often show deficits in tasks linked to frontal lobe functioning, in particular those tapping executive functioning
(concept formation, planning, organization of behavior;
see Morice and Delahunty 1996). The Wisconsin Card
Sorting Test (WCST, Heaton 1981) has frequently been
used to examine such concept formation deficits. Several
interventions for remediating performance deficits have
been developed for the WCST (see Goldberg and
Weinberger 1994; Strattaetal. 1997). Special instructions,
teaching, and continuous verbal feedback seem to be particularly effective in improving the formation, maintenance, and change of concepts in at least some schizophrenia patients (Goldberg et al. 1987; Green et al. 1992).
Compared with the standard administration of the WCST,
which includes only simple feedback (right, wrong), these
interventions require the patient to use previous feedback
and instruction to a high degree to make subsequent correct choices.
Starting from the assumption that these cognitive
remediation techniques bring about short-term gains in at
least some patients, we first showed that interindividual
differences in responding to these interventions can be
replicated in different samples and can be meaningfully
related to other psychological and clinical variables
(Wiedl and Wienobst 1999; Wiedl et al. 1999). Switching
our approach of investigation from a cognitive remediation perspective to a dynamic assessment perspective, we
Abstract
Dynamic testing typically involves specific interventions for a test to assess the extent to which test performance can be modified, beyond level of baseline
(static) performance. This study used a dynamic version of the Wisconsin Card Sorting Test (WCST) that
is based on cognitive remediation techniques within a
test-training-test procedure. From results of previous
studies with schizophrenia patients, we concluded that
the dynamic and static versions of the WCST should
have different construct validity. This hypothesis was
tested by examining the patterns of correlations with
measures of executive functioning, secondary verbal
memory, and verbal intelligence. Results demonstrated a specific construct validity of WCST dynamic
(i.e., posttest) scores as an index of problem solving
(Tower of Hanoi) and secondary verbal memory and
learning (Auditory Verbal Learning Test), whereas the
impact of general verbal capacity and selective attention (Verbal IQ, Stroop Test) was reduced. It is concluded that the construct validity of the test changes
with dynamic administration and that this difference
helps to explain why the dynamic version of the WCST
predicts functional outcome better than the static version.
Keywords: Executive functioning, learning, verbal
capacity, cognitive modifiability, Wisconsin Card
Sorting Test, schizophrenia.
Schizophrenia Bulletin, 30(4):703-711,2004.
Measuring a person's capacity to improve cognitive performance is the goal of an approach that is called dynamic
assessment or dynamic testing (DT). To achieve this goal,
repeated administrations of a test and specific interventions are applied within a test-training-test paradigm or
are integrated into the presentation of the test items
(Grigorenko and Sternberg 1998). This approach was
developed in the field of intelligence assessment and then
extended to other domains (see Guthke and Wiedl 1996;
Send reprint requests to Prof. Dr. Karl H. Wiedl, Fachbereich
Psychologie, Universitat Osnabriick, 49069 Osnabriick, Germany; email: [email protected].
703
K.H. Wiedl et al.
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
instruction and feedback, we also assessed whether
WCST performance was related to verbal learning and
memory. Finally, we examined the extent to which WCST
performance was related to general verbal intelligence
level at the pre- and posttest points, because prior research
(Heaton 1981; Morice and Delahunty 1996) has suggested
that general verbal intelligence may contribute to WCST
performance. Thus, our research question is as follows:
Does test repetition after WCST training alter the relations
of the WCST to other tests of executive functioning, to
verbal memory, and to general intelligence?
next looked for the predictive validity of indicators of performance change in the WCST following the specific
interventions. Using Brenner's Integrated Psychological
Treatment Program (Brenner et al. 1992), we found that
the degree of performance change assessed in the testtraining-test paradigm of the WCST is related to proficiency in cognitive differentiation training (Wiedl and
Wienobst 1999). Performance in the dynamic WCST was
also shown to be related to training gain in a standard
clinical intervention program (eight sessions of medication management and problem solving), adding up to 20
percent of explained variance of the external criteria
(medication knowledge, problem-solving knowledge)
(Wiedl 1999; Carlson and Wiedl 2000; Wiedl et al. 2001a;
Wiedl and Schottke 2002).
Based on these studies, which demonstrate improved
predictive validity of the test after the specific intervention (third administration), we concluded that moving
from the first (static) to the third (dynamic) version of the
test probably altered its construct validity. Results from a
study in which we used a typological approach of assessing performance change and classified patients according
to degree of learning (learners, nonlearners, high scorers)
were consistent with this assumption. Examining attentional characteristics of these subgroups of patients, signal/noise discrimination in a vigilance task (the Degraded
Continuous Performance Test, Nuechterlein et al. 1986)
and the subjective feeling of distractibility (Test of
Attentional Styles, van den Bosch et al. 1993) were
shown to discriminate among the learner groups. Learners
and high scorers (good performance at both pre- and
posttest) scored higher on signal/noise discrimination during vigilance than did nonlearners. Also, in contrast to
members of the other two groups, nonlearners tended to
be less aware of feelings of "distractibility." Indicators of
performance change thus seem to be related to specific
aspects of attention (Wiedl et al. 200\b).
In the present study, we examined changes in the construct validity of the WCST during a DT procedure by
comparing the correlation of WCST pre- and posttest
scores with other measures of executive functioning and
with measures of verbal learning and general intelligence.
One component of construct validity involves whether
relationships that are theoretically predicted do occur.
Various tasks assessing executive functioning were used
to examine whether expected relationships between
WCST scores and other measures of executive functioning were present. Problem solving and selective attention,
two hypothesized contributors to WCST performance,
were examined. The Stroop test was used to measure
selective attention, while the Tower of Hanoi indexed
problem-solving capacity.
Because learning relevant to WCST performance in a
DT paradigm involves profiting from intensive verbal
Method
Subjects. The sample is identical to the sample that
Wiedl et al. (200\b) examined to replicate and validate
the typological classification of schizophrenia patients
based on WCST DT and to demonstrate group differences
in attentional functioning. Subjects were 49 inpatients of a
psychiatric state hospital who met DSM-III-R criteria for
schizophrenia or schizoaffective disorder. Thirty-three
patients were male, and 16 were female. Patients were
diagnosed by an experienced psychiatrist and a senior
research clinical psychologist through a best-estimate
diagnostic conference using all available sources of information. Information included the SKID (the German version of the Structured Clinical Interview for DSM-III-R,
Wittchen et al. 1990; for n - 32 patients), clinical records,
and indicators of illness course. The latter types of information were confined to chronic patients with a wellestablished clinical diagnosis of schizophrenia or
schizoaffective psychosis. Patients were taken into the
sample only when they came from a rehabilitation ward
and had been judged by the senior psychiatrist of the ward
to be testable. Subjects were excluded if they had a history of substance or alcohol dependence or an identifiable
neurological disorder. After being given full information
about the project, patients who consented were administered the assessments described below. All were paid $20
for full participation in the research project, which
included additional testing and training in medication
management and problem solving. All patients were
receiving neuroleptic medication, which for 15 patients
involved atypical neuroleptics. The sample comprised relatively young patients whose condition was moderately
chronic. Further clinical and demographic data are given
in table 1.
DT With the WCST. In the standard WCST (see Heaton
1981), subjects are required to match 128 cards to one of
4 target cards. Matching rules are color, shape, or number
of symbols on each card. Under standard administration,
the subjects are told "right" or "wrong" after each match.
704
Dynamic Testing in Schizophrenia
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
Table 1. Frequencies, means, and standard deviations (SDs) of the total sample on clinical and
demographic variables, symptomatology, and neurocognition
Tests
Mean
SD
Sex
Female
Male
16
33
Age
49
32.60
7.15
Age at first admission
48
25.17
5.64
Number of years at school
48
10.75
1.60
BPRS
33
46.06
11.66
SANS
37
22.73
18.84
WST IQ (verbal intelligence)
47
96.02
14.57
Stroop color word reading
46
39.00
7.94
Stroop color naming
46
65.96
18.13
Stroop interference
46
111.37
33.96
AVLTA1-A5 no. correct
46
42.70
10.66
AVLT A6 no. correct
46
8.65
3.60
AVLT B no. correct
46
4.62
1.98
AVLT recognition no. correct
46
12.96
2.05
AVLT recognition errors
46
1.34
2.29
TOH rule breaks
40
5.40
12.86
TOH moves 3 disks
40
9.60
3.36
TOH time 3 disks
40
86.04
45.06
TOH moves 4 disks
40
28.78
16.59
TOH time 4 disks
40
305.80
198.46
TOH moves 5 disks
40
62.60
26.97
TOH time 5 disks
40
566.26
266.37
Note.—AVLT = Auditory Verbal Learning Test; BPRS = Brief Psychiatric Rating Scale; SANS = Scale for the Assessment of Negative
Symptoms; TOH = Tower of Hanoi; WST = Wortschatztest (Test of Word Power).
After ten consecutive correct matches, the tester changes
the rule without informing the subject. The most commonly used measures are number of correct responses,
categories achieved, and perseverative errors.
In contrast to the standard procedure, the WCST was
given in a pretest-training-posttest sequence in one session with each block comprising 64 cards. Pre- and
posttest (Time 1 [Tl], Time 3 [T3]) were identical, using
the standard procedures described by Heaton (1981). As
in previous studies (Wiedl and Wienobst 1999; Wiedl et
al. 200\b), the training block (Time 2 [T2]) was administered according to the trial-by-trial intervention procedures described by Green et al. (1992) and Goldberg et al.
(1987). These procedures have been shown to improve
schizophrenia patients' performance (T2). However, dura-
bility of the effects appears to be variable from subject to
subject (T3).
After finishing the first block, patients were informed
that they would now get help. Before starting the second
block, they were told the three sorting rules (color, form,
number). After every card sort, the patients were also told
why their choice was right or wrong (e.g., "This was
wrong. We don't sort for color now, but for form or number."). Subjects were informed of change of category
(e.g., "Correct, you had to sort for color. Having performed ten consecutive correct sorts, the rule will change.
You will now no longer sort for color but for form or
number."). Between the three blocks, brief breaks of
approximately 5 minutes were provided. Altogether,
WCST administration took between 30 and 45 minutes.
705
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
K.H. Wiedl et al.
that contains all nouns from the two lists and additional
distractors with the instruction to mark the items from list
A. Number of words correctly recalled in the initial five
trials (Al to A5), number of words of the second list (B)
and of the first list correctly recalled after list B has been
recalled (A6), and number of correct recognitions and of
recognition errors (items from list B) when the total list is
finally presented were selected as performance indexes.
For the assessment of general intelligence, the WST
(Wortschatztest, or Test of Word Power, Metzler and
Schmidt 1992) was used. This test assesses verbal comprehension as an indicator of crystallized intelligence
(Cattell 1963) and is considered to be a robust measure of
premorbid intelligence. The WST was constructed following the logistic model of psychological testing and was
shown to possess high reliability. The task is to identify
meaningful words from rows of meaningless distractors
(42 rows, 5 distractors each).
Among the test scores that can be computed, the number
of correct responses, categories achieved, and perseverative errors are the most commonly used. For our analysis of interindividual differences, number of correct
responses and number of perseverative errors were
selected because of their advantageous distributional characteristics.
Further Cognitive Variables. The Tower of Hanoi
(TOH) is used for the assessment of problem solving and
planning capacity and is considered to be a valid measure
of executive functioning (Morice and Delahunty 1996). A
version adapted by Schottke (2000) for computer administration was used. The task consists of three rods and
three to five disks of different sizes. The subjects are
required to transfer the disk from the primary rod to a target rod according to the following rules: Only one disk
can be moved at one time, and a disk cannot be put on top
of a smaller disk. Measures used for scoring are number
of moves, solution time, and number of rule breaks. The
first two measures yield intercorrelations around 0.60 and
are considered to indicate speed-accuracy trade-off components of problem-solving ability. Number of rule breaks
assesses regard for context information (task demands)
while processing the test and is related to number of
moves and to solution time to a lower degree (around
0.45, Schottke 2000). Given this and the fact that higher
numbers of rule breaks (>3) are frequently found in
patients with frontal (especially left frontal) closed-head
injuries, this measure is believed to be related to the functioning of working memory (Schottke 2000).
The Color Word Interference Test (Baumler 1985), a
German version of the Stroop test, was used to assess
selective attention. The subjects are presented pages containing color words for which the color of the printed
word and the meaning of the word are not congruent.
Also, there are colored dashes printed on the pages. The
tasks for the subjects are to first read aloud the color
words (color word reading), then say aloud the colors of
the dashes (color naming), and then say aloud the print
color of the words (interference, selective attention). For
the present analysis, we used transformed median scores
(t distribution) of the time that the patients spent on these
three tasks.
The Auditory Verbal Learning Test (AVLT; German
version by Heubrock 1992) was used to assess secondary
verbal learning and memory. A list of 15 nouns (list A) is
read to the subjects with the instruction to reproduce these
nouns after their presentation is completed. No feedback
is given by the tester. This procedure is repeated five
times, followed by a second list (list B) to assess the
effects of interference, and again by another request to
recall the first list. Finally, the patients are given a table
Symptoms, Clinical and Demographic Variables.
Negative symptoms were assessed with a 25-item version
of the Scale for the Assessment of Negative Symptoms
(SANS, Andreasen 1984). Aggregated item scores for the
different subscales were used to estimate the overall
degree of negative symptoms. For an estimation of the
general level of psychopathology, the sum score of the
Brief Psychiatric Rating Scale (BPRS, version by Ventura
et al. 1993) was used. Besides chronological age, age at
first admission and number of years at school were registered as indicators of chronicity and educational level.
Procedures. A time span of 4 days was provided for
assessment. WCST was always administered on the first
day, followed by TOH and WST (second day), Stroop and
AVLT (third day), and then other variables not considered
in this report (fourth day). BPRS and SANS were rated by
the senior psychiatrists or senior clinical psychologists of
the wards where the patients were recruited. Additional
assessments were conducted to evaluate the patients'
training proficiency in a subsequent rehabilitation training
program. These procedures are described elsewhere
(Wiedl 1999; Wiedl and Schottke 2002).
Results
Description of the Sample. Means and standard deviations (SDs) of WCST scores for the three testing conditions were for number of correct responses (NCR) M1 =
35.55 (SD = 12.09), M2 = 60.29 (SD = 2.41), and M3 =
48.53 (SD = 11.83); for number of perseverative errors
(NPE) Ml = 16.96 (SD = 10.60), M2 = 0.65 (SD =
11.16), and M3 = 6.88 (SD = 6.41); and for number of
706
Dynamic Testing in Schizophrenia
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
categories achieved Ml = 1.69 (SD = 1.31), M2 = 5.18
(SD = 0.86), and M3 = 3.29 (SD = 1.79). All pairwise
comparisons of Ml, M2, and M3 for these different measures were significant (p < .001). The patients had thus
significantly improved performance in T2; although performance by T3 had decreased, it was still significantly
higher than performance at Tl. The subsequent analyses
will be based on pre- and posttest scores ( T l , T3).
Because of restrictions of variance in number of categories achieved as a consequence of the shortening of the
WCST in the three different blocks, only NCR and NPE
will be used.
Table 1 presents the patients' scores in the other variables that were assessed.
The opposite change in level of correlation can be
seen for the TOH. There is a significant or close to significant (0.05 < p < 0.10) change of correlation size from preto posttest for both WCST scores, indicating an increase
of shared variance of problem-solving ability with the
posttest scores. These results refer to the complex tower
tasks. The result for the three-disks tower was not significant.
In terms of correlation with clinical variables, neither
BPRS scores, SANS scores, nor years since first admission (age controlled) were substantially related to WCST
performance. The only significant correlations (p < 0.05)
were found between age and NPE-T3 (r = 0.29) and
between number of years at school and NPE-T3 (-0.37).
To clarify further the specific construct validity of
WCST pre- and posttest-scores, multiple regression analyses on the WCST target variables were computed with a
set of predictors that had yielded significant correlation
coefficients (Stroop color word reading and Stroop interference; TOH five-disk moves and time, and rule breaks;
AVLT sum and recognition errors, WST-IQ). The variables were entered simultaneously, not stepwise, to determine the best multivariate contribution using all variables
showing promise in univariate analyses. Visual inspection
of the histograms of standardized residuals did not suggest violations of the assumption of normal distributions
(see Olkins 1967). Different regression models proved to
be significant or close to significant for the different target
variables (NCR and NPE, at Tl and T3). For NCR, the
regression model was only close to significance for Tl (p
< 0.08) but became very significant at T3 (p < 0.001).
However, none of the predictors gained a significant
weight. Visual inspection of the results indicated that
while there was a decrease of weight of Stroop color word
reading, there was an increased weight of TOH rule
breaks and moves and AVLT recognition error scores,
accounting for the increase in explained variance (R2 from
0.35 to 0.55).
For NPE, there were significant predictors within the
significant (p < 0.002) regression model for Tl (verbal
intelligence, p < 0.01; Stroop color word reading, p <
0.03; interference, p < 0.08). In the regression model for
T3 (p < 0.03), these variables lost their substantial weight.
The result of this was a reduction of R2 (0.53 to 0.40). The
only variable that stayed close to significance was TOH
rule breaks (Tl: p < 0.05; T3: p < 0.07). The regression
analyses thus showed different trends—increase of weight
of consistent variables versus decrease of variables that
are not consistent with the construct—which converge on
the interpretation of increased construct validity.
So far we have shown that WCST scores alter their
pattern of correlation and the regression model from
pre- to posttest. What pure performance change con-
Correlational Analyses of WCST Performance and
Performance Change. Before running correlational
analyses, the test scores were checked for distributional
characteristics with the help of the Kolmogorov-Smirnov
algorithm. Violations of the assumption of normality were
detected for some of the variables. Therefore, all scores
were submitted to a log transformation (base 10). If there
were zero scores, a score of 1 was added to all raw scores
of this variable. Table 2 gives a comprehensive view of
the correlational structure of the transformed variables.
The general strategy for analyzing these data was to
check for significant differences between correlational
coefficients between the WCST and the other cognitive
variables at the pretest versus posttest point. This was
done using Fisher's z distribution of correlation coefficients (Glass and Stanley 1970; for the calculation algorithm, see Steiger 1980).
Inspection of the results yields the following picture.
General verbal intelligence is significantly related to one
aspect of WCST performance (NPE) both before and after
intervention. The correlation with NCR is not significant.
For AVLT, there seems to be a differential pattern.
Whereas the nonsignificant correlations with scores of B
(immediate recall), A6, and recognition do not change
with intervention, there is a significant change of correlations for the recognition error score with regard to NPE
and NCR. For the global measure of learning and memory
(A1-A5), the difference of correlations is close to significance for NPE (p < 0.10).
Another set of variables, the Stroop measures, seems
to be related to WCST performance only before intervention. After intervention, the coefficients go down and lose
significance (exception: color word reading and
NPE-T3). Most salient among the Stroop variables is
color word reading, a measure of verbal processing and
production with regard to written words. For this variable
and NCR scores (Tl, T3), the difference for the correlation coefficients is significant (p < 0.05).
707
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
K.H. Wiedl et al.
Table 2. Pearson correlations between WCST pre- and posttraining scores (T1/T3) and other
neurocognitive variables
n
NCR-T1
NCR-T3
NPE-T1
NPE-T3
47
0.21
0.20
-0.51"
-0.41"
Color word reading (t)
46
-0.46"
-0.19
0.44**
0.35*
Color naming (t)
46
-0.42"
-0.21
0.13
0.23
Interference
46
-0.36*
-0.21
0.10
0.20
Variables
WST (verbal intelligence)
Stroop (median score)
p < 0.05
TOH
p<0.06
p<0.01
Moves 5 disks
40
-0.57**
-0.23
Moves 4 disks
40
-0.14
Moves 3 disks
40
-0.02
p<0.01
-0.55**
-0.21
-0.16
-0.09
40
-0.03
-0.35*
40
0.04
Time 3 disks
40
-0.06
40
-0.39*
0.19
p<0.05
-0.26
-0.20
-0.06
0.16
0.19
0.12
p<0.10
p<0.05
Rule breaks
0.15
-0.24
p < 0.06
Time 4 disks
0.55
p<0.05
p<0.05
Time 5 disks
0.21
-0.15
p<0.05
0.56"
-0.67**
0.28
-0.48**
AVLT
p<0.10
46
0.30*
0.33*
-0.21
B
46
0.22
0.10
-0.16
-0.14
A6
46
0.26
0.27
-0.18
-0.33*
0.29
0.34*
Sum score A1-A5
Recognition
46
Recognition errors
46
-0.17
-0.21
0.05
p<0.05
p < 0.06
-0.44**
0.09
0.43**
Note.—AVLT = Auditory Verbal Learning Test; NCR = number of correct responses; NPE = number of perseverative errors; TOH = Tower
of Hanoi; WST = Wortschatztest (Test of Word Power).
• p < 0.05 (2-sided); " p < 0.01 (2-sided)
tributes to these results, however, cannot be directly
inferred from the data presented, because posttest
scores combine two aspects—initial performance and
performance change. To get some hints with regard to
the construct validity of change scores, an additional
step of analysis was therefore included. The target variables were determined via residuals from regression
analysis on NCR-T3 and NPE-T3, with the respective
pretest scores (NCR-T1, NPE-T1) as predictors. The
selection of the patients for this step of analysis was
conducted with the help of the statistical algorithm
designed by Schbttke et al. (1993). It excludes all sub-
jects who start at pretest with an NCR score of 43 or
higher to allow sufficient room for performance
improvement. For this reason, the sample had to be
reduced to 26 patients. For change in NCR, a regression model with TOH rule breaks (p < 0.04) and AVLT
recognition errors (p < 0.05) was significant {p < 0.006,
R2 = 0.69). For change in NPE, the regression model
did not reach significance. The capacity to improve
NCR performance after specific intervention thus
seems to be related to aspects of working memory during planning and problem solving and to word recognition under conditions of distraction.
708
Dynamic Testing in Schizophrenia
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
of the regression analyses indicate that the construct being
measured by the WCST has become more specific in two
ways. First, it seems to have gained a higher loading of
executive functioning, probably including the activity of
working memory, and of aspects of secondary verbal
memory (total recall in the course of repeated presentations, inhibition). Second, it has undergone a reduction in
weight of nonspecific components, especially verbal intelligence (WST IQ) and verbal reading capacity (Stroop
color word reading).
The question subsequently addressed was what in
particular is indicated by change of performance. Based
on the regression analyses using standardized residuals as
a criterion, use of context information and—to a lesser
degree—word recognition under the influence of distracting stimuli proved to be salient predictors. The associations of these WCST change scores to the predictor variables are thus very similar to those of the posttest scores.
This confirms that the construct validity of the posttest
scores following specific intervention is related to these
components of learning.
Some other issues need clarification. One relates to
the TOH, which shows clearest results for the four- and
five-disk tasks, indicating that learning ability relates to
complex problem solving. However, the five-disk task
(and to a lesser degree the four-disk task) is not only complex but also is preceded by learning while working on
the less complex three-disk task condition. An alternative
interpretation might thus be that the results reported for
the TOH indicate procedural learning capacity, as had
been stated by Goldberg et al. (1990). Further studies will
have to be conducted to clarify this issue.
Another aspect of interest is the pervasive effect that
can be found for the error scores of different tasks (TOH,
AVLT), in particular the TOH rule breaks. In all error
scores, instructional rules that need to be followed while
working on a task have been disregarded. One hypothesis
would be that working memory did not function in keeping online the specific context information (task
demands), and thus transgressions of rules could occur in
those persons who do poorly, even after WCST training.
This interpretation is backed by results on use of context
information in neurological patients with specific frontal
lobe lesions (Schottke 2000). A second hypothesis would
be that the ability to learn and retain successful WCST
performance covaries with the ability to learn these rules.
Intelligence should thus be related to rule breaking. Low
correlations between the intelligence scores and the error
scores (-0.18, -0.15, ns) indicate that this hypothesis is
not very plausible, however.
In the introduction, we noted that DT results in a
clear increase in explained variance in the prediction of
functional outcome (clinical rehabilitation training). What
Discussion
Generally, the data presented show that the pattern of correlations between key neurocognitive variables and
WCST performance differs when examined before and
after a specific WCST intervention. Considering the single variables, some evidently do not change their relationships: verbal intelligence (IQ), simple problem solving
(TOH, three disks), immediate memory (AVLT, first presentation, B), and word recognition (AVLT). In contrast,
significant or close to significant alterations in correlations from before to after WCST training can be observed
for AVLT secondary verbal memory (A1-A5) and recognition errors, for complex problem solving (TOH, four
and five disks), for use of context information (TOH, rule
breaks), and for verbal processing (Stroop color word
reading). These latter variables suggest that a change in
the construct being measured by the WCST may occur
with training. As was shown, this change is related to both
NCR and NPE. If one wanted to use the number of categories achieved as the dependent measure, it might help to
use Nelson's modification of the test, which can yield a
greater range in this index because categories are
switched after six (not ten) consecutive correct responses
(see Lezak 1995).
More specifically, these results, together with the
results from regression analyses, give a rather clear
answer to the research question addressed by this study.
With regard to the variables linked to executive functioning, a "shift of loading" from pre- to posttest takes place
for two tests: Stroop and TOH. Whereas the "loading" of
the Stroop variables diminishes, the importance of TOH
indicators is augmented. According to the results of the
regression analyses, two aspects of TOH performance
seem to be particularly important: complex problem-solving ability (number of moves, time), which is an aspect of
executive functioning, and number of rule breaks, which
indicates use of context information. It appears that the
specific intervention may increase the extent to which
these nonverbal frontal lobe functions are used to perform
the WCST. On the other hand, certain aspects of general
verbal capacity (Stroop color word reading, WST verbal
intelligence) and selective attention (Stroop interference)
seem to be of some relevance during pretest performance
(perseverative errors) but lose their importance for WCST
performance after intervention.
In the introductory section, we also suggested that
memory functioning should show higher correlations with
WCST posttest performance than with pretest performance because of the reliance on a verbal intervention. The
correlations presented in table 2 are in the expected direction; the correlation differences are substantial for recall
and learning (A1-A5) and for recognition errors. Results
709
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
K.H. Wiedl et al.
brings about this change in predictive validity? Based on
the results reported here, we suggest that executive functioning, use of context information as a function of working memory, and aspects of secondary verbal memory
may be the capabilities that mediate the patients' functioning in a training program. In contrast to the static version
of the WCST, the dynamic test version may better tap
these abilities and thereby improve its ability to predict
functional outcome. An alternative explanation may be
that DT is interactive; it requires readiness for social interaction to do well. Future studies may want to consider
whether a subject's readiness for social interaction is a
key determinant of his or her ability to benefit from DT.
In summary, we believe that a testing procedure that
systematically integrates certain brief interventions and
examines their impact results in more specific construct
validity of the WCST as an index of learning potential,
which, in turn, seems to be related to problem solving and
to aspects of secondary verbal memory and learning. In
addition, the impact of general verbal capacity and speed
and of selective attention on the scores appears to be
reduced. Because this procedure is related to functional
outcome and is easy to apply, it seems to have advantages
for further use in both basic research and clinical application.
Goldberg, T.E., and Weinberger, R. Schizophrenia, training paradigms, and the Wisconsin Card Sorting Test
Redux. Schizophrenia Research, 11:291-296, 1994.
Goldberg, T.E.; Weinberger, D.R.; Bergman, K.F.; Pliskin,
N.H.; and Podd, M.H. Further evidence for dementia of
the prefrontal type in schizophrenia? A controlled study of
teaching the Wisconsin Card Sorting Test. Archives of
General Psychiatry, 44:1008-1014, 1987.
Green, M.F.; Kern, R.S.; Braff, D.L.; and Mintz, J.
Neurocognitive deficits and functional outcome in schizophrenia: Are we measuring the "right stuff"?
Schizophrenia Bulletin, 26(1): 119-136, 2000.
Green, M.F.; Satz, P.; Ganzell, S.; and Vaclav, J.F.
Wisconsin Card Sorting Test performance in schizophrenia: Remediation of a stubborn deficit. American Journal
of Psychiatry, 149:62-67, 1992.
Grigorenko, E.L., and Sternberg, R.J. Dynamic testing.
Psychological Bulletin, 124:75-111, 1998.
Guthke, J., and Wiedl, K.H. Dynamisches Testen. Zur
Psychodiagnostik der intraindividuellen
Variabilitat.
Gottingen, Germany: Hogrefe, 1996.
Heaton, R.K. Wisconsin Card Sorting Test Manual.
Odessa, FL: Psychological Assessment Resources, 1981.
Heubrock, D. Der Auditiv-Verbale Lerntest (AVLT) in der
klinischen und experimentellen Neuropsychologie.
Durchfiihrung, Auswertung und Forschungsergebnisse.
References
Zeitschrift fitr Differentielle
Andreasen, N.C. Scale for the Assessment of Negative
Symptoms (SANS). Iowa City, IA: University of Iowa,
1984.
und Diagnostische
Psychologic 3:161-174, 1992.
Lezak, M.D. Neuropsychological Assessment. 3rd ed.
New York, NY: Oxford University Press, 1995.
Baumler, G. Farbe-Wort-Interferenztest (FWIT) nach J.R.
Stroop. Gottingen, Germany: Hogrefe, 1985.
Metzler, P., and Schmidt, K.-H. Wortschatztest (WST).
Stuttgart, Germany: Beltz Testverlag, 1992.
Brenner, H.D.; Hodel, B.; Roder, V.; and Corrigan, P.
Treatment of cognitive dysfunctions and behavioral
deficits in schizophrenia. Schizophrenia
Bulletin,
18(l):21-26, 1992.
Morice, R., and Delahunty, A. Frontal/executive impair-
ments in schizophrenia. Schizophrenia
Bulletin,
22(1):125-137, 1996.
Carlson, J.S., and Wiedl, K.H. The validity of dynamic
assessment. In: Lidz, C.S., and Elliot, J., eds. Dynamic
Assessment: Prevailing Models and Applications. New
York, NY: Elsevier, 2000. pp. 881-912.
Nuechterlein, K.H.; Edell, W.S.; Norris, M.; and Dawson,
M.E. Attentional vulnerability indicators, thought disorder, and negative symptoms. Schizophrenia Bulletin,
12(3):408^l26, 1986.
Cattell, R.B. Theory of fluid and crystallized intelligence.
Psychology,
54:1-22, 1963.
Olkin, J. Correlations revisited. In: Stanley, J . C , ed.
Improving Experimental Design and Statistical Analysis.
Chicago, IL: Rand McNally, 1967.
Glass, G.V., and Stanley, J.C. Statistical Methods in
Education and Psychology. Upper Saddle River, NJ:
Prentice Hall, 1970.
Schottke, H. Arbeitsgedachtnis und Kontextinformationen
mit dem Turm von Hanoi. Zeitschrift fur Differentielle
und Diagnostische Psychologic 21:304-318, 2000.
Goldberg, T.E.; Saint-Cyr, J.A.; and Weinberger, D.R.
Assessment of procedural learning and problem solving in
schizophrenic patients by Tower of Hanoi type tasks.
Journal of Neuropsychiatry, 2:165-173, 1990.
Schottke, H.; Bartram, M ; and Wiedl, K.H. Psychometric
implications of learning potential assessment: A typological approach. In: Hamers, J.H.M.; Sijtsma, K.; and
Ruijssenaars, A.J.J.M., eds. Learning
Potential
International Journal of Educational
710
Dynamic Testing in Schizophrenia
Schizophrenia Bulletin, Vol. 30, No. 4, 2004
Assessment: Theoretical, Methodological and Practical
Issues. Amsterdam, The Netherlands: Swets and
Zeitlinger, 1993. pp. 153-173.
patients—indicators of rehabilitation potential?
International Journal of Rehabilitation Research, 22:1-5,
1999.
Steiger, J.H. Tests for comparing elements of a correlation
matrix. Psychological Bulletin, 87:245-251, 1980.
Wiedl, K.H.; Wienobst, J.; Schottke, H.; Green, M.E; and
Nuechterlein, K.H. Attentional characteristics of schizophrenic patients differing in learning proficiency in the
Wisconsin Card Sorting Test. Schizophrenia Bulletin,
27(4):687-696, 20016.
Stratta, P.; Mancini, F.; Mattei, P.; Daneluzzo, E.;
Casacchia, M.; and Rossi, A. Remediation of Wisconsin
Card Sorting Test performance in schizophrenia.
Psychopathology, 30:59-66, 1997.
Wiedl, K.H.; Wienobst, J.; Schottke, H., and Kauffeldt, S.
Differentielle Aspekte kognitiver Remediation bei schizophren Erkrankten auf der Grundlage des Wisconsin Card
Sorting Tests. Zeitschrift fiir Klinische Psychologie,
28(3):214-219, 1999.
van den Bosch, R.J.; Rombouts, R.P.; and Asma, M.J.O.
Subjective cognitive dysfunction in schizophrenic and
depressed patients. Comprehensive
Psychiatry,
34:130-136, 1993.
Wittchen, H.-U.; Zaudig, M.; Schramm, E.; Spengler, P.;
Mombour, W.; Klug, J.; and Horn, R. SKID. Strukturiertes
Ventura, J.; Lukoff, D.; Nuechterlein, K.H.; Liberman,
R.P.; Green, M.; and Shaner, A. Appendix 1: Brief
Psychiatric Rating Scale (BPRS). Expanded Version (4.0).
Scales, anchor points and administration manual.
International Journal of Methods in Psychiatric Research,
3:227-243, 1993.
Klinisches Interview fiir DSM-III-R.
Acknowledgment
Wiedl, K.H. Assessing cognitive modifiability as a supplement to readiness for rehabilitation in schizophrenic
patients. Psychiatric Services, 50:1411-1419, 1999.
This research was supported by grants from the German
Research Council (Wi-484/7-2) and the American
Council of Learned Societies (ACL-III/63).
Wiedl, K.H., and Schottke, H. Dynamic assessment of
selective attention in schizophrenic subjects: The analysis
of intraindividual variability of performance. In: Carlson,
J.S., ed. European Contributions to Dynamic Assessment.
London: JAI Press, 1995. pp. 185-208.
The Authors
Karl H. Wiedl, Ph.D., is Professor of Clinical Psychology,
Department of Psychology, University of Osnabriick,
Osnabriick, Germany; and Co-Director, Center for
Psychiatric Rehabilitation, Osnabriick. Henning Schottke,
Ph.D., is Professor of Clinical Psychology, Department of
Psychology, University of Osnabriick. Michael F. Green,
Ph.D., is Professor, Department of Psychiatry and
Biobehavioral Sciences, Geffen School of Medicine,
University of California, Los Angeles, Los Angeles, CA; and
Department of Veterans Affairs, VISN 22, Mental Illness
Research, Education, and Clinical Center, Los Angeles, CA.
Keith H. Nuechterlein, Ph.D., is Professor, Department of
Psychiatry and Biobehavioral Sciences, Geffen School of
Medicine, University of California, Los Angeles.
Wiedl, K.H., and Schottke, H. Vorhersage des Erfolgs
schizophrener Patienten in einem psychoedukativen
Behandlungsprogramm durch Indikatoren des
Veranderungspotentials im Wisconsin Card Sorting Test.
Verhaltenstherapie, 12:90-96, 2002.
Wiedl, K.H.; Schottke, H.; and Calero, D. Dynamic
assessment of cognitive rehabilitation potential in schizophrenic persons and in old people with and without
dementia. European Journal
of
Weinheim,
Germany: Beltz Test GmbH, 1990.
Psychological
Assessment, 17(2): 112-119, 2001a.
Wiedl, K.H., and Wienobst, J. Interindividual differences
in cognitive remediation research with schizophrenic
711