Thinking with a Theory: Theory-prediction Consistency and Young

Instructional Science (2006) 34: 159–188
DOI 10.1007/s11251-005-4056-3
Ó Springer 2006
Thinking with a theory: Theory-prediction consistency
and young children’s identification of causality
PAUL HOWARD-JONES1,*, RICHARD JOINER2
& JENNIFER BOMFORD3
1
The Graduate School of Education, University of Bristol, 35 Berkeley Square, Bristol,
BS8 1JA, UK; 2University of Bath; 3University of Wales Institute Cardiff
(*Author for correspondance, e-mail: [email protected])
Received: 5 May 2004; in final form: 2 March 2005; accepted: 18 March 2005
Abstract. Theory-testing can only inform scientific inquiry when the prediction of test
outcome is based upon the current theory (theory-prediction consistency). This investigation explores children’s theory-prediction consistency in a computer-mediated task
in which multiple opportunities were provided to predict outcomes and review theories.
An initial correlation study revealed that theory-prediction consistency was associated
with children’s success when attempting to identify causation. The second study
investigated the effect of goal and a simple intervention upon children’s theory-prediction consistency. The type of goal appeared to have no effect but the intervention,
which encouraged the children to use their theory to make predictions, significantly
improved their ability to identify cause. Interestingly, it also improved other aspects of
their performance – such as encouraging more reflection upon the outcomes of tests.
The results imply that poor theory-prediction consistency may be related to difficulties
in identifying the type of problem being presented.
Keywords: causation, evidence, goal, predictions, scientific inquiry, theories
Introduction
Previous investigations of children’s scientific problem solving have
revealed both the importance of effective coordination of theories and
evidence, and the development of the investigative skills needed to test
the validity of a theory. This latter area of theory testing may involve
the design of experiments comparing pre-determined conditions,
or the interpretation of data sets arising from more quasi-experimental scenarios in which the dependent variables are not directly controllable. In either case, however, the theoretical significance of the test
outcome depends upon a conscious application of the theory when
predicting it. Having explicitly theorised about a particular cause, the
160
production of a prediction that appears consistent with that theory
tends to indicate both an ability to apply the theory and an awareness
that such application is useful to the overall problem-solving process.
We have coined the phrase ‘theory-prediction consistency’ to describe
this observable quality of problem-solvers’ behaviour. There are reasons why we may hold and espouse a theory without necessarily using
it make predictions. Firstly, everyday experience informs us that there
are indeed limitations to how useful theories are in making predictions
(e.g. We got wet on Saturday because it was raining – will we get wet
next Saturday?). We may also espouse theories for social purposes,
such as post-hoc justification, that are very different from those associated with prediction making. Within the particular situation of scientific inquiry, however, not basing predictions upon the current theory
can be very counter-productive in terms of testing a theory and, therefore, determining causality. The assumed importance of theoryprediction consistency is part of the particular epistemology of science
but clearly may not be considered as self-evidently advantageous in all
situations. Identifying a problem as being amenable to scientific
inquiry may, therefore, be an important precursor to applying scientific epistemological principles that include theory-prediction consistency. Both identification of the problem type and recognition of the
need for theory-prediction consistency may be prompted by unsuccessful attempts to use other types of strategy (We have chosen the
term ‘theory-prediction consistency’ rather than ‘appropriate prediction’ because this latter term may be confused with merely the correct
prediction).
Ensuring that a prediction is based upon a current theory has largely been ignored as a source of difficulty for children, with the implicit assumption that children generally understand the need for
theory-prediction consistency when solving a problem scientifically.
For example, Klahr (2000) investigated the ability of children to discover the function of a mystery computer key when programming the
movement of a toy vehicle. Through analysing the children’s theories
about the key’s function and the experiments the children designed to
test these theories, Klahr proposed a model of scientific thinking
involving a search in multiple, interacting, problem spaces. Klahr
(2000) compared the prediction he made from their theory, with the
subsequent experiment they designed, thus making the assumption
that the children themselves would have made such a prediction.
At first sight, such an assumption might appear reasonable. Indeed, if
a child has designed an appropriate experiment to test their
theory, one can probably assume that they have used their theory to
161
appropriately predict a possible outcome. However, where experiments do not reflect theories, there must be some uncertainty as to
whether this arises from an unsuccessful search in experiment space
or from poor theory-prediction consistency. A lack of such consistency may arise from a lack of knowledge about how to apply the
theory, but may also be due to an immature epistemological understanding.
The work of Kuhn has emphasised the difficulties experienced by
children in coordinating theory and evidence, identifying the fundamental issue here as the ability to think about a theory rather than
just with a theory. We believe this may somewhat understate theoryprediction consistency as a critical issue in children’s problem solving.
Indeed, there is some suggestion from Kuhn’s own work that children
do not always use their theories to devise the tests they carry out. In
studies involving inference from covariation evidence, Kuhn, et al.
(1988) asked participants to evaluate information about, for example,
which sorts of food were responsible for catching colds. As predicted,
difficulties in theory-evidence coordination were indicated by children
ignoring or distorting evidence that was at odds with their theory.
However, difficulties in understanding the importance of theoryprediction consistency may also be inferred from the responses to
prediction questions used by Kuhn to shed further light upon
evidence evaluation skills. Although no attempt was made to compare
predictions of the children with the theories they held, Kuhn et al.
(1988) noted the use of a ‘matching’ strategy by some children. When
making predictions, children often attempted to identify which prior
situation was closest to the present scenario in terms of all potential
causes present, rather than base their prediction upon their current
theory about which was the single causal factor. Using such a
strategy, prediction that eating a set of four foods would lead to
health was made by identifying that another set of similar foods, say
with only a single difference, had previously led to health. Children
justified such a prediction by indicating this similarity: ‘‘It will come
out good, because it has almost the same things as this one that came
out good.’’ This matching strategy, which has also been observed by
Downing, et al. (1985), can be helpful in some real-world contexts but
is inefficient in situations where a single or limited number of causes
are likely to be solely responsible for the effect. Other strategies,
such as a purely random approach, may involve the same lack of
theory-prediction consistency but be even less successful in making
correct predictions. Of course, even perfect theory-prediction
consistency may still precede an incorrect prediction, since the
162
theory upon which the prediction is based may be incorrect. This may
be a crucial event, however, in the correct determination of causality,
since it may prompt a revision of the theory. Thus, theory-prediction
consistency is important not just in terms of making correct predictions from a theory of causation, but also in terms of producing
evidence indicating the need to revise or dismiss the theory itself.
In a multicausal study by Leach (1999), participants included 162
younger children aged 9 years old. They were asked to select, from
four different accounts, the explanation that best suited a set of observations about the behaviour of a simple electrical circuit. Having chosen an explanation, students were asked to use it to generate a
prediction about the behaviour of four other circuits and to comment
on the actual behaviour of each circuit in the light of their explanation. Leach reports poor theory-prediction consistency, although the
explanation-based methodology limits interpretation of his results. He
records that only 21% of the predictions made by the 9 year-olds in
his study were clearly consistent with the explanations preceding them
and 31% were clearly inconsistent, leading him to highlight this as an
area of weakness for students struggling to understand the ‘‘rules of
the game’’ of theory evaluation in science.
Sodian, et al. (1991) provided evidence for good theory-prediction
consistency amongst children as young as 6 years old who were able
to choose appropriate experimental procedures to test alternate
hypotheses. As with Klahr (2000), it can be argued that successful
selection of these experimental procedures involved reflection upon
what sort of outcomes might be predicted from the theory being tested, i.e. a grasp and application of theory-prediction consistency.
Additionally, as part of this study, children were asked what prediction should arise and most of the youngest children (aged 6–7 years
old) were indeed able to do this. However, as also commented by
Sodian et al. (1991), several differences exist between their study and
those of the type carried out by Kuhn et al. (1988). In the Sodian
et al. (1991) study, the hypotheses were not held with any conviction
by the children but were generated by the experimenter, and the children were not required to go on and generate their own alternative
hypothesis in the light of disconfirmatory evidence. It is entirely possible that children (and adults) tend to operate differently when assessing and applying the theories of others than when testing their own
ideas. Thus, it cannot be said that Sodian et al. (1991) demonstrated
theory prediction consistency in situations similar to those considered
in the present study or by Kuhn. In both cases, children were
required to adapt their own ideas about causation in the face of
163
evidence. Furthermore, the Sodian et al. (1991) study did not present
covariation evidence but used a task in which a single test on only
two potential causes allowed a definite conclusion to be reached
about which was responsible. Ruffman, et al. (1993) employed a faked
evidence methodology using covariation evidence with children also
aged 6–7 years old. In this task, evidence was presented in which an
effect covaried with an outcome, leading the children to express a particular causal theory. The evidence was then rearranged in the presence of the children to support an alternative theory, and they were
asked what a story character who viewed this ‘faked’ evidence would
consider was causing the effect. In one experiment, participants were
also asked what prediction the story character might make about the
outcome of a subsequent test. Ruffman et al. (1993) considered that
the ability of most of their participants to competently carry out the
tasks demonstrated an appropriate epistemological understanding of
the relationship between evidence, theory, and predictions by 7 years
old. Again, however, in explaining the improved success of the children compared with those in other studies, Ruffman et al. point to
the decreased number of potential causes in their task compared with
Kuhn et al. (1988). It is reasonable to assume that the number of potential causes in a problem may influence theory-prediction consistency since, as well as influencing the complexity of making a theorybased prediction, it may also influence the usefulness of the prediction
outcome. Prediction about situations involving only two potential
causes (as in Ruffman et al., 1993) may conclusively test a theory in a
single stroke. In multicausal problems producing covariation evidence
(as in Kuhn et al., 1988; Leach, 1999), a single correct prediction
must be considered collectively with other evidence and does not, in
itself, prove a theory. Incorrect predictions may also vary in their significance. As pointed out by Koslowski (1996), a single incorrect prediction can indicate the need to reject a theory or, in some situations,
suggest the need for only a revision of the theory. If the number of
potential causes influences the theoretical value of a single prediction,
it follows that the perceived usefulness of theory-prediction consistency may also vary.
Theory-prediction consistency appears, then, to be less problematic
for children undertaking tasks involving small numbers of potential
causes, with some studies suggesting that difficulties may arise in
multicausal tasks. No previous work, however, has directly investigated whether theory-prediction consistency is a significant limiting
factor for children attempting multicausal problems requiring scientific method. (By scientific method, we refer to a set of thinking skills
164
that have a popular association with fields often labelled as scientific.
We do not wish here to make any contribution to debates concerning
the universality of the scientific method, which we are aware has been
questioned (e.g. Latour & Woolgar, 1986). We make no claim that
these skills, the approaches they support, or the outcomes they lead
to, possess any such perfect universality. We do assume, however,
that they are invaluably useful in certain problem-solving situations
such as the type investigated here.) The two studies reported here focus upon the role of theory-prediction consistency in solving multicausal problems involving covariation evidence. With Kuhn et al.
(1988), we have focused upon that most fundamental type of theory
in scientific thinking: a theory asserting a relationship between one
category of phenomena and another. In our studies, the investigator
first explained the domain-specific knowledge of the theory and only
children who had mastered this knowledge were involved in the main
part of the investigation. What was needed to complete the theory
and be able to make successful predictions was identification of the
membership of the causal category (in all our tests a single member)
through consideration of the covariation evidence. It should be noted,
therefore, that the term ‘theory’, in the sense used in this report, refers to identification of the single member of the causal category and
not to the underlying generative mechanism. Also, in both studies,
prior to making a prediction about the outcome of a test, children
were asked to select one cause from of a set of potential causes that
they considered responsible for the effect (their theory) and allowed to
review this selection before the next prediction. Thus, in so far as the
children were making predictions about tests being selected and simulated by a computer, the scenario might be considered closer to a
quasi-experimental situation than fully experimental since they did
not have control over manipulating independent variables.
The first study investigated whether children’s ability to identify
causation was associated with their theory-prediction consistency and
allowed the types of strategies being used by the children to be characterised. The second study investigated the effects of a simple intervention aimed at encouraging theory-prediction consistency and
also whether the type of goal had any discernible influence upon
outcomes.
Study 1
If theory-prediction consistency is a critical factor influencing the
success of children in multicausal problems, then, amongst a sample
165
of children who are developing their scientific thinking skills, it should
be associated with the ability to identify cause. The chief objective of
this initial study was to test this hypothesis and also to investigate the
extent of theory-prediction consistency amongst such children. It was
also hypothesised that success in correctly identifying the cause would
be associated with other factors, such as the degree to which rejection
of theories is prediction-driven and also with the extent of reflective
pausing prompted by disconfirmatory evidence. These latter hypotheses were tested in Study 1, in order to appraise the usefulness of such
measures as indicators of emerging problem solving strategies. The
study involved three computer-based tasks. The first was to check
understanding of the domain and the nature of the task, the second
was to identify the children whose knowledge of covariance was
developing and the third allowed these children’s developing skills to
be studied in a situation where evidence was accumulating.
Method
Design
This was a correlation study measuring associations between the ability to identify cause (with the theory score as the number of occasions
when the correct cause was identified) and a range of other measures
of participants’ behaviour chosen as potential indicators of progress.
Experimental hypotheses (see below) were based on predicted
associations between these variables.
Participants
The participants were 80 randomly-selected children in Year 3 (aged
7–8 years) attending three urban Primary Schools in South Wales (32
boys and 48 girls, M = 7;11 years; range = 7;4–8;9 years).
Procedure
Domain Task
All children first received a short tutorial about electricity and were
assessed for their understanding of simple cause–effect relationships
within this domain, in terms of a conducting circuit allowing a battery to light a bulb. This involved being presented with a simple circuit comprising a battery, bulb and a break in the circuit where tubes,
said to contain different materials, could be inserted. Using this real
166
circuit, the children were shown how, if the material inside the tube
was a conductor, the bulb would light. They were then shown that if
the material was not a conductor the bulb would not light. They were
shown how such a circuit could be used by a scientist to identify whether a material was a conductor or not and they were asked to do
this for a pair of materials hidden inside tubes numbered ‘1’ and ‘2’.
They were told that just one of the pair was a conductor. The tutorial
was followed by a simulation task on the computer involving sequential presentation of evidence from a similar simple circuit (see Figure 1). This first computer-based task involved observing a simulation
of a scientist’s tests upon two materials using the simple circuit. Prior
to each occasion that one of the two materials appeared in the circuit,
the children were asked to enter the number of the material they
thought was the conductor (i.e. their theory). With the material to be
tested in the circuit, the children were then asked to make a prediction about whether the bulb would light or not by pressing one of
two keys (marked with a dark bulb and a lit bulb). After a 0.5 second
delay, the children were informed by the program whether their prediction was correct, with the word ‘RIGHT’ or ‘WRONG’ displayed
for 1 second. The materials that had been tested and the outcome of
the test (in terms of whether the bulb lit or not) were then stored in
the top left hand corner of the screen. A block of ten tests (five identical tests on each of the two materials) was presented in this way.
The children’s ability to engage with the computer and to evaluate
simple cause–effect relationships in the chosen domain was determined by whether they could make consistently correct predictions by
the time they reached the second half of the block.
Will the bulb light?
Figure 1. Sequential presentation of evidence from a simple circuit.
167
Cumulative covariation task. Following this, the children were assessed
for their ability to use cumulative covariation evidence. They were
introduced to a parallel circuit using a physical demonstration. This
circuit was similar to the simple circuit in all respects except that
there was not one break but two breaks in parallel. It was explained
that two tubes of different materials could be put together into this
circuit, but only one had to be a conductor for the bulb to light. It
was shown that the conducting material could occupy either of the
breaks and the bulb would light, but if materials that were not conductors occupied both breaks then the bulb would not light. It was
stated that a scientist wishing to identify a conductor amongst a set
of materials could use this parallel circuit to test them. It was explained that this could be done by putting in different pairs of materials and looking to see whether the bulb lit up or not and that, after
having done this a few times, one could work out from the results
which was the conductor. No further explanation was provided about
how the solution could be derived from this procedure. The children
were then assessed for their ability to use covariation evidence in a
second computer-based task using a cumulative presentation of evidence in a simulation scenario. Here, their attention was drawn to a
computer screen displaying the results of a scientist’s tests who had
used a parallel circuit to identify one conductor out of a group of
materials. They were consecutively shown four complete sets of evidence accumulated from tests involving 3, 4 and 5 materials (12 sets
in total). For each set, the pairs tested consisted of all possible permutations of materials once (i.e. all combinations twice), so that there
were 6 pieces of evidence for the set of 3 materials, 12 pieces of
Which is the conductor?
Figure 2. Cumulative presentation of evidence from a parallel circuit.
168
evidence for the set of 4 materials, and 20 pieces of evidence for each
set of 5 materials. For each set, the children were asked to enter, via
the keyboard, the number of the material they thought was the conductor. They were told that, for each set, there was just one conductor. The number of times, out of 12 sets, for which they correctly
identified the conductor was recorded as a measure of the children’s
ability to use covariation evidence. Figure 2 shows a typical screen
display from this task.
Fifteen children had some difficulty (i.e. were not able to achieve a
perfect prediction score in the second half of the block) with the first
computer-based task involving the simple circuit. These children
(19%) were excluded from the study on the basis of insufficient
understanding of the domain, the information provided about the
task or the computer procedures associated with it. Informal observation and interview revealed occasional misconceptions about the role
of the material in the circuit, such as considering it as a power source
rather than as either a conductor or a break in the circuit. A less than
perfect score in the second half of the block was also sometimes associated with difficulties in understanding that the causal agent did not
change during the block, or with using keys that were not involved in
the task or simply with errors when pressing the keys to indicate a response. The 23 children (29%) who achieved a perfect score (12) in
the second computer-based task (assessment of ability to use covariation evidence) were also excluded from the study, since these children
appeared to have little difficulty in applying scientific method in the
consideration of covariance information involving up to five potential
causes. The remaining 42 children (53%) were judged as understanding the domain and engaging appropriately with the computer, but
possessing a range of developing ability to consider covariation evidence. 24 children were randomly selected from this group (10 boys
and 14 girls, M = 7;11 years; range = 7;4–8;4 years). The mean and
standard deviation of their scores in the preliminary assessment were
4.79 and 2.60, respectively. These children were introduced to a third
computer-based task involving the sequential presentation of evidence
from a parallel circuit.
Sequential covariation task. This task involved a computer simulation
in which a scientist was using a parallel circuit to test pairs of materials to determine which one was the conductor. All participants were
told that they needed to use the test outcomes to identify the conductor and be able to predict whether the bulb would light or not. Prior
to each pair of materials appearing in the circuit, the children were
169
asked to enter the number of the material they thought was the conductor. With the pair of materials to be tested in the circuit, but prior
to the rest of the circuit being completed, the children were asked to
make a prediction about whether the bulb would light by pressing
one of two other keys on the computer (marked with a dark bulb and
a lit bulb). The children were told whether their prediction was correct by the program and the results of tests were stored in the top left
hand corner of the screen. The children were presented with 3 blocks
of increasing complexity, comprising trials arising from identifying the
conductor amongst a set of 3, 4 and then 5 materials. Permutation of
materials that composed the test pairs was as previously described.
Presentation orders of the pairs were randomised except that no two
materials were tested twice in the same half of the block. This constrained effective application of matching strategies to the second half
of the presentation sequence. Figure 3 shows a typical screen display
arising from this task. The computer recorded the children’s theories,
how long they took to respond with them and their predictions. At
the end of all three blocks, the children were asked to explain how
they had tried to identify the conductor, using the initial question:
‘‘Can you tell me how you worked out that (number) was the conductor?’’, followed by the further prompt: ‘‘So when we’ve got two numbers like this next to a lit up bulb, how does that help us to work out
which one’s the conductor?’’. The responses to these questions were
recorded on tape and transcribed for analysis.
Measures
Progress in the sequential covariation task was measured by a theory
score, calculated as the total number of occasions that the child
Will the bulb light?
Figure 3. Sequential presentation of evidence from a parallel circuit.
170
correctly reported the number of the material that was conducting
electricity and causing the bulb to light.
The following measures were chosen as indicators that might be
reasonably associated with participants’ progress in the problemsolving task:
– Ability to predict whether the bulb would light or not. This was measured using a prediction score equal to the number of correct
predictions of lighting.
– Tendency to base predictions on current theory of causation. This
was measured using a theory-prediction consistency score equal to
the number of occasions when the prediction could be reasonably
derived from a correct application of the expressed theory –
irrespective of whether the theory was correct or not.
– Differentiating appropriately between confirmatory evidence and disconfirmatory evidence. This was measured by calculating the difference time in entering a theory following an incorrect prediction
compared to a correct prediction. Whereas confirmatory evidence
requires only that the existing theory be maintained and applied
again, disconfirmatory evidence might reasonably prompt an additional pause whilst a theory is revised. Indeed, it has often been
noted that such discontinuous transitions in strategy use are
marked by a critical slowing down (Van der Maas & Jansen, 2003).
– Appropriately revising a theory. This was measured as the percentage of occasions of Incorrect Prediction that were followed by
Rejection of the current theory (IPR)
– Inappropriately revising a theory. This was measured as the percentage of occasions of Correct Prediction that were followed by Rejection of the current theory (CPR). Unlike the other measures,
progress may be associated with decreases in this indicator.
– The experimental hypotheses were that these dependent variables
would be positively correlated with each other, except for CPR
which would be negatively correlated. Alpha was set at 0.05.
Result
The experimental hypotheses concerned data arising from the third
computer-based task, referred to above as the sequential covariation
task. These hypotheses were that the theory score would be positively
correlated with prediction score, theory-prediction consistency,
171
difference time in entering a theory following an incorrect prediction
compared to the time in entering a correct prediction and IPR. Additionally, it was hypothesised that the theory score would be negatively
correlated with CPR. For the blocks containing 3, 4 and 5 materials,
the chance theory scores were 2, 3 and 5, respectively. The children did
not achieve theory scores significantly above chance for the blocks containing 3 materials (t = 1, df = 23, p = ns.), the blocks containing 4
materials (t < 1, df = 23, p = ns.) or the blocks containing 5 materials (t < 1, df = 23, p = ns). For the blocks containing 3, 4 and 5
materials, the chance scores for prediction were 3, 6 and 10, respectively. The children scored significantly above chance for the blocks
containing 3 materials (t(23) = 2.2, p < 0.05), the blocks containing 4
materials (t(23) = 3.2, p < 0.05) and the blocks containing 5 materials
(t(23) = 2.5 p < 0.05). For the blocks containing 3, 4 and 5 materials,
the chance scores for theory-prediction consistency were 3, 6 and 10,
respectively. The children scored significantly above chance for the
blocks containing 3 materials (t(23) = 3.3, p < 0.05) and the blocks
containing 5 materials (t(23) = 4.1, p < 0.05), but not for the blocks
containing 4 materials (t(23) < 1, p = ns.). Mean scores (with standard deviations in parentheses) for theory-prediction consistency
across the blocks containing 3, 4 and 5 materials were 3.63 (0.92), 6.21
(2.41) and 12.54 (3.02).
Scatter plots indicated associations according to the experimental
hypotheses, with the exception of any discernable association of theory
scores with IPR or CPR. All distributions passed Kolmogorov–
Smirnov tests of normality except for CPR. A non-parametric test of
association (Spearman’s rho) was applied to investigate any association
between this variable and the theory score of the participants but did
not reveal a significant correlation (rs = 0.063, p = ns.). To test
associations between the other parameters, correlation coefficients
(Pearson’s r) were calculated. This analysis revealed significant associations between theory score and prediction score (r = 0.63, p < 0.005),
theory score and theory-prediction consistency score (r = 0.68,
p < 0.005), theory score and difference time in entering a theory following an incorrect prediction compared to the time entering a correct prediction (r = 0.56, p < 0.005), but no statistically significant
association between theory score and IPR(r = 0.24, p = 0.250).
Three children referred to generating or rejecting hypotheses about
which material was the conductor, but were unable to explain how
they had done this or relate their decisions coherently to the evidence.
Five other children did refer to the evidence when explaining their
selection or rejection of a hypothesis but only one child amongst the
172
current theory, then
consistent
prediction
materials
(a) 4
3
current theory, then
inconsistent
prediction
2
prediction correct
1
latency (seconds)
1
2
3
4
5
6
7
8
9 10 11 12
trials
15
10
5
0
1
2
3
4
5
6
7
8
9 10 11 12
trials
current theory, then
consistent
prediction
materials
(b) 4
3
current theory, then
inconsistent
prediction
2
prediction correct
1
latency (seconds)
1
2
3
4
5
6
7
8
9 10 11 12
trials
15
10
5
0
1
2
3
4
5
6
7
8
9 10 11 12
trials
Figure 4. The trial-by-trial progress made by participants pursuing strategies classified as (a) mature (perfect or near perfect theory-prediction consistency and rapid
movement to a correct theory that is then maintained throughout the remaining trials, with rapid increase in response time after correct theory is identified), (b) random
(theory-prediction consistency which is near-chance and with no apparent consideration of test outcome when revising theory, resulting in near-chance prediction and
theory scores), (c) pattern matching (almost perfect prediction scores in the latter half
of the block, but with near-chance theory-prediction consistency and near-chance theory scores throughout the block, (d) prior belief (theory-prediction well above-chance
but with persistent retention of a theory in the face of poor prediction performance,
(e) vacillation (theory-prediction consistency well above chance, only temporarily settling upon correct theory and then abandoning it).
173
current theory, then
consistent
prediction
materials
(c) 4
3
current theory, then
inconsistent
prediction
2
prediction correct
1
latency (seconds)
1
2
3
4
5
6
7
8
9 10 11 12
trials
15
10
5
0
1
2
3
4
5
6
7
8
9 10 11 12
trials
materials
(d) 4
current theory, then
consistent prediction
3
current theory, then
inconsistent
prediction
2
prediction correct
1
latency (seconds)
1
2
3
4
5
15
6
7
8
9 10 11 12
trials
10
5
0
1
2
3
4
5
6
7
8
9 10 11 12
trials
materials
(e)
current theory, then
consistent
prediction
4
3
current theory, then
inconsistent
prediction
2
prediction correct
1
latency (seconds)
1
2
3
4
5
6
7
8
9 10 11 12
trials
15
10
5
0
1
2
3
4
5
6
7
8
9 10 11 12
trials
Figure 4. Continued.
174
sample was able to convincingly describe a strategy based upon the
covariation principle. The five explanations categorized as ‘other’ included consideration of which material had been the conductor previously and a calculation-based strategy involving the numbers assigned
to the materials. Three children were unable to provide any attempt
at explaining how they had tried to solve the problem, saying that
they did not know or had guessed. The explanations of seven children
were too incoherent to provide insights into their strategies. These difficulties in using an explanation-based methodology to identify strategies echo reports of other studies. Children’s expression of their
understanding often lags behind their understanding (e.g. Flavell,
1985) and, even when children appear able to provide coherent explanations, doubts must remain about the extent to which these are
influenced by post-hoc justification.
Graphical representation of the trial-by-trial decisions and progress
of the children was more helpful in revealing the strategies used in
each block. This was achieved by plotting the current theory, the subsequent prediction success, the theory-prediction consistency and the
time taken to produce the theory for each trial. Figure 4 provides an
example of such a plot. A mature application of scientific method to
the covariation data was typified by perfect or near-perfect theory
prediction consistency, and the rapid movement from an incorrect
theory to a correct theory that was then maintained throughout the
rest of the block (see Figure 4(a)). Here, it was also observed that initial response times were longer, reflecting slower responses when evidence was being scrutinized more carefully. These response times
decreased considerably as soon as the participant became confident
that his/her theory about the cause was correct and thus responded
more rapidly.
There were many examples of an apparently random approach,
characterised by theory-prediction consistency which was near-chance
(50%) and with no apparent consideration of test outcome in the
choosing a theory, resulting in near-chance prediction scores and
near-chance theory scores (see Figure 4(b)). With respect to the
trial-and-error nature of this approach and absence of appropriate
strategy, this might be said to resemble Vygotsky’s ‘vague syncretic’
stage of conceptual development (Vygotsky, 1935). However, it
should be pointed out that at least one of the approaches alluded to
in the children’s explanations (calculation – see above) was systematic
and non-random in its approach but still gave rise to outcomes which
resulted in this classification. Thus, in the strictest sense, the term
random is used here to refer to a group of strategies that produced
2
1
0
6
15
3
0
4
8
9
The types of strategies identified were mature, periods of good theory-prediction consistency but with vacillation, good theory-prediction
consistency hampered by an unwillingness to depart from a prior belief, pattern matching and random.
Mature
1
Good theory-prediction consistency with some vacillation
0
Good theory-prediction consistency hampered by prior belief 1
Pattern matching
4
Random
18
1st block (3 materials) 2nd block (4 materials) 3rd block (5 materials)
Block
Table 1. The distribution of participants using different strategies in each block
175
176
apparently random outcomes, rather than one strategy that was
purely random in its underlying approach.
There were several examples of children gaining an above-chance
prediction score using the pattern matching strategy identified by
Kuhn et al. (1988) and Downing et al. (1985). Here, the child appeared to be matching global information (e.g. ‘‘there is now a 1 followed by a 2’’) with previous occurrences of similar type of instance
(e.g. ‘‘there was a 2 followed by a 1 in this previous test’’) to make a
successful prediction. As would be expected, this produced some limited success in terms of predictions but was not effective at all in
terms of identifying the cause. In the task presented to the children,
combinations in the second half of the block were repeated from the
first half (but with some permutation of order of the two potential
causes presented). Thus, this strategy was associated with almost
perfect prediction scores in the latter half of the block, but with
near-chance theory-prediction consistency and near-chance theory
scores throughout the block (see Figure 4(c)). This would explain why
it was found that, overall, the children were scoring significantly
above chance for their prediction scores but scoring at chance levels
for their theory scores.
There was some evidence that even when children produced
predictions that were consistent with their theory, they sometimes
retained their theory even when faced with unexpected outcomes.
The tendency to retain a prior belief in the face of conflicting evidence has been well-documented in children and even some adults
(Kuhn et al., 1988). Strategies involving some theory-prediction
consistency but hampered by prior belief were characterised in the
present study by levels of theory-prediction consistency well above
chance (75% or above) but with persistent retention of a theory
in the face of poor prediction performance (see 4(d)). In one such
example in Study 1, the child eventually abandoned their theoryprediction consistency, possibly in a misguided attempt to avoid
abandoning their theory. There was also one example of what
may be an additional vacillation stage preceding mature conceptualisation of the problem. In the block involving 4 materials, this
participant displayed theory-prediction consistency of 75%, rejected
their prior theory and even reached the correct solution but then
discarded it after a short period (See Figure 4(e)). In the next
block, the same participant displayed perfect theory-prediction consistency, settled efficiently upon the correct solution and stayed
with it. Table 1 shows the distribution of participants using the
different strategies in each block.
177
Discussion
The children were performing at chance level in terms of their theory
scores, but achieved above chance for prediction. Children’s success in
identifying causation in this task was strongly associated with
theory-prediction consistency, and was also associated with a tendency to pause longer following an incorrect prediction than a correct
prediction. However, the lack of association between theory scores
and when children rejected a theory (CPR, IPR) was unexpected. A
more detailed analysis of the data revealed that those children who
rapidly developed a mature strategy and achieved higher scores often
tended to initially make more changes to their theory that were both
appropriate and inappropriate, while children who were using random
or pattern-matching strategies tended to change their causal theories
less frequently. This relationship between early variability and later
learning has been observed in a number of other studies (Alibali &
Goldin-Meadow, 1993; Graham & Perry, 1993; Perry & Lewis, 1999;
Siegler, 1995). In the present study, the initial variability of the more
successful problem-solvers extended to frequently abandoning correct
theories in the face of supporting evidence.
Theory-prediction consistency scores were poor in all three blocks.
No child based their predictions on their theories with perfect consistency throughout all three blocks, although those children who identified the causal agent most quickly did come close (with 34 out of 38
predictions consistent with theory in one case). Indeed, for the middle
block (4 materials), theory-prediction scores were not significantly
above chance. Such findings, echoing the types of observations made
by Leach (1999), indicate that this is a problematic area for children
attempting to solve problems that require a scientific approach. However, it cannot be inferred from this study that children lack an ability
to ensure theory-prediction consistency, since their difficulties may
also derive from a lack of awareness that such consistency is desirable
for this type of problem.
Four children stuck steadfastly to their theories throughout the last
block and this approach resulted in very low theory scores. This is,
perhaps, more difficult to understand than the unwillingness to abandon beliefs that was identified by Kuhn et al. (1988) using covariance
tasks with children of a similar age. Kuhn identified the children’s
beliefs within a familiar domain before presenting the children with
evidence that conflicted with these ideas. It seems likely that some of
these beliefs may have been held with conviction for a considerable
period of time. This cannot be said of the theories expressed in the
178
present study, and yet these four children remained unwilling to give
them up in the face of the accumulating evidence. They appeared to
understand the theory-prediction relationship, but still displayed difficulties in co-ordinating their theories with the evidence that was accumulating.
After strategies classified as random, pattern matching was the most
popular strategy. Only a few developed a mature strategy which was
robust enough to be applied in the final block involving five materials,
but over half of the children exhibited a strategy in at least two consecutive blocks that was more sophisticated than random. More than
half the children applied more than one strategy during the session.
Study 2
Study 2 investigated the effects of a simple intervention that encouraged theory-prediction consistency by encouraging the children to
base their predictions on their ideas and correcting them if they failed
to do so. If the poor theory-prediction observed in Study 1 was
derived from a lack of epistemological knowledge regarding its desirability, then this simple encouragement should bring about a rapid
improvement in achievement with this type of problem in terms of
theory scores. Additionally, it was considered possible that an encouragement to base predictions upon theories might prompt other behaviours associated with evidence-based problems, such as greater
reflection upon unexpected outcomes.
Study 2 also provided the opportunity to determine whether the
type of goal was influencing the achievement of the children. Owen &
Sweller (1985) have demonstrated that high school students given a
‘‘non-specific goal’’ of finding out how to solve problems in geometry
showed greater understanding of the underlying mathematical
Table 2. Means and standard deviations of preliminary assessment scores for children
in each of the four groups (n = 12 for each group)
Intervention
No intervention
Performance goal
Mean
Standard deviation
Group 1
7.25
3.08
Group 3
6.83
3.51
Procedural goal
Mean
Standard deviation
Group 2
6.75
3.74
Group 4
6.75
3.28
179
principle than students given the ‘‘specific’’ goal of solving them.
Geddes & Stevenson (1997) have shown similar effects when university students were asked to solve a problem involving a causal relationship. Although these studies were with adults and older children
(see also Vollmeyer, et al. 1996), it seemed probable that the type of
of goal might also influence the tendency of younger children to generate hypotheses about causal relations. Thus, a performance goal,
with an emphasis on maximising prediction success, would be less
effective at encouraging scientific strategies of successful causal investigation as characterised by lower theory scores, less theory-prediction
consistency and less reflection upon unexpected outcomes. Such a
goal might encourage other approaches, such as pattern matching
strategies. A procedural or, in the terms of Owen & Sweller (1985), a
more non-specific goal, with an emphasis on finding out how, might
be more successful at encouraging a greater depth of thought. Thus, it
was predicted that this goal would give rise to higher theory scores,
greater theory-prediction consistency and more reflection upon unexpected outcomes.
Method
Design
Study 2 employed a two-factor between-participants design in which
the dependent variables being measured were, as above, theory score,
prediction score, theory-prediction consistency, and difference time
following an incorrect prediction. IPR and CPR were not included in
this second study, since Study 1 had provided insufficient evidence to
associate these dependent variables with overall success in identifying
causation. The independent variables were intervention (two levels:
with and without intervention) and learning goal (two levels: performance and procedural).
Participants
The original pool of participants were 135 randomly-selected children
in Year 3 (aged 7–8 years) attending 3 urban Primary Schools in
South Wales not previously involved in this investigation (72 boys
and 63 girls, mean = 8;0 years; range = 7;0–8;11 years).
180
Table 3. Descriptive statistics for performance in the last block (involving 5 materials)
for each of the groups in Study 2
Intervention
Procedural
goal
Group 1
Theory score
Mean
Standard deviation
Prediction score
Mean
Standard deviation
Theory-prediction
consistency
Mean
Standard deviation
Difference time
Mean
Standard deviation
No intervention
Performance
goal
Group 2
Procedural
goal
Group 3
Performance
goal
Group 4
16.33
4.19
15.83
3.64
11.08
6.64
12.00
7.75
17.67
1.61
16.92
1.83
14.08
4.40
15.50
4.21
19.17
1.70
19.25
1.22
14.92
4.52
15.92
5.16
5419
2584
7764
6071
3406
3170
3436
3573
The table shows means with standard deviations in parentheses of theory scores, prediction scores, theory-prediction consistency and difference time (ms) in entering a
theory following an incorrect prediction (n = 12 for each group).
Procedure
A preliminary assessment of the children’s ability was again carried
out. 57 children (42%) achieved perfect scores and 14 children (10%)
were also excluded following difficulties with the simple circuit. Of the
remaining 64 children, 48 were randomly selected and allocated to 4
groups arising from combination of the conditions. Allocation to
these groups was on the basis of preliminary assessment scores, so
that each group displayed similar distributions of ability (see Table 2).
As in Study 1, children in all groups were presented with 3 blocks
of tests (involving 3, 4 and 5 materials) and then asked how they had
tried to identify the conductor. Again, the children were asked to
enter their predictions via the keyboard to observe how the circuit
behaved. However, the method by which the children reported their
theories was modified in order to make the task more comparable to
previous studies (including those referred to above) in which partici-
181
(a)
mature: 13,13
(0,1)
vacillation
prior belief
Strategies with good
theory-prediction
consistency
(0,1)
pattern
matching:1,3
(3,5) (1,0)
(2,1)
Random:4,0
(b)
mature: 12,20
(1,1)
(3,0)
vacillation
prior belief:1,1
Strategies with good
theory-prediction
consistency
(1,0)
(4,0)
(0,1)
pattern
matching: 0,1
(1,0)
(1,0)
random
Figure 5. Retention and changes in strategies for (a) participants who were not
encouraged to base their prediction on their theory (no intervention) and (b) participants who did receive this encouragement (intervention). The first number in each
pair refers to events between the first and second blocks, the second number refers to
events between the second and third blocks. Numbers inside boxes represent the participants who retained a particular strategy, numbers on arrows represent participants
who changed strategies.
182
pants expressed their theories verbally. In Study 2, all participants
were asked to report their current theory verbally to the experimenter,
rather than via the keyboard, prior to making their next prediction.
At the beginning of each block, children in the intervention condition (groups 1 and 2) were asked to make sure that they based their
predictions on their theories, but were given no advice on how to do
this. In the first half of each block, if a child was about to enter a prediction that could not be derived from their current theory, the experimenter pointed this out to him/her. The children were then given the
opportunity to revise their prediction before entering it into the computer. Children in groups 3 and 4 did not receive this intervention.
All children were told that they needed to identify the conductor so
that they could be sure whether the bulb would light or not. However,
children in the procedural goal condition (groups 1 and 3) were asked
to find out how to do this and be able to explain how to the experimenter: ‘‘You need to find out which one is the conductor so that you
can get your predictions right. Your aim is to find out how to do this.
Later on I’m going to ask you how’’. This procedural goal was reinforced by asking the children to explain how they had tried to identify
the conductor at the end of each block. Children in the performance
goal condition (groups 2 and 4) were asked to identify the conductor so
that they could say correctly, as many times as possible, whether the
bulb would light: ‘‘You need to find out which one is the conductor so
that you can get your predictions right. Your aim is to try to get as
many right as you can. Later on, I’m going to tell you how many you
got right.’’ This performance goal was reinforced by telling the children
at the end of each block how many correct predictions they had made.
Results
Descriptive statistics (including means and standard deviations) for
dependent variables in the last block (in which the effects of learning
goal and intervention are likely to have reached maximum effect) are
shown in Table 3. One-sample Kolmogorov–Smirnov tests on dependent variables for each group showed normal distributions in all cases.
A MANOVA was carried out on these variables, with independent
variables of Goal (two levels: procedural and performance) and Intervention (two levels: intervention, no intervention). This revealed an
effect of intervention (F(4, 41) = 4.02, p = 0.008), but not an effect of
goal (F(4, 41) = 0.32, p = ns), with no significant interaction effect
(F(4, 41) = 1.07, p = ns.). However, the data for prediction scores,
theory scores and theory-prediction consistency failed Levene’s tests of
183
homogeneity of variance, and so independent samples t-tests were carried out (with equal variances not assumed) to confirm the effects of
the intervention on theory score, prediction score and difference time
following incorrect prediction (Further analysis of theory-prediction
consistency was not considered to be of sufficient interest to warrant
inclusion – since differences in this dependent variable were a trivial
outcome of the intervention). With significance levels set at 0.017
to account for data dependency in these tests, all three dependent
variables appear significantly influenced by the intervention (with
significance values for theory scores, prediction score and difference
time following incorrect prediction: t(35.5) = 2.77, p = 0.009;
t(30.4) = 2.66, p = 0.012; t(41.2) = 2.70, p = 0.010, respectively).
The individual progress of each child within each block was analysed as in Study 1. The strategies adopted by each child, in each block,
were categorised using the criteria set out in Study 1 but including
this additional category. Figure 5 shows the strategy changes made by
individuals between the three blocks.
Discussion
The effect of encouraging the children to base their predictions upon
their theory improved their ability to identify the cause. Given that
theory-prediction consistency is problematic for some children, the
fact that support in this area improved their ability to identify causation has practical implications but is theoretically unsurprising.
No instruction was provided about how to use a theory to make a
prediction, although it is possible that feedback, in the form of the
corrections provided by the experimenter, might have supported the
participants in developing this ability. However, the small number of
corrections that were required tends to indicate that many children
already possessed this ability but needed the instruction to apply it.
Encouraging the children to base their predictions upon their theories
also appears to have positively influenced another characteristic of
effective hypothesis-testing behaviour, since children in the intervention groups spent longer reflecting upon outcomes that challenged
their present thinking. It may be that the emphasis upon
theory-prediction consistency activated a particular framework, or
schema, containing other information associated with problems
requiring a scientific approach. Kuhn, et al. (2004) suggest that a
major dimension of cognitive development is the increasing role of
metalevel components in monitoring and managing procedural level
skills. Our results suggest that one of these skills is the assigning of
184
predictive power to one‘s theories of causation. Metalevel functioning
has also been divided into metatask understanding and metastrategic
competence (Keselman, 2003). Advances in metatask understanding
may be characterized by, for example, improved understanding of the
need to identify causal attribution rather than pursuing other, nonscientific, outcomes. Advances in metastrategic understanding include
more appropriate selectivity and monitoring of strategies. The effect
of encouraging TPC upon other areas of strategy reminds us of the
likelihood of a close relationship between these metalevel components
if, as we suggest, these additional strategic improvements arise via
improved understanding of task type.
The provision of a procedural learning goal appeared no more
beneficial than a performance goal. This is despite the fact that
self-explanation (used to reinforce the procedural goal) has itself been
previously associated with the enhancement of understanding (e.g.
Chi, et al. 1994). It would appear that the combined effects of the
procedural learning goal and the benefits of self-explanation anticipated when children are asked to explain how to identify the cause of
an effect have not been effective in improving their ability to do so.
Some explanation for the lack of learning goal effect in young children may be derived from considering how such an effect may operate in older children and adults. In the study by Geddes & Stevenson
(1997), it is possible that the increased depth of understanding
achieved by those who were asked to find out how an effect operated
was due to an association between such a goal and anticipation of the
need for self-explanation. It may be that children are less aware, or
less concerned, with how difficult it is to clearly explain extended
unsystematic approaches. Adults, on the other hand, may more
readily anticipate such issues and thus adapt their approach to avoid
having to make explanations that are difficult to articulate or may
make them appear foolish.
Analysis of strategies again revealed variability in strategy use. Experience with the problem generally brought about adaptive changes in
strategy use, with a few examples of children occasionally reverting to
less successful strategies. Comparing Figure 5 (a) and (b), it would
appear that encouraging theory-prediction consistency supported an
increase in the prevalence of strategies with improved theory-prediction
consistency which, in turn, were often precursors to mature problem
solving. Some instances of reversion to less sophisticated strategies may
have been prompted by increased task complexity but may also reflect
the co-existence in time of different ways of thinking about the problem,
as in Siegler’s overlapping waves model (Siegler, 1996).
185
General discussion and conclusions
This investigation has demonstrated that children do not always use
their theories to generate predictions and that this can undermine
their attempts to identify causation in contexts requiring the application of scientific method. Their rapid improvement, when instructed
and encouraged to be consistent, suggests that this tendency arises, at
least partially, not from a lack of ability to make theory-based predictions but an apparent failure to apply it. Why did children already
possessing this skill not spontaneously apply it? The simplest explanation might be based on a ‘cognitive miser’ model – since the scientific
method requires considerable cognitive resources. As Koslowski &
Masnick(2002) have summarized the experimental evidence for causal
reasoning being an essentially empirical activity, we believe prediction
making may be characterized in a similar way. These children may
not have immediately recognized the desirability of strategies based
on a normative model of multicausal variability, but adapted their
strategies appropriately in the face of feedback and more so when
given a clue about the need to test their ideas. Their original
approaches to making predictions may have been inappropriate to the
type of scientific task they were presented with, but we cannot assume
they were necessarily flawed in a more general sense. In the social
domain, where children and adults confront most real-life problems,
the scientific method is only infrequently required. In this domain, the
ideas we verbalise are considered to perform many functions, but are
rarely accurate reports of the generally complex internal representations we use to guide our behaviour. Neither, therefore, can such
expressions be considered as models that may successfully predict
future outcomes in an explicit and scientific manner. Indeed, even in
simple cases, it has been shown that children’s predictions of future
social behaviour are often not in line with the dispositional implications of past behaviour (Rholes, et al. 1990; Newman & Ruble, 1992).
Asking children to base their predictions upon their ideas may, therefore, be sending a clear message that the problem requires a theorybased and scientific approach. Such an explanation is supported by
the fact that instruction and encouragement to make theory-based
predictions also prompted greater reflection upon unexpected
outcomes of tests. However, if this is a true explanation of what
occurred, then a cue, such as a procedural learning goal, that prompts
adults and older children to think more analytically about a problem
does not always work for younger children. This may be because
young children, when asked to determine how to solve a problem,
186
may possess insufficient experience to want to avoid the difficulty of
explaining an unsystematic approach.
Such conclusions about the relationship of theory-prediction consistency with other behaviours must remain tentative but are worthy
of further investigation. What remains clear from the present study is
that children can profess a theory without using it and this lack of
theory-prediction consistency limits the successful application of scientific method in problems requiring it. In addition to theory-evidence
co-ordination, theory-prediction consistency thus deserves consideration as a developing area of understanding in children that is crucial
to scientific problem solving.
Finally, the trial-by-trial analysis of the children’s progress, allied
to other recent microgenetic methodologies, has again proved fruitful
in allowing a re-examination of a problem solving process. As well as
identifying an additional factor influencing children’s success in scientific inquiry, the study has again highlighted the potential usefulness
of computer-generated feedback in supporting the development of
children’s problem-solving skills. The potential advantages of computers in supporting children’s conceptual development have been proposed for some time (e.g. Chaillé & Littman, 1985) and it may well be
that the rapid rate of testing and feedback facilitated the speed with
which adaptive change occurred. (Participants observed 38 outcomes
of tests in a period of around 25 minutes.) The findings of this study
support the notion that computer-mediated simulations of scientific
inquiry are themselves a worthy focus of investigation for those
wishing to accelerate children’s acquisition and development of
problem solving strategies.
Acknowledgements
This work was initiated with help of the late Professor Rosemary
Stevenson and was made possible by a grant from the Spencer
Foundation. The data presented, the statements made, and the views
expressed are solely those of the authors.
References
Alibali, M.W. & Goldin-Meadow, S. (1993). Gesture-speech mismatch and mechanisms
of learning:What the hands reveal about a child’s state of mind. Cognitive Psychology
25: 468–523.
187
Chaillé, C. & Littman, B. (1985). Computers in early education: the child as theory
builder. Children and Computers: New Directions for Child Development 28: 5–18.
Chi, M.T.H., de Leeuw, N., Chiu, N., & LaVancher, C. (1994). Eliciting selfexplanation improves understanding. Cognitive Science 18: 439–477.
Downing, C., Sternberg, R. & Ross, B. (1985). Multicausal inference: evaluation of
evidence in causally complex situations. Journal of Experimental Psychology:
General 114: 239–263.
Flavell, J.H. (1985). Cognitive development. Englewood Cliffs, NJ: Prentice-Hall.
Geddes, B.W. & Stevenson, R.J. (1997). Explicit learning of a dynamic system with a
non-salient pattern. The Quarterly Journal of Experimental Psychology 50A(4): 742–
765.
Graham, T. & Perry, M. (1993). Indexing transitional knowledge. Developmental
Psychology 29: 779–788.
Keselman, A. (2003). Supporting inquiry learning by promoting normative understanding of multivariable causality. Journal of Research in Science Teaching 40(9):
898–921.
Klahr, D. (2000). Exploring science: The cognition and development of discovery
processes. Cambridge, MA: MIT Press.
Koslowski, B. (1996). Theory and evidence: The development of scientific reasoning.
Cambridge, MA: MIT Press.
Koslowski, B. & Masnick, A. (2002). The development of causal reasoning. In
U. Goswami, (ed.), Blackwell handbook of childhood cognitive development, pp. 257–
281. Blackwell Publishing: Oxford.
Kuhn, D., Amsel, E. & O’Loughlin, M. (1988). The development of scientific thinking
skills. San Diego: Academic Press.
Kuhn, D., Katz, J.B. & Dean, D. (2004). Developing reason. Thinking and Reasoning
10(2): 197–219.
Latour, B. & Woolgar, S. (1986). Laboratory life: The construction of scientific facts.
Princeton NJ: Princeton University Press.
Leach, J. (1999). Students’ understanding of the co-ordination of theory and evidence in
science. International Journal of Science Education 21(8): 789–806.
Newman, L.S. & Ruble, D.N. (1992). Do young children use the discounting principle?.
Journal of Experimental Psychology 28: 572–593.
Owen, E. & Sweller, J. (1985). What do students learn while solving mathematics
problems?. Journal of Educational Psychology 77(3): 272–284.
Perry, M. & Lewis, J.L. (1999). Verbal imprecision as an index of knowledge in
transition. Developmental Psychology 25: 749–759.
Rholes, W.S., Newman, L.S. & Ruble, D.N. (1990). Developmental and motivational
aspects of perceiving persons in terms of invariant dispositions. In E. Tory Higgins
and Richard M. Sorrentino, (ed.), Handbook of motivation and cognition, vol. 2. The
Guilford Press: NY.
Ruffman, T., Perner, J., Olson, D. & Doherty, M. (1993). Reflecting on scientific
thinking: Children’s understanding of the hypothesis-evidence relation. Child
Development 64: 1617–1636.
Siegler, R.S. (1995). How does change occur: a microgenetic study of number
conservation. Cognitive Psychology 28: 225–273.
Siegler, R.S. (1996). Emerging minds: The process of change in children’s thinking.
New York: Oxford University Press.
188
Sodian, B., Zaitchek, D. & Carey, S. (1991). Young children’s differentiation of
hypothetical beliefs from evidence. Child Development 62: 753–766.
Van der Maas, H.L.J. & Jansen, B.R.J. (2003). What response times tell of
children’s behavior on the balance scale task. Journal of Experimental Child
Psychology 85: 141–177.
Vollmeyer, R., Burns, B.D. & Holyoak, K.J. (1996). The impact of goal specificity on
strategy use and the acquisition of problem structure. Cognitive Science 20: 75–100.
Vygotsky, L.S. (1935). Mental development of children and the process of learning. In
M. Cole, V. John-Steiner, S. Scribner and E. Souberman, (eds)., (1978) L.S.
Vygotsky: Mind in society. Cambridge (Mass.): Harvard University Press.