Improving Math Learning Through Intelligent

Improving Math Learning Through Intelligent Tutoring
and Basic Skills Training
Ivon Arroyo,1 Beverly Park Woolf, 1 James M. Royer,2 Minghui Tai, 1 Sara English3
1
Department of Computer Science, 2 Department of Psychology
3
School of Education
University of Massachusetts, Amherst Mass. 01002
{bev, ivon}@cs.umass.edu;[email protected]; [email protected];
[email protected]
Abstract. We studied the effectiveness of a math fact fluency tool integrated with an
intelligent tutor as a means to improve student performance in math standardized
tests. The study evaluated the impact of Math Facts Retrieval Training (MFRT) on
250 middle school students and analyzed the main effects of the training by itself
and also as a supplement to the Wayang Tutoring System on easy and hard items of
the test. Efficacy data shows improved student performance on tests and positive
impact on mathematics learning. We also report on interaction effects of MFRT with
student gender and incoming math ability.
1. Motivation
Memory retrieval is an important skill in mathematics development since problem solving
takes place in a cognitive system constrained by a limited capacity of working memory
[1]. Many students have problems in mathematics in part because they are slow and/or
inaccurate in retrieval of simple math facts from memory. Training the speed and
accuracy of math fact retrieval (MFR) skills has been shown to be effective for students
with learning disabilities, who may show number processing inefficiencies [2]. We
studied how to impact students’ learning through training this mathematical fluency, using
software modules that supplement a traditional mathematics tutoring system.
One hypothesis is that training math facts retrieval would help female students in
particular. An affective gender gap towards mathematics increases as children progress
through the school system, and despite equal performance in mathematics classes and on
individual mathematics projects [3,4], girls lose interest in math-related careers [4,5]. In
addition, girls have consistently under-performed boys on time-based standardized tests in
mathematics (e.g. SAT-M) and this underperformance has been particularly pronounced
among higher-than-average-achieving girls [6].
The cognitive difference of girls’ underachievement in timed tests compared to males
has been mainly attributed to differences in spatial abilities [7], memory retrieval skills [8]
and strategy use. Gender differences in strategy use have been found in the first grades of
elementary school, suggesting that girls continue to use concrete strategies to solve
arithmetic problems (finger counting) while boys move on to using retrieval from memory
[10]. Continuing to use concrete strategies like these make math tougher for girls when
they move on to more abstract topics and timed tests. Gender differences in mathematics
performance do not appear to be biological [11], as even those basic skills can be trained
and computational fluency can be enhanced with software-based interventions [12].
However, the advantages of using computer-based tools to train mathematics fact retrieval
skills and their impact on mathematics achievement have not yet been fully investigated.
In addition, students with mild cognitive disabilities, and/or emotional disturbances
also are under-represented in mathematics-intensive careers and fail to take additional
mathematics classes. Learning disabilities (LD) do appear to have a biological basis and
there is evidence that students with LD have concrete difficulties with working memory as
well as executive control of math problems and procedural knowledge [1]. As a result,
many students with LD may also
persist in using counting strategies
(e.g., finger counting)[13], take longer
to solve arithmetic problems and
perform poorly in classrooms and on
high-stakes standards-based tests [6].
This population is poorly reached by
traditional methods and has a large
negative impact on society in terms of
lost potential by not being educated to
their maximum potential. Learning
disability is a complex multi-factor
problem and educational institutions do
Fig. 1. The Wayang Tutoring System. Gendered
not provide potent cost-effective
affective learning companions talk to students
instruction tailored to the individual.
about the need to practice and to exert effort.
Figure 2. The Wayang Outpost Tutoring
System
2. The Math Fact Retrieval Hypothesis
The math fact retrieval (MFR) hypothesis suggests that the speed of math fact retrieval,
defined as an individual’s ability to “automatically retrieve correct answers to addition,
subtraction, multiplication and division problems,” is a source of this gap [8]. If students
can quickly and automatically retrieve math facts while taking a mathematics test, for
example, they will have more cognitive capacity to devote to higher-level problemsolving activities and will also be able to complete the test more quickly. This hypothesis
has been explored as one source for the math performance gap for women and low
performing students. The speed of math-fact retrieval is a significant predictor of middle
school students’ performance on mathematics tests and of college students’ performance
on the mathematics portion of a standardized college admissions test [8]. Males are not
inherently faster than females at retrieving facts from memory, as females tend to show an
advantage when retrieval speed was measured for word-naming and sentence
understanding tasks (i.e., verbal processing tasks instead of mathematics tasks). In
addition, the gap can be reduced, as a group of participants were allowed to practice math
fact retrieval before measuring their speed, and the gap seemed to disappear among both
Chinese students living in the U.S. and Chinese students living in Hong Kong, though not
for a group of U.S. students [8].
We hypothesized that training students in mathematics fact retrieval (MFR) every day
before using the Wayang math tutoring system would improve their learning, because it
would free up cognitive resources that could be used for learning new math skills. We
used a fact retrieval drill and practice system to provide interventions that emphasized fast
and accurate mathematics fact retrieval along with an intelligent tutor, as described next.
3. The Wayang Tutor and MFR Training Software
The Wayang Outpost multimedia web-based math tutoring software is an adaptive
multimedia tutoring system that teaches students how to solve geometry, statistics and
algebra problems of the type that commonly appear on standardized tests [14]. To answer
problems in the Wayang interface, students choose a solution from a list of multiple
choice options, see Figure 1. Wayang provides immediate feedback on students’ entries
by coloring them red or green in the interface. As students solve a problem, they can ask
the tutor for hints that are displayed in a progression from general suggestions to bottomout solution. In addition to this domain-based help, the tutor provides a wide range of
meta-cognitive and affective support, delivered by learning companions or agents
designed to act like peers who care about a student's progress and offer support and advice
[15,16]. The learning companions’ interventions are tailored to each student’s needs
according to two models [16]. A simple effort model is used to assess the degree of effort
a student invests to develop a problem solution, and is based on time per action. A linear
regression affect model is used to assess a student’s emotional state; this model is derived
from data obtained from sensors, models and surveys [16,18].
General results showed that low performing students who used Wayang improved at
standardized tests compared to matched groups that did not use the tutoring software. In
addition, students of lower than median math ability learned more than students of high
ability. Similarly positive results indicate that, while all students improve their liking of
and self-concept in math when they used affective pedagogical agents in the tutor, women
high school students responded to affective pedagogical agents better than did male
students.
The Math Facts Retrieval Training software is commercially available based on more
than 20 years of laboratory research with problem learners1 [17]. The software provides
training and assessment. In the training phase, students study full digital pages of math
facts (e.g. two operand addition/subtraction/multiplication/division of at most two digit
numbers). Students click on each item to hear the answer (to learn or confirm that their
guess was right). In the assessment phase, students are tested for their accuracy and speed
(at the millisecond level). Students speak out the answer aloud and immediately hit the
space bar, after which the correct answer is spoken back to the student. Students were
instructed to code if their answers were right or wrong. Cheating was not an issue as the
goal was to have students think of the
answers in their head and hear the
feedback. At the end of the assessment
session, students saw a line chart that
showed their progress (in speed and
accuracy) compared to the previous
assessment session, Figure 2. Students
frequently became faster as they
worked on more pages and progress
charts showed their decline in speed,
which was a motivation to “go for
another round.” While students
generally demonstrate ceiling effects
on accuracy, MFR speed predicts
performance on SAT-M problems [8].
This software for math fluency was
Fig. 2. Student accuracy (left) and speed (right) as
based on similar software for reading
displayed in progress charts in the Math Facts
fluency, created with a similar working
Retrieval Training Software
memory limitation hypothesis and
especially used with children who had dyslexia.
3. Empirical Studies Using the Tutor and Basic Skills Training
A Spring 2009 study evaluated the impact of using the Wayang Tutor and Math Facts
Retrieval Training with 250 middle School students enrolled in a public school in Western
Massachusetts, United States. The objective was to analyze the main effects of MFR by
itself and as a supplement to the Wayang Math Tutoring System.
1 MFR Software, published by Math Success Lab, see http://www.mathsuccesslab.com/
Conditions and Subjects. Middle schools students (7th and 8 th graders) were randomly
assigned to one of four conditions: 1) Use of Wayang Tutor after working on the MFR
Training software for 15 minutes (Wayang-MFR); 2) use of Wayang Tutor alone
(Wayang-noMFR); 3) Use of the MFR Training software (noWayang-MFR) and then use
of other modules and web sites (e.g., National Library of Virtual Manipulatives2) that did
not tutor; and 4) classroom instruction instead of software instruction or use of math web
sites (noWayang-noMFR). All students had similar exposure time to the software or math
class. The existence of six classes of each grade created a challenge to match classes to
each condition. As a result, either two 7th grade classes and one 8th grade class were
assigned to the same condition, or two 8th grade classes and one 7th grade class.
Procedure. The first and last (fourth) day of the study, students completed a
mathematics mock standardized test (counterbalanced, so that half of students received
test A for the pretest and the other half received test B for pretest; the last day tests were
reversed for the posttest). Tests A and B were similar in difficulty and consisted of a
combination of easy, medium and hard items that addressed skills covered throughout the
tutoring system. Students also completed a pretest of computation items (addition,
subtraction, multiplication and division) online and their accuracy and speed to answer
was recorded. The last day, students completed a math facts retrieval posttest within the
MFR software. Speed and accuracy at individual items and averages across items were
recorded. Students also completed the mock-standardized test that they had not taken the
first day (A or B). Students using the MFR software used the quiz-game modules, drilling
on single digit multiplication tables, single digit addition, double and single digit
subtraction, and double digit by single digit division, in the fashion described in the
previous section, for about 15 minutes every day. Students using the Wayang software
were directed to the tutoring module (after MFR training in the case of the Wayang-MFR
group), where they progressed through 9 topics, practicing in each of the problems
assigned via an adaptive pedagogical module. Students were encouraged to request hints
via the help button and to remember that the goal was to learn from the software. Students
with learning disabilities were identified by the fact that they had Individual Educational
Plans (IEP) [18].
Expected Outcomes. We expected the following outcomes: 1) improved performance
on the mock-standardized posttest for the cohort who received MFR training, compared to
those who did not; 2) improved performance for students in the Wayang conditions
compared to the no-Wayang conditions; 3) improved performance for female students
doing MFR training, compared to those females who did not train MFR; 4) improved
performance for low achievement and students with LD doing MFR training, compared
low achievement students who did not train MFR.
2
http://nlvm.usu.edu
However, realistically, we
were hesitant to predict that
15 minute blocks of MFR
training during 2-3 days
would produce improved
results
at
retrieval.
In
addition, we were also
concerned that taking time
away from Wayang for MFR
training would be detrimental
to learning from Wayang.
4. Experimental Results
Despite of the limited
exposure time to Wayang, the
two groups that received
tutoring during days 1, 2 and
3, improved in the math test
by an overall 3%. This is not
much compared to our past
studies, but it is reasonable
considering that the average
student went through half of
the topics in the system, and
that it was the first time that
Wayang was used with
middle school students (7th
and 8th grades). Interestingly,
students in the no-Wayang
groups actually decreased
performance,
indicating
perhaps that students in
general did not want to take
yet another test and were less
careful during the posttest
than during the pretest (see
Figure 3). The effect size for
Wayang
vs.
no-Wayang
groups (Cohen’s d) was 0.39.
Fig. 3. Improvement on hard items of the test (top), specially for
students who used both the tutor and the MFR training. Ceiling
effect on the easy items of the test for all students (bottom)
The group with
highest
scores
at
posttest time was the
Wayang-MFR group,
which received both
Wayang and MFR
training (see Figure 3).
Items in the test were
split into easy and hard
items depending on
pretest
performance
across
the
whole
population, and scores
were
computed
separately as if there
Fig. 4. Students who received MFR Training became faster at
were two pre and postresponding to simple arithmetic questions and students who used the
tests, an easy and a
Wayang Tutor with MFT performed the fastest (right). Means and SD.
hard one. Wayang
helped students improve (or maintain, in the case of easy items) their math test
performance compared to the no-Wayang control groups. An ANCOVA for posttest
percent correct, with pretest score as a covariate and Wayang [yes/no] and MFR[yes/no]
as fixed factors, revealed the following: a significant effect for Wayang on posttest
performance (F(222,1)=3.8, p=.05), a non-significant effect on Math Fact Retrieval
Training (p=.97), and a significant interaction effect for Wayang x MFR (F(222,1)=7.9,
p=.005) suggesting a differential impact of a combination of MFR Training and Wayang
on student improvement.
We analyzed the improvement of students for easy and hard items separately, in part,
because, students did quite well in the pretest (the overall test was too easy for them). We
generated two pretest and posttest scores for the half “easier” and “harder” items of each
of the tests, depending on general performance at each item at pretest time, across the
whole population of students. In addition, because we wanted to analyze the impact of the
interventions on gender and students with learning disability, we analyzed the following
fixed factors: Wayang, MFR, Gender and MathAbility [low or high achievement3].
For EASY items, an ANCOVA revealed a significant effect for Wayang alone
(F(221,1)=10.6, p=.001); a non-significant effect for MFR alone (F(221,1)=.1, p=.7); a
significant main effect for MathAbility (F(221,1)=14.7, p<.001); and a significant
interaction effect for WayangxMFR (F(221,1)=5.1, p=.025). While a significant
interaction effect reveals that at least two of the means are different (corresponding to the
3
Math Ability was determined by a median split on the overall math pretest performance. 88% of
students with an identified learning disability were part of the low achievement group. More than
one third (35%) of students in the low achievement group had a learning disability, while only 4%
students in the high achievement group had a learning disability (not necessarily math related).
four groups defined by the combinations of MFR[yes/no] and Wayang[yes/no]), it is not
clear which group(s) are better and which are worse. Bonferroni confidence intervals
allow to answer specific questions such as whether one of the treatments is better than the
rest, or whether two of the groups are better than the other two. For instance, Bonferroni
confidence intervals revealed that the Wayang-MFR group had significantly highest
improvement (higher than the other three groups), and that both Wayang groups scored
higher on easy items of the posttest than the no-Wayang groups. However, confidence
intervals also revealed that the Wayang-MFR group did not do significantly better than
Wayang-noMFR group, suggesting that MFR Training does not help to significantly
improve performance on easy items. Wayang seems better at doing that.
For HARD items, the ANCOVA revealed again a significant effect for Wayang
(F(222,1)=6.8, p=.01); a non-significant effect for MFR alone (F(222,1)=.5, p=.5; and a
significant effect for Wayang x MFR (F(222,1)=6.8, p=.009). Confidence intervals
revealed that the Wayang-MFR group had significantly highest improvement than the
other three groups on hard items, and that both Wayang groups scored higher at hard
items than the other two no-Wayang groups. Bonferroni confidence intervals also revealed
that the Wayang-MFR group did do better than the Wayang-noMFR group, suggesting
that MFR training did help to improve performance on hard items for students who used
Wayang.
An interpretation of these results is that being more math fluent (thanks to the MFR
training) frees up cognitive resources that are essential to approach hard math problems.
Easy items don’t seem to require so many cognitive resources, so the math fluency
training did not make a difference in performance at these easy items.
The advantage of the Wayang-MFR group can be attributed to MFR training only if
students in the MFR groups had gotten faster at retrieving those simple math facts from
memory. Thus, we analyzed the gain in MFR posttest speed and accuracy of students who
received Math Facts training compared to those who did not, Figure 4. Given that pretest
and posttest accuracy was at ceiling (reasonably, students were highly accurate at simple
arithmetic operations), we analyzed only speed --whether students had gotten faster. We
ran an ANCOVA for Math Facts Speed Posttest (a mean speed for all items in the MFR
Posttest for each student) with Math Facts Speed Pretest as a covariate, and MFR[Yes/No]
and Wayang[Yes/No] as fixed factors. The result was a highly significant effect for MFR
(F(197, 1)=13.9, p<.001). Figure 4, shows that students who received MFR training were
faster to answer those simple math facts at posttest time. A significant effect for Wayang
(F(197,1)=8.6, p=.023) was unexpected, and suggests that using Wayang helps students be
more math fluent, faster at retrieving simple math facts from memory.
5. Discussion
Despite the limited exposure to the software (3 days), the Math Facts Retrieval Training
software combined with the Wayang tutor effectively improved students performance on a
standardized test and specifically improved learning on hard questions. Hard items on
these tests generally involved several steps and much computation, and MFR training
probably freed up memory resources that were used to think about the problem. In
addition, a ceiling effect for easy items might have made that score harder to improve.
While the Wayang main effect did not surprise us, as we had evidence that Wayang can
improve performance for standardized test items even with short amounts of exposure, the
improvement in students’ speed to retrieve simple arithmetic operation answers from
memory due to Wayang was unexpected. The repeated need of computation to solve these
problems may be attributed to the math facts retrieval speed improvement.
Math Facts Retrieval Training alone (without Wayang) did not help middle school
students perform better at standardized test items, suggesting that MFR training should be
supplemented with appropriate instruction on the test topics for such training to have a
real impact on math standardized tests scores. Wayang tutoring seemed better than the
alternative math computer activities that students used in the no-Wayang groups. The fact
that the noWayang-noMFR group did somewhat better than the noWayang-MFR group
can be partly attributed to classroom instruction: a large group of the students in the
noWayang-noMFR group had their regular math class, and the teacher covered some of
the same topics taught by the tutor during math class.
The lack of gender effects, math ability effects, or interaction effects involving gender
or math ability suggest that MFR Training was highly effective for all students, not only
for females or low achieving students. We conclude that MFR Training software is an
invaluable supplement to traditional math intelligent tutoring software, for students of all
levels, both females and males. We plan to continue to include this basic skills training
within our mathematics intelligent tutoring system.
Acknowledgement
This research was funded by a NSF grant, What kind of Math Software works for Girls?
Arroyo, I. (PI) with Royer, J.M. and Woolf, B.P. (#0734060, HRD GSE/RES), and a
grant from the US Department of Education, Institute of Education (IES) Using Intelligent
Tutoring and Universal Design To Customize The Mathematics Curriculum, to Woolf (PI)
with Arroyo, Maloy co-PIs. Any opinions, findings, conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views
of the funding agencies.
References
1. Geary, D. C. Hoard, M. K. Byrd-Craven, J. Nugent, L. Numtee, C., (2007), Cognitive
Mechanisms Underlying Achievement Deficits in Children With Mathematical Learning
Disability, Child Development, Vol 78; # 4, pp 1343-1359.
2. Royer, J. M., & Tronsky, L. N. (1998). Addition practice with math disabled students
improves subtraction and multiplication performance. In T. E. Scruggs and M. A.
Mastropieri (Eds.), Advances in Learning and Behavioral Disabilities (Vol 12).
Greenwich, Conn.: JAI Press, Inc. (pp 185-218).
3. Catsambis, S. (1994). The path to math: Gender and racial-ethnic differences in
mathematics participation from middle school to high school. Sociology of Education, 67,
199-215.
4. Catsambis, S. (2005). The gender gap in mathematics: Merely a step function? In A. M.,
Gallagher & J. C. Kaufman (Eds.), Gender differences in mathematics. Cambridge, UK:
Cambridge University Press, pp. 220-245.
5. Midgeley, C. Feldlaufer, H., & Eccles, J. (1989). Student/teacher relations and attitudes
toward mathematics before and after the transition to junior high school. Child
Development, 60, 981-992.
6. Olson, L., (2005) State Test Programs Mushroom as NCLB Mandate Kicks In. Education
Week, November 30, 2005.
7. Casey, M; Nuttall, R.; Pezaris, E.; Benbow, C. (1995) The influence of spatial ability on
gender differences in math college entrance test scores across diverse samples.
Developmental Psychology, 31, 697-705.
8. Royer, J. M., Tronsky, L. N., Chan, Y., Jackson, S. J., & Marchant, H. G. (1999). Math
fact retrieval as the cognitive mechanism underlying gender differences in math test
performance. Contemporary Educational Psychology, 24, 181-266, pg 196.
9. Eccles, J., Wigfield, A., Harold, R. D., & Blumenfeld, P. (1993). Age and gender
differences in children’s self and task perceptions during elementary school. Child
Development, 64, 830-847.
10. Carr, M. and Jessup, D. (1997) "Gender Differences in First-Grade Mathematics Strategy
Use: Social and Metacognitive Influences." Journal of Educational Psychology, Vol. 89
(No. 2) 318-328.
11. Beal, C. R. (1993) Boys and girls: the development of gender roles. McGraw-Hill.
12. Royer, J. M., & Garofoli, L. (2005). Cognitive contributions to sex differences in math
performance. In A. M., Gallagher & J. C. Kaufman (Eds.), Gender differences in
mathematics. Cambridge University Press. (pp. 99-120).
13. Fletcher, J., Lyon, G., Fuchs, L., & Barnes, M. (2007). Learning disabilities: From
identification to intervention. NY.
14. Arroyo, I., Beal, C. R., Bergman, A., Lindenmuth, M., Marshall, D., Woolf, B. P. (2003)
Intelligent Tutoring for high-stakes achievement tests. Proceedings of the 11th
International Conference on Artificial Intelligence in Education. IOS press.
15. Arroyo, I., Woolf, B.P., Royer, J.M., Tai, M. (2009) Affective Gendered Learning
Companions, International Conference on Artificial Intelligence and Education, IOS
Press, July.
16. Arroyo, I., Cooper, D.G., Burleson, W., Woolf, B.P., Muldner, K., Christopherson, R.
(2009) Emotion Sensors Go To School, International Conference on Artificial Intelligence
and Education, IOS Press, July.
17. http://ReadingSuccessLab.com
18. Arroyo, I., Woolf, B., Muldner, K., Burleson, W., Cooper, D., Razzaq, L., Dolan, R., Low
Achieving and Learning Disability Students are Helped by Motivational Learning
Companions, to ITS2010