Multithreading RePast Models
Alex Voss1, Jing-Ya You2, Eric Yen2, Simon
Lin2,
Ji-Ping Lin3, Andy Turner4
1School
of Computer Science, University of St Andrews
2Academia Sinica Grid Computing, Academia Sinica, Taiwan
3Center for Survey Research, Academia Sinica, Taiwan
4School of Geography, University of Leeds
Workshop on Future Directions in Agent Based Modelling
Leeds, UK, June 2010
SICSA student induction day, 2009
Slide 1
Overview
•
Something about the model we want to build on
migration
•
Quite a bit about how we tweaked the model to make
use of multiple CPUs/cores
•
A bit about what we will do next and questions we
want to explore
•
My interest in this…
SICSA student induction day, 2009
Slide 2
Migration in Taiwan
•
Migration has been an important factor in Taiwanese
social development and influenced by outside factors
since the 1600s
•
Aim is to test existing theories of migration
constructively and to investigate recent developments
such as increased outward migration to China.
•
Timely as Taiwan is running another census in 2010.
•
Based on work conducted by Ji-Ping Lin of the
Academia Sinica Center for Survey Research on
Migration using the 1990 and 2000 Taiwan
Population and Housing Census
SICSA student induction day, 2009
Slide 3
SimTaiwan: Migration in Taiwan
•
Based on Taiwan 2000 Population and Housing Census
•
Dataset is individual-level but with restricted variables
•
Held at Academia Sinica
•
Need to identify additional
datasets to complement census
•
Issues with data protection
•
Need to scale up to ca.
22 million individuals
•
They are heterogeneous agents with quite a large number of
attributes and history.
SICSA student induction day, 2009
Slide 4
SimTaiwan Tests
•
Four different model implementations:
1.
2.
3.
4.
•
Naïve single-threaded model
Improved single-threaded model
Initial multi-threaded model
Improved multi-threaded model
Test runs with each of these models to measure:
1.
2.
3.
4.
Wallclock and CPU time
Memory usage
Code hotspots
Worker thread activity (where applicable)
SICSA student induction day, 2009
Slide 5
Test Code and Parameters
•
Simplified model with only fertility and mortality, same
for all measured models
•
250k male and 250k female random initial population,
running for 365 ticks (=days)
•
Measurements taken using JProfiler 6.0.4
–
CPU sampling (5 sec intervals)
–
Memory allocations recording
•
JVM Parameters: -Xmx8192M –Xss128M
•
Hardware: Dell PowerEdge R610 with 2xXeon E5504
@ 4x2GHz (8 cores total) and 16GB RAM
SICSA student induction day, 2009
Slide 6
Naïve Serial Version
•
More time spent in RePast scheduling code than in
model code because events are scheduled for each
individual agent every step.
SICSA student induction day, 2009
Slide 7
Improved Serial Version
•
Event scheduled on DemographicsContext, code
iterating through individual agents
•
Wallclock time down from 5:32 to 2:23
•
Opens up opportunities for parallelising code as
well…
SICSA student induction day, 2009
Slide 8
Initial Parallel Version
•
Need to partition data to allow multiple worker
threads to exploit multiple CPUs & cores
•
PartitionedContext keeping agents in separate
HashSets that can return independent Iterators for
use by multiple threads.
•
ThreadPoolExecutor with configurable number of
worker threads (here 8)
•
Initial version brings only modest / no improvement,
wallclock time in some runs > improved serial code
•
Max. CPU utilisation ~ 200% (top)
SICSA student induction day, 2009
Slide 9
Initial Parallel Version (II)
•
Worker threads blocking a lot on monitors placed around RePast
constructs.
•
Main issue seems to be that use of RandomHelper is not threadsafe
•
Simulation schedule relatively minor issue
•
Some contention around simulation objects
SICSA student induction day, 2009
Slide 10
Improved Parallel Version
•
Overloading some of RePast’s code to make it thread
safe.
•
Reducing scope of monitor objects used and pulling
code parts that are safe out of synchronized sections
•
Introducing thread-local variable containing a per
thread random number generator:
protected static ThreadLocal<Uniform> uniform = new ThreadLocal<Uniform>() {
@Override
protected Uniform initialValue() {
RandomEngine generator = new
MersenneTwister((int)System.currentTimeMillis());
return new Uniform(generator);
}
};
SICSA student induction day, 2009
Slide 11
Improved Parallel Version (II)
•
Monitor contention is eased signficantly
•
Wallclock running time down to 1:03 and max. CPU utilisation
up to ~ 600%
•
Time spent in serial code for analysis and production of charts is
now significant
SICSA student induction day, 2009
Slide 12
What have we
learned/developed?
•
Advice on structuring RePast code
–
Parallelise using PartitionedContext
–
Iteration instead of scheduling events
•
RePast does put some barriers in the way but should
be possible to overcome
•
Speed-up initially not as much as hoped for but was
overcome by introducing thread-local random number
generators
•
Can we factor this work into development of RePast?
•
Or present as tutorial?
SICSA student induction day, 2009
Slide 13
Next Step: Debugging/Profiling
on the Grid
•
Tests to establish optimum number of partitions and
threads vs no. of agents
•
Verification of the code and sensitivity analysis
•
Repeated runs to uncover rare events & need to
repeat runs to obtain comparable average figures
•
Availability of high-memory machines will become an
issue once we scale up to full 22 million agents;
–
48GB server available at ASGC
–
Upgrade / purchase of server at St Andrews planned
SICSA student induction day, 2009
Slide 14
Questions
•
What will happen when we make the model more
complex?
•
What decisions about the model (will) affect the
degree of parallelism and running times?
•
How many CPU cores can we effectively utilise?
•
–
Need machine with more cores as well as more memory
–
This is now becoming affordable thanks to AMD
Commodity computing is what we are interested in –
less skills involved (?) and availabilty for social
scientists
SICSA student induction day, 2009
Slide 15
My Interests…
•
Not about building the most sophisticated model
•
or the highest performance one but…
•
about making ABM framework(s) (RePast) usable for
social scientists interested in population-level
phenomena,
•
addressing the practical issues of developing and
using agent-based models in anger
•
cf. challenges outlined by Peter McBurney today
SICSA student induction day, 2009
Slide 16
© Copyright 2026 Paperzz