Global best - MilkyWay@Home - Rensselaer Polytechnic Institute

"The Maximum Likelihood
Problem and Fitting the
Sagittarius Dwarf Tidal Stream"
Matthew Newby
Astronomy Seminar
RPI Oct. 22, 2009
1
Overview:
•Introduction
•The Sagittarius Stream
• SDSS
• Locating
• Maximum Likelihood
• Methods
•Differential Evolution
• Monte-Carlo Markov-Chain
•Gradient Descent
•Genetic Search
•Particle Swarm
• Revisit the Sagittarius Stream
• BOINC
•Overview
•Current and Future Work
2
Introduction
•Modern Astronomy – No longer staring through a telescope
•Automated Surveys produce large data sets
•Errors in measurements – statistical methods
needed
•Fast and accurate computer routines are needed
in order to analyze this information!
Image : NASA.gov
computer$ go faster_
Image : Wikimedia Commons
3
The Sloan Digital Sky Survey (SDSS):
Image: sdss.org
• 230+ million objects
• 8,400 square degrees in the sky
• Large percentage of north galactic cap
• Very little data in galactic plane (too much dust)
• Several hundred thousand stars
4
The Sagittarius Dwarf Tidal Stream
• The Sagittarius Dwarf Galaxy is
merging with the Milky Way
• The dwarf is being tidally
disrupted by the Milky Way,
creating long “tails.”
Mapping the Tidal Stream will:
• Provide information on matter
distribution in Milky Way
• Provide constraints on Galactic
Halo
Image (above): [Ibata et al. 1997, AJ]
5
Image (left): David Martinez-Delgado (MPIA) & Gabriel Perez (IAC)
Halo
Bulge
Thin Disk
Thick Disk
The Milky Way:
Data Wedge
Sun
Sagittarius Dwarf Galaxy
Tidal Stream
~30 kiloparsecs (100,000 light-years)
6
Data Stripe:
F-turnoff stars on the H-R diagram
Stripe 82 (southern galactic cap)
7
Image: Newberg & Yanny 2006, JoP Conference series (modified by N. Cole
Sag. Stream: Model
• Assume stream is a cylinder
• Radial drop-off given by a Gaussian Distribution
• 2 background parameters
r0, q
• 6 parameters per stream
ε, μ, r, θ, φ, σ
Cole, N.
Background distribution:
At least 8 parameters in the search –
8-dimensional solutions space!
8
Maximum Likelihood:
• Bayesian Method
• Must assume a “prior” – a model explaining the data
• Find the parameters that are the “most likely” in a data set, given the prior
• Law of large numbers
•Can assume that large data sets have normally distributed data points
• Find probability that each data point lies in the given distribution
•The you can get the likelihood:
L(Q|D) =  DataPointProbi
9
Computational Algorithms
Overview:
• Set up problem
• Parameter space: all allowed values of parameters
• Likelihood evaluator for given parameters
• Evaluation method – moves in parameter space in an efficient way
• End conditions: when change in best is below a limit, or a predefined
number of iterations is reached.
Problems:
•Likelihood calculation is usually time-consuming
• Need to avoid local maximums – find global max
What is the best method?
10
Computational Methods:
“No Free Lunch”
(David H. Wolpert, William G. Macready)
Poor Students:
Rosencrantz
•Only eats meat
Guildenstern
•Low Carb Diet
Ophelia
•Vegetarian
Local Eateries, same menus, random prices:
Burger Palace
Gourmet Salads
No Carbs at All
Prices differ by restaurant! Not everyone can eat cheaply!
One restaurant cannot be the best solution for every person (problem)!
•One solution method (or algorithm) will not be ideal for all problems!
•Need to choose the best solution for the job at hand!
11
Conjugate Gradient Descent (CGD)
• Calculates the gradient of the surface for each parameter
• Moves towards best likelihood using a line search
• Conjugate gradient uses the gradient of the previous step to converge faster
•Requires many likelihood calculations per move
• Unfortunately, may end at local maximums
• Need to run from several different directions in order to find global best
Likelihood vs. Position
best solution
The gradient, G:
gradient
location
Local Maximum
Gradient Descent: 1-dimensional case
L = likelihood function
Q = Parameter (i or j)
hi = step size for ith parameter
12
Line Search
• Evaluates two points in direction of gradient: one a distance 1d away, the other 2d
• d is usually related to the gradient (slope)
• If the middle point is not at a better likelihood than the end points, d is doubled and
the process repeated
• If the middle point is higher, then the middle point becomes the starting point for
another CGD
• Line Search causes the algorithm to reach the best likelihood efficiently
next end point
next middle point
first end point
Line Search example (left):
The first search does not find a better
likelihood for the middle point (yellow),
so the distance is doubled. This time, the
new middle point (red) has the best
likelihood. The next iteration of CGD will
start at this point.
first middle point
starting point
13
Monte-Carlo Markov-Chain (MCMC)
• A “random walk” method
• Samples parameter space well
• Automatically produces error distribution
• Easy to code
•Sensitive to running time and step size
• Never truly converges
•Metropolis-Hastings:
• Take a step in each direction (parameter)
• Step size/direction is random, drawn from
a normal distribution
• If the new location has a better likelihood,
move to it
• If the new location has a worse likelihood,
then there is a chance of moving to it
The trajectory of a 1000 step MCMC straight-line fit
(top) and the distribution in b (bottom).
14
Genetic Search
• Inspired by natural selection
• Start with multiple “individuals” (positions) in parameter space
• Evaluate likelihood for each individual
• Remove individuals with the worst likelihoods
• Replace the removed individuals with “children” of the remaining individuals
(“parents”)
• Parents can be chosen randomly or from the best likelihoods
• Create children through crossover and mutation:
• Crossover: A child inherits the parameters of multiple parents, either by
averaging the parents’ parameters or by inheriting select parameters from
each parent
•Mutation: Replace a parameter with a new, randomly generated one
• Repeat until end conditions are met
15
Differential Evolution
• An individual moves according
to the weighted difference
between the locations of two
“parent” individuals
• If the new position has a worse
likelihood, then the individual
does not move
• Parents may be random or
chosen from the population best
• Also, multiple pairs of parents
may be used (averaging over the
differences)
Difference Vector
No Change
X
Change in position
(center is global best)
16
Particle-Swarm Optimization
Parameter Space
• Physically Intuitive –
based on animal behavior
• Particles have velocities
• “Forces” towards
personal best, global best
Global best
to global best
velocity
Personal best
to personal best
particle
Position (x) change at step t:
w, c1,c2 are weighting parameters, p is personal best, g is global best, rand() is a random number
17
BOINC
Berkeley Open Infrastructure for Network Computing
Milkyway@home stats:
Total
Active
Users
37,251
16,010
Hosts
79,023
25,101
Teams
1,410
922
Countries
163
124
Total Credit
9,302,434,280
Recent average credit RAC
52,731,529
Average floating point operations per
second
527,315.3 GigaFLOPS / 527.315
TeraFLOPS
• Users volunteer spare processor / graphics card time to the project
• Massively parallel
• Graphics processor technology has created a large increase in processing power
• Milkyway@home is now the #2 ranked BOINC project
• You can help, too: http://milkyway.cs.rpi.edu/milkyway/
18
Separation: Stripe 82
Sgr Stream Stars
Sgr Stream Stars
Non-Sgr Stream Stars
19
Conclusions:
• Modern astronomy produces large data sets
• The Maximum Likelihood method is ideal for analyzing this data
• Powerful computer algorithms exist to perform MLE
• Mapping the Sagittarius Stream is possible by using these methods
20
Credits
The Sloan Digital Sky Survey
BOINC.com
Milkyway@home
Prof. Heidi Newberg, Rensselaer Polytechnic Institute
Nathan Cole, “Maximum Likelihood Fitting of Tidal Streams with Applications to the
Sagittarius Dwarf Tidal Tails” (PhD Thesis, Rensselaer Polytechnic Institute, 2008)
Travis Desell, “Aysnchronous [sic] Global Optimization for Massively Distributed Computing”
(PhD candidacy document, 2009)
Shakespeare, et al. “Hamlet”
21
3 stream search:
22