Data reduction and processing tutorial

25-31 October 2010 EMBO Course
Data reduction and
processing tutorial
Petr V. Konarev
European Molecular Biology Laboratory,
Hamburg Outstation
BioSAXS group
25-31 October 2010 EMBO Course
EMBL BioSAXS beamline X33, 2010
Optics
Vacuum
cell
Completely redesigned 2005-2010
Efficiency gain: about 5-10 times
Hutch
25-31 October 2010 EMBO Course
PILATUS Pixel X-ray Detector at X33
PILATUS 1M (10*100K modules)
Active area 70*420 mm2 , pixel size: 170µm
Readout time: 3.6ms, framing rate: 50Hz
AgBeh pattern
In user operation on X33 from 22.11.07
25-31 October 2010 EMBO Course
Ag-Behenate 2D Pattern (FIT2D View)
25-31 October 2010 EMBO Course
Ag-Behenate 2D Pattern (with MASK)
25-31 October 2010 EMBO Course
Ag-Behenate 1D Pattern (FIT2D View)
25-31 October 2010 EMBO Course
Creating angular axis
Axis-adat
Axis-adat
Makeaxis
Makeaxis dialog
dialog
Angular
Angular axis
axisfile
file
Ag-Behenate
Ag-Behenate
Channels
Channels
19-26 October 2008 EMBO Course
Raw data reduction steps
•
Radial integration of 2D image into 1D curve
•
Check for radiation damage, averaging of different time
frames (total of 8 frames, each for 15 seconds)
•
Associated errors in the data points are computed from the
numbers of counts using Poisson statistics
•
Mask file is used to eliminate beamstop and inactive
detector area
•
Exact coordinates of the beam center are required for
integration (determined from AgBeh data)
•
Data are normalized to the pindiode value (intensity of the
transmitted beam)
•
Data are transferred into ASCII format containing 3
columns:
s I(s)
Er(s)
An automated SAXS
pipeline at X33
Data normalization
2D-1D reduction
Data processing
Check for radiation damage
Computation of overall parameters
Database search
HardwareAb initio modelling
independent
analysis block XML-summary file generation
25-31 October 2010 EMBO Course
Program PRIMUS- graphical package
for data manipulations and analysis
♦statistical analysis, 2D integration and scaling of raw data in MAR and OTOKO format,
transforming MAR image plate/gas detector data into ASCII format
(modules MAR-PRIMUS, Sapoko and Binasc)
♦data manipulations (averaging, background subtraction, merging of data in
different angular ranges, extrapolation to infinite dilution )
♦evaluation of radius of gyration and forward intensity (Guinier plot, module AUTORG),
estimation of Porod volume
♦calculation of distance/size distribution function p(r)/V(r) (module GNOM)
♦data fitting using the parameters of simple geometrical bodies
(ellipsoid, elliptic/hollow cylinder, rectangular prism) (module BODIES)
♦data analysis for polydisperse and interacting systems, mixtures and partially ordered
systems (modules OLIGOMER, SVDPLOT, MIXTURE and PEAK)
P.V. Konarev, V.V. Volkov, A.V. Sokolova, M.H.J. Koch, D.I. Svergun
J.Appl. Cryst. (2003) 36, 1277-1282
25-31 October 2010 EMBO Course
PRIMUS: graphical user interface
PRIMUS:
25-31 October 2010 EMBO Course
data processing
buffer before
sample: BSA 5.4 mg/ml
buffer after
Plot
Plot
PRIMUS:
25-31 October 2010 EMBO Course
buffer averaging
buffer before
buffer after
buffer averaged
Average
Average
PRIMUS:
25-31 October 2010 EMBO Course
data processing
sample: BSA 5.4 mg/ml
buffer averaged
Plot
Plot
PRIMUS:
25-31 October 2010 EMBO Course
buffer subtraction
sample: BSA 5.4 mg/ml
buffer averaged
subtracted curve
Subtract
Subtract
25-31 October 2010 EMBO Course
PRIMUS: data at different angular ranges
saxs range
waxs range
Plot
Plot
PRIMUS:
25-31 October 2010 EMBO Course
merging data
saxs range
waxs range
Merge
Merge
PRIMUS:
25-31 October 2010 EMBO Course
merging data
saxs range
waxs range
merged curve
Merge
Merge
PRIMUS:
25-31 October 2010 EMBO Course
Guinier plot
I (s) = I (0) ⋅ exp(− s 2 ⋅ Rg2 3)
Rg – radius of gyration
Rg
Rg == 2.68
2.68 ++- 1.11e-2
1.11e-2
I0
I0 == 271.07
271.07 ++- 0.605
0.605
M≈MBSA*(I(0)/IBSA(0))
Guinier
Guinier
25-31 October 2010 EMBO Course
PRIMUS:
AutoRG module
AutoRg
AutoRg
Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I.
(2007) J. Appl. Cryst., 40, s223-s228.
PRIMUS:
V = 2π
2
25-31 October 2010 EMBO Course
Porod volume
I (0 ) Q = 2π
2
I (0)
∞
∫
s 2 ( I ( s ) − K ) ds
0
V
V == 92.37
92.37
V – excluded volume of particle
Porod
Porod
25-31 October 2010 EMBO Course
Extrapolation to zero concentration
25-31 October 2010 EMBO Course
Merging for making final composite curve
9.3
9.3 mg/ml
mg/ml
1.4
mg/m
1.4 mg/m
3.8
3.8 mg/ml
mg/ml
5.7
mg/ml
5.7 mg/ml
8.5
8.5 mg/mll
mg/mll
9.3
9.3 mg/ml
mg/ml
1.4
1.4 mg/m
mg/m
3.8
mg/ml
3.8 mg/ml
5.7
5.7 mg/ml
mg/ml
8.5
8.5 mg/mll
mg/mll
25-31 October 2010 EMBO Course
Checking Guinier plot
5.7
9.8
1.4
3.8
8.5
mg/ml
9.8
1.4
3.8
5.7
8.5mg/ml
mg/ml
mg/ml
25-31 October 2010 EMBO Course
Merging for making final composite curve
9.3
9.3 mg/ml
mg/ml
1.4
1.4 mg/m
mg/m
3.8
mg/ml
3.8 mg/ml
5.7
5.7 mg/ml
mg/ml
8.5
8.5 mg/mll
mg/mll
9.3
9.3
mg/ml
9.3
9.3mg/ml
mg/ml
mg/ml
1.4
1.4
mg/m
1.4
1.4mg/m
mg/m
mg/m
3.8
3.8
mg/ml
mg/ml
3.8
3.8mg/ml
mg/ml
5.7
5.7
mg/ml
5.7
5.7mg/ml
mg/ml
mg/ml
8.5
8.5
mg/mll
8.5
8.5mg/mll
mg/mll
mg/mll
merged
merged
25-31 October 2010 EMBO Course
Real/reciprocal space transformation
p(r)=r2 γ(r) distance distribution function
γ0(r)=γ(r)/γ(0)
Probability to find a point at
distance r from a given point
inside the particle
i
rij
j
25-31 October 2010 EMBO Course
Distance distribution function
from simple shapes
25-31 October 2010 EMBO Course
Distance distribution function of helix
PRIMUS:
25-31 October 2010 EMBO Course
GNOM menu
Gnom
Gnom
Indirect Fourier Transform
Run
Run
PRIMUS:
25-31 October 2010 EMBO Course
P(R) function
Gnom
Gnom
Indirect Fourier Transform
25-31 October 2010 EMBO Course
AUTOGNOM – automated version of GNOM
for monodisperse systems
♦ In
the original version of GNOM the maximum particle
size Dmax is a user-defined parameter and successive
calculations with different Dmax are required to select its
optimum value.
♦ This
optimum Dmax should provide a smooth real
space distance distribution function p(r) such that
p(Dmax) and its first derivative p'(Dmax) are approaching
zero, and the back-transformed intensity from the p(r)
fits the experimental data.
Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I.
(2007) J. Appl. Cryst., 40, s223-s228.
25-31 October 2010 EMBO Course
Estimation of Dmax with GNOM
(under-estimation)
6.0
Poor fit to experimental data
Distance distribution function p(r)
goes to zero too abruptly
25-31 October 2010 EMBO Course
Estimation of Dmax with GNOM
(over-estimation)
12.0
Good fit to experimental data
BUT: Distance distribution function
p(r) becomes negative
25-31 October 2010 EMBO Course
Estimation of Dmax with GNOM
(correct case)
8.0
Good fit to experimental data
Distance distribution function p(r)
goes smoothly to zero
25-31 October 2010 EMBO Course
AUTOGNOM – automated version of GNOM
for monodisperse systems
♦ The
maximum size is determined from automated
comparison of the p(r) functions calculated at different
Dmax values ranging from 2Rg to 4Rg, where Rg is the
radius of gyration provided by AUTORG.
♦The
calculated p(r) functions and corresponding fits to
the experimental curves are compared using the
perceptual criteria of GNOM (Svergun, 1992) together
with the analysis of the behavior of p(r) function near
Dmax and the best p(r) function is chosen for the final
output.
Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I.
(2007) J. Appl. Cryst., 40, s223-s228.
25-31 October 2010 EMBO Course
Examples for data processing
Go to directory /Examples/Primus/
../AgBeh
- data from saxs and waxs range as well as the merged curve
../BSA - standard sample for calibration of molecular mass
../Lysozyme – example of p(r) from globular particle
../Lymazine – example of p(r) from hollow globular (virus-like) particle
../ISWI_Cterm – example of p(r) from elongated particle