Application of Dynamic Light Scattering

WHITEPAPER
Application of Dynamic Light Scattering
(DLS) to Protein Therapeutic Formulations:
Principles, Measurements and Analysis - 3.
DLS Deconvolution Algorithms
A Malvern Instruments' Bioscience Development Initiative
Executive Summary
Dynamic light scattering (DLS) is an analytical technique used to measure the
particle size distribution of protein formulations across the oligomer and sub-micron
size ranges of approximately 1 nm to 1 µm. The popularity of DLS within the
biopharmaceutical industry is a consequence of its wide working size and extended
sample concentration ranges, as well as its low volume requirements. With that
said, the challenge that remains with the application of DLS to protein therapeutic
formulations is centered around data interpretation. In this four-part white paper
series, common issues and questions surrounding the principles, measurements and
analysis of DLS data are discussed in order to help minimize the time required for
and complexity of acquiring and interpreting DLS data that is critical throughout the
development process. In this third white paper of the series, we cover the basic types
of DLS deconvolution algorithms used to extract the intensity weighted particle size
distribution from the measured correlogram.
Dynamic Light Scattering
Dynamic light scattering (DLS) is an analytical technique used within bioapplications
to measure particle size distributions across the oligomer and sub-micron size ranges.
In a DLS measurement, scattering intensity fluctuations are correlated across small
time spans, yielding a correlogram. The distribution of particle diffusion coefficients
is extracted from the measured correlogram using various deconvolution algorithms,
each of which can yield different results. Identifying the proper algorithm, and hence
the correct distribution of diffusion coefficients, requires a basic understanding of the
operational limitations of the algorithm.
DLS Correlogram
For particles in solution moving under the influence of Brownian motion, the measured
scattering intensity will fluctuate with time, producing an intensity trace which
appears to represent random fluctuations about a mean value. The signal fluctuation
rate will depend upon the rate of change of the position of the particles, with slow
Malvern Instruments Worldwide
Sales and service centres in over 65 countries
www.malvern.com/contact
©2014 Malvern Instruments Limited
WHITEPAPER
moving particles leading to slow fluctuations and fast moving particles leading to fast
fluctuations.
nd
Correlation is a 2 order statistical technique for measuring the degree of nonrandomness in an apparently random data set. When applied to a time-dependent
intensity trace, the intensity correlation coefficients, G2(τ), are calculated as shown
below, where τ is the delay time and the 2 subscript in G2 indicates the intensity rather
than the field autocorrelation.
For direct application, the correlation equation can be expressed as the summation
shown below and detailed in Figure 1.
Figure 1: Schematic detailing measurement and construction of the DLS correlogram.
Typically, the correlation coefficients are normalized, such that G2(∞) = 1. For
monochromatic laser light, this normalization imposes an upper correlation curve limit
of 2 for G2(τ0) and a lower baseline limit of 1 for G2(∞). In practice, the theoretical upper
limit can only be achieved in carefully optimized optical systems. Typical experimental
upper limits are around 1.8 to 1.9 for G2 or 0.8 to 0.9 for G2 - 1, which is what is usually
displayed in DLS correlogram figures (Figure 2).
2
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Figure 2: DLS measured correlograms for 6 nm ovalbumin and 95 nm silicon dioxide in PBS.
Deconvolution
The scattering intensity fluctuations measured during a DLS experiment are a
manifestation of fluctuations in the electric field generated by the ensemble collection
of solution particles. The electric field fluctuations are a consequence of the
superposition of fields (or waves) generated by each of the scattering particles as
they diffuse through the solution. So information regarding particle motion is contained
within the electric field function, as indicated in the field autocorrelation expression (G1)
given below, where E is the field function, Γ is the decay rate, D is the mean diffusion
coefficient, q is the scattering vector, λ0 is the vacuum laser wavelength, ñ is the
medium refractive index, θ is the scattering angle, and the 1 subscript in G1 indicates
the field autocorrelation.
The intensity, which is the actual parameter measured in a light scattering experiment,
2
is equivalent to the square of the field (I = E ), with the respective autocorrelation
functions related to each other through the Seigert relationship, where γ is a coherence
factor expressing the efficiency of the photon collection system.
3
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
So the Seigert relationship can be used to deconvolute particle motion information from
the DLS measured intensity autocorrelation function.
Cumulants Analysis
Cumulants analysis is the ISO recommended approach for extracting the mean or Z
average size from a DLS measured correlogram. In the cumulants analysis, a single
particle size family is assumed and the correlogram is fitted to a single exponential as
shown below, where A is the amplitude or intercept of the correlation function, B is the
baseline, and Γ is the correlation decay rate.
The exponential fitting expression is expanded to account for polydispersity or peak
st
broadening effects and then linearized, as shown below, where the 1 moment (a1)
is equivalent to the decay rate (Γ) and the 2
distribution width (µ2).
nd
moment (a2) is proportional to the
As shown earlier, the decay rate is related to the mean diffusion coefficient (D), which
facilitates determination of the mean hydrodynamic size using the Stokes-Einstein
relationship, where k is the Boltzmann constant, T is the absolute temperature, η
is the viscosity of the medium, and RH is the hydrodynamic radius. The mean size
determined from the cumulants analysis is described as the Z (or intensity weighted)
average.
nd
As noted above, the 2 moment from the cumulants analysis is related to the width
of the distribution. That relationship is given in the expression below, where PdI is
the polydispersity index and σ is the standard deviation of a hypothetical Gaussian
distribution centered on the Z average size.
The cumulants analysis is unique from other DLS algorithms in that it yields only
a mean size and polydispersity index, with no additional information regarding the
modality of the distribution. While it is not uncommon to see hypothetical Gaussian
distributions derived from the mean and PdI in print, it is not recommended, due
to the potential for misinterpretation. If distribution information is desired, the best
4
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
approach is to utilize one of the NNLS algorithms, all of which produce a full particle
size distribution.
NNLS Algorithms
The DLS particle size distribution is derived from the measured correlogram using a
non-negatively constrained least squares (NNLS) fitting algorithm. While there are
subtle differences between the various algorithms, all use a similar non-negative least
squares fitting approach.
In contrast to cumulants analysis, which assumes an ideal field correlation function
of identical diffusing spheres and fits the measured intensity correlation curve to a
single exponential, NNLS algorithms make no assumption with regard to the number
of particle families represented in the intensity correlation function. The challenge, of
course, is in the identification of "how many" particle families or decays are actually
present. In mathematical circles this is viewed as an ill-posed problem, since relatively
small amounts of noise can significantly alter the solution of the integral equation and
hence the number of predicted size families.
Fit
The fitting function, G1 , for the field autocorrelation function can be represented as
a summation of single exponential decays, where the factor Ai is the area under the
th
curve for each exponential contribution, and represents the strength of that particular i
exponential function.
The best fit is found by minimizing the deviation of the fitting function from the
measured data points, where a weighting factor σ is incorporated to place more
emphasis on the strongly correlated, rather than the low correlation (and noisy), data
points.
The weighting factor is proportional to the correlation coefficient, e.g. the correlation
function value at a given τ value. As mentioned above, this serves to give more weight
to highly correlated data points. As an example, consider the correlation curve shown
in Figure 3. As evident in the inset view, there is experimental noise in the baseline. In
the absence of a weighting factor, this noise could be interpreted as 'decays' arising
from the presence of very large particles. So the weighting function provides one
means of addressing experimental noise in the ill-posed deconvolution problem.
5
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Figure 3: Example autocorrelation function highlighting baseline noise.
Fit
Identifying the solution of Ai's in the G1
fitting expression is accomplished by
2
minimizing the deviation in ξ with respect to each Ai, and then solving the resulting
system of equations.
While the details are left to more advanced texts, the standard procedure for
solving the above equation is to construct the solution as a linear combination of
eigenfunctions. With that said, when the eigenvalues are small, a small amount of
noise can make the solution (or the number of Γi) extremely large, hence the previous
ill-posed problem classification. To mitigate this problem, a stabilizer (α) is added
to the system of equations. This parameter is called the regularizer, and with its
incorporation, we are performing a first order regularization of the linear combination of
eigenfunctions in the deconvoluted solution.
The above expression is called a first order regularization because the first derivative
(in Ai) is added to the system of equations. The alpha (α) parameter or regularizer
determines how much emphasis we put on this derivative. In other words, it defines the
degree of smoothness in the solution. If α is small, it has little influence and the solution
can be quite 'choppy'; whereas a larger α will force the solution to be very smooth.
In addition to the smooth solution constraint, NNLS algorithms also require that the
solution be physical, i.e. all Ai > 0. With these constraints, Z is minimized by requiring
that the first derivatives with respect to Ai be zero. As indicated previously, this
6
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
minimization corresponds to solving a system of linear equations in Ai. The solution of
Ai values is found using an iterative approach called the gradient projection method.
The normalized display of Ai vs. Ri (or Ai vs. diameter) is called the intensity particle
size distribution. Mean peak sizes are intensity weighted average values, and are
obtained directly from the size histogram using the following expression:.
The peak width or standard deviation (σ), indicative of the unresolved distribution in the
peak itself, can also be obtained directly from the particle size distribution histogram:
Figure 4 shows an example of a DLS-derived intensity weighted particle size
distribution for a 60 nm latex standard, calculated using an NNLS type algorithm, along
with the mean and standard deviation of the DLS peak.
Figure 4: Example DLS results for 60 nm latex, in histogram format.
What's The "Best" NNLS Algorithm?
A common question from users of DLS instrumentation is "what is the best NNLS
algorithm?". Intuitively, one might think that the obvious best method for fitting the
correlogram would be to use an iterative approach until the sum of squares error
is minimized. For a perfect noise free correlation function, this approach would be
ideal. But in practice, there is no such thing as a perfect noise free correlogram, and
minimizing the sum of squares error in the presence of noise can lead to erroneous
results, with no reproducibility and minimal validity. So the question, "what's the best
NNLS algorithm", is a good one.
The problem here is that the answer depends very much on the type of sample being
analyzed, the working size range of the instrument being used, and most importantly,
the level of noise in the measured correlogram. There are a variety of named NNLS
type algorithms available to light scattering researchers, either through the web or
through the collection of DLS instrument vendors. While the algorithms are all NNLS
based, what generally makes them unique is the locking of certain variables, such as
7
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
the weighting factor or the regularizer, in order to optimize the algorithm for a given set
of instrument and sample conditions. Some examples of named algorithms include:
CONTIN
The CONTIN algorithm was originally written by Steven Provencher and has become
the industry standard for general DLS analysis. CONTIN is considered to be a
conservative algorithm, in that the choice of the alpha (α) parameter controlling the
smoothness of the distribution assumes a moderate level of noise in the correlogram.
As a consequence, particle distribution peaks which are close in terms of size
tend to be blended together in a CONTIN-derived size distribution. See http://sprovencher.com/pages/contin.shtml, for additional information.
Regularization
The Regularization algorithm, written by Maria Ivanova, is a more aggressive algorithm
which is optimized for dust-free small particle samples, such as pure proteins and
micelles. The Regularization algorithm utilizes a small α parameter, thereby assuming
a low level of noise in the measured correlogram. As a consequence, Regularization
derived distributions tend to have sharper peaks. However, this low noise estimate can
lead to phantom (nonexistent) peaks if noise is present in the correlogram.
GP & MNM
The GP and MNM algorithms, distributed with the Zetasizer Nano instrument, are
general NNLS algorithms that have been optimized for the wide range of sample sizes
and concentrations suitable for measurement with the Nano system. The GP (General
Purpose) algorithm is conservative, with a moderate estimate of noise, and is suitable
for milled or naturally-occurring samples. The MNM (Multiple Narrow Mode) algorithm
is more aggressive, with a lower noise estimate, and is better suited for mixtures of
narrow polydispersity particles such as latices and pure proteins.
REPES & DYNALS
The REPES and DYNALS algorithms are available for purchase through various
internet sites. Both are similar to the industry standard CONTIN, although more
aggressive with regard to noise estimates.
There are two primary parameters that are varied in NNLS algorithms: the weighting
factor and the alpha parameter (regularizer). The table below shows a comparison
of the default values of these two parameters for some of the algorithms cited
above. Note that the algorithms listed in the table are listed in order of increasing
aggressiveness.
Algorithm
Weighting scheme
α Parameter
CONTIN
Quartic
Variable
General Purpose
Quadratic
0.01
Multiple narrow mode
Quadratic
0.001
Regularization
Quadratic
0.00002
Data Weighting
As described earlier, data weighting in the DLS deconvolution algorithm places
emphasis on the larger and more significant correlation coefficients rather than the
less important smaller baseline values. Figure 5 shows the effects of data weighting
on a DLS correlogram for 1 mg/mL lysozyme, after filtration through a 20 nm Anotop
8
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
filter. As shown in this figure, the weighting serves to stretch out the correlogram along
the Y axis. In the absence of data weighting, noise in the baseline can lead to the
appearance of ghost or noise peaks.
Figure 5: Comparison of quadratic and quartic weighting on the measured correlogram for a 1 mg/mL
lysozyme sample, after filtration through a 20 nm filter, along with the resultant size distributions derived
using the Malvern General Purpose algorithm.
Alpha (α) Parameter Or Regularizer
The Regularizer or α parameter in NNLS based deconvolution algorithms controls the
acceptable degree of spikiness in the resultant distribution. Large α values generate
smoother, less resolved distributions, whereas smaller alpha values generate more
spiky distributions, with an appearance of better resolution. The α parameter then, can
be loosely described as an estimate of the expected level of noise in the measured
correlogram.
There is no ideal or best alpha parameter. The appropriate value depends upon the
sample being analyzed. For mixtures of narrow mode (low polydispersity) and strongly
scattering particles, decreasing the α parameter can sometimes enhance the resolution
in the intensity particle size distribution. Consider Figure 6 for example, which shows
the distribution dependence on the α parameter for a mixture of 60 nm and 220 nm
latexes. The results derived using the default regularizer for the Malvern General
Purpose and Multiple Narrow Mode algorithms, r = 0.01 and 0.001 respectively, are
noted for comparison.
Figure 6: Intensity particle size distribution dependence on the α parameter for a mixture of 60 and 220
nm latex particles.
As evident in the above figure, a decrease in the α parameter leads to an increase
in both the number of resolved modes and the sharpness of the peaks. It is also
instructive to note that once baseline resolution is achieved, the resultant sizes (peak
9
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
positions) are independent of the α value, with only the apparent width of the peaks
changing with further changes in the regularizer.
The influence of the α parameter on the measured size distribution for a monomodal
220 nm latex sample is shown in Figure 7. As with the latex mixture, reduction of
the regularizer has no influence on the measured particle size, and serves only to
decrease the apparent polydispersity of the peak, i.e. decrease the peak width.
Figure 7: Intensity particle size distribution dependence on the α parameter for a monomodal 220 nm
latex sample.
In the previous two examples for low polydispersity latices, the more aggressive
algorithms utilizing smaller α values generated results that were consistent with
the sample properties. For samples that are not composed of narrow mode particle
families however, aggressive reduction in the α parameter can lead to overinterpretation of the measured data, and the generation of more modes or peaks
than are actually present in the sample.
Figure 8 shows the influence of the α parameter on the resultant size distribution for
a dilute protein sample. The monomeric protein has a known hydrodynamic diameter
of 3.8 nm. Under the conditions employed here, the protein is also known to exist as a
mixture of low order oligomers. As evident in this figure, the intensity weighted mean
size of the sample is independent of the α parameter selected, and is consistent with
the expected average size of an oligomeric mix. If the General Purpose algorithm
is selected, with an α value of 0.01, the peak width is also representative of the
expected polydispersity for a mix of protein oligomers (~ 25-30%). Over reduction of
the α parameter (< 0.01) however, generates a phantom peak at circa 2 nm, and leads
to the erroneous conclusion that the sample is composed of only two particle sizes,
one of which is much smaller than the monomer itself.
10
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Figure 8: Influence of the α parameter on the resultant size distribution for a 0.3 mg/mL lysozyme sample
in PBS at pH 6.8.
The results shown in Figure 9 represent another example where the less
aggressive α value for the Malvern General Purpose algorithm is the appropriate value
to use in the generation of the particle size distribution. In the absence of stabilizing
agents, hemoglobin (Hb) denatures and aggregates at temperatures >38°C. When the
protein denatures, the aggregates formed are random in size, with no specificity, i.e.
very polydisperse. As such, the distribution best representative of the actual sample
is that generated using the Malvern General Purpose algorithm, with an α value of
0.01. Reduction of the α parameter to values < 0.01 leads to the generation of two
apparently unique size classes in the 300 nm & 800 nm regions that are inconsistent
with the actual properties of the sample.
Figure 9: Influence of the α parameter on the resultant size distribution for denatured hemoglobin at 44 C
in PBS buffer.
Multiple Solutions (CONTIN)
CONTIN is unique among DLS algorithms, in that it generates a collection of
solutions, each with a set of qualifying descriptors. The descriptors used to identify
the most probable solution are 1) the number of peaks, 2) the degrees of freedom, 3)
the α parameter, and 4) the probability to reject. The most probable solution is selected
using the principle of parsimony, which states that after elimination of all solutions
inconsistent with a priori information, the best solution is the one revealing the least
amount of new information.
11
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Figure 10 shows a comparison of the CONTIN generated solution set for the 60 nm
and 220 nm latex mixture discussed earlier. As seen in this figure, one of the solutions
(CONTIN 1) is consistent with the results generated using the Malvern Multiple Narrow
Mode algorithm (α = 0.001) and is a good representation of the actual sample. The
CONTIN determined most probable solution however, is CONTIN 6, which shows a
blending of the populations to form a single peak of high polydispersity.
Figure 10: Comparison of the CONTIN generated solution set of size distributions for the 60 nm and 220
nm latex mixture.
In comparison to the Malvern General Purpose and Multiple Narrow Mode algorithms,
CONTIN tends to be more conservative than the GP algorithm. While this works well
for noise recognition and management (dilute protein in Figure 11), it can also lead to a
reduction in apparent particle size resolution (latex mixture in Figure 11) for mixtures.
Figure 11: Comparison of CONTIN (▬), General Purpose (▬), and Multiple Narrow Mode (▬) results for a
mixture of 60 nm and 220 nm latices and a dilute protein (0.3 mg/mL lysozyme) sample.
To finally address the question of "what is the best DLS algorithm?", the truthful answer
is that there is no best algorithm. Each of the algorithms give useful information to
the researcher. The best approach is to couple what you know with what you suspect
about the sample, compare results from various algorithms, recognizing the strengths
and limitations of each, and then look for robustness and repeatability in the results.
In other words, if multiple measurements all indicate a shoulder in a wide peak,
which resolves itself into a unique repeatable population upon application of a more
aggressive algorithm, the chances are strong that the this unique population is real. If
repeat measurements generate inconsistencies, then it is best to err on the side of a
more conservative algorithm, such as the Malvern General Purpose or CONTIN.
12
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Additional Reading
Braginskaya, Dobitchin, Ivanova, Klyubin, Lomakin, Noskin, Shmelev, & Tolpina
"Analysis of the polydispersity by PCS. Regularization procedure", Phys. Scr. 1983, 28,
73.
Frisken "Revisiting the method of cumulants for the analysis of dynamic light scattering
data", Applied Optics 2001, 40(24), 4087-4091.
ISO 13321 "Particle size analysis: Photon correlation spectroscopy", 1996.
Liu, Arnott, & Hallett "Particle size distribution retrieval from multispectral optical depth:
Influences of particle nonsphericity and refractive index", Journal of Geophysical
Research 1999, 104(D24), 31753.
Pecora "Dynamic Light Scattering: Applications of Photon Correlation Spectroscopy",
Plenum Press, 1985.
Provencher "Contin: a general purpose constrained regularization program for inverting
noisy linear algebraic and integral equations", Computer Physics Communications
1982, 27, 229-242.
Provencher "A constrained reqularization method for inverting data represented by
linear algebraic or integral equations", Computer Physics Communications 1982, 27,
213-227.
Schmitz "An Introduction To Dynamic Light Scattering By Macromolecules", Academic
Press, New York, 1990.
Stepanek "Data analysis in dynamic light scattering" in Light scattering: principles and
development, Ed: Brown, Pub: Clarendon Press, Oxford, 1996, 177-241.
"Application of dynamic light scattering (DLS) to protein therapeutic formulations: part I
- Basic Principles". Inform White Paper. Malvern Instruments Limited.
"Application of dynamic light scattering (DLS) to protein therapeutic formulations:
part II - concentration effects and particle interactions". Inform White Paper. Malvern
Instruments Limited.
"Application of dynamic light scattering (DLS) to protein therapeutic formulations: part
IV - frequently asked questions". Inform White Paper. Malvern Instruments Limited.
"A basic guide to particle characterization". Inform White Paper, Malvern Instruments
Limited.
"Developing a bioformulation stability profile". Inform White Paper, Malvern Instruments
Limited.
About Malvern's Bioscience Development
Initiative
Malvern Instruments' Bioscience Development Initiative was established to accelerate
innovation, development, and the promotion of novel technologies, products, and
capabilities to address unmet measurement needs in the biosciences market.
13
Application of Dynamic Light Scattering (DLS) to Protein Therapeutic Formulations: Principles, Measurements and Analysis - 3. DLS Deconvolution Algorithms
WHITEPAPER
Malvern Instruments Limited
Grovewood Road, Malvern,
Worcestershire, UK. WR14
1XZ
Malvern Instruments is part of Spectris plc, the Precision Instrumentation and Controls Company.
Spectris and the Spectris logo are Trade Marks of Spectris plc.
Tel: +44 1684 892456
Fax: +44 1684 892789
www.malvern.com
All information supplied within is correct at time of publication.
Malvern Instruments pursues a policy of continual improvement due to technical development. We therefore reserve
the right to deviate from information, descriptions, and specifications in this publication without notice. Malvern
Instruments shall not be liable for errors contained herein or for incidental or consequential damages in connection with
the furnishing, performance or use of this material.
Malvern Instruments owns the following registered trademarks: Bohlin, FIPA, Insitec, ISYS, Kinexus, Malvern, Malvern
'Hills' logo, Mastersizer, Morphologi, Rosand, 'SEC-MALS', Viscosizer, Viscotek, Viscogel and Zetasizer.
©2014 Malvern Instruments Limited