What Does `Nonparametric` Mean?

What Does ’Nonparametric’ Mean?
John Hughes
cargo
September 2, 2014
These notes answer the question ”What does ’nonparametric’ mean?”
by developing a model for quantum dot images of kinesin motor
assays. The model is semiparametric—it has a parametric component
and a nonparametric component—and so provides a natural point of
departure from the fully parametric methods with which the student is
familiar.
A kinesin is a type of processive motor protein. Processive motor
proteins are ATP-powered biological nano-machines that transport
large and/or massive cargoes within eukaryotic cells. The existence
of eukaryotic organisms depends on these tiny motors because the
passive process of diffusion cannot generate enough force to transport massive payloads quickly, and larger cargos are likely to also
be impeded by collisions with other objects. A motor protein overcomes these difficulties by hydrolyzing ATP in order to tow a cargo,
such as an organelle or vesicle, rapidly and in a directed path along
a substrate (e.g., a microtubule). A kinesin motor protein is shown in
Figure 1.
stalk
neck linker
head
microtubule
Figure 1: This figure depicts a kinesin
towing its cargo along a microtubule.
The diagram is not only stylized but
is also not to scale—a kinesin is many
times smaller than its cargo. For a
typical kinesin, the maximum distance
between the heads is approximately 8
nm, whereas a mitochondrion cargo,
for example, would have a diameter of
500–10,000 nm.
Fluorescence Microscopy
One way to study small biological specimens like kinesins is fluorescence microscopy. In fluorescence microscopy, a specimen of interest
is tagged with a fluorescent molecule or particle (also called a fluorophore). A fluorescence microscope then irradiates the specimen
with light at the excitation wavelength of the fluorophore, and when
the excited electrons revert to the ground state they emit photons at
the emission wavelength. A filter separates the emitted light from the
excitation light so that only the light from the fluorescent material
can pass to the microscope’s eyepiece and camera system (Figure 2).
Images of Kinesin Motor Assays
To produce an image of kinesins, each fluorophore is attached to a
head (roughly 5 nm in diameter) or near the neck linker of a kinesin
motor so that the movement of the fluorophore roughly mirrors the
movement of the motor. In order to infer the dynamics of the motor,
it is important to determine the location of the fluorophore to within
a few nanometers, a task that is complicated by a pixel width on the
order of several tens of nanometers. Luckily, the true location of the
fluorophore is at the center of a point spread function that contains a
Figure 2: This figure shows the structure of a fluorescence microscope
(http://en.wikipedia.org/wiki/File:
FluorescenceFilters_2008-09-28.
svg).
what does ’nonparametric’ mean?
2
great deal of information about the center. By extracting this information we can overcome the limited resolution of the imaging apparatus
and so accurately track a processing motor across the frames of a
”movie.”
Two images of kinesin assays are shown in Figure 3. The first
image was created using green fluorescent proteins (GFP). A fully
parametric model (call it Model 1) is sufficient for such images. The
second image was created using quantum dots. Quantum dot images
require that Model 1 be extended in a couple of ways, one of which is
nonparametric.
Model 1: A Model for GFP Images
The material in this section was adapted from Hughes, Fricks, and
Hancock [2010].
Represent a microscope slide as a rectangular region T ⊂ R2 , with
N, a Poisson random field on T, representing the light emission from
the slide. The intensity function for N consists of background fluorescence of magnitude B and a sum of bivariate Gaussian functions,
one for each fluorophore. (A fluorophore’s emission pattern follows
an Airy function (Figure 4), but the Airy function is approximated
well by a Gaussian function.) Hence, any Borel set R ⊂ T is a Poisson
random variable with expectation
EN ( R) =
where
J
ZZ
R
B+
∑ gj (x, y) dx dy,
j =1
( x − s j )2 + ( y − t j )2
g j ( x, y) = A j · exp −
S2
!
and J is the number of fluorophores. Note that g j , which represents
the jth fluorophore, is the Gaussian function centered at (s j , t j ) and
with ”height” A j and ”spread” S.
The light emitted from the slide is collected by the pixels of a
CCD camera. We represent the pixels by partitioning T into a uniform grid with each cell d nanometers on a side, thus arriving at,
say, n pixels, R1 , . . . , Rn , and their corresponding random variables,
N ( R1 ), . . . , N ( Rn ). To ease notation and description, we will henceforth let Zi = N ( Ri ) for i = 1, . . . , n and use the term ’pixel’ to refer
to the random variable Zi rather than to the region Ri . Thus, for a
given pixel Zi with center ( xi , yi ),
EZi =
Z y +d/2 Z x +d/2
i
i
yi −d/2
Figure 3: This figure shows GFP and
quantum dot images of kinesin assays.
xi −d/2
J
B+
∑ gj (x, y) dx dy,
j =1
Figure 4: This figure shows a surface
plot of an Airy pattern (http://en.
wikipedia.org/wiki/File:Airy-3d.
svg) and the accuracy of a Gaussian
approximation (http://en.wikipedia.
org/wiki/File:Airy_vs_gaus.svg).
what does ’nonparametric’ mean?
3
which is approximately equal to
(
!)
( x i − s j )2 + ( y i − t j )2
B + ∑ A j · exp −
d2 .
2
S
j
We reparameterize so that B and the A j absorb the constant d2 . This
gives
!
( x i − s j )2 + ( y i − t j )2
EZi ≈ B + ∑ A j · exp −
= µi .
(1)
S2
j
The intensity of the background fluorescence is sufficiently large to
allow for a normal approximation to the Poisson, which implies that
·
Recall that if X ∼ P (µ) and µ is large,
·
X ∼ N (µ, µ).
Zi ∼ N (µi , µi ).
However, the stochasticity in an image may not be limited to Poisson noise. Randomness may also arise from the camera system due
to signal quantization and dark current, for example. This source of
variation is modeled as Gaussian white noise. Thus we arrive at the
approximate sampling model for pixel i:
·
L
Zi ∼ N (µi , µi ) + N (0, σ2 ) = N (µi , µi + σ2 ),
(2)
and is independent of the other pixels because the underlying process is Poisson and the instrumentation error is independent.
Recall that a Poisson process has
independent increments.
Model 2: A Model for Quantum Dot Images
The material in this section was adapted from Hughes and Fricks
[2011].
Maximum likelihood estimation based on (2) works well for certain GFP images, but our quantum dot images depart from (2) in two
ways:
1. the variance may not be equal to the shifted mean, and
2. a quantum dot image may exhibit salt-and-pepper noise, so called
because it is present or absent at random.
More specifically, the variance function for our quantum dot images
is unknown but Poisson-like in that the variance changes with the
mean. And the salt-and-pepper noise is exponentially distributed.
These differences led us to formulate a two-component mixture
model for the pixels of these quantum dot images. The mixture density for Zi is given by
i
i
f i ( z i ) = (1 − π ) f N
( zi ) + π { f N
(zi ) ∗ f E (zi )},
(3)
A random variable X is said to have
a mixture distribution if the density
for X has the form f ( x ) = ∑ πi f i ( x ),
where the f i are densities and the πi are
non-negative and sum to 1. The πi are
called mixing proportions.
what does ’nonparametric’ mean?
i denotes the normal density corresponding to pixel i, f
where f N
E
refers to an exponential density, ∗ is convolution, and π ∈ (0, 1). The
normal-exponential convolution density accommodates the exponential error. This density is given by
f N ∗E (z) = f N (z) ∗ f E (z)
√ 1
µ−z
µ−z
v
v
√ +
= exp
+
1−Φ
2
λ
λ
λ
2λ
v
for a normal with mean µ and variance v and an exponential with
mean λ, where Φ is the standard normal cdf.
Retaining (1) and combining it with the new error model gives the
new expectation for pixel i:
E( Zi | Wi ) = B +
J
∑ gj (xi , yi ) + Wi λ = µi + Wi λ,
j =1
where the Wi are iid Bernoulli random variables that indicate the
presence or absence of the exponential error, which has mean λ.
Since the variance function for the normal component is unknown,
we model it as a function of the mean, v(µ). This implies that pixel i
is approximately distributed as
·
Zi ∼ N {µi , v(µi )} + Wi E (λ),
(4)
conditional on Wi and independent of the other pixels. Note that this
model can include the Gaussian white noise from (2) through v(µ).
Thus (4) can be viewed as an extension of the previous model, (2).
From Parametric to Semiparametric
The log-likelihood corresponding to (3) is given by
n
`n (θ | Z ) =
∑ log{(1 − π ) f Ni (Zi | ψ1 ) + π f Ni ∗E (Zi | ψ2 )},
i =1
where ψ1 = ( B, S, A1 , s1 , t1 , . . . , A J , s J , t J )0 are the parameters for the
mean of the normal, ψ2 = (ψ10 , λ)0 , and θ is the full parameter vector,
(ψ20 , π )0 .
Note that θ is a Euclidean parameter: θ ∈ R p , where p = 3J + 4.
And so the specification given in (3) appears to leave us in ”parametricland.” But recall that the normal component of the mixture has an
unknown variance function v(µ). The fact that v is unknown leaves
us with one foot in ”parametricland” and the other foot in ”nonparametricland.” Our starting point was a parametric model. We added
a nonparametric component. The result is a model of hybrid type: a
semiparametric model.
a parametric model has a Euclidean
parameter
a nonparametric model has an infinitedimensional parameter
a model with both finite-dimensional
and infinite-dimensional parameters is
called a semiparametric model
4
what does ’nonparametric’ mean?
Infinite-Dimensional Parameters
In what sense is v (or any function, f 0 say) infinite dimensional? This
question has a rather abstract answer, but we will answer it using a
concrete example.
Suppose that we are willing to make some assumptions about our
infinite-dimensional parameter. Perhaps we have reason to believe
that our function is a real polynomial in one variable, for example.
That is, we assume that
f 0 ( x ) = a0 + a1 x + a2 x 2 + · · · + a m x m ,
where m is an unknown non-negative integer and the ai are unknown
real numbers. It turns out that the set of all such functions,
R[ x ] = { a0 + a1 x + a2 x2 + · · · + am x m : m ∈ N, ai ∈ R},
is a vector space. A basis for this space is {1, x, x2 , . . .}. Recall that the
dimension of a vector space is defined to be the number of vectors in
any basis. Since this basis contains infinitely many vectors, R[ x ] must
be infinite dimensional. This implies that the problem of estimating
f 0 is akin to finding a needle in an infinite haystack!
References
John Hughes and John Fricks. A mixture model for quantum dot
images of kinesin motor assays. Biometrics, 67(2):588–595, 2011.
John Hughes, John Fricks, and William O. Hancock. Likelihood
inference for particle location in fluorescence microscopy. The Annals
of Applied Statistics, 4(2):830–848, 2010.
Let f ( x ) and g( x ) be elements of
R[ x ]. Then the addition and scalar
multiplication on R[ x ] are defined
as ( f + g)( x ) = f ( x ) + g( x ) and
( a f )( x ) = a f ( x ).
5