Mathematical characterization of three

BIOINFORMATICS
Vol. 20 no. 11 2004, pages 1653–1662
doi:10.1093/bioinformatics/bth135
Mathematical characterization of
three-dimensional gene expression patterns
L. da F. Costa1,∗ , M. S. Barbosa1 , E. T. M. Manoel1 , J. Streicher2 ,
and G. B. Müller3,4
1 Cybernetic
Vision Research Group, GII-IFSC, Universidade de São Paulo,
São Carlos, SP, Caixa Postal 369, 13560-970, Brasil, 2 Department of Anatomy and
3 Department of Zoology, Integrative Morphology Group, University of Vienna,
1090 Vienna, Austria and 4 Konrad Lorenz Institute for evolution and Cognition
Research, 3422 Altenberg, Austria
Received on July 31, 2003; revised on December 18, 2003; accepted on December 19, 2004
Advance Access publication February 26, 2004
ABSTRACT
Motivation: The importance of a systematic methodology for
the mathematical characterization of three-dimensional gene
expression patterns in embryonic development.
Methods: By combining lacunarity and multiscale fractal
dimension analyses with computer-based methods of threedimensional reconstruction, it becomes possible to extract
new information from in situ hybridization studies. Lacunarity
and fractality are appropriate measures for the cloud-like gene
activation signals in embryonic tissues. The newly introduced
multiscale method provides a natural extension of the fractal
dimension concept, being capable of characterizing the fractality of geometrical patterns in terms of spatial scale. This tool
can be systematically applied to three-dimensional patterns of
gene expression.
Results: Applications are illustrated using the threedimensional expression patterns of the myogenic marker gene
Myf5 in a series of differentiating somites of a mouse embryo.
Contact: [email protected]
1
INTRODUCTION
Over the last two centuries the biological sciences underwent
an ever increasing specialization, which led to the establishment of important areas such as genetics, developmental
biology and ecology. At the beginning of the 21st century,
we experience a reintegration of many of those areas, with
implications not only for a better understanding of life and
treatment of its pathologies but also for making sense of the
prodigious diversity of biological shapes. A key element in
this process is the systematic development and application
of mathematic–computational concepts to biology, through
the methods of mathematical biology, computational biology
and, more recently, bioinformatics. The potential of such
approaches is manifold. First, their computational nature is
inherently adequate for storing, organizing and mining the
∗ To
whom correspondence should be addressed.
Bioinformatics 20(11) © Oxford University Press 2004; all rights reserved.
myriad of complex biological data generated by empirical
research projects. Second, these methodologies allow not only
a more objective and reproducible quantification of results
but also a sharpening of analytical procedures traditionally
performed by human operators. Present research in biology concentrates much on the biomolecular aspects relating
development and evolution (Raff and Kaufman, 1983; Hall,
1992; Carroll, 2001; Müller and Newman, 2003; Wilkins,
2001), and it becomes particularly critical to have access to
effective means for analyzing the data produced by experiments addressing these issues. This is a task that can only
be achieved through the systematic and intensive application of computational methods for sequence comparison and
gene identification, involving areas, such as genomics, signal
processing, pattern recognition, artificial intelligence, database construction and many others. The investigation of gene
expression networks (e.g. through macro- and microarrays),
has demanded a whole series of additional mathematical
and computational resources, such as dynamical systems
approaches and advanced multivariate methods. This type of
technology allows a high ‘molecular resolution’—in the sense
that thousands of genes can be screened simultaneously—
but their temporal and spatial resolutions are very limited. In
contrast, recently introduced techniques of three-dimensional
(3D) computer representation of gene expression patterns in
developing embryos (Streicher et al., 1997, 2000; Jernvall
et al., 2000; Weninger and Mohun, 2002; Sharpe et al., 2002)
allow high spatial resolution (and a comparable time resolution) at the expense of taking into account only a small
number of genes (Fig. 1). Such techniques call for equally
advanced and adapted mathematical tools for characterizing the spatio-temporal dynamics of developmental gene
activation associated with the formation of microscopic and
macroscopic characters.
Gene expression is characterized by an intrinsic statistical
nature, in the sense that the spatio-temporal dynamics leading
to developmental changes in two different individuals from the
1653
L.da F.Costa et al.
Fig. 1. GeneEMAC (Streicher et al., 2000) reconstruction of Myf5
expression (blue) by the myogenic cells in thoracic somites 14–21
of an 11-day mouse embryo. Yellow, neural tube and red, epidermis.
same species is never identical as a consequence of genetic differences and environmental influences. At the same time, the
experimental conditions and methods used to produce gene
expression data imply a certain level of variability, such as
usually observed in histological marking and registration of
seriate sections. Given such a complexity of gene expression
data, it is foremost necessary to identify suitable geometrical measures capable of capturing meaningful information
about the intricate 3D and 4D patterns of gene activation.
One of the most intuitive approaches to such a characterization is through the spatial density of the expression signals.
But this measurement reflects only in a highly degenerated
fashion the interaction between cells expressing a significant
amount of gene product, not the intensity with which they
interact with each other or with the environment. In other
words, a virtually infinite number of expression patterns can
exhibit the same overall density, hence the term degeneracy.
Although such a type of behavior is in principle quantifiable
by traditional fractal dimensions (Peitgen and Saupe, 1992;
Falconer, 1986; Mandelbrot, 1983), these measurements do
not take into account the fact that natural objects, especially
the biological ones, have limited and varying self-similarity.
Indeed, they often represent the limitations to the fractality of real objects. First, since every object in the natural
world has a finite diameter (defined as the maximum distance
between any of its points), its fractal behavior approaches that
of a single point when observed at sufficiently large spatial
scales. Next, real objects only exhibit self-similarity along
rather limited intervals of spatial scales. A fern leaf e.g. is
fractal only through three or four hierarchical orders, and
even there the self-similarities at each of these scales are not
absolutely identical. A third and serious limitation is imposed
by the finite spatial resolution of digital acquisition and storage devices, in the sense that even an infinitely self-similar
object, such as a Koch curve, cannot be properly represented in a digital image. It is important to observe that a large
1654
number of different objects can exhibit the same fractal value.
As a matter of fact, a single scalar value cannot express properly the varied fractal behaviors occurring at different spatial
scales. An interesting approach to providing richer information about the geometry of real objects, namely the lacunarity,
was proposed by Benoit B. Mandelbrot (1983). Basically, this
multiscale measurement quantifies the size and spatial distribution of holes. More specifically, fractal patterns exhibiting
a high invariance to translation tend to present a low lacunarity. While the combined use of traditional fractal dimensions
and lacunarity reduces degeneracy, i.e. two objects may have
the same fractal dimension but different lacunarity measurements, the interpretation of the fractality remains incomplete
and subject to the problems discussed above. This implies that
the different fractal properties of objects remain unnoticed
when observed at different spatial scales and that their limited and varying fractality is not taken into account. Based
on the concept of exact Euclidean distances and dilations
(Costa and Cesar, 2001), the newly introduced multiscale
fractal dimension (MFD) provides a natural means for effectively addressing this shortcoming. Instead of yielding a single
fractal dimension value for each given datum, the MFD quantifies the fractal behavior in terms of a whole interval of spatial
scales. While the whole MFD signal can be considered for a
detailed analysis of the objects, a series of global features can
also be derived from it—such as the peak fractal area under
the MFD curve and its dispersion—which allow the quantification of meaningful geometric properties. The multiscale
nature of lacunarity and MFD ensures a much less degenerate
characterization and interpretation of the considered objects.
The current article describes and illustrates how the combined application of these two measurements can be used as
a powerful method in the analysis of 3D gene expression in
development. Special attention is given to the interpretation
of the implications of these measurements for cell interaction
(between themselves and with the environment) and migration during embryogenesis. The potential of the proposed
approach is demonstrated using the example of differential
expression of a muscle precursor gene by the myogenic cell
population within successive somites of a mouse embryo.
Since the somites of a vertebrate embryo form and differentiate in a cranio-caudal sequence, somitic gene expression
has both a spatial and a temporal dimension within a single
embryo.
2 MATERIALS AND METHODS
2.1 In situ hybridization and image acquisition
Mouse embryos were in situ hybridized against the myogenic
marker gene Myf5. A Theiler stage 18 embryo was embedded
in hydroxyethyl-methacrylate and was serially sectioned at
5 µm intervals. From every section, phase-contrast and brightfield images of the gene expression signals were captured
using a video camera attached to a Nikon Microphot-FXA
Mathematical characterization of three-dimensional gene expression patterns
microscope. Computer-based image registration, segmentation and 3D reconstruction were performed as described
previously (Streicher et al., 1997). Binary images of the
segmented signal patterns of gene expression were used for
mathematical analysis.
2.2
Density estimation by Parzen windows
One of the essential properties of gene expression patterns,
which often assume the form of clouds of points, is their variability and irregular distribution along 3D space. Therefore,
density estimation by statistical means represents one of the
most natural and immediate means for quantifying and modeling such kinds of data. The concept of density of a certain
element (e.g. cells), considering a specific measure m (e.g.
the number of cells), at a point P = (x, y) can be formulated
in terms of the following limit:
d(x, y){P = (x, y) ∈ } = lim
→0
m
,
(1)
where stands for the considered region around the point P
and . represents its size, area (for 2D spaces) or volume
(for 3D spaces). So, while for non-infinitesimal the density
can be understood as the average density of m inside , the
density, d, becomes a local measure at point P as tends to
zero. Such a point density is henceforth represented as d(x, y).
Although this definition is suitable for continuous spaces, it
cannot be applied immediately to cases where the measured
quantity is not continuous along the space (e.g. dead cells
along a tissue or when the space is quantized such as in a
digital image). In such situations, it is necessary to interpolate
through the quantified space, which can be done by using the
well-known Parzen windows approach (Duda et al., 2001),
which involves convolving the binary image (2D or 3D) with
a normalized Gaussian kernel (i.e. having unit volume under
its curve), using a specific dispersion given by the respective
covariance matrix. Therefore, for a discrete set containing
M samples, each at position pi and with intensity ai , the
density along the space, represented in terms of the position
vector, pi , is given by the following equation:
d(p)
=
M
ai gσ (p − pi ),
(2)
k=1
where gσ (p − pi ) is a circularly symmetric Gaussian centered
at p − pi with covariance matrix
2
0
σ
.
K=
0 σ2
Figure 2 shows a cloud of points divided into three regions
(a) and the respective density interpolation from the Parzen
windows approach for σ = 1 (b), 10 (c), 25 (d) and 50 (e).
Observe that while regions (1) and (3) in Figure 2a correspond
to uniform densities, the intermediate region (i.e. 2) represents a graded transition. It is clear from this example that the
interpolation obtained depends on the choice of σ , with more
sparse results being obtained for smaller values of this parameter, whose choice should take into account the expected
overall smoothness. In the current work, this value is chosen
in order to avoid deep valleys between peaks.
It is important to observe that this interpolation assigns a
density value to each point in the considered space around the
original set of points, e.g. Figure 2b–e. While such results
provide complete information about the density distribution
along space, it is often necessary to capture the overall properties of such distributions in terms of a reduced number of
global measures, such as the area and the SD, which are
considered in the present study.
2.3
Lacunarity
Lacunarity was introduced by Mandelbrot (1983) as a means
of complementing the degenerated characterization of the geometry of the analyzed objects implied by traditional fractal
dimensions. Given the set of points S composing the object of
interest, its lacunarity is defined by Equation (3), where m(r)
is the number of elements found inside the sphere of radius r
centered at a specific point of a region R around the object,
µm(r) is the average of m(r) considering all points and σm(r) is
the respective variance. Although the region R can be defined
in several ways, such as corresponding to the space where the
object is represented (e.g. the whole image space), here we
consider the more objective assumption that R corresponds to
the dilation of the original set S by
L(r) =
2
σm(r)
µ2m(r)
.
(3)
The lacunarity measure provides the means to quantify deviations from translational invariance by describing the distribution of gaps inside S for multiple spatial scales, represented
by r. Higher lacunarity values are obtained for more heterogeneous arrangements of the gaps inside the object. Figure 3
shows the lacunarity curves for each of the three parts of
Figure 2a and also for that whole image. For small values
of r, the lacunarity generally varies inversely with the point
density. This is a consequence of the fact that higher densities
imply more points fall inside the window defined by r, allowing a more accurate density estimation and therefore lower
variance of the estimate averages taken inside those windows.
For large enough values of r, the lacunarity starts to reflect the
translational invariance of the holes in the image. This effect
is observed for the lacunarity curve for part 2, which, being
more heterogeneous, implies higher lacunarity values than the
curve for part 1 and also for the curve considering the whole
image (reflecting much less translational invariance).
2.4
Object dilations
The dimension of the space where the set of points is embedded (in principle 2D or 3D) is represented by N . Let S be
1655
L.da F.Costa et al.
(a)
(b)
(d)
(c)
(e)
Fig. 2. A gradient of points divided into three regions (a) and respective Parzen interpolations considering σ = 1 (b), 10 (c), 25 (d) and
50 (e).
a set of points in N -dimensional space such as in Figure 2a,
which are represented in terms of their Cartesian coordinates.
Observe that such a set can be used to represent the most general type of objects, including those that are continuous (e.g.
a region or a curve, etc.) and disconnected (e.g. a cloud of
disconnected points). The exact dilation by radius r of the
object represented by S is defined as the set of points resulting from the union of balls of radius r centered at each of the
points in S, up to a maximum radius, rmax . Figure 4a and b
shows the dilations of the image in Figure 2a for r = 3 and
r = 7, respectively. The area of the dilation of the set S by
radius r is expressed as A(r). Observe that the values of r
1656
introduce a spatial scale representation with respect to each
of the properties of the object under analysis. The dilation
by radius r can be alternatively obtained by thresholding by
r the distance transform of the object (i.e. the transformation
that assigns to each surrounding point the minimal distance
from that point to the object. Figure 4c presents the values of
A(r) for several radii. As is clear from this figure, the fact that
different values of A(r) are typically obtained for different
objects suggests that such curves can be used to characterize the geometrical property of the objects in terms of the
spatial scale. For instance, a set of points that are closely
packed tend to yield A(r) values with slower increasing rates
Mathematical characterization of three-dimensional gene expression patterns
Scale
7
Scale
part 1
part 2
part 3
whole
1.8
Fractal Dimension
Lacunarity
5
3
1.6
1.4
1.2
1
part 1
part 2
part 3
0.8
0.6
1
20
50
4
80
Fig. 3. Lacunarities for each of the three pieces and the whole image
of Figure 2a.
Area
(b)
0
5
10 15 20
Dilation radius
(c)
25
30
than those obtained for sparser sets of points, such as in the
above example. This is exactly the concept underlying the
multiscale fractal dimension.
The multiscale fractal dimension
The traditional Minkowski–Bouligand dimension of a set of
points, represented as B, is defined (Schroeder, 1991) as
log A(r)
.
r→0 log(r)
B = N − lim
20
30
Fig. 5. The multiscale fractal dimension curve for the image in
Figure 2a, considering its three pieces.
F (s) = N −
Fig. 4. The dilation of the image in Figure 2a by a ball with radius
equal to 3 (a) and 7 (b) and area function (c).
2.5
6 7 8 910
It has been conjectured that the Minkowski–Bouligand dimension is greater or equal to the Hausdorff dimension (the
dimension usually taken as reference), with equality being
achieved for all strictly self-similar fractals (Schroeder, 1991).
The main problem with the above definition is that it reflects
the properties of the log–log function only near the coordinate
origin, overlooking its behavior along the remaining spatial
scales. A means of circumventing this limitation, namely the
recently reported MFD (2), consists of considering the derivative of the log–log curve along all spatial scales. By making
s = log r, the MFD can be formally defined as
(a)
8
7
6
5
4
3
2
1
0
5
(4)
d
log[A(s)].
ds
(5)
Note that F (s) corresponds no longer to a single scalar value
but to a whole function of the spatial scale s, therefore allowing a richer characterization of the geometrical features of the
analyzed set of points along the spatial scale, accounting for
the varying fractality of natural objects. One of the most interesting features of the MFD is its interpretation as quantifying
the degree to which the object (i.e. set of points) self-limits
its own dilations. For instance, a more closely packed set of
points allows less interstitial space for the dilations, implying a smaller inclination (i.e. derivative) of the function A(s)
and, consequently, a higher F (s). Observe that the MFD
also quantifies the intensity in which the points composing
the object interact with one another and with the surrounding
space. Such an interpretation is particularly relevant from the
biological perspective as it is closely related to several biological processes, such as diffusion, migration, dispersion and
auto-catalysis. For instance, a high fractal dimension value
indicates that the gene expression dynamics exhibited by the
cells is being readily spread to the surrounding cells.
The curves in Figure 5 provide a clear indication of the
spatial interactions between the components of the image in
Figure 2a. Although the graph for part 1 grows slower than that
for part 2, which grows slower than that for part 3, all curves
ultimately saturate at dimension 2, representing full spatial
1657
L.da F.Costa et al.
coverage of the planar image space. The varying growth rates
of these curves indicate that the objects in part 1 of Figure 2a
tend to flood the void spaces along a larger interval of spatial
scales. This is an immediate consequence of the fact that more
separated point distributions tend to imply lower constraints
to the point dilations. In this respect, we can say that the distribution in part 1 is less ‘complex’, in the sense of spatial
coverage, than the distribution in parts 2 and 3. It is clear
from this figure that, compared with the density and traditional fractal dimension (see graphs in respective sections),
the MFD provides a substantially richer characterization of
the spatial interactions between the components of the image.
While the whole set of MFD values can be used to characterize the object, it is often necessary to consider a reduced
number of measurements quantifying especially representative features of the MFD curve, such as those listed in the
following:
• Peak fractality (P): the maximum fractal value along the
MFD function.
Fig. 6. The spatial distribution of the Myf5 signal in the epimere
(left) and hypomere (right) regions of eight consecutive myotomes
along the cranio-caudal axis.
• Characteristic scale (C): the value of the spatial scale for
which P is obtained.
• Total fractality (T): the total area below the MFD func-
tion. When divided by the width of the spatial scale interval, this measure yields the average fractal dimension in
that interval.
• Half-area scale (H): the value of spatial scale where the
area under the MFD reaches its half-value.
• Dispersion (D): a measure of the distribution of the MFD
function, expressed, for instance, in terms of its SD. This
feature identifies the extent of spatial scales along which
the data exhibit more intense fractal behavior.
• Entropy (E): indicates the number of bits necessary to
encode the variation of the MFD along the considered
interval of spatial scales. The extended potential of the
MFD is illustrated in Figure 5, where the MFD curves
are presented for each of the set of points in Figure 2a.
One important issue related to the above proposed methodology of spatial gene expression characterization is regarding
the degree to which variations in the processing of the biological data interfere with the repeatability of the obtained
measurements. Such variations include registration accuracy, voxel size and staining procedure. In order to provide
some indication about such a repeatability, we apply the
considered measurements to the left and right somites independently and compare the agreement between the respective
results. In addition, we performed an experiment where one
somite is isolated and progressively perturbed (by adding new
points at uniformly distributed positions) while its multiscale
fractal dimension is estimated. Such an experiment provides
an indication not only of the effect of the remaining background in the gene expression pattern but also of the sensitivity
1658
of the measurement to small changes in the gene expression
profile.
3
EXPERIMENTAL RESULTS
Expression signals of the early myogenic marker gene Myf5
in the myotomes (portions from which muscle precursor
cells arise in vertebrate embryos) of eight consecutive mouse
somites from the thoracic region (14–21) are shown in
Figure 6. Each myotome consists of two characteristic regions,
a dorsal one (epimere) and a ventral one (hypomere). The
results presented henceforth concerning the expression patterns of Myf5 are given separately for each of the two regions.
In order to provide enhanced accuracy for fractal estimation, the space where the data under analysis is represented
is enlarged (six times in the current case), and the initial
areas are discarded since they are affected strongly by anisotropies inherent to the orthogonal lattice. Low-pass filtering
(Gaussian smoothing) is also applied to the accumulated area
function after differentiation in order to regularize the noise
caused by spatial quantization of the data. Figure 7 shows
the lacunarity values measured for the left and right somites.
Figure 8 presents the multiscale fractal dimension for all left
and right somites, from which the functionals of Figure 9 are
derived.
4
DATA ANALYSIS
A number of indications about the spatial behavior of gene
expression patterns can be extracted from the graphs resulting
from the mathematical treatment of the 3D Myf5 expression
data. Generally, as expected, the lacunarity tends to decrease
with the radius values for all situations in Figure 7. In addition,
the lacunarity also decreases with the somite number, meaning
Mathematical characterization of three-dimensional gene expression patterns
0.2
Lacunarity
Lacunarity
0.2
0.15
0.1
0.15
0.1
0.05
0.05
14
14
15
15
16
16
3
17
Somite number
1.5
20
21
2.5
18
2
19
3
17
2.5
18
Somite number
Windows radius
2
19
1.5
20
1
21
(a) Left somites/Epimeres
1
Window radius
(b) Right somites/Epimeres
0.25
0.25
Lacunarity
Lacunarity
0.2
0.15
0.2
0.15
0.1
0.1
0.05
0.05
14
14
15
15
16
3
17
Somite number
2.5
18
2
19
1.5
20
21
1
Window radius
(c) Left somites/Hypomeres
16
3
17
2.5
18
Somite number
2
19
1.5
20
21
1
Window radius
(d) Right somites/Hypomeres
Fig. 7. The logarithm of the lacunarity function for the epimeres (a, b) and hypomeres (c, d) in terms of the somite number and the window
radius.
that the more cranial somites are less translationally invariant
than those at the caudal region. In other words, the former
somites tend to be less spatially uniform. Similar values were
obtained for left and right somites, substantiating the robustness of the measurement. It is clear from Figure 8 that such
a dimension reaches a peak value near the middle of the considered spatial scales interval. It can also be verified from this
figure that the results obtained for the left and right somites
exhibit overall agreement, substantiating the repeatability of
the adopted methodology. In other words, considering that the
left and right somites are expected to be biologically similar,
the comparable results obtained for both sides can be interpreted as an indication that the computational treatment of
the gene expression data is consistent and reasonably invariant to artifacts. Figure 9 shows the global features derived
from the multiscale fractal dimensions shown in Figure 8 and
the mean density as a function of somite number, also considering both left and right somites. It is clear from the curves
in Figure 9 that the difference between the results obtained
for the epimere and hypomere parts is much larger than that
verified between the corresponding left and right somite parts.
As illustrated in Figure 9a and c, the overall fractality tends
to increase continuously from the cranial towards the caudal
somites (i.e. from somite 14 towards 21). This corresponds
to the fact that somite formation and myotome differentiation
progress in a cranio-caudal sequence during embryogenesis.
Hence, it can be inferred that the fractality of the gene expression pattern reflects a degree of ‘maturation’ of the somites
and myotomes. Second, the characteristic scale, shown in
Figure 9b, indicates that the peak fractal behavior for the
1659
2.6
2.6
Multiscale fractal dimension
Multiscale fractal dimension
L.da F.Costa et al.
2.5
2.4
2.3
2.2
2.1
2.5
2.4
2.3
2.2
2.1
2
14
14
15
15
16
16
4
17
Somite number 18
Somite number
2
19
1
20
21
3
18
2
19
1
20
Logarithm of the dilation radius
21
0
(a) Left somites/Epimeres
Logarithm of the dilation radius
0
(b) Right somites/Epimeres
Multiscale fractal dimension
2.5
Multiscale fractal dimension
4
17
3
2.4
2.3
2.2
2.1
2
2.5
2.4
2.3
2.2
2.1
2
1.9
1.9
14
14
15
15
16
4
17
Somite number
3
18
2
19
1
20
21
0
Logarithm of the dilation radius
(c) Left somites/Hypomeres
16
4
17
Somite Number
3
18
2
19
1
20
21
0
Logarithm of the dilation radius
(d) Right somites/Hypomeres
Fig. 8. The multiscale fractal dimension function for the epimeres (a, b) and hypomeres (c, d) in terms of the somite number and the logarithm
of the dilation radius.
hypomere occurs always at larger spatial scales, indicating
that the gene expression in such structures occurs at a larger
average spacing between points, which is corroborated by the
respective density values shown in Figure 9e. In this regard,
it can be said that the epimere structures are systematically
denser than the hypomere counterparts. Since the hypomeres
represent regions from which the myoblasts migrate out to
form the ventral musculature, this reflects the looser packing of the incipient migratory cells. In this case fractality can
be interpreted as a measure of the degree of differentiation of
the myotome cells toward a migratory, myoblastic population.
Both aspects, the quantitative maturation of gene expression
behavior and the degree of cell differentiation state, reflect
important developmental parameters.
Figure 10 shows the results obtained in the sensitivity analysis of the multiscale fractal dimension, as motivated in the
1660
last paragraph of Section 2. In this experiment, we generated 25 artificially perturbed images by adding spatially
uniform noise, in steps of 1.5%, to the original voxel image
containing the gene expression data for one of the somites
(number 22). From Figure 10a we observe that the multiscale
fractal dimension is sensitive to small perturbations, as shown
by the visually distinct curves obtained after each addition of
1.5% of noise. Figure 10b shows the root means square (rms)
value for all curves, indicating monotonic accumulation of
error for the sequence of noisy images, tending to saturate
for the noisier cases (right-hand side of the graph). It is clear
from such an analysis that the multiscale fractal dimension is
sensitive to small perturbations, while retaining its shape even
for substantial levels (i.e. almost 45%) of noise.
The possibility of gene expression characterization through
3D multiscale fractality indicates that, in addition to adding
Mathematical characterization of three-dimensional gene expression patterns
Fig. 9. Peak fractality (a), characteristic scale (b), mean fractality (c), entropy of the multiscale fractal dimension (a) and mean density (e) as
functions of the somite numbers, considering all left and right somites.
1661
L.da F.Costa et al.
for exploring and comparing the correlation of gene activity
with morphogenetic events in related but phenotypically distinct species opens the possibility of a new kind of theoretical
biology that can mathematically address connections between
developmental and evolutionary processes.
ACKNOWLEDGEMENTS
L.d.F.C is grateful to FAPESP (99/12765-2) and CNPq
(468413/00-6 and 301422/92-3) for financial support. M.S.B
is supported by FAPESP (02/02504-01). The left–right symmetry analysis was funded by HFSP grant to L.d.C and
G.B.M.
REFERENCES
Fig. 10. The sensitivity of the multiscale fractal dimension: various
curves for progressively perturbed images (a) and the rms (b) for
each sample.
quantitative data to in situ hybridization studies, a new quality
of data can be obtained from computer generated 3D representations of gene expression in embryonic development.
Basically, measurements such as the lacunarity and multiscale
fractal dimension provide richer characterization of the spatial distribution of gene activation to a degree that cannot be
matched by traditional measurements, such as density and
statistical variance. Interestingly, the two measurements considered provide complementary information about the gene
activation spatial distribution. More specifically, while the
lacunarity quantifies translational invariance, the multiscale
fractal dimension provides an indication of the interaction
between the points, a property that is also related to spatial coverage. As such, the quantitative methodology for gene
expression characterization reported in this article has good
potential for application to a series of situations in genetics and
animal development, from microscopic (individual cell) to
macroscopic (organs) spatial scales. These and other methods
1662
Carroll,S.B., Grenier,J.K., and Weatherbee, S.D. (2001) From DNA
to Diversity. Blackwell Science, Malden.
Costa,L.da F., Campos,A.G. and Manoel,E.T.M. (2001) An integrated approach to shape analysis: Results and perspectives, in
Proceedings of the International Conference on Quality Control
by Artificial Vision, Le Creusot, France, Vol. 1, 23–34.
Duda,R.O., Hart,P.E. and Stork.P.G. (2001) Pattern Classification.
Wiley-Interscience, New York.
Falconer,K.J. (1986) The Geometry of Fractal Sets. Cambridge
University Press, Cambridge, England.
Hall,B.K. (1992) Evolutionary Developmental Biology. Chapman &
Hall, London.
Jernvall,J., Keranen,S.V. and Thesleff,I. (2000) Evolutionary modification of development in mammalian teeth: quantifying gene
expression patterns and topography. Proc. Natl Acad. Sci., USA,
97, 14444–14448.
Mandelbrot,B.B. (1983). The Fractal Geometry of Nature.
W.H. Freeman and Co., New York.
Müller,G.B. and Newman,S.A. (eds) (2003) Origination of Organismal Form. MIT Press, Cambridge, MA.
Peitgen,H.O. and Saupe,D. (1992) Chaos and Fractals: New
Frontiers of Science. Springer-Verlag, New York.
Raff,R.A. and Kaufman,T.C. (1983) Embryos, Genes, and Evolution.
Macmillan Publishing Co., New York.
Schroeder,M. (1991) Fractals, Chaos, Power Laws: Minutes from
an Infinite Paradise. W. H. Freeman, New York.
Sharpe,J., Ahlgren,U., Perry,P., Hill,B., Ross,A., HecksherSorensen,J., Baldock,R. and Davidson,D. (2002) Optical projection tomography as a tool for 3D microscopy and gene expression
studies. Science, 296, 541–545.
Streicher,J., Donat,M.A., Strauss,B., Sporle,R., Schughat,K. and
Muller,G.B. (2000) Computer-based three-dimensional visualization of developmental gene expression. Nat. Genet., 25,
147–152.
Streicher,J., Weninger,W.J. and Muller,G.B. (1997) External markerbased automatic congruencing: a new method of 3D reconstruction from serial sections. Anat. Rec., 248, 583–602.
Weninger,W.J. and Mohun,T. (2002) Phenotyping transgenic
embryos: a rapid 3D screening method based on episcopic
fluorescence image capturing. Nat. Genet., 30, 59–65.
Wilkins,A.S. (2001) The Evolution of Developmental Pathways.
Sinauer, Sunderland, MA.