TOWARDS INTELLIGENT CHROMOSOME ANALYSIS Introduction

TOWARDS INTELLIGENT CHROMOSOME ANALYSIS
Aleksandar Jovanović, Miroslav Marić, Momčilo Borovčanin, Aleksandar Perović
GIS - Group for Intelligent Systems, School of Mathematics, University of Belgrade,
Studentski trg 16, 11000 Belgrade, Serbia and Montenegro
www.gisss.com, www.matf.bg.ac.yu, contact: [email protected]
Abstract: Investigation of chromosomes is based on the systems of induced bandings, procedures which generate characteristic patterns of light and dark areas - bands along chromosome.
These patterns are used to identify individual chromosomes, to diagnose certain diseases expressed
as characteristic pattern changes, to address precisely gene location - gene mapping, to determine
the origin of extra chromosomal material. Less than ten years ago, in the best scientific papers, chromosomes were treated manually with mechanical scissors operating on the most exciting
photographic material. Introducing CCD - microscopy we developed methods for precise representation, photo morphology, normalization and uncompromising measurements, implemented it
in software and supplied our genetic, oncology, hematology departments with comfortable and
efficient tools which replaced glue and scissors.
Introduction
The regular banding of chromosomes is
nicely shown on the chromosomes of mitosis presented in Fig.1.
sults in chromosome sequencing - cariotype which
includes identification of irregular chromosomes or
chromosomal segments in cell divisions characterizing appearance of variety of syndromes, especially
in oncology. Early precise determination of chromosomal changes and divergence from the standard
would assist in early syndrome classification and
diagnosing. Software tools assisting this work and
presumably aiming at completely intelligent systems are being developed by a number of teams.
The necessary elements here are the precise chromosome description - representation, measurement
tool set, including comparison and similarity measurements, object extraction and normalization,
gene addressing and tools for genetic back tracing
of wrong chromosomes.
Fig. 1 Mitosis
Method
Long ago researchers in genetics noticed that
the regular mitosis consists of all chromosomes in
pairs, except sex determining X and Y appearing
as {X, X} or {X, Y } pairs. That means that, each
except X, Y , chromosome pattern has identical or
a very similar - matching twin, in normal cells.
According to the distribution of bands and length
chromosomes are designated as first through the
twenty second and X or Y . This process is regularly accomplished by visual observations and re-
Photometrically the banding pattern is represented
as a sequence of parallel mountains whose peeks
correspond to the places, with low (or high) light
absorption. Each longitudinal (i.e. meridian) intersection of whole photometric chromosome surface is a one argument function, photometric polynomial M (x), where algebraic-combinatorial invariants should be common for similar chromosomes. For meridian intersections of two chromosomes of the same type, photometric polynomials
M1 (x) and M2 (x) with the same arrangement and
proportions of their local extreme’s positions, we
should have
(∃ε > 0)(∀x)|M1 (x) − aM2 (x) + c| < ε,
dA,B (CA , CB ) =
(1)
for some a and c and small enough ε.
Let
M = {Mi (x)|i ∈ I},
Z
min
a∈A,b∈B
|CA (bx) − CB (
x∈D
x
dx
+ a)|
b
m(D)
(4)
be a set of chromosomal photometric representawhere a is translation, b a contraction factor
tions. In M define an approximation of natural
and m(D) the length of D - the domain of a longer
equivalence ρ by ρε (Mi , Mj ) iff (1). Different ε’s
of chromosomal representations CA and CB .
will result in different granulation of the relation ρ.
The set M could include etalons (obtained statistically). Let for example
E = {Me1 , . . . , Men },
be an etalon and let ε be such that for the members
of E
ρε (Mei , Mej ) iff i = j.
This enables introduction of normalized similarity
relation
½
0, if i = j
∗
i
ρε (Mei , Mej ) = ¬δj =
.
(2)
1, otherwise
Then the equivalence classes are centered on the
members of E giving a set of metric balls
Keεi = {Mj |ρε (Mei , Mj )},
Kε = {Keεi |ei ∈ E}.
Let
Rε = M \
[
Kε ,
Fig. 2 Comparison of chromosomes - best matching of characteristic functions of meridian - longitudinal sections
(3)
then for given ε, Rε contains photometric
polynomials redundant from the etalon E (”mutants” i.e. significantly changed chromosomes).
The above reasoning could be refined if there is
a need to include more subtle aspects. In practice, in polynomials M1 and M2 representing two
chromosomes of the same type, even when they
are from the same cell division, relative intensities
of local maxima are hardly maintained, due to a
variety of factors acting in the chromosome formation. Fortunately positions of local extremes are
well preserved, more precisely - their relative distances, which is a good basis for definition of chromosome invariants. With reasonable allocation of
dark band edges to the saddle points nearest to
absorption local maxima, instead of photometric
polynomials, their simplifications Ci ’s - characteristic functions of bands are taken for the less sensitive chromosome representations, thus leading to
the comparison - similarity of chromosomes function. Thus, we can calculate similarity of chromosome A and B with the corresponding characteristic - like functions CA , CB with
Fig. 3 3D - photomorphology view
In the Fig.2 right half, we have a pair of type
1 chromosomes, extracted with their photometric
polynomial representations Mi and the corresponding simplifications Ci - represented with band patterns. Three variants of similarity measurements
in cents of the two Ci ’s are shown at lower left,
exhibiting very good match of these two chromosomes (≤ 6%, i.e. very high similarity). In the
Fig.3 we have complete photo morphology of the
chromosome 1 pair. These representations are suitable when there is a need for more detailed insight
into the chromosomal structure.
Fig. 6 Automatized chromosome normalization
Fig. 4 3D-photomorphology with two meridian
and one latitude section
We developed a nice set of measurement
tools on the 3D - photo morphology chromosome
representations, some shown in the Fig.4. Before
comparison, chromosomes usually need some ”rectification”, which is also necessary for gene precise
location. First we have implemented manual chromosome extraction with manual rectification. In
order to reduce human interaction and get closer
towards automatized goals, we implemented automatic chromosome normalization. Thus both comparison and localization of genes on chromosomes
demand the introduction of the cariotype coordinate system - chromosomal coordinate system,
which will facilitate identification of a chromosome
and its locus - bearing gene signal or other specific
features.
Fig. 5 Automatized cariotyping - construction of
a chromosome central meridian
Fig. 7 Automatized cariotyping - meridian of
chromosome 1
This is done, following the cariotyping biologic standard, which differentiates chromosomes
in their development phases and uses standardized
banding techniques which introduce chromosome
specific banding patterns. In order to reach any
level of automatization, we need to ”straighten” in
the most reasonable way this objects, which is: reconstruct them ”straight” = normalized, so that
their bands are distributed as if they would be if
a chromosome at first was ”straight”. In fact, the
whole chromosome image, considered as the corresponding absorption function - a two argument
polynomial F (x, y), rather than its single longitudinal section M (x), exhibits characteristic positioning of local extremes, which constitute the chromosome invariant, best for its identification and classification for whatever kind of detail/change investigation. The coordinate transformation is implemented in the following way. The central meridian
of original F (x, y) is deduced from the primary latitudes - normals of already determined tangents on
the contour, Fig.5. End points of an individual segment of the central meridian are the middles of the
adjacent primary latitude line network (beam sections). Then the whole central meridian is formed
from such elementary segments. Next, the network
of latitudes normal on their central meridian segments is refined, segment by segment. Finally, rotating each segment of central meridian, so that
they all become colinear and maintaining normality of the corresponding latitudes, we obtain the
”rectified” - normalized coordinate system. Then
mapping original pixels to their target coordinates,
the transform of the original chromosome is obtained, which is normalized - straight. Alternative
construction of the central meridian is based on the
process of contour thinning: inscribing contours, finally we reach a nicely determined big portion of
central meridian.
normalization is presented in Fig.9. We have introduced one step backwards, namely, controlled normalization, which allows operator to redefine central meridian and to perform ”rectification” step
by step, thus providing insight into the highly convex chromosome parts, which when compactifying
might loose some fine micro detail-shown as edge
holes in Fig.10 , leading to essential topology destruction.
Fig. 8 Normalized chromosome 1
Fig. 10 Semiautomatic mode: chromosomal step
by step normalization
Fig. 9 Complete automatic cariotyping
After the automatized normalization, the
chromosomes are sorted by length. The application of normalization of object from Fig.5 is shown
in Fig.6. Then Fig.7 and Fig.8 exhibit the similar
steps on the other chromosome. The result of the
complete automatized chromosome extraction and
Fig. 11 Trisomy of chromosomes: the chromosome to the right of the extracted pair in the right
corner has an extra band, one band more than corresponding chromosome which is immediately to
the left
Further automatization of this process, by
comparison of the cariotype obtained in this way,
with the chromosome image data bases, containing
details on identified syndromes, would be the next
huge step. The cataloging of all identified chromosome expressed pathologies needs to be done before. We will shortly illustrate the application of
the surveyed functions on the real material - problems of our customers. A serious genetic syndrome
- trisomy, appearance of one extra band in one of
paired chromosomes is shown in Fig.11.
Presence of an irregular normally nonexistent chromosome, called marker chromosome is
shown in Fig.12, the third in the first column of
big chromosomes, from the left.
Fig.17 together with the visual equation in Fig.18
confirming the congruence of irregular marker (top)
with the concatenation of chromosome Y and the
longer leg of chromosome 1, which led to the identification of a rare chematology syndrome (8th
recorded case).
Fig. 14 Suspected match
Fig. 12 Marker chromosome (from the left, the
first bigger chromosome below horizontally positioned one), regularly it does not exist
Fig. 15 Chromosome 1, long leg - strong similarity
Fig. 13 Detailed photometric similarity comparison of the marker chromosome with other big chromosomes indicates that it is redundant
Assistance in genetic back tracing of the material in this marker chromosome (Fig.17) is illustrated in the figures Fig.13, Fig.14, Fig.15, Fig.16, Fig. 16 Chromosome 1, long leg - different angle
Fig. 17 Marker - photomorfology
Fig. 19 Mitosis: with a couple of ”wrong” chromosomes
Fig. 18 Visual equation: congruence of marker
with the fusion of Y and longer arm of
chromosome 1
Fig. 20 Top normal pair, lower material from the
small moved to the tail of the bigger chromosome
In the Fig.19 there is an evidence of move
of genetic material from the small chromosome top
row to the right to the chromosome bellow it, as
demonstrated in detail in Fig.20. The top row has
a regular pair of these two chromosomes, while in
the lower row, with photomorphologic details we
have results of translation process.
The last example depicts assistance of described methods implementation in gene localization. In Fig.21 we have bright dots on chromosomes corresponding to a gene made visible by
fluorescent in situ hybridization (FISH) method.
Our photometric representation of chromosomes
with image measurement tools provides highly precise allocation of the maximum of the gene - sig- Fig. 21 Gene - signals (fluorescent in situ hynal, the spot closer to the chromosome center. bridization - FISH)
respect to the chromosome length), which will become more important with introduction of multiple
and finer gene hybridization techniques. Our implementations are still experimental and with growing functionality. It is free for download from our
web site.
References
[1] A. Jovanović, Mathematics in biology, (Serb),
School of Mathematics, University of Belgrade,
1997.
Fig. 22 Gene-signal addressing in chromosome
address space
[2] A. Jovanović, Group for Intelligent Systems Problems and Results,(Russ) Intelektualnie sistemi, Lomonossov Un, tom 6, vip 1-4, Moscow,
2002.
The signal maximum position can be de- [3] Group for intelligent systems - GIS, School
termined with precision of ≤ 2 − 3 pixels, after
of Mathematics, University of Belgrade,
chromosomal normalization. Thus, when working
www.gisss.com
with mega-pixel chips, reaching thousands of pixels per chromosome length, we are approaching sub [4] O. J. Miller and E. Therman, Human chromopromile precision of gene - signal addressing (with
somes, Fourth edition, Springer 2001.