Cell identification in Differential Interference Contrast microscope

Cell identification in Differential Interference
Contrast microscope images using
template matching
D. Young1 , C.A. Glasbey2 , A.J. Gray1 and N.J. Martin3
1
Department of Statistics and Modelling Science, University of Strathclyde,
26 Richmond Street, Glasgow G1 1XH, Scotland, UK
email: [email protected]
2
Scottish Agricultural Statistics Service, JCMB, King’s Buildings,
Edinburgh EH9 3JZ, Scotland, UK
email: [email protected]
3
Department of Biochemical Sciences, Scottish Agricultural College,
Auchencruive, Ayr KA6 5HW, Scotland, UK
Abstract
Some issues are considered which arise when template matching is used to
identify cells in differential interference contrast (DIC) microscope images.
An automatic method of counting and sizing the cells in such images is
proposed and two examples given. Template matching is complex due
to the nature of DIC images, namely the way in which light is captured
across the image. Further complications arise due to different cell sizes
and orientations. The method used compensates for these problems.
1
Motivation
The work described here results from a need in research into high rate algal ponds,
an environmentally important development in applied microbiology. These are simple, energy efficient, low-technology waste treatment systems (see eg. [3]). Achieving
optimum efficiency of such systems relies on knowledge of the biomass of algae and
bacteria in the mixed microbial populations of the pond. This is determined by viewing pond samples under a microscope, counting the number and measuring the size
of cells and using standard formulae to estimate biomass from these measurements.
The current methods for identifying, counting and measuring cells are at best
semi-automatic and slow. No fully automatic method (eg. edge-detection algorithms) has so far proved successful. This is due to the complex nature of the
microscope images eg. the presence of different types and shapes of cells, blurring from out-of-focus cells, presence of detritus material etc. Also, algal cells are
typically clustered and/or overlapping, and a method is needed for accurately separating, identifying and counting individual cells in a sample, while ignoring noise.
Overcoming these problems using image analysis will be the first step in developing
automatic methods of estimating biomass directly from the microscope image.
Ideally any method developed would be applicable to all microscope modalities
- eg. brightfield, differential interference contrast (DIC) (see eg. [2], [8]), phase
contrast and epifluorescence microscopy (see eg. [7]). Template matching will be
investigated as a means of achieving this.
2
Introduction
Template matching methods (see eg. [4]) find matches of a sub-image with grey
levels w(x, y) within an image of grey levels f (x, y), based on a ‘goodness of match’
statistic evaluated at each pixel in the image.
Here the statistic used is the covariance, taking the value g(i, j) at pixel location
(i, j) within the image, where
g(i, j) =
m
X
m
X
(f (i + k, j + l) − f¯(i, j)) (w(k, l) − w̄),
k=−m l=−m
using a (2m + 1) × (2m + 1) template (ie. with ‘window size’ 2m + 1), centred
at (i, j), and where w̄ is the template mean grey level and f¯(i, j) is a local image
mean evaluated over the area being matched. High covariance values indicate a good
match.
3
Summary of Method
The stages below indicate the procedure developed for counting and sizing the cells
in a DIC image (of most interest here):
(a) semi-automatic part
1. construct templates corresponding roughly in size to the largest, middle-sized
and smallest cells in the image. (Different templates are required for different
types of cell). Choose a shape which corresponds approximately to the shape
of the cells eg. circular algae, ellipsoidal yeast. Also record the size of the cells
within these templates.
2. apply differencing to the templates in the approximate direction that the light
hits the cells. Modify these templates if necessary by using weighted differencing or thresholding to match the cells in the image more closely (see below).
Add Gaussian blur to the templates to correspond more closely to the image.
(b) automatic part
– cell identification
1. apply the largest template. If rotation is required (for non-circular cells) take
the maximum value across rotations of the goodness-of-fit statistic at each
pixel location. Choose cell centres as points with g(i, j) values greater than
those of their 8 neighbours and greater than or equal to 80% of the maximum
g(i, j) value. (Note that for some matching statistics, a good fit corresponds to
a minimum, so values less than or equal to 20% of the maximum value could
be chosen). If there is a difference of 200 or more between consecutive ordered
statistic values then increase the cut-off value to cut above the lower value.
2. repeat the previous step for the middle-sized and smallest templates in turn.
The algorithm automatically removes cells as they are identified.
– cell sizing
1. site a fuller range of templates at each cell centre, choosing the best fitting
template to estimate the cell size. These templates are automatically created
by the algorithm. For images in which the templates require rotating, a range
of orientations is considered, to improve the accuracy of sizing.
2. output the number, size and location of the identified cells.
The application and development of these ideas is considered using two examples of
DIC images: Candida yeast cells (Section 4) and algal cells (Section 5).
4
Yeast Cells
Fig. 1 shows a 256 × 256 grey-level image of transparent Candida yeast cells, a
relatively simple example chosen for algorithm development.
4.1
Template Construction
The yeast cell templates were constructed using knowledge of the theory of DIC
microscopy [2]. The cells are assumed to be ellipsoidal - purely from visual, not
biological, information. A template was constructed by assigning grey levels to
pixels within an ellipse according to their distance from the edge, the centre pixel
receiving intensity of 1, points at the edge and beyond, intensity of 0. A 63 × 63
template with major axis length 25 and minor axis length 15 was constructed using
(1).
f (i, j) = 1 −
(
i
σx
2
+
j
σy
2 )
(1)
where σx = 25, σy = 15 and if f (i, j) < 0 then f (i, j) is set to 0. The grey levels
were then scaled to the full range (255, 0). The result is shown in Fig. 2.
To mimic the light in the DIC image, which is a first derivative image of optical
specimen density, first-order differencing was applied to Fig. 2 at the angle at which
the light hits the specimen. Here it appears to come from the top left, so differencing
was done at 450 in this direction, taking
w′ (i, j) = w(i, j) − w(i − 1, j − 1)
as the differenced template value at location (i, j). (To adequately represent the
cells in an image, it may be necessary to apply a weighting factor to the differencing
- see Section 5).
The light has quite a striking effect on the cells in the image (possibly because
the cells are higher in the centre). This was reproduced in the template by thresholding grey-level values of the differenced template above 200 to white and below 3
to black. Gaussian blur (with variance 16/3) was also applied, chosen to correspond
visually with the blur in the image. The final template is shown in Fig. 3.
For cell identification three square templates of size 63, 49 and 39 pixels wide
were used. They were rotated in steps of 100 from 00 to 1700 (to match roughly the
orientations of cells in the image), then differenced and thresholded. The maximum
covariance value at each pixel, for each template, was recorded and this information
used to extract the cell centres. The result of choosing points with g(i, j) values
greater than those of their 8 neighbours, ordering these values and then cutting
off at the level of the maximum difference is shown in Fig. 4, where a black dot
represents a ‘centre’. The actual range of covariance values in this image is from
−723.034 to 2221.59.
4.2
Cell Identification
The method identifies 20 cells in the image. The centres of the smaller cells appear
to be fairly accurate, however the larger cells have multiple centres and a point has
also been picked between two touching cells (see Fig. 4).
A good fit corresponds to a high covariance indicating that the 11 points with
the highest values should correspond to the 11 cell centres. Fig. 5 shows these 11
best fitting points. These do not all occur at the centres of cells. While this suggests
that the templates may not be very good representations of cells, clearly they more
closely match a cell than any other area within the image.
The reason for larger cells giving multiple centres is that smaller templates
have matched up better within a cell or even across a cell. Similarly, two touching
cells may lead to a match which results in a ‘centre’ at or near their border (see Fig.
4). This reasoning was justified by applying the smallest (39 × 39) template to the
image. In theory this should give maximum fits at the smallest cells. The range of
g(i, j) values for this template is from −1347.76 to 2135.98. The points given by
the highest three g(i, j) values are shown as black points on the cells in Fig. 1. The
template correctly identifies the cell with one centre, at an angle of 900 . The cell
with the two centres identified had two matches across the cell at angle 1500 , while
the actual cell is rotated at approximately 700 .
Any fully automatic method must allow for this. If smaller templates are
not allowed to lie inside bigger cells the problem should not occur. The method
proposed applies the templates individually (starting with the largest template),
choosing centres after each template has been fitted then eliminating these larger
identified cells from the image before running the next largest template.
The criterion used for choosing centres was to record the maximum covariance
values from the largest rotated templates. The points with covariance greater than
that of their 8 neighbours were identified, and then reduced to those with values
greater than or equal to 80% of the maximum. A further constraint was added to
ensure that only the best fits were chosen, namely that if there was a difference in
value of more than 200 between any of the top 20% of values then only points above
this threshold were chosen. In this case the maximum value was 1688.54, giving a
cut-off level of 1350. The maximum difference between the ordered values was 178
so the cut-off remains at 1350 and 5 cells are identified in the image.
These 5 cells are then removed from the image, by centering an ellipse (equal
in size to the ellipse part of the template) on these points and setting pixels within
this ellipse to a grey-level similar to the average grey-level of the background (in this
example taken as the average pixel value in the image ie. 157). Alternatively, the
area of match could simply be masked out without modifying the image.
The middle sized (49 × 49) template was applied to the remaining cells and the
same criterion used for choosing the centres. Cutting at the 80% level (covariance
value 1529) identified a further 5 cells in the image, which were also then removed.
The smallest (39 × 39) template was then applied and the remaining cells
identified using the 80% rule with the difference of 200 between covariance values
now being applicable. The image with all centres identified is shown in Fig. 6.
This fully automatic method of identifying centres gives an accurate cell count.
The centres were then successfully used as a basis for estimating cell sizes (results
not shown) by fitting at the centres all templates from smallest to largest (width
35, 37, .., 63 pixels) rotated at angles 4o less to 4o more than the recorded orientation.
The size is then estimated from the best fitting template.
5
Algal Cells
The method developed for the yeast image is now applied to a 512 × 512 DIC image
of algal cells with 256 grey-levels.
Templates were constructed similarly to the yeast cell template, but assuming
that cells are circular (so rotation is unnecessary). The differencing was applied at
450 from the top right using a weighting factor with
w′ (i, j) = w(i, j) − 1.5 × w(i − 1, j − 1),
to more adequately represent the cells in the image. The algal cells are semitransparent and the weighting was chosen to remove the brightfield image component
in the DIC image.
Three sizes of template were now applied, namely squares of 35, 29 and 23
pixels wide. The results from the largest template, applying the ‘80% – 200’ rule,
identified 9 cells (cut-off at 1915 ie. 80% of maximum covariance value). After
removing these cells (average value of background taken to be 144), the middle
sized template identified another two cells (using cut-off at difference of 200). The
remaining two cells were found by the smallest template (cut-off at 80% value of
1694). All the centres are shown in Fig. 7. Cells were then sized using square
templates of width 23, 25, .., 35 pixels.
6
Computational Issues
After construction (and rotation if required) of the appropriate templates to represent the cells in the DIC image, the method described is fully automatic. It is fairly
accurate but very computer intensive. For example, the CPU time taken to apply
the largest (35 × 35) template to the algal cell image was 896 seconds on a SUN
SPARCstation 10.
Various approaches have been taken in the literature towards speed up of template matching. Parallel processing architectures and algorithms for template matching are considered, for instance, in [11] and [12]. For non-parallel computation, Fast
Fourier Transforms may produce faster computation of degree of match, depending
on the chosen similarity criterion and on image and template size. This is currently
being implemented.
The number of computations may also be reduced by ordered matching. In [9]
probabilistic information (eg. a grey level histogram) from the image and template
is used to order computation of the matching criterion so as to minimise the number
of computations required before the threshold value is exceeded. Degree of computational saving depends on the shape of the grey level distribution. However this
approach is not directly applicable to the examples considered here as the threshold
is not fixed.
A different approach to speed up is multi-stage matching. A two-stage approach is taken in [6], using a sub-template initially to determine good candidate
locations for a match (ie with similarity measure values beyond a chosen threshold),
then applying the whole template in the second stage only to locations identified
by the sub-template. Degree of speed up (and matching success) depends on subtemplate size and the stage one threshold value. In [10] the initial stage is used
to search for matches of a template feature which occurs only rarely in the image,
exhaustive template matching then being applied in the vicinity of the identified feature locations. Hierarchical matching algorithms generalise these ideas to matching
at multiple resolutions. See e.g. [1], which also considers template orientation, as
does Goshtasby [5] which uses normalised invariant moments as similarity measures
in a two-stage approach similar to that of [6]. The need to rotate templates adds
greatly to the computational burden of the algorithm described in this paper. It
would be useful to investigate the applicability of [5] to the algal and yeast images.
The method described has also been applied to other microscope images, namely
a 512 × 768 DIC image of algal and bacterial cells and two 512 × 512 brightfield images (without template differencing), one of rod-shaped algal cells and the other of
an insect-like organism comprising four distinct but touching parts. The results were
less successful for images in which objects overlap, and also disappointing for very
blurred images.
Further work will involve assessment of accuracy of sizing via simulation, after
issues of computational efficiency have been addressed.
Acknowledgement
The first author was supported by an Earmarked Studentship from the Engineering
and Physical Sciences Research Council.
References
[1] Anisimov, V.A. and Gorsky N.D. (1993). Fast hierarchical matching of an arbitrarily oriented template. Pattern Recognition Letters, 14, 95-101.
[2] Cogswell, C.J. and Sheppard C.J.R. (1990). Confocal differential interference
contrast (DIC) microscopy: including a theoretical analysis of conventional and
confocal DIC imaging. Journal of Microscopy, 165, 81-101.
[3] Fallowfield, H.J. and Martin, N.J. (1990). The operation, performance and computer design of high rate algal ponds. Institute of Chemical Engineering Symposium Series, 111.
[4] Gonzalez, R.C. and Woods, R.E. (1992). Digital Image Processing, AddisonWesley, Massachusetts.
[5] Goshtasby, A. (1985). Template Matching in Rotated Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7, 338-344.
[6] Goshtasby, A., Gage, S.H. and Bartholic, J.F. (1984). A two-stage cross correlation approach to template matching. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 6, 374-378.
[7] Herman, B. and Jacobson, K. (1990). Optical Microscopy for Biology. Wiley,
New York.
[8] Holmes, T.J. and Levy, W.J. (1987). Signal-processing characteristics of differential interference contrast microscopy. Applied Optics, 26, 3929-3939.
[9] Margalit, A. and Rosenfeld, A. (1990a). Using probabilistic domain knowledge
to reduce the expected computational cost of template matching. Computer
Vision, Graphics and Image Processing, 51, 219-234.
[10] Margalit, A. and Rosenfeld, A. (1990b). Using feature probabilities to reduce the
expected computational cost of template matching. Computer Vision, Graphics
and Image Processing, 52, 110-123.
[11] Prasanna Kumar, V.K. and Krishnan, V. (1989). Efficient parallel algorithms
for image template matching on Hypercube SIMD machines. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 11, 665-669.
[12] Sid-Ahmed, M.A. (1990). Serial architectures for the implementation of 2-D
digital filters and for template matching in digital images. IEEE Transactions
on Acoustics, Speech and Signal Processing, 38, 853-857.
Figure 1: Yeast-cell image
Figure 2: Ellipse template
Figure 3: Final template with added Gaussian blur
Figure 4: Centres identified by the method
Figure 5: 11 highest covariance values
Figure 6: Centres of cells identified by proposed method
Figure 7: Algal cell centres identified