Tooth Segmentation from Cone-Beam CT using Graph Cut

Tooth Segmentation From Cone-Beam CT Using
Graph Cut
L.T. Hiew∗ and S.H. Ong∗ and Kelvin W.C. Foong†
∗
National University of Singapore, Electrical and Computer Engineering
† National University of Singapore, Preventive Dentistry
Abstract—Cone beam computed tomography (CBCT) can provide dentists with accurate 3D diagnostic images of the maxillofacial region at a lower irradiation dose compare to conventional
medical CT. Due to low image contrast, higher image noise
and missing image boundaries, tooth segmentation in CBCT
is difficult even with experienced radiographic interpreters. In
this paper, we proposed a graph cuts segmentation approach of
obtaining the 3D tooth model from CBCT images. A 3D Markov
Random Fields (MRF) is used to model CBCT 3D images. We
then used graph cuts to obtain the optimal image segmentation.
For a total of 25 teeth data sets, our results shows an average
dice similarity coefficient of 0.89.
I. I NTRODUCTION
Cone beam computed tomography (CBCT) has been used
in dentistry since the mid 1990s. It uses a cone shaped X-ray
beam which rotates around the patient to acquire a volumetric
data set. The main advantage of CBCT is that the radiation
dosage is considerably less than conventional CT scanning. In
orthodontics, CBCT is commonly used for management and
treatment planning of ectopic teeth, impacted teeth, unerupted
molars, root resorption and fractured roots [1] [2] [3]. Several
authors [1] [2] have shown that the reconstruction of virtual 3D
teeth models can provide more accurate diagnostic information
that may lead to better treatment planning decisions and
potentially more predictable outcomes for the above clinical
cases.
Teeth segmentation in CBCT is an important step in creating
accurate 3D teeth models. There are several reasons that
make teeth segmentation a difficult task: a) close proximity
of adjacent tooth structures b) topological differences in the
roots of a tooth and c) higher image noise in CBCT images.
Majority of work relevant to this area of research uses a level
set method [9] for teeth segmentation. The level set method
is a numerical technique for tracking interfaces and shapes.
A 2D level set with shape and intensity prior was used for
tooth segmentation from CT images [4]. In [5], panaromic resampling and a 2D level set were used to segment teeth from
CT images. Using thin plate splines (TPS) and level set with
shape priors, [6] obtained the “best-fit” polygonal surface of
the tooth from CT. Lastly, using a graphical representation,
a generic model-based segmentation algorithm was used for
tooth segmentation from CBCT [7].
Although level set method is widely used in teeth segmentation, energy minimization in level set framework which uses
a gradient descent approach has a potential pitfall of getting
stuck in a local minima. This reduces the robustness and
the accuracy of the segmentation results. In this paper, we
explore a graph cuts approach for teeth segmentation. Graph
cuts [8] can be employed to solve a wide variety of low-level
computer vision problems efficiently. For a two-label image
segmentation problem, the problem can be solve exactly using
this approach. Furthermore, the solutions can be found within
a known factor of the global minimum.
This paper is structured into four parts. We first discuss
the formulation of the problem and explain how graph cuts is
applied. We then talk about the materials that are used in this
study. This is followed by experiments, results and discussion.
Finally, we end our paper with a conclusion.
II. M ETHOD
A. Problem formulation
The segmentation problem can be treated as an energy
minimization such that for a set of voxels P and a set of labels
L, the goal is to find a labeling f : P → L that minimizes
some energy E(f ). Using Markov Random Fields (MRFs)
with unary and pairwise cliques to model f [10], the energy
is given by
X
X
E(f ) =
Up (fp ) +
Vpq (fp , fq ),
f ∈ LP (1)
p∈V
p,q∈N
where N is the set of adjacent pixels. Up (fp ) is the penalty of
assigning label fp ∈ L to p, and Vpq (fp , fq ) is the penalty of
labeling the pair p and q with labels fp , fq ∈ L, respectively.
In this paper, L = {0, 1}, and the minimum E(f ) can be
computed efficiently with graph cuts when Vpq is a submodular
function, i.e. Vpq (0, 0)+Vpq (1, 1) ≤ Vpq (0, 1)+Vpq (1, 0) [11].
B. Graph-Cuts
Let G = (V ∪ {s, t}, E) be an arc-weighted directed graph.
In addition to V corresponds to image pixels (voxels), V
contains two special terminal nodes, namely, the source s
and the sink t. The edge set E consist of n-links and t-links.
n-links connect pairs of neighboring pixels whose cost are
derived from smoothness term Vpq . t-links connect pixels with
terminals, whose costs are derived data term Up .
A subset of edges C ⊂ E is called an s-t cut in G if C whose
removal partitions the nodes into two disjoint subsets S and
T in the induced graph G = (V, E − C), such that s ∈ S and
t ∈ T and no path can be established from s to t. The cost
|C| of the cut is the sum of all edge weights in C. For a given
graph, the minimum cost cut (mincut) can be found by solving
an equivalent maximum flow (maxflow) problem [12].
272
Proceedings of the Second APSIPA Annual Summit and Conference, pages 272–275,
Biopolis, Singapore, 14-17 December 2010.
10-0102720275©2010 APSIPA. All rights reserved.
TABLE I
A XIAL SLICES OF A CANINE , INCISOR , PREMOLAR AND MOLAR .
C. Minimizing E(f ) with Graph-Cuts
A voxel p ∈ P can be assigned label fp = 1 (object) if
p ∈ S and fp = 0 (background) if p ∈ T . As a result, each cut
will produce a binary labeling f and a corresponding energy
E(f ). The goal is assign appropriate weights to the graph’s
edges so that the mincut cost |C| is equal to the minimum
energy E(f ).
In this paper, the unary penalty Up (fp ) is defined based on
negative log likelihood of given object and background image
intensity histograms.
Up (fp ) = − ln p(Ip |fp )
Tooth
Slice Number
35
55
75
Canine
Incisor
(2)
Premolar
For pairwise penalty Vpq , we use the following weight assignment [13]
Vpq (fp , fq ) = g(p, q).|fp − fq |
15
(3)
Molar
where
1
(Ip − Iq )2
.
g(p, q) = exp −
2
2σ
dist(p, q)
(4)
Here Ip is the intensity of pixel p, dist(p, q) is the Euclidian
distance between pixels p and q. Parameter σ is the estimated
image noise.
III. E XPERIMENT AND RESULTS
D. Dice similarity coefficient
Dice similarity coefficient [15] is used to measure the
accuracy of the segmentation results. The equation for this
measure is given as the following
TP
(5)
TP + FP + FN
where TP stand for true positive (voxels correctly classified),
FP for false positive (voxels incorrectly classified as F) and
FN for false negative (voxels incorrectly classified as B).
DSC =
A. Data acquisition
CBCT head scans are acquired from real human subjects.
The CBCT images have an isotropic resolution of 0.3 mm.
As the aim of this study is to evaluate the accuracy of graph
cuts in individual tooth segmentation, 3D volumes with tooth
embedded are carefully cropped out from the CBCT head
scans. The number of canines, incisors, pre-molars and molars
that included in this study are 6, 8, 7 and 4 respectively, a total
of 25 teeth. Average size for all teeth data set is 55 × 48 × 83
voxels. Some of the CBCT cropped images are shown in Table
I.
B. Ground truth segmentation
Ground truth segmentation can be obtained using a semiautomatic approach with open source ITK-SNAP [14]. ITKSNAP allows the user to iteratively refine our segmentation
results by controlling the parameters of a geodesic active
contour. Users are also allowed to edit the segmentation results
using a interactive brush editing tool provided by ITK-SNAP.
C. Foreground and background voxels selection
Foreground (F) and background (B) voxels are needed to
initialize graph cuts. In this paper, F is obtained using morphological erosion on the ground truth segmentation. On the
other hand, B can be obtained using morphological dilation.
Both procedures uses a disk structuring element. The size of
the structuring element is 5 and 10 for F and B, respectively.
E. Results and discussion
For the given foreground and background initializations, our
segmentations results has an average DSC of 0.89, with all
DSC measures greater than 0.70. Average DSC measures for
canines, incisors, pre-molars and molars are 0.90, 0.91, 0.87
and 0.88, respectively. Some of the segmentation results are
listed in Table II. Detail tabulation of our DSC measures is
also given in Table III. Segmentation of the roots of a tooth is
more difficult than the crown in CBCT images. Out of 25, there
were 9 data sets where the roots are not properly segmented.
Out of these 9 data sets, 2 are from canine, 2 are from incisor,
4 are from molar and 1 from premolar.
We observed that the area of the foreground initialization can have a serious impact on the segmentation results.
Segmentation of the roots of the tooth becomes increasingly
difficult when there is no foreground voxels being labeled in
the root regions. We can demonstrate the effect of diminishing
foreground labels by controlling the erosion disk size on F.
Visualization of a canine segmentation is shown in Table IV.
The DSC decreases when the the erosion disk size is increased.
F. Graph cuts vs level sets
Graph cuts in many ways is superior than level sets for a
few reasons: a) Flexibility of adding hard constraint (B and
273
TABLE II
I NITIALIZATION AND SEGMENTATION RESULTS OF A CANINE AND A
MOLAR . T HE RED AND BLUE CONTOURS ARE THE INITIAL FOREGROUND
AND BACKGROUND VOXELS , RESPECTIVELY. S EGMENTATION RESULTS
ARE INDICATED IN GREEN .
TABLE IV
R ECONSTRUCTION AND DSC MEASURES OF A CANINE BASED ON
VARIOUS EROSION DISK SIZE FOR F .
View
10
Slice Number
30
40
20
50
Ground Truth
3
Erosion disk size for F
5
7
60
Front
Canine
Initial
Graph
Cut
Side
Molar
Initial
Back
Graph
Cut
DSC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Type
Size
Canine
45 × 60 × 83
Canine
60 × 57 × 71
Canine
60 × 51 × 91
Canine
60 × 53 × 92
Canine
59 × 59 × 72
Canine
59 × 59 × 71
Incisor
43 × 49 × 106
Incisor
46 × 49 × 80
Incisor
47 × 58 × 78
Incisor
59 × 38 × 97
Incisor
56 × 39 × 86
Incisor
46 × 32 × 98
Incisor
59 × 38 × 96
Incisor
44 × 52 × 74
Molar
58 × 56 × 76
Molar
57 × 25 × 84
Molar
72 × 26 × 81
Molar
57 × 46 × 97
Molar
54 × 42 × 84
Molar
51 × 66 × 81
Molar
72 × 30 × 72
Premolar 39 × 49 × 101
Premolar 46 × 47 × 75
Premolar 59 × 56 × 78
Premolar 59 × 56 × 83
DSC
Time (s)
Root
0.87
0.86
0.89
0.94
0.91
0.94
0.91
0.88
0.93
0.87
0.91
0.89
0.91
0.95
0.83
0.92
0.96
0.96
0.85
0.82
0.77
0.95
0.93
0.70
0.95
14.67
16.99
18.95
19.86
29.88
30.39
11.74
13.89
14.59
14.81
15.50
16.88
19.21
25.41
17.64
18.64
19.44
21.53
36.65
38.66
40.88
13.58
15.77
19.95
20.64
Y
N
N
Y
Y
Y
N
Y
Y
N
Y
Y
Y
Y
N
Y
Y
Y
N
N
N
Y
Y
N
Y
0.9232
0.9148
0.8229
G. Performance and time
TABLE III
DSC, SIZE , TIME AND SUCCESS OF ROOT SEGMENTATION OF ALL 25
TEETH . F OR THE LAST COLUMN , Y INDICATES A SUCCESS , N INDICATES
A FAILURE .
Index
1
Our program is written in MATLAB and it runs on a
AMD Athlon II X4 630 Processor (4 CPUs), 2.8GHz, 64-bit
Windows 7 and 4 GB RAM. The max flow implementation
for graph cuts is provided by Boykov et. al. [16]. Average
computational time recorded for each teeth is 21 seconds.
Detail tabulation of segmentation performance is also given
in Table III.
TABLE V
C OMPARISON BETWEEN GRAPH CUTS AND LEVEL SETS SEGMENTATION
WITH THE SAME INITIALIZATION FOR A INCISAL . L EVEL SET HAVE
OBVIOUS LEAKING ISSUE AT THE ROOTS . G RAPH CUT HAS NO LEAKING
ISSUE , HOWEVER THE RECONSTRUCTED SURFACE IS NOT AS SMOOTH AS
LEVEL SET.
Views
Initialization
Level Sets
Graph
Cuts
Front
Side
Back
F) b) Segmentation results are nearer to global optimal and
c) Less parameters to be fine tuned. One way to demonstrate
the strength of graph cuts is by observing the segmentation
results, given the same initialization for both level sets and
graph cuts method. This is shown in Table V. As level sets
often leaks at regions where edges are weak (the roots of the
tooth), graph cuts is more robust and accurate in this situation.
IV. C ONCLUSIONS
We present a method for tooth segmentation from CBCT
images using graph cuts. By formulating the 3D images in
274
Markov Random Field framework, we can obtain optimal
tooth segmentation using graph cuts. Experimental results
shows that the approach is accurate and robust for a total
of 25 data sets with an average DSC score of 0.89. The
method is generally superior than conventional level sets based
approach which is widely used in teeth segmentation. Future
improvements include: a) incorporating shape prior to control
the growing/expansion properties of the graph cuts method.
b) incorporating higher order smoothness constraint c) multilabel teeth segmentation.
R EFERENCES
[1] S. J. Merrett, N. A. Drage and P. Durning, “Cone beam computed tomography: A useful tool in orthodontic diagnosis and treatment planning,”
Journal of Orthodontics, vol. 36, pp. 202-210, September 2009.
[2] S. L. Hechler, “Cone-Beam CT: Applications in Orthodontics,” Dental
Clinics of North America, vol. 52, pp. 809-823, October 2008.
[3] C. H. Kau, S. Richmond, J. M. Palomo and M. G. Hans, “Threedimensional cone beam computerized tomography in orthodontics,” Journal of Orthodontics, vol. 32, pp. 282-293, December 2005.
[4] H. Gao and O. Chae, “Individual tooth segmentation from CT images
using level set method with shape and intensity prior (Article in Press),”
Pattern Recognition, 2010.
[5] M. Hosntalab, R. A. Zoroofi, A. A. Tehrani-Fard and G. Shirani, “Segmentation of teeth in CT volumetric dataset by panoramic projection
and variational level set,” International Journal of Computer Assisted
Radiology and Surgery, vol. 3, pp. 257-265, June 2008.
[6] S. Liao, W. Han, R. Tong and J. Dong “A Hybrid Approach to Extracting
Tooth Models from CT Volumes,” IMA Conference on the Mathematics
of Surfaces, pp. 308-317, June 2005.
[7] J. Keustermans, D. Seghers, W. Mollemans, D. Vandermeulen and
P. Suetens “Image Segmentation Using Graph Representations and Local
Appearance and Shape Models,” GbRPR, pp. 353-365, June 2009.
[8] Y. Boykov O. Veksler, and V. Kolmogorov “Fast approximate energy
minimization via graph cuts,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 23, pp. 1222-1239, Nov 2001.
[9] S. Osher and J. A. Sethian “Fronts propagating with curvature-dependent
speed: Algorithms based on Hamilton-Jacobi formulations,” Journal of
Computational Physics, vol. 79, pp. 12-49, Nov 1988.
[10] D. M. Greig, B. T. Porteous and A. H. Seheult “Exact Maximum A
Posteriori Estimation for Binary Images,” J. Royal Stat. Soc. Series B,
vol. 51, pp. 271-279, Nov 1989.
[11] V. Kolmogorov and R. Zabih “What Energy Functions Can Be Minimized via Graph Cuts?,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 26, pp. 147-159, Nov 2004.
[12] L. R. Ford and D. R. Fulkerson “Flows in networks,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, Princeton University Press,
1962.
[13] Y. Y. Boykov and M. P. Jolly “Interactive graph cuts for optimal
boundary & region segmentation of objects in N-D images,” International
Conference on Computer Vision, vol. 1, pp. 105-112, 2001.
[14] P. A. Yushkevich, J. Piven, H. C. Hazlett, R. G. Smitth, S. Ho, J. C. Gee
and G. Gerig “User-guided 3D active contour segmentation of anatomical
structures: Significantly improved efficiency and reliability,” Neuroimage,
vol. 31, pp. 1116-1128, 2006.
[15] L. R. Dice “Measures of the Amount of Ecologic Association Between
Species,” Ecology, vol. 26, pp. 297-302, July 1945.
[16] Y. Boykov and V. Kolmogorov “An Experimental Comparison of MinCut/Max-Flow Algorithms for Energy Minimization in Vision,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp.
1124-1137, Sept 2004.
275