A Dilation-based Clustering Algorithm for Anti

A Dilation-based Clustering Algorithm for Anti-Reflection Glass
Inspection
Chun-Wei Yeh
University of Birmingham
Electronic, Electrical and
Computer Engineering
[email protected]
Chin-Sheng Chen
National Taipei University of
Technology
Graduate Institute of Autoamtion
Technology
[email protected]
Chien-Liang Huang
National Taipei University of
Technology
Graduate Institute of Autoamtion
Technology
[email protected]
ABSTRACT
This paper develops an efficient and effective dilation-based clustering algorithm (DBCA) for
Anti-Reflection (AR) glass defect detection using run-length encoding (RLE). The fundamental
concept of dilation-based connectivity and its limitation are described in the beginning.
Subsequently, the architecture of DBCA is constructed in the following procedures: (1) run-length
encoding, (2) RLE-based morphological operation, (3) RLE-based component detection algorithm,
(4) relationship construction, and (5) re-labeling connection. The details of these five procedures
performed in DBCA are then discussed in detail.
Finally, the experimental results of DBCA
indicate that this algorithm can successfully overcome the effects of broken defects for AR glass if
an appropriate structure element is selected.
Moreover, the performance evaluation further shows
that DBCA can be applied in the real application as a post-processing of defect inspection.
Key words: dilation-based clustering algorithm (DBCA), run-length encoding (RLE) and component
detection.
1
1.
INTRODUCTION
In recent years, the demand for liquid crystal display (LCD) has increased rapidly, and the quality
control of LCD manufacturing becomes more and more important. Anti-reflection (AR) glass,
which is a part of LCD, is used to improve the optical performance of displays by enhancing light
transmission rate. However, it is difficult to assure the high quality of AR glass by human eyes
because human inspection is instable. In such situation, automatic optical inspection (AOI)
technology, which is a kind of machine vision applications, is used to detect defects (scratches,
cracks and bright dots) during the fabrication process, resembling the behavior of human
inspectors. This can effectively increase the quality and yield rate of AR glass manufacturing.
The scratch defect of AR glass, which is shown in Fig. 1(b), can be gained by applying digital image
pre-processing. Fig. 1(c) shows this scratch defect of AR glass is segmented to several lines
and points due to the limited ability of image processing and clustering methodology. For human
inspector, the scratch defect can be easily recognized as a whole broken defect, but not for an AOI
system. However, the inspection procedure influences the fabrication process of AR glass if it
provides the wrong information of defects related to the frequency of occurrence and the description
of defects to the manufacturer. Therefore, the defects of AR glass need to be clustered (grouped)
after image pre-processing in original image of AR glass, such as Fig. 1(d). This can assist the
AOI system to deliver the correct information of defects, which is closed to human eyesight.
Among the previous research of clustering, spatial clustering is suitable to be utilized in this
application because it can effectively partition a large amount of objects consisting of
two-dimensional (2D) data into different clusters [1, 2], and is easier to implement in digital images
which are represented by 2D data matrix.
(a)
(b)
(c)
(d)
Figure 1 (a) original image of AR glass, (b) defects of AR glass after image pre-processing, (c)
wrong clustering results of AR glass, and (d) correct clustering results of AR glass.
Spatial clustering relies on the distance between objects, which is applied to imitate the ability of
human eyesight, to make clusters. For this reason, the spatial knowledge of medium, which is
called domain knowledge, is necessary to decide on parameter of distance. Several researches
have been developed for spatial clustering, and subsequently Ester et. al [1] reported the following
requirements for spatial clustering algorithms:
2
(1) Minimal requirements of domain knowledge to determine the input parameters, because appropriate
values are often not known in advance when dealing with large databases. (2) Discovery of clusters with
arbitrary shape, because the shape of clusters in spatial databases may be spherical, drawn-out, linear,
elongated etc.(3) Good efficiency on large databases, i.e. on databases of significantly more than just a
few thousand objects.(P. 226)
According to the above requirements, the morphology-based clustering algorithm was proposed to
discover the different shape of clusters effectively by using morphological operator [3, 4].
Subsequently, Braga-Neto and Goutsias [5] proposed a geometrical-oriented hierarchical clustering
algorithm using dilation-based multi-scale connectivity to reduce the requirement of input parameter.
The dilation-based connectivity of morphology, which is illustrated in Fig. 2, is applied by the dilation
operator with a structure element to probe the relationship of different objects. Moreover, Wang et
al. [6] utilized the morphology based scale space method to recognize the linear and near linear
features of seismic belts. The above algorithms are useful in general applications; however, they
could induce expensive computation time when the application involves huge data sets for
connection and large size of structure element for morphology.
Figure 2 Illustration of dilation-based multi-scale connectivity [5]. (P. 895)
Although many researchers have investigated the morphology-based clustering algorithm, relatively
few works have been addressed on its implementation which could improve the performance in AOI
applications. In AOI, only the dilation operation is necessary to be used in making clusters,
therefore this is called dilation-based clustering algorithm, which is a kind of morphology-based
clustering algorithms. Based on the above discussion, the aim of this paper is to develop an
efficient and effective dilation-based clustering algorithm for AOI, especially in AR glass inspection
by using run-length encoding (RLE), which is a data compression technique of digital image used to
reduce the computation complexity. In the beginning, the concept of dilation-based connectivity is
employed to design a tool which probes the connectivity of different objects with arbitrary structure
element, and then the relationship of these objects is constructed using component-labeling
algorithm, which reduces the computation complexity with RLE. Finally, the information related to
the domain knowledge are extracted and used to make the clusters in re-labeling procedure with
the relationship of these objects.
This paper is organized as follows: Section 2 presents the architecture of dilation-based clustering
3
algorithm (DBCA), and then describes the detail of all procedures. In Section 3, the performance
evaluation of DBCA is shown firstly, and the experimental results for AR glass images are then
discussed. Finally, the conclusions of this paper are given in Section 4.
2.
THE ARCHITECTURE OF DBCA
Fig. 3 shows the architecture of DBCA. The DBCA includes five procedures: (1) run-length
encoding, (2) RLE-based morphological operation, (3) RLE-based component detection algorithm,
(4) relationship construction, and (5) re-labeling connection. In the beginning, the original image R
is translated from a binary data matrix to RLE structure as a RLE Image. The RLE-based
morphological operation, which uses dilation operation, is applied in this original RLE Image with
structure element S. Subsequently, the dilation-based component information in RLE Image,
which is stored in a data structure of Blob table, is extracted by using RLE-based component
detection algorithm, and this table can build the indexes of the dilated RLE Image in the procedure
of relationship construction. Finally, the clustering results stored in RLE and Blob tables are
produced in the procedure of re-labeling connection, which re-labels the original RLE Image by
comparing the indexes of dilated RLE Image. The details of these five procedures performed in
DBCA are described in the following subsections.
2.1 Run-Length Encoding
Run-length encoding (RLE) is a data compression technique, which uses the structure run
represented by three parameters: Xs, Xe and Y in translation for the horizontal direction of
foreground pixels in a binary image, which are connected to each other. The Xs, Xe and Y in a run
mean starting position, ending position and y position of a binary data matrix, respectively [7].
Figure 3 The architecture of DBCA.
Assume that the object P can be represented by RLE as
, where N is the number of runs in
object P, and
is the n run order of object P. Fig. 4 illustrates the processes
between the binary image and the table of run representation in RLE. The indexes on the starting
position of each run shown in Fig. 4 (a) are the run order of the RLE table, which records the run
information of the binary image in Fig. 4 (b).
4
Y
Xe
Xs
(a)
(b)
Figure 4 An example of run-length encoding: (a) a binary image; (b) a RLE table which records the
run information translated from the binary image.
2.2
RLE-based morphological operation
Morphological operation is a type of mathematical tool which analyzes the geometry and structure
of objects, and is also a useful tool for digital image processing [8]. Therefore, many researchers
have proposed in the performance improvement of morphological operation for the general
applications of image processing. In order to reduce computation complexity, Kim et al. [7] applied
RLE in binary morphological operations: dilation and erosion, and showed that its performance is
more efficient than that of conventional algorithm. However, its performance has not compared
with bit packed implementation, a powerful algorithm for binary morphology [9]. According to
Breuel [10], RLE morphological processing is adopted in this paper due to its performance, which is
more efficient than bit packed implementation.
RLE-based morphological operation is utilized by the principle of mathematical morphology in RLE
representation with transposing. There is only a dilation operator considered to apply in this
section because it is closed to the ability of human eyesight in clustering results (clustering
naturally). A RLE-based dilation of object P with a structure element S can be expressed [7] as
where, N and M are the number of runs in object P and structure element S, respectively. In
addition,
are the runs of object P and structure element S, which are defined as
and
in the n and m run order,respectively.
The RLE-based morphological operation can be further divided into horizontal and vertical
operations. The within-line operations, which is used to process one-dimensional morphological
operation, are implemented to replace the between-line operations using transposition in RLE
Image. Therefore, a RLE-based dilation of object P with a rectangular structuring element of size u
5
× v can be written as [10]:
FUNCTION Dilation2d (objects, u, v)
Dilate1d (objects, u);
Transpose (objects);
Dilate1d (objects, v);
Transpose (objects);
END
Fig. 5 shows the detail of main idea as the following procedures. In the beginning, the original
object is represented by RLE in Fig. 9 (a) to be an input parameter objects, and u × v are the
horizontal and vertical sizes of rectangular structuring element, respectively.
Then,
one-dimensional (1D) dilation is applied to extend the horizontal direction of original object with
parameter u using equation (1), and the dilated result is shown in Fig. 9 (b). After using 1D dilation
in original object, transposition is utilized to transpose the direction between horizontal and vertical
axes as Fig. 9 (c). Fig. 9 (d) is then resulted from Fig. 9 (c) by extending the horizontal direction of
with parameter v. Finally, the dilation result of original object with a rectangular structuring
element of size u × v is obtained in Fig. 9 (e) by iterating the transposition of Fig. 9 (d).
(a)
(b)
(c)
Transposition
X
0
Y
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
(d)
(e)
Figure 5 RLE-based dilation procedures: (a) a original object and a rectangular structuring
element of size u × v; (b) result after dilated1D operation (within-line) in original object with size u; (c)
result after transposition of (b); (d) result after dilated1D operation of (c); (e) final result using
dilation operation with a rectangular structuring element of size u × v in original object.
2.3
RLE-based component detection algorithm
The computer can only visualize a binary data matrix provided in binary image, therefore,
connected component labeling algorithm is developed to assist computer in extracting the
information including area, location and amount of objects from binary image. The concept of
connected component labeling algorithm is to check the relationship between a pixel and its
4-adjacent or 8-adjacent neighboring pixels in a binary image. If the connected relation exists in
6
an image, then these connected pixels are belonging to the same component.
Many
component-labeling algorithms, which can be divided into two types: pixel-based and RLE-based,
have been proposed in the past years. One well-known algorithm of the pixel-based researches is
that Chang et al. [11] applied a contour tracing technique to group the components of image as a
one-pass component-labeling algorithm. In order to reduce computation complexity, a few
researchers focused on the RLE-based component-labeling algorithm, such as Line-Scan cluster
(LSC) algorithm [12] and run-based two-scan labeling algorithm [13]. Finally, the authors [14]
applied two data structures called the RLE and blob tables to gain the component information, and
its performance significantly outperforms than that of Chang et al. [11].
Size
First
Last
First
Last
Next
Next
Size
Y
Y
Parent
Xe
Xe
Parent
Xs
Xs
The component-detecting algorithm [14] is adopted in this section to obtain the component
information from the binary data matrix of image by using two data structures: RLE and Blob tables.
In this algorithm, the data structure
discussed in section 2.1 need to be re-defined as
, where Next is used to record the next run order of this run in the
same object. The brief description of this algorithm is that it performs raster scan to detect the
connection of each run in RLE Image using the information between consecutive rows, and then the
Blob table is used to store the information of each component, which is related to connected runs of
RLE table. Fig. 6 [14] illustrates the concept of RLE-based connection algorithm, and it is clear to
understand that blob 0 and blob 1 are grouped as the same component through the processing of
Fig. 10.
Figure 6 RLE-based component detection: (a) a binary image; (b) RLE table before scanning row
3 of the original image; (c) RLE table after scanning row 3 of the original image; (d) Blobs table
before scanning row 3 of the original image; (e) Blobs table after scanning row 3 of the original
image [14]. (P. 182)
2.4
Relationship construction
The clusters are constructed in RLE-based connected-component detection using dilation-based
discussed in section 2.3 need to extend as
connectivity. Therefore, the data structure
, where Label is used to record the index of belonging
cluster. After RLE-based connected-component detection, the cluster information can be
extracted by the RLE and Blob tables as the following function.
7
FUNCTION Relationship Construction (RLE, Blob, Blob_amount)
FOR i FROM 0 TO Blob_amount DO
IF Blob(i).parent = NULL THEN
ClusterId = i;
RLEId = Blob(i).First;
FOR j FROM 0 TO Blob(i).Size DO
RLE(RLEId).Label = ClusterId;
RLEId = RLE(RLEId).Next;
END FOR
END IF
END FOR
END
2.5
Re-labeling connection
Re-labeling connection is applied between the original RLE Image R and the dilated RLE Image D.
From the section 2.3, R can be expressed as
, where N is the number of
runs in the original RLE Image.
Subsequently, D can be expressed as
, where m is the number of runs in the dilated RLE Image D.
Therefore, the overlap in the runs of these two RLE Images is used to make clusters of the original
RLE Image if
And
.
(2)
According to equation (2), the runs of original RLE Image can find its belonging cluster easily by
comparing the Label of dilated RLE Image. Fig. 7 shows whole procedures with a binary image
applied in Fig. 7, and then the original image R can be grouped into two clusters using a
dilation-based clustering with structure element S.
8
RLE table
Blob table
Original RLE Image
RLE table
Blob table
Figure 7 The architecture of DBCA with a binary image.
3.
EXPERIMENTAL RESULTS
In this section, the performance of DBCA is evaluated with three different synthetic databases
extracted from Ester et al. [1], which are shown in Fig. 8. All the experiments were performed with
Borland C++ Builder 6.0 on a laptop, which is Pentium dual-core 1.3 GHz with 2 GB of memory, and
the execution time was recorded.
Database 1
Database 2
Database3
Figure 8 The synthetic databases for evaluation [1]. (P. 227)
9
Fig. 9 shows the clustering results that are used to verify the accuracy of DBCA by human
inspection in the three different synthetic databases. In database 1, the four clusters, which look
like circular shapes, are obtained using DBCA with a rectangular structuring element of size 15 × 15,
and the execution time for database 1 is approximately 13 ms using DBCA, which includes whole
procedures described in Section 2. Database 2 contains four clusters, which are irregular shapes
detected using DBCA with a rectangular structuring element of size 10 × 10, and it spends about 13
ms. In database 3, there are some noises with the four main clusters produced by DBCA with a
rectangular structuring element of size 11 × 11, however, the area constrain could be used to
reduce the additional noises after clustering or the morphological erosion operation could be
applied to remove original noises of database before clustering. In addition, its execution time is
around 9 ms for database 3 without removing noises. Focusing on the accuracy evaluation,
DBCA can recognize all clusters effectively if an appropriate size for structuring element (domain
knowledge) is selected.
Figure 9 The clustering results of the synthetic databases.
In performance evaluation of DBCA, the different sizes of structuring element are first tested in the
same database to confirm the relationship between sizes of structure element, and to compute its
execution time. Various size of structure element used in DBCA would detect the different clusters
in the original databases, and definitely affect the needed execution time. However, the results
show that the size of structure element only has tiny influence in execution time when applying
DBCA in these databases. The performances of DBCA in these three databases are shown in Fig.
10.
Figure 10 Performance Evaluation of DBCA with different sizes in structure element.
10
Finally, DBCA is applied to cluster the defects of AR glass with different sizes of structure element,
which is shown in Fig. 11. Through these clustering results, it can be known that the input
parameter - structure element plays a significant role in DBCA because it decides the final clusters
of images.
(a)
(b)
(c)
(d)
Figure 11 (a) defects of AR glass after image pre-processing, and applying DBCA in defects with a
rectangular structuring element of size (b)10 × 10, (c) 15 × 15 and (d) 20 × 20, respectively.
4.
CONCLUSION
A dilation-based clustering algorithm for AR glass defect inspection, which consists of the five
procedures: (1) run-length encoding, (2) RLE-based morphological operation, (3) RLE-based
component detection algorithm, (4) relationship construction, and (5) re-labeling connection, is
developed with the concept of dilation-based connectivity. The limitation of this algorithm is that
the scales of database are in 2-D discrete space as the digital images. Based on the analytical
and experimental investigations presented under this limitation, the following conclusions may be
made: (1) the clustering capability of DBCA is verified as an accuracy clustering algorithm. (2)
DBCA was applied for the AR glass defect inspection including the five procedures described above.
(3) The results indicated that this algorithm can successfully overcome the effects of broken defects
for AR glass if an appropriate size for structure element is selected. Performance evaluation
further shows that DBCA can be applied in the real work and then the post-processing for defect
inspection might be an important direction to study in the future, such as defect classification.
REFERENCES
[1]
M. Ester, H.P. Kriegel, J. Sander and X. Xu (1996), “A Density-Based Algorithm for Discovering
Clusters in Large Spatial Databases with Noise,” In Proc. 1996 Int. Conf. Knowledge Discovery and
Data Mining, pp. 226-231.
[2]
P. Viswanath and R. Pinkesh (2006), “l-DBSCAN: A Fast Hybrid Density Based Clustering Method,”
ICPR 2006. 18th International Conference, Vol. 1, pp. 912-915.
[3]
H. Luo, F. Kong, K. Zhang and L. He (2006), “A Clustering Algorithm Based on Mathematical
th
Morphology,” In Proceedings of the 6 World Congress on Intelligent Control and Automation, pp.
6064-6067.
11
[4]
Q. Zhang (2008), “A Mathematics Morphology Based Algorithm of Obstacles Clustering,” In
Proceedings of 2008 International Conference on Computer Science and Software Engineering, pp.
670-673.
[5]
U. Braga-Neto and J. Goutsias (2005), “Object-Based Image Analysis Using Multiscale
Connectivity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 6, pp.
892-906.
[6]
M. Wang, Y. Leung, C. Zhou, T. Pei and J. Luo (2006), “A Mathematical Morphology Based Scale
Space Method for the Mining of Linear Features in Geographic Data,” Data Mining and Knowledge
Discovery, Vol. 12, pp. 97-118.
[7]
W. J. Kim, S. D. Kim and K. Kim (2005), “Fast Algorithm for Binary Dilation and Erosion Using
Run-Length Encoding,” Electronic and Telecommunications Research Institute (ETRI) Journal, Vol.
29, No. 6, pp. 814-817.
[8]
R. C. Gonzalez and R. E. Woods (2008), Digital Image Processing, Prentice-Hall, 3rd Edition, pp.
628-637.
[9]
D. S. Bloomberg, “Implementation Efficiency of Binary Morphology (2002),” International Symposium
on Mathematical Morphology
[10]
Ⅵ.
T. M. Breuel (2008), “Binary Morphology and Related Operations on Run-length Representation,” In
rd
Proceedings of the 3 International Conference on Computer Vision Theory and Applications, pp.
159-166.
[11]
Y. Yang and D. Zhang (2003), “A Novel Line Scan Clustering Algorithm for Identifying Connected
Components in Digital Images,” Image and Vision Computing, Vol. 21, pp. 459-472.
[12]
F. Chang, C. J. Chen, and C. J. Lu (2004), “A Linear-time Connected-Component Labeling Using
Contour Tracing Technique,” Computer Vision and Image Understanding, Vol. 93, pp. 206-220.
[13]
L. He, Y. Chao, and K. Suzuki (2008), “A Run-based Two-scan Labeling Algorithm,” IEEE
Transactions on Image Processing, Vol. 17, No. 5, pp. 749-756.
[14]
C. S. Chen, C. W. Yeh and P. Y. Yin (2009), “A Novel Fourier Descriptor Based Image Alignment
Algorithm for Automatic Optical Inspection,” Journal of Visual Communication and Image
Representation, Vol. 20, pp. 178-189.
12