International Conference on challenges in IT, Engineering and Technology (ICCIET’2014) July 17-18, 2014 Phuket (Thailand) Dynamic Calculation and Characterization of Threshold Value in Optical Character Recognition Ayşe Eldem, and Fatih Başçiftçi decided how to binarize each region [3]. A learning framework for the optimization of the binarization methods was developed, which was designed to determine the optimal parameter values for a document image [4]. Text segmentation in generic displays was learnt the best binarization values for a commercial OCR system [5]. A local adaptive thresholding method based on a water flow model was used, in which an image surface was considered as a three-dimensional (3-D) terrain [6]. A new adaptive approach was presented for the binarization and enhancement of degraded documents. In the approach a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image while incorporating image up-sampling and finally a post-processing step in order to improve the quality of text regions and preserved stroke connectivity [2]. 40 selected thresholding methods from various categories were compared in the context of nondestructive testing applications as well as for document images and the comparison was based on the combined performance measures [7]. In this study, image was first transformed into gray and noises on the image were eliminated by median and mean filters. Subsequently, various thresholding algorithms were applied for transforming into black-white. Binary image representation is the preferred format for image analysis in most textual image techniques, and the performance of the subsequent steps in document analysis applications depends on the accuracy of the binarization process [8]. In this study, the best thresholding algorithm was determined among Otsu [9], Sauvola [10], Niblack [11] and White [12] algorithms. Consequently, characters separation process was applied on the resulting image. Abstract—Optical character recognition (OCR) can be used in some management mechanisms of state and business world to organize documents scanned or captured by camera. Therefore, OCR is one of the subjects rapidly evolving in the recent times. This study investigates the procedure steps until character recognition. That’s because, the more correctly preprocessings are applied; the better results can be obtained in character recognition. In this study, the algorithms providing the best results are determined by following the procedure stages of gray scale transformation, noise removal and image thresholding. Binarization plays an important role in character recognition. For this reason, algorithms dynamically determine thresholds are preferred for character recognition in this study. Accordingly, Otsu thresholding method was determined to give the best results in terms of both picture quality and speed. Therefore, this method was used for character recognition. Primarily, lines were determined for character separation and then letters were individually obtained. Keywords—Image Processing, OCR, Thresholding, Binarization, Pattern Recognition. I. INTRODUCTION O PTICAL Character Recognition (OCR) is the process of converting and digitizing documents written by a tool, machine or hand into computer language [1]. For this reason, optical character recognition is an important step for image processing. That’s because, it is necessary to distinguish characters from background and properly obtain the characters. Otherwise, characters cannot be defined correctly. The obtained image is first transformed into gray and the present noises are eliminated. Then, binarization is applied. Document image binarization (threshold selection) refers to the conversion of a gray-scale image into a binary image [2]. These are the basic steps of image processing. A novel binarization method was developed for document images produced by cameras. To resolve the problem, the proposed method divided an image into several regions and II. MATERIAL AND METHOD The obtained optical images were first transformed into gray. Then, noise reduction techniques were applied to eliminate degradations (e.g. noises) on images that could be caused by machines. Instead of thresholding by a certain gray level, image binarization was made by thresholding methods dynamically determining threshold values by the condition of gray level pixels, and thus, black and white version of the Ayşe Eldem is with the Department of Computer Technology and Computer Programming, Technical Science Vocational High School, Karamanoğlu Mehmetbey University, 70100, Karaman, TURKEY. Fatih Başçiftçi is with the Department of Computer Engineering, Faculty of Technology, Selcuk University, 42003 Konya, TURKEY. This work is supported by Selçuk and Karamanoğlu MehmetBey Universities Scientific Research Projects Coordinatorships, Konya, Karaman, Turkey. http://dx.doi.org/10.15242/IIE.E0714006 9 International Conference on challenges in IT, Engineering and Technology (ICCIET’2014) July 17-18, 2014 Phuket (Thailand) P(T) is the histogram of the image; mf(T) and mb(T) are the mean intensity for the whole image [17]. image was obtained. The steps of the process are shown in figure 1. Obtain the image Preprocessing Binarization process White: In this method, gray value of the pixel is compared to the mean value of its neighbors. If its value is lower than the average, that pixel indicates the object, and otherwise, it shows the background [12]. This thresholding method is formulized as follows (2). Division into characters Fig.1 Process steps A. Noise Reduction Methods Degradation (noises) occurs due to out of focus viewfinder during photo shooting, lens quality, very bright environment, lack of light or objective angle [13]. Mean and median filters are generally used to reduce this kind of noises. 1 B(i, j ) 0 Niblack: In this method, threshold value is determined with the following equation based on local mean and standard deviation for bxb window size (3). T (i, j) m(i, j) k.s(i, j) (3) m(i,j) and s(i,j) are mean and standard deviation, respectively. k is accepted as -0.2. [11][14]. Median: Median filter is another way to eliminate the noises in an image. Similarly, a window size like 3x3 is determined in the median filter and applied to its neighbors. Gray level values of pixels within the window size are taken and ordered among themselves, and then the median value is assigned as the new pixel value. Sauvola: It is an adaptive and threshold-based binarization method that can be seen as a generalization of Niblack’s method [4]. In this method, pixel density at (x,y) point is defined as the g(x,y) function getting values between 0-255. As can be seen in the equation below (4), if g(x,y) value is lower than threshold, it becomes 0, and otherwise, it gets 255 [10]. B. Thresholding Methods It is one of the most important processes in separating background and foreground in document image binarization. Because binarization plays a key role in document processing since its performance affects quite critically the degree of success in a subsequent character segmentation and recognition [2]. In this way, image is transformed into black and white grayscale and it becomes easier for processing. Binarization as separation of text from background regions has formed the foundation of many learning methods, which start with a rough estimation of the text and background regions, and then attempt to learn their behavior, in order to classify regions that are in the confusion interval [4]. Thresholding algorithms depend on a multitude of factors such as the gray level distribution of the document, local shading effects, the presence of denser, non-text components such as photographs, the quality of the paper etc. [14]. In this study, 4 thresholding methods were used. 0 o(x, y)= 255 if g(x, y) < t(x, y) otherwise (4) When threshold t(x,y) is determined, the mean and standard deviation (s(x,y)) are used. R is 128 and k is a parameter changing between [0.2, 0.5]. The equation is given below (5). T(i, j)= m(i, j)+[1+k.( s(i, j) - 1)] R (5) III. APPLICATION In this study, an image is obtained from outer environment and subject to some preprocessing. These preprocessing include transformation to grayscale and using mean or median filter. In pattern recognition systems, the aim of these preprocessing is to distinguish the pattern from undesired components in order to obtain similar key properties from similar patterns [18]. Following preprocessing, separation into letters is made to determine the best methods (White, Sauvola, Niblack and Otsu) for dynamically determining the threshold value. In the current study, after an image is taken, it is initially transformed into grayscale. The gray image is obtained with (R+G+B)/3 method as an image consists of RGB color space. The gray form of the image is shown in figure 2. Otsu: Otsu’s method is a powerful, parameterless method belonging to threshold-based method [15]. When the background and foreground intensities are well separated, Otsu’s method yields good binarized results [3]. In Otsu thresholding method, optimum threshold is determined through minimizing the changes for weighted total class of the object and background pixels [9][16]. Optimum threshold is measured with the equation below (1). http://dx.doi.org/10.15242/IIE.E0714006 (2) µw(x,y), is the mean gray value of the window being processed; bias is generally taken as 2; w is the window size (15 in general); and I(i,j) is the value of the current pixel [16]. Means: Mean filter is one of the methods used to reduce noises in images. It can be used with neighborhood types like 3x3 and 5x5. Each pixel value is replaced with the mean (average) value of its neighbor pixels. For instance, the new pixel value is assigned the mean value of 9 pixels in a 3x3 mask. Topt arg max[ P(T ).(1- P(T )).(m f (T ) - mb (T )) 2 ] if mw (i, j ) I (i, j )* bias otherwise (1) 10 International Conference on challenges in IT, Engineering and Technology (ICCIET’2014) July 17-18, 2014 Phuket (Thailand) (a) (c) (b) (d) Fig. 2 a) Color image b) Grayscale image Fig. 4 a) White Thresholding b) Sauvola Thresholding c) Niblack Thresholding d) Otsu Thresholding Subsequently, noise removal methods are applied. For this purpose, mean [3x3] and Median [3x3] filters were used on the figure 3, respectively. Of these filters, mean filter blurred the image even more. Therefore, the image subject to median filter was used in the following stages. In the comparison of durations of 4 thresholding methods, Otsu method was determined to give the best performance. Time schedule is given in Table I. TABLE I DURATIONS OF THRESHOLDING METHODS Method Name White Thresholding Sauvola Thresholding Niblack Thresholding Otsu Thresholding (a) Time (msecond) 89 ms 119 ms 384 ms 48 ms Initially, the rows are determined for character separation as shown in figure 5. In this process, the image is scanned from the beginning and the initial black point is accepted as the start of row and the last black point is determined as the end of row. (b) Figure 3. a) Mean [3x3] b) Median [3x3] The results of the methods used for binarization following the noise removal are given in Figure 4. Accordingly, Otsu method was determined to give the best results in binarization process. Therefore, Otsu thresholding was preferred character separation. Fig. 5 Start and end of row Subsequently, for determining the columns, the initial character is found by advancing from the first black point to right side until the first column of white points is reached, and upon finishing this letter, it is taken from the image and stored. As can be seen in Figure 6 below, initially the letter “i” and then “m”, “a” and “g” letters in the first row are distinguished with image scan. The same process is repeated for the other rows. Thus, character separation process is completed. (a) (b) (a) http://dx.doi.org/10.15242/IIE.E0714006 11 International Conference on challenges in IT, Engineering and Technology (ICCIET’2014) July 17-18, 2014 Phuket (Thailand) [6] [7] (b) [8] [9] [10] (c) [11] [12] (d) [13] Fig. 6 Letter separation process [14] IV. CONCLUSION Many applications have been developed in optical character determination. In the current study, the process steps until character separation are investigated. For this purpose, primarily the obtained image was transformed into grayscale and noise removal was applied. At this stage, median filter was determined to give better results in image quality, and therefore, median filter was preferred. Subsequently, 4 thresholding methods were applied, respectively, and Otsu thresholding method provided the best fit. Otsu thresholding method was found to be better in both image quality and performance. Using Otsu thresholding method, the start and end of rows were determined on black and white image and character separation process was achieved. [15] [16] [17] [18] REFERENCES [1] [2] [3] [4] [5] S. Ay, and A. Varol, “Karakter Tanıma İçin Düzenli Özellik Çıkarma İşleminin İncelenmesi ve Uygulanması”, Bilimde Modern Yöntemler Sempozyumu (BMYS 2008), Eskişehir Osmangazi Üniversitesi, Eskişehir, 2008, pp. 1063-1070. B. Gatos, I. Pratikakis, and S.J. Perantonis, “Adaptive degraded document image binarization”, Pattern Recognition, vol. 39, pp. 317 – 327, 2006. http://dx.doi.org/10.1016/j.patcog.2005.09.010 C. Chou, W.H. Lin, and F. Chang, “A binarization method with learning-built rules for document images produced by cameras”, Pattern Recognition, vol. 43, pp. 1518–1530, 2010. http://dx.doi.org/10.1016/j.patcog.2009.10.016 M. Cheriet, R.F. Moghaddam, and R. Hedjam, “A learning framework for the optimization and automation of document binarization methods”, Computer Vision and Image Understanding, vol. 117, pp. 269–280, 2013. http://dx.doi.org/10.1016/j.cviu.2012.11.003 A. Fernández-Caballero, M.T. López, and J.C. Castillo, “Display text segmentation after learning best-fitted OCR binarization parameters”, Expert Systems with Applications, vol. 39, pp. 4032–4043, 2012. http://dx.doi.org/10.1016/j.eswa.2011.09.162 http://dx.doi.org/10.15242/IIE.E0714006 12 I.K. Kim, D.W. Jung, and R.H. Park, “Document image binarization based on topographic analysis using a water flow model”, Pattern Recognition, vol. 35, pp. 265-277, 2002. http://dx.doi.org/10.1016/S0031-3203(01)00027-9 M. Sezgin, and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation”, Journal of Electronic Imaging, vol. 13(1), pp. 146–165, 2004. http://dx.doi.org/10.1117/1.1631315 B. Bataineh, S.N.H. Sheikh Abdullah, and K. Omar, “An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows”, Pattern Recognition Letters, vol. 32, pp. 1805–1813, 2011. http://dx.doi.org/10.1016/j.patrec.2011.08.001 N. Otsu, “A threshold selection method from gray-level histograms”, IEEE Trans. Systems, Man, and Cybernetics, vol. 9(1), pp. 62–66, 1979. http://dx.doi.org/10.1109/TSMC.1979.4310076 J. Sauvola, and M. Pietikainen, “Adaptive document image binarization”, Pattern Recognition, vol. 33(2), pp. 225–236, 2000. http://dx.doi.org/10.1016/S0031-3203(99)00055-2 W. Niblack, An Introduction to Digital Image Processing, Englewood Cliffs, NJ: Prentice-Hall. 1986, pp. 115-116. J.M. White, and G.D. Rohrer, “Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction”, IBM Journal of Research and Development, Vol. 27(4), pp. 400-411, 1983. http://dx.doi.org/10.1147/rd.274.0400 Ö Taban, http://omertaban.com/tag/median-filtre/, Access Time: 01.10.2013. B. Sankur, M. Sezgin, Image Thresholding Techniques: A Survey Over Categories, http://algoval.essex.ac.uk/rep/thresholdingreview.doc, Access Time: 20.06.2013. R. F. Moghaddam, and M. Cheriet, “AdOtsu: An adaptive and parameterless generalization of Otsu’s method for document image binarization”, Pattern Recognition, vol. 45, pp. 2419–2431, 2012. http://dx.doi.org/10.1016/j.patcog.2011.12.013 M. Sezgin, “İmge Eşikleme Yöntemlerinin Başarım Değerlendirmesi ve Tahribatsız Muayenede Kullanımı”, PhD Thesis, 2002. P. Liao, T. Chen, and P. Chung, “A Fast Algorithm for Multilevel Thresholding”, Journal of Information Science and Engineering, vol. 17, pp. 713-727, 2001. A. Şengür and İ. Türkoğlu, “Değişmez Momentlerle Türkçe Karakter Tanıma”, Doğu Anadolu Bölgesi Araştırmaları, pp. 114-117, 2004.
© Copyright 2024 Paperzz