International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com Nutrition Facts Label Processing using Image Segmentation and Token Matching based on OCR Apoorva P Lecturer, Department of Computer Science, Amrita Vishwa Vidyapeetham, Mysuru Campus, Amrita University, India. Bindushree.M, PG Scholar, Department of Computer Science, Amrita Vishwa Vidyapeetham, Mysuru Campus, Amrita University, India. Bhamati.H.A PG Scholar, Department of Computer Science, Amrita Vishwa Vidyapeetham, Mysuru Campus, Amrita University, India. information about an object or a group of object in order to facilitate classification. The input document may contain several lines of text that needs to be categorized into single character for recognition. Row of characters in image is first detected. Now each character is to be identified for the row obtained earlier. This is done by scanning the row vertically from top to bottom; the first darker pixel detected is the leftmost (left) pixel of character. Now if all pixels are found to be blank then this is right of character. The character from the scanned image is normalized from any pixel size to 15 X 15 pixels. It cropped the image by using top, left, right, and bottom boundaries. Now the cropped image of 15 X 15 can be binarized into array of 15 X 15, where black representing 1 and white representing 0. Once the pattern is generated pattern based recognition or token matching is done on the generated binary format with the existing templates. Usage of mobile phones is made simple and gaining lot of popularity. Especially mobile phones with high resolution camera also make our tasks easier. A method to identify and list the nutritional facts written on the labels of food packets is developed. It is very important for the consumer today to know about their daily diet. Most of us refer to the nutritional facts labels and there is a need to maintain a log of all these facts about the food we consume. The labels are captured using mobile phone cameras and the text from the labels is extracted with the help of OCR. A method for preprocessing is also discussed. Image once taken is preprocessed and segmented before sending it to the OCR. To improve the output of OCR token matching and value rectification methods are used. The paper is organized as follows. A brief literature survey is presented in section 2. The working principles of the proposed method are presented in section 3.Details of the dataset used, experimental settings and the obtained results are presented in section 4. The paper is concluded in section 5. Abstract The system focuses on recognizing nutrition facts from the fact label. A combination of Otsu’s method for segmentation and tesseract for optical character recognition (OCR) is introduced in this work. It is a combined approach to identify and tabulate line items on nutrition facts labels from mobile images. At first, we apply image processing techniques including adaptive thresholding, small region removal, eccentricity filtering, orientation detection, and rotation to locate a nutrition facts label in a given image. We then use region segmentation to isolate individual lines in the facts table. Later optical character recognition is used on the segmented line images, and then token matching is used to correct the output to match a dictionary of expected words and values. The combination of combined approach with segmentation and OCR gives a better result than the OCR on unprocessed images. Mobile images are mostly poor in resolution. The proposed method achieves better success rate in identifying each label line item. Keywords: Image segmentation, Hough transform, Optical character recognition, tesseract, Token matching. Introduction Often abbreviated as OCR, Optical Character recognition refers to recognition of printed or written text characters in scanned images by a computer. The main function of OCR is to acquire the scanned image, preprocess it, and extract the features and pattern generation. In the image acquisition stage the text images are obtained from a scanner or a pre stored image file. In order to handle the unwanted noise, and if the image is taken using a low resolution scanning device then a preprocessor helps in smoothening characters, handle touching characters, proportional spacing, variable line spacing. Feature extraction is the process of getting 6591 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com strokes and edges. The results show that the algorithm is robust in most cases, except for very small text characters that are not properly detected. Also in the case of low contrastin the image, misclassifications occur in the texture segmentation. A focus of attention based system for text region localization has been proposed by Liuand Samarabandu in [19]. The intensity profiles and spatial variance is used to detect text regions in images. A Gaussian pyramid is created with the original image at different resolutions or scales. The text regions are detected in the highest resolution image and then in each successive lower resolution image in the pyramid. The approach used in [20] utilizes a support vector machine (SVM) classifier to segment text from non-text in an image or video frame. Initially text is detected in multi scale images using edge based techniques, morphological operations and projection profiles of the image. These detected text regions are then verified using wavelet features and SVM. The algorithm is robust with respect to variance in color and size off on t as well as language. Related Work Some of the important papers related to nutrition label information extraction are described in this section. OCR is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. For the better recognition it is important to provide the image in a more suitable form [2] image preprocessing is one of the important step and it is carried out in java and it is done to increase the recognition accuracy in the software, the process involved in image processing includes number of steps such as filtering, gray scale conversion and binirization hence after preprocessing step we get the final enhanced image. Tesseract is an open source engine to do the optical character recognition, it is one of the most accurate OCR engine available now, it was developed by Google Inc. and licensed under apache 2.0,[3] the process is character recognition is done using the enhanced image the process involves the use of two-pass approach to character recognition. The second pass is known as "adaptive recognition" and uses the letter shapes recognized with high confidence on the first pass to recognize better the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font is distorted Once the words are recognized on the label it is important to understand the meaning of the terms used on the nutrition label, [14] US food and drug administration as published a detailed instruction on how to read the nutrition contents on the label also [15] a journal published by myfitnesspal give an idea to combine calorie value of the food a person take and the daily exercise a person does to give the information on the proper diet for that person. Several different methods have been introduced in the past for detection and localization of text in images. These approaches take into consideration different properties related to text in an image such as color, intensity, connected-components, edges etc. These properties are used to distinguish text regions from their background and/or other regions within the image. The algorithm proposed by Wang and Kangas in [16] is based on color clustering. The input image is first pre-processed to remove any noise if present. Then the image is grouped into different color layers and a gray component. This approach utilizes the fact that usually the color data in text characters is different from the color data in the background. The potential text regions are localized using connected component based heuristics from these layers. Also an aligning and merging analysis (AMA) method is used in which each row and column value is analyzed [16]. The experiments conducted show that the algorithm is robust in locating mostly Chinese and English characters in images; some false alarms occurred due to uneven lighting or reflection conditions in the test images. The text detection algorithm in [17] is also based on color continuity. In addition it alsouses multi-resolution wavelet transforms and combines low as well as high level image features for text region extraction. The text finder algorithm proposed in [18] is based on the frequency, orientation and spacing of text within an image. Texture based segmentation is used to distinguish text from its background. Further a bottom-up ‘chip generation’ process is carried out which uses the spatial cohesion property of text characters. The chips are collections of pixels in the image consisting of potential text Proposed System The mobile phone images are processed in two phases. In the first phase, an image is processed for extracting individual lines of text within the nutrition facts labels and to segment them. These segmented images are later used for word recognition. In the second phase Tesseract is used for recognition of words where the OCR extracts text and numbers from the images processed in the first phase. Image Processing The steps of image processing phase are shown in Figure 1. Preprocessing and Binarization Figure 1: Identification of nutrition facts label region. The original image is converted to binarized image, then small-region removal is applied on the image, and the image is filtered through eccentricity filtering. On the resultant image Hough transform is applied and further filtering based on kmeans is done with morphological closing operation to detect the lines. To remove the noise for region filtering the original color images are preprocessed first and binarized. Then small 6592 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com region removal is applied. Filtering is done using a 3X3 neighborhood median filtering to smoothen high-frequency noise. In order to identify and improve the regions which is shadowed Adaptive histogram equalization is used. After shadowed region, bright areas which are caused due to glare are manipulated in the original image. A 16X 16 pixel tiles is used to filter images were it is converted to binary image using local adaptive thresholding. The pixels within a block were set to zero if the maximum intensity variation in that block is below 0:3. The entire noisy and bright regions in the image that are not the part of line in the image are filtered during adaptive image binarization stage. In order to preserve the pixels corresponding to the lines the image is filtered further and any region with eccentricity less than 0.98 are removed. The smaller regions in the binarized image are reduced and larger regions in the binarized image are retained. It is observed that the lines in the nutritional fact label are parallel lines. If there are small differences from the orientation of the parallel lines then such clusters were deleted. In order to rotate the filtered images to its proper position the Hough transform (HT) is applied on the filtered image. Figure 3: (a) Point p0 (b) All possible lines through p0 represented in the Hough space Following steps are to be considered for detecting straight lines in an image. First, Edge detection, e.g. using the Canny edge detector [8].Then mapping of edge points to the Hough space and storage in an accumulator is done. Interpretation of the accumulator to yield lines of infinite length. The interpretation is done by thresholding and possibly other constraints. Later infinite lines are converted to finite lines. The finite lines can then be super imposed back on the original image. The Hough transforms itself is performed in point 2, but all steps except edge detection is covered in this worksheet. The Hough transform takes a binary edge map as input and attempts to locate edges placed as straight lines. The idea of the Hough transform is, that every edge point in the edge map is transformed to all possible lines that could pass through that point. Figure 4 illustrates this for a single point, and Figure 5 illustrates this for two points. A typical edge map includes many points, but the principle for line detection is the same as illustrated in Figure 5 for two points. Each edge point is transformed to a line in the Hough space, and the areas where most Hough space lines intersect is interpreted as true lines in the edge map. Representation of Lines in the Hough Space Lines can be represented uniquely by two parameters. Often the form in Equation 1 is used with parameters a and b. y=a.x+b (1) This form is, however, not able to represent vertical lines. Therefore, the Hough transform uses the form in Equation 2, which can be rewritten to Equation 3 to be similar to Equation 1. The parameters θ and r is the angle of the line and the distance from the line to the origin respectively. r = x .cos θ + y . sin θ (2) y =-(cos θ/sin θ) . x + r / sin θ (3) All lines can be represented in this form when θ € [0, 180] and r € R (or θ € [0, 360] and r > 0).The Hough space for lines has therefore these two dimensions; θ and r, and a line is represented by a single point, corresponding to a unique set of parameters (θ 0, r0). The line-to-point mapping is illustrated in Figure 2. Figure 4: (a) Points p0 and p1 (b) All possible lines through p0 and/or p1 represented in the Hough space Infinite lines are detected by interpretation of the accumulator when all edge points has been transformed. An example of the entire line detection process is shown in Figure 4.The most basic way the detect lines is to set some threshold for the accumulator, and interpret all values above the threshold as a line. The threshold could for instance be 50% of the largest value in the accumulator. This approach may occasionally suffice, but for many cases additional constraints must be applied. As it is obvious from Figure 5, several entrances in the accumulator around one true line in the edge map will Figure 2: Line to point mapping. An important concept for the Hough transform is the mapping of single points. The idea is that a point is mapped to all lines that can pass through that point. This yields a sine-like line in the Hough space. The principle is illustrated for a point p0 = (40, 30) in Figure 3. 6593 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com corresponding regions of the original grayscale image were segmented and binarized with Otsu’s method. have large values. Therefore a simple threshold has a tendency to detect several (almost identical) lines for each true line. To avoid this, a suppression neighborhood can be defined, so that two lines must be significantly different before both are detected. The classical Hough transform detects lines given only by the parameters r and θ and no information with regards to length. Thus, all detected lines are infinite in length. If finite lines are desired, some additional analysis must be performed to determine which areas of the image that contributes to each line. Several algorithms for doing this exist. One way is to store coordinate information for all points in the accumulator, and use this information to limit the lines. However, this would cause the accumulator to use much more memory. Another way is to search along the infinite lines in the edge image to find finite lines. The advantage of the Hough transform is that the pixels lying on one line need not all be contiguous. This can be very useful when trying to detect lines with short breaks in them due to noise, or when objects are partially occluded. As for the disadvantages of the Hough transform, one is that it can give misleading results when objects happen to be aligned by chance. This clearly shows another disadvantage which is that the detected lines are infinite lines described by their (m,c) values, rather than finite lines with defined end points. Segmentation There are different type of the Region based method like thresholding, region growing and region splitting and merging [8]. Thresholding is an important technique in image segmentation applications. The basic idea of thresholding is to select an optimal gray-level threshold value for separating objects of interest in an image from the background based on their gray-level distribution. While humans can easily differentiable an object from complex background and image thresholding is a difficult task to separate them. The graylevel histogram of an image is usually considered as efficient tools for development of image thresholding algorithms. Thresholding creates binary images from grey-level ones by turning all pixels below some threshold to zero and all pixels about that threshold to one. If g(x, y) is a threshold version of f(x, y) at some global threshold T, it can be defined as [4], g(x, y) = 1 if f(x, y) ≥ T = 0 Otherwise Thresholding operation is defined as: T = M [x, y, p(x, y), f (x, y)] In this equation, T stands for the threshold; f (x, y) is the gray value of point (x, y) and p(x, y) denotes some local property of the point such as the average gray value of the neighborhood centered on point (x, y) Based on this, there are two types of thresholding methods. Global thresholding: When T depends only on f (x, y) (in other words, only on gray-level values) and the value of T solely relates to the character of pixels, this thresholding technique is called global thresholding. Local thresholding: If threshold T depends on f (x, y) and p(x, y), this thresholding is called local thresholding. This method divides an original image into several sub regions, and chooses various thresholds T for each sub region reasonably [6]. Otsu method is type of global thresholding in which it depend only gray value of the image. Otsu method was proposed by Scholar Otsu in 1979. Otsu method is global thresholding selection method, which is widely used because it is simple and effective [7]. The Otsu method requires computing a gray level histogram before running. However, because of the onedimensional which only consider the gray-level information, it does not give better segmentation result. So, for that two dimensional Otsu algorithms was proposed which works on both gray-level threshold of each pixel as well as its Spatial correlation information within the neighborhood. So Otsu algorithm can obtain satisfactory segmentation results when it is applied to the noisy images [11]. Many techniques thus were proposed to reduce time spent on computation and still maintain reasonable thresholding results. In [12], proposed a fast recursive technique that can efficiently reduce computational time. Otsu’s method was one of the better threshold selection methods for general real world images with regard to uniformity and shape measures. However, Otsu’s method uses an exhaustive search to evaluate the criterion for maximizing the between-class variance. As the Figure 5: Hough Transform At this point, two challenges sometimes persisted, small letters clinging on to the lines due to blur in the original image, additional noise to the left and right of the detected lines. To address these issues, additional filtering was performed based on row and column intensities. In particular, any row containing too few white pixels in total was set to zeros; any column passing through too few distinct white regions was set to zeros. In both cases, the threshold was selected by k-means clustering. Finally, we performed morphological closing with a long, thin horizontal structuring element in order to close disjointed lines in the binary mask. Based on the lines in the mask, we identified the bounding boxes of the text regions of the nutrition facts labels. The 6594 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com number in classes of an image increases, Otsu’s method takes too much time to be practical for multilevel threshold selection [13].Otsu’s method is aimed in finding the optimal value for the global threshold. It is based on the interclass variance maximization. Well thresholding classes have wel l discriminated intensity values. The weighted within-class variance is equation (4). w2 (t ) q1 (t ) 12 (t ) q2 (t ) 22 (t ) Further noise removal was performed using orientation filtering, since binarization alone often resulted in a significant number of small noise regions. Finally, each segmented region of text was cropped (by removing all dark and all white padding on the image borders) and further divided by identifying all-white horizontal lines running through the center of the region. In most cases, the segmentation of each nutrition facts line was performed cleanly. (4) Where the class probabilities are estimated as: t q1 (t ) P(i ) q2 (t ) i 1 I P(i) i t 1 (5) And the class means are given by: t 1 (t ) i 1 I iP (i ) iP (i) 2 (t ) q1 (t ) i t 1 q2 (t ) (6) Finally, the individual class variances are: P(i) q1 (t ) i 1 I P(i) 22 (t ) [i 2 (t )]2 q2 (t ) i t 1 t 12 (t ) [i 1 (t )]2 (7) Now, we could actually stop here. All we need to do is just run through the full range of t values [1,256] and pick the value that minimizes . But the relationship between the within-class and betweenclass variances can be exploited to generate a recursion relation that permits a much faster calculation. • The basic idea is that the total variance does not depend on threshold (obviously). • For any given threshold, the total variance is the sum of the within-class variances (weighted) and the between class variance, which is the sum of weighted squared distances between the class means and the grand mean. Figure 6: After segmentation In Figure 6 at the top a grayscale image is shown with the binarization. Later the steps like cropping, and orientation noise filtering is conducted. Text quality was significantly enhanced. Word Recognition After some algebra, we can express the total variance as equation (8) (8) Since the total is constant and independent of t, the effect of changing the threshold is merely to move the contributions of the two terms back and forth. So, minimizing the within-class variance is the same as maximizing the between-class variance. The nice thing about this is that we can compute the quantities in recursively as we run through the range of t values. Finally q1 (t 1) q1 (t ) P(t 1) q (t ) 1 (t ) (t 1) P(t 1) 1 (t 1) 1 q1 (t 1) 2 (t 1) q1 (t 1) 1 (t 1) 1 q1 (t 1) (9) Figure 7: (a) Raw OCR output from unsegmented (top) and segmented images. (b) The segmented images led to cleaner text outputs. (10) (11) Optical character recognition (OCR) is used to extract text and numeric values from preprocessed images. Tesseract was used 6595 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com as OCR engine because it is one of the most widely used open-source engines. However, character based recognition is performed by Tesseract instead of word-based recognition. So the output was then refined word-based token matching to improve the accuracy. Tesseract was restricted to English letters, digits, and the special characters %, ?, and . This improved the ability to rectify misread characters from OCR. Tesseract was very sensitive to image skew, clutter, and noise, especially with unsegmented block images. Figure 7 shows a comparison of Tesseract performance on block versus segmented inputs. An improved performance was evident with segmented images. 50 images, and 34% error if we exclude the images that failed segmentation due to confusion with barcode recognition. Table 1: The above table shows error rates and runtime across four test scenarios. Error Rate Avg Time Token Matching The words that are used on the nutrition facts label are limited (e.g., calories, cholesterol), the output from Tesseract was matched to a predefined dictionary to improve the recognition results. Two token matching methods were implemented. Naive string matching is used to look for exact appearances of tokens in each line. Scored matching based on the Levenshtein distance between input lines and each token in the dictionary, were normalized by the line length. Since a lower score indicates a higher confidence, this normalization ensured that shorter tokens are not disproportionately favored. Output tokens and related numeric values with a confidence score above a given threshold (60%) were considered. At first if there label quantities that contain number are considered in naive string matching, (e.g., ‘10g’). The numbers that directly followed the matched tokens were considered without any rectification. Later in the second phase, that is the scored matching algorithm, searches for tokens such as ‘mg,’ ‘g,’ or ‘%’ that represent the values that are seen in a line. If both gram and percent values are given, we use the recommended daily values of each item to reconcile any conflicts between the values. Figure 8shows token matching and value rectification results which show further improvement. Raw Image Naive Token Match match 70% 68% Segmented Image Naive Token Match Match 53% 36% 13 sec 33 sec 16 sec 38 sec Error rate is calculated as the percentage of line items across all 50 test images that the algorithm either did not detect or detected with incorrect values. The results were also tested using another OCR engine namely (ABBYY Fine reader 11) and the result remains same (34% with failed segmentation images removed).A comparison of our processing techniques is shown in Table 1. It represents the effectiveness segmentation as it represents the result for unprocessed (raw) versus segmented image result. There is a large difference in the error rate between the naive and token matching methods on the segmented images. It indicates the need for stronger word recognition with the help of a high-quality OCR output in order to get good results. The improved result also requires a thorough image processing steps to segment the images for high quality recognizer. Conclusion and Future Work The experiments conducted shows a decreased error rate of 36% compared to the error rate when raw images were used i.e. 68% with the same token matching algorithm. The image processing steps and the token matching method surely gives an improved result of 34% over the raw image and naïve matching algorithm. When a different OCR (ABBYY Finereader 11) was used error rate remains the same. The results are based on a dataset of 50 images taken by iphone 5s. The method proves to be efficient for mobile images of the nutritional fact labels and proves consistent on a larger dataset also. Our runtime was calculated on R2013a MATLAB on a 2.4 GHz Intel dual-core processor. For the image segmentation step alone, average processing time on this machine was 1525s per image. On an improved hardware machine the average runtime can be estimated as an average runtime of 20s per image. The results discussed are mainly affected by the difficulty in interpreting the bolded and blurred text. Along with template matching, morphological operators such as eroding the bolded image may improve the accuracy of the OCR. Recognizers other than token matching, such as template matching, bag of words, or multi-class support vector machines can be used. The main problem faced during the image segmentation was images that contained barcode and high-eccentricity clutter oriented parallel to the nutrition facts lines. The method Figure 8: (a)Token matching and numeric rectification. (b)Levenshtein distance matching resulted in improved token recognition. Experimental Results The combined algorithm was tested on a dataset of 50 images of processed food items which are from a local supermarket. All the images were taken using an iPhone 5. For each image, details like values for calories, fat, cholesterol, sodium, carbohydrates, dietary fiber, sugar, and protein were manually verified. The performance of the algorithms is verified using these values. The algorithm resulted in 36% error on the full 6596 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9 (2016) pp 6591-6597 © Research India Publications. http://www.ripublication.com [14] H. Jiang, C. Han, K. Fan, “A Fast Approach to Detect and Correct Skew Documents," in Proceedings of the 13th International Conference on Pattern Recognition, vol. 3, 1996, pp. 742-746. [15] ShrinathJanvalkar, Paresh Manjrekar, SarveshPawar,''Text Recognition from an Image'',”Int. Journal of Engineering Research and Applications”ISSN: 2248-9622, Vol. 4, Issue 4(Version5), April 2014, pp.149-151. [16] Kongqiao Wang and Jari A. Kangas, Character location in scene images from digitalcamera, the journal of the Pattern Recognition society, March 2003. [17] K.C. Kim, H.R. Byun, Y.J. Song, Y.W. Choi, S.Y. Chi, K.K. Kim and Y.K Chung,Scene Text Extraction in Natural Scene Images using Hierarchical FeatureCombining and verification, Proceedings of the 17th International Conference onPattern Recognition (ICPR ’04), IEEE. [18] Victor Wu, RaghavanManmatha, and Edward M. Riseman, TextFinder: AnAutomatic System to Detect and Recognize Text in Images, IEEE Transactions onPattern Analysis and Machine Intelligence, Vol. 21, No. 11, November 1999. [19] Xiaoqing Liu and JagathSamarabandu, A Simple and Fast Text LocalizationAlgorithm for Indoor Mobile Robot Navigation, Proceedings of SPIEIS&TElectronic Imaging, SPIE Vol. 5672, 2005. [20] Qixiang Ye, Qingming Huang, Wen Gao and Debin Zhao, Fast and Robust textdetection in images and video frames, Image and Vision Computing 23, 2005. discussed finds it difficult to differentiate the actual lines and lines out of interest if they are parallel. A filtering method to handle these difficulties can be considered. The proposed work does not support any distortion or curvature in the image. However the work results in a robust way to correct a little distortion in adjacent regions. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Hetal J. Vala, AsthaBaxi, “A Review on Otsu Image Segmentation Algorithm”,ISSN: 2278-1323, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), Volume 2, Issue 2, February 2013. W. Bieniecki, S. Grabowski, and W. Rosenberg, “Image Preprocessing for Improving OCR Accuracy,” in Proceedings of the International Conference on Perspective Technologies in MEMS Design, 2007, pp. 75-80. R. Smith, “An Overview of the Tesseract OCR Engine,” in Proceed-ings of the Ninth International Conference on Document Analysis and Recognition, vol. 2, 2007, pp. 629-633. Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing”, 2nd ed., Beijing: Publishing House of Electronics Industry, 2007. W. X. Kang, Q. Q. Yang, R. R. Liang,“The Comparative Research on Image Segmentation Algorithms”, IEEE Conference on ETCS, pp. 703707, 2009 Er. Nirpjeetkaur and ErRajpreeetkaur, “A review on various method of image thresholding”,IJCSE-2011. Zhong Qu andLiHang”Research on Iimage Segmentation Based on the Improved Otsu Algorithm.”,2010 W. X. Kang, Q. Q. Yang, R. R. Liang,“The Comparative Research on Image Segmentation Algorithms”, IEEE Conference on ETCS, pp. 703707, 2009. Z. Ningbo, W. Gang, Y. Gaobo, and D. Weiming, “A fast 2d otsu thresholding algorithm based on improved histogram,” in Pattern Recognition, 2009. CCPR 2009. Chinese Conference on, 2009, pp. 1-5. L. Dongju and Y. Jian, “Otsu method and k-means,” in Hybrid Intelligent Systems, 2009. HIS ’09. Ninth International Conference on, vol. 1, 2009, pp. 344349. LIU Jian-zhuang, Li Wen-qing, “The Automatic threshold of gray level pictures via Two-dimentional Otsu Method”, Acta Automatic Sinica,1993 J. Gong, L. Li, and W. Chen, “Fast recursive algorithms for two-dimensional thresholding,” Pattern Recognition,vol. 31, no. 3, pp. 295-300, 1998. P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. Chen, “A survey of thresholding techniques,” Computer Vision Graphics Image Processing, Vol. 41, 1988, pp. 233-260. 6597
© Copyright 2026 Paperzz