Slant Correction for Offline Handwritten Telugu Isolated Characters

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
Slant Correction for Offline Handwritten Telugu Isolated Characters and
Cursive Words
Ch. N. Manisha*
Research Scholar, Department of Computer Science,
Krishna University, Machilipatnam, Andhra Pradesh, India.
Y.K. Sundara Krishna
Professor, Department of Computer Science, Krishna University,
Machilipatnam, Andhra Pradesh, India.
E. Sreenivasa Reddy
Professor, Department of Computer Science Engineering,
Acharya Nagarjuna University, Guntur, Andhra Pradesh, India.
Telugu isolated characters and cursive words. This method is
a very efficient for correcting slant of offline handwritten
Telugu isolated characters and cursive words. A proper slant
corrected cursive words can easily segment and a proper slant
corrected isolated characters increases the recognition
accuracy. Therefore, slant correction is required for Telugu
isolated characters and cursive words.
The remaining paper is organized as follows. Section-2
presents the related work of slant correction in various
languages, Section-3 explains the preprocessing step, and
Section-4 presents the proposed method describes the slant
estimation and correction methods for offline handwritten
Telugu isolated characters and cursive words. Section-5
discusses and presents the results in two subsections; First
subsection presents the comparative result analysis of the
existing method and proposed method and Second subsection
discusses comparative analysis of the processing time of the
existing method and proposed method. Section-6 provides the
conclusion of the study.
Abstract
Slant correction is one of the preprocessing steps for the
character recognition. This proposed method is simple and
efficient for correcting the slant of offline handwritten Telugu
isolated characters and cursive words. This method is
motivated from the chain code of the word contour slant
correction method but this proposed method does not identify
the chain code of the word or character contour. This method
uses the vertical starting and ending co-ordinates for
estimating the slant. It estimates the slant direction using the
maximum vertical projection profile and corrects the slant of
offline handwritten Telugu isolated characters and cursive
words using shear transformation. The experimental results
demonstrate comparative analysis of the efficiency of the
proposed method compared with that of the existing method.
The proposed method efficiently reduced the processing time
compared with the existing methods.
Keywords: Slant, Telugu, Characters, Offline, Handwritten,
Correction.
Related Work
Introduction
Alceu de S. Britto Jr. et. al. [1] implemented a modified KSC
method to correct slant of the numeral strings. This method
deals with overlapped numerals. Taira et. al. [21] corrected
non-uniform slant of handwritten English words. Dynamic
program based algorithm developed for that. Mohamed et. al.
[14] analyzed the slant of the online signatures with vectorrule based approach.
Bušta et.al. [3] proposed five skew estimators to correct the
text skew in scene image. They proposed vertical dominant,
VD on the convex hull, longest edge, thinnest profile and
symmetric glyph for it. Yamaguchi et. al. [23] classified digits
for telephone numbers on digitized signboards. In that they
corrected the slant of numbers by circumscribing digits with
the tilted rectangles method.
Kimura et.al. [11, 12] corrected the slant of handwritten words
with chain code of the word contour. Madhvanath et.al., [13]
used chain codes to extract vertical line elements. They
Slant correction is a major step influencing the segmentation
step. It is a crucial step of pre-segmentation. In offline
handwritten character recognition we identified slant
correction is one of the important level at preprocessing stage
in [4]. Slant correction uses vertical co-ordinates of the
character for correcting the slant.
The proposed method deals with offline handwritten Telugu
isolated characters and cursive words. The cursive and noncursive Telugu words that differ with the writing style of
people. Some people wrote the characters cursively,
segmentation of which is the toughest task because the cursive
words comprise slant errors. Whereas, the non cursive Telugu
words comprise non uniform slant errors for each character.
In this paper, we corrected the non-uniform slanted Isolated
characters and cursive words. We proposed a novel approach
for estimating and correcting the slant of offline handwritten
2755
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
calculate slant angles with the start and end points of each line
and global slant angle calculate by the average of all vertical
line slant angels. They corrected by the shear transformation
of the angle.
Ali and Jumari [2] detected slant angle by weiner-ville
distribution. The authors calculate all histograms by WVD.
The highest intensity of histograms taken as non slanted word.
Sun and Si [20] were detected slant of characters in text line
by gradient operation, corrected by shear operation and
connected components can be corrected by fitting
parallelograms.
Ding et.al., [8] proposed simple high speed iterative method
and eight directional chain code to corrected the slant for
English words. Ding et. al., [7] performed four, eight, twelve
and sixteen directional chain codes used for correcting slant of
words in the English language. The authors shown eight
directional chain codes gave better results than other
directional chain codes.
Ishaan et. al., [9] reconstructed the images by Zernike
moments. They observed radon transform gives better result
than Hough transform. They corrected slant of both English
and Hindi text. Kavallieratou et. al., [10] were developed a
slant removal algorithm for English and Greek text. They
estimated slants by vertical projection and WVDs.
Papandreou and Gatos [17, 18] detected core region depends
on the upper and lower baseline and black run profiles used to
removed small fragments and corrected the slant. de Zeeuw
[5] Identified slants by using the vertical projection histogram.
They calculate different vertical projection histograms
between -450 to 450 and which has the highest peak of
histogram consider for slant correction.
Proposed Method
In this section, we present the proposed method for slant
estimation and correction of offline handwritten Telugu
isolated characters and cursive words.
We proposed a novel approach for estimating the slant of
offline handwritten Telugu isolated characters and cursive
words. The slant estimation method is motivated from the
chain code word contour slant correction method in [12]. The
chain code sequence (see Figure 2) and the slant estimation
angle from Kimura et.al., [12] in Eq. (1). The proposed
method does not use the chain code of the word or character
contour.
For the extraction of the isolated characters and cursive words
from the document images we used horizontal and vertical
projection profiles. For isolated characters based words
contains non uniform slant errors. Therefore each isolated
character can be corrected separately.
Let ‘m’ and ‘n’ be the width and height of the character image
‘I’. Furthermore, ‘x’ and ‘y’ represent the vertical and
horizontal co-ordinates of the pixel, respectively.
We identified the starting and ending vertical co-ordinates of
the character in the image, which is represented as ‘x1’ and
‘xm,’ respectively (see Figure 3).
We simplified Eq. (1) for reduce the processing time and
increase the efficiency of the method. Instead of ‘n1’ and ‘n3’,
we replaced with ‘xm’ and ‘x1’. Instead of ‘n2’, we replaced
with Eq. (2).
For the slant estimation of offline handwritten Telugu
characters, ‘x1’ and ‘xm’ were used. We substituted ‘x1’ and
‘xm’ in Eq. (1) to obtain the slant estimation angle ‘’ for the
offline handwritten Telugu characters (see Eq. (3)).
Preprocessing
Before applying propose method, we preprocessed the image
for noise removal, binarization, and skeletonization. Noise
removal is a crucial step in preprocessing. Most handwritten
documents suffer from salt-and-pepper noise. Removal of
noise discussed by Nair et. al., [15]. The conversion of a
grayscale image to a binary image is called binarization. For
binarization, we used the Otsu’s threshold method. The Otsu’s
threshold method described by Otsu in [16]. Skeletonization is
a process of extracting the essential structure of characters.
Deng et.al., [6] developed a fast parallel algorithm for
skeletonization. Figure 1 shows the Telugu character and
cursive word.
(a)
It is necessary to identify the slant direction. Because it
decides positive or negative sign of the slant angle.
For correcting the slant of a character, shear transformation is
widely used. It is also called Transvection [19]. Shear
transformation is generally of two types, horizontal and
vertical shear transformations. We used the vertical shear
transformation that shifts each vertical co-ordinate in a
particular direction. For applying the shear transformation, we
used ‘’ and get new pixels (x, y), as shown in Eq. (4) and
Eq. (5).
(b)
With our experiments it is necessary to adjust the angle ‘’ for
shear transformation for better slant correction. From our
experiments we used Trial-and-Error approach to set 0.01 as a
value to adjust ‘’.
Figure 1: Before Slant Correction (a) Offline Handwritten
Telugu Character and (b) Offline Handwritten Telugu Cursive
Word
2756
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
We estimated slant angle ‘’ for the offline handwritten
Telugu characters. It is necessary to identify the direction of
the slant. Direction of the slant will decide the sign of the
angle that is positive or negative angle of slant correction
angle.
De Zeeuw [5] estimated slant angle with the help of vertical
projection histograms. The authors shear transformed the
image with different angles from −45 to +45 then computed
the vertical projection histograms for each angle. Among the
vertical projection histograms of the word, the highest peak of
vertical projection histogram was considered as the slant
corrected word.
In this paper, we considered only two angles for verification.
For selecting negative or positive ‘’, we used − and + in
eq. 4 and obtained two images, say “I1” and “I2”. After
calculating the vertical projection profiles for I1 and I2 (See
Eq. (6)), we determined the maximum vertical projection
profile value for I1 and I2 (See Eq. (7)).
(a)
(b)
Figure 4: After Slant Correction (a) Offline Handwritten
Telugu Character and (b) Offline Handwritten Telugu
Cursive Word
Table 1: Result Analysis of proposed method for Figure 4(a)
and Figure 4(b).
Fig No. Slant Angle Estimation Processing Time (in Sec)
4(a)
57.13
0.10
4(b)
-59.87
0.05
Experimental Results
We collected offline handwritten Telugu cursive words
written by teachers and students and characters from the hpltelugu-iso-char-offline dataset [22]. In this section, we present
a comparative analysis of the existing chain code of the word
contour slant correction method (CCWC) and proposed
method (PM). In First subsection, we compare the results of
the chain code of the word contour slant correction method
and proposed method. In Second subsection, we compare the
processing time of the chain code of the word contour slant
correction method and proposed method.
Let L1 and L2 be the maximum vertical projection profile
values for I1 and I2, respectively. The image corresponding to
higher L value between the L1 and L2 is selected as the slant
corrected image (see Eq. (8)).
Figure 4 shows the offline handwritten Telugu character and
cursive word after slant correction. Processing time and slant
angle analysis are shown in TABLE.1.
Comparative Analysis of Testing Results
The comparative analysis of the chain code of the word
contour slant correction method result and proposed method
result shown in Figure 5. We compared results of the chain
code of the word contour slant correction method and
proposed method for offline handwritten Telugu isolated
characters and cursive words.
We analyzed five offline handwritten Telugu cursive words.
The results of the comparative analysis shown in Figure 6.
The comparative analysis of slant angle correction for offline
handwritten Telugu isolated characters and cursive words
shown in Figure 7 and Figure 8 respectively.
Figure 2: Chain Code Sequence
(a)
(b)
Figure 3: Starting and Ending Vertical Co-ordinates
Identification (a) Offline Handwritten Telugu Character
(b) Offline Handwritten Telugu Cursive Word
2757
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
(a)
(b)
Comparative Analysis of Processing Time
We used a 3.00 GHz processor and 32-bit operating system
for this experiment. We compared the processing time of
chain code of the word contour and propose methods for
offline handwritten Telugu isolated characters and cursive
words, as shown in Figure 9 and Figure 10, respectively.
Average processing time comparison between chain code of
the word contour method and proposed method for offline
handwritten Telugu isolated characters and cursive words as
shown in Figure 11.
(c)
Figure 5: Slant Correction Comparison of Offline
handwritten Telugu isolated characters (a) Before Slant
Correction (b) Slant corrected with Chain code of the Word
Contour Method (c) Slant corrected with Proposed Method
0.16
Processing Time
0.14
0.12
0.1
0.08
CCWC
0.06
PM
0.04
0.02
0
1
Figure 6: Slant Correction Comparison of Offline
handwritten Telugu cursive words (a) Before Slant Correction
(b) Slant corrected with Chain code Word Contour Method (c)
Slant corrected with Proposed Method
2
3
4
5
Figure 9: Comparative Analysis of Processing Time for
Offline Handwritten Telugu Isolated Characters
0.09
0.08
80
0.07
Processing Time
100
60
40
20
CCWC
0
-20
1
2
3
4
5
0.06
0.05
CCWC
0.04
PM
0.03
0.02
PM
0.01
0
-40
1
2
3
4
5
-60
-80
Figure 10: Comparative Analysis of Processing Time for
Offline Handwritten Telugu Cursive words
-100
Figure 7: Comparative Analysis of Slant Angle Correction
for Isolated Characters.
0.14
0.12
100
0.1
80
0.08
Non-Cursive
60
0.06
40
CCWC
0
-20
Cursive
0.04
20
1
2
3
4
5
0.02
PM
0
-40
CCWC
PM
-60
-80
Figure 11: Avg. Processing Time Comparison of Chain Code
Word Contour and Proposed Method for Offline Handwritten
Telugu Isolated Characters and Offline Handwritten Telugu
Cursive words
-100
Figure 8: Comparative Analysis of slant Angle Correction for
Cursive words
2758
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
Conclusion
In this study, we proposed a novel and simple method for
slant correction of offline handwritten Telugu isolated
characters and cursive words. The slant correction for offline
handwritten Telugu isolated characters and cursive words was
motivated from the chain code of the word contour slant
correction method in [12]. However, the proposed method
does not identify the chain code of the character contour. This
method deals with non-uniform slanted offline handwritten
Telugu isolated characters and cursive words.
In this proposed method, the starting and ending vertical coordinates of offline handwritten Telugu characters were
considered for slant angle estimation. To transform the
direction of the vertical co-ordinates, shear transformation
was used. To identify the slant direction, the estimated
negative and positive slant angles were applied to shear
transformation. For the slant angle estimation, maximum
vertical projection profiles corresponding angle selected as the
slant corrected image. We used handwritten Telugu characters
and cursive words from different people and dataset [22] for
testing, and the proposed method efficiently corrected the
slant of offline handwritten Telugu isolated characters and
cursive words.
[8]
[9]
[10]
[11]
[12]
[13]
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[14]
Alceu de S. Britto Jr, Robert Sabourin, Edouard
Lethelier, Flávio Bortolozzi and Ching Y. Suen.
(1999). “Slant normalization of handwritten numeral
strings”.
[Online]
Available:
www.etsmtl.ca/ETS/media/ImagesETS/Labo/LIVIA/
Publications/1999/BrittoISKDM.pdf.
Ali, M. A., and Jumari, K. B., “Base-Area Detection
and Slant Correction Techniques Applied for Arabic
Handwritten Characters Recognition Systems”, Int.
Conf. on Artificial Intelligence and Pattern
Recognition, Orlando, USA. 2009. pp. 133-138.
Bušta, M., Drtina, T., Helekal, D., Neumann, L., and
Matas, J., “Efficient Character Skew Rectification in
Scene Text Images”, Computer Vision-ACCV 2014
Workshops, Springer Int. Publishing. 2014. pp. 134146.
Ch. N. Manisha, E. Sreenivasa Reddy and Y. K.
Sundara Krishna, “A Hierarchical Pre-processing
Model for Offline Handwritten Document Images”,
Int. Journal of Research Studies in Computer Science
and Engineering, Vol. 2, Issue. 3, pp. 41-45, 2015.
de Zeeuw, F., (2006). “Slant Correction Using
Histograms”, Undergraduate Thesis. [Online].
Available:
http://www.
ai.
rug.
nl/~
axel/teaching/bachelorprojects/zeeuw_slant
correction.pdf.
Deng, W., Iyengar, S. S., and Brener, N. E., “A fast
parallel thinning algorithm for the binary image
skeletonization”, Int. Journal of High Performance
Computing Applications, Vol. 14, No. 1, pp. 65-81,
2000.
Ding, Y., Kimura, F., Miyake, Y., and Shridhar, M.,
“Slant estimation for handwritten words by
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
2759
directionally refined chain code”, Proc. of the 7th Int.
Workshop on Frontiers in Handwriting Recognition,
2000. pp. 53-61.
Ding, Y., Ohyama, W., Kimura, F., and Shridhar, M.,
“Local slant estimation for handwritten English
words”, IWFHR, 2004, pp. 328-333.
Ishaan Agrawal, Alok Vashishtha and Rajiv Kumar,
“Slant
Angle
Estimation
in
Handwritten
Documents”, Int. Journal of Computer Science and
Management Studies. Vol. 14, Issue 5, pp. 1-5, 2014.
Kavallieratou, E., Fakotakis, N., and Kokkinakis, G.,
“A slant removal algorithm”, Pattern Recognition,
Vol. 33, pp. 1261-1262, 2000.
Kimura. F., Shridhar. M. and Chen. Z.,
“Improvements of a Lexicon Directed Algorithm for
Recognition of Unconstrained Handwritten Words”,
Proc. of the 2nd ICDAR, 1993. pp. 18–22.
Kimura, F., Tsuruoka, S., Miyake, Y., and Shridhar,
M., “A lexicon directed algorithm for recognition of
unconstrained
handwritten
words”,
IEICE
Transactions on Information and Systems, Vol. E77D, No. 7, pp. 785-793, 1994.
Madhvanath, S., Kim, G., and Govindaraju, V.,
“Chaincode contour processing for handwritten word
recognition”, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 9, pp. 928-932, 1999.
Mohamed, A., Yusof, R., Mutalib, S., and Rahman,
S. A., “Online slant signature algorithm analysis”,
WSEAS Transactions on Computers, Vol. 8, Issue 5,
pp. 864-873, 2009.
Nair, M. S., Revathy, K., and Tatavarti, R.,
“Removal of salt-and pepper noise in images: a new
decision-based algorithm”, Proc. of the Int. Multi
Conf. of Engineers and Computer Scientists, March
2008. Vol. I.
Otsu, N., “A Threshold Selection Method from GrayLevel Histograms”, IEEE Transactions on Systems,
Man, and Cybernetics, Vol. 9, No. 1, pp. 62-66,
1979.
Papandreou, A., and Gatos, B., “Word slant
estimation using non-horizontal character parts and
core-region information”, Document Analysis
Systems. 10th IAPR Int. Workshop on IEEE, 2012.
pp. 307-311.
Papandreou, A., and Gatos, B., “Slant estimation and
core-region detection for handwritten Latin words”,
Pattern Recognition Letters, Vol. 35, pp. 16-22,
2014.
ShearMapping.
[Online]
Available:
https://en.wikipedia.org/wiki/Shear_mapping.
Sun, C., and Si, D., “Skew and slant correction for
document images using gradient direction”,
Document Analysis and Recognition, Proc. Of the 4th
Int. Conf on. IEEE. 1997. Vol. 1, pp. 142-146.
Taira, E., Uchida, S., and Sakoe, H., “Nonuniform
slant correction for handwritten word recognition”,
IEICE Transactions on Information and Systems,
Vol. E87-D, No. 5, pp. 1247-1253, 2004.
hpl-telugu-iso-char-offline data set. [Online]
Available:
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp 2755-2760
© Research India Publications. http://www.ripublication.com
http://lipitk.sourceforge.net/datasets/teluguchardata.h
tm.
[23] Yamaguchi, T., Nakano, Y., Maruyama, M., Miyao,
H., and Hananoi, T., “Digit classification on
signboards for telephone number recognition”,
Document Analysis and Recognition, Proc. of the 7th
Int. Conf on. IEEE. 2003. pp. 359-363.
2760