Handling of Infinitives in English to Sanskrit

International Journal of Artificial Life Research, 1(3), 1-16, July-September 2010 1
Handling of Infinitives in English
to Sanskrit Machine translation
VimalMishra,BanarasHinduUniversity,India
R.B.Mishra,BanarasHinduUniversity,India
AbStrAct
ThedevelopmentofMachineTranslation(MT)systemforancientlanguagelikeSanskritisafascinatingand
challengingtask.Inthispaper,theauthorshandletheinfinitivetypeofEnglishsentencesintheEnglishto
Sanskritmachinetranslation(EST)system.TheESTsystemisanintegratedmodelofarule-basedapproach
ofmachinetranslationwithArtificialNeuralNetwork(ANN)modelthattranslatesanEnglishsentence(source
sentence)intotheequivalentSanskritsentence(targetsentence).TheauthorsusefeedforwardANNforthe
selectionofSanskritwords,suchasnouns,verbs,objects,andadjectives,fromEnglishtoSanskritUserData
Vector(UDV).DuetomorphologicalrichnessofSanskrit,thissystemusesonlymorphologicalmarkingsto
identifySubject,Object,Verb,Preposition,Adjective,Adverb,Conjunctiveandaswellasaninfinitivetypes
ofsentence.TheperformanceevaluationsofourESTsystemwithdifferentmethodsofMTevaluationsare
shownusingatable.
Keywords:
EnglishtoSanskritMachineTranslation,Infinitives,MachineTranslation,RuleBasedApproach
ofMachineTranslation,Sanskrit
1. IntroductIon
India is a multilingual country with eighteen
constitutionally recognized languages (Sinha &
Jain, 2003). Even though, Sanskrit is understand
by 0.01% (49,736) as per census of India, 1991.
Therefore, machine translation provides a solution in breaking the language barrier within the
country. Correct karaka assignment poses the
greatest problem in this regards (Samantaray,
2004; Bharti & Kulkarni, 2007). There are
no existing machine translation systems that
work on English to Sanskrit translation. Some
works on Sanskrit parser and morphological
DOI: 10.4018/jalr.2010070101
analyzers have done earlier which is briefly
described below.
The work of Ramanujan (1992) describes
that morphological analysis of Sanskrit is the
basic requirement for the computer processing
of Sanskrit. The Nyaya (Logic), Vyakarana
(Grammar) and Mimamsa (Vedic interpretation) are suitable solutions that cover syntactic,
semantic and contextual analysis of Sanskrit
sentence. P. Ramanujan has developed a Sanskrit
parser ‘DESIKA’, which is Paninian grammar
based analysis program. DESIKA1 includes
vedic processing and shabda-bodha.
Briggs (1985) uses semantic nets (knowledge representation scheme) to analyze sentences unambiguously. He compares the similarity
Copyright © 2010, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global
is prohibited.
2 International Journal of Artificial Life Research, 1(3), 1-16, July-September 2010
between English to Sanskrit and provides the
theoretical implications of their equivalence.
Huet (2003) has developed a Grammatical Analyzer System, which tags NPs (Noun
Phrase) by analyzing sandhi, samasa and sup
affixation2.
The works in Sanskrit processing tools
and Sanskrit authoring system have carried out
Jawaharlal Nehru University, New Delhi-India3.
It is currently engaged in karaka Analyzer,
sandhi splitter and analyzer, verb analyzer, NP
gender agreement, POS tagging of Sanskrit,
online Multilingual amarakosa, online Mahabharata indexing and a model of Sanskrit Analysis
System (SAS) (Jha et al., 2006).
Morphological analyzers for Sanskrit
have developed by Akshara Bharathi Group at
Indian Institute of Technology, Kanpur-India,
and University of Hyderabad.
We have developed a prototype model
of English to Sanskrit machine translation
(EST) system using ANN model and rule
based approach. ANN model gives matching
of equivalent Sanskrit word of English word
which handles noun and verb. The ANN based
system gives us faster matching of English
noun (subject or object) or verb to appropriate
Sanskrit noun (subject or object) or dhaatu. The
rule based model generates Sanskrit translation
of the given input English sentence using rules
that generate verb and noun for Sanskrit. The
rule based approaches mostly make use of
hand written transfer rules to the translation of
substructures from source language (English
sentence) to target language (Sanskrit sentence).
The main advantages of rule based approaches
are easy implementation and small memory
requirement (Jain et al., 2001).
We have divided our work into the following sections. Section 2 describes the linguistic
feature of Sanskrit, its equivalence in English
and comparative view of English and Sanskrit.
Section 3 presents infinitives in English and
Sanskrit that describe the rules for forming
words of infinitives in Sanskrit which is based
on Panini grammar. Section 4 describes the
system model of our EST system. Section 5
presents implementation and the illustration
with examples as well as the result of the
translation in GUI form. In section 6, we show
the performance evaluation results of our EST
system with different MT evaluation methods
such as BLEU (BiLingual Evaluation Understudy), unigram Precision, unigram Recall,
F-measure and METEOR score using table.
The conclusions and scope for future work are
mentioned in Section 7.
2. LInguIStIc FeAtureS
oF SAnSkrIt And ItS
equIvALence In engLISh
The Sanskrit language is called as “devbhashaa”
which means the language of God. The Sanskrit
language is written in devnagari script. The
Sanskrit language have forty-two characters or
varnas which have thirty-three vyanjanas or consonants and nine swaras or vowels (Kale, 2005).
In Sanskrit, there are six tenses (kala) and four
moods (arthaa). The tenses are: Present, Aorist,
Imperfect, Perfect, First Future and Second
Future. The moods are: Imperative, Potential,
Benedictive and Conditional. The ten tenses and
moods are technically called the ten Lakaaras
in Sanskrit grammar. There are three numbers:
singular, dual and plural. The singular stands for
one; dual for two and plural for more than two.
There are three person in English grammar: first,
second and third. Thus, Sanskrit grammar has
structural vastness. Besides its structural vastness, Sanskrit grammar is very well organized
and least ambiguous compared to other natural
languages, illustrated by the fact of increasing
fascination for this ancient Aryan language. According to Paninian grammar, Sanskrit grammar
possesses well organized rules and Meta rules.
We describe some of salient feature regarding
the nature of the Sanskrit that is given below.
(a) All words made of characters either vowel
(swaras) or consonants (vyanjanas). swaras
exits independently while vyanjanas depend on swaras.
(b) Words consist of two parts: a fixed base
part and a variable affix part. The variable
Copyright © 2010, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global
is prohibited.
14 more pages are available in the full version of this
document, which may be purchased using the "Add to Cart"
button on the product's webpage:
www.igi-global.com/article/handling-infinitives-englishsanskrit-machine/46024?camid=4v1
This title is available in InfoSci-Journals, InfoSci-Journal
Disciplines Medicine, Healthcare, and Life Science.
Recommend this product to your librarian:
www.igi-global.com/e-resources/libraryrecommendation/?id=2
Related Content
Scented Node Protocol for MANET Routing
Song Luo, Yalin E. Sagduyu and Jason H. Li (2012). Biologically Inspired Networking
and Sensing: Algorithms and Architectures (pp. 242-267).
www.igi-global.com/chapter/scented-node-protocol-manetrouting/58310?camid=4v1a
An Autonomous Multi-Agent Simulation Model for Acute Inflammatory
Response
John Wu, David Ben-Arieh and Zhenzhen Shi (2011). International Journal of Artificial
Life Research (pp. 105-121).
www.igi-global.com/article/autonomous-multi-agent-simulationmodel/54751?camid=4v1a
An Optimal Balanced Partitioning of a Set of 1D Intervals
Chuan-Kai Yang (2010). International Journal of Artificial Life Research (pp. 72-79).
www.igi-global.com/article/optimal-balanced-partitioning-setintervals/44672?camid=4v1a
Improved Gas Source Localization with a Mobile Robot by Learning
Analytical Gas Dispersal Models from Statistical Gas Distribution Maps Using
Evolutionary Algorithms
Achim J. Lilienthal (2011). Intelligent Systems for Machine Olfaction: Tools and
Methodologies (pp. 249-276).
www.igi-global.com/chapter/improved-gas-source-localizationmobile/52456?camid=4v1a