.........................................................................................................................
आभ्यंतर (Aabhyantar)     
SCONLI-12 
विशेषांक         ISSN : 2348-7771
.........................................................................................................................
18. Malayalam experience of
Google Translate: Morpho-syntactic observations
Prajisha Areecode
Abstract
Google translator is one of the most used online
machine-translator.  This paper evaluates
the errors of Malayalam to English translation carried out through this device.
Sample sentences are attempted in order to understand the translation problems
and to assess the accuracy of online machine translation offered by Google. It
is observed that morpho syntactic peculiarities of Malayalam found remaining
untranslatable. An evaluation of this issue could suggest that the development
of NLP in Malayalam is not up to the mark to support machine translation at
this juncture. This study suggests the present morpho syntactic errors should
be addressed immediately to ensure the effectiveness of Machine Translation in
the Context of Malayalam language.
Key words: Google translate, NLP, machine translation,
Malayalam, Morpho-syntax
Introduction
Machine Translation in
Malayalam context has not much of the systematic efforts in its credit other
than Google initiative. This very context demands to test the success of the
Malayalam Google translator. This paper presents the accuracy test and
identifying major issues involved in translating Malayalam by Google
Translator. Also this study suggests the solution of the issue.
Considering the low success
rate of MT attempts previously, the Google translator is good in response and
use. But the mismatches in translation should be studied and make suggestions
for a better tomorrow. With this aim, this study is designed. 
This paper is organized into 3
sections. Section1 gives a brief introduction of MT attempts in the context of
Malayalam, Section 2 illustrates the translation of Malayalam by Google
Translator with the support of examples problems are identified and discussed
and finally the concluding remarks are made at the 3rd session.
Translation
of Malayalam by Google Translator
In this part translations of Malayalam sentences
done by the Google Translator and its human variation is presented.
1)                 
1.(a) ii  panthinu            nalla     valippamuNDu 
This            ball+ DAT        pretty    size+CONJ
This
sound is pretty big                                                        GT
1.(b) ii        panthinu            nalla     valuppamuNDu
This            ball+DAT         pretty    size+CONJ.
      This      ball       is          pretty                big                                HT
 This           ball       has       a          good     size                              HT
This            bal         has      a          good     size                                                      GT
Consider 1.(a) and 1. (b). these sentences shows words valippam,
valuppam respectively in free variation. 1.(a) shows valippam and
Google mistranslates it. When it is valuppam as in 1.(b) GT gets it
right. This indicates that the two forms are not listed as the variants of the
same lexical entry in the corpus. 
2)                 
UrappayiTTum             avare                kaNDaal           thallaam
-  
Definitely         they-Acc          see-if    beat-may
Definitely         beat      them     when    (you)    meet     HT
Even     if          they      can       see       them.                GT
Here aal is equalent to the word ‘if’ in english.
But in Malayalam homonymous with instrumental suffix. This might create
confusion in MT. The verb thallu (beat) is lost in GT. The
mistranslation of this sentence indicates incurrect performance of
morphological analyser. And aam is judgemental modality. 
3)                 
Nii       kaLLi   aaNu - 
You            thief-FEM.SG   be.
  You          are       (a)        thief
                                        HT
You’re
in the mosque                                                          GT
The ‘thief-feminine-singular’ is translated here as ‘in
the mosque’. This may be because of mistakes in lexical items. This
translation  makes mismatch in lexical
item and it reframe case structure. For instance, In the Malayalam sentence (3)
it is in nominative but in its translation came with locative case- in. This
unrelated way of translation suggests that the sentence is not understood by
the translator
4)                 
manushyaR       nanma  uLLavaraNu - Humans have good things
human-Pl    goodness           having-Pl-be
Humans      are       good     beings    HT
Humans      are       virtuous            
HT
There is a mix up in copular verbs, ‘aanu’(be) vs
‘uND’(have) (In sentence number 4).
In the above 1-4 sentences, all are found failed in
translation (except 3). In the first one(1), there is a misrepresentation of
the object ball. The word for ball in Malayalam is translated as ‘sound’ as in
1. Likewise kaLLi (thief-feminine-singular) is translated as ‘in the
mosque’ as in (3). This kind of unrelated lexical items appearing in
translation caused as the major problem in MT. Another problem is of sense
identification found in case of nanma. It means virtuous but it is
reduced as good. It suggests Google couldn’t sense its semantic value.
2.1 Verbs
5)                 
Njaan   avaLe   viLichathayirunnu 
I     she-Acc            call-PAST.PCPL-be
–CONJ.PAST
I     had       called   her.                              HT
I     called   her                                           GT
6)                 
Njaan   avaLe   viLichirunnu
I     she-Acc            call-perfect
past
I     had       called   her.                              HT
I     called   her                                           GT
7)                 
Njaan   avaLe   viLichu
I     she-Acc            call-PAST
I     called   her                                           HT
I
called her                                                 GT       
(7 is simple sentences with only past tense marker. Eg.7
is only translated correctly. )
8)                 
Njaan   avaLe   viLichiTTuND 
I           she-Acc            call-REMO.PERF-be-PRES
I     had       called                                       HT
I           called   her                                           GT
In 6, perfective aspect doesn’t get translated. Instead
it is in simple past.
9)                 
Njaan   avaLe   viLichiTTuNDaayirunnu
I     she-Acc            call-REMO.PERF-be-PRES-CONJ.PAST
                  I           had       called   her                   HT
I           called               her                   GT
5-9 translations do not retain the tense and aspect
meanings of corresponding Malayalam verbs. Sentence 5-9 are typical examples
showing inflections of Malayalam. Inflections are rated as an important
character of Malayalam but it is not addressed in Google Translation.  In case of 5-9, the translation shows
similarity while its use in Malayalam is distinctive and this is not covered by
the Translation. Past tense in Malayalam is expressed differently but in
translation it is uniformly translated as the pattern with –ed form.
The following 10- 13, reflects the same like above in
case of future tense sense differences.
10)              
Njan     avaLe   viLikkum
I     she-Acc            call-FUT
I     will      call       her                               HT
I’ll  call       her                                           GT
11)              
Njan     avaLe   viLikkumaayirikkum
I     she-Acc            call-FUT-may
I           may/might         be         call       her.                  HT
I’ll        call       her                                                       GT
Here, the verb stem vilik with desiderative mood
(desiderative mood is used to denote a situation where the speaker intends to
say that a particular action which was not alone should have been done.2012:63)
That is, GT fails to capture the mood features of verb.
12)              
Njan     avaLe   viLikkaam
I     she-Acc            call-PROM
I     will      call       her                               HT
I’ll  call       her                                           GT
Some other examples:
The following sentences illustrate the failures of
translate the verb inflections in Malayalam. Even tense is also not translated
equally. 
13)              
Enikk    viSakkum
I-DAT        hungry-FUT
     I            will      be         hungry                          HT
I’m       hungry                                                  GT
14)              
Enikk    viSannu
I-DAT        hungry-PAST
 I    got        hungry                                      HT
I     was      hungry                                      GT
15)              
Enikk    viSakkunnilla
I-DAT        hungry-PRES-NEG
I     didn’t    get        hungry                          HT
I’m             not        hungry                          GT
16)              
Enikk    viSannilla
I-DAT  hungry-PAST-NEG
I           was      not        hungry                          HT
I’m                   not        hungry                          GT
17)              
Enikk    viSakkunnuNDaayirunnilla
I-DAT  hungry-PRES-be-CONJ-NEG
I           was      not        feeling  hungry                          HT
I           was      not        hungry                                      GT
18)              
Enikk    viSanniTTuNDaayirunnilla
I-DAT        hungry-PAST-REMO PERF-be-PAST-NEG
I           have     not        got        hungry                          HT
I           was      not        hungry                                      GT
2.2 Habitual action (seelabhaavi) in
Malayalam
‘Seelabhaavi’ includes all tenses. It is continuous and
habitual. For instance, daily process like sun rise and sun set. 
19)              
Sooryan            kiZhakkee         udikkuu
Sun east-HAB                     rise-HAB
The            sun       rises     only      in         the        east                  HT
The            Sun       rises     in         the        east                              GT
20)              
naayayuTe                    vaal      vaLnjnjee         irikkoo
dog-GEN    tail                   bent-HAB         be-HAB
The
dog’s tail always will be bent                                        HT
Eat the        dog’s    tail                                                       GT
Here the main verb irikk is used in the sense of aak
(meaning ‘be’, usually it means sit). But GT got the verb wrong. Instead it is
translated as ‘eat’. ‘ee’ is an emphasis marker, noting habitual action. oo
also denotes habitual action. These two indicators of habitual action are
ignored by GT. Thus that sense is completely lost in translation.
In 19 and 20, the translation
could not hold its nature habitually. 
2.3  Auxiliary Verbs (anuprayoogam)
This category is main speciality to Malayalam. Use of
auxiliaries (traditional  Malayalam
grammar distinguish aspect-mood and auxiliaries) are not conceived by the
Google Translator. For instance,
21)              
Njaan   sathyam            paRanju            pooyi             
I           truth     tell-PAST         go-PAST
I           happened                      to         tell       the        truth                 HT
I’m                   saying               the        truth                                         GT
Here, got tense wrong. pooyi is a auxiliary/light
verb. It denotes that the action was done involuntarily. But this sense is lost
in GT.
2.4 Agglutinative Nature
Compare the pair of sentence 22(a) and 22(b), 23(a) and
23(b).
22)              
(a) Ninte           veeTeviTeyaaNu
You-Acc          home    where-is
Here’s  your      home                            GT
22(b). Ninte veeTu EviTeyaaNu – Where
is your home
You-Acc    home    where-is
Where
is your home                                    HT
Where
is your home                                    GT
23)              
(a) Raamanetthi
Raaman            reach-PAST
Ram                                                     GT
23(b).Raaman               etthi
Raaman
reach-PAST
Raman
reached                               GT
Raaman
reached                             HT
GT correctly analyse and translate complex words only
when given as separate morphemes. In
the above examples, The Translator couldn’t understand the agglutinative nature
of Malayalam But when the same examples are separated, the machine could sense
it and GT could identify the individual morphemes. It means the low accuracy
shown in this case is mainly due to the non-familiarity with the agglutinative
nature of the language
2.5
We can list out the mistakes.
1.       Error in documenting corpus and lexicon.
2.       Mistakes in translating copular verbs.
3.       GT could not distinguish the differences tense,
aspects and mood of verb morphology.
4.       When auxiliary verbs/light verbs are used, GT could
not capture their various functional meanings.
5.       Problems in analysing agglutinated forms.
Conclusion
The present paper discusses
various instances of mistranslation done by the Google translator in Malayalam
context. From this test of accuracy, it is observed there are instances of
mismatch and incoherence appeared in Google translation. We used Malayalam
sentences as input and it generates less corresponding translation in English.
Language used in this work exhibit rich morphology which causes poor
translation quality. It opens a new vista of MT in Malayalam. The above mapping
of untranslatability is not a failure of the Google device rather NLP in
Malayalam is not scientifically enriched. It may be concluded that the quality
of translation is directly dependent on the scope and quality of NLP and
parallel language corpora. Malayalam Linguistics should concentrate primarily
on the morphological aspects of the language and it’s computing for making MT
realised in Malayalam.
Bibliography
·        
Antony, P.J. 2012.  Machine Translation Approaches
and Survey for Indian Languages. Computational Linguistics and Chinese Language
Processing.  Vol.18,
No.1,
March 2013.
·        
Garie,VY, Kbarate.U.K.  Survey of Machine Translation systems in
India.  International Journal on Natural
Language Computing (IJNLC) Vol.2, No.4, October 2013.
·        
http://GloablSecurity.org/intell/systems/mt.history.html
·        
Hutchins,W.John. Machine Translation: A Concise
history. http://ourworld.compuserve.com/homepages/wjHutchins
·        
http://ourworld.compuserve.com Latest version
November 2005
·        
Jomysose. 
Machine Translation with special reference to Malayalam language:  International Journal of computer science and
Engineering Technology (IJCSET).  Vols.
No.04,
April 2014.
·        
Translation directory.com/articles/articles 190t.php
Abbreviations
Acc – Accusative case.
CONJ – conjunctive
CONT – Continuous
DAT – Dative
DES – Desiderative
FEM – Feminine
FUT – Future tense
GEN – Genitive
HAB – Habitual
LOC – Locative
NEG – Negative
PAST – Past Tense
PCPL – Participle
PERF – Perfective
PERM – permissive
PL – Plural
POSS – Possibilitive
PRES – Present tense
PROM – Promissive
REMO – Remote
 
 
 
I like your post very much. It is very much useful for my research. I hope you to share more info about this. Keep posting Cyber Security Training
ReplyDelete