Dr. Pushpak Bhattacharya represents the Department of Computer Science and Engineering, Indian Institute of Technology,Bombay. His contribution to Prof. M. B. Emeneau Centenary International Conference on South Asian Linguistics is a presentation on "Machine Translation, Language Divergence and Lexical Resources" dealing with the technical aspects involved in language translations with special emphasis on MTs between Hindi, English and Marathi.

          Mr. Bhattacharya begins with an introduction on Machine Translations whose purpose is to convert documents in one language (L 1 -Source Language) to another (L 2 -Target Language). But the key concern is the language divergence problem, arising from different syntactic and lexical choices for expressing an idea. There is effort, he says, on the construction of Human Aided Machine Translation (HAMT) and Machine Aided Human Translation (MAHT). Also, to ease analysis and generation processes, there is need for Pre and Post editing on the input and output texts respectively.

          MAHT is the translation done by human beings with the help of machines or support tools from the computer like online dictionaries, terminology databanks and translation memories. Mr. Bhattacharya opines that translation memories TM 1 and TM 2 often produces incorrect translations due to confusing concepts used in translation memory such as text alignment, affix considerations and sense ambiguation.

         Classifying Machine Translation Systems, Mr. Bhattacharya mentions two categories: Domain Coverage , as is used in general purpose systems like SYTRAN and special purpose systems like Tom Meteo; Point of Entry from the Source Text and Target Text, which he illustrates from Vauquois Triangle involving three famous translation methodologies, viz., direct, Transfer and Interlingua approaches.

          Dr. Bhattacharya's paper focuses on Language Divergence, which refers to differences in lexical and syntactic choices that languages make in expressing ideas. He explains this concept based on trilingual Machine Translation-between Hindi, English and Marathi. He illustrates grammatical changes that occur while translating, for instance, turning of a noun into an adjective or verb; changing an adjective to an adverb or verb; converting a preposition to an adverb and other idiomatic usages. He observes that translation between Hindi and Marathi is equally difficult as between Hindi and English, though these are closer language siblings.

          Speaking of disambiguation, he mentions Yarowski's Word Sense Disambiguation (WSD) concept, which is defined as the task of finding the correct sense of the word in a context. There is emphasis on the need to develop Wordnet , which aid in solving the problem of sense disambiguation by picking up the correct sense from a repository of sense enumerations. Here, Wordnet is defined as an electronic lexical reference system in which each word meaning is represented as a set of word-forms known as synonym sets or synsets . The basic semantic relations in Hindi wordnet, which currently contains approximately 30,000 unique words and 13,000 synsets are Hypernymy / Hyponymy, Entailment / Tropnymy and Meronymy / Holonymy.

          In conclusion, this paper discusses the Machine Translation systems, emphasizing the Pre and Post-editing processes, the classification, language divergence and WSD and the Wordnet principle. There is also an analysis of the tri-lingual MT between Hindi, English and Marathi. This paper deals with the technical aspects of Machine Translations, keeping in view Hindi and Marathi MTs, thereby laying the foundation for MTs between different Indian Languages, to materialize into reality.

          Prof. Milind Malshe presented the paper of Dr. Pushpak Bhattacharya in his stead. After the session chaired by Dr. Kikkeri Narayan, many issues related to Machine Translations were discussed. It was a general conception of the discussion that machines would have to be made to disambiguate words.

 

Back TOP