Dhanji Prasad, Language technology
Mahatma Gandhi Antarrashtriya
Hindi Vishwavidyalaya, Wardha
..................................................................................................
(The Research Paper presented in SCONLI-05, held at HCU, Hydrabad)
..................................................................................................
Abstract: Linguistics
is the scientific study of language. There are many applied areas of
linguistics. The application of linguistic knowledge in various areas may be
categorized in three types- Practical Application (direct application in real
life), Interdisciplinary Application (application with other sciences to get
more knowledge about all the aspects of language) and Technological Application
(application in technical areas). Technological Application is the recent area
with a lot of opportunities. There are three major disciplines in which the
theoretical knowledge of languages is being applied in technical areas. These
are - Computational Linguistics, Language Technology and Language Engineering.
Natural Language Processing (NLP) is the way through which the knowledge of
human languages is being established in computers (and computer like machines).
With the help of this knowledge machines are being used in the various areas
for our language related works such as- Machine Translation, Information
Extraction/Information Retrieval, Text Summarization, Computer Assisted
Language Learning, Optical Character Recognition and Computational Lexicology
etc.
1.0 INTRODUCTION
Linguistics, the scientific study of language, explains
its structure as a system of units found on various levels. These units are
following in ascending order –
Phoneme – Morpheme - Word – Phrase – Clause – Sentence
- Discourse
In above, each larger unit is made by assembling more
than one (or at least one) smaller units. Linguistics tries to find out the
rules and conditions in which they are assembled and produced. In general
linguistic behavior ‘phone’ is minimal unit of language. When phones are
produced in a syntactic and meaningful form of language, they become
communicative. In this way, they are used for sending and receiving ‘meaning’
(meaningful ideas). Here one thing is
important that neither the ‘phone’ nor the ‘meaning’ is ‘Language’. Language is systemic form that connects ‘phones’
and ‘meaning’. It is abstract and found in human mind. Linguists try to
understand and explain this system as a Grammar. For example- Chomsky’s
Transformational Generative Grammar, Halliday’s Systemic Grammar, Lamb’s
Stratificational Grammar, Fillmore’s Case Grammar etc. In these grammars all
the linguists provide some units, levels and methods to explain the system/
structure of Language.
2.0
APPLIED AREAS OF LINGUISTICS
All this above is discussed for a theoretical touch to
go forward to see the applications of linguistic knowledge. Every theoretical
science has its applications, because application makes theories to be realized
in the practical or actual world. Linguistics is one of them. It has many
applied areas. R.N. Shrivastav in his book ‘Anupryukt Bhashavigyaan’ (1995),
has categorized these in three types according to their Contexts. These are
following-
i) Context
of Knowledge area (Gyaanchhetra ka Sandarbh)
ii)
Context of
particular Discipline (Vidha-vishesh ka Sandarbh)
iii)
Context of
Language Teaching (Bhaasha ShikshaN ka Sandarbh)
These three are further explained by him and various
subfields (such as Sociolinguistics, Psycholinguistics, Translation and
Lexicography etc.) are categorized among them.
But, I think that from today’s point of view, the
applications of linguistics cannot be explained thoroughly (completely) by this
categorization. So, here applications of linguistics are being
categorized in three categories on the basis of their use, form and nature-
1.
Practical
Application
2.
Interdisciplinary
Application
3.
Technological
Application
1. Practical Application: - Here ‘Practical’ means direct application of
linguistics to the real world. So, all the disciplines, which use the
linguistic knowledge and produce directly something for our practical life,
will be included in this type. These are Language Teaching, Translation,
Lexicography, Language Planning and Speech Therapy etc. All these subfields use
linguistic knowledge and provide things which may be seen and used directly in
our life as-
Ø
Language
Teaching uses linguistic knowledge and makes learners able to communicate in
the Target Language in the Practical World.
Ø
In Translation,
linguistic knowledge of both languages is used and it produces a text in the
Target Language.
Ø
Lexicography
produces ‘dictionaries’ for our uses.
Etc.
2. Interdisciplinary Application: - There are many disciplines in which ‘linguistics’ and
‘other sciences’ are combined together to get more knowledge about various
aspects of language with other areas. All these disciplines will be included in
this type. The most popular of them are- Sociolinguistics, Psycholinguistics,
Neurolinguistics and Stylistics etc.
Ø In Sociolinguistics, Sociology
and Linguistics are combined. In it the interrelationship between ‘language’
and ‘society’ is studied and explained.
Ø Psycholinguistics studies the relationship between ‘language’
and ‘human mind’. This includes all the psychological aspects of
recognition and production of language.
Ø In Neurolinguistics, the relationship
between ‘neural schema of human brain’ and ‘language’ is studied.
Ø Stylistics
tries to find out the elements which change a linguistic expression to
literature. Here ‘linguistics’ and ‘literature’ meet each other.
We can see that all above disciplines include more
than one sciences. But linguistics is common among them. So, here the
Interdisciplinary Application of linguistics is found.
3. Technological Application: - Technology is the field where any scientific theory
gets realized. Its conventional meaning includes a broad sense in it.
Generally, the word ‘Technology’ is used for all tools, methods and craft which
are implied to make any artifact. So, this is a process/method by which the
natural materials are changed to useful products with the help of scientific
knowledge about them. This may be understood by following diagram-
Natural Material
|
Product
|
Scientific Knowledge
|
TECHNOLOGY
|
Now
a days, the meaning of ‘Technology’ has been centered to the methods and
skills used for making electronic tools and machines. The electrical phenomenon
is spread in all the areas of day to day life in present time. But there are
sciences which provide theoretical knowledge for them. This is the reason why all the sciences
are looking for ‘Technological
Applications’ of their theories in this specific sense. Linguistics too has many applications in it.
3.0 DISCIPLINES FOR TECHNOLOGICAL APPLICATIONS
The Technological Application of linguistics is
related to digital machines particularly computers. There are various types of
‘software and application systems’ which are being developed and used in
computers. These are related with all the areas of our life. We are able to do
our daily/official/professional works faster and in easier way than past with
them. But, it is true that most of our works are related with language. So,
linguistic knowledge in computer is necessary to make it more useful and communicative.
This is why language related softwares are being developed rapidly all over the
world. For this task, many disciplines have been emerged in past 5 decades.
Three most important and popular of them are following-
a)
Computational
Linguistics
b)
Language Technology
c)
Language Engineering
One thing is to be considered here that these three
disciplines are not quietly separate. These are co-related and overlap each
other in many areas.
For the development of language related softwares,
a process is implemented called “Natural Language Processing” (NLP). NLP
is way to establish the knowledge of Natural Language in machines. This
requires a systemic and explicit grammar which should be produced in
algorithms. These grammars are called formal grammars.
Now we would be briefly introduced to these
disciplines-
a) Computational Linguistics: - As the name indicates this is an interdisciplinary course of ‘computer
science’ and ‘linguistics’. Ralph Grishman in ‘Computational Linguistics’
(1994) says about it, “Computational Linguistics is the study of computer
systems for understanding and generating Natural Language.”
Computers have a fast processing speed and vast
memory. If a computer is well programmed, it can perform a task thousands of
times faster than a man and send or receive any message anywhere in the world.
Therefore, all the works which are done by/with the help of computers or
computerized machines are hi-tech today. This is the reason why researchers are
trying to introduce computers with the knowledge of Natural Language. Now, this
work is going on for all the major languages of the world and Computational
Linguistics is emerging as a major research area with a lot of opportunities.
b) Language Technology: - Language Technology is the second discipline with more broad areas for the
technical applications of linguistic knowledge. In it, the development of
language related software systems is done not only for computers but also for
other platforms as mobile, A.I. tools and other language oriented machines. Computational
Linguistics and Language Technology do the same job but Computational
Linguistics is restricted only for computers while Language Technology includes
other platforms too.
‘Natural Language Processing’ is the basic
idea for Language Technology too. In this discipline, softwares are developed
to handle the tasks related to Natural Languages. Natural Languages are most
complex and ambiguous in their nature and structure. These languages are processed using formal
grammars as LFG, HPSG, GPSG, TAG etc. Formal grammars produce rules of language
in the form of logical expressions.
c) Language Engineering: - Language Engineering is the most recent concept among scholars. This is
introduced because the challenges of Natural Language Understanding and Natural
Language Production can’t be handled properly and completely by usual systems.
About this Ralph Grishman writes,
“Constructing a fluent, robust natural language
interface is a difficult and complex task. Perhaps our understanding of the
language improves, we will be able to construct simpler natural language
systems. For the present, however, much of the challenges of building such a
system lies in integrating many different types of knowledge – syntactic
knowledge, semantic knowledge, knowledge of domain of discourse – and using
them effectively in language processing. In this respect, the building of
natural language systems – like other large computer systems – is a major task
of engineering.”
This emphasizes for designing, making, implementing
and modifying language related tools and systems. To do this job, Language
Engineering has emerged and growing rapidly.
4.0 AREAS/ FIELDS OF TECHNOLOGICAL APPLICATION
Now it is
clear that the technological
application of linguistics means the use of linguistic knowledge in electronic
machines and tools as they can perform the language related tasks. There are
many areas where it is taking place. Some most important fields among them can
be titled as following-
i)
Machine Translation
ii)
Optical Character
Recognition
iii)
Speech-to-Text and
Text-to-Speech
iv)
Information Extraction
and Information Retrieval
v)
Text Summarization
vi)
Computational
Lexicography and Computational Lexicology
vii)
Computer Assisted
Language Learning
viii)
Question Answer
Systems
ix)
Language Reading and
Writing Adds
x)
Voice Recognizer
i)
Machine Translation: - Machine Translation is most popular applied
area as a subfield of Computational Linguistics and Language Technology. This
is related to the development of software programs which can automatically
translate text (or speech) from one natural language to another. The linguistic
knowledge in these software programs is stored with syntactic rules and lexicon
or other huge collections like Corpus. Many types of approaches have been used
to develop machine translation systems. These are following-
(a)
Rule based: - Translation
systems based on this approach include Transfer-based machine translation, Interlingua
machine translation and Dictionary based machine translation paradigms.
(b)
Statistical: - This type of machine translation systems use
statistical methods based on bilingual text corpora.
(c)
Example-based :- It also uses bilingual corpus as its main
knowledge base, at runtime. It is essentially a translation by analogy and can
be viewed as an implementation of case-based reasoning approach of machine
learning.
(d)
Hybrid :- this is a combined approach of statistical and
rule based methodologies.
Using these types of approaches many machine translation systems are being
developed all over the world as EUROPARL (Based on Canadian Hansard corpus),
Anusaaraka (Based on Panini Grammar) etc.
ii)
Optical
Character Recognition: - This is mechanical
change of scanned images of printed, typed or written text into machine
readable text. It is used to convert books and documents into electronic files.
iii)
Speech-to-Text
and Text-to-Speech:- The development of
these types of application systems is applied area of Speech Recognition and
Speech Synthesis. These systems are used to convert spoken sentences to written
text and vice-versa.
iv)
Information
Extraction and Information Retrieval (IE/IR) : - In this field, software systems are developed to automatically extract
structured information from unstructured machine-readable documents.
v)
Text
Summarization: - This is a task of
making summary of a text or document by a computer program. The summary
contains all the significant and key points of the original document.
vi)
Computational
Lexicography and Computational Lexicology: - Computational Lexicography is related to the use of computers in making
dictionaries/lexicons while Computational Lexicology is concerned with the use
of computers in the study of dictionaries/ lexicons.
vii)
Computer
Assisted Language Learning:
- This is related to providing courseware and additional material for language
learning and teaching.
viii)
Question
Answer Systems: - The ability of a
system to automatically answer a question asked in natural language is a major
task in IR and NLP. Development of such systems is also an application field.
ix)
Language
Reading and Writing Adds:-
Application software systems for this task are also being developed.
x)
Voice
Recognizer: - The development of Voice
Recognizers is also being done for various tasks. Many types of Voice
Recognizers have been developed in present.
5.0 CONCLUDING REMARKS
In this way we see that linguistic knowledge of any
language is very necessary to do various tasks in our life. We use this
knowledge sometimes directly as well as we are using them to get more knowledge
about all the aspects of language. So, linguistics has various types of
applications.
Today, technology is the demand of our faster and
smarter life. It has made its place in our daily works rapidly. So, all the
sciences have provided the application of their theoretical knowledge in this
field. Linguistics also has so many Technological Applications of various types
which are significant at global level.
6.0 REFERENCES
1. Chaitnya, Vineet, Sangal, Rajeev (2000) Natural
Language Processing: A Paninian Perspective, New Delhi : Prentice-Hall of
India Private Limited
2. Grishman, Ralph (1994) Computational Linguistics:
An Introduction, Cambridge University Press
3. Malhotra, Dr.
Vijay Kumar (2002) Computer ke Bhaashik Anupryog, New Delhi: Vanni Prakashan
4. Shrivastava, R.N and others (1995) Anupryukt
Bhashavigyaan, New Delhi: Aalekh Publication
5. Bolshakov, Igor A., Gelbuk Alexader (2004)
Computational Linguistics: Models, Resources, Applications, Instituto
Politecnico Nacional
7.0 WEBSITES
c प्रोग्रामिंग उदाहरण कोड स्निपेट
ReplyDeleteतीन बिंदुओं नमूना कोड के माध्यम से सर्कल