Plaesarn: Machine Aided Translation
Tool for English-to-Thai
เธฃเธจ.เธ”เธฃ.
อัศนีย์ ก่อตระกูล
1. Abstract
Internet
technology nowadays influences Thailand's
country development. Since many relevant
information and knowledge, spreading
on the Internet, is mostly written in
English, Thai people can therefore slowly
acknowledge them. This engineering project's
major aim is to develop a web-based
software tool in order to assist document
translation from English to Thai through
using Structural Transfer Model as the
mechanism. The system's architecture
namely comprises of two parts: server
part and client part. The server part
consists of four sub-systems, Syntactic
Analysis System, Structural Transformation
System, Sentence Generation System,
and Linguistic Knowledge Acquisition
System. The client part is an Internet
browser. The system primary advantage
is to assist Thai people to read English
documents better. Its major features
are when an English sentence can be
translated to many translations; the
user can manually elect which of them
is the most appropriate for him, and
the user can manually teach the system
a new translation rule if considered
necessary. The principal limitation
is that the system translates English
sentences regardless of semantic ambiguity
management. For this project, three
translation problems, namely, word rearrangement,
verbal phrase translation, and Thai
noun classifier identification, are
accomplished, but, in contrast, semantic
ambiguity is still not achieved. Moreover,
the system is experimented on the Future
Magazine Corpus and it yields 75% of
translation accuracy.
2.
Objectives
The project
is developed with the following three
major purposes:
- To
study Automatic Machine Translation
theories and existing Machine Translation
system
- To
study relevant problems occurring
in English-to-Thai translation
- To
design and implement a software
tool to assist English-to-Thai translation
3.
System Architecture
The Machine Translation System comprises
of four principle components, namely:
- Syntactic
Analysis: This stage analyzes every
sentence of the source documents
through using grammatical knowledge
from the Linguistic Knowledge Base.
Then, it transforms each sentence
into a parse tree, which is selected
as a sentence representation in
order to relieve complicatedness
of translation.
- Structural
Transformation: This stage transforms
a source language parse tree into
a target language one through using
grammatical knowledge from the Linguistic
Knowledge Base.
- Sentence
Generation: This stage generates
a target language sentence from
the parse tree. This stage also
relies on the grammatical knowledge
from the Linguistic Knowledge Base.
- Linguistic
Knowledge Acquisition: This stage
provides advantageous tools for
training translation knowledge.
4. Figures
Kasetsart University was established
on 2 February 1943 with the
prime aims in promoting subjects
related to agricultural sciences.
The university revised its
curricula and expanded the
subject areas to cover science,
arts, social science, humanity,
education, engineering,
and architecture.
Recently, the university
made an attempt to include
medicine and health science.
Kasetsart University has
established 7 campuses distributed
scatterly to cover all regions
of Thailand.
At present, the number of
enrolled students at all
levels of study is 23,000.
|
Figure
1: The input paragraph
Figure 2: This is the homepage
of the translation service
Figure 3: The user can manually
choose the most appropriate translation.
Kasetsart University ถูกก่อตั้งเมื่อ
2 February 1943 ด้วยจุดประสงค์หลักในการสนับสนุนวิชาซึ่งถูกเกี่ยวข้องถึงวิทยาศาสตร์การเกษตร
มหาวิทยาลัยทบทวนหลักสูตรและขยายพื้นที่วิชาเพื่อครอบคลุมวิทยาศาสตร์,
ศิลปศาสตร์, สังคมวิทยาศาสตร์,
มนุษยศาสตร์, การศึกษา, วิศวกรรม,
และสถาปัตยกรรม
เมื่อเร็ววันนี้, มหาวิทยาลัยทำความพยายามเพื่อรวมแพทยศาสตร์และวิทยาศาสตร์สุขภาพ
Kasetsart University ได้ก่อตั้งวิทยาเขต
7 วิทยาเขตซึ่งถูกกระจายอย่างกระจัดกระจายเพื่อครอบคลุมพื้นที่ทั้งหมดของ
Thailand
ที่ปัจจุบัน, จำนวนของนักเรียนที่เข้าเรียนที่ระดับชั้นทั้งหมดของการศึกษาเป็น
23,000
|
Figure 4: The output paragraph
Figure 5: The user can manually
train to the system new translation
rules.
|