Enter your keyword

2-s2.0-84966862607

[vc_empty_space][vc_empty_space]

Experiment on a phrase-based statistical machine translation using PoS Tag information for Sundanese into Indonesian

Suryani A.A.a, Widyantoro D.H.a, Purwarianti A.a, Sudaryat Y.b

a Sekolah Teknik Elektro Dan Informatika-ITB, Bandung, Indonesia
b Fakultas Pendidikan Bahasa Dan Seni-UPI, Bandung, Indonesia

[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2015 IEEE.This paper discusses the problem of Sundanese into Indonesian text translation, as one of low-resource language pair translation. The number of parallel corpus gives a significant impact on a statistical machine translation. Whereas to date, there are no Sundanese to Indonesian parallel corpus that ready to use. It is, therefore, we apply the PoS Tag rather than only surface form in the translation model to get a better translation result. This experiment was done to get an early result in Sundanese to Indonesian text translation and to identify problems arise on it. The result shows that the model using surface form and PoS Tag was slightly outperformed the model using only surface form. However, there are some problems faced in this experiment which are the large number of OOV caused by the limited number of parallel corpus and unproper phrase translation caused by some noise in the parallel corpus such as typos and inconsistency writing a word in Sundanese corpus.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Bleu scores,Language model,Low resource languages,Phrase translations,Phrase-based machine translations,Phrase-based statistical machine translation,Statistical machine translation,Translation models[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]bleu score,language model,phrase-based machine translation,PoS Tag,translation model[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICITSI.2015.7437678[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]