[vc_empty_space][vc_empty_space]
Data augmentation on spontaneous Indonesian automatic speech recognition using statistical machine translation
Hadiwinoto P.N.a, Lestari D.P.a
a School of Electrical Engineering and Informatics, Bandung Institute of Technology Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© Published under licence by IOP Publishing Ltd.Language model plays an important role in decoding process of the automatic speech recognition. The accuracy of spontaneous speech recognition is still very low compared to dictated speech of the Indonesian automatic speech recognition. It is due to the lack of the number of spontaneous data. Collecting spontaneous data is also difficult to do, so one of the candidate solutions is to augment data from existing spontaneous data. In this research, experiments are conducted on language models to improve the accuracy of spontaneous Indonesian speech recognition by conducting data augmentation. Data augmentation in this research is done by using statistical machine translation named ‘Moses’. Language modeling technique used here is n-gram. GMM-HMM is used for acoustic modeling. First, spontaneous text corpus is added to the text corpus, then the data augmentation is conducted. When the language model is formed from the addition of a spontaneous text corpus, there is an increase in accuracy of 3.59% relative to the baseline. When data augmentation is done on language model there is an increase in accuracy of 2.74% relative to the baseline. However, this decrease is considered not significant compared to the effort required in collecting spontaneous data manually.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1088/1757-899X/803/1/012030[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]