[vc_empty_space][vc_empty_space]
Text Corpus and Acoustic Model Addition for Indonesian-Arabic Code-switching in Automatic Speech Recognition System
Barik R.E.a, Lestari D.P.a
a School of Electrical Engineering and Informatics, Bandung Institute of Technology, Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2019 IEEE.Code-switch is a common phenomenon found during daily conversation, especially in Indonesia. In this Muslim-majority country, the occurrence of Arabic Language during conversation is quite frequent. Unfortunately, there has not been any special handling done to minimize errors caused by Indonesian-Arabic code-switching occurrences. The handling can be done on several levels. In the lexicon, Arabic vocabularies are added. Code-switching sentences are also added into the training text corpus to improve the language model. On the acoustic model, we apply Indonesian-Arabic phone merging by using the IPA (International Phonetic Association) rule to improve the performance of the acoustic model. Recognition of code-switch speech Indonesian-Arabic achieved WER (Word Error Rate) improvement of 20.87% from the baseline system. OOV rate and perplexity of the system also improved.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Acoustic model,Arabic languages,Automatic speech recognition system,Baseline systems,Code-switching,Language model,Text corpora,Word error rate[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]acoustic model,ASR,code-switching,language model,lexicon[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text]Baseline data, audio recording apparatus, and training environment were provided by PT. Prosa Solusi Cerdas. This research is partially funded by “Program Penelitian, Pengabdian kepada Masyarakat dan Inovasi (P3MI) Kelompok Keahlian ITB”[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICAICTA.2019.8904183[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]