[vc_empty_space][vc_empty_space]
Indonesian automatic speech recognition system using English-based acoustic model
Ferdiansyah V.a, Purwarianti A.a
a Sekolah Teknik Elektro Dan Informatika, Institut Teknologi Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]Building an automatic speech recognizer (ASR) means that one has to provide the acoustic model, language model and lexicon for the intended language, which is also applied for Indonesian ASR. Unfortunately, providing acoustic model for a certain language is quite expensive, unlike the language model and the lexicon. This is because one has to record many utterances from several speakers to build a speaker independent ASR. In our research, we attempted to build an Indonesian ASR without providing the Indonesian acoustic model directly. Instead, we made use English acoustic model and mapped English phoneme into Indonesian one. There are 39 English phonemes and 29 Indonesian phonemes. For special Indonesian phoneme with no corresponding English phoneme, we tried to make estimation such as “ny” is mapped into “n” and “y”. There are 9,509 Indonesian words equipped with corresponding English phoneme. The English acoustic model size is 5,523 KB and the Indonesian language model is built from 405 KB. By customizing Sphinx (a Hidden Markov Model based ASR tool) with Indonesian lexicon and Indonesian language model, the Indonesian ASR has been built. The goal of this paper is to compare the system’s accuracy with existing Indonesian ASR that use Indonesian acoustic model. © 2011 IEEE.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Acoustic model,Automatic speech recognition,Automatic speech recognition system,Automatic speech recognizers,English-Indonesian phoneme mapping,Language model[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]English acoustic model,English-Indonesian phoneme mapping,Indonesian automatic speech recognition[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICEEI.2011.6021583[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]