Enter your keyword

2-s2.0-84960981964

[vc_empty_space][vc_empty_space]

Acoustic and language models adaptation for Indonesian spontaneous speech recognition

Lestari D.P.a, Irfani A.a

a Department of Informatics Engineering, Bandung Institute of Technology, Bandung, Indonesia

[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2015 IEEE.Performance of Indonesian Automatic Speech Recognition is decreased significantly when recognizing spontaneous speech. Spontaneous speech has particular characteristics differ from read speech both in acoustic and language rule. In spontaneous speech, the pronunciation and expression of the speech varies depending on the speaker fluency and the topic. Disfluencies in speech disrupt a fluent sentence and more often violates the rule of the formal language. To improve Indonesian automatic speech recognizer to recognize spontaneous speech, several model enhancement methods was conducted by adding spontaneous data and retrain both acoustic model and language model using those data, by adapting the acoustic model based on the maximum likelihood linear regression and maximum a posteriori approach, and by adapting the language model employing the language model linear interpolation. Experimental results show all methods are effective in increasing the capability of the Indonesian automatic speech recognizer to recognize spontaneous data. However, all methods decreased the accuracy of read speech recognition. On average, retraining both acoustic and language models using combination of read and spontaneous data is more effective than conducting model adaptation. The absolute improvement of 28.34% accuracy is achieved after retraining both language model and acoustic model using combination of read data and spontaneous data.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Automatic speech recognition,Linear Interpolation,Maximum a posteriori,Maximum likelihood linear regression,Spontaneous speech[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Indonesian Automatic Speech Recognition,Linear Interpolation Adaptation,maximum a posteriori,maximum likelihood linear regression,Spontaneous Speech[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICAICTA.2015.7335375[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]