Enter your keyword

2-s2.0-84961160763

[vc_empty_space][vc_empty_space]

Filled pause detection in Indonesian spontaneous speech

Sani A.a, Lestari D.P.a, Purwarianti A.a

a Institut Teknologi Bandung, Bandung, Indonesia

[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© Springer Science+Business Media Singapore 2016.Detecting filled pause in spontaneous speech recognition is very important since most of the speech is spontaneous and the most frequent phenomenon in Indonesian spontaneous speech is filled pause. This paper discusses the detection of filled pauses in spontaneous speech of Indonesian by utilizing acoustic features of the speech signal. The detection was conducted by employing statistical method using Naïve Bayes, Classification Tree, and Multilayer Perceptron algorithm. To build the model, speech data were collected from an entertainment program. Word parts in the data were labeled and its features were extracted. These include the formant and pitch stability, energy-drop, and duration. Half an hour of sentences contains 295 filled pause and 2082 non-filled pause words were employed as training data. Using 25 sentences as testing data, Naïve Bayes gave best detection correctness, 74.35 % on a closed data set and 71.43 % on an open data set.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Acoustic features,Classification trees,Filled pause,Speech signals,Spontaneous speech,Spontaneous speech recognition,Testing data,Training data[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Acoustic,Filled pause,Spontaneous speech[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1007/978-981-10-0515-2_4[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]