[vc_empty_space][vc_empty_space]
Automatic rhetorical sentence categorization on Indonesian meeting minutes
Rachman G.H.a, Khodra M.L.a
a School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2016 IEEE.Meeting minutes contains much important information of meeting. Since meeting minutes is unstructured document, in order to easily get and summarize this information, classification for every sentence in meeting minutes should be conducted. Some works in this research area has been done for meeting minutes in English, but not conducted yet in Indonesian. Therefore, this paper aims to present the rhetorical sentence categorization from Indonesian meeting minutes by utilizing some features, i.e. length, position, previous label, significant terms, and cue phrases per class. Then, this paper shows the result of employing SMOTE and resampling for balancing the existing instances per class. Every experiment is tested in four classifiers, namely Naïve Bayes, SVM Linear, IBk, and J48 tree. It shows that the use of previous label and both of significant term and cue phrase per class improves performance. Then it shows also that resampling is better than SMOTE. After doing the 10-fold cross-validation in IBk classifier, model using SMOTE achieved F-measure of 85.22% and resampling model achieved 94.52%.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]10-fold cross-validation,Cue phrase,F measure,Indonesian meeting minutes,Resampling,Rhetorical categorization,Unstructured documents[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Balancing data,Feature extraction,Indonesian meeting minutes,Rhetorical categorization[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICODSE.2016.7936103[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]