[vc_empty_space][vc_empty_space]
Modified weighting method in TF*IDF algorithm for extracting user topic based on email and social media in integrated digital assistant
Pramono L.H.a, Rohman A.S.a, Hindersah D.H.a
a Electrical Engineering Department, School of Electrical Engineering and Informatics, Bandung Institute of Technology, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]Integrated Digital Assistant (IDA) is a system designed to be a ‘personal secretary’ who worked in full for the user. IDA will be active when the user is relaxing at home, office activities and even while traveling or outside activities. IDA works to minimize the interaction between user and system. The system will be able to find out information from the outside that is needed by users by searching users’ topics through email and social media data. Searching and extracting user interest or topics in social media and email data of IDA is using TF*IDF weighting modification algorithm named TFIDFDF which is extend of TFIDF method. Expected with TF*IDF weighting modification algorithm, topics that obtained more representative and in accordance with the information needed by the user. From extraction by using TF*IDF*DF, the number of terms (words) that has a value of document frequency (df) more than one are increases. On the other hand the computational load is also increasing due to the multiplier factor of df. News taken based on the extracted topic using the TF*IDFand*DF increased and more diverse. The term from topic extraction result still have noisy text that not appropriate to grammar writing and need to be fixed, so the term that found will be more perfect. © 2013 IEEE.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Computational loads,Digital assistants,Document frequency,Social media datum,TF-IDF,Topic extraction,Topic Modeling,user topic[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]feature selection,TF-IDF,topic extraction,topic model,user topic[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/rICT-ICeVT.2013.6741547[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]