Enter your keyword

2-s2.0-85084034373

[vc_empty_space][vc_empty_space]

Optimizing Deep Learning for Detection Cyberbullying Text in Indonesian Language

Anindyati L.a, Purwarianti A.a, Nursanti A.b

a School of Electrical Engineering and Informatics, Institute Teknologi, Bandung, Indonesia
b Psychology Faculty, Universitas Yayasan Rumah Sakit, Islam (YARSI, Indonesia

[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2019 IEEE.Cyberbullying in Indonesia currently become a concern due to the increasing usage of social media. Cyberbullying detection is an important step to make good environments in social media interactions. This research is part of computational linguistics that focuses on the usage of deep learning to detect bullying sentence on Twitter. There are two important processes in this study. First, the process of forming a word representation. Second, the classification process for detecting bullying sentences. Pre-Trained process to build the new representation of term/word is performed independently. Word2vec is used as a tool for the pre-Trained process. There are two types of data used in the pre-Training process. The first type of data only used testing data and training data, while the second type of data is the overall data, total 26, 800 unique Twitter sentences including test data and training data. The classification process is formed using three main algorithms that are popular for text classification: LSTM, bi-LSTM, and CNN. 9.854 labeled sentences are extracted from 2.584 Twitter conversations used as the dataset. The dataset consists of 1.680 sentences are labeled as a bully and 6.343 sentences are labeled as neutral. A total of 504 experiments are conducted in this research by exploiting the preprocessing stage for determining machine learning features, dropout layers configuration and the algorithm of deep learning. The experiments show that the accuracy score reaches 90.57% while the recall score for bully class reaches 75.7%.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Classification process,Cyber bullying,Indonesian languages,Social media,Testing data,Text classification,Training data,Word representations[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]BLSTM,CNN,computational linguistic,cyberbullying,deep learning,LSTM,Text Classification[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICAICTA.2019.8904108[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]