[vc_empty_space][vc_empty_space]
Comparison of distance measures for clustering data with mix attribute types for Indonesian potential-based regional grouping
Prasetyo H.a, Purwarianti A.a
a School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2014 IEEE.Every region in Indonesia has different potentials and need to be analyzed for national development considerations. This analyzed can be accomplished with clustering Indonesian regional potential data, which is collected from PODES enumeration. This data consist of both numeric and categorical attributes. However, most of clustering algorithm can be applied on either numeric or categorical data. K-prototypes algorithm, as clustering algorithm which can deal with mix data types, has limitation such as distance measurement. Selecting distance measures properly is thus important to increase its performance. This paper presents a comparison of distance measures for clustering mix attribute type data. We have applied k-prototypes algorithm with several distance measures on PODES11-DESA dataset and used Silhouette index for clustering evaluation. The results show that the best clustering is accomplished by applying Ratio on Mismatches distance for categorical attributes. For numeric attributes, there is no one best performing distance measure since the performance of numeric distance measures varies for each treatment.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Categorical attributes,Clustering evaluation,clustering mix attribute types,Distance measure,K-prototype,National development,Regional groupings,Silhouette indices[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]clustering mix attribute types,distance measures,k-prototypes algorithm,Regional potentials[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICITSI.2014.7048230[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]