[vc_empty_space][vc_empty_space]
Static mapping for OpenCL workloads in heterogeneous computer systems
Rahmawan H.a, Kuspriyantoa, Gondokaryono Y.S.a
a School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, 40132, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2005 – ongoing JATIT & LLS.Today, heterogeneous computer systems (HCS) commonly rely on CPU and GPU, for processing elements, and OpenCL, for the programming framework. In an HCS, a workload should execute on its best processor to achieve its best speedup. OpenCL currently entirely lefts the selection for the best-fit processor, termed as workload mapping, to programmers. However, the NP-completeness of the workload mapping task indicates it is not a trivial task to do manually by programmers so that effective computational approaches are necessary. This research proposes a static mapping method for OpenCL workloads that automatically select the best-fit processor for the workloads. The method accepts static features of a workload and utilizes K-Nearest Neighbor algorithm to classify the workload to either CPU or GPU. The static features are collected using LLVM/Clang compiler framework. To increase the accuracy of classification while keep maintaining the physical meaning of features, the features are reduced using feature selection approaches. Two feature selection models, filter model and wrapper model, are used in this research. This approach was evaluated using k-fold cross-validation against 18 OpenCL kernels obtained from standard benchmark packages. According to the evaluation results, the workload mapping accuracy was in the range of 93% to 97% indicating the method is well applicable in the HC environment with two processors. Floating-point operations and vector-integer operations, or floating-point operations and vector-global memory access are the combinations of features that a have significant contribution to the classification of workloads. The main contribution of the method in this research, compared to previous related research, lies in its capability to state features that are significant in the classification process.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]GPU,Heterogeneous computing,K-nearest neighbor,OpenCL,Workload mapping[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]