[vc_empty_space][vc_empty_space]
Study of data imbalance and asynchronous aggregation algorithm on federated learning system
DIwangkara S.S.a, Kistijantoro A.I.a
a Institut Teknologi Bandung, School of Electrical Engineering and Informatics, Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2020 IEEE.As the use of machine learning techniques are becoming more widespread, the need for more elaborate dataset is becoming more prevalent. This is usually done with data collection methods that pay little to no attention to the data owner’s privacy and consent. Federated learning is an approach that tries to solve this problem, where such system can train a machine learning model without centrally storing the needed data. But one weakness of the current implementation is that they have a slow convergence time, despite the fact that they distribute the task on many nodes. This is mainly caused by the synchronous nature of the current algorithm. In this paper, we observe the effect that asynchronous aggregation algorithm has on convergence time and test the two factors that might affect it – staleness and data imbalance – on various levels. We implement the asynchronous aggregation algorithm by adapting the Stale Synchronous Parallel algorithm. We test our system on MNIST dataset and found that asynchronous aggregation algorithm improves convergence time in a federated learning system that has large inequality in server-wise update frequency and has a relatively balanced data distribution.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Aggregation algorithms,Convergence time,Data collection method,Data distribution,Federated learning system,Machine learning models,Machine learning techniques,Slow convergences[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Asynchronous,Distributed system,Distributed training,Federated learning,Machine learning,Non-iid[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/ICITSI50517.2020.9264958[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]