[vc_empty_space][vc_empty_space]
Modified Breadth-first order-based link categorization for finding financial statement documents
Jatmiko A.B.a, Widyantoro D.H.b
a Bureau of Indonesian Statistics, Jakarta, Indonesia
b School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
[vc_row][vc_column][vc_row_inner][vc_column_inner][vc_separator css=”.vc_custom_1624529070653{padding-top: 30px !important;padding-bottom: 30px !important;}”][/vc_column_inner][/vc_row_inner][vc_row_inner layout=”boxed”][vc_column_inner width=”3/4″ css=”.vc_custom_1624695412187{border-right-width: 1px !important;border-right-color: #dddddd !important;border-right-style: solid !important;border-radius: 1px !important;}”][vc_empty_space][megatron_heading title=”Abstract” size=”size-sm” text_align=”text-left”][vc_column_text]© 2016 IEEE.With the exponential growth of information on the World Wide Web, there is a challenge to find for a document that contains specific information on the Internet. There are many statistical documents that are available on the Web. However, to search, recognize and take these kinds of documents need much effort and also require much time. One of the solutions that can be used to do that is a web crawler. In this paper, we develop a method to search statistical documents, specifically on finance domain using a web crawler based on link categorization. The method is a modified form of Breadth-first ordering strategy, where every link that is found will be categorized into three groups: Positive link, Negative link and Neutral link. The objective is to identify whether the link is likely pointing into relevant documents or not. In addition, we also utilize keywords related financial domain to recognize a relevant document. Based on our experiments, the value of precision and F-measure score of the proposed method is higher than its baseline value.[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Author keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Breadth-first,Exponential growth,F-measure scores,Financial domains,Financial statements,Relevant documents,Specific information,Statistical document[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Indexed keywords” size=”size-sm” text_align=”text-left”][vc_column_text]Breadth-first Ordering,Statistical document,Web Crawler[/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”Funding details” size=”size-sm” text_align=”text-left”][vc_column_text][/vc_column_text][vc_empty_space][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][vc_empty_space][megatron_heading title=”DOI” size=”size-sm” text_align=”text-left”][vc_column_text]https://doi.org/10.1109/CyberneticsCom.2016.7892572[/vc_column_text][/vc_column_inner][vc_column_inner width=”1/4″][vc_column_text]Widget Plumx[/vc_column_text][/vc_column_inner][/vc_row_inner][/vc_column][/vc_row][vc_row][vc_column][vc_separator css=”.vc_custom_1624528584150{padding-top: 25px !important;padding-bottom: 25px !important;}”][/vc_column][/vc_row]