DC Field | Value | Language |
---|---|---|
dc.contributor.author | Min-Soo Kim | - |
dc.contributor.author | Sangyeon Lee | - |
dc.contributor.author | Han, WS | - |
dc.contributor.author | Himchan Park | - |
dc.contributor.author | Jeong-Hoon Lee | - |
dc.date.accessioned | 2017-07-19T13:36:40Z | - |
dc.date.available | 2017-07-19T13:36:40Z | - |
dc.date.created | 2017-02-13 | - |
dc.date.issued | 2015-10 | - |
dc.identifier.issn | 1041-4347 | - |
dc.identifier.uri | https://oasis.postech.ac.kr/handle/2014.oak/37320 | - |
dc.description.abstract | Computing connected components is a core operation on graph data. Since billion-scale graphs cannot be resident in memory of a single server, several approaches based on distributed machines have recently been proposed. The representative methods are Hash-To-Min and PowerGraph. Hash-To-Min is the state-of-the art disk-based distributed method which minimizes the number of MapReduce rounds. PowerGraph is the-state-of-the-art in-memory distributed system, which is typically faster than the disk-based distributed one, however, requires a lot of machines for handling billion-scale graphs. In this paper, we propose an I/O efficient parallel algorithm for billion-scale graphs in a single PC. We first propose the Disk-based Sequential access-oriented Parallel processing (DSP) model that exploits sequential disk access in terms of disk I/Os and parallel processing in terms of computation. We then propose an ultra-fast disk-based parallel algorithm for computing connected components, DSP-CC, which largely improves the performance through sequential disk scan and page-level cache-conscious parallel processing. Extensive experimental results show that DSP-CC 1) computes connected components in billion-scale graphs using the limited memory size whereas in-memory algorithms can only support medium-sized graphs with the same memory size, and 2) significantly outperforms all distributed competitors as well as a representative disk-based parallel method. | - |
dc.language | English | - |
dc.publisher | IEEE | - |
dc.relation.isPartOf | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | - |
dc.title | DSP-CC: I/O Efficient Parallel Computation of Connected Components in Billion-scale Networks | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/TKDE.2015.2419665 | - |
dc.type.rims | ART | - |
dc.identifier.bibliographicCitation | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, v.27, no.10, pp.2658 - 2671 | - |
dc.identifier.wosid | 000361245300006 | - |
dc.date.tcdate | 2019-02-01 | - |
dc.citation.endPage | 2671 | - |
dc.citation.number | 10 | - |
dc.citation.startPage | 2658 | - |
dc.citation.title | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | - |
dc.citation.volume | 27 | - |
dc.contributor.affiliatedAuthor | Han, WS | - |
dc.identifier.scopusid | 2-s2.0-84941554873 | - |
dc.description.journalClass | 1 | - |
dc.description.journalClass | 1 | - |
dc.description.wostc | 1 | - |
dc.description.isOpenAccess | N | - |
dc.type.docType | Article | - |
dc.subject.keywordAuthor | Graphs | - |
dc.subject.keywordAuthor | disk-based | - |
dc.subject.keywordAuthor | parallel | - |
dc.subject.keywordAuthor | connected components | - |
dc.subject.keywordAuthor | SSD | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
library@postech.ac.kr Tel: 054-279-2548
Copyrights © by 2017 Pohang University of Science ad Technology All right reserved.