Open Access System for Information Sharing

Login Library

 

Article
Cited 45 time in webofscience Cited 0 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.authorYu, HJ-
dc.contributor.authorYang, J-
dc.contributor.authorHan, JW-
dc.contributor.authorLi, XL-
dc.date.accessioned2016-04-01T08:46:40Z-
dc.date.available2016-04-01T08:46:40Z-
dc.date.created2009-08-05-
dc.date.issued2005-11-
dc.identifier.issn1384-5810-
dc.identifier.other2005-OAK-0000017235-
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/28743-
dc.description.abstractSupport vector machines (SVMs) have been promising methods for classification and regression analysis due to their solid mathematical foundations, which include two desirable properties: margin maximization and nonlinear classification using kernels. However, despite these prominent properties, SVMs are usually not chosen for large-scale data mining problems because their training complexity is highly dependent on the data set size. Unlike traditional pattern recognition and machine learning, real-world data mining applications often involve huge numbers of data records. Thus it is too expensive to perform multiple scans on the entire data set, and it is also infeasible to put the data set in memory. This paper presents a method, Clustering-Based SVM (CB-SVM), that maximizes the SVM performance for very large data sets given a limited amount of resource, e.g., memory. CB-SVM applies a hierarchical micro-clustering algorithm that scans the entire data set only once to provide an SVM with high quality samples. These samples carry statistical summaries of the data and maximize the benefit of learning. Our analyses show that the training complexity of CB-SVM is quadratically dependent on the number of support vectors, which is usually much less than that of the entire data set. Our experiments on synthetic and real-world data sets show that CB-SVM is highly scalable for very large data sets and very accurate in terms of classification.-
dc.description.statementofresponsibilityX-
dc.languageEnglish-
dc.publisherSPRINGER-
dc.relation.isPartOfDATA MINING AND KNOWLEDGE DISCOVERY-
dc.subjectSUPPORT VECTOR MACHINES-
dc.titleMAKING SVMS SCALABLE TO LARGE DATA SETS USING HIERARCHICAL CLUSTER INDEXING-
dc.typeArticle-
dc.contributor.college컴퓨터공학과-
dc.identifier.doi10.1007/S10618-005-0-
dc.author.googleYu, HJ-
dc.author.googleYang, J-
dc.author.googleHan, JW-
dc.author.googleLi, XL-
dc.relation.volume11-
dc.relation.issue3-
dc.relation.startpage295-
dc.relation.lastpage321-
dc.contributor.id10162777-
dc.relation.journalDATA MINING AND KNOWLEDGE DISCOVERY-
dc.relation.indexSCI급, SCOPUS 등재논문-
dc.relation.sciSCI-
dc.collections.nameJournal Papers-
dc.type.rimsART-
dc.identifier.bibliographicCitationDATA MINING AND KNOWLEDGE DISCOVERY, v.11, no.3, pp.295 - 321-
dc.identifier.wosid000233732500005-
dc.date.tcdate2019-01-01-
dc.citation.endPage321-
dc.citation.number3-
dc.citation.startPage295-
dc.citation.titleDATA MINING AND KNOWLEDGE DISCOVERY-
dc.citation.volume11-
dc.contributor.affiliatedAuthorYu, HJ-
dc.description.journalClass1-
dc.description.journalClass1-
dc.description.wostc27-
dc.type.docTypeArticle-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

유환조YU, HWANJO
Dept of Computer Science & Enginrg
Read more

Views & Downloads

Browse