Open Access System for Information Sharing

Login Library

 

Article
Cited 145 time in webofscience Cited 0 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.authorYu, HJ-
dc.contributor.authorHan, JW-
dc.contributor.authorChang, KCC-
dc.date.accessioned2016-04-01T08:46:41Z-
dc.date.available2016-04-01T08:46:41Z-
dc.date.created2009-08-05-
dc.date.issued2004-01-
dc.identifier.issn1041-4347-
dc.identifier.other2004-OAK-0000017230-
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/28744-
dc.description.abstractWeb page classification is one of the essential techniques for Web mining because classifying Web pages of an interesting class is often the first step of mining the Web. However, constructing a classifier for an interesting class requires laborious preprocessing such as collecting positive and negative training examples. For instance, in order to construct a "homepage" classifier, one needs to collect a sample of homepages (positive examples) and a sample of nonhomepages (negative examples). In particular, collecting negative training examples requires arduous work and caution to avoid bias. This paper presents a framework, called Positive Example Based Learning (PEBL), for Web page classification which eliminates the need for manually collecting negative training examples in preprocessing. The PEBL framework applies an algorithm, called Mapping-Convergence (M-C), to achieve high classification accuracy (with positive and unlabeled data) as high as that of a traditional SVM (with positive and negative data). M-C runs in two stages: the mapping stage and convergence stage. In the mapping stage, the algorithm uses a weak classifier that draws an initial approximation of "strong" negative data. Based on the initial approximation, the convergence stage iteratively runs an internal classifier (e.g., SVM) which maximizes margins to progressively improve the approximation of negative data. Thus, the class boundary eventually converges to the true boundary of the positive class in the feature space. We present the M-C algorithm with supporting theoretical and experimental justifications. Our experiments show that, given the same set of positive examples, the M-C algorithm outperforms one-class SVMs, and it is almost as accurate as the traditional SVMs.-
dc.description.statementofresponsibilityX-
dc.languageEnglish-
dc.publisherIEEE COMPUTER SOC-
dc.relation.isPartOfIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING-
dc.subjectWeb page classification-
dc.subjectWeb mining-
dc.subjectdocument classification-
dc.subjectsingle-class classification-
dc.subjectMapping-Convergence (M-C) algorithm-
dc.subjectSVM (Support Vector Machine)-
dc.subjectEM-
dc.titlePEBL: WEB PAGE CLASSIFICATION WITHOUT NEGATIVE EXAMPLES-
dc.typeArticle-
dc.contributor.college컴퓨터공학과-
dc.identifier.doi10.1109/TKDE.2004.1264823-
dc.author.googleYu, HJ-
dc.author.googleHan, JW-
dc.author.googleChang, KCC-
dc.relation.volume16-
dc.relation.issue1-
dc.relation.startpage70-
dc.relation.lastpage81-
dc.contributor.id10162777-
dc.relation.journalIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING-
dc.relation.indexSCI급, SCOPUS 등재논문-
dc.relation.sciSCI-
dc.collections.nameJournal Papers-
dc.type.rimsART-
dc.identifier.bibliographicCitationIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, v.16, no.1, pp.70 - 81-
dc.identifier.wosid000187435500007-
dc.date.tcdate2019-01-01-
dc.citation.endPage81-
dc.citation.number1-
dc.citation.startPage70-
dc.citation.titleIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING-
dc.citation.volume16-
dc.contributor.affiliatedAuthorYu, HJ-
dc.description.journalClass1-
dc.description.journalClass1-
dc.description.wostc95-
dc.type.docTypeArticle-
dc.subject.keywordAuthorWeb page classification-
dc.subject.keywordAuthorWeb mining-
dc.subject.keywordAuthordocument classification-
dc.subject.keywordAuthorsingle-class classification-
dc.subject.keywordAuthorMapping-Convergence (M-C) algorithm-
dc.subject.keywordAuthorSVM (Support Vector Machine)-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

유환조YU, HWANJO
Dept of Computer Science & Enginrg
Read more

Views & Downloads

Browse