Show simple item record

dc.contributor.convenorHeng Tao Shen and Athman Bouguettaya
dc.contributor.authorKalinov, P
dc.contributor.authorStantic, B
dc.contributor.authorSattar, A
dc.contributor.editorHeng Tao Shen and Athman Bouguettaya
dc.date.accessioned2017-05-03T11:26:27Z
dc.date.available2017-05-03T11:26:27Z
dc.date.issued2010
dc.date.modified2010-07-08T08:09:02Z
dc.identifier.issn1445-1336
dc.identifier.refurihttp://www.itee.uq.edu.au/~adc2010/
dc.identifier.urihttp://hdl.handle.net/10072/31361
dc.description.abstractDue to the lack of in-built tools to navigate the web, people have to use external solutions to find information. The most popular of these are search engines and web directories. Search engines allow users to /textit{locate} specific information about a particular topic, whereas web directories facilitate /textit{exploration} over a wider topic. In the recent past, statistical machine learning methods have been successfully exploited in search engines. Web directories remained in their primitive state, which resulted in their decline. Exploration however is a task which answers a different information need of the user and should not be neglected. Web directories should provide a user experience of the same quality as search engines. Their development by machine learning methods however is hindered by the noisy nature of the web, which makes text classifiers unreliable when applied to web data. In this paper we propose Stochastic Prior Distribution Adjustment (SPDA) - a variation of the Multinomial Naive Bayes (MNB) classifier which makes it more suitable to classify real-world data. By stochastically adjusting class prior distributions we achieve a better overall success rate, but more importantly we also significantly improve error distribution across classes, making the classifier equally reliable for all classes and therefore more usable.
dc.description.peerreviewedYes
dc.description.publicationstatusYes
dc.format.extent185583 bytes
dc.format.mimetypeapplication/pdf
dc.languageEnglish
dc.language.isoeng
dc.publisherAustralian Computer Society
dc.publisher.placeSydney, Australia
dc.publisher.urihttp://dl.acs.org.au/
dc.relation.ispartofstudentpublicationN
dc.relation.ispartofconferencenameAustralasian Database Conference
dc.relation.ispartofconferencetitleConferences in Research and Practice in Information Technology Series
dc.relation.ispartofdatefrom2010-01-18
dc.relation.ispartofdateto2010-01-21
dc.relation.ispartoflocationBrisbane, Australia
dc.relation.ispartofpagefrom113
dc.relation.ispartofpageto122
dc.relation.ispartofvolume104
dc.rights.retentionY
dc.subject.fieldofresearchInformation systems organisation and management
dc.subject.fieldofresearchcode460908
dc.titleBuilding a Dynamic Classifier for Large Text Data Collections
dc.typeConference output
dc.type.descriptionE1 - Conferences
dc.type.codeE - Conference Publications
gro.facultyGriffith Sciences, School of Information and Communication Technology
gro.rights.copyright© 2010 Australian Computer Society Inc. The attached file is posted here in accordance with the copyright policy of the publisher, for your personal use only. No further distribution permitted. Use hypertext link for access to the conference website.
gro.date.issued2010
gro.hasfulltextFull Text
gro.griffith.authorStantic, Bela
gro.griffith.authorSattar, Abdul


Files in this item

This item appears in the following Collection(s)

  • Conference outputs
    Contains papers delivered by Griffith authors at national and international conferences.

Show simple item record