Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/2557

Title: Using IR techniques to improve Automated Text Classification
Authors: Gonçalves, Teresa
Quaresma, Paulo
Keywords: machine learning
Text classification
Issue Date: 2004
Publisher: Springer-Verlag
Abstract: This paper performs a study on the pre-processing phase of the automated text classification problem. We use the linear Support Vector Machine paradigm applied to datasets written in the English and the European Portuguese languages – the Reuters and the Portuguese Attorney General’s Office datasets, respectively. The study can be seen as a search, for the best document representa- tion, in three different axes: the feature reduction (using linguistic in- formation), the feature selection (using word frequencies) and the term weighting (using information retrieval measures).
URI: http://hdl.handle.net/10174/2557
Type: article
Appears in Collections:INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File Description SizeFormat
tcg04b-usingIR.pdfArtigo126.3 kBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois