Repositório Digital de Publicações Científicas: Using IR techniques to improve Automated Text Classification

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/2557

Title:	Using IR techniques to improve Automated Text Classification
Authors:	Gonçalves, Teresa Quaresma, Paulo
Keywords:	machine learning Text classification
Issue Date:	2004
Publisher:	Springer-Verlag
Abstract:	This paper performs a study on the pre-processing phase of the automated text classification problem. We use the linear Support Vector Machine paradigm applied to datasets written in the English and the European Portuguese languages – the Reuters and the Portuguese Attorney General’s Office datasets, respectively. The study can be seen as a search, for the best document representa- tion, in three different axes: the feature reduction (using linguistic in- formation), the feature selection (using word frequencies) and the term weighting (using information retrieval measures).
URI:	http://hdl.handle.net/10174/2557
Type:	article
Appears in Collections:	INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File	Description	Size	Format
tcg04b-usingIR.pdf	Artigo	126.3 kB	Adobe PDF	View/Open