Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/2557
|
| Title: | Using IR techniques to improve Automated Text Classification |
| Authors: | Gonçalves, Teresa Quaresma, Paulo |
| Keywords: | machine learning Text classification |
| Issue Date: | 2004 |
| Publisher: | Springer-Verlag |
| Abstract: | This paper performs a study on the pre-processing phase of the automated text classification problem. We use the linear Support Vector Machine paradigm applied to datasets written in the English and the European Portuguese languages – the Reuters and the Portuguese Attorney General’s Office datasets, respectively.
The study can be seen as a search, for the best document representa- tion, in three different axes: the feature reduction (using linguistic in- formation), the feature selection (using word frequencies) and the term weighting (using information retrieval measures). |
| URI: | http://hdl.handle.net/10174/2557 |
| Type: | article |
| Appears in Collections: | INF - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|