Please use this identifier to cite or link to this item:
|Title: ||Analysing part-of-speech for Portuguese text classification|
|Authors: ||Gonçalves, Teresa|
|Keywords: ||machine learning|
|Issue Date: ||2006|
|Abstract: ||This paper proposes and evaluates the use of linguistic in- formation in the pre-processing phase of text classification. We present several experiments evaluating the selection of terms based on different measures and linguistic knowledge. To build the classifier we used Sup- port Vector Machines (SVM), which are known to produce good results on text classification tasks.
Our proposals were applied to two different datasets written in the Portuguese language: articles from a Brazilian newspaper (Folha de So Paulo) and juridical documents from the Portuguese Attorney General’s Office. The results show the relevance of part-of-speech information for the pre-processing phase of text classification allowing for a strong re- duction of the number of features needed in the text classification.|
|Appears in Collections:||INF - Artigos em Livros de Actas/Proceedings|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.