Repositório Digital de Publicações Científicas: An Approach to the POS Tagging Problem Using Genetic Algorithms


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ Departamento de Informática / INF - Publicações - Capítulos de Livros /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/17099

Title:	An Approach to the POS Tagging Problem Using Genetic Algorithms
Authors:	Silva, Ana Paula Silva, Arlindo Pimenta Rodrigues, Irene
Editors:	Madani, Kurosh Correia, Dourado Antonio Rosa, Agostinho Filipe, Joaquim
Keywords:	Part-of-speech Tagging Disambiguation Evolutionary Algorithms Natural Language Processing
Issue Date:	2015
Publisher:	Springer International Publishing
Citation:	Ana Paula Silva , Arlindo Silva, Irene Rodrigues. An Approach to the POS Tagging Problem Using Genetic Algorithms. Chapter Computational Intelligence Volume 577 of the series Studies in Computational Intelligence pp 3-17. Springer, 2015
Abstract:	The automatic part-of-speech tagging is the process of automatically assigning to the words of a text a part-of-speech (POS) tag. The words of a language are grouped into grammatical categories that represent the function that they might have in a sentence. These grammatical classes (or categories) are usually called part-of-speech. However, in most languages, there are a large number of words that can be used in different ways, thus having more than one possible part-of-speech. To choose the right tag for a particular word, a POS tagger must consider the surrounding words’ part-of-speeches. The neighboring words could also have more than one possible way to be tagged. This means that, in order to solve the problem, we need a method to disambiguate a word’s possible tags set. In this work, we modeled the part-of-speech tagging problem as a combinatorial optimization problem, which we solve using a genetic algorithm. The search for the best combinatorial solution is guided by a set of disambiguation rules that we first discovered using a classification algorithm, that also includes a genetic algorithm. Using rules to disambiguate the tagging, we were able to generalize the context information present on the training tables adopted by approaches based on probabilistic data. We were also able to incorporate other type of information that helps to identify a word’s grammatical class. The results obtained on two different corpora are amongst the best ones published.
URI:	http://dx.doi.org/10.1007/978-3-319-11271-8_1 http://hdl.handle.net/10174/17099
Type:	bookPart
Appears in Collections:	INF - Publicações - Capítulos de Livros

Files in This Item:

File	Description	Size	Format
t.txt		1.46 kB	Text	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora