Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/34684

Title: Generating a European Portuguese BERT Based Model Using Content from Arquivo.pt Archive
Authors: Miquelina, Nuno
Quaresma, Paulo
Nogueira, Vitor
Editors: Yin, Hujun
Camacho, David
Tino, Peter
Keywords: BERT
Portuguese European
Arquivo.pt
Issue Date: 21-Nov-2022
Publisher: Springer International Publishing
Citation: Miquelina, N., Quaresma, P., Nogueira, V.B. (2022). Generating a European Portuguese BERT Based Model Using Content from Arquivo.pt Archive. In: Yin, H., Camacho, D., Tino, P. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2022. IDEAL 2022. Lecture Notes in Computer Science, vol 13756. Springer, Cham. https://doi.org/10.1007/978-3-031-21753-1_28
Abstract: Building a language model from free available internet information takes several steps and challenges. This new model aims to be a BERT-based language model for Portuguese-European, with no specific context. The corpus was built using a web page archive infrastructure provided by Arquivo.pt and restricted to .pt domains. This paper will describe the overall process of building the corpus and training a BERT model.
URI: http://hdl.handle.net/10174/34684
Type: bookPart
Appears in Collections:INF - Publicações - Capítulos de Livros

Files in This Item:

File Description SizeFormat
Generating_a_Portuguese_European_BERT_based_model_using_content_from_Arquivo_pt_archive_preprint.pdf909.81 kBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois