Repositório Digital de Publicações Científicas: Intrinsic and Extrinsic Evaluation of the Quality of Biomedical Embeddings in Different Languages


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Guia do Utilizador RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ CIDEHUS - Centro Interdisciplinar de História, Culturas e Sociedades / CIDEHUS - Artigos em Livros de Actas/Proceedings /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/29061

Title:	Intrinsic and Extrinsic Evaluation of the Quality of Biomedical Embeddings in Different Languages
Authors:	Franceschini, Paula Santos, Henrique Vieira, Renata
Keywords:	Language Models Health Informatics
Issue Date:	Jul-2020
Publisher:	IEEE
Citation:	P. M. Franceschini, H. D. P. dos Santos and R. Vieira, "Intrinsic and Extrinsic Evaluation of the Quality of Biomedical Embeddings in Different Languages," 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), Rochester, MN, USA, 2020, pp. 271-276, doi: 10.1109/CBMS49503.2020.00058.
Abstract:	Lately, language models have been applied to several tasks in biomedical natural language processing. Some public language models are available online, each built with different corpora. In this paper, we evaluate different public word embedding models trained with both general and biomedical corpora for English and Portuguese. We present intrinsic evaluations based on semantic analogies that use word pairs extracted from the MeSH biomedical thesaurus and also from benchmarks that are available for general-domain evaluation. For extrinsic evaluations we rely on a classification task over Eletronic Health Records. Our experiments show that biomedical embeddings can better capture semantics for biomedical analogies in both languages. On the other hand for extrinsic evaluation, based on classification tasks using the language models, larger general textual corpora appeared equally or more effective.
URI:	https://doi.org/10.1109/CBMS49503.2020.00058 https://ieeexplore.ieee.org/document/9182968 http://hdl.handle.net/10174/29061
Type:	article
Appears in Collections:	CIDEHUS - Artigos em Livros de Actas/Proceedings

Files in This Item:

File	Description	Size	Format
09182968.pdf		90.73 kB	Adobe PDF	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora