|
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/29056
|
Title: | Evaluating the performance and improving the usability of parallel and distributed Word Embeddings tools |
Authors: | Silva, Mateus Meyer, Vinicius Kirchoff, Dionatrã Neto, Joaquim Vieira, Renata De Rose, Cesar |
Keywords: | Language models |
Issue Date: | Mar-2020 |
Publisher: | IEEE |
Citation: | M. L. d. Silva, V. Meyer, D. F. Kirchoff, F. S. Joaquim Neto, R. Vieira and A. F. César De Rose, "Evaluating the performance and improving the usability of parallel and distributed Word Embeddings tools," 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Västerås, Sweden, 2020, pp. 201-206, doi: 10.1109/PDP50117.2020.00038. |
Abstract: | The representation of words by means of vectors, also called Word Embeddings (WE), has been receiving great attention from the Natural Language Processing (NLP) field. WE models are able to express syntactic and semantic similarities, as well as relationships and contexts of words within a given corpus. Although the most popular implementations of WE algorithms present low scalability, there are new approaches that apply High-Performance Computing (HPC) techniques. This is an opportunity for an analysis of the main differences among the existing implementations, based on performance and scalability metrics. In this paper, we present a study which addresses resource utilization and performance aspects of known WE algorithms found in the literature. To improve scalability and usability we propose a wrapper library for local and remote execution environments that contains a set of optimizations such as the pWord2vec, pWord2vec MPI, Wang2vec and the original Word2vec algorithm. Utilizing these optimizations it is possible to achieve an average performance gain of 15x for multicores and 105x for multinodes compared to the original version. There is also a big reduction in the memory footprint compared to the most popular python versions. |
URI: | https://doi.org/10.1109/PDP50117.2020.00038 https://ieeexplore.ieee.org/document/9092420 http://hdl.handle.net/10174/29056 |
Type: | article |
Appears in Collections: | CIDEHUS - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|