Repositório Digital de Publicações Científicas: Stroke Outcome Measurements from Electronic Medical Records: On the Effectiveness of Neural and Nonneural Classifiers


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ CIDEHUS - Centro Interdisciplinar de História, Culturas e Sociedades / CIDEHUS - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/30381

Title:	Stroke Outcome Measurements from Electronic Medical Records: On the Effectiveness of Neural and Nonneural Classifiers
Authors:	Zanotto, Bruna Etges, Ana dal Bosco, Avner Cortes, Eduardo Ruschenll, Renata Souza, Ana Andrade, Claudio Viegas, Felipe Canuto, Sergio Luiz, Washington Martins, Sheila Vieira, Renata Polanczyk, Carisi Gonçalves, Marcos
Keywords:	Electronic Health Records text classification
Issue Date:	2021
Citation:	Zanotto BS, Beck da Silva Etges AP, dal Bosco A, Cortes EG, Ruschel R, De Souza AC, Andrade CMV, Viegas F, Canuto S, Luiz W, Ouriques Martins S, Vieira R, Polanczyk C, André Gonçalves M Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers JMIR Med Inform 2021;9(11):e29120 doi: 10.2196/29120
Abstract:	Background: With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management. Objective: The research reported in this article aims at comparing the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs. Methods: Our study addressed the computational problem of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: Tier 1 (achieved healthcare status), Tier 2 (recovery process), care-related (clinical management and risk scores), and baseline characteristics. The analyzed dataset was retrospectively extracted from the EMRs of stroke patients from a private Brazilian hospital between 2018 and 2019. A total of 44.206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning (ML) methods, including state-of-the-art neural and non-neural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1-score), supported by statistical significance tests. Feature importance analysis was conducted to provide insights regarding the results. Results: The top-performing models were support vector machines (SVM) trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR’s textual representations. The SVM models produced statistically superior results in a total of 17 tasks out of 24 (70%), with an F1 score > 80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally/ambulate and communicate), healthcare status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional non-neural methods given the characteristics of the dataset. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future. Conclusions: Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to stroke victims’ clinical conditions, and thus, ultimately assess the possibility of proactively using these machine-learning techniques in real-world situations.
URI:	https://medinform.jmir.org/2021/11/e29120 http://hdl.handle.net/10174/30381
Type:	article
Appears in Collections:	CIDEHUS - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica

Files in This Item:

File	Description	Size	Format
PDF.pdf		873.19 kB	Adobe PDF	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora