|
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/30381
|
Title: | Stroke Outcome Measurements from Electronic Medical Records: On the Effectiveness of Neural and Nonneural Classifiers |
Authors: | Zanotto, Bruna Etges, Ana dal Bosco, Avner Cortes, Eduardo Ruschenll, Renata Souza, Ana Andrade, Claudio Viegas, Felipe Canuto, Sergio Luiz, Washington Martins, Sheila Vieira, Renata Polanczyk, Carisi Gonçalves, Marcos |
Keywords: | Electronic Health Records text classification |
Issue Date: | 2021 |
Citation: | Zanotto BS, Beck da Silva Etges AP, dal Bosco A, Cortes EG, Ruschel R, De Souza AC, Andrade CMV, Viegas F, Canuto S, Luiz W, Ouriques Martins S, Vieira R, Polanczyk C, André Gonçalves M
Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers
JMIR Med Inform 2021;9(11):e29120
doi: 10.2196/29120 |
Abstract: | Background: With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management.
Objective: The research reported in this article aims at comparing the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs.
Methods: Our study addressed the computational problem of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: Tier 1 (achieved healthcare status), Tier 2 (recovery process), care-related (clinical management and risk scores), and baseline characteristics. The analyzed dataset was retrospectively extracted from the EMRs of stroke patients from a private Brazilian hospital between 2018 and 2019. A total of 44.206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning (ML) methods, including state-of-the-art neural and non-neural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1-score), supported by statistical significance tests. Feature importance analysis was conducted to provide insights regarding the results.
Results: The top-performing models were support vector machines (SVM) trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR’s textual representations. The SVM models produced statistically superior results in a total of 17 tasks out of 24 (70%), with an F1 score > 80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally/ambulate and communicate), healthcare status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional non-neural methods given the characteristics of the dataset. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future.
Conclusions: Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to stroke victims’ clinical conditions, and thus, ultimately assess the possibility of proactively using these machine-learning techniques in real-world situations. |
URI: | https://medinform.jmir.org/2021/11/e29120 http://hdl.handle.net/10174/30381 |
Type: | article |
Appears in Collections: | CIDEHUS - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|