ORIGINAL

Classification of Ischemic Stroke Subtypes Using Machine Learning: insights from the International Stroke Trial dataset

Classificação de Subtipos de Infarto Cerebral Utilizando Aprendizado de Máquina: insights do International Stroke Trial dataset

  • Samuel Pedro Pereira Silveira    Samuel Pedro Pereira Silveira
  • Gustavo del Rio Lima    Gustavo del Rio Lima
  • Gustavo Branquinho Alberto    Gustavo Branquinho Alberto
  • Luiza Carolina Moreira Marcolino    Luiza Carolina Moreira Marcolino
  • Larissa Batista Xavier    Larissa Batista Xavier
  • Carlos Umberto Pereira    Carlos Umberto Pereira
  • Murillo Martins Correia
  • Roberto Alexandre Dezena    Roberto Alexandre Dezena
  Views: 2093
  Downloads: 498

Resumo

Introdução: A classificação de subtipos de AVC isquêmico é essencial para o prognóstico e tratamento, mas desafiadora na prática clínica. Objetivo: Desenvolver e avaliar modelos de aprendizado de máquina (ML) para classificação automatizada dos subtipos de AVC isquêmico (OCSP) usando dados clínicos. Métodos: Utilizando 13.056 casos do estudo IST, treinamos Random Forest, XGBoost, Regressão Logística, Support Vector Machine e k-Nearest Neighbors. Avaliamos acurácia, sensibilidade, especificidade, VPP, VPN e AUC-ROC com validação cruzada estratificada (10-fold). Resultados: Variáveis clínicas associaram-se fortemente aos subtipos (p < 0,001). Random Forest e XGBoost tiveram desempenho perfeito (todas as métricas = 1,000 ± 0,000). Regressão Logística e SVM tiveram desempenho quase perfeito (acurácia ≈ 0,998; AUC-ROC = 1,000). O KNN apresentou menor sensibilidade, especialmente para POCS (sensibilidade média = 0,898). Conclusão: Os modelos de ML, especialmente Random Forest e XGBoost, permitem classificar subtipos de AVC isquêmico com alta precisão usando dados clínicos rotineiros.

Palavras-chave

Acidente vascular cerebral isquêmico; Machine learning; Random forest; Inteligência artificial

Abstract

Introduction: Ischemic stroke subtype classification supports prognosis and treatment but can be challenging in acute care. Objective: To develop and evaluate Machine Learning models for automated OCSP-based ischemic stroke subtype classification using clinical data. Methods: Using 13,056 IST cases, we trained Random Forest, XGBoost, Logistic Regression, Support Vector Machine, and k-Nearest Neighbors models. Performance was assessed by accuracy, sensitivity, specificity, PPV, NPV, and AUC-ROC using 10-fold stratified cross-validation. Results: Clinical variables were strongly associated with stroke subtypes (p < 0.001). RF and XGBoost achieved perfect performance (all metrics = 1.000 ± 0.000). Logistic Regression and SVM also performed near-perfectly (accuracy ≈ 0.998, AUC-ROC = 1.000). KNN showed lower sensitivity, especially for POCS (macro average sensitivity = 0.898). Conclusion: ML models, particularly RF and XGBoost, enable highly accurate ischemic stroke subtype classification using routine clinical data.

Keywords

Ischemic stroke; Machine learning; Random forest; Artificial intelligence

References

5. Bamford J, Sandercock P, Dennis M, Warlow C, Burn J.

Classification and natural history of clinically identifiable subtypes of cerebral infarction. Lancet. 1991;337(8756):1521-6. http:// doi.org/10.1016/0140-6736(91)93206-O. PMid:1675378.

6. INTERNATIONAL STROKE TRIAL COLLABORATIVE GROUP.

The International Stroke Trial (IST): a randomised trial of aspirin, subcutaneous heparin, both, or neither among 19435 patients with acute ischaemic stroke. Lancet. 1997;349(9065):1569-81. http:// doi.org/10.1016/S0140-6736(97)04011-7. PMid:9174558.

7. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12(85):2825-30.

8. Breiman L. Random forests. Mach Learn. 2001;45(1):5-32. http:// doi.org/10.1023/A:1010933404324.

9. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016 Aug 13-17; San Francisco, CA. New York: Association for Computing Machinery; 2016. p. 785-94.

http://doi.org/10.1145/2939672.2939785.

10. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273-97. http://doi.org/10.1007/BF00994018.

11. Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21-7. http://doi.org/10.1109/TIT.1967.1053964.

12. Wardlaw JM, Murray V, Berge E, del Zoppo GJ. Thrombolysis for acute ischaemic stroke. Cochrane Database Syst Rev. 2014;2014(7):CD000213. http://doi.org/10.1002/14651858.CD000213.pub3. PMid:25072528.

13. Andrade JBC, Mohr JP, Timbó FB, et al. Oxfordshire community stroke project classification: a proposed automated algorithm. Eur Stroke J. 2021;6(2):160-7. http://doi.org/10.1177/23969873211012136. PMid:34414291.

14. Fang G, Xu P, Liu W. Automated ischemic stroke subtyping based on machine learning approach. IEEE Access. 2020;8:118426-32. http:// doi.org/10.1109/ACCESS.2020.3004977.

15. Ryu W-S, Schellingerhout D, Lee H, et al. Deep learning-based automatic classification of ischemic stroke subtype using diffusionweighted images. J Stroke. 2024;26(2):300-11. http://doi.org/10.5853/ jos.2024.00535. PMid:38836277.

16. Garg R, Oh E, Naidech A, Kording KP, Prabhakaran S. Automating ischemic stroke subtype classification using machine learning and natural language processing. J Stroke Cerebrovasc Dis. 2019;28(7):2045-51. http:// doi.org/10.1016/j.jstrokecerebrovasdis.2019.02.004. PMid:31103549.

17. Lee HJ, Schwamm LH, Turner AC, et al. Abstract WMP49: a machine learning approach to classifying ischemic stroke etiology using variables available in the Get-with-the-Guidelines Stroke Registry. Stroke.

2025;56(Suppl 1). http://doi.org/10.1161/str.56.suppl_1.WMP49.



1Faculty of Medicine, Universidade Federal do Triângulo Mineiro, Uberaba, Minas Gerais, Brazil

2Center for Mathematics, Computing and Cognition – CMCC, Universidade Federal do ABC, Santo André, SP, Brazil.

3Discipline of Neurosurgery, Hospital das Clinicas, Universidade Federal do Triângulo Mineiro, Uberaba, MG, Brazil.

4Neurosurgery Division, Universidade Federal de Sergipe – UFS, Aracaju, SE, Brazil.

5Neurosurgery Division, Universidade Federal do Triângulo Mineiro, Uberaba, Minas Gerais, Brazil.


 

Received May 13, 2025

Accepted June 4, 2025


JBNC  Brazilian Journal of Neurosurgery

JBNC
  •   ISSN (print version): 0103-5118
  •   e-ISSN (online version): 2446-6786
iThenticate
Open Access

Contact

Social Media

   

ABNc  Academia Brasileira de Neurocirurgia

  •   Rua da Quitanda 159 – 10º andar - Centro - CEP 20091-005 - Rio de Janeiro - RJ
  •   +55 21 2233.0323
  •    abnc@abnc.org.br

Sponsor

  • Brain4Care
  • Hospital INC
  • Strattner
  • Zeiss