Electrocardiogram arrhythmia detection with novel signal processing and persistent homology-derived predictors

Tracking #: 808-1788


Hunter DlugasORCID logo https://orcid.org/0000-0002-6819-0045

Responsible editor: 

Karin Verspoor

Submission Type: 

Research Paper


Many approaches to computer-aided electrocardiogram (ECG) arrhythmia detection have been performed, several of which combine persistent homology and machine learning. We present a novel ECG signal processing pipeline and method of constructing predictor variables for use in statistical models. Specifically, we introduce an isoelectric baseline to yield non-trivial topological features corresponding to the P, Q, S, and T-waves (if they exist) and utilize the N-most persistent 1-dimensional homological features and their corresponding area-minimal cycle representatives to construct predictor variables derived from the persistent homology of the ECG signal for some choice of N. The binary classification of (1) Atrial Fibrillation vs. Non-Atrial Fibrillation, (2) Arrhythmia vs. Normal Sinus Rhythm, and (3) Arrhythmias with Morphological Changes vs. Sinus Rhythm with Bradycardia and Tachycardia Treated as Non-Arrhythmia was performed using Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naive Bayes, Random Forest, Gradient Boosted Decision Tree, K-Nearest Neighbors, and Support Vector Machine with a linear, radial, and polynomial kernel Models with stratified 5-fold cross validation. The Gradient Boosted Decision Tree Model attained the best results with a mean F1-score and mean Accuracy of (0.967,0.946), (0.839,0.946), and (0.943,0.921) across the five folds for binary classifications of (1), (2), and (3), respectively.


Previous Version: 


  • Reviewed

Data repository URLs: 

Date of Submission: 

Wednesday, April 10, 2024

Date of Decision: 

Monday, April 22, 2024

Nanopublication URLs:



Solicited Reviews:

1 Comment

meta-review by editor

We are pleased to inform you that your paper has been conditionally accepted for publication, under the condition that you address the remaining minor issues:

The reviewers have acknowledged your revisions aimed at addressing their comments, and found the manuscript improved. A key concern remains the comparison with related work and the associated discussion. As has been pointed out by Reviewer 2, the list of previous performance results presented on p.21 is meaningless in the absence of grounding in the same task and dataset. 90% accuracy on one dataset cannot be compared directly to 98% accuracy on a different dataset, and even more so if the addressed task is different. A table summarizing the other work, the dataset used, the approach used and the performance is needed. Further discussion of the relationship between the presented results/findings and this other work would add important depth.

Please also confirm that this work was not done in collaboration with anyone else who may meet the criteria for authorship; it is rare to see single-authored work these days, particularly from more junior researchers.

Karin Verspoor (https://orcid.org/0000-0002-8661-1544)