A Novel Machine Learning Algorithm for Classifying Mortality Risk Patients for Intensive Care Unit Admissions

Tracking #: 626-1606

Responsible editor: 

Michael Krauthammer

Submission Type: 

Research Paper


Introduction: In this paper, we focus on the classification of high mortality risk patients for Intensive Care Units (ICUs). Classification algorithms for identifying ICU mortality are necessary for measuring and improving ICU performance. Mortality risk severity scores are an essential part of hospital management and clinical decision-making. Proper application of classification models can help in decision making to lower hospital costs. In fact, classification high mortality risk models have become a necessary tool to explain differences in mortality risk. Purpose of Study: The purpose of this study is to develop and evaluate a new algorithm which more accurately predicts patient mortality in ICU, using patient information of vital signs and laboratory results only in the first 24 hours of ICU admission. Methods: We evaluate a novel approach, by statistically converting continuous variables into categorical variables and identifying optimal threshold cut points for stabilizing the coefficients of the classification mortality risk model. Using a machine learning method, namely the logistic regression model, the optimal threshold cut points and open source tools, we developed and evaluated a mortality risk algorithm for ICU patients. Results: An optimal set of 3 threshold values were derived, that partitioned the data into 4 groups, resulting in the patient mortality risk scores being more distinguishable across the 4 partitioned groups. The most important variables for the ICU Mortality Risk was PO2 (120 ¬ 125), followed by Cardiac Arrest (Yes), Bilirubin (0.75 – 1), Vasopressors (Yes), SPO2 (< 66), Bilirubin (>7.75), Foley (<6), Severe COPD (Yes), WBC (> 19.5) and BUN (> 49). Conclusion: We present a new binary classification algorithm, the logistic regression with threshold cut points, designed to address the problem of continuous variables with high variability and extreme values, to stabilize our model coefficients and improve the accuracy of classifying high mortality risk ICU patients. Our proposed optimal threshold cut point model performed substantially better (AUC=0.944) in identifying ICU patients with high mortality risk compared to the current scoring systems commonly used in hospitals, such as the SAPS 11 (AUC =0.771), APACHE 11 (AUC=0.736) and SOFA (AUC=0.699). This accuracy is at least 30% (1.35 times) better than current mortality risk scoring systems. SAPS 11, APACHE 11 and SOFA are static algorithms whereas our new optimal threshold algorithm is a data-driven algorithm which predicts mortality in ICU patients in real-time and may be useful for the timely identification of deteriorating patients. Our new binary classification algorithm will allow clinicians to accurately identify high-mortality risk patients early within 24 hours so that they can be given prompt treatment to reduce their risks of deteriorating or dying.



  • Reviewed

Data repository URLs: 

Date of Submission: 

Sunday, March 29, 2020

Date of Decision: 

Wednesday, May 13, 2020



Solicited Reviews:

1 Comment

Meta-Review by Editor

Both reviewers commented on serious flaws in the evaluation of the method. There were questions about the novelty of the approach. The reviewers pointed at the lack of comparison to any other state-of-the-art method and of testing on a independent dataset.

Michael Krauthammer (https://orcid.org/0000-0002-4808-1845)