Methods: The dataset was obtained from Acıbadem Maslak Hospital. Risk factors of the European System for Cardiac Operative Risk Evaluation (EuroSCORE) were used to predict mortality risk. First, Standard EuroSCORE scores of patients were calculated and risk groups were determined, because 30-day follow-up information of patients was not available in the dataset. Models were created with five different machine learning algorithms and two different datasets including age, serum creatinine, left ventricular dysfunction, and pulmonary hypertension were numeric in Dataset 1 and categorical in Dataset 2. Model performance evaluation was performed with 10-fold cross-validation.
Results: Data analysis and performance evaluation were performed with R, RStudio and Shiny. C4.5 was selected as the best algorithm for risk prediction (accuracy= 0.989) in Dataset 1. This model indicated that pulmonary hypertension, recent myocardial infarct, surgery on thoracic aorta are the primary three risk factors that affect the mortality risk of patients during or shortly after cardiac surgery. Also, this model is used to develop a dynamic web application which is also accessible from mobile devices (https://elifkartal.shinyapps.io/euSCR/).
Conclusion: The C4.5 decision tree model was identified as having the highest performance in Dataset 1 in predicting the mortality risk of patients. Using the numerical values of the risk factors can be useful in increasing the performance of machine learning models. Development of hospital-specific local assessment systems using hospital data, such as the application in this study, would be beneficial for both patients and doctors.
Risk grouping and forecasting models are seen as essential tools for assessing the quality of care, medical decision making, patient counseling, and patient consent.[4] Different risk stratification models such as the Parsonnet Scoring System, Cleveland Clinic Scoring System, The Society of Thoracic Surgeons National Database Risk Scoring System, etc. are developed to evaluate the results of open-cardiac surgery.[5] Geissler et al.[6] compared six different scoring techniques and reported that the European System for Cardiac Operative Risk Evaluation (EuroSCORE) gave the best performance of mortality prediction. Dişcigil et al.[7] pointed out that the EuroSCORE has only four factors related to surgery, therefore it is the least affected by surgery factors. In this regard, increasing patient based risk assessment and minimizing differences which may arise due to the surgical team are seen advantages of EuroSCORE.[6,7] In addition, a system based on EuroSCORE called Cardiac Risk Scoring is used by hospitals in Turkey[8] and hospital charges are determined according to this risk score. Karabulut et al.[9] found that EuroSCORE is easy and applicable for the cardiovascular surgery clinic; although multi-centered studies and increasing the number of observations would increase the validity of the system in Turkey.
European system for cardiac operative risk
evaluation
EuroSCORE is a scoring system which was
developed to predict early death in cardiac surgery
patients.[10-12] Roques et al.[13] identified risk factors
for mortality in cardiac surgical adult patients as part of EuroSCOREs development process. Also a large
portion of this studys database was used to develop the
EuroSOCRE. Ninety-seven risk factors were collected
from 20 thousand patients from 128 hospitals of eight
European countries; however only 17 of these risk
factors (Table 1) were selected for the scoring system
as significant, reliable, and objective.[10] Today, there
are three EuroSCORE models that provides online
risk calculations: Standard (Additive) EuroSCORE,[14]
Logistic EuroSCORE,[15] and EuroSCORE II[16]
(Figure 1):
Figure 1: Comparison of EuroSCORE models.
Machine learning in cardiac risk assessment
The machine learning field is associated with
building automatically developed computer programs
with experience.[17] Machine learning incorporates
computer programming using sample data or past
experience for performance optimization.[18] Simon[19]
described learning as any change that would improve
a systems second performance on the same task or in
a new task related to the same population. Mitchell[17]
stated how a machine can change its behavior in order
to learn by taking performance into consideration:
A computer program is said to learn from experience
(E) with respect to some class of tasks (T) and
performance measure (P), if its performance at tasks in
(T), as measured by (P), improves with experience (E).
There are two main types of learning: supervised learning and unsupervised learning. Supervised learning is a form of learning in which the learner receives a set of labeled examples of training data and makes predictions for points that it has not seen before.[20] Unsupervised learning is a form of learning in which no labeled sample is found in the learners' training data.[20] The main difference between supervised and unsupervised learning is the presence of the target attribute in the dataset.
Both machine learning and common scoring systems have been used for predicting mortality risk after cardiac surgery. Nouei et al.[21] proposed the Lookup Genetic Fuzzy Annealing System to predict mortality risk after coronary artery bypass grafting (CABG) surgery a nd c ompared i ts a ccuracy (acc= 0 .853) w ith two well-known machine learning techniques: logistic regression (acc= 0.781) and the multilayer perceptron neural network (acc= 0.748). Tu et al.[22] compared the performance of the artificial neural networks and logistic regression to estimate the mortality risk in the hospital after CABG operation, and found that the two methods reported similar relationships between patient characteristics and mortality. Lippmann et al.[23] estimated the mortality risk of death, stroke, and renal impairment for patients who underwent CABG operation using artificial neural networks. Tunca[24] developed a risk prediction model by using the REMARC (Risk Estimation by Maximizing Area under Receiver Operating Characteristic Curve) algorithm and TurkoSCORE system which involves a database and learning system to estimate mortality risk for patients in Turkey.
This study aimed to predict the mortality risk of patients during or shortly after cardiac surgery by using EuroSCORE mortality risk factors and machine learning techniques.
Business understanding
Business understanding is defined as problem
understanding in the business environment. In this
study, business understanding was considered to be
problem understanding. The problem was defined as
predicting the risk assessment of patients during or
shortly after cardiac surgery.
Data understanding
In this study, data was obtained from Acıbadem
Maslak Hospital. Initially, the dataset consisted of
17 predictive attributes (Table 1). The total number of
observations was 1482. Dead / alive status of patients
was determined by Roques et al.[13] according to the
next 30 days after surgery. However, when the date of
operation and discharge from hospital were examined
in this study, a standard 30-day postoperative followup
period could not be obtained. Therefore, patients
were grouped according to standard EuroSCORE
scores: low (0-2 points), moderate (3-5 points), and
high (≥6 point). This attribute was also used as a
target attribute for machine learning algorithms in the
analyses.
Table 1: The European system for cardiac operative risk evaluation risk factors
Data preparation
A large number of missing values were detected.
While the standard and logistic EuroSCORE
calculator[26] does not have any option for missing
values, the calculator was designed for patients[27] to
have options of Do not know and No, and the
calculated scores was equal in both cases. Therefore,
in this study, it was decided to complete the missing
values in the dataset before the analyses. Missing
values of the categorical and numerical attributes were
completed with the most repeated category and the
mean of the each attribute in terms of each risk group
(class label of the target attribute).
Outliers were detected and removed from the dataset by considering the rules provided by experts. Since the post-infarct septal rupture attribute was only seen in one patient, it was removed from the dataset. Duplicated observations were also removed.
EuroSCORE only works with categorical attributes; however numerical values of age, serum creatinine, left ventricular dysfunction, and pulmonary hypertension attributes were also available in the dataset. It is believed that possible effects of different data types of these attributes can be examined. Therefore, analyses are performed on two different datasets in which the attributes are numerical in Dataset 1 and categorical in Dataset 2. The numerical attributes in Dataset 1 were normalized using the max-min normalization technique.[28] Table 2 shows the frequency distribution of risk groups in Dataset 1 and Dataset 2.
Table 2: Frequency distribution of risk groups in the datasets
Modeling
Alternative models were created with Naive Bayes
classifier, k-nearest neighbor algorithm, logistic
regression analysis, ID3, and C4.5 decision tree
algorithms to predict the mortality risk of patients
during or shortly after cardiac surgery. The basic
concepts of these algorithms are briefly explained
below.[17,28-30]
Naive Bayes Classifier: An easily understandable method which makes use of the Bayes Theorem. Probabilities of an observation belonging to the class labels of the target attribute can be found with this method. Maximum a posteriori hypothesis and assumption of class conditional independence are two key elements that are used in classification process.
K-Nearest Neighbor Algorithm: The distance is calculated between the unlabeled observation and all observations in the dataset. k-observations are taken with smallest distance value. The most frequent class in k observations is assigned as the class value.
In this study, k parameter of the algorithm was initially selected. In order to obtain the best k, the algorithm was applied for k= 1, 2, ..., 10. Furthermore, Gower distance[31] was preferred for Dataset 1 since it has both binary coded and numerical attributes, Jaccard distance[32] is used for Dataset 2 because the attributes in Dataset 2 are encoded in asymmetric binary format. Moreover, the function which allows the Gower distance for the algorithm is developed with R by the authors for the analyses.
Logistic Regression Analysis: Provides the relationship between the predictive attributes and the target attribute if the target attribute is categorical. It is defined as binary, ordinal, and multinomial logistic regression according to data type of the target attribute.[33]
In this study, due to the number of zero frequency cells, some categories of the attributes (including age, left ventricular dysfunction, and the target attribute) were merged to make the data more appropriate for the analyses and binary logistic regression was performed. The purpose of binary logistic regression is to estimate the possibility that the target attribute gets 1 value when 1 code is used for the risky situation in the target attribute.[33]
ID3 and C4.5 Decision Tree Algorithms: ID3 is one of the simplest decision tree algorithms. It uses entropy and information gain to measure how well the training samples are split. The information gain criterion used in the ID3 has left its place to gain ratio in C4.5 which applies a kind of normalization called split information to information gain. Since C4.5 can work with attributes that take both categorical and numerical values and ID3 works only with categorical attributes, in this study analysis was performed with C4.5 on Dataset 1 and with ID3 on Dataset 2.
Evaluation
Various methods have been developed for model
performance evaluation such as hold-out, stratified
sampling, three-way split, cross-validation, etc.
In this study, stratified 10-fold cross validation method was chosen to compare performance of the models. In k-fold cross-validation, the dataset is divided into k-equal parts. One part is used for testing, and remaining k-1 parts are used for training. In the end, k error rates (or other performance evaluation measures) are obtained and average of the errors are taken into account as performance.
In addition; various measures can be used for model performance evaluation.[34] In this study, accuracy, error, and also more comprehensive measures such as F-measure and diagnostic odds ratio were calculated.
Analyses were performed with R programming language and RStudio. R is a free language and environment that allows statistical calculations and graphical visualization.[35] RStudio[36] is an integrated development environment for R. Various R packages such as e1071,[37] knnGarden,[38] RWeka,[39,40] shiny,[41] and shinythemes[42] are used to perform analyses in R. A dynamic web application of the best model has been developed with Shiny[43] and it provides the development of applications that enable the transfer of R codes to the web environment. One of the ways to share these applications on the web is to publish it from shinyapps.io.[44]
The ranking of the attributes in terms of contribution levels to models are obtained from ID3, C4.5, and logistic regression analysis (Table 3).
Table 3: Top three attributes according to contribution level of the models
Deployment
C4.5 decision tree model, which gives the best
performance in Dataset 1, was integrated into a
dynamic web application which is also accessible from
mobile devices (https://elifkartal.shinyapps.io/euSCR/)
(Figure 2). It is possible to produce rules similar to
those below by using the decision tree.
IF pulmonary hypertension is less than or equal to 32 and recent myocardial i nfarct= NO and Other than isolated CABG = NO; Then the RISK is LOW
Figure 2: Web application for cardiac risk assessment using the C4.5 decision tree model.
IF pulmonary hypertension is less than or equal to 32 and recent myocardial infarct= NO and Other than i solated C ABG= YES; Then the RISK is MEDIUM
IF pulmonary hypertension is greater than 33 and pulmonary hypertension is less than or equal to 42 and recent myocardial infarct= YES; Then the RISK is HIGH.
Since there was no 30-day follow-up data for patients in the dataset as in EuroSCORE, the standard EuroSCORE scores of the patients were first calculated and predictions were made using the risk groups as target attribute. Seventeen risk factors were used in the calculation of Standard EuroSCORE; however since postinfarct septal rupture attribute was only seen in one patient, this attribute was not used in analyses.
In EuroSCORE, if the patient did not know the exact value of the risk factor, the factor was calculated as absent. However, in this study, the missing values of the remaining 16 risk factors were completed.
Numerical and categorical values of age, serum creatinine, left ventricular dysfunction, and pulmonary hypertension attributes were used in Dataset 1 and Dataset 2, respectively.
Not only accuracy and error, but more comprehensive performance evaluation measures were also used.
The highest performance was obtained from the C4.5 decision tree algorithm model in Dataset 1 and the lowest performance was obtained from the ID3 decision tree algorithm in Dataset 2.
It was determined that the performance measures
obtained from Dataset 2 were significantly lower
than the values obtained from Dataset 1. The general
evaluation showed that the errors in Table
Table 4: Results of model performance evaluation
Sixteen attributes were ordered with the help of
ID3 and C4.5 decision tree algorithms and logistic
regression analysis. Pulmonary hypertension was first
rank in models derived from Dataset 1. It was also
determined that age factor was in the top three for two
different machine learning algorithms.
Conclusion
Acknowledgement
The authors would like to thank to Acıbadem Maslak
Hospital Chief Physician Prof. Dr. Çağlar Çuhadaroğlu,
Acıbadem University School of Medicine Head of Department
of Cardiovascular Surgery Prof. Dr. Cem Alhan, and Acıbadem
Maslak Hospital Cardiovascular Surgery Department Assistant
Sevinç Kocaman who provided the dataset.
Declaration of conflicting interests
Funding
The C4.5 decision tree model had the highest
performance in predicting the mortality risk of patients (accuracy= 0 .989). This model can be accepted as a
predictor model based on learning from data from this
study. Using numerical values of the risk factors may
be useful in increasing the performance of machine
learning models. Developing hospital-specific local
assessment systems, such as the application in this
study, would be beneficial for both patients and
doctors. Furthermore, this model should be tested with
datasets collected from other hospitals.
This study was formed within the scope of Kartal E.
(2015). Machine Learning Techniques Based on Classification
and a Study on Cardiac Risk Assessment (PhD Thesis). İstanbul
University, İstanbul.
The authors declared no conflicts of interest with respect to
the authorship and/or publication of this article.
This study was supported by Scientific Research Projects
Coordination Unit of Istanbul University (Project number 49091).
1) World Health Organization, Cardiovascular diseases (CVDs).
Available at: http://www.who.int/mediacentre/factsheets/
fs317/en/. [Accessed: November 10, 2017].
2) American Heart Association. New statistics show one of every
three U.S. deaths caused by cardiovascular disease. Available
at: http://newsroom.heart.org/news/new-statistics-show-oneof-
every-three-u-s-deaths-caused-by-cardiovascular-disease.
[Accessed: November 10, 2017].
3) Turkish Statistical Institute. Türkiye İstatistik Kurumu Ölüm
Nedeni İstatistikleri. 2016, 24572. Available at: http://www.
tuik.gov.tr/PreHaberBultenleri.do?id=24572. [Accessed:
November 10, 2017].
4) Akar AR, Kurtcephe M, Sener E, Alhan C, Durdu S, Kunt
AG, et al. Validation of the EuroSCORE risk models in
Turkish adult cardiac surgical population. Eur J Cardiothorac
Surg 2011;40:730-5.
5) Okutan H, Yavuz T, Peker O, Tenekeci C, Düver H, Öcal
A ve ark. Outcomes of Euroscore (European System for
Cardiac Operative Risk Evaluation) at opareted patients in
our clinic. Turk Gogus Kalp Dama 2002;10:201-5.
6) Geissler HJ, Hölzl P, Marohl S, Kuhn-Régnier F, Mehlhorn
U, Südkamp M, et al. Risk stratification in heart surgery:
comparison of six score systems. Eur J Cardiothorac Surg
2000;17:400-6.
7) Dişcigil B, Badak Mİ, Gürcün U, Boğa M, Özkısacık
EA, Güneş TÜ. Açık Kalp Cerrahisi Sonuçlarının Avrupa Kardiyak Risk Skorlama Sistemi (Euroscore) ile
Değerlendirilmesi. ADÜ Tıp Fakültesi Dergisi 2005;6:19-23.
8) Republic of Turkey Social Security Institution. Social Security
Institution Declaration of Healthcare Implementation,
(22.10.2014 Change Notification up-to-date 2013). Available
at: http://www.sgk.gov.tr/wps/portal/sgk/tr/kurumsal/
merkez-t esk i lat i /a na _ h i zmet _ bi r im ler i /gss _ genel _
mudurlugu/anasayfa_duyurular/duyuru_20141022_03.
[Accessed: November 10, 2017].
9) Karabulut H, Toraman F, Dağdelen S, Çamur G, Alhan
C. EuroSCORE (European System for Cardiac Operative
Risk Evaluation) risk skorlama sistemi gerçekçi mi?
Türk Kardiyol Dern Arş 2001;29:364-7.
10) What is euroSCORE ? Available at: http://www.euroscore.
org/what_is_euroscore.htm. [Accessed: November 10, 2017].
11) What is euroSCORE ? (for patients). Available at: http://www.
euroscore.org/patient.htm. [Accessed: November 10, 2017].
12) Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow
S, Salamon R. European system for cardiac operative
risk evaluation (EuroSCORE). Eur J Cardiothorac Surg
1999;16:9-13.
13) Roques F, Nashef SA, Michel P, Gauducheau E, de Vincentiis
C, Baudet E, et al. Risk factors and outcome in European
cardiac surgery: analysis of the EuroSCORE multinational
database of 19030 patients. Eur J Cardiothorac Surg
1999;15:816-22.
14) Roques F, Gabrielle F, Michel P, De Vincentiis C, David M,
Baudet E. Quality of care in adult heart surgery: proposal
for a self-assessment approach based on a French multicenter
study. Eur J Cardiothorac Surg 1995;9:433-9.
15) Roques F, Michel P, Goldstone AR, Nashef SA. The logistic
EuroSCORE. Eur Heart J 2003;24:881-2.
16) EuroSCORE Project Group, EuroSCORE II. Available at:
http://www.euroscore.org/calc.html. [Accessed: November
10, 2017].
17) Mitchell TM. Machine Learning. 1st ed. New York: McGraw-
Hill; 1997.
18) Alpaydın E. Introduction to Machine Learning. Cambridge:
MIT Press; 2014.
19) Simon HA, Why should machines learn?. In: Michalski
RS, Carbonell JG, Mitchell TM, editors. Machine learning:
An artificial intelligence approach. Berlin: Springer; 1984.
p. 25-37.
20) Mohri M, Rostamizadeh A, Talwalkar A. Foundations of
Machine Learning. London: The MIT Press; 2012.
21) Nouei MT, Kamyad AV, Sarzaeem M, Ghazalbash S.
Developing a genetic fuzzy system for risk assessment of
mortality after cardiac surgery. J Med Syst 2014;38:102.
22) Tu JV, Weinstein MC, McNeil BJ, Naylor CD. Predicting
mortality after coronary artery bypass surgery: what do
artificial neural networks learn? The Steering Committee of
the Cardiac Care Network of Ontario. Med Decis Making
1998;18:229-35.
23) Lippmann RP, Kukolich L, Shahian D. Predicting the risk
of complications in coronary artery bypass operations using
neural networks. In: Tesauro G, Toretzky DS, Leen TK,
editors. Advances in Neural Information Processing Systems
7) Cambridge: The MIT Press; 1995. p. 1055-62.
24) Tunca A. Predicting Risk of Mortality in Patients Undergoing
Cardiovascular Surgery. Master of Science Thesis, Ankara:
Bilkent University; 2008.
25) Shearer C. The CRISP-DM model: the new blueprint for data
mining. J Data Warehous 2000;5:13-22.
26) Additive/logistic EuroSCORE interactive calculator.
Available at: http://euroscore.org/calc.html. [Accessed:
November 10, 2017].
27) EuroSCORE for patients. Available at: http://www.euroscore.
org/patienteuroscore2.html. [Accessed: November 10, 2017].
28) Han J, Kamber M. Data mining: concepts and techniques
(the Morgan Kaufmann Series in data management systems).
2nd ed. California: Elsevier; 2006.
29) Özkan Y. Veri madenciliği yöntemleri. İstanbul: Papatya
Yayıncılık Eğitim; 2008.
30) Balaban ME, Kartal E. Veri Madenciliği ve Makine
Öğrenmesi Temel Algoritmaları ve R Dili ile Uygulamaları.
1) Baskı. İstanbul: Çağlayan Kitabevi; 2015.
31) Gan G. Data Clustering in C++: An Object-Oriented
Approach. Florida: CRC Press; 2011.
32) Dekhtyar A. CSC 466: Knowledge Discovery from Data -
Distance/Similarity Measures. Available at: http://users.csc.
calpoly.edu/~dekhtyar/560-Fall2009/lectures/lec09.466.pdf.
[Accessed: November 10, 2017].
33) Karabulut E, Alpar R. Lojistik regresyon, In: Alpar R, editör.
Uygulamalı Çok Değişkenli İstatistiksel Yöntemler. Ankara:
Detay Yayıncılık; 2011. s. 591-660.
34) Sokolova M., Lapalme G. A Systematic analysis of
performance measures for classification tasks. Inf Proc
Manage 2009;45:427-37.
35) R-project, The R Project for Statistical Computing. Available
at: http://www.r-project.org/. [Accessed: November 10, 2017].
36) RStudio, Take control of your R code. Available at: https://
www.rstudio.com/products/rstudio/. [Accessed: November
10, 2017].
37) Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch
F. e1071: Misc Functions of the Department of Statistics
(e1071), TU Wien, 2017, Available at: https://CRAN.Rproject.
org/package=e1071. [Accessed: November 10,
2017].
38) Wei B, Yang F, Wang X, Ge Y. knnGarden: Multi-distance
based k-Nearest Neighbors, 2012, Available at: https://
CRAN.R-project.org/package=knnGarden. [Accessed:
November 10, 2017].
39) Hornik K, Buchta C, Zeileis A. Open-source machine
learning: R meets weka. Comput Stat 2009;24:225-32.
40) Witten IH, Frank E. Data Mining: Practical machine
learning tools and techniques. 2nd ed. San Francisco:
Morgan Kaufmann; 2005.
41) Chang W, Cheng J, Allaire JJ, Xie Y, McPherson J. Shiny:
Web Application Framework for R. 2017. Available at: https://
CRAN.R-project.org/package=shiny. [Accessed: November
10, 2017].
42) Chang W. Shinythemes: Themes for Shiny. 2016, Available
at: https://CRAN.R-project.org/package=shinythemes.
[Accessed: November 10, 2017].
43) RStudio. Shiny. Available at: https://shiny.rstudio.com/.
[Accessed: November 10, 2017].
44) RStudio. shinyapps.io. Available at: http://www.shinyapps.io/.
[Accessed: November 10, 2017].