Methods: Between June 2003 and January 2009, medical files of 456 pediatric patients who were monitored in a 12-bed postoperative cardiac surgery care unit following heart surgery were retrospectively analyzed. A total of 373 files were included in the study. Age, gender, diagnosis, the length of stay in the intensive care unit and hospital, survival rates, PIM 1, PIM 2 scores and m-SOFA scores on admission, at 24 and 48 hours and peak m-SOFA scores were recorded. Student’s t test was used to compare the normally distributed data, whereas Mann-Whitney-U test was used to compare non-parametric data. Calibration of the scores was performed using the Hosmer and Lemeshow Goodness of Fit test. Discrimination power of the scores was analyzed using the receiver operating characteristic (ROC) curves.
Results: Fifty patients (13.4%) died perioperatively. Peak and m-SOFA scores on admission were significantly higher in nonsurvivors (9.8±2 and 9.2±2, respectively) than survivors (5±2.5 and 4.6±2.5, respectively; p<0.01). Calibration with Hosmer- Lemeshow Goodness of Fit test was chi-square df (8)=30.4, p=0.0002 for PIM 1 and chi-square df (9)=13.5, p=0.13 for PIM 2. Discrimination power and calibration strength of PIM 2 score was good (ROC 0.82), whereas PIM 1 had a better value (ROC 0.87) of discrimination power with a poor calibration strength. The ROC values of peak and m-SOFA scores on admission were observed to have a good discrimination power (0.93 and 0.92, respectively).
Conclusion: Our study results demonstrate that peak and m-SOFA scores on admission are improved for the prediction of mortality in pediatric cardiac surgery, compared to PIM 1 and PIM 2 scores.
Modified sequential organ failure assessment (m-SOFA) is another assessment instrument that consists of the evaluation of five organ systems. It is designed to predict organ failure, but has also been successfully used to predict mortality after pediatric heart surgery.[5]
In this study, we retrospectively collected the data of pediatric patients who underwent heart surgery, calculated their PIM 1, PIM 2, and m-SOFA scores, and evaluated their mortality prediction rates.
The age, gender, diagnosis, ICU and hospital length of stay, mortality, and cardiopulmonary and aortic cross-clamp times along with the PIM 1 PIM 2, m-SOFA scores on admission and at 24 and 48 hours, and the peak m-SOFA score during the study period were recorded. The m-SOFA score had a maximum value of 20 (Table 1),[5] and the PIM 1 and PIM 2 scores were recorded according to the ICU referral data.
Table 1: Modified sequential organ failure assessment score
Parametric values were expressed as mean ± standard deviation (SD) where appropriate, and all variables were tested for normal distribution using the Kolmogorov-Smirnov test. Student’s t-test was used for comparison of the normally distributed parametric data while the m-SOFA scores, both among the survivors and non-survivors, were compared with the Mann-Whitney U test. The performance of the scoring systems was assessed by applying the standardized mortality ratio (SMR), which was calculated by dividing the observed deaths by the predicted number of deaths in the total group. Calibration of the PIM 1 and PIM 2 scores was performed with the Hosmer and Lemeshow goodness-of-fit test for deciles of mortality risk based on the ranked mortality risks of all patients for each scoring system. A p value <0.05 indicated a poor calibration or fit. The discrimination power of the scores was assessed with receiver operating characteristic (ROC) curves in which the sensitivity was plotted against the 1-specificity. The area under the receiving operating characteristic (AUROC) curve was the overall summary measure of discriminatory performance. For example, an AUROC of 0.5 meant that there was no discriminative ability or that it was equal to random chance, whereas an AUROC of 1.0 indicated a perfect discrimination power. Statistical analysis was performed with the SPSS for Windows version 10.0 software program (SPSS Inc, Chicago, IL, USA), and a value of p<0.05 was considered to be significant.
Table 2: Distribution and mortality rate according to the operative procedures
Table 3: Demographic and operative data
The length of ICU stay along with the aortic crossclamp and CPB times were longer for the non-survivors (Table 3). Furthermore, the m-SOFA admission scores were higher for the non-survivors (9.2±2) than the survivors (4.6±2.5) (p<0.01), and the m-SOFA scores on day one and day two were higher for the nonsurvivors (7.8±2 and 8.7±2.5, respectively) than for the survivors (4.6±2.5 and 3.8±2.7) (p<0.01 for both). In addition, the peak m-SOFA score was significantly higher for the non-survivors (9.8±2) when compared against the survivors (5±2.5) (p<0.001) (Figure 1). All the non-survivors had both peak m-SOFA and m-SOFA admission scores of ≥ 6 (range 6-14).
The m-SOFA admission score had a sensitivity of 97% and a specificity of 80%, which made for a cut-off value of 6.5, and the peak m-SOFA values revealed a similar sensitivity and specificity (96% and 74%, respectively) with the same cut-off value. The m-SOFA on day one had a sensitivity of 88% and specificity of 75%, yielding a cut-off value of 5.5. On day two, the sensitivity was 86% and the specificity was 86%, making a cut-off value of 6.5. Of all of the non-survivors, the peak m-SOFA and m-SOFA admission scores of only two patients were below the cut-off value of 6.5. These results revealed that the peak m-SOFA score along with m-SOFA admission scores of 6.5 have reliable sensitivity and specificity for mortality.
The overall performance of the PIM 1 and PIM 2 scores was evaluated via SMR, which was calculated by comparing the expected and observed deaths in the whole group (Table 4). The PIM 1 score had a sensitivity of 83% and a specificity of 78% for a cutoff value of 2.85, and the PIM 2 score had a similar sensitivity and specificity (83% and 76%, respectively) for a cut-off value of 2.45. Of all the non-survivors, the PIM 1 scores of 10 patients were below the cut-off value of 2.85 and the PIM 2 scores of seven patients were below the cut-off value of 2.45. Therefore, the discriminative power of admission and the peak m-SOFA score of 6.5 is higher than both PIM scores.
Table 4: Observed and expected mortality as standard mortality rate (n=373)
Calibration with the Hosmer-Lemeshow goodnessof- fit test for PIM 1 revealed a chi-square value of degrees of freedom (df) (8)=30.4 and a 95% confidence interval (CI) of 1.2 (range 1.12-1.27) (p=0.0002). The results of the PIM 2 showed a chi-square value of df (9)=13.5 and a 95% CI of 1.38 (range 1.2-1.5) (p=0.13). The PIM 2 mortality prediction model proved to be a better option compared to PIM 1 for predicting mortality in pediatric patients undergoing heart surgery as the Hosmer-Lemeshow goodness-offit test had a p value of >0.05 (p=0.13). However, both scores underestimated the mortality risk in this study (Table 5).
The discriminatory performance of the scores assessed with the ROC curves showed that the PIM 2 had a fair discrimination power (ROC score 0.82), while the PIM1 ROC (0.87) was better. However, there was poor calibration (p=0.0002). The m-SOFA scores have good discrimination power (Table 6). However, the m-SOFA admission and m-SOFA on day two had the greatest power of all (Table 6). The m-SOFA score is shown in Table 1.
Table 6: Discriminatory performance of the scores assessed by receiver operating curve curves
A preliminary study evaluating the mortality predicting power of PIM over PRISM in pediatric cardiac surgery demonstrated that PIM is better than PRISM when applied to infants and children.[6] Furthermore, the authors declared that collecting data for the PIM is much easier than for the PRISM. However, in a recent study, the performance of the PIM 2 was found to be poor regarding the calibration and predictive ability in pediatric cardiac surgery.[7] The difference between these studies regarding the PIM scores might be attributable to differences among the centers and poor inter-rater reliability.[8] The Czaja et al.[7] study, comprised of 8,391 pediatric patients who underwent cardiac surgery, also had an AUROC of 0.80. The performance of the PIM 2 score in our study had a similar AUROC of 0.82.
In a study performed on 75 pediatric patients who underwent cardiac surgery, Barlas et al.[9] determined that the PRISM score had a poor mortality prediction power when compared with the modified Acute Physiology and Chronic Health Evaluation (APACHE) III score.
The SOFA is an organ failure assessment score. However, it is also valuable for predicting ICU mortality for both adult and pediatric ICU patients.[10,11] This score can also be used as an independent predictor of mortality in adult patients undergoing heart surgery, as shown in the study by Pätilä et al.[12] In that same study, the peak SOFA scores were measured during the first three days. A SOFA score of over 20 points in pediatric patients undergoing heart surgery was found to be reliable for predicting death within the first 36 hours.[5] The Shime et al.[5] study, with as maller number o f participants (n=142) than our study, had a very high neonatal mortality rate (7 out of 8). Furthermore, the authors did not report the AUROC or the sensitivity and specificity of the SOFA scores. They didn’t perform a neurological evaluation based on the SOFA score, which we chose not to include, either. In our study, the AUROC plot values for admission regarding the m-SOFA scores and m-SOFA scores at 48 hours were higher than all of the other m-SOFA scores. The m-SOFA admission and peak m-SOFA scores of 6.5 showed good discrimination power. In a systematic review of SOFA-based models for predicting mortality in the ICU, Minne et al.[13] concluded that SOFA admission scores were competitive with severity of illness scores limited to the first 24 hours of admission. They advocated for the use of a combination of sequential SOFA scores in conjunction with traditional models (e.g. APACHE). We agree that studies which combine SOFA with other severity scores, such as PIM, could also be valuable for predicting mortality after pediatric cardiac surgery.
We acknowledge that our study had several limitations. A retrospective analysis cannot provide as strong evidence in favor of the predictive power of a mortality scoring system as a prospective study would. Moreover, although statistically significant results have emerged in our limited patient set, multi-center studies with larger patient numbers are necessary for enhancing statistical strength in testing the universality and reliability of m-SOFA score as a mortality predictor.
Declaration of conflicting interests
The authors declared no conflicts of interest with
respect to the authorship and/or publication of this
article.
Funding
The authors received no financial support for the
research and/or authorship of this article.
1) Pollack MM, Ruttimann UE, Getson PR. Pediatric risk of
mortality (PRISM) score. Crit Care Med 1988;16:1110-6.
2) Pollack MM, Patel KM, Ruttimann UE. PRISM III: an
updated Pediatric Risk of Mortality score. Crit Care Med
1996;24:743-52.
3) Shann F, Pearson G, Slater A, Wilkinson K. Paediatric index
of mortality (PIM): a mortality prediction model for children
in intensive care. Intensive Care Med 1997;23:201-7.
4) Slater A, Shann F, Pearson G; Paediatric Index of Mortality
(PIM) Study Group. PIM2: a revised version of the Paediatric
Index of Mortality. Intensive Care Med 2003;29:278-85.
5) Shime N, Kageyama K, Ashida H, Tanaka Y. Application of modified sequential organ failure assessment score in
children after cardiac surgery. J Cardiothorac Vasc Anesth
2001;15:463-8.
6) Jones GD, Thorburn K, Tigg A, Murdoch IA. Preliminary
data: PIM vs PRISM in infants and children post cardiac
surgery in a UK PICU. Intensive Care Med 2000;26:145.
7) Czaja AS, Scanlon MC, Kuhn EM, Jeffries HE. Performance
of the Pediatric Index of Mortality 2 for pediatric cardiac
surgery patients. Pediatr Crit Care Med 2011;12:184-9. doi:10.1097/PCC.0b013e3181e89694.
8) van Keulen JG, Polderman KH, Gemke RJ. Reliability of
PRISM and PIM scores in paediatric intensive care. Arch Dis
Child 2005;90:211-4.
9) Barlas S, Tireli E, Tekinalp H, Karslı M, Dağlı S, Dayıoğlu
E ve ark. Apache III sisteminin kalp ameliyatı sonrasında
kardiopulmoner yetersizlikli çocuklardaki değeri. Turk
Gogus Kalp Dama 1994;4:304-8.
10) Vincent JL, de Mendonça A, Cantraine F, Moreno R, Takala
J, Suter PM, et al. Use of the SOFA score to assess the
incidence of organ dysfunction/failure in intensive care units:
results of a multicenter, prospective study. Working group
on “sepsis-related problems” of the European Society of
Intensive Care Medicine. Crit Care Med 1998;26:1793-800.
11) Vincent JL, Ferreira F, Moreno R. Scoring systems for
assessing organ dysfunction and survival. Crit Care Clin
2000;16:353-66.