ORIGINAL ARTICLE The Surgical Mortality Probability Model Derivation and Validation of a Simple Risk Prediction Rule for Noncardiac Surgery Laurent G. Glance, MD,∗ † Stewart J. Lustik, MD,∗ Edward L. Hannan, PhD,‡ Turner M. Osler, MD,§ Dana B. Mukamel, PhD,¶ Feng Qian, MD,PhD,∗ and Andrew W. Dick, PhD|| Objective: To develop a 30-day mortality risk index for noncardiac surgery that can be used to communicate risk information to patients and guide clinical management at the “point-of-care,” and that can be used by surgeons and hospitals to internally audit their quality of care. Background: Clinicians rely on the Revised Cardiac Risk Index to quantify the risk of cardiac complications in patients undergoing noncardiac surgery. Because mortality from noncardiac causes accounts for many perioperative deaths, there is also a need for a simple bedside risk index to predict 30-day all-cause mortality after noncardiac surgery. Methods: Retrospective cohort study of 298,772 patients undergoing noncardiac surgery during 2005 to 2007 using the American College of Surgeons National Surgical Quality Improvement Program database. Results: The 9-point S-MPM (Surgical Mortality Probability Model) 30-day mortality risk index was derived empirically and includes three risk factors: ASA (American Society of Anesthesiologists) physical status, emergency status, and surgery risk class. Patients with ASA physical status I, II, III, IV or V were assigned either 0, 2, 4, 5, or 6 points, respectively; intermediate- or high-risk procedures were assigned 1 or 2 points, respectively; and emergency procedures were assigned 1 point. Patients with risk scores less than 5 had a predicted risk of mortality less than 0.50%, whereas patients with a risk score of 5 to 6 had a risk of mortality between 1.5% and 4.0%. Patients with a risk score greater than 6 had risk of mortality more than 10%. SMPM exhibited excellent discrimination (C statistic, 0.897) and acceptable calibration (Hosmer-Lemeshow statistic 13.0, P = 0.023) in the validation data set. Conclusions: Thirty-day mortality after noncardiac surgery can be accurately predicted using a simple and accurate risk score based on information readily available at the bedside. This risk index may play a useful role in facilitating shared decision making, developing and implementing risk-reduction strategies, and guiding quality improvement efforts. (Ann Surg 2012;255:696–702) F or more than 20 years, clinicians have used the Goldman Index,1 and its successor, the Revised Cardiac Risk Index (RCRI), to quantify the risk of cardiac complications and cardiac mortality in patients scheduled to undergo noncardiac surgery.2 Estimates of From the Departments of ∗ Anesthesiology and †Community and Preventive Medicine, University of Rochester School of Medicine, Rochester, NY; ‡School of Public Health, Department of Health Policy, Management and Behavior, Albany, NY; §Department of Surgery, University of Vermont Medical College, Burlington, VT; ¶Center for Health Policy Research, University of California, Irvine, CA; and ||RAND, Pittsburgh, PA. Disclosure: The authors declare no conflicts of interest. Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal’s Web site (www.annalsofsurgery.com). Reprints: Laurent G. Glance, MD, Department of Anesthesiology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 604, Rochester, NY 14642. E-mail: [email protected]. C 2012 by Lippincott Williams & Wilkins Copyright ISSN: 0003-4932/12/25504-0696 DOI: 10.1097/SLA.0b013e31824b45af 696 | www.annalsofsurgery.com patient risk based on the RCRI are used in conjunction with AHA guidelines to implement cardiac risk-reduction strategies for preoperative patients.3 However, the RCRI was not designed, nor should it be used, to predict all-cause mortality in surgical patients.3 Mortality from noncardiac causes accounts for a large portion of perioperative deaths.4 In addition to the RCRI, we need a global measure of surgical risk to guide the clinical care of noncardiac surgical patients and make true informed consent more feasible. Like the RCRI, such a risk index needs to be based on readily available clinical data, simple enough to implement at the bedside, and robust enough to convey accurate risk information to patients and clinicians. Although many models have been developed to estimate the risk of all-cause mortality in patients undergoing noncardiac surgery, no simple mortality risk score has been implemented in the United States.5–9 The American College of Surgeons (ACS) has spearheaded a national effort to benchmark general surgical outcomes in US hospitals.10 However, the ACS National Surgical Quality Improvement Program (NSQIP) model is too complicated to use at the bedside.11 Furthermore, because of the data collection burden and cost of participation, only 3% of US hospitals currently participate in the ACS NSQIP.12 By comparison, hospital report cards based on universally available administrative data using prediction models developed by the Agency for Healthcare Research and Quality (AHRQ)13 are prominently featured on the Web sites of many thirdparty payers. Unfortunately, models based on administrative data may generate biased measures of hospital performance due to poor data quality14,15 and like the ACS model are also too complicated to use to risk stratify individual patients at the bedside. In contrast to the ACS and AHRQ surgical mortality prediction models, the RCRI is based on easily obtainable clinical data and can be rapidly calculated at the bedside. However, the RCRI was created to estimate cardiac risk and does not accurately predict overall mortality risk.16 Thus, the RCRI cannot be used to assess the overall risk of mortality for individual patients undergoing noncardiac surgery, or to perform hospital and physician benchmarking. We sought to develop a simple risk index that can be used to communicate risk information to patients and guide clinical management at the “point-of-care,” and that can be used by surgeons and hospitals to internally audit their quality of care. Using clinical data from the ACS NSQIP, our objective was to develop a simple risk score for noncardiac surgical patients, which could be easily implemented without sacrificing predictive accuracy. In creating this risk score, we had 3 goals. First, the risk score should be based on readily available clinical data and should not require intensive data collection resources. Second, the risk score should be simple enough to use at the bedside to estimate risk of mortality without the use of a calculator. And third, this prediction model should be accurate enough to be used by hospitals and physicians with limited data collection resources to internally audit their outcomes. With respect to performance measurement, our goal is not to replace the NSQIP model, but rather to provide a reasonable alternative for nonpublic performance measurement when hospital participation in the ACS NSQIP is not feasible due to cost considerations. Annals of Surgery r Volume 255, Number 4, April 2012 Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 METHODS Data Source This study used data from the ACS NSQIP database for patients undergoing noncardiac surgery between 2005 and 2007. This database includes information on patient demographics, functional status, admission source, preoperative risk factors, intraoperative variables, 30-day postoperative outcomes, and predicted probability of 30-day mortality (based on the ACS NSQIP model) for patients undergoing major surgery in more than 200 participating hospitals.17 Hospital demographic information is not available in the research data file. A systematic sampling strategy is used to avoid bias in case selection and to insure a diverse surgical case mix.17 Trained surgical clinical reviewers collect patient data from the medical chart, operative log, anesthesia record, interviews with the surgical attending, and telephone interviews with the patient. Data quality is insured through comprehensive training of the nurse reviewers, and through an interrater reliability audit of participating sites. The University of Rochester School of Medicine institutional review board approved this study after expedited review (Rochester, NY). Study Population and Outcomes We first identified 322,389 records for patients who underwent noncardiac surgery using current procedural terminology codes. We excluded patients who received no anesthesia, local anesthesia, or monitored anesthesia care (20,490); patients who were missing American Society of Anesthesiologists Physical Status (ASA PS) codes (158), and patients who were in coma or mechanically ventilated (2969). The study cohort consisted of 298,772 patients. Development and Validation of Risk Score The outcome of interest was 30-day mortality. We elected not to create a comprehensive prediction model because the complexity of the resulting risk score would have rendered it impractical for clinical use. We based our selection of risk factors on clinical judgment and our review of the literature. Because our objective was to create a parsimonious model to use as the basis for a risk score, we selected 3 components for the risk model: (1) ASA PS (I, II, III, IV, or V); (2) surgery-specific risk (low, intermediate, or high risk); and (3) emergent versus nonemergent operation. We used the ASA PS as a summary measure of baseline patient risk because it is highly correlated with other preoperative clinical risk predictors18 and has been shown to be a very strong predictor of outcomes19–23 (Table 1). A recent study designed to examine the predictive ability of parsimonious models based on ACS NSQIP data found that ASA PS was one of the 5 key clinical predictors, and that the limited model had similar statistical performance to the full model.23 We used a 3-stage approach to construct a risk score for surgical mortality.24,25 In the first stage, we used an empiric data-driven approach to classify surgical procedures into low-, intermediate-, and high-risk procedures. By comparison, risk scores, such as the RCRI,2 typically assign specific surgical procedures to risk categories based TABLE 1. ASA PS Classification ASA PS I II III IV V Definition A normal healthy patient A patient with mild systemic disease A patient with severe systemic disease A patient with severe systemic disease that is a constant threat to life A moribund patient who is not expected to survive without the operation C 2012 Lippincott Williams & Wilkins The Surgical Mortality Probability Model on expert opinion.5,7 First, we grouped current procedural terminology codes for similar procedures into categories (eg, pulmonary resection) and then created dummy variables for each of the procedure groups. Second, we estimated a logistic regression model using ASA PS, emergency status, and each of the procedure dummy codes as explanatory variables. Third, we assigned each procedure to 1 of the 3 mutually exclusive risk categories (low, intermediate, and high risk) based on the estimated regression coefficient for each procedure (Supplemental Digital Content 1–3: Appendices 1a, 1b, and 1c, available at http://links.lww.com/SLA/A223, http://links.lww.com/SLA/A224, and http://links.lww.com/SLA/A225, respectively), as opposed to categorizing them using crude mortality rates. In the second stage, we reestimated a logistic regression model using ASA PS, surgery risk category, and emergency status as explanatory variables for 30-day mortality. We then assigned points to the levels of each risk factor based on the estimated coefficients. For each risk factor, the base category was assigned 0 points (ASA PS I, low-risk surgery, and nonemergent surgery). We then rounded the estimated coefficients to the nearest whole number to obtain the points associated with each of the risk factor levels. For example, because the estimated coefficient for ASA PS II was 2.009, patients with ASA PS II were assigned 2 points.25 In the third stage, we summed up the points for each patient. We then estimated a final logistic regression model in which patient score was the sole explanatory variable used to predict 30-day mortality. The data set was randomly split (50:50) into a derivation data set and a validation data set. The risk score was developed entirely using the derivation data set. The multivariate model and risk score estimated in the derivation data set were then cross-validated in the validation data set using measures of discrimination and calibration. Fractional polynomials were used to verify the linearity of the association between the logit of the risk score and 30-day mortality.26 The performance of the ACS NSQIP model was evaluated in the validation data set for general surgical and vascular patients using the ACS NSQIP probability-of-death (POD) present in the database. Data management and statistical analyses were performed using STATA SE/MP version 11 (STATA Corp, College Station, TX). All statistical tests were 2-tailed and P values less than 0.05 were considered significant. We used robust variance estimators to account for the nonindependence of observations within hospitals.27 Model discrimination was assessed using the C statistic, and model calibration was assessed using the Hosmer–Lemeshow statistic and calibration plots.28 RESULTS Patient demographics are displayed in Table 2. The overall mortality rate for the study cohort was 1.34%. The median age was 55 with an interquartile range between 42 and 68. Nearly 60% of the patients were male. The majority of the patients were ASA PS II (47%) or ASA PS III (37%), with the remainder distributed across ASA PS I (11%), Class IV (5.4%), and Class V (0.15%). Nearly 30% underwent either an intermediate-risk (12%) or high-risk (17%) surgical procedure. Approximately 6% had a history or previous openheart surgery and 5% have had angioplasty. The prevalence of diabetes was 14%, and 12% were morbidly obese. Figure 1 and Table 3 shows mortality as a function of ASA PS and surgery-specific risk. The baseline logistic regression model, which included each of the risk factors coded as categorical variables—ASA PS, emergency status, and surgical risk—demonstrated excellent discrimination and acceptable calibration (Table 4). The C statistic, a measure of discrimination, was 0.902 in the derivation data set and 0.900 in the validation data set. The Hosmer–Lemeshow statistic was 12.7 (P = 0.026) in the derivation data set and 10.5 (P = 0.063) in the www.annalsofsurgery.com | 697 Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 Glance et al TABLE 2. Preoperative Risk Factors of the Study Cohort and of the S-MPM Classes ASA PS I II III IV V Procedure risk Low risk Intermediate risk High risk Urgency Emergency Demographics Age∗ Male Cardiac Angina Myocardial infarction Congestive heart failure Open-heart surgery PCI HTN Pulmonary COPD Pneumonia, current Dyspnea-at-rest Renal Renal failure Neurologic TIA CVA Other Diabetes Underweight Morbid obesity Observed mortality rate Overall N = 298,772 S-MPM Class I (Low Risk) N = 229,779 S-MPM Class II (Intermediate Risk) N = 55,882 S-MPM Class III (High Risk) N = 13,111 10.8 47.0 36.7 5.39 0.15 14.1 60.1 25.9 0 0 0 81.2 14.7 0.03 0 0 0 36.6 60.2 3.24 70.4 12.3 17.3 87.6 5.87 6.54 16.2 40.2 43.6 0.10 6.66 93.2 12.9 9.15 14.8 69.8 55 (42,68) 58.1 51 (39,63) 61.9 66 (56,76) 44.9 70 (58,79) 46.9 0.78 0.59 0.85 6.06 5.14 44.3 0.40 0.15 0.20 3.23 3.00 37.1 1.79 1.52 2.13 15.1 12.2 68.0 3.12 4.19 6.73 17.2 12.5 69.5 4.30 0.31 1.13 2.17 0.08 0.43 10.5 0.64 2.42 15.2 2.99 7.78 2.08 0.77 5.13 11.9 2.89 2.36 2.14 1.34 5.33 5.27 5.67 7.98 14.2 2.31 11.7 1.34 10.5 1.65 13.1 0.21 26.3 4.15 7.37 2.80 28.3 6.16 6.00 35.6 Values are expressed as% unless otherwise stated. ∗ Median, interquartile range. FIGURE 1. The observed mortality rate as a function of American Society of Anesthesiologists’ physical status and surgeryspecific risk. 698 | www.annalsofsurgery.com validation data set. These values for the Hosmer–Lemeshow statistic reflect acceptable calibration given the large sample sizes of the derivation and validation data sets, and the recognized sensitivity of the Hosmer-Lemeshow statistic to sample size.29 On the basis of the regression coefficients estimated using the derivation data set, a point value was assigned to each of the risk factors (Table 5). The total score was then calculated for each patient by summing points for each of the 3 predictors. The final logistic regression model, which included only total score as an explanatory variable, also exhibited excellent discrimination and acceptable calibration. The C statistic was 0.899 in the derivation data set and 0.897 in the validation data set. The Hosmer-Lemeshow statistic was 5.53 (P = 0.35) in the derivation data set and 13.0 (P = 0.023) in the validation data set. Visual inspection of the calibration plot, and comparisons of the observed and predicted mortality rates for each score level (Table 5) indicates very good model calibration. We created 3 S-MPM (Surgical Mortality Probability Model) classes to facilitate bedside determination of the approximate risk of all-cause mortality: Class I <0.50%; Class II 1.5%–4.0%; and Class III >10% based on point totals (Table 6). C 2012 Lippincott Williams & Wilkins Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 The Surgical Mortality Probability Model Risk Factor TABLE 3. Observed Mortality Percent/Number at Risk as a Function of ASA PS and Surgery Risk Category ASA PS Procedure risk Emergency ASA PS Surgery Risk Category Low risk Intermediate risk High risk I II III IV V 0.03 15,226 0.65 459 0.00 427 0.08 56,980 0.52 6673 0.75 6306 0.49 32,200 2.21 12,338 4.81 10,267 3.31 2420 7.15 2,419 19.2 3251 7.69 13 32.7 49 54.2 144 Risk Factor Coefficient ASA physical status I II III IV V Procedure risk Low risk Intermediate risk High risk Emergency Nonemergent Emergency surgery 95% CI P 0 1 2 3 4 5 6 7 8 9 4 1 0 5 1.5% 1 + exp( 1 −βi X i ) = 1.6% βi X i = −9.18 + 3.83(1) + 1.25(1) Odds Ratio Using the point system (Table 7), the estimated risk of mortality is 1.5% versus 1.6% for the full logistic regression model. Reference 2.009 3.827 5.062 6.366 (0.861, 3.157) 0.001 (2.687, 4.968) <0.001 (3.920, 6.206) <0.001 (5.194, 7.539) <0.001 7.46 45.9 158 582 Reference 1.250 2.065 (1.082, 1.419) <0.001 (1.921, 2.210) <0.001 3.49 7.89 Reference 0.934 (0.829, 1.039) <0.001 2.54 TABLE 5. Goodness-of-Fit of the S-MPM Risk Score in the Validation Sample Point Total Points Estimating the probability of mortality using the full logistic regression model (Table 3): P= TABLE 4. Logistic Regression Model Used to Assign Points to Risk Factors in S-MPM Value ASA III Intermediate Non-emergent Point total Estimate of risk n Observed Mortality, % Estimated Mortality, % Mortality, % (Rounded) 11,313 4245 49,810 12,366 36,951 14,354 13,555 4762 1677 139 0.035 0 0.072 0.27 0.48 1.67 3.98 10.4 25.3 56.8 0.009 0.024 0.067 0.19 0.52 1.46 3.98 10.4 24.6 47.8 0.01 0.02 0.07 0.2 0.5 1.5 4.0 10 25 50 Of the 149,172 patients in the validation data set, 144,404 had an ACS NSQIP predicted POD in the database. The performance of ACS NSQIP was compared to S-MPM in this subset of the validation data set. The discrimination of S-MPM (C statistic, 0.897) was slightly worse than ACS NSQIP (C statistic, 0.935). S-MPM was better calibrated (HL stat 11.8; P = 0.04) than the ACS NSQIP model (Hl stat 25.3, P < 0.001) (Fig. 3). To illustrate the application of S-MPM to calculate the POD, we show a specific example of the correspondence between the mortality prediction estimated by the full logistic regression model and the mortality estimate based on the risk index: Case. A 60-year-old ASA PS III patient undergoing an elective cholecystectomy. C 2012 Lippincott Williams & Wilkins DISCUSSION We have used a large multicenter database to create a simple scoring system for predicting all-cause mortality in patients undergoing noncardiac surgery. Our scoring system requires that clinicians determine only 3 risk factors—ASA PS, surgical risk category, and emergency status—to predict 30-day all-cause mortality for noncardiac surgical patients with a high degree of accuracy. We believe that this risk index is simple enough to implement at the patient’s bedside and can be used to help inform clinical decision-making. By providing an estimate of overall mortality risk, S-MPM complements the preoperative assessment of cardiac risk obtained using the RCRI, and may provide a framework for the development and implementation of risk-reduction strategies based on all-cause mortality. The additional information provided by S-MPM may also facilitate informed consent and shared decision making between patients and their physicians.30 Finally, our risk score may prove useful to hospitals which lack the resources to participate in ACS NSQIP, but nonetheless want to assess their risk-adjusted outcomes after noncardiac surgery for quality improvement. FIGURE 2. Calibration graph for S-MPM. The solid line is the predicted mortality, and the open circles represent observed mortality rates. Vertical bars represent 95% confidence intervals for the observed mortality rates. www.annalsofsurgery.com | 699 Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 Glance et al TABLE 6. S-MPM Class Levels and Associated Risk of Mortality Class I II III Point Total Mortality 0–4 5–6 7–9 <0.50% 1.5%–4.0% >10% FIGURE 3. Calibration graph for S-MPM versus ACS NSQIP models. The black line represents perfect calibration; observed and predicted mortality are identical. TABLE 7. S-MPM Scoring System for Estimating Risk of 30-Day Mortality After Noncardiac Surgery Risk Factor Points Assigned ASA physical status I II III IV V Procedure risk Low risk Intermediate risk High risk Emergency Nonemergent Emergency surgery 0 2 4 5 6 0 1 2 0 1 The ASA PS is the most important risk factor in S-MPM. The relationship between ASA PS and mortality was recognized early on by Dripps and colleagues19 and by other investigators.20–22,31,32 Despite this finding, the ASA PS was never designed to “prognosticate the effect of a surgical procedure.”33 Yet, 7 decades after it was introduced,33 the ASA PS remains one of the most important single predictors of mortality and morbidity for general surgery.23,34,35 One of the major strengths of the ASA PS is its simplicity. But it is the simplicity of this classification system, which also is its greatest limitation. Specifically, the lack of precise definitions for each of the ASA PS levels can result in inconsistent ratings.36–38 Different physicians may not always agree on whether a patient should be classified as an ASA II patient (a patient with mild systemic disease) or an ASA 700 | www.annalsofsurgery.com III (a patient with severe systemic disease) patient. However, despite the subjective nature of the ASA PS, it is one of the key risk factors in both the ACS NSQIP and VA NSQIP prediction models, and it is also 1 of the 5 risk factors in the parsimonious models proposed to replace the more comprehensive current models employed by ACS NSQIP.23 Furthermore, predictions of ASA PS classes using objective NSQIP risk variables were found to correlate strongly with assignments made by anesthesiologists, further underlining the objective basis of the ASA PS classification system.18 To our knowledge, only 2 other surgical risk indices based on the ASA PS have been proposed, and neither has been widely incorporated into clinical practice. The first, published more than 20 years ago, was developed using a cohort of 2055 cases. It predicted major complications occurring within the first 24 hours after surgery, as opposed to 30-day mortality. This risk score was based on patient age, ASA PS score, emergency status, and surgical risk.39 The second, the Surgical Risk Scale (SRS) for in-hospital mortality, was developed using a cohort of 4903 patients treated by 3 surgeons.7 This risk index was based on ASA PS, surgical urgency, and surgery-specific risk. The major limitation of SRS is that it is based on a relatively small patient cohorts and may not be readily generalizable to a US patient population. In contrast, S-MPM was developed and validated using a cohort of 298,772 patients undergoing surgery at more than 200 centers. Compared to the ACS NSQIP mortality model, S-MPM has slightly worse discrimination and marginally better calibration. Based on only 3 predictors, S-MPM exhibits excellent discrimination with a C statistic of 0.90, compared to the 35-variable ACS NSQIP risk adjustment model, which has a C statistic of 0.94. Recently, Dimick and colleagues23 have developed parsimonious 2-variable procedurespecific models also based on the ACS NSQIP database. These models had C statistics ranging between 0.73 and 0.92. The goal of creating these models was to reduce the burden of data collection and make participation in the ACS NSQIP more affordable. However, unlike S-MPM, these risk-adjustment models are not risk scores, which can be easily implemented at the bedside. Furthermore, these models are only applicable to a limited set of procedures: cholecystectomy, ventral hernia repair, gastric bypass, pancreatectomy, and colectomy. Finally, because each of these procedure-specific models has unique coefficients and different variables, no single risk score could be created to quantify mortality risk based on these separate procedurespecific models. Gawande et al40 proposed the Surgical Apgar Score to predict major postoperative complications and mortality for patients undergoing general or vascular surgery.11,41–43 This risk index incorporates heart rate, blood pressure, and estimated blood loss. This surgical outcome score was initially developed and validated at a 2 major academic centers in the United States. The Surgical Apgar Score was subsequently tested at 8 international sites serving as pilot sites for a surgical quality improvement program sponsored by the World Health Organization.43 As a standalone risk index, this risk index exhibits moderately good discrimination and conveys important prognostic information. One of the primary advantages of the Surgical Apgar Score is its simplicity. Its primary limitation is the potential for measurement variability, especially with respect to estimated blood loss.40 Unlike S-MPM, the Surgical Apgar Score was not designed for performance benchmarking41 because it does not adjust for preoperative risk factors or surgical complexity. If it were used for surgical quality reporting, it would give “credit” to surgeons with more extensive surgical blood loss. In other words, if 2 surgeons were performing an identical procedure (such as a cholecystectomy) and had identical crude mortality rates, the surgeon with more intraoperative blood loss would have a higher predicted mortality rate and would thus appear to have a lower risk-adjusted mortality rate. C 2012 Lippincott Williams & Wilkins Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 Our goal in creating this new surgical risk index was to create a risk score, which was both accurate enough to convey important prognostic information and simple enough to use at the patient’s bedside using readily available risk factors. S-MPM meets these criteria. Data collection is limited to 3 risk factors: ASA PS, emergency status, and type of operation. These data elements are easily obtained. The goodness-of-fit of S-MPM in the validation sample suggest that the predicted probabilities accurately reflect the observed outcomes in our patient sample. Because the ACS NSQIP collects data from a selfselected group of major medical centers, mortality estimates based on S-MPM will reflect the mortality experience in a group of hospitals that may have somewhat lower mortality rates compared to “average” hospitals in the United States. Additional work may be necessary to recalibrate this model to a more representative population-based database. Risk scores, like practice guidelines, should not be used in isolation to make clinical decisions. Although prognostication based on a prediction model may at first appear to be devoid of the subjectivity of clinical reasoning, it has long been recognized that “statistical model building can be as much an art as it is a science.”44 Nonetheless, risk scores can be used to enhance clinical decision making by supplementing a clinician’s subjective “gut feeling”41 with a summary measure based on a patient sample much larger than the sum total of any individual physician’s clinical experience. One of the central tenets of evidence-based medicine is that clinical decision making should involve the integration of best evidence with individual expertise.45 Risk scores, properly constructed, represent the best evidence on patient prognosis. The utility of risk scores to help guide clinical decision making is exemplified by the widespread application of the RCRI to guide the preoperative cardiac evaluation of surgical patients based on a patient’s risk of developing a cardiac complication.46 Risk-reduction strategies, such as the intensity of postoperative surveillance, are dictated by patient risk and surgical risk. The introduction of a more general risk score, such as S-MPM, may facilitate the development of other evidence-based practice guidelines, like the AHA/ACC guidelines, in which the branch points in decision trees are based on objective measures of patient risk. This study has some potential limitations. It can be argued that using the ASA PS as one of the key variables in S-MPM introduces too much subjectivity, limiting the potential value of this risk index. As noted earlier however, the potential for measurement error in the ASA PS does not appear to significantly impact the value of the ASA PS as a key predictor of surgical outcome, nor its use in the 2 largest surgical benchmarking efforts in the United States—the VA NSQIP and the ACS NSQIP. Second, the subjectivity of the ASA PS may lead to up-coding of the ASA PS if S-MPM were used as the basis for public reporting of hospital surgical outcomes. However, this risk index is not designed to be used in this manner. Third, our categorization of specific surgical procedures into high-, intermediate-, and lowrisk procedures may not always match clinical intuition. For example, we classify pulmonary resections and exploratory laparotomies as high-risk surgery, whereas these are typically considered to be intermediate-risk procedures. In mapping specific procedures to risk categories, we relied on the results of our regression analyses, as opposed to expert clinical judgment. To the extent that some procedures in S-MPM are classified differently than expected, clinicians will need to “recalibrate” their choice of surgical risk to use S-MPM. Because most surgeons perform a limited number of surgical procedures, we do not believe that this should present a significant obstacle for most clinicians. Fourth, despite being based on a large patient population, our study cohort is not population-based, and our risk score may not perform as well in an external data set. This drop-off in performance when a scoring system is applied to a different population has long C 2012 Lippincott Williams & Wilkins The Surgical Mortality Probability Model been recognized,47 and it is unlikely to limit the clinical value of this new risk score3 because it would be straight-forward to recalibrate S-MPM to the new patient population (as long as the characteristics of the new population are not markedly different from ACS NSQIP). Finally, our risk score is intended to provide a baseline estimate of risk before a patient goes to surgery. In some cases, the nature and extent of the planned operation will change during the course of the operation. In theory, the observed statistical performance of this risk index may be overstated because the risk index classifies surgery-specific risk on the basis of the actual procedure, as opposed to the planned procedure. Although we do not believe that this increment in model performance is clinically important, we cannot rule it out using available data. CONCLUSIONS In summary, we have developed a simple risk index for allcause 30-day mortality for noncardiac surgery. S-MPM is a 9-point score based on a patient’s ASA PS, surgery-specific risk, and whether the procedure is performed on an emergency basis. Despite its simplicity and ease of application, this risk score exhibits excellent statistical performance. S-MPM may play a useful role in facilitating shared decision making, developing and implementing risk-reduction strategies, and guiding quality improvement efforts. ACKNOWLEDGMENTS This project was supported by a grant from the Agency for Healthcare and Quality Research (RO1 HS 16737) and the Department of Anesthesiology, University of Rochester. The sponsors of this study had no role in the conduct of this study; in the analysis or the interpretation of the data; or in the preparation, review, or approval of the article. The views presented in this manuscript are those of the authors and may not reflect those of Agency for Healthcare and Quality Research. REFERENCES 1. Goldman L, Caldera DL, Nussbaum SR, et al. Multifactorial index of cardiac risk in noncardiac surgical procedures. N Engl J Med. 1977;297:845–850. 2. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100:1043–1049. 3. Goldman L. The revised cardiac risk index delivers what it promised. Ann Int Med. 2010;152:57–58. 4. Devereaux PJ, Yang H, Yusuf S, et al. Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial. Lancet. 2008;371:1839–1847. 5. Copeland GP, Jones D, Walters M. POSSUM: a scoring system for surgical audit. Br J Surg. 1991;78:355–360. 6. Prytherch DR, Whiteley MS, Higgins B, et al. POSSUM and Portsmouth POSSUM for predicting mortality. Physiological and Operative Severity Score for the enUmeration of Mortality and morbidity. Br J Surg. 1998;85:1217– 1220. 7. Sutton R, Bann S, Brooks M, et al. The Surgical Risk Scale as an improved tool for risk-adjusted analysis in comparative surgical audit. Br J Surg. 2002;89:763–768. 8. Neary WD, Prytherch D, Foy C, et al. Comparison of different methods of risk stratification in urgent and emergency surgery. Br J Surg. 2007;94:1300–1305. 9. Liebman B, Strating RP, van Wieringen W, et al. Risk modelling of outcome after general and trauma surgery (the IRIS score). Br J Surg. 2010;97:128–133. 10. Hall BL, Hamilton BH, Richards K, et al. Does surgical quality improve in the American College of Surgeons National Surgical Quality Improvement Program: an evaluation of all participating hospitals. Ann Surg. 2009;250:363– 376. 11. Regenbogen SE, Lancaster RT, Lipsitz SR, et al. Does the Surgical Apgar Score measure intraoperative performance? Ann Surg. 2008;248:320–328. 12. Birkmeyer JD, Shahian DM, Dimick JB, et al. Blueprint for a new American College of Surgeons: National Surgical Quality Improvement Program. J Am Coll Surg. 2008;207:777–782. www.annalsofsurgery.com | 701 Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited. Annals of Surgery r Volume 255, Number 4, April 2012 Glance et al 13. Glance LG, Osler TM, Mukamel DB, et al. Impact of the present-on-admission indicator on hospital quality measurement: experience with the Agency for Healthcare Research and Quality (AHRQ) Inpatient Quality Indicators. Med Care. 2008;46:112–119. 14. Iezzoni LI. Assessing quality using administrative data. Annals of Internal Medicine. 1997;127:666–674. 15. Gordon HS, Johnson ML, Wray NP, et al. Mortality after noncardiac surgery: prediction from administrative versus clinical data. Med Care. 2005;43:159– 167. 16. Ford MK, Beattie WS, Wijeysundera DN. Systematic review: prediction of perioperative cardiac complications and mortality by the revised cardiac risk index. Ann Int Med. 2010;152:26–35. 17. Khuri SF, Henderson WG, Daley J, et al. The patient safety in surgery study: background, study design, and patient populations. J Am Coll Surg. 2007;204:1089–1102. 18. Davenport DL, Bowe EA, Henderson WG, et al. National Surgical Quality Improvement Program (NSQIP) risk factors can be used to validate American Society of Anesthesiologists physical status classification (ASA PS) levels. Ann Surg. 2006;243:636–641; discussion 641–644. 19. Dripps RD, Lamont A, Eckenhoff JE. The role of anesthesia in surgical mortality. JAMA. 1961;178:261–266. 20. Vacanti CJ, VanHouten RJ, Hill RC. A statistical analysis of the relationship of physical status to postoperative mortality in 68,388 cases. Anesth Analg. 1970;49:564–566. 21. Marx GF, Mateo CV, Orkin LR. Computer analysis of postanesthetic deaths. Anesthesiology. 1973;39:54–58. 22. Wolters U, Wolf T, Stutzer H, et al. ASA classification and perioperative variables as predictors of postoperative outcome. Br J Anaesth. 1996;77:217– 222. 23. Dimick JB, Osborne NH, Hall BL, et al. Risk adjustment for comparing hospital quality with surgery: how many variables are needed? J Am Coll Surg. 2010;210:503–508. 24. Wasson JH, Sox HC, Neff RK, et al. Clinical prediction rules. Applications and methodological standards. N Engl J Med. 1985;313:793–799. 25. Sullivan LM, Massaro JM, D’Agostino RB, Sr. Presentation of multivariate data for clinical use: the Framingham Study risk score functions. Stat Med. 2004;23:1631–1660. 26. Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modeling. Applied Statistics. 1994;43:429–467. 27. White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817–830. 28. Hosmer DW, Lemeshow S. Applied Logistic Regression. Vol 2. New York, NY: Wiley-Interscience Publication; 2000. 29. Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: the Hosmer-Lemeshow test revisited. Crit Care Med. 2007;35:2052–2056. 30. Krumholz HM. Informed consent to promote patient-centered care. J Am Med Assoc. 2010;303:1190–1191. 31. Prause G, Ratzenhofer-Comenda B, Pierer G, et al. Can ASA grade or Goldman’s cardiac risk index predict peri-operative mortality? A study of 16,227 patients. Anaesthesia. 1997;52:203–206. 702 | www.annalsofsurgery.com 32. Prause G, Offner A, Ratzenhofer-Komenda B, et al. Comparison of two preoperative indices to predict perioperative mortality in non-cardiac thoracic surgery. Eur J Cardiothorac Surg. 1997;11:670–675. 33. Saklad M. Grading of patients for surgical procedures. Anesthesiology. 1941;2:281–284. 34. Khuri SF, Daley J, Henderson W, et al. Risk adjustment of the postoperative mortality rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study. J Am Coll Surg. 1997;185:315–327. 35. Daley J, Khuri SF, Henderson W, et al. Risk adjustment of the postoperative morbidity rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study. J Am Coll Surg. 1997;185:328–340. 36. Owens WD, Felts JA, Spitznagel EL, Jr. ASA physical status classifications: a study of consistency of ratings. Anesthesiology. 1978;49:239–243. 37. Ranta S, Hynynen M, Tammisto T. A survey of the ASA physical status classification: significant variation in allocation among Finnish anaesthesiologists. Acta Anaesthesiol Scand. 1997;41:629–632. 38. Mak PH, Campbell RC, Irwin MG. The ASA physical status classification: inter-observer consistency. American Society of Anesthesiologists. Anaesth Intensive Care. 2002;30:633–640. 39. Tiret L, Hatton F, Desmonts JM, et al. Prediction of outcome of anaesthesia in patients over 40 years: a multifactorial risk index. Stat Med. 1988;7:947– 954. 40. Gawande AA, Kwaan MR, Regenbogen SE, et al. An Apgar score for surgery. J Am Coll Surg. 2007;204:201–208. 41. Regenbogen SE, Ehrenfeld JM, Lipsitz SR, et al. Utility of the surgical Apgar score: validation in 4119 patients. Arch Surg. 2009;144:30–36; discussion 37. 42. Regenbogen SE, Bordeianou L, Hutter MM, et al. The intraoperative Surgical Apgar Score predicts postdischarge complications after colon and rectal resection. Surgery. 2010;148:559–566. 43. Haynes AB, Regenbogen SE, Weiser TG, et al. Surgical outcome measurement for a global patient population: validation of the Surgical Apgar Score in 8 countries. Surgery. 2011;149:519–524. 44. Lemeshow S, Teres D, Klar J, et al. Mortality Probability Models (MPM II) based on an international cohort of intensive care unit patients. J Am Med Assoc. 1993;270:2478–2486. 45. Sackett DL, Rosenberg WM, Gray JA, et al. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312:71–72. 46. Fleisher LA, Beckman JA, Brown KA, et al. ACC/AHA 2007 Guidelines on Perioperative Cardiovascular Evaluation and Care for Noncardiac Surgery: Executive Summary: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Writing Committee to Revise the 2002 Guidelines on Perioperative Cardiovascular Evaluation for Noncardiac Surgery): Developed in Collaboration With the American Society of Echocardiography, American Society of Nuclear Cardiology, Heart Rhythm Society, Society of Cardiovascular Anesthesiologists, Society for Cardiovascular Angiography and Interventions, Society for Vascular Medicine and Biology, and Society for Vascular Surgery. Circulation. 2007;116:1971– 1996. 47. Teres D, Lemeshow S. As American as apple pie and APACHE. Acute physiology and chronic health evaluation. Crit Care Med. 1998;26:1297– 1298. C 2012 Lippincott Williams & Wilkins Copyright © 2012 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
© Copyright 2025 Paperzz