University of Iowa Iowa Research Online Theses and Dissertations Spring 2013 A critical evaluation of healthcare quality improvement and how organizational context drives performance Justin Mathew Glasgow University of Iowa Copyright 2013 Justin Mathew Glasgow This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2503 Recommended Citation Glasgow, Justin Mathew. "A critical evaluation of healthcare quality improvement and how organizational context drives performance." PhD (Doctor of Philosophy) thesis, University of Iowa, 2013. http://ir.uiowa.edu/etd/2503. Follow this and additional works at: http://ir.uiowa.edu/etd Part of the Clinical Epidemiology Commons A CRITICAL EVALUATION OF HEALTHCARE QUALITY IMPROVEMENT AND HOW ORGANIZATIONAL CONTEXT DRIVES PERFORMANCE by Justin Mathew Glasgow An Abstract Of a thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Epidemiology in the Graduate College of The University of Iowa May 2013 Thesis Supervisor: Associate Professor Peter J. Kaboli 1 ABSTRACT This thesis explored healthcare quality improvement, considering the general question of why the last decade’s worth of quality improvement (QI) had not significantly improved quality and safety. The broad objective of the thesis was to explore how hospitals perform when completing QI projects and whether any organizational characteristics were associated with that performance. First the project evaluated a specific QI collaborative undertaken in the Veterans Affairs (VA) healthcare system. The goal of the collaborative was to improve patient flow throughout the entire care process leading to shorter hospital length of stay (LOS) and an increased percentage of patients discharged before noon. These two goals became the primary outcomes of the analysis, which were balanced by three secondary quality check outcomes: 30-day readmission, in-hospital mortality, and 30-day mortality. The analytic model consisted of a five-year interrupted time-series examining baseline performance (two-years prior to the intervention), the year during the QI collaborative, and then two-years after the intervention to determine how well improvements were maintained post intervention. The results of these models were then used to create a novel 4-level classification model. Overall, the analysis indicated a significant amount of variation in performance; however, subgroup analyses could not identify any patterns among hospitals falling into specific performance categories. Given this potentially meaningful variation, the second half of the thesis worked to understand whether specific organizational characteristics provided 2 support or acted as key barriers to QI efforts. The first step in this process involved developing an analytic model to describe how various categories of organizational characteristics interacted to create an environment that modified a QI collaborative to produce measureable outcomes. This framework was then tested using a collection of variables extracted from two surveys, the categorized hospital performance from part one, and data mining decision trees. Although the results did not identify any strong associations between QI performance and organizational characteristics it generated a number of interesting hypotheses and some mild support for the developed conceptual model. Overall, this thesis generated more questions than it answered. Despite this feature, it made three key contributions to the field of healthcare QI. First, this thesis represents the most thorough comparative analysis of hospital performance on QI and was able to identify four unique hospital performance categories. Second, the developed conceptual model represents a comprehensive approach for considering how organizational characteristics modify a standardized QI initiative. Third, data mining was introduced to the field as a useful tool for analyzing large datasets and developing important hypotheses for future studies. Abstract Approved: _______________________________________________ Thesis Supervisor Associate Professor, Department of Internal Medicine______ Title and Department October 4, 2011___________________________________ Date A CRITICAL EVALUATION OF HEALTHCARE QUALITY IMPROVEMENT AND HOW ORGANIZATIONAL CONTEXT DRIVES PERFORMANCE by Justin Mathew Glasgow A thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Epidemiology in the Graduate College of The University of Iowa May 2013 Thesis Supervisor: Associate Professor Peter J. Kaboli Graduate College The University of Iowa Iowa City, Iowa CERTIFICATE OF APPROVAL _________________________ PH.D. THESIS _____________ This is to certify that the Ph. D. thesis of Justin Mathew Glasgow has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Epidemiology at the May 2013 graduation. Thesis Committee: ______________________________________________ Peter Kaboli, Thesis Supervisor ______________________________________________ James Torner ______________________________________________ Elizabeth Chrischilles ______________________________________________ Ryan Carnahan ______________________________________________ Jason Hockenberry ______________________________________________ Jill Scott-Cawiezell TABLE OF CONTENTS LIST OF TABLES iv LIST OF FIGURES vi CHAPTER 1 – INTRODUCTION 1 Study Overview Summary 7 8 CHAPTER 2 – QUALITY IMPROVEMENT COLLABORATIVES The Collaborative Approach to Quality Flow Improvement Inpatient Initiative (FIX) FIX Analysis Overview Conclusions CHAPTER 3 – TIME-SERIES METHODS Data Sources Data Elements Patient Cohort Risk Adjustment Time-Series Model Improvement and Sustainability Sub-group Analyses Conclusions 10 10 17 21 25 27 27 29 29 31 38 41 46 47 CHAPTER 4 – TIME-SERIES RESULTS AND DISCUSSION System-Wide Analysis Facility Analysis Evaluation of the Specific Aims Discussion Limitations Conclusions 48 48 52 57 59 65 67 CHAPTER 5 – SUPPORTING QUALITY IMPROVEMENT Relationships with Healthcare Quality Relationships with Quality Improvement Efforts Analytic Framework Conclusions ii 69 69 76 79 84 CHAPTER 6 – ANALYTIC VARIABLES AND DATA MINING Organizational Characteristics in VA VA Hospital Organizational Context Data Mining Overview Decision Tree Development Decision Tree Interpretation Conclusions CHAPTER 7 – DECISION TREE RESULTS AND DISCUSSION Decision Tree Performance Metrics Individual Decision Trees Discussion Interpreting the Analytic Framework Limitations Conclusions 86 86 89 100 105 110 111 113 113 116 132 138 140 142 CHAPTER 8 – SUMMARY AND FUTURE WORK 145 Project Summary Human Factors and Change Management Recommendations for Improving QI Future Studies Conclusions 145 151 153 156 159 APPENDIX A – RISK ADJUSTMENT MODEL SAS CODE 160 APPENDIX B – SAS OUTPUT FOR RISK ADJUSTMENT 168 APPENDIX C – FACILITY PERFORMANCE BY SIZE AND REGION 178 APPENDIX D – FULL VARIABLE LISTS 181 REFERENCES 185 iii LIST OF TABLES Table 2-1: Reported calculation of cost savings from FIX ................................... 20 Table 3-1: List of Outcome Measures ................................................................. 30 Table 3-2: Comparison of risk adjustment cohort to all other FY07 discharges .. 32 Table 3-3: List of potential risk adjustment variables, the number of discrete categories, and a description of how categories were defined .......... 33 Table 3-4: Modeling of age risk adjustment categories ....................................... 35 Table 3-5: Modeling of race risk adjustment categories ...................................... 35 Table 3-6: Modeling of service connected risk adjustment categories ................ 35 Table 3-7: Modeling of admission source risk adjustment categories ................. 36 Table 3-8: Modeling of place of discharge risk adjustment categories ................ 36 Table 3-9: Highly correlated risk adjustment variables........................................ 37 Table 3-10: Description of full classification categories....................................... 42 Table 4-1: Hospital classification across the 5 outcome measures (N = 130) ..... 53 Table 4-2: LOS Improvers classification (N = 45) ............................................... 55 Table 4-3: Discharge before noon Improvers classification (N = 60) .................. 56 Table 4-4: P-Values from Chi-square tests examining facility performance in subgroups by size and regional location ................................................. 56 Table 6-1: Categories for different response scales in the CPOS survey ........... 90 Table 6-2: Variables measuring facility structure................................................. 91 Table 6-3: Variables measuring QI structure ....................................................... 94 Table 6-4: Calculated and Composite measures of QI Structure ........................ 95 Table 6-5: Variables measuring QI process ........................................................ 97 Table 6-6: Calculated and Composite measures of QI Process.......................... 99 Table 6-7: Point ranges for composite model classification .............................. 110 Table 7-1: Data mining sample performance classifications (N = 100) ............. 115 iv Table 7-2: Decision tree performance metrics .................................................. 115 Table 7-3: Count of factors in each of the decision trees .................................. 117 Table 7-4: List of individual and composite variables in the decision trees ....... 119 v LIST OF FIGURES Figure 2-1: Model of the IHI BTS Collaborative timeline for FY07 FIX................ 18 Figure 3-1: Decision tree used to classify hospital performance ........................ 45 Figure 4-1: Aggregate results for LOS (FY05 - FY09) ........................................ 50 Figure 4-2: Aggregate results for in-hospital mortality (FY05 - FY09) ................. 50 Figure 4-3: Aggregate results for 30-day mortality (FY05 - FY09) ...................... 51 Figure 4-4: Aggregate results for discharges before noon (FY05 - FY09) .......... 51 Figure 4-5: Aggregate results for 30-day readmissions (FY05 - FY09)............... 52 Figure 5-1: Analytic framework for how organizational context impacts QI......... 81 Figure 7-1: Full decision tree for LOS performance .......................................... 123 Figure 7-2: Full decision tree for discharges before noon performance............ 126 Figure 7-3: Full decision tree for LOS/Noon composite performance ............... 128 Figure 7-4: Full decision tree for overall composite performance ..................... 131 vi 1 CHAPTER 1 – INTRODUCTION In the years since the Institute of Medicine (IOM) reported that as many as 98,000 people die each year as a result of medical errors,1 the healthcare community has been focused on efforts to improve quality, efficiency, and safety. While considerable efforts have gone into improving healthcare quality, broad measures of quality do not show the expected improvements in quality. One common monitor of quality is the National Healthcare Quality Report (NHQR) which tracks annual performance on several quality measures. In 2008, the report found that there was only a 1.4% average annual increase in all measures of quality with a concomitant 0.9% average annual decrease in scores on patient safety measures.2 The 2009 report continued the theme, noting that while it was possible to identify small pockets of success, the overall variability across the healthcare industry was too great to claim any success in improving quality and safety.3 Further confirming the lack of improvement in quality and safety was a recent review of patient medical records by the Centers for Medicare and Medicaid Services (CMS). The review evaluated records of 780 Medicare beneficiaries recently discharged from a hospital and found that 13.5% experienced an adverse event during their hospital stay.4 Further, an expert panel review of these adverse events determined that 44% of the events were clearly or likely preventable.4 Taken together, the NHQR reports and the CMS chart reviews suggest a disconnect between what quality improvement (QI) efforts report in the literature and their actual success. The broad driving force behind 2 the research reported in this thesis is to understand potential causes for this disconnect and to explore possible modifications to the healthcare environment that will support and increase the probability of successful QI in the future. Two theories have been particularly instructive in approaching and understanding why individual reports of successful QI projects may not translate into widespread improvements in quality. First, human factors theory advocates that when designing a device or a process careful attention must be paid to how innate limitations of human physical and mental capabilities will impact how people interact with the device or process. 5 This concept means that even the greatest of technological solutions can be unsuccessful if people cannot successfully interact with the system. Building from this idea, a potential hypothesis for why there is little overall improvement in quality is that many QI projects propose and implement solutions that may impose too much additional cognitive burden on those tasked with providing high quality care. In this situation, there may be initial success as excitement and energy related to the project are sufficient to overcome the additional cognitive burden. However, as time passes and the improvements become less of a focus there is a reduction in task specific energy. This eventually leads to a point where the additional cognitive burden becomes too overwhelming and performance begins to decline. This sort of process would suggest a QI project that initially appears successful, but overtime cannot sustain performance which would result in a slow decline in quality likely back to baseline performance as providers abandon the new solution for their original process. 3 The other instructive theory for understanding the quality disconnect was change management theory. This theory acknowledges that going through and accepting change is a difficult and emotional process that people often resist.6 This theory suggests that even if a QI effort is technically correct from a human factors perspective, resistance to change from healthcare providers could still result in an unsuccessful QI project. This sort of change resistance could help identify why a successful QI project at one hospital is not successful when translated to other settings. Without the correct institutional QI culture or change management process, QI projects will not sustain their improvements and likely will have difficulty achieving even initial improvements. As a concrete example of how QI solutions may not consider the cognitive or emotional hurdles involved in improving and sustaining quality, consider that many QI solutions rely predominately on provider education as the main component of a QI solution. In the standard approach, providers are gathered together in a meeting room or lecture hall where someone presents them with a problem, for example a growing backlog of patients waiting to be admitted from the emergency department each afternoon. Having established the problem, the speaker asks the group to improve quality by increasing the number of inpatient discharges that occur before noon. After discussions and pushback from the audience, the presenter wraps up the presentation hoping the group is energized and ready to go fix the problem. This approach has a number of short and long term problems which impact the likelihood of long-term improvements in quality. Perhaps the biggest 4 barrier to success in this situation is the feasibility in achieving what the speaker proposes. Mornings are generally a busy time for physicians and nurses as they go through their rounds, provide care, and make plans for the rest of the day. This period may already be so busy that adding the extra cognitive task of planning and taking care of a patient discharge may not be feasible. Combine this cognitive difficulty with various emotional reactions, such as change avoidance, denial of the problem (or blaming others), or just simple avoidance, and this intervention would be lucky to make an initial improvement and it certainly will not lead to sustained improvements. While an interesting theoretical example, the real question to ponder is whether actual QI projects generate and sustaining improvements in quality across multiple healthcare settings. Unfortunately, the current QI literature predominately focuses on case reports that describe projects in a single setting and do not provide the in-depth project evaluation necessary to fully understand QI in healthcare. Even systematic reviews of QI have a hard time reaching definitive conclusions as they generally conclude project evaluations are not methodologically sound and cannot establish whether improvements in quality occur and if improvements were present whether those improvements were even causally related to the QI effort.7-9 With so little focus on establishing whether interventions create initial results, it is not a surprise that few reports broach the subject of sustained quality, nor present any data that covers the period after project completion. 5 Since those initial reviews, two approaches to quality improvement, Lean and Six Sigma, have become increasingly popular in healthcare. These two approaches are important as both of them have a specific focus as part of the process that emphasizes the importance of trying to sustain improvements after initial project completion. However, a recent systematic review of these two approaches found that few articles discussed whether project interventions led to sustained improvements.10 Of the few cases that discussed sustained improvements, two were particularly informative about the challenges healthcare faces as it works towards sustaining QI. The first case involved an intervention targeted to reduce nosocomial urinary tract infections (UTI) using the more general approach of nursing staff education and training.11 The initial effort resulted in a steady decrease in the number of UTI recorded which lasted for about a year after the intervention. However, after that year the rates slowly began to rise resulting in a loss of the initial improvements and eventually a recording of the highest quarterly rate of UTI observed in a 4-year period. Since the unit was monitoring their UTI rates they did respond to the increase with another round of staff education which lead to at least a temporary reduction in rates. This QI initiative mimics the prior theoretical example and highlights that relying solely on provider education is unlikely to produce sustained improvements in quality. While the root cause behind the loss of quality was not discussed in the article, there certainly could have been emotional or cognitive challenges that contributed to the nurses’ inability to maintain low UTI rates. 6 In contrast, the second case focused on reducing catheter-related bloodstream infections (CRBSI) by identifying solutions that involved process changes that would not only improving quality, but also would reduce provider cognitive burden. The solutions in this project involved developing a system to monitor catheter dwell time, as well as the creation of a catheter insertion kit that ensured all materials were immediately available in one area.12 This change reduced provider burden in two ways. First, by creating a method for monitoring and alerting providers about catheter dwell time, providers did not have to remember when a catheter was inserted and whether it was time to be changed or removed. Instead, they would receive a reminder when action was appropriate. Second, by creating a procedure kit there was no longer the burden of searching for necessary components in a time pressed environment. Anytime a catheter needed to be placed only one item, the kit, needed to be located and then everything necessary for high quality care would be available. Even while these were effective changes, long-term monitoring of CRBSI rates found a substantial spike the first winter after implementation. Review of that increase led to the identification of a specific subset of patients with characteristics different than those evaluated in the original project. This led to an additional change to the process mandating the use of antibiotic coated catheters for select subsets of patients. While this example paints a more promising picture about the future of quality in healthcare (i.e. that well designed process improvements can improve quality), it also reveals that fixing quality problems may require more than a single intervention. 7 Study Overview As established in the introduction this study is driven by the apparent disconnect between reports of successful QI efforts and lack of measured improvements in healthcare quality. There are likely many root causes to this disconnect, but this study will first focus on two potential causes. First, current evaluation approaches may overestimate how well hospitals perform on QI efforts and stronger methodologies may identify that fewer hospitals than expected successfully improve quality. Second, those projects that do successfully improve quality initially may not be able to sustain results long term. In order to explore these two areas, the first objective of this study was to conduct an in-depth examination of whether a collection of Veterans Affairs (VA) hospitals were able to improve and sustain quality after participating in the same quality improvement collaborative, the Flow Improvement Inpatient Initiative (FIX). This analysis will address the following two specific aims: Aim 1: Determine the impact of the FIX collaborative upon quality and efficiency as measured by LOS, percent of patients discharged before noon, in-hospital and 30-day mortality rates, and 30-day readmission rates. Hypothesis 1: The FIX collaborative will shorten patient LOS, increase the percentage of patients discharged before noon. There will be no changes in mortality or readmission rates attributable to FIX. Aim 2: Determine whether improvements attributable to FIX are sustained postimplementation. 8 Hypothesis 2a: Improvements in the outcome measures will continue on a downward slope after completion of FIX. Hypothesis 2b: The rate of further improvements in the outcome measures after completion of FIX will be at or below the rate of pre-FIX improvements. With this initial description of how well hospitals are able to improve and sustain quality after a QI effort, the next question becomes what can be done to increase the ability of QI to lead to sustained improvements. The goal of this analysis is to understand whether there are any structural issues that may be potential root cause barriers to improvement. Therefore, the second half of this project will focus on an effort to understand what organizational characteristics may be associated with successful and unsuccessful QI projects. This will be accomplished using data mining decision trees to determine which organizational characteristics, as reported on responses to the 2007 Survey of ICUs & Acute Inpatient Medical & Surgical care in VHA (HAIG)13 and the VA Clinical Practice Organizational Survey (CPOS),14 are associated with different performance classifications. This analysis meets the third specific aim of this project: Aim 3: Describe how selected organizational structures are associated with sustaining improvements. Summary The following chapters will introduce the reader to relevant portions of the QI literature, cover the study methods, present study results, and disscuss what this means for QI efforts in healthcare. Chapter 2 begins the task of addressing 9 the two specific aims by discussing the collaborative approach to QI, examining the current understanding of the approach in the literature, and exploring a specific collaborative that served as the case study for analysis. Chapter 3 discusses the analytic methods and reasons for selecting those methods for analyzing hospital performance during the QI collaborative. Chapter 4 concludes the analysis by presenting and discussing the results of the analysis. The second half of the thesis then addresses the third specific aim of the study. Chapter 5 begins by summarizing the current literature examining how organizational characteristics are related to quality measures and QI efforts. The result of this discussion is the development of a new analytic framework that guides the subsequent analysis. Chapter 6 reviews data from two surveys that serve as the measures of organizational characteristics and then discusses how data mining decision trees are ideal tools for modeling the relationship between organizational characteristics and hospital performance on QI. Chapter 7 presents and discusses the results of the data mining decision trees. Finally, Chapter 8 summarizes the findings from this thesis, overviews some recommendations for hospitals to consider when trying to improve their success with QI, and concludes with a discussion about future studies that will build on this work and improve the overall understanding about how to successful improve and sustain quality in healthcare. 10 CHAPTER 2 – QUALITY IMPROVEMENT COLLABORATIVES The goal of this chapter is to introduce the collaborative approach to quality improvement (QI), discuss the current evaluation of the approach in the literature and examine a specific QI collaborative. The initial introduction to collaborative QI considers its origins and development by the Institute for Healthcare Improvement (IHI). The IHI collaborative model proscribes a specific approach that has been employed to tackle a broad number of QI issues. The review of the literature evaluates the success of these approaches, the current understanding about the strengths and weakness of the approach, and also considers the strengths and weaknesses of the literature. The next section of the chapter examines the Flow Improvement Inpatient Initiative (FIX) which represents a specific QI collaborative undertaken in the Veteran Affairs (VA) healthcare system. This QI collaborative serves as the case study for all the analyses reported in this study. The review of FIX considers how it fits the IHI collaborative model and its utility as a case study to meet the goals of this thesis as well as to contribute knowledge to the broader literature. Lastly the chapter concludes with an overview of the first two specific aims of this project. The Collaborative Approach to Quality First conceived by Paul Batalden, MD, and refined by others at the IHI, the QI collaborative was viewed as an effective means for overcoming a key limiting factor to improving healthcare quality, diffusion of knowledge.15 Batalden and the IHI felt that for many topics there was good underlying science on what needed to happen to improve quality, but because hospitals were either unaware of the 11 science, unable to disseminate the science among employees, or did not have the resources or experience necessary to make effective improvements they could not implement the science in a meaningful manner to improve quality. They envisioned the QI collaborative as a process that could overcome these barriers and lead to “breakthrough” improvements in healthcare quality, while also helping to reduce costs.15 This thinking led to the establishment of the IHI Breakthrough Series (BTS) collaborative, which has become the common framework for QI collaboratives in healthcare. The general concept is to have a group of hospitals that are interested in specific and similar quality goals work together to identify solutions. A benefit of the collaborative format, over traditional in-house QI efforts, is that it allows hospitals to collectively invest in relevant subject matter experts who participate by initially training and then guiding participants through the processes necessary to achieve change and improve quality. The collaborative also establishes a structure through which the participants at different hospitals communicate regularly, allowing participants to be resources to other groups such that everyone learns effective solutions for overcoming the inevitable obstacles that arise during a QI effort. In the BTS model there are three learning sessions with alternating action periods (Figure 2-1, pg.18), most frequently distributed over a year but ranging from 6 – 15 months.15 Each learning session is attended by at least three team members from each participating institution as well as the subject matter expert. The first learning session typically focuses on learning about the topic through 12 relevant training, refining the team aim, and making plans for change. Some common focuses are learning how to use the Plan-Do-Study-Act (PDSA) change cycle, how to develop specific and measurable aims, and defining the ideal state of care. The second and third learning session brings the teams back together to report experiences, discuss challenges, learn from other teams, and work with QI experts to apply additional skills. There is often also a final conclusion session where teams review their successes and discuss any goals moving forward. The alternating action periods are times where the teams focus on implementing improvement projects at their facility. During the action periods the various participating hospitals interact with each other through conference calls providing regular opportunities to brainstorm solutions for any new problems. The literature reporting on collaborative QI projects suggests the approach can be successful in improving quality and disseminating QI across a variety of settings. Some example collaboratives include efforts to improve chronic heart failure (CHF) patient care,16 reduce door-to-balloon time for heart attack care,17 reduce fall-related injuries,18 and improve medication reconciliation.19 There are also reports showing collaboratives have worked in other healthcare systems both in developed (Holland, Norway and Australia)20-22 and developing countries.23 While these reports state each collaboratives is a success, it is important to note that there is variation in performance across hospitals in individual collaboratives. There are also some potential systematic barriers that may either prevent participation in or greatly reduce the chance of hospital success with a 13 collaborative. As an example, consider the efforts to improve medication reconciliation that aimed to involve all hospitals in the state of Massachusetts. The collaborative was able to recruit 88% of hospitals in the state, but the nonparticipating facilities were clearly distinguished by their small size and often isolated locations.19 For the participating hospitals, only 50% had success achieving at least partial implementation of the initiatives related to improving the medication reconciliation process. For those hospitals that did not achieve partial implementation of the initiatives, some frequently cited barriers to success were an inability to get people to change the way they work, an inability to get clinician buy in and overall project complexity.19 These barriers, particularly an inability to get buy-in or get people to change the way they work, are directly related to the change theory and human factors issues discussed as a challenge for QI in healthcare. Another critical consideration about the literature discussing collaboratives was that many articles, much like the broader QI literature, utilized methodologies that were limited in their ability to establish cause-effect relationships proving the effectiveness of collaboratives. The reports often focused on a team’s ability to implement planned changes, as in the Massachusetts article, but this does not speak to whether the implementation was effective or lead to any improvements in quality. Another common assessment approach is to have the team self-report of whether they felt their efforts led to improved quality. Although the collaborative format encourages 14 rigorous data collection, rarely do publications include any data that would increase the reader’s confidence that teams were truly successful. In short, the assessments of collaboratives make it difficult to quantify what measureable improvements to quality the collaborative achieved and further which actions are most directly associated with any improvements. Showing this causal association is particularly important in healthcare as these collaboratives typically target highly publicized quality problems. As such, any observed improvement may be more attributable to outside events, such as continuing education sessions and conferences, which increase awareness about the topic and may result in small modifications to provider behavior. This particular problem was addressed in a study analyzing whether the CHF BTS collaborative led to improvements in care above and beyond that which would have naturally occurred.16 The study design to achieve this aim involved sampling 4 hospitals from the collaborative and then identifying 4 control hospitals that did not participate in the collaborative and had similar hospital structures, i.e. matched controls. Using a panel of 21 common metrics for CHF care quality, the analysis identified that the collaborative sites exhibited greater improvements on 11 of them, with the strongest improvements associated with patient counseling and education metrics. For some of the metrics where there was no difference between participants and controls there were still sizable improvements in performance. As an example, collaborative hospitals increased by 16% the percentage of patients that had their left ventricular ejection fraction (LVEF) measured, but the controls also increased LVEF testing by 13% leading 15 to a non-significant comparison (p = 0.49).16 This article highlights that observed improvements cannot always be directly attributed to the collaborative and careful consideration should be taken in developing program evaluations that can best establish a causal relationship between measured improvements and collaborative efforts. VA was an early adopter of the BTS model and has used it to target adverse drug events, safety in high-risk areas, home-based primary care, fall risk, and many other patient safety areas.24-27 An example from primary care was efforts to improve, across a system of nearly 1,300 sites of care, the average number of days until the next available primary care appointment.24 Over a fouryear period the Advanced Clinic Access collaborative was able to drop the average days until first available appointment from 42.9 to 15.7 days. On the inpatient side, a review of 134 QI teams participating in 5 different VA collaboratives found that somewhere between 51 – 68% of teams were successful with their efforts.25 Success in this case was defined as a selfreported reduction in at least one outcome by 20% from baseline and sustained at that level for 2 months before the end of the collaborative. Some example outcomes for the collaboratives were to reduce adverse drug events, reduce infection rates, reduce caregiver stress for home-based dementia care, reduce delays in the compensation process, and reduce patient falls. A unique feature of this article is that it evaluated for whether any organizational, systemic, and interpersonal characteristics of hospitals and teams were associated with performance in the collaborative. When comparing ratings at the end of the 16 collaborative to those at the beginning, some key findings were that low performing teams showed reductions in their ratings of resource availability, physician participation, and team leadership.25 In contrast, high performing teams were more likely to rate that they had worked as a team before, were part of their organization’s strategic goals, and had stronger team leadership. A main take away from the analysis of collaboratives in VA, as well as the study of the medication reconciliation collaborative in Massachusetts, was that there may be challenges faced by hospitals that are not directly addressed in the current QI collaborative structure. Two common barriers were a lack of resources and difficulty getting support and buy-in from physicians. One consideration with these barriers, particularly the availability of resources, is whether the presence of such a barrier could be identified prior to a collaborative and if identified whether those hospitals should participate in a collaborative. It may be that a hospital needs to develop a certain baseline of behaviors before success is likely in a QI collaborative and if those behaviors aren’t present that may be where the hospital needs to focus first. This question will be addressed as part of the third aim for this study; however, before that can be analyzed it’s necessary to measure and understand which hospitals succeed in a QI collaborative. In order to measure and understand which hospitals succeed, it is necessary to get past the current style of reporting which relies too much on prepost analyses (assuming actual quantitative data) that can’t establish what measured improvements are due to collaborative participation. The next sections of this chapter will provide an in-depth introduction to the QI collaborative studied 17 throughout this research and overview the initial analyses undertaken to establish which hospitals improved and also sustained quality as part of their participation in the collaborative. Flow Improvement Inpatient Initiative (FIX) The collaborative of interest for this study was the Flow Improvement Inpatient Initiative (FIX). This was a system redesign initiative undertaken in VA during fiscal year 2007 (FY07) and closely followed the IHI BTS collaborative model. The aim of the collaborative was to improve and optimize inpatient hospital flow through the continuum of inpatient care.28, 29 The efforts focused on addressing potential barriers to smooth flow in the emergency department, operating suites, and on the inpatient wards. The objective was simply to identify and eliminate bottlenecks, delays, waste and errors that may hinder a patient’s smooth progression through the hospital. Some outcome measures associated with the collaborative were shorter hospital length of stay (LOS) and increased percentage of patients discharged before noon.30 The goal of these outcome measures was to ensure that sufficient patient beds were available (particularly in the early afternoon) for patients needing to be admitted from the emergency department (ED) or after surgical procedures. By improving bed availability, not only is patient care and safety improved, but VA hopes to reduce the need for fee-service care, where veterans are cared for at VA expense in private hospitals. This collaborative followed the general BTS model with 3 learning sessions and then a final wrap up session,31 the approximate timing of these events is outlined in Figure 2-1. In total, 130 VA hospitals participated with 18 approximately 500 participants attending at least one learning session.31 Given the need for active participation and interaction during learning sessions, the collaborative was split and implemented in five separate regions (Northeast, Southeast, Central, Midwest, and West). During the action periods, teams met at least weekly to work on their QI projects. Commonly reported projects focused on efforts to reduce LOS, reduce bed turnover time, increase the percentage of patients given a discharge appointment, increase the percentage of patients discharged before noon, decrease the time to admission from the ED, and decrease ED diversion time.30 Figure 2-1: Model of the IHI BTS Collaborative timeline for FY07 FIX 19 Despite the prior experience in VA with collaboratives, there was only limited evaluation plans established for FIX. Teams likely measured their performance as they worked to improve patient flow, but this data was never systematically collected. An external consulting group was tasked with evaluating success after the completion of FIX. This evaluation focused most predominately on determining whether participants were satisfied with the collaborative and felt that they gained knowledge or skills during the process.32 However, the evaluation also considered whether there was a positive business impact or return on investment based on changes in the Observed Minus Expected LOS (OMELOS) during the collaborative. This pre-post analysis compared the FY06 OMELOS with the FY08 OMELOS for patient time in an ICU or a general acute care floor at 10 hospitals. The process also involved querying the FIX team leader at each of those hospitals so they could estimate what percentage of the improvement they would attribute to FIX. The average of these values was then extrapolated to the entire VA population and used to determine an estimated cost savings. An overview of these results is presented in Table 2-1.32 After adjusting for the estimated benefits attributable to FIX the final conclusion was that implementation of FIX saved $141 million. In order to determine a return on investment, the analysis considered the costs at the 10 facilities related to oversight, planning, implementation, and evaluation. The extrapolated costs came to $5.8 million for VA, equating to an overall return on investment of 2,327%. 20 Acute ICU Table 2-1: Reported calculation of cost savings from FIX Amount % Attributed to # of Annual FY08 – FY06 Cost/day Saved FIX Admissions OMELOS 0.51 days $684.75 530,000 $185 million 40.37% 0.31 days $3500.00 150,000 $110 million 52.18% While impressive, these results have their limitations and are insufficient for truly understanding the impact of FIX. One major concern is that the analysis uses a pre-post study design based on an unspecified single time point, i.e. the analysis does not report how many days or patients are averaged together. For any number of reasons these single time points may not accurately reflect a hospitals performance as measured by OMELOS. Particularly, noteworthy is that OMELOS fluctuates, sometimes significantly at smaller hospitals, and yet the report provided no indication of how much variation was associated with the measure. Further, LOS has a documented pre-existing temporal trend, which was not considered and could account for a considerable proportion of the observed improvements.33, 34 Although the analysis adjusted for self-perceived impact, that measure only considers whether the QI team felt they had targeted activities that would impact OMELOS, and less on whether they felt those activities were responsible for a specific reduction in OMELOS. Beyond the potential misleading conclusions about reductions in OMELOS, the cost savings calculations also have two important limitations. First, the calculations assume that inpatient costs are distributed evenly throughout the inpatient stay, which is unlikely to be the true case. Secondly, with much of the involved costs representing fixed expenses, a reduction in LOS only represents a 21 savings to VA if they are able to avoid fee based care. Unfortunately, diversion rates and fee-based care costs are not systematically collected or available for analysis. One final consideration about the analysis, the final report didn’t come out until May 2010, yet there was no attempt to consider how well hospitals maintained improvements after the completion of FIX. The sustainability of interventions is a big component to achieving high quality care, yet there is no assessment of this in any collaborative reports. The next section shows how even in a retrospective nature it is possible to conduct an in-depth study of FIX that will provide insight into whether hospitals were able to improve outcomes and then sustain quality after participating in FIX. FIX Analysis Overview There are a number of challenges in developing a study for analyzing FIX, yet the FIX collaborative has some important characteristics that make it an ideal collaborative to study. First, the goals of FIX make it amenable to a retrospective analysis that uses available administrative data sets. The two primary outcomes of FIX, LOS and discharge before noon, are easily ascertained in administrative records of patient stays. Second, since FIX occurred in FY07 there are now two years of data available to analyze whether initial improvements in outcomes are sustained after FIX. Third, FIX occurred at the same time as two major surveys that assessed organizational characteristics in VA hospitals. These two surveys will play a major role in the second half of this study as the study attempts to identify characteristics that distinguish sites on their ability to succeed during FIX. 22 As a final strength, FIX was in effect 5 simultaneous collaboratives providing a large sample (130 hospitals) and offering the possibility of some sub-group analyses. Given these strengths FIX was selected to serve as a case study that could help identify whether a QI collaborative leads to quantifiable improvements in quality, whether hospitals sustain those improvements, whether there is significant variation in performance, and whether organizational characteristics might help explain success or failure in the collaborative. As discussed through the literature reviews, the ideal study would involve an analysis that would either establish or provide strong arguments for a cause-effect relationship between specific improvements and changes in the outcomes. Unfortunately, there was no data that defined the specific improvements implemented by teams. Without this, or other qualitative assessments from the teams it was impossible to suggest a causal relationship between FIX and the observations of this study. Instead the study strives to use a methodologically strong quasi-experimental study approach that provides some support for suggesting that any identified improvements were attributable to FIX. One such approach could be a case-control study such as that done to analyze improvements in the CHF collaborative. However, this is not a possibility since FIX involved all VA acute care hospitals, eliminating any natural controls. Additionally, selecting private sector hospitals as controls would be unrealistic as the unique structural characteristics (i.e. federal funding, comprehensive electronic medical record, extensive catchment areas) of VA hospitals make 23 direct comparisons difficult. Instead, this study employed an interrupted timeseries analysis. The exclusion of a case-control study combined with the use of administrative data leave the options for analyzing FIX as structural equation modeling, latent growth curve modeling, hierarchical linear modeling and timeseries analysis.35 Of these four choices, hierarchical linear models and timeseries analysis are best suited for analyzing and understanding the changes over time in outcomes such as LOS and discharges before noon. Since a separate outcome model is planned for each facility, all measures are at the individual level and the utility of hierarchical linear models would be for analyzing the data as a repeated measures model. In comparing a repeated measures approach with a time-series model, the trade off is between a greater ability to model the correlation between individuals (hierarchical model) or the correlation between events over time (time-series). This analysis chooses to focus on the correlation between events over time (i.e. uses a time-series analysis) for three reasons. First, the ability to riskadjust for different patient characteristics provides some protection for correlation between individuals at a facility that may impact their outcome. Furthermore, since most admissions represent a unique case (rather than a related readmission) risk adjustment better adjusts for correlation between individuals than repeated measures hierarchical models would. Second, the use of time-series models allows more flexibility for evaluating and adjusting for auto-regressive relationships between data. There is a notable relationship between outcomes on 24 separate days, the strength of which dissipates over time. Further, there is the potential for periodicity effects (e.g. weekly, seasonal). While these are not commonly found in healthcare outcomes an analysis of this type should evaluate for their presence. Third, time-series models are considered most appropriate when the question of interest focuses on the impact of an intervention on a system level rather than on an individual level.36 While some individuals may have greater benefit from the FIX initiative, the general hypothesis was that FIX resulted in systemic changes to the system and that benefits were essentially uniform across individuals. A risk-adjusted time series model provides the best balance of adjustment for individual characteristics and correlation between data points over time while focusing on the key underlying question of what impact FIX had on the ability of each facility to provide high quality care. Based on these considerations it was determined that an interrupted timeseries evaluation was the strongest study design for taking into account the preexisting temporal trends in the data that might help explain observed improvements as well as indicate whether facilities were able to sustain improvements after FIX. The primary outcomes of the analysis will be LOS and percent of patients discharged before noon in order to directly reflect the goals of FIX. Additionally, three secondary outcomes (in-hospital mortality, 30-day mortality, and 30-day all-cause readmission) will be evaluated. The purpose of these secondary outcomes was to ensure that improvements in the primary outcomes were not associated with reductions in quality for other quality measures. The analyses of FIX address the following two specific aims: 25 Aim 1: Determine the impact of the FIX collaborative upon quality and efficiency as measured by LOS, percent of patients discharged before noon, in-hospital and 30-day mortality rates, and 30-day readmission rates. Hypothesis 1: The FIX collaborative will shorten patient LOS, increase the percentage of patients discharged before noon. There will be no changes in mortality or readmission rates attributable to FIX. Aim 2: Determine whether improvements attributable to FIX are sustained postimplementation. Hypothesis 2a: Improvements in the outcome measures will continue on a downward slope after completion of FIX. Hypothesis 2b: The rate of further improvements in the outcome measures after completion of FIX will be at or below the rate of pre-FIX improvements. Conclusions This chapter established the background for this analysis of FIX as a case study representing quality improvement in healthcare. The first half of the chapter discussed the IHIs development of the collaborative model and its utility for supporting broad improvements in healthcare quality. This introduction was followed by a review of the collaborative literature which suggested that while collaboratives do generate improvements; individual hospitals vary in their success. Additionally, the findings were weakened because they frequently relied on team self-report of success in implementing project components or on improving outcomes. The second half of the chapter moved from the broad 26 literature to discuss the FIX collaborative and how an analysis of that collaborative could improve the understanding of the collaborative as well as begin to address the questions of this thesis. Lastly the chapter reviews several different potential analytic approaches and identifies the reasons for selecting an interrupted-time series model for analyzing FIX. The upcoming chapter provides further detail on the methods used to risk-adjusted the five outcomes on interest and then develop the final time-series models for evaluating FIX. 27 CHAPTER 3 – TIME-SERIES METHODS This chapter presents the methods used to address the first two specific aims of this research which focus on understanding the impact of the Flow Improvement Inpatient Initiative (FIX) on five outcome measures. The initial sections of the chapter describe the data sources used in this analysis and define the patient cohort. Subsequently, there is a discussion of the process used to develop the risk-adjustment models for each outcome. The risk-adjusted patient values are then input into a time-series model, with the final parameters calculated in this model serving to determine hospital performance on each of the outcomes. Finally, the chapter discusses a classification scheme developed based on potential outcomes from the time-series model that was used to group hospitals into performance categories to facilitate future analyses. Data Sources Data for this study came from VA administrative discharge records. While administrative databases were not originally intended for research, they have played a valuable role in health services research in the Veterans Affairs (VA) healthcare system.37, 38 Based on the 1972 Uniform Hospital Discharge Data Set (UHDSS)39 healthcare administrative databases have a standard form which includes patient demographics as well as the International Classification of Diseases, 9th revision, Clinical Modification (ICD-9-CM) codes that serve as a proxy for clinical status. The accuracy of some ICD-9-CM codes has been challenged, but a VA study on the level of agreement between administrative and medical records data report kappa statistics of 0.92 for demographics, 0.75 for 28 principal diagnosis, and 0.53 for bed section.40 Variables to determine patient outcomes and adjust for severity at admission will come from several existing administrative databases compiled at the Austin Automation Center for all VA hospitals. These files include the: 1) Patient Treatment File (PTF); 2) Enrollment File; and 3) Vital Status File. All files were linked using unique patient identifiers which also allow for monitoring a patient over-time to detect a sequence of hospital visits. PTF data are updated on a quarterly basis as SAS datasets and provided the majority of descriptive variables related to patient outcomes and risk adjustment models. Available data fields were derived from 45,000 data fields contained within the Veterans Health Information Systems and Technology Architecture (VISTA). Quality control protocols ensure data fields contain appropriate numbers and types of characters. VISTA modules cover a variety of important hospital services and functions including admission, discharge, transfer, scheduling, pharmacy, laboratory, and radiology. Enrollment File contains details on basic demographic variables as well as VA specific measures such as a listing of medical conditions that are considered directly connected to military service. Vital Status File combines data from four sources: VA Beneficiary Identification and Record Locator System (BIRLS), VA Patient Treatment File (PTF), Social Security Administration (SSA) death master file, and Medicare vital status file. It provides date of death for VA users with a sensitivity of 98.3% and specificity of 99.8% compared to the National Death Index.41 29 Data Elements This study analyzes five outcomes (Table 3-1), two of which are primary outcomes while the other three are secondary outcomes. The primary outcomes, length of stay (LOS) and percent of discharges before noon, were chosen to reflect the stated goals of the FIX collaborative. As stated in hypothesis 1, FIX is expected to result in improved performance on these outcomes. The secondary outcomes, 30-day all-cause readmission, 30-day mortality, and in-hospital mortality, serve as quality checks focused on identifying if the efforts to improve patient flow led to any unintended consequences. The hypothesis was there would be no changes attributable to FIX associated with any of the secondary outcomes. For the purpose of defining readmissions, an index admission will be any new admission within a 30-day period, with any subsequent admission within 30days classified as a readmission. A readmission cannot itself count as an index admission for a later admission, although the initial index admission could potentially have multiple associated readmissions. Visits to an emergency department or admissions to a non-VA hospital are not captured in this data. Patient Cohort The study population was all patients admitted to acute medical care in each of 130 VA hospitals between FY05 – FY09. This includes patients directly admitted as medical patients (as opposed to surgical patients) to an ICU as well as those admitted and discharged under observation status. While observation patients are billed as outpatients, they are important to include in this analysis for 30 a couple reasons. First, an ability to discharge a patient within 24 hours (the standard set in VA to maintain observation status), may be a sign of good patient flow, so removing these patients from analyses could inadvertently penalize facilities for some of their improvements. Second, there is inconsistent use of observation status (reflecting policy issues as well as patient flow) across VA. A quick analysis identified 9 facilities that had never used observation status and one facility that classified 50% of admissions as observation patients. With no direct understanding of how high or low use of observation status impacts patient outcomes, exclusion of observation patients could have severe unknown consequences on the evaluation. Lastly, observation patients are treated on the same wards as traditional acute admissions meaning their presence impacts the overall flow and provider workload on medical wards making it inappropriate to exclude them in these analyses. Variable Length of Stay Noon Discharge 30-Day Readmission Table 3-1: List of Outcome Measures Type Description Calculated: Time of Discharge – Time of Continuous Admission Rate Percentage of patients discharged before noon Rate Any readmission to any VA Hospital 30-Day Mortality Rate Death recorded during the hospital stay or within 30 days of discharge In-Hospital Mortality Rate Death recorded during the hospital stay 31 Risk Adjustment Separate risk-adjustment models were developed for each of the outcome measures before modeling outcomes in the time-series equations. Risk adjustment evaluation was done in a cohort of patients discharged in FY07. Following standard VA procedure a cohort was identified that represented a stratified sample of 10 VA hospitals representing each of the five geographic regions (Northeast, Southeast, South, Midwest, West).42 One large (>200 medical/surgical beds) and one medium (100 – 199 medical/surgical beds) VA hospital were randomly sampled to represent each region. Small facilities were not included as their small volumes can lead to dramatic variation which can have adverse affects on the final risk adjustment coefficients. The final risk adjustment cohort represented 42,725 discharges in FY07. Table 3-2 provides a comparison of some basic descriptive statistics between the risk adjustment cohort and all other FY07 discharges. While the vast majority of these comparisons were statistically different, these differences were attributable to the large sample sizes and do not represent meaningful clinical differences. The only concerning difference in the table is the difference between the two groups in terms of missing race information. This example shows why data from small facilities can be problematic and they are not included in risk adjustment model evaluation for VA data. A broad collection of variables, listed in Table 3-3, that measure patient socio-demographics, primary diagnosis, diagnosed comorbidities, admission, and discharge characteristics were evaluated to determine their impact on each 32 outcome measure. Modeling for LOS was done in the log-scale due to the skewed nature of LOS data.43 All other outcomes were treated as rates and modeled with binomial distributions. Table 3-2: Comparison of risk adjustment cohort to all other FY07 discharges Risk Adjustment All Other FY07 p-value (N=42,725) (N=291,484) Age (SD) 65.94 (12.85) 65.49 (13.11) <0.001 Male (%) 41,032 (96.0%) 279,735 (96.0%) 0.50 Income (SD) 23,275 (47,775) 22,162 (42,390) <0.001 Race White (%) 26,511 (62.1%) 146,748 (50.4%) <0.001 Black (%) 7,614 (17.8%) 43,571 (15.0%) <0.001 Hispanic (%) 668 (1.6%) 3,171 (1.1%) <0.001 Asian / Pacific Islander (%) 407 (1.0%) 2,111 (0.7%) <0.001 Native American (%) 161 (0.4%) 1,375 (0.5%) 0.006 Missing (%) 7,964 (18.6%) 97,069 (33.3%) <0.001 ICU Direct Admit (%) 7,471 (17.5%) 53,685 (18.4%) <0.001 Un-adjusted LOS (SD) 5.43 (8.90) 5.22 (8.08) <0.001 Died In-hospital (%) 1,104 (2.6%) 8,311 (2.85%) 0.002 Discharge Before Noon (%) 7,082 (16.6%) 54,075 (18.6%) <0.001 All Cause Readmit (%) 6,332 (15.3%) 42,995 (15.3%) 0.85 33 Discharge Admission Socio-demographics Table 3-3: List of potential risk adjustment variables, the number of discrete categories, and a description of how categories were defined Description Categories Everyone under 45* Age 10 5-year increments from 45 – 84 Everyone 85 and older Sex 2 Male*, Female Married*, Divorced, Never Married, Separated, Marital Status 6 Unknown, Widowed Income Continuous variable White*, Asian / Pacific Islander, Missing, Other Race 4 (includes Black, Hispanic, Native Am.) Percentage that admission condition is Service 3 connected to military service Connected 0%*, 10 – 90%, 100% Major Diagnostic Code Categories Primary Diagnosis 25 Circulatory System* Comorbidities 41 Quan adjustment to Elixhauser algorithm44, 45 Direct*, VA Nursing Home, Community Nursing Home, Outpatient, Observation, Source 9 Community Hospital, VA Hospital, Federal Hospital Direct to ICU Yes / No Community*, Irregular, Death, VA Hospital, Federal hospital, Community hospital, VA Place of nursing home, Community nursing home, 13 Discharge State home nursing, Boarding house, Paid home care, Home-basic primary care, hospice Regular*, Discharge of a committed patient for a 30-day trial, Discharge of a nursing home Type of patient due to 6-month limitation, Irregular, 5 Discharge Transfer, Death with autopsy, Death without autopsy Died InYes / No Hospital Transferred Yes / No out of Hospital * Reference category 34 The first decision in the risk adjustment process was to identify the appropriate number of categories for some of the variables. This was done by running univariate categorical models determining the predictive association between each category and LOS. The goal in this process was to maximize model fit (as measured by Akaike information criterion (AIC)), while working towards a parsimonious list of categories. The goal was to identify a collection of categories for each variable in which the individual point estimates for each category were statistically significant. As an example, the field for place of discharge took on 26 different values in the administrative files, with 14 of these fields having non-significant point estimates in the initial full model. While a number of these categories are meaningful for administrative purposes, they have no significance clinically. Therefore categories such as military hospital, other federal hospital and other government hospital were grouped together and models re-evaluated. For some small groups, there were no ideal clinical comparisons, in which case groups may have been grouped by the similarity of their initial point estimates. This process was iterated, trialing different grouping as necessary, until the best model (lowest AIC, all categories significant) was identified. Full details on the modeling process for Age (Table 3-4), Race (Table 3-5), Percent Service Connected (Table 3-6), Admission Source (Table 3-7), and Place of Discharge (Table 3-8) are available in their respective tables. No changes were necessary for Marital Status or Type of Discharge. A full description of the individual categories has been published elsewhere.46 35 Table 3-4: Modeling of age risk adjustment categories # of AIC Description Model Categories 1 Continuous 114938 2 2 115130 < 60, ≥ 60 3 4 115010 < 40, [40, 60), [60, 80), ≥ 80 4 15 114932 <20, 5 year increments, ≥ 90 5 8 114953 <20, 10 year increments, ≥ 90 6 12 114932 <25, 5 year increments, ≥ 80 7 10 114926 <45, 5 year increments, ≥ 85 Table 3-5: Modeling of race risk adjustment categories # of Model AIC Description Categories 1 6* 115308 Native American, Hispanic were non-significant 2 4 115305 White, Asian/Pacific Islander, Missing, All others 3 4 115307 Black, Asian/Pacific Islander, Missing, All others * Coded categories are: White, Black, Hispanic, Asian/Pacific Islander, Native American, Missing Table 3-6: Modeling of service connected risk adjustment categories # of AIC Description Model Categories 1 11* 115318 30,40,50,70,90 all insignificant 2 3 115308 0, [10,90], 100 3 6 115314 Grouped in increments of 20 4 4 115310 0, [10,50], [60,90], 100 * Service connected is recorded in increments of 10 from 0 - 100 36 Table 3-7: Modeling of admission source risk adjustment categories # of AIC Description Model Categories 1E, 1H, 1J, 1L, 1R, 1S, 2A, 2B, 2C, 3B, 3E were 1 19* 114579 non-significant 1E, 1J, 1L,1R, 1S all paired with 1P; 1G with 1H; 2A, 2B, 2C grouped as 2A; 3B, 3E all paired with 3C 2 9 114574 2A (p=.0573) 3 8 114576 2A paired with 1M 4 8 114573 2A paired with 1P * See VA data documentation for complete listing of fields46 Table 3-8: Modeling of place of discharge risk adjustment categories # of Model AIC Description Categories 1, 2, 3, 12, 13, 15, 16, 19, 20, 21, 27, 29, 34, 35 1 26* 113126 were non-significant 1,2,3, paired as 3; 12,13,15,20 paired with 11; 16,19 paired with 17; 27 paired with 5; 21,29,35 paired as 21; 34 paired with 22 2 14 113121 21 is non-significant 3 13 113119 21 (including 29 & 35) paired with -3 * See VA data documentation for complete listing of fields46 Categories 9, 10, & 14 were not recorded for any discharges in the study Once the final categorizations were set the next step in the risk adjustment model development was to evaluate the univariate relationships between each outcome and the potential risk adjustment variables. All variables having a p<0.1 association in univariate analyses were included in the initial full model for that outcome. Reduced models were then generated by removing variables that did not meet a determined threshold; the full sequence of steps taken to develop each model is detailed in the SAS code available in Appendix A. The goal of 37 model selection was to identify the simplest model with the best AIC. In instances where the AIC were too similar (within 2 points) the model with the greatest number of variables, even if some were marginally significant was selected. The model evaluation process also evaluated for potential correlations between variables. Correlation was tested between single level variables (ex. comorbidities, Direct Admission to ICU). The Correlation between multi-level variables was not compared, but potential correlations such as place of discharge and type of discharge were never relevant in identifying the best model. The key correlations that were identified and evaluated if necessary during the model development are listed in Table 3-9. Table 3-9: Highly correlated risk adjustment variables Variable 1 Variable 2 Correlation (ρ) Rheumatic Arthritis Arthritis 0.88 Paralysis Hemiparesis 0.82 Renal Disease Complicated Hypertension 0.81 Renal Failure Complicated Hypertension 0.81 Mild Liver Liver 0.98 Nonmetastic Cancer Malignancy 0.92 Ulcer No Bleed Peptic Ulcer 0.82 Renal Disease Renal Failure 1.00 Once final risk adjustment models were developed a second cohort of 60,000 patients was randomly sampled from all FY07 discharges, this random sample included patients from small facilities and discharges from the original risk adjustment cohort. The final models were run in this cohort to verify model performance and generate point estimates for use in risk-adjustment. A listing of 38 these final point estimates is available in Appendix B. These risk-adjustment point estimates were used to calculate the expected outcome for each patient, which was used to determine the indirect adjusted outcome. Time-Series Model There are several issues, many discussed in Chapter 2, to consider in determining how to best model and evaluate the impact of FIX. This study employed an interrupted time-series model given the study designs ability to account for pre-existing temporal trends, allow for evaluation of the outcomes after the intervention, and to protect against some threats to internal validity in comparison to other quasi-experimental designs.21 22 All outcomes were individually modeled using a time series analysis covering 5 years, starting in FY05 (October 1, 2004) through to the end of FY09 (September 30, 2009). This provided two years of data prior to FIX which establish the baseline performance, a year of data identifying whether hospitals made improvements during FIX, and two years of data identifying whether those hospitals that improved were able to sustain those improvements. After determining the risk adjusted outcomes, the next step was determining the best level of outcome aggregation. At the individual patient level, LOS or rates of the other outcomes were highly variable. So modeling at that discrete of a level would make it difficult to detect meaningful changes in any outcome due to excessive variability or noise in the signal. Conversely, modeling at a highly aggregated level, such as a 6-month mean, would potentially ignore key fluctuations in the outcome measures. This study settled on having each 39 data point represent a fourteen-day average which results in 26 data points per year, or 130 data points over the 5 study years. This level of outcome aggregation was based on power calculations determining the appropriate tradeoff between variability and overall number of time points. Assuming moderate autocorrelation (φ = 0.3), these models have a power of 0.88 to detect a change in the outcome in response to the intervention equivalent to one standard deviation (Power = 0.87 for detecting sustainability).47, 48 While a simple 14-day average works well for the LOS and discharges before noon models, it presents a challenge for the other models. Most notably in smaller VA hospitals where it is reasonable to expect 14-day periods without any observed outcomes, particularly for in-hospital mortality. To avoid the unnecessary variance introduced by this possibility, the outcome models for readmission and mortality rates were plotted every 14-days, but, each point represents a moving average of the previous 70-days (5 data points). This did result in the reduction of these time-series by 4 data points at the beginning of these models. The final concern in developing this model, which supports the selection of a time-series model, is how these data are unlikely to meet the assumptions of standard linear regression. Most importantly, while each discharge was essentially an independent event, it was not appropriate to assume independent error terms. Therefore, all models were evaluated and adjusted for correlation between error terms. The potential for autocorrelation was evaluated up to 26 times points, allowing for capture of seasonal correlation up to a year. The 40 second concern was that the measures may not have homoscedastic variance. There were two potential sources of heteroscedasticity in this analysis. First, there may be different number of discharges averaged in each 14-day measure. Secondly, as the outcomes improve, they may be approaching a floor in which no further improvements are possible and thus the variance around that point is likely to tighten. All models were evaluated for and when identified corrected for autocorrelation and heteroscedasticity.49 With the above considerations, the following is the final form of the basic outcome model: In the above model, β1 – β5 represent the slope associated with the modeled outcome during FY05 – FY09 respectively. The time component is parameterized in order to create a continuous linear regression, so t05 counts from 0 - 129, while t06 is 0 for the first 27 time points and then begins counting. This parameterization continues, with each subsequent year beginning 26 points later, thus t07 = 1 at 53, t08 at 79, and t09 at 105. The β6 term represents a quadratic component to the overall trend. This parameter was only included in models where it was significant (p<0.05). The final component of this model, vt represents the autocorrelated error term, shown below: In this equation, represents the degree of correlation between the error terms of the current time point and any prior point. For these models only those 41 correlations that were statistically significant (p<0.05) were included in the final model of vt. The final component of the model is the remaining error term, : ~0,1 For these models et represents the typical assumption in linear regression that error terms are normally distributed with mean 0 and variance σ2. However, as discussed this data may not fit this assumption, so when heteroscedasticity is detected h, as calculated below, is used to estimate and correct for the changing error variance. ' + &( *( # $ %& & $ )* * Improvement and Sustainability With the time-series equation developed, the final step was to develop a classification approach that would identify whether hospitals improved on any outcome measure and then which hospitals went on to sustain those improvements. The final classification system, listed in Table 3-10, defined 11 sub-categories that collapse into 4 major categories. This approach to classifying performance predominately focuses on the results of parameters β3 – β5. β1 and β2 serve to establish a baseline of performance and control for improvements that would be expected, based on historical trends, had FIX not occurred. The first major category is those hospitals classified as having No Change, meaning no statistical (p<0.05) changes were observed for β1 – β4 (FY05 – FY08). The purpose of this category is to separate out those facilities whose outcome performance was characterized by high variance, meaning any 42 signal was buried in among a significant amount of noise. It is potentially important to note this type of performance in quality improvement, as high variation suggests a lack of consistently performing process which is a different quality improvement challenge than saying a hospital was unsuccessful in their efforts to improve a process. For this reason the No Change category was kept separate from the No Benefit category. On last consideration about the No Change category, some of these hospitals did exhibit a detectable change in the outcome in FY09 but were still classified here for two reasons. First, given the high variability displayed by many of these facilities it seemed that any detected change in FY09 was unlikely to be a true change and more likely represented a chance occurrence. Secondly, any improvement observed in FY09 was too distant from the occurrence of FIX to suggest any association. No Change Table 3-10: Description of full classification categories A.1 No changes observed from FY05 – FY09 A.2 No changes observed from FY05 – FY08, improvement in FY09 A.3 No changes observed from FY05 – FY08, decline in FY09 Improve B.1 Not B.2 Sustain B.3 Immediate Loss: Improve in FY07, return to baseline in FY09 Delayed Loss: Improve in FY07, return to baseline in FY09 Delayed Impact: No change in FY07, improve in FY08 Improve C.1 and C.2 Sustain C.3 High Sustain: Additional improvements observed in FY08/09 Moderate Sustain: No additional improvements in FY08/FY09 Weak Sustain: Diminishing improvements, better than FY05/06 No Benefit D.1 D.2 No change in FY07, but statistical changes observed elsewhere Decline observed in FY07 43 The other three categories deal with hospitals that had observable statistical changes during the first four years of the study. For these hospitals the first step in the classification was to examine the performance in FY07 (β3). Figure 3-1 is the flow chart depicting the decision process used to classify each hospital’s performance. Starting with Part B of the figure, any facility that showed a decline in performance during FIX was classified as D.2. While it may be possible that facilities would show improvements in FY08 or FY09, given the lack of directly observed data it was impossible to determine if these improvements represented a delayed effect of FIX, the effect of a different QI project, or simple regression to the mean. With this consideration, it was determined there was no need to further sub-classify hospitals based on their outcomes in FY08/09 if there was a decline in performance relative to baseline in FY07. Next Part C of the flow chart represented those hospitals whose performance during FY07 was flat (i.e. performance continued on the baseline trend established by performance in FY05 & FY06). The outcomes for these hospitals fell into one of two categories. First, the hospital could record an improvement in FY08 leading to classification as B.3. This was recorded as a possible improvement attributable to FIX with the reasoning that FIX was a yearlong effort that aimed to improve outcomes across an entire hospital. It seemed reasonable that not all hospitals would have an immediately measurable impact in FY07 but would instead record the biggest gains in the latter half of FY07 and into the first half of FY08. This is certainly the weakest category for 44 asserting that improvements were associated with FIX and should be interpreted appropriately. The other possibility for hospitals that had flat performance in FY07 was that they would continue on the pre-established baseline or exhibit some decline in FY08. These hospitals were classified as D.1 and deemed to have had no benefit attributable to FIX. The No Benefit category, representing hospitals with a D.1 or D.2 classification, marks those hospitals that initially performed with low variability allowing detection of a clear baseline trend which suggests they had processes in place that performed with some consistency. The key feature of these hospitals was that, as measured by the individual outcome, they were unable to make improvements to that process as part of their participation in FIX. The last set of hospitals is those that had an initial improvement during FY07, which is charted in Part A. All of these hospitals are classified as improving; it just becomes a question of whether they sustain those improvements. Hospitals that made an improvement in FY08 or FY09 with no declines in either time period were classified as C.1 or a high sustainer since they not only sustained initial improvements but went on to make further improvements. A facility that neither declined nor improved (i.e. just continued the new baseline established in FY07) were classified as C.2 or moderate sustainer. The last category of sustainer (C.3) was those hospitals that exhibited a decrease in the rate of improvement in FY08 or FY09. However, their overall performance did not decline such that their performance on the outcome returned to pre-FIX levels. This category acknowledges that rates of improvement may 45 Figure 3-1: Decision tree used to classify hospital performance A. Hospitals showing an initial improvement during FY07 B. Hospitals with decreased performance in FY07 C. Hospitals with non statistical (p>0.05) performance in FY07 46 level off after a QI collaborative completes, but hospitals may still maintain a high level of performance. The final category was those hospitals that were unable to sustain the improvements. If the hospitals returned to baseline performance in FY08 they were classified as B.1, immediate loss. If however, they had a slower return to baseline with it not occurring until FY09 then they were classified as B.2, delayed loss. Sub-group Analyses Although the later chapters of this study will provide an in-depth evaluation of the relationship between organizational characteristics and hospital performance, this initial evaluation did consider three sub-group comparisons. The first comparison evaluated hospitals by size to determine if the collaborative was effective across all size categories. Hospitals were classified as either large (≥ 200 beds), medium (100 – 199) or small (< 100) based on the number of approved medical/surgical beds. The second comparison evaluated whether performance varied based on which learning session the team attended. Since 130 hospitals participated in FIX, the learning sessions were broken into five separate regions (Northeast, Southeast, Central, Midwest, and West) to allow all participants to actively engage.31 Lastly, the final comparison examines whether facilities that improved (whether they sustained or not) on the primary outcomes had a different distribution of performance on the other outcomes (particularly the secondary outcomes) compared to the full group. This comparison ensures that these hospitals did not have higher than expected rates of classification into No Benefit 47 on the secondary outcomes. All of these comparisons were done using Pearson chi-square tests comparing the distribution of the relevant sub-group to that of the overall group. Conclusions This chapter has discussed the methods used to evaluate performance at each hospital for each of the five outcomes of interest. Overall, the chapter covered the data sources, defined the patient cohort and provided a detailed description of the risk adjustment and time-series modeling processes. Although the time-series methods used in this analysis were not novel, they have not been applied in this manner to evaluate healthcare QI. Additionally, the classification algorithm generated to aggregate facilities based on performance was a new approach. This classification approach focused on understanding how facilities may be grouped to facilitate later analyses examining how organizational characteristics impact QI efforts. The next chapter will present and discuss the results of this time-series evaluation and classification algorithm. 48 CHAPTER 4 – TIME-SERIES RESULTS AND DISCUSSION This chapter concludes the first half of this study by presenting the results from the analysis of the Flow Improvement Inpatient Initiative (FIX). This analysis first considers results at the aggregate Veterans Affairs (VA) level by grouping patient discharges across hospitals. This provides some understanding of the overall impact of FIX. However, the real purpose of this analysis is to examine the performance of each individual hospital using the time-series approach outlined in Chapter 3. After presenting these results, the chapter continues with an in-depth discussion. First, the discussion focuses on addressing the first two specific aims for the project. Second, it considers the greater implications of the findings for quality improvement in healthcare and whether there is support for using large collaboratives, such as FIX, to improve quality in healthcare. System-Wide Analysis Although the main interest of this analysis was to understand performance at each individual hospital, it was useful to first understand the aggregate impact of FIX for VA as a system. Viewing the data at the aggregate level provides some understanding of the average performance providing a basis for comparing high and low performing hospitals. The five years of data in this study covered 1,690,191 discharges from 130 VA hospitals. Three of the outcome measures, LOS, in-hospital mortality, and 30-day mortality exhibited a natural 3-4% annual improvement in performance prior to FIX. For LOS, Figure 4-1, the time-series model identified a subtle statistical increase in the rate of improvement during 49 FIX which was sustained through the post-intervention period. This was in contrast to in-hospital and 30-day mortality which showed no aggregate improvements associated with FIX. In-hospital mortality, Figure 4-2, showed no statistical changes in FY07 – FY09 from the pre-established trends. For 30-day mortality there was a slight decline in performance in FY07, although as seen in Figure 4-3 this decline does not mean 30-day mortality rates were rising, instead it only signified a leveling of 30-day mortality rates. Most likely this simply reflects that 30-day mortality rates were reaching optimal potential performance leaving few achievable improvements. The other two outcomes in this study, discharges before noon and 30-day all-cause readmission, both were statistically flat prior to FIX. The aggregate results for discharges before noon are perhaps the most intriguing in this study. As shown in Figure 4-4, there is a clear improvement during and after FIX with discharges before noon jumping to near 24% from a baseline of 17%. Unfortunately, part way through FY08 the percentage of patients discharged before noon began to decline, reaching a rate around 20% at the end of the study. While this level of performance is still improved at the end of the study compared to the baseline, it is unclear whether performance will level off at 20% or continue to decline back to baseline. Lastly, 30-day readmissions, Figure 4-5, showed highly variable performance with an overall worsening of performance during FIX. 50 3.6 3.5 Days 3.4 3.3 3.2 3.1 3 2.9 0 FY05 26 FY06 Observed 52 FY07 78 FY08 104 FY09 130 Time-Series Model Figure 4-1: Aggregate results for LOS (FY05 - FY09) Percent of Discharges 5.0% 4.5% 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 0 FY05 26 FY06 Observed 52 FY07 78 FY08 104 FY09 130 Time-Series Model Figure 4-2: Aggregate results for in-hospital mortality (FY05 - FY09) 51 Percent of Discharges 6.5% 6.0% 5.5% 5.0% 4.5% 4.0% 3.5% 3.0% 2.5% 0 FY05 26 FY06 Observed 52 FY07 78 FY08 104 FY09 130 Time-Series Model Figure 4-3: Aggregate results for 30-day mortality (FY05 - FY09) Percent of Discharges 25% 23% 21% 19% 17% 15% 0 FY05 26 FY06 Observed 52 FY07 78 FY08 104 FY09 130 Time-Series Model Figure 4-4: Aggregate results for discharges before noon (FY05 - FY09) 52 Percent of Discharges 16.0% 15.5% 15.0% 14.5% 14.0% 13.5% 13.0% 0 FY05 26 FY06 Observed 52 FY07 78 FY08 104 FY09 130 Time-Series Model Figure 4-5: Aggregate results for 30-day readmissions (FY05 - FY09) Facility Analysis Working with this initial introduction of how FIX impacted VA performance, the focus now shifts to classifying individual hospitals using the classification approach outlined in Chapter 3. The breakdown in performance for all 130 hospitals across each of the 5 outcomes was listed in Table 4-1. These results suggest there was considerable variation both within each hospital on individual outcomes and also across each of the five outcomes. Beginning with LOS, there were 45 (35%) hospitals that made an initial improvement with 27 (60%) able to sustain the improvements. Further, 14 out of the 45 (31%) hospitals classified as improvers had a delayed onset of improvements which means they were not evaluated for whether they sustained the improvements. Of course these 53 successes are balanced by 36 hospitals (28%) in whom there were no statistical changes over the entire study and 49 (38%) that saw a decline or showed no benefit associated with FIX. A.1 2 3 5 3 2 A.2 4 8 2 1 1 Improve B.1 4 3 3 0 4 B.2 0 17 0 2 0 B.3 14 21 6 4 11 Sustain C.1 C.2 C.3 16 8 3 13 3 3 4 1 1 4 7 8 7 6 4 No Benefit No Change Table 4-1: Hospital classification across the 5 outcome measures (N = 130) Noon 30-Day 30-Day In-Hospital LOS Discharge Readmission Mortality Mortality A 30 25 78 32 28 D.1 36 23 19 27 28 D.2 13 11 11 42 39 This break-down between categories is in contrast to how hospitals performed on efforts to improve the percent of patients discharged before noon. Interestingly, there was the exact same number of hospitals, 36, that showed no statistical changes. However, it was not the same hospitals as only 13 were categorized as no change for both LOS and discharge before noon. For improvements, a greater number made initial improvements, 60 (46%), but fewer were able to sustain (19 out of 60, 32%). Once again there were a fair number of 54 hospitals, 21 out of 60 improvers (16%), which exhibited flat performance during FY07 but recorded improvements in FY08. Lastly, 34 (26%) of the facilities did not record any benefit from their participation in FIX related to increasing the rate of discharges before noon. For the secondary outcomes, these were included mainly to determine whether there were declines during FIX, there is less of an expectation that hospitals would improve these outcomes in response to FIX. This expectation was supported as few hospitals showed improvements with a substantial percentage either recording no statistical changes over the study or no benefit from FIX. Of the hospitals not recording any benefit from FIX, those classified as D.2 would be the most concerning as that would signify a decline in performance during FIX that might mean FIX had a negative impact. For the mortality rates, 39 (30%) had declining in-hospital mortality performance and 42 (32%) had declining 30-day mortality performance. While these are concerning numbers, perhaps the more telling association would be if there is a strong association between improvement on the primary outcomes and a subsequent decline on the secondary outcomes. As shown in Table 4-2 (LOS) and Table 4-3 (Discharges before Noon), the distribution of facilities that improved on either of these outcomes is no different than the overall distributions of all facilities, suggesting that improvements attributable to FIX were not associated with direct declines on the secondary outcomes. The last feature to notice in Table 4-1 was the high proportion of hospitals (65%) that showed no statistical change on 30-day readmission. This fits with the aggregate readmission graph (Figure 4-5, p. 52) 55 which suggests hospital readmissions are highly variable and potentially even Improve B.1 B.2 B.3 0 8 8 0 0 2 0 1 2 2 0 5 Sustain Table 4-2: LOS Improvers classification (N = 45) Noon 30-Day 30-Day In-Hospital Discharge Readmission Mortality Mortality A 6 28 9 9 A.1 0 2 1 1 A.2 1 1 1 0 C.1 C.2 C.3 7 1 1 1 0 1 2 1 4 5 1 3 No Benefit No Change associated with a random process. D.1 10 7 10 6 D.2 3 3 14 13 0.73 (10) 0.96 (9) 0.93 (9) 0.55 (9) X2 p-value (df) The other sub-group analyses showed that performance did not vary by hospital size or region. Table 4-4 displays the results (p-values) of the chi-square tests comparing the hospital size or region sub-group to the overall population distribution. None of the comparisons were statistically significant at the p< 0.05 level, which given the number of comparisons may have been inappropriately conservative. The full break down showing the number of hospitals classified into each performance category by hospital size and region is available in Appendix C. 56 Improve B.1 B.2 B.3 2 0 8 2 0 3 0 0 2 1 0 5 Sustain C.1 C.2 C.3 8 5 2 4 0 1 1 3 4 4 4 3 No Benefit No Change Table 4-3: Discharge before noon Improvers classification (N = 60) 30-Day 30-Day In-Hospital LOS Readmission Mortality Mortality A 11 35 16 12 A.1 2 3 0 1 A.2 3 1 1 1 D.1 13 5 13 15 D.2 6 6 20 14 0.87 (9) 0.75 (9) 0.94 (9) 0.93 (9) X2 p-value (df) Category (N) LOS Size Small (54) Medium (60) Large (16) 0.97 0.71 0.11 Region Table 4-4: P-Values from Chi-square tests examining facility performance in subgroups by size and regional location Northeast (23) Southeast (26) Central (25) Midwest (29) West (27) 0.96 0.34 0.71 0.76 0.72 30-Day 30-Day Noon Discharge Readmission Mortality 0.73 0.18 0.83 0.76 0.98 0.71 0.75 0.06 0.58 0.75 0.77 0.67 0.72 0.68 0.36 0.67 0.51 0.37 0.96 0.84 0.72 0.43 0.51 0.92 In-Hospital Mortality 0.77 0.95 0.97 0.06 0.62 0.49 0.87 0.93 57 Evaluation of the Specific Aims The first specific aim evaluated in this study was whether FIX positively impacted quality and efficiency as measured by the five selected outcomes. An evaluation of both the aggregate results and the individual facility results leads to the conclusion that FIX did result in a reduction in LOS, an increase in the percent of patients discharged before noon and that these improvements were not associated with any systematic negative impacts as measured by mortality or readmission rates. The aggregate results for both primary outcomes showed improvements in FY07 that were greater than expected given preliminary trends. Both of the mortality rates had promising aggregate results with in-hospital mortality showing a continuation of the preexisting trend during FIX. The observed leveling in the rate of 30-day mortality rates during FIX likely reflects that there was little to improve on that outcome. The 30-day readmission rate did show a slight increase during FIX, but given the high variability of this outcome (at the individual hospital level 65% had no statistical changes over the entire study) and the lack of association between high performance on LOS or discharges before noon with poor performance on readmissions, this increase in the rate of 30-day readmissions was unlikely a direct effect of FIX. This conclusion is supported by prior work which showed no increase in hospital readmissions with lower hospital LOS.34 Although the aggregate results are impressive, the analyses at the individual hospital level provide a more complex evaluation of FIX. At the 58 individual hospital level only 35% of hospitals improved LOS and 46% improved discharges before noon. Looking at it from another perspective, 50 hospitals (39%) did not improve on either of the primary outcomes and 30 (23%) did not improve on any of the five outcomes. These results are similar but somewhat lower than other published reports of hospital success with collaboratives. The most likely explanation for this difference is that most of the other reports focused on team self reports of success. Certainly a small selection of teams that believe they succeeded would have not produced any measureable improvements. Overall, the conclusion is that FIX was successful, based on the aggregate results, but it is important to recognize that despite receiving the same training individual hospital performance was quite variable. This variation suggests that while successful there were components of QI collaboratives that can be improved in order to help all hospitals have measureable benefits from the effort. The second specific aim evaluated whether those hospitals that achieved initial improvements as part of FIX sustained the improvements for two years post-intervention. This evaluation only considers the results from the primary outcomes given that as predicted few hospitals recorded improvements on readmission or mortality rates during the intervention period. The two primary outcomes paint distinctly different pictures. For LOS, considering only those hospitals that improved in FY07 (i.e. those classified as B.1, B.2, C.1, C.2, or C.3) 87% of them sustained improvements (27 out of 31). Further, 59% (16 out of 27) of the sustaining hospitals were classified as 59 high sustainers (C.1), meaning not only did they sustain a new rate of improvement, but exhibited additional improvements after FIX. From these results it would appear the collaborative was successful, with some individual variation, in creating sustained quality. In contrast, for discharges before noon, only 49% of hospitals improving in FY07 sustained the improvements (19 out of 39). Although fewer hospitals sustained improvements, those that did sustain were frequently high sustainers (13 out of 19, 68%). The results of this outcome, paint a less promising picture about sustainability. With only 50% of hospitals sustaining improvements and a clear declining trend in the aggregate results it is hard to conclude that any solutions developed during the collaborative were specifically designed for sustained improvements. Considering the results of these two outcomes, the overall picture suggests that it is possible to improve and sustain quality after a collaborative; however, there may be some important lessons to learn from the observation that more facilities improved and sustained for LOSA compared to discharges before noon. Discussion This analysis found that a selection of hospitals achieved sustained improvements as part of their participation in FIX. However, individual hospital performance was highly variable, suggesting an opportunity to improve on the success of hospitals participating in a QI collaborative. Since variation in performance was consistent across all 5 regions and across hospital size 60 categories it appears the collaborative was successfully implemented, just not all hospitals had measureable benefits from the experience. Given this mixed evaluation, it is important to remember the complexity of these outcomes and that many different factors impact the final measurement. So while FIX strove to take a system wide approach to improving patient flow, there may still be factors that the framework of the collaborative did not address which would explain the limited success at hospitals. Further, it may have been difficult to widely disseminate improvements across all medical patients in the course of a single year. Despite these inherent limitations to the effort to evaluate FIX, the results of the study still uncovered some interesting challenges to achieving high quality healthcare. These challenges are perhaps best highlighted by the overall performance on the efforts to increase the number of discharges before noon. While not every hospital was successful, it is evident that improvements were achieved systemwide (Figure 4-4). Yet, once the collaborative ended and focus on the performance metric was reduced, only 49% of those that made improvements in FY07 sustained that performance. If many QI initiatives have a similar response profile (initial success that regresses back to the mean over time) that would explain why there have been limited measureable improvements in quality. In contrast, the other primary outcome, LOS, had 87% of improving hospitals go on to sustain improvements. However, there may be inherent differences between these two outcomes which explain the greater success in sustaining LOS compared to discharges before noon. 61 Perhaps the most significant difference is the long history in healthcare of LOS as a performance metric. As such providers generally accept the premise that they should work to shorten LOS, recognize potential personal benefit from shortened LOS, know their average LOS (at least for physicians), and how their performance compares to others. The major benefit of these features is that providers are likely to be less resistant to change, suggesting there would be a low barrier for teams to overcome in implementing and sustaining an intervention designed to shorten LOS. The only real obstacle would be ensuring the intervention was well designed and did not create unmanageable burden. This environment would be in stark contrast for the environment around increasing the rate of discharges before noon, which was a newly introduced performance metric. In that case providers will not have considered it before, know nothing or very little about current performance, and have no basis for comparing performance. If the hospital culture is not otherwise accepting of change this is likely to be a change adverse situation. Successful sustainment of improvement therefore would require a solution that not only improves outcomes but also works to help providers accept and maintain the change. It may be this last part, how to handle and maintain change, where implementation teams were most likely to be unsuccessful in sustaining improvements related to discharges before noon during FIX. It is not surprising that teams would have difficulty achieving sustained improvements related to discharging before noon considering that many morning activities work in direct conflict with the process of trying to discharge patients. 62 The morning is when physician teams round on patients, nurses provide medications, phlebotomy collects blood, and the labs run tests. Not only do these activities represent a significant effort on providers but many of them, particularly results of morning lab tests, provide critical information for deciding whether or not to discharge a patient. With all of these barriers, a proposed solution must not only be effective but it must also reduce workload burden and address information needs. If the proposed solutions did not achieve all of these necessities, it is reasonable to predict that providers trialed the proposed solution, found it unacceptable and then returned to the old way of caring for patients in the morning. Such a response certainly fits with the overall observed aggregate profile where initial improvements are quickly lost with a trend back to pre-implementation performance levels. Despite these concerns, it cannot be forgotten that at the aggregate level the observed percentage of patients discharged before noon was still above the baseline level. The final observed rate of 20% of patients discharged before noon is a 3% absolute increase, or about a 15% relative increase, in the rate compared to the baseline of 17%. It is worth considering the possibility that a decline from a high around 23% of patients to a final rate of 20% may not indicate worse care or poorer hospital flow. Instead, particularly if performance levels off and does not continue to decline, a final rate of 20% may mean that hospitals have achieved an appropriate balance between provider workload burden and meeting the flow needs for their hospital. Other measures that would better understand the flow concerns of a hospital could be emergency 63 department (ED) diversion rates, ED to medicine admission times or the amount of fee basis care for medical admissions. These however were not considered as measured outcomes during FIX, nor are they systematically collected. An important lesson here is that while the metric of discharges before noon was potentially useful for driving improvements, it needed to be evaluated in tandem with a more clinically or business relevant metric to determine true success in improving flow. A secondary consideration from this analysis was the higher than expected percentage (28%) of hospitals classified as No Change on the primary outcomes. In fact more hospitals, 13 compared to 5, recorded No Change on both primary outcomes compared to recording Sustain on both outcomes. While discharges before noon did have a flat baseline at aggregate, to have so many hospitals exhibit no statistical change in LOS which has a distinct trend was particularly surprising to observe from hospitals participating in a QI collaborative. This data serves as a stark reminder that improvements in quality require a standardized process that can be analyzed and improved. QI teams should remember that they first must understand the relevant process, or lack of process, before trying to make change. In the end, whether the implemented solutions were ineffective, not needed, or unacceptable the end result is that individual performance varied considerably despite all participants receiving the same training and having access to national resources. Not only did just a fraction of hospitals show sustained improvements through FY09, but 50 (39%) hospitals did not show any 64 improvement on the two primary outcomes. This leads to the concern that perhaps the collaborative approach to QI does not add any additional value compared to a more individual hospital approach to QI (i.e. a QI project not associated with a collaborative). The main reason for making this comparison is that a collaborative can be an expensive undertaking, the estimated cost of FIX for VA was $5.8 million.32 When hospitals have to pay for this directly out of their budget (it is not clear who bore the individual costs of FIX) they may not want to participate should they recognize that anywhere from a third to a half of participating hospitals would not improve on measured patient outcomes. However, there are two points worth considering when evaluating the tradeoff between in-house QI projects and QI collaboratives. First, collaboratives likely provide many benefits beyond measureable improvements in outcomes. A key purported benefit of a collaborative is that it brings hospitals together to learn skills, coordinate activities, and share knowledge. For a hospital with limited experience with QI, a collaborative may provide many worthwhile benefits even if that collaborative cannot be directly associated with improved quality. These sorts of benefits have been noted in prior analyses of collaboratives which often acknowledge important cultural changes.17 Unfortunately, there was no data collected on these types of benefits during FIX so it was not possible to factor any benefits of this nature into the analysis. While hospitals could achieve some of these benefits from an in-house QI effort, if they have to bring in outside resources to providing initial training the cost of this is likely to be the same if not more than the cost of training at a 65 collaborative. Second, there is no good basis for understanding the individual success rate with in-house QI efforts. Further, there is little data about the costs of these QI projects. Considering that individual QI efforts are not uniformly successful and have many associated costs as well, investing in a collaborative may represent little additional risk. Given the general poor knowledge about QI success rates and costs, there is no clear conclusion about whether a QI collaborative is a worthwhile investment. However, given the potential collaboration between hospitals working together there should be a general benefit from participating in a collaborative. Therefore, the second half of this study works to develop an understanding of what factors may predict an ability to succeed in a collaborative. This understanding can help hospitals decide if they can succeed in a collaborative, and if they cannot succeed identify issues they should focus on in order to create an environment that will support a successful QI collaborative. Limitations While this study generated some intriguing results, it is important to remember that these were exploratory analyses that were subject to some key limitations. First, these results are based on administrative data. This means there are many unmeasured and unaccounted for variables. A key consideration is that this meant the analyses could not be tied to specific areas of a hospital if improvements were initially trialed on specific units before dissemination. In the case of FIX, a hospital-wide approach is supported and in some ways most appropriate. Since FIX aimed to improve flow throughout the entire hospital, 66 improvement projects should have targeted broad initiatives that improved flow for all patients not just a small subset. This is also why the internal evaluation of FIX considered all patients, not just those on targeted wards. So if teams did only make small improvements during FIX, while this would be beneficial, there is reason to argue that this would not have been a fully successful collaborative experience. Second, these results cannot isolate the impact of FIX. FIX was not a onetime, isolated QI initiative, but rather the first of many systems redesign collaboratives (some examples are Patient Flow Center, Transitioning Levels of Care, and Bedside Care Collaborative) some of which occurred during the twoyear follow-up period. Additionally, VA hospitals have been encouraged to conduct numerous local QI efforts each year. Some of these other projects are likely to impact the measured outcomes (LOS and 30-day readmission in particular) meaning the detected improvements can only weakly be attributed to FIX. However, the impact of these other QI projects is of limited concern for two reasons. First, the time-series analysis accounts for baseline trends in the outcomes. So to the degree VA hospitals maintain a regular focus on QI projects, the national focus on FIX represents a single increase in effort and all other QI projects would be accounted for by the baseline trends. Second, it is reasonable to expect that for complex outcomes, such as LOS and discharge before noon, sustained quality will not come out of a single QI project. Instead the importance of any single QI project may be the attention it brings to a topic, the training it provides team members, and its contribution to a greater culture focused on QI. 67 With these considerations, sustained results that were generated by a continuous cycle of improvement generated in response to FIX would be just as meaningful. The final limitation of this analysis was the lack of information at the individual team level. Key metrics such as team leadership quality, support from hospital leadership, and actual team engagement with FIX would provide critical information for distinguishing high and low performers. FIX was a mandated QI collaborative, thus much of the variation in performance may simply be due to varying levels of engagement by teams or hospitals with the collaborative. Even if this is the reason for non-success, it is telling for VA and other policy makers that simple presence at a QI collaborative did not ensure success. Conclusions This chapter brings to a conclusion the first half of this study, which utilized a five-year time series analysis to evaluate whether a large QI collaborative lead to sustained improvements in quality as measured by two primary outcomes, LOS and discharges before noon. The analyses found that in aggregate there were improvements in LOS and discharges before noon. However, performance at individual hospitals was quite variable and not all hospitals showed improvements. For those hospitals that improved, there was a high likelihood of sustaining LOS but a low likelihood of sustaining discharges before noon. Some of the decline may just reflect a balance between patient flow and provider workload. However, if many other newly introduced quality metrics seem a similar post-implementation decline, it will be difficult to achieve substantial 68 improvements in quality. The study also considered three secondary outcomes which, as expected showed little change and little impact associated with FIX. Based on this analysis, there are four important findings. First, in comparison to the traditional pre-post study involving team reported success, an analysis that accounts for pre-existing temporal trends in patient outcomes leads to the identification of a smaller than expected group of QI teams that made initial improvements. Second, there may be significant loss of quality, or regression to the mean, after the completion of QI projects. Third, this novel classification approach highlighted that many hospitals operate with processes that lead to highly variable performance. These hospitals likely need to focus on creating a standardized process before undertaking serious efforts to improve any of those processes. Fourth, this analysis showed that success can be achieved across multiple hospital settings; but given the overall variation there needs to be a better understanding of what factors predict success in a collaborative. 69 CHAPTER 5 – SUPPORTING QUALITY IMPROVEMENT This chapter begins the second half of this project which considers another body of literature, develops an analytic framework, and analyzes survey data in conjunction with the results from the prior analysis to meet the goals of the studies third specific aim. This specific aim was to describe how selected components of an organization’s structure were associated with an ability to sustain improvements in quality. The first half of this chapter reviews the extensive literature that evaluates the relationships between different organizational characteristics and high quality healthcare. The second half then works from the conclusions reached in this literature to develop a guiding analytic framework. The goal of this analytic framework is to posit a relationship of how different classes of organizational characteristics interact to generate an environment that may or may not support successful QI initiatives. The next chapter in this section then discusses how the framework was applied to analyze FIX and the methods used to generate hypotheses based on those results. The third chapter of this section then presents and discusses the results of the analysis as well as discussing their implications for QI and the overall framework. Relationships with Healthcare Quality There have been a number of studies that evaluated whether different features or characteristics of an organization were associated with higher quality healthcare. This literature has been nicely summarized in three systematic reviews. The first of these systematic reviews evaluated 81 publications that examined the relationship between a measured organizational variable and 70 mortality rates.50 While mortality rates were the primary outcome of interest the review also included studies that evaluated other adverse healthcare outcomes such as nosocomial infections, falls, and medication errors. The review considered structural variables (professional expertise, professionalization, nurse to patient ratio, care team mix, not for profit status, teaching status, hospital size, technology use, and location), organizational process variables (measures of caregiver interaction, patient volumes), and clinical process variables (implicit quality, explicit quality, and computer decision support). The general conclusion of the review was that the body of evidence for each of the organizational variable categories was equivocal at best. The only organizational variable with consistently positive impact on mortality rates was having high levels of technology, which at the time of these studies meant having access to equipment such as ventilators and pacemakers.50, 51 The second review built on this first review by focusing the literature review on how each study operationally defined the outcome of interest. The objective was to determine whether the operational definitions for the studied adverse events identified a mechanism through which altering an organizational characteristic could realistically improve a care process and result in fewer adverse events.52 Based on the lack of consistent evidence showing an association between any single organizational characteristic and improved quality, the authors theorized that perhaps adverse events were too broadly defined meaning there were too many factors impacting quality and thus the measured characteristics could not reasonably lead to improved quality. This 71 review analyzed 42 articles that provided 67 measures of different organizational characteristics and their association with medical errors and patient safety outcomes. The measured organizational characteristics broke down into 13 groups: team quality, implementation of standard operating procedure use, feedback, technology, training, leadership, staffing, communication, simplification of the work process, culture, organization structure, employee empowerment, and group decision making. The operational definitions for adverse events in the studies included medication errors, medication complications, diagnostic errors, treatment-related errors, patient falls, specimen labeling errors, and other nonspecified patient safety concerns. The authors noted that while most of the studies focused on medication errors and complications, there was no consistency across studies in how to define and measure a medication error or complication. This made drawing any systematic conclusions about organizational characteristics and adverse events a challenge. Additionally, the authors noted that only 9 of the studies provided sufficient detail that would allow the reader to identify a specific relationship between an organizational variable and the measured adverse event. Given these limitations, as well as others, the review concluded that there were no generalizable statements about how a specific organizational factor could address errors or safety in healthcare.52 The third of these systematic reviews continued to refine the process, this time by using Donabedian’s structure-process-outcome model as a framework for structuring the analysis.53 This review identified 92 articles and analyzed them to understand whether sequentially close Donabedian relationships (e.g. process – 72 outcome) had more consistent and positive findings than distant relationships (e.g. structure – outcome).54 The review also examined whether studies considered definitions of quality that included improving services rather than simply defining quality as a reduction in negative events. The study evaluated 19 structure-process, 58 structure-outcome, 20 process-outcome, and 9 processprocess relationships. Much like the prior reviews, this systematic review found that the preponderance of organizational factors studied were associated with non-significant findings.54 These non-significant findings were most frequently found when examining the distant structure – outcome relationships, which was the most commonly examined relationship in the literature. A general concern with these studies was that they did not consider or evaluate any of the intervening process variables which would help enlighten the understanding of why some studies identified positive impacts while others had negative or nonsignificant outcomes. When studies examined the sequential Donabedian relationships of structure-process or process-outcome, cross study results were more consistent and there were greater odds of detecting a statistically significant relationship between an organizational variable and a measure of improved quality. The review of this literature highlights that components of organizational structure and care quality have a complex relationship that was difficult to analyze. Those components that have a direct cause-effect relationship (e.g. certain forms of technology, nurse-patient ratios) quite frequently have positive effects on quality. However, more peripheral factors (e.g. affiliation with a medical 73 university) that do not have that direct linear relationship will show contradictory results across studies leading to a conclusion of non-significant impact when analyzed in aggregate. One key conclusion from this research was that multiple organizational characteristics contribute to any single measure of quality. Therefore, any analysis that does not appropriately model the complex relationships between organizational characteristics and quality outcomes cannot expect to ascertain a strong relationship between factors. This approach would likely require a multilevel analysis that could test how different variables interact and mediate each other to support quality. Few studies have the data for this type of analysis, but when such data was available it did help identify meaningful relationships, even helping identify how important intervening factors could inhibit quality. For example, an analysis of reengineering efforts across 497 hospitals initially found that reengineering was detrimental from a cost competitive standpoint.55 However, when using a multivariable analysis that adjusted for indicators of organizational support and quality of the implementation, the study identified trends showing that if successfully implemented the reengineering efforts were beneficial.55 Of course, as potentially indicated by the variability in performance with FIX, the question of how to successfully implement QI is an important and little examined topic. Before addressing the literature related to the implementation of QI, there were some key limitations associated with these reviews and the studies they summarized. The first limitation was the difficulty associated with defining and 74 measuring quality. Early studies focused on efforts to reduce mortality rates, which as a generally rare and complex event was difficult for any broad organizational characteristic to significantly impact.50 Later efforts identified more modifiable targets of quality (e.g. reduce adverse events, improve patient satisfaction) and were able to uncover some relationships. However, the operational definition of the same outcome frequently varied between studies making it difficult to determine whether any relationships existed across healthcare institutions or only in those where the studies occurred. Some of these same problems plagued the analysis of FIX, LOS and discharges before noon represented composite outcomes that likely did not measure the true quality goals and this limitation will impact the results of the analyses in this study. Some recent efforts have addressed these issues and will lead to better and more consistent measures of quality. As one example, the National Healthcare Quality Report published annually since 2003 promotes the systematic collection of quality measures allowing comparisons between hospitals.2 A second limitation identified in the reviews was weak methodology. One weakness of the early studies was they did not adjust for patient severity. However, now that risk adjustment is an accepted standard in health services research the more recent studies all used appropriate risk-adjustment procedures. However, even with risk-adjustment these studies often suffered from methodologically weak study designs. Most of these studies employed an observational study design and could not address any characteristics that varied 75 between different healthcare institutions and how those variables may confound any observed relationships. In fact, given the number of postulated organizational factors that may impact quality with each individual study only considering a few organizational characteristics they all potentially suffered from significant unmeasured confounding. A few studies did use a stronger methodology and utilize an interventional design with quality measured before and after a change in the organizational characteristic. These studies however utilized pre-post designs, did not consider any natural trends in the outcomes, frequently analyzed distant structure-outcome relationships, and only reported results from a single site. A number of biases, particularly historical bias and regression to the mean, threaten the validity of these studies. While not an inherent limitation of these systematic reviews, on final consideration was that the reviews only focused on how the presence or absence of different organizational characterists were associated with quality. However, it may be more important to evaluate how an organizational characteristic supports the process of improving quality. This concept moves away from efforts tocused on identifying distant relationships between features and instead explores how QI teams conduct improvement projects and how they use resources and otherwise interact with their surrounding environment. The first step in this process was to examine whether different organizational characteristics were associated with successful QI initiatives. 76 Relationships with Quality Improvement Efforts The relationship between organizational characteristics and quality improvement efforts has been less studied, but there are three notable studies to consider. The first of these studies considers the process of organizational learning in neonatal intensive care units (NICU).56 This paper synthesizes theories from best-practice transfer, team learning, and process change to develop hypotheses testing the relationship between concepts such as learnwhat (activies related to learning what the best practice is), learn-how (activities related to operationalizing or implementing best practice), and psychological safety with success in a QI initiative. The data in the study represents 1,440 survey respondents spread over 23 NICUs. The results of the survey indicated that perceived implementation success was associated with respondents feeling there was a greater body of evidence supporting the intervention, a greater sense of psychological safety at the insitution, and high use of learn-how activities. They did not find any association with learn-what activities, nor did any of the control variables measuring structural characteristics have any impact. Some limitations of the study were that it only studied 23 NICUs that all had selfselected into the collaborative. Additionally, among all the NICUs participating in the collaborative, there was a low response rate to participate in this study and a low response rate among providers at the NICUs that did participate in the survey. Although this study did not examine more traditional organizational characteristics, it did establish that certain characteristics are associated with percieved success at implentation of a QI collaborative. 77 The next critial article was a systematic review that examined how organization context was related to quality improvement success. The majority of the 47 studies in the review examined QI projects associated with the Total Quality Management (TQM) or Continuous Quality Improvement (CQI) approaches.57 The analyzed studies most frequently measured success with QI based on pre-post data. A small selection of the studies only reported team perceived success. Factors that were associated with improvement were management leadership, organizational culture, use of information systems, and prior experience with QI. Additionally, there was support for physician involvement, microsystem motivation to change, available resources and quality of the QI team leadership. The findings of this review were difficult to interpret since it could only measure those factors included in the reports, none of which had the specific goal of testing the role of specific organizational characteristics. As such, any individual factor was only mentioned in 20% of articles leading to small smaple sizes to draw any conclusions from. The strength of the paper is that it starts to identify a collection of variables that studies should evaluate when working to identify which organizational characteristics best support QI. The last article to considered reported on 99 interviews conducted at 12 hospitals that participated in the Door-to-Balloon (D2B) Alliance.17 The hospitals were recruited into this study based on their reported influence of the D2B Alliance on improving care at their hospital, with 6 reporting a strong influence and 6 a limited influence. Their qualitative analysis of the interviews was based on a realistic evaluation framework focused on identifying the contextual 78 environment that led to the hospitals percieved impact of the D2B Alliance. This anlaysis revealed that a perceived need to change, openness to external sources of information, and a strong champion for change were all contextual factors consistently associated with the D2B Alliance having a strong impact. While this study only considers a small number of hospitals, the interviews provided a wealth of information on various organizational characteristics, providing the best assurance that the identified associations between organizational characteristics and QI success were at least true associations at those individual hospitals. This collection of articles suggested that a number of factors can impact a team’s success with a QI effort. In contrast to the prior section, the supported organizational characteristics are generally closely associated along the causal pathway with the measured outcome of interest. The most notable exceptions to this concept were the more broadly defined features such as psychological safety and organizational culture. While it is important to recognize that the identified organizational characteristics were associated with successful QI efforts there is little available information on what constitued a good organizational culture or supportive leadership. The next challenge for healthcare QI may be in determining how to best create the environment and necessary support structures to allow effective QI. In concluding this literature synthesis around the relationship between organizational characteristics and healthcare quality, there were three key concepts that stand out. Future studies should focus on these concepts as they work to overcome the limitations of this prior work as well as begin to develop an 79 understanding of how to best improve healthcare quality. First, there should be consideration of how organizational features and processes interact to support quality. Second, the overall context of an organization impacts their QI efforts. As such, analyses need to compare across multiple organizations in order to best understand the relationships between organizational characteristics and outcomes. Third, longitudinal analyses related to specific interventions will help establish a causal relationship showing how structures support quality. This study’s analysis of the results from the FIX collaborative addresses some of these limitations. The analyses use survey data collected during FIX to understand how a large collection of organizational characteristics were associated with performance during FIX. The focus was to identify whether any modifiable organizational characteristics were part of a collection of characteristics commonly associated with success in FIX. The identified characteristics would then be potential targets for intervention allowing an unsuccessful hospital to adopt changes that will help support future QI efforts. Analytic Framework In order to best understand how organizational characteristics related to FIX performance, the first step was to develop an analytic framework to structure the analyses. The starting point in this process was to identify a theoretical approach to guide the development. Based on the literature review, there was no established theoretical approach guiding the field. After surveying a selection of organizational theories, realistic evaluation was selected as the approach that best matched the purpose of this analysis. Realistic evaluation theory, originally 80 developed for improving the quality of evaluation for public policy interventions, focuses on understanding the context of the situation where an intervention occurs and how factors interact to lead to the observed result.58 A common quote that succinctly summarizes the theory is to understand “what works for whom in what circumstances.”58 In effect, the work argues that success in one situation will not always translate to another situation and that it is a complex interaction of factors that results in improvement or failure. This theory contributes two important characteristics to this analysis. First it led to the decision to use a data mining approach to analyze the data. The support for this decision will be discussed in the next chapter. Second, it provides the superstructure for the framework. This superstructure conceptualizes a QI effort (in this case FIX) as an external stimulus applied to a specific organizational context. This organizational context responds to the QI effort and produces a set of measureable outcomes. A model of this framework is outlined in Figure 5-1. This superstructure however, does not address the key objective of realist evaluation, which was to thoroughly understand the characteristics of the organizational context and how those characteristics interact to generate the outcomes. To understand this required developing a more detailed model of the organizational context that shapes a QI effort. This process began with a consideration of the organizational characteristics covered in the literature. This consideration revealed that there was no succinct list of factors, but instead suggested that factors may be categorized into specific classes. A further refinement of this concept came from a review of the SQUIRE (Standards for 81 QUality Improvement Reporting Excellence) Guidelines.59 These publication guidelines encourage authors to describe various aspects of the organizational context that might impact a QI project. The consideration of these two factors led to the identification of four classes of contextual factors that may impact success with QI efforts, 1) Facility structure, 2) QI structure, 3) QI processes, and 4) Team character. Figure 5-1: Analytic framework for how organizational context impacts QI The first class, facility structure, represented factors describing the basic structural characteristics of the healthcare institution. These factors were conceptualized as generally unmodifiable variables (e.g. facility size). Despite their unmodifiable nature, these factors create a critical foundation that not only 82 supports but also interacts with the other classes to create the environment that responds to the QI project. So even though these factors may be unmodifiable, their interactions were critical and necessary to include in the analytic framework. The next class, QI structure, also represented structural components, but these were distinguished from facility structural factors in two ways. First, variables selected for this class should be more likely to directly impact or support QI activities. Second, these structural variables should be more modifiable than the innate characteristics of a hospital. Some examples of variables that might fit into this category include nurse to patient ratios, levels of support staff, or the availability of critical resources for providing quality care. In total, these first two variable classes provide a general context for understanding overall characteristics, unique challenges, and available resources for successful QI. The third class, QI processes, consisted of factors that measured prior experience with QI. The goal of these factors was to understand how ubiquitous QI is in the environment. These factors were considered important to include based on two theories. Most directly, hospitals that consistently pursue QI should be more likely to have determined how to best run a QI project. Further, the more ubiquitous QI was at a facility the greater odds of sustaining improvements due to a continuous cycle of improvement preventing any significant decline in quality. Indirectly, high levels of QI activities should increase the likelihood of an overall hospital culture that supports QI. When providers accept, support and participate 83 in QI, there should be a decreased prevalence of change resistance suggesting a greater probability for successful implementation of QI solutions. The last class, team character, consists of variables defining the QI team. Important variables in this class would measure team make-up, team functioning, and team organization. These variables measure the quality of the team as it is important to recognize, particularly in the setting of a failure, whether it was poor support that a quality team could not overcome, or whether quality support seemed to be present, however the QI team was unsuccessful because they could not effectively function. The last step in developing the analytic framework was to consider how to model the interaction between each of the four components. Some of the most recent analyses of organizational structure have been based on Donabedian’s structure-process-outcome model for quality assurance.53 A major concern with that framework was its prescription of a tight linear interaction between sequential components in the model. It seems more likely that for quality improvement there is a complex interplay between components to generate the specific organizational context. A potential example of this is the interaction between QI process and QI structure. As QI activity becomes increasingly common, hospitals are more likely to see benefit from QI and increase their willingness or desire to invest in QI structure. Therefore, rather than seeing these four components as part of a causal pathway, they are thought of as layers that build-on top of each other. 84 These layers are represented in a triangle as it helps to emphasize a few concepts. First, it introduces the concept that organizations need to build-up for QI success. It seems unlikely that even the most highly functioning team can succeed if a proper foundation does not support the efforts. Building on this concept, the area associated with each class of variables signifies their relatively importance. Although a hospital may not be able to modify their innate characteristics it is important to understand how those characteristics impact how the hospital functions. A successful QI effort in a large urban hospital may not translate to a small low volume critical access hospital and it is important to recognize and understand what characteristics result in success and failure in these disparate settings. Lastly, the overall shape should convey the idea of scaling or climbing a mountain. The intention being to remind people that improving quality is not an easy task, but instead a skill that must be carefully honed and perfected if there is a hope of achieving the ultimate summit. Conclusions This chapter has focused on understanding how organizational characteristics relate to quality improvement. A review of the literature revealed that although extensively studied there were few conclusions about which organizational characteristics can improve quality or effectively support QI efforts. This was most likely attributable to the complex nature of healthcare and the inability to isolate any single factor. Instead, efforts to understand how organizational characteristics can improve or support quality need to consider how individual factors interact, an effort that likely requires more complex 85 modeling and analytic approaches. In the first step along this path this chapter introduced a new analytic framework. This framework identified 4 key classes of factors that likely play a role in modulating the success of a QI project. This analytic framework will be further explored in the following chapter as it is applied to the FIX initiative to understand whether any collections of organizational factors were commonly associated with an ability to improve and then sustain those improvements. 86 CHAPTER 6 – ANALYTIC VARIABLES AND DATA MINING This chapter overviews and defines the analysis of how organizational context modifies a quality improvement (QI) collaborative to result in measured outcomes. The first portion of this chapter continues to expound on the analytic framework introduced in chapter 5 by discussing two additional data sources that served the basis for this analysis. These two surveys measured a number of organizational variables in Veterans Affairs (VA) hospitals during FY07, the same year as the Flow Improvement Inpatient Initiative (FIX). The measured variables will be introduced and classified into categories based on the analytic framework and their perceived relationships to QI activities. The second portion of this chapter addresses the analytic methods. This involves an introduction to the data mining process as well as the specific method used in this analysis, a decision tree. After covering the details involved in establishing the dataset and analyzing the dataset, the chapter concludes with a brief discussion about the process of interpreting and evaluating the decision tree. This covers both the identification of hypotheses to drive future research as well as determining whether the models suggest a need to modify the analytic framework. Organizational Characteristics in VA While this research examined individual hospital performance in response to participation in FIX, it cannot be forgotten that these hospitals are all part of VA healthcare, the largest integrated healthcare system in the US. There are many characteristics about VA healthcare that make it a useful case study for this analysis but also may impact how well some of the findings generalize to a larger 87 population. VA healthcare represents one-third of the VA cabinet level office in the federal government with direct congressional oversight. Currently over 5.5 million veterans receive some portion of their healthcare through VA, 3.1 million of whom have conditions connected to their military service.60 The VA patient population is just over 8% female and 20% minority. The healthcare network includes over 1,000 service locations including the 130 acute care hospitals in this study. As part of a large integrated system, VA healthcare has developed regional divisions that coordinate efforts in a region and promote high quality care. The VA adopted this regional network, known as the Veterans Integrated Service Network (VISN), in 1996.61 The 21 individual VISNs promote high quality care through a number of mechanisms including VISN-level budget control, adoption of VISN mandated performance measures, establishment of drug formularies, and the promotion or coordination of QI efforts. The VISN structure potentially impacts this study as interactions between an individual hospital, its network hospitals, and VISN leadership could modify a hospital’s behavior in response to a QI effort. As such VISN memberships as well as other measures of hospital-VISN interaction were considered in the analyzed variables. Another critical feature of the integrated VA network is the existence of a comprehensive electronic medical record. Early versions of the VA electronic medical record first appeared in 1978.62 Over the years the electronic medical record, known as CPRS, has evolved into a format that is highly standardized but also allows for flexibility. The basic standardized structure supports patient care 88 and uniform data collection across facilities. The flexibility in the interface allows individual facilities to develop, test, and implement unique solutions to address local needs. As such the electronic medical record and its interface play a significant role in many QI solutions and measures of CPRS use as part of quality improvement were appropriately represented in many study variables. An additional key characteristic of VA that supports this research is its broad survey culture. VA conducts numerous surveys each year; some are repeated in regular intervals while others represent targeted research efforts. This study utilized data from two surveys both of which were completed by key hospital representatives during FY07, making these surveys an accurate snapshot of the organizational context at the time of FIX. The first of these surveys, Survey of Intensive Care Units & Acute Inpatient Medical & Surgical Care in VHA (HAIG), was a biennial survey that recorded key facility attributes and all VA hospitals completed.13 Any values reported in this survey were considered the official record for those features. The other survey used in this study, The VA Clinical Practice Organizational Survey (CPOS), was a one-time survey designed to evaluate clinical practice characteristics that may be associated with quality care and high performance.14 The objective of the survey was to measure organizational readiness for change, particularly in the primary care arena, since VA was shifting its care focus away from acute care episodes to coordinated primary care efforts. Since many VA hospitals support both acute care and primary care facilities, a number of the survey responses are relevant to the FIX efforts to improve acute 89 care flow. The survey was sent to Chiefs of Staff at 160 VA facilities, with 86% of facilities responding. A few data elements overlap between these two surveys, since the HAIG survey is more complete and considered an official VA record it was used in any situation where data elements were duplicated. VA Hospital Organizational Context The process of providing inpatient care, let alone attempting to improve the quality of that care, is exceedingly complex. As such there were a considerable number of variables identified in these two surveys that may help characterize how organizational context responds to a QI effort to produce measured outcomes. This section walks through these variables and where appropriate discusses how the variables were categorized into the various classes established in the analytic framework. An important first step to describing and understanding these variables is to understand the different scales used for categorical variables in the CPOS survey. A total of 12 response scales were used in the CPOS survey, with the scales varying from 3 to 6 response options. Table 6-1 lists each of these scales, with the first column listing the stem of the response. For example, the stem “useful” with 3 response options means that the respondents selected from the options: not useful, somewhat useful, or very useful. Those stems that are numbered signify that there were multiple scales with the same stem. The stems in this table serve as the “type” labels in the tables that follow in this section. For analytic purposes, these variables were recorded with numeric values equivalent to the scale levels in the table. 90 Table 6-1: Categories for different response scales in the CPOS survey Scale Levels Stem 1 2 3 4 5 6 Useful Not Somewhat Very Barrier Not Small Moderate Large Monitoring Not Annually Quarterly Monthly Importance Not Somewhat Moderately Very No Planned Partially Fully Implement1 Plans Implement2 Not V. Little Some Great V. Great Very Challenge Difficult Neutral Easy V. Easy Difficult Mostly More Share More Mostly Responsible VISN VISN equally Hospital Hospital Sufficient1 Not Barely Somewhat Mostly Completely Sufficient2 Never Rarely Sometimes Usually Always Cooperate V. Great Great Some V. Little None None Few Some Most ~ Half All Percentage (41-60%) (61-90%) (≥91%) (1-20%) (21-40%) (0%) Working from the base of the triangle that describes the organizational context in the analytical framework, the first collection of variables to describe were those that measure characteristics of the facility structure. These variables, listed in Table 6-2, ideally measure facility characteristics that were not related to specific quality improvement efforts and were generally stable from year to year. The first six variables in the table, region, VISN, facility type, wards, ICU level, and ICU status, were easily identified as basic hospital demographic variables making them measures of facility structure. The next variable, academic affiliation, was a strong candidate for this class but it was frequently tested in early studies for its relationship with quality, suggesting it could be classified as a QI structure. Considering that in those studies academic affiliation had no 91 consistent direct association with mortality rates and that the associations between VA hospitals and academic institutions were static, the conclusion was that any association academic affiliation has with quality was distant and more emblematic of a measure of facility structure. The counts of operational beds (whether ICU or acute care) were slotted into this class because while they may change, in VA they do not change dynamically but instead only in response to long term facility planning which has been driven by patient volume or building remodeling not directly as an a specific consideration about improved quality. Variable Region VISN Facility Type Ward ICU Level ICU Status Table 6-2: Variables measuring facility structure Source Type Description Categorical FIX learning session region Categorical Veterans Integrated Service Network HAIG Categorical Primary, secondary or tertiary Number of 9 different types of wards HAIG Count in a hospital HAIG Categorical Level 1, 2, 3, 4 or No ICU Closed, Open, Open with Mandatory HAIG Categorical Consult Is hospital affiliated with an academic HAIG Yes/No medical center Academic Affiliation # Operational ICU Beds # Operational Acute Care Beds Annual Volume Rural Total Wards Specialty Wards Discharges/Bed HAIG HAIG PTF PTF Continuous Total number of active ICU beds Total number of active medical and surgical beds Count Number of FY07 medicine discharges Categorical Based on % of rural patients Count Total number of separate wards Number of telemetry, step-down or Count respiratory specific wards Calculated # FY07 discharges / # acute beds Continuous 92 The last individual variables in this class, annual volume and rural, were two measures that reflect broad features (size and location) of a hospital’s patient population. These data come from the Patient Treatment File (PTF) and not either of the surveys. The rural classification is a three tiered classification of urban, rural, and highly rural based on the percentage of patients discharged from the facility in FY07 that fall into classifications of Small Rural or Isolated using the Rural-Urban Commuting Area Code system.63, 64 These variables were classified as facility structures because while they may change from year-to-year based on which patients seek inpatient care, the changes were outside the hospitals’ control and cannot be manipulated to directly impact quality. Beyond these individual variables available in the surveys or patient records, this class also included three composite or calculated variables. The two composite variables dealt with the number of wards in the hospital. The first was a count of the total number of wards, while the second was a count of the specialty wards in the hospital. Generally, the number of wards serves as an additional marker for facility size although they may also have potential positive or negative associations with quality. For example, allowing specialization on wards may improve quality for certain conditions, but this situation could also increase the number of in-hospital care transitions which can be vulnerable periods. The calculated variable, discharges per bed, represented the simple division of the total annual volume over the total number of active acute care beds. This was conceptualized as a crude measure of workload or provider 93 burden, with the theory that it may be a marker for provider motivation to accept suggested changes for both of the primary FIX analysis outcomes. The next set of variables, listed in Table 6-3, represent the individual measures of hospital QI structure. These variables consider hospital characteristics that exhibit increased flexibility compared to those identified as measures of facility structure. Additionally, these variables should have more direct theorized associations with quality. As a broad set, these variables generally measure the structures necessary for providing care. The quality or quantity of these structures impacts not only the ability to provide basic care but potentially the ability to undertake improvement efforts. The underlying theory being that an effective structure for supporting QI likely requires providers, support staff and resources that sufficiently meet basic patient needs while also having the flexibility or additional capacity to support QI efforts. As an example, hospitalists were classified as QI structure because one influence in adopting a hospitalist program was the concept that physicians employed by a hospital focusing on inpatient care will be more efficient, have a better understanding of the inpatient care system, and can potentially justify protected time to participate in QI.65-67 Similarly, nurse staffing and nurse to patient ratios consider whether sufficient staffing was present to provide consistent care and whether nurses would be able to participate in and support QI efforts. Included in this section of variables was the set labeled as barriers to improvement, as all 3 measures ask whether there were insufficient numbers of providers or staff for achieving desired improvements. 94 The last collection of variables in this set measured basic events that were not directly related to QI, meaning they didn’t qualify as a QI processes, but still potentially contribute to the development of an organizational culture that supports QI. The first set of these variables measured the cooperation and communication between providers and departments, a particularly important consideration for a broad QI effort such as FIX. The last two measures, performance monitoring and utilization review, provided information about how much data was available for QI as well as establishing how accustomed providers would be to interpreting performance data. Table 6-3: Variables measuring QI structure Type Description* Variable Source Hospitalists used on: All, Some, or Hospitalists HAIG Categorical No Wards Medical FTEE Number of Full Time Employee CPOS Continuous Nurses Equivalents (FTEE) ICU Nurse to Reported for Day, Evening & Night HAIG Categorical Patient Ratio Shifts, 1:1, 1:2 or 1:3 Were 7 types of staff sufficient for Sufficient Staff CPOS Sufficient 1 inpatient care needs Barriers to Extent to which 3 measures were a CPOS Barrier Improvement barrier to improvement Inpatient Were 9 types of resources CPOS Sufficient 2 Resources adequate for inpatient care Communication & 3 measures of the quality of CPOS Cooperate Cooperation communication or cooperation Performance How frequently are 6 performance CPOS Monitoring Monitoring measures monitored What percentage of 3 types of Utilization Review CPOS Percentage admissions are reviewed * For full listing of grouped variables see Appendix D 95 Table 6-4: Calculated and Composite measures of QI Structure Variable Discharges / Nurse Total Staff Clinical Staff Support Staff Total Barriers Total Resources Space Resources Technology Resources Communication Total Performance Monitoring Description* # FY07 discharges / # medicine nurse FTEE Sum of all 7 sufficient staff variables Sum of 3 sufficient clinical staff variables Sum of 4 sufficient support staff variables Sum of all 3 barrier variables Sum of all 9 resource variables Sum of 2 space resource variables Sum of 6 technology resource variables Sum of 2 communication variables Sum of all 6 performance monitoring variables Level of monitoring for each of the 6 Monitoring Level performance monitoring measures Total Utilization Review Sum of all 3 utilization review variables * For further details see description in Appendix D In addition to these individual variables, the QI structure class also included several calculated and composite variables, listed in Table 6-4. Much like the calculated variable in the facility structure class, the calculated variable here was a crude measure of workload, this time calculated as the annual volume divided by the total medicine nurse FTEE. The composite variables represent aggregate measures for each set of individual variables. These variables acknowledge the likelihood that any single resource or barrier cannot significantly and consistently impact QI, but instead it may be the collection of these variables that has meaning. Therefore, all sets of variables have a composite variable representing the sum of the individual variables. For two sets of variables, sufficient staff and inpatient resources, two additional composite variables were calculated representing distinct subsets within those variable sets. 96 The last variable of note was monitoring level. This variable was recorded for each of the 6 performance measures and reflects whether the performance monitor was measured at the facility, clinic, provider or some combination of those three levels. The next set of variables, listed in Table 6-5, represent measures of hospital QI processes. These variables measured prior experience with quality improvement at the hospital and likely should have a strong relationship with current QI performance. The first set of variables in this collection measured products and outcomes likely associated with prior QI efforts. The first measure, clinical order sets, evaluated whether a hospital implemented an electronic order set or a clinical reminder for 8 common inpatient conditions. Further, if an order set had been implemented, respondents indicated whether the clinical order set or reminder was viewed as useful. The next variable was a similar measure related to the implementation of evidence bundles for common ICU events. The last variable fitting into this set measured which of 7 different approaches hospitals used to encourage adherence to clinical practice guidelines for 3 different conditions: acute myocardial infarction (AMI), congestive heart failure (CHF), and community-acquired pneumonia (CAP). The next set of variables in this class considered different drivers of local QI. The first variable, QI information, considered the role played by seven different potential sources of information for guiding QI efforts or strategic planning. The other, driving force, considered the split in responsibility between the VISN office and the individual hospital for six activities that support QI. 97 Table 6-5: Variables measuring QI process Type Description Variable Source Are clinical order sets or reminders Implement 1 Clinical Order implemented in CPRS for 8 conditions CPOS Sets Useful If implemented, how useful Are 10 different evidence bundles ICU Evidence HAIG implemented, paper or electronically Bundles Clinical Practice Use of 7 methods for adhering to Guideline CPOS Yes/No guidelines for 3 conditions (AMI, CHF, Adherence CAP) How important are 7 sources of QI Information CPOS Importance information for guiding QI efforts Is the VISN or hospital primarily Driving Force CPOS Responsible responsible for 6 activities Which of 8 methods are usually used Clinical CPOS Yes/No to develop reminders Reminders Performance To what extent have 11 actions been CPOS Implement 2 Improvement implemented Guideline Presence of 6 factors in response to CPOS Implement 2 Implementation guideline implementation 6 challenges related to clinical Clinical CPOS Challenge champions Champions Four measure of hospital culture and Facility CPOS Cooperate support Environment Performance Use of 4 types of incentives related to CPOS Yes/No Awards improving performance measures On average, percentage of awards Award CPOS Percentage given to groups Distribution Has the hospital implemented a QI Yes/No program in the ER ED QI Teams CPOS Continuous If Yes, how many teams * For full listing of grouped variables see Appendix D 98 The final set of variables in this class examined factors related to general team performance and actions with QI activities. These were included in the QI process class, as opposed to the team character class, in the framework because these were general measures about QI at a facility and not measures of the specific members of the team involved in FIX. The first in this section, clinical reminders, considered whether eight different methods were typically used to develop reminders in CPRS. The next, performance improvement, measured the extent of implementation of eleven actions to improve VA clinical performance. The next three, guideline implementation, clinical champions, and facility culture, all examined different challenges or responses from hospital staff related to QI efforts. Lastly, a collection of variables considered the use of awards to encourage performance improvement and the number of QI teams implemented in the emergency department (ED) the year prior to the survey. Just as in the QI structure class, the QI process class had several composite variables, listed in Table 6-6. Generally these composites represented the sum of measures across a group of individual categories. However, for order sets, guideline adherence, clinical reminders, and performance improvement there were additional sub-groupings. For clinical order sets these sub-groups consider the number of fully or partially implemented sets, the number of planned sets and an average usefulness rating across the implemented sets. The guideline adherence sub-groups reflect the number of methods used to address each of the individual diseases (disease total) as well as for how many diseases a method was used (method total). The sub-groups for clinical reminders 99 separated the five activities involved in the development of a clinical order set from the 2 activities involved in evaluating a clinical order set after implementation. Lastly, for performance improvement, the sub-groups considered a collection of measures related to establishing a team as well as shifting resources between areas in the hospital in an effort to improve performance. Table 6-6: Calculated and Composite measures of QI Process Variable Description* Total Clinical Order Sets Sum of all 8 conditions Implemented Count of all partially or fully implemented Planned Count of planned order sets Usefulness Average usefulness of implemented order sets Total Evidence Bundles Count of all implemented ICU evidence bundles Total Guideline Adherence Sum of all 21 fields (3 diseases x 7 methods) Disease total Sum of all 7 methods for each diseases Method Total Sum of method use across the 3 diseases Total Information Sum of all 7 sources of information Average Driving Force Average across the 6 variables Total Clinical Reminders Sum of all 8 methods Development Sum of 5 measures related to development Post Sum of 2 measures related to post-implementation Total Performance Sum of all 11 activities Establish Sum of 3 activities related to establishing a team Shift Sum of 2 activities related to resource shifting Guideline Implementation Sum of 2 measures of implementation process Guideline Resistance Sum of 2 measures of resistance Total Clinical Champion Sum of 6 clinical champion measures Facility Culture Sum of 2 measures of culture Facility Support Sum of 2 measures of financial support Total Incentives Sum of 4 measures of incentive use * For further details see description in Appendix D 100 The final class in the analytic framework, team character, was not represented in this analysis. The data from the two surveys reviewed thus far did not directly pertain to FIX, but only represented the organizational environment at the time of FIX. As part of FIX there were some surveys completed by the participants, but this data was not available at the individual or team level as it was aggregated at the regional level. Further, this information was unlikely to provide much insight as many of the questions show greater than 95% of respondents responding positively (agree / strongly agree) to survey questions. So rather than include this data which could lead to erroneous interpretations of the final model, these data were excluded. While this was a clear limitation of the analysis, as an exploratory analysis it was not a fatal limitation. The impacts of this limitation will be discussed during the results review in the next chapter. Data Mining Overview This section provides an overview and discussion of data mining as a technique for developing an understanding of how these measures of organizational context may relate to hospital performance during FIX. The challenge to this task was that while the literature review in Chapter 5 identified a number of studies that examined relationships between quality and organizational characteristics, due to the complexity of these relationships there were few consistent meaningful associations. This study selected data mining as a tool that could capably analyze this data and effectively identify complex associations and patterns that describe hospital performance during FIX. Any identified associations would then serve a basis for developing future studies that 101 should involve a qualitative and quantitative analysis to better understand the specific relationships. While data mining was the selected analytic method, logistic regression was also considered. The selection of data mining mainly reflected its ability to uncover complex relationships between factors in a dataset.68 This was in contrast to logistic regression approaches which would require a considerably larger sample than was available to effectively analyze the potential variables of interest. In short, there were two particular concerns that suggested logistic regression was inappropriate for achieving the goals of this study. First as shown in the literature review, none of these organizational factors are likely to have a clear univariate relationship with hospital performance during FIX. As such, an attempt to define the average affect of a given variable across hospitals would generally lead to non-significant findings. Second, given the available sample size, any efforts to model interactions between variables would lead to underpowered analyses, once again increasing the likelihood of non-significant findings. The next couple paragraphs provide an overview of data mining highlighting its strengths and showing why data mining was an appropriate tool for developing a set of hypotheses about how organizational context modifies a QI initiative to produce a set of measured outcomes. The term data mining in fact encompasses a large tool box of analytic methods, this study focused on decision trees. Decision trees belong to the class of symbolic learning and rule induction algorithms.68 Symbolic learning algorithms aim to generate a structured set of hypotheses that can be used to understand 102 and classify a specific concept. In this study the concept to classify was facility performance, with the classification representing the four-level major classifications listed in Table 4-1. The decision tree process first begins with the concept of an information system (IS) with four key components, S, Q, V, and f: where IS = <S,Q,V,f>. S represents the set of examples used to develop the hypotheses, in this case the sample of hospitals that participated in FIX and have complete data on the HAIG and CPOS surveys. The next component, Q, represents the collection of features that serve to characterize the sample, in this case the collection of variables discussed in the prior section, “VA Hospital Organizational Context”. Within the set (Q), each individual feature (F) can take on a discrete set of values (V), for example the potential answers on the scales listed in Table 6-1. The last component, f, represents a function encoding the individual values for each feature for each example (individual hospital) within the entire dataset. The decision tree analysis begins with a root node, which constitutes the entire set of examples, S, in the IS. Working with a selected algorithm, the process identifies a feature (F) in the dataset that best separates the data into smaller subsets. The definition of what constitutes the best separation varies between algorithms. This process iterates growing limbs of a tree as the examples in each node are evaluated and features identified to create nodes with a smaller number of examples. The process terminates when all the examples in a node have the same classification value. These terminating nodes are called leaves. A tree could theoretically have as many leaves as there are examples, 103 although that would generally be an undesirable outcome. Similarly a limb can have as many nodes as necessary to reach the final classification. Lastly, limbs can be of varying nodal lengths. In order to better clarify this process, it is useful to understand how the decision tree development process differs from that of stepwise variable selection in logistic regression. In stepwise variable selection, variables are sequentially added to a model based on their relationship with the entire data set. With addition to the model based on the amount of variance explained by that variables addition to the model. In decision tree modeling, the addition of a variable to extend a limb is only based on its relationship with the examples in that node. In effect, it is therefore a conditional relationship based on the factors identified at prior points along the limb. For example, in a decision tree with 100 hospitals in the full sample, the first decision point may split into a node of 75 hospitals with an academic affiliation and a node of 25 hospitals without an academic affiliation. The next decision point along each of these limbs would then be separately and conditionally evaluated based on the hospitals academic affiliation. So for the hospitals with an academic affiliation the feature that best splits them into smaller groups may be the use of hospitalists, while for the nonacademic affiliates it may be ICU level. Taking this to a third level, hospitals with academic affiliates and hospitalists program would be evaluated with no direct consideration about their ICU level as that variable had been included in a separate limb. 104 Keeping in mind that this analysis was in part driven by the realistic evaluation framework and its goal to thoroughly describe and understand the context that leads to a measured outcome, data mining seemed the tool best suited to achieve this goal. The algorithms that create decision points in the decision tree process do not have the traditional concerns about statistical power as would be the case in regression approaches. Further, the process of evaluating nodes and identifying key features provides critical insight into how interactions between variables may differ based on the context in which they interact. These strengths lead to the conclusion that data mining, and more specifically decision trees, were the best approach for modeling the interactions between components of facility structure, QI structure, and QI process. A main weakness of decision trees is that they can generate long complex limbs leading to difficult interpretations. However, there are two modifications to the general data mining structure, which help to address this weakness. The first and most common modification is to develop a pruned tree. Pruned trees consider the tradeoff between 100% accurate classification and the risk of overfitting the data in a manner that leads to low interpretability and generalizability of the findings. Pre-pruning algorithms attempt to balance this trade-off by testing whether the addition of another feature to the limb provides sufficient additional information. If the magnitude of information gained (as calculated by the selected algorithm) does not meet a defined threshold the limb terminates leaving some misclassification. There are also post-pruning processes that generate a fully classified tree and then trim limbs back based on the information gain at each 105 step. These two processes generally lead to the same decision tree and mostly differ in computation efficiency; as such pre-pruning is the generally adopted approach and what will be used in this analysis. The other and more recently developed technique, boosting, involves the development and combination of multiple decision trees. 67, 68 The trees are combined based on a weighting that reflects the misclassification at each individual node. These approaches produce more accurate classifications, but increase the complexity of the overall interpretation. Since the application of data mining to this type of healthcare data is a novel technique, and the goal of this analysis is to generate hypotheses for future studies not necessarily identify definitive associations, a boosting algorithm was not employed. The final conclusion was that a pruned decision tree would best facilitate description of the findings to audiences generally unfamiliar with data mining and decision trees. Decision Tree Development The decision tree modeling process was completed in the Waikato Environment for Knowledge Development (WEKA) version 3.6.4 data mining environment.69 The selected model was the J48 algorithm which is an implementation of the C4.5 pre-pruning information entropy algorithm.70 This algorithm operates by calculating a level of information entropy (uncertainty) for the examples within a node and then determining which feature provides the most information gain. Entropy was defined by the following equation:71 6 ,- ./012 $ 0& log 0& &( 106 Where pi signifies the proportion of examples within the set that have a given classification, c. After determining the entropy of the set, the algorithm compares each of the features determining which feature provides the greatest information gain, as defined by the following equation:71 -7/.89 :/- ;9:-2, <& ,- ./012 $ = >? 2= ,- ./012= 2 Where F represents one of the features from the entire feature set (Q) and SF represents the number of examples in the set of interest with the classification of interest. After calculating the feature that provides the maximum information gain, sub-nodes are generated and the program continues to iterate through the process until all limbs terminate as leaves or nodes in which the addition of further features does not meet the information gain pruning threshold. Although decision tree algorithms can be developed to respond as desired to variables with either numeric or text values, in WEKA there is a clear distinction in the natural evaluation of these two variable types.69 Numeric variables were treated as a continuous variable with the evaluation process considering the optimal point to split the group in two. As an example, for a 5point Likert-scale recorded with 1 = very difficult, 2 = difficult, 3 = neutral, 4 = easy, and 5 = very easy, the model would consider the different groupings of 1 and 4 (i.e. very difficult vs. all others) or 2 and 3 (i.e. easy and very easy vs. neutral, difficult, and very difficult) to identify which split would provide the greatest information gain in that specific setting. In contrast, if that same Likert scale was encoded using the text descriptors, the decision tree would not be able to identify that very difficult is similar to difficult and could thus potentially be 107 grouped. Instead it would treat all of these as separate values and evaluate them for the information gained if the group was split into five different subgroups. In general, there was no expectation that responses on the Likert scale questions from the CPOS survey would provide much information gain if treated as text variables, and these were all numerically encoded. There were however, some variables with a limited number of categories that supported hypotheses suggesting they may split into two groups or into multiple groups. The first of these, ICU evidence bundles had three response options: no not used, used without electronic orders, and used with electronic orders. These three situations could all lead to different levels of performance with quality improvement. However, it very well could be that only having electronic orders is associated with quality while the other two have no direct impact suggesting a two-level split. A similar theory can be used for the hospitalist presence variable which was encoded with the options: no hospitalists, hospitalists used for some patients and hospitalists used for all patients. The other variables that were given both text and numeric variables were a few other facility structure variables that were originally encoded as text, but were likely to have the most meaning as numeric variables. These variables were nurse to patient ratios, rurality, ICU type, ICU level, ICU management, and facility status. There were two complementary processes, both of which were utilized in the analysis, for developing and interpreting decision trees. The first and more standard process was to evaluate the dataset using an n-fold cross validation. This process involved splitting the dataset into n equal splits, most commonly 108 and as implemented here into 10 equal splits. The model development process then created a decision tree based on an information set that included n-1 splits; in a 10-fold cross validation that involved 90% of the data. Then the remaining split (10% for this example) was a test set used to test the classification accuracy. This process iterated n times, such that each split serves as the test split once. The classification results from the collection of tests were then used to calculate the following performance metrics, based on traditional confusion matrix principles:72, 73 @.A B/C: : D9 <9EC B/C: : D9 B.F:C:/- DF9EE < G9CA. @B @B < <B <B @ @B <B @B @B < @B 2 I DF9EE I B.F:C:/DF9EE B.F:C:/6P DF:. J0.9 :-K L9.9F .:C :F LA. M @BF<BN FOF 6Q These metrics were calculated for each individual classification class as well as an average for the entire classification scheme. Additionally, the interrater reliability or kappa statistic was calculated comparing across each of the test sets, with P(a) representing the observed agreement between raters (or test sets) and P(e) representing the potential agreement due to chance.72 R B9 B 1 B 109 The other model development process used all of the available examples to create a single decision tree. With each of the leaves noting the number of correctly and incorrectly classified instances. There were no formal performance metrics to evaluate for this tree, but it does offer a single easy to interpret presentation of the data. The results from both of these model development processes will be presented to evaluate the decision tree models. The last consideration during model development was which classifications to use. The initial FIX analysis used an 11-point classification model and applied it to 5 different outcome measures. However, given the small number of hospitals in some of the individual categories, this final analysis will use a simplified classification model that only considers a 4-level classification based on the major categories: No change, Improve not Sustain, Sustain, and No Benefit. Further, given the limited numbers of hospitals that improved or sustained on the three quality check outcomes, only length of stay (LOS) and discharges before noon were individually modeled. In addition to modeling hospital performance on the two primary outcome measures, models were created for two composite measures. The first model combined hospital performance on LOS and discharges before noon, while the other considered performance across all 5 outcomes. The composite classification was created by assigning the following point values to each performance category: 2 for sustained improvement, 1 for non-sustained improvements, 0 for no change, and -1 for no benefit. Table 6-7 lists the point ranges assigned to each classification category for both composite models. In 110 the LOS/Noon composite model both the sustained and non-sustained classifications include hospitals with a score of two. The distinction reflects a decision that any facility having a classification of sustain needed to have successfully sustained one of the two outcomes. Thus a few hospitals with a score of 2, reflecting that they improved but did not sustain on both outcomes, had their classifications set to improve not sustain for that composite. While hospitals that sustained on one outcome and recorded no change on the other maintained a classification of sustain. A similar check was performed on the total composite ranking, but all hospitals classified as sustain by point score showed sustainment on at least one of the five outcomes. Table 6-7: Point ranges for composite model classification LOS/Noon Total Composite Composite No change 0 -1 – 0 Improve not Sustain 1–2 1–2 Sustain 2–4 3–6 No Benefit -2 – -1 -5 – -2 Decision Tree Interpretation The overall purpose of this analysis was to generate hypotheses that could serve as a basis for guiding future in-depth studies focusing on how organizational characteristics support effective quality improvement efforts. As such this evaluation focuses on the performance metrics from the 10-fold analysis of the data as well as each of the four developed decision trees. The 111 performance metrics provided some insight into the potential external validity of this analysis by determining whether the hospitals in this dataset were able to predict each other’s performance. After evaluating the performance metrics, the next step considered the decision trees created from the entire dataset. This analysis examined the different limbs of the trees aiming to identify collections of organizational characteristics that consistently result in similar performance classifications. It is these limbs and the identified associations that serve the basis for future studies. This process also considered whether the structure of the decision trees provided any support for the guiding conceptual framework. In general, this analysis considered whether broad facility structures were located close to the main node of the tree suggesting they provide a basic foundation that is modified by QI structure and QI process to result in the final performance classification. Following this logic, after facility structure the next variables along the limb should be QI structures, with QI processes serving to establish the final leaves. Conclusions This section has introduced the methods used in this study to begin developing hypotheses on how organizational characteristics interact to create an environment that supports sustained improvement as part of a QI collaborative. The initial sections discussed the collection of individual, calculated and composite variables used to measure different components of facility structure, QI structure, and QI processes. This was followed by an overview of data mining and efforts to establish it as an effective tool for this task. 112 Subsequently this was followed by a description of the specific data mining steps involved in this analysis. Lastly, a short discussion focused on how to interpret the data mining models and established the goals for the analysis. The actual results of the analysis appear in the next chapter. 113 CHAPTER 7 – DECISION TREE RESULTS AND DISCUSSION This chapter reports and evaluates the results from the decision tree modeling efforts. These decision trees examined how different organizational characteristics interacted to create the organizational context that contributed to how the hospital responded to FIX to generate the measured outcomes reported in Chapter 4. The first section of this chapter identifies the sample of hospitals that had complete data allowing inclusion in the study. Next the chapter examines the performance metrics from the 10-fold decision tree analysis. The last portion of the results examines the individual decision trees for the length of stay (LOS), discharges before noon, LOS/Noon composite, and overall composite classifications. The discussion of these results interprets the performance metrics and the decision trees before exploring whether the results suggest any modifications to the guiding analytic framework. Before concluding the chapter lastly considered some of the key limitations of the analysis. Decision Tree Performance Metrics Of the 130 hospitals that participated in FIX, the chief of staff at 100 of them completed the VA Clinical Practice Organizational Survey (CPOS) leading to a final sample size for the data mining analysis of 100 hospitals. Table 7-1 lists the number of hospitals classified into each of the four performance categories for the two primary outcomes as well as the two composite measures. Chisquare tests comparing the performance distribution between the full sample and this sample on LOS and discharges before noon show no signs of systematic non-response (Χ2 (df = 3), p(LOS) = 0.99; p(Noon) = 0.92). Since the full analysis 114 in Chapter 4 did not show any variation in performance by hospital size or region, the distribution of these factors in the data mining sample were not compared to the full sample. Overall, the decision trees developed for each of these outcomes had a difficult time identifying consistent relationships among features that combined to create an organizational context that was consistently associated with a specific classification of hospital performance. Even with a total of 263 individual and composite variables to consider, the kappa statistic for all models performed equivalently to chance (κ(LOS) = -0.03, κ(Noon) = -0.02, κ(LOS/Noon Composite) = -0.06, κ(Overall Composite) = 0.02). The other performance metrics from the evaluation similarly suggest that the models performed no better than chance, see Table 7-2. Further, a review of the receiver operator characteristic (ROC) measures showed that the only categorization level that was consistently identified at a rate better than chance was those hospitals that performed with no statistical change in response to FIX. One promising note was that the decision tree with the best performance at classifying facilities was the overall composite measure. This reflects that organizational context, particularly as reflected by the measures in this analysis, impacted the larger environment at the hospital and that measuring a single outcome, such as LOS, cannot adequately understand how well an organization supports QI. 115 Noon No Change Improve Only Sustained No Benefit Average 0.346 0.294 0.077 0.222 0.26 0.297 0.318 0.115 0.288 0.278 0.29 0.323 0.091 0.222 0.257 0.346 0.294 0.077 0.222 0.26 0.316 0.308 0.083 0.222 0.258 0.526 0.496 0.373 0.456 0.477 LOS/Noon Composite Table 7-2: Decision tree performance metrics TP Rate FP Rate Precision Recall F-Measure ROC No Change 0.407 0.288 0.344 0.407 0.373 0.55 Improve Only 0 0.172 0 0 0 0.421 Sustained 0.182 0.231 0.182 0.182 0.182 0.454 No Benefit 0.263 0.339 0.323 0.263 0.29 0.443 Average 0.25 0.28 0.255 0.25 0.251 0.471 No Change Improve Only Sustained No Benefit Average 0.455 0.133 0.105 0.172 0.21 0.256 0.314 0.21 0.282 0.272 0.333 0.154 0.105 0.2 0.197 0.455 0.133 0.105 0.172 0.21 0.385 0.143 0.105 0.185 0.201 0.594 0.375 0.476 0.369 0.441 Overall Composite LOS Table 7-1: Data mining sample performance classifications (N = 100) LOS/Noon Overall LOS Noon Composite Composite No Change 27 26 22 29 Improve Not Sustain 13 34 30 20 Improve and Sustain 22 13 19 17 No Benefit 38 27 29 34 No Change Improve Only Sustained No Benefit Average 0.379 0.25 0.176 0.265 0.28 0.352 0.188 0.145 0.303 0.267 0.306 0.25 0.2 0.31 0.278 0.379 0.25 0.176 0.265 0.28 0.338 0.25 0.188 0.286 0.277 0.517 0.518 0.541 0.526 0.524 116 Individual Decision Trees This section considers the results of the full decision trees which represent the pruned classification of all 100 samples. Before examining the trees individually, the first evaluation step was to examine which variable categories were emphasized across models. Table 7-3 lists each of the major variable categories identified in at least one model and the count of how many times a factor from that category appeared in each of the four models. Variable categories in the table were ordered by the total number of appearances with separations between each of the three major classes from the analytic framework. Surprisingly, the four decision trees all featured a similar number of factors with the LOS tree using 28 factors to reach the pruned classification, while the other 3 trees each used 24 factors. There were very few components of facility structure in the models with only 4 of the potential 12 variable categories identified in any model. In contrast the QI structure and QI process classes were frequently observed in the models. The QI structure class played a prominent role in the LOS model while QI process was the prominent class in the discharges before noon and overall composite models. The LOS/Noon composite model had an even number of features from both of these two classes. Most of the major variable categories in the QI structure and QI process classes were represented in at least one model. For QI structure there were two classes that did not appear in any model, ICU nurse to patient ratio and barriers to improvement. Although the nurse to patient ratio is a factor with a strong 117 Table 7-3: Count of factors in each of the decision trees Overall LOS/Noon LOS Noon Composite Composite Facility Structure Ward 3 1 1 1 Academic Affiliate 1 1 # of ICU Beds 1 1 Rural 1 Total Facility Structure 5 2 2 2 Total 6 2 2 1 QI Structure Sufficient Staff Performance Monitoring Inpatient Resources Utilization Review Communication Nurse FTEE Hospitalists Total QI Structure QI Process Guideline Adherence Performance Improvement QI Information Clinical Reminders ICU Bundles Driving Force Clinical Champion ER QI Teams Performance Awards Clinical Order Sets Facility Environment Total QI Process Total Decision Points 4 8 1 2 15 2 1 1 4 3 1 1 1 1 1 8 1 3 3 2 2 2 3 2 1 1 4 1 2 1 11 8 3 3 4 1 1 2 1 1 2 1 1 1 2 2 2 1 1 8 14 1 11 14 28 24 24 24 13 12 7 5 2 2 1 10 6 5 5 5 4 3 3 3 2 1 association to quality, nearly every hospital ICU has a 1:2 nurse to patient ratio across all shifts. Given the lack of variation across hospitals, it was not surprising that this factor did not appear in any models. In contrast, the hospitals did vary on their reporting of barriers to improvement, so it was not clear why none of these 118 factors appeared in the models. Interestingly, the only variable category from the QI process class that was not selected into a model was guideline implementation, which similarly measured the presence of barriers, specifically negative behavioral responses to efforts to implement clinical practice guidelines. So while the exact explanation for why these variable categories were not included was not clear, it seems that a consistent issue with measurement or definition of barriers to improvement impacted the overall understanding of how the presence of any barriers impacted efforts to improve quality. The two most frequently utilized variables categories from the QI structure class were the measures evaluating whether certain staffing levels were sufficient and measures considering the frequency (annual, quarterly, monthly, or never) or level (hospital, ward, or provider) of data monitoring on a collection of performance measures. For the QI process class, the two most frequent variables were measures related to efforts to improve clinical guideline adherence and measures related to implementation of actions to improve performance. Before considering more in-depth the individual decision trees, a second evaluation step considered whether the selected factors represented individual measures from the surveys or one of the composite measures created to represent a variable category. Across the four models there were 17 composite variables selected into the various models. These composite measures represented 3 QI structure and 3 QI process variable categories, each listed in Table 7-4. In general, these composite variables did not seem to play a 119 significant role in summarizing the individual measures. The one exception to this was the guideline adherence variable category where a composite measure was selected for eight of the ten times a factor from that variable category appeared in the decision tree models. Within the decision trees, the LOS model had 7 composite variables, the overall composite 5, 3 for the LOS/noon composite, and 2 for the discharges before noon model. Table 7-4: List of individual and composite variables in the decision trees Individual Composite Total QI Structure Sufficient Staff 11 2 13 Performance Monitoring 9 3 12 Utilization Review 4 1 5 QI Process Guideline Adherence Clinical Reminders Performance Awards 2 3 2 8 2 1 10 5 3 The review of the individual decision trees began with the model depicting hospital performance on LOS, which is displayed in Figure 7-1. To ease reference while describing the decision trees, the boxes that list selected features were numbered. A striking feature of this decision tree upon initial review was that it appeared less like a tree and more like a long vine with just a few small offshoots. An additional feature of this model, which was to be expected based on the decision tree performance metrics as well as the use of 28 measures to classify the facilities, was that most of the final classification groups only included a small number of hospitals. The few exceptions to this include the classification 120 off box 2 which lead to 6 hospitals classified as sustaining performance, and off box 28 where 15 hospitals were grouped as having no statistical changes in the study. Across the full pruned tree a total of 10 hospitals were misclassified. On a substantive level, there were four interesting findings to highlight. First, higher levels of data availability or monitoring were associated with better performance. The clearest indication of this occurs at box 2 which considers a collection of 8 hospitals that used incentives to encourage guideline adherence for all 3 measured disease categories (acute myocardial infarction (AMI), chronic heart failure (CHF), and community acquired pneumonia (CAP)). These hospitals were split into 6 with sustained performance and 2 with no-statistical changes based on whether they had a utilization review for non-VA admissions for over half (sustainers) or less than half (no change). The misclassified hospital in the no change category did register improved performance. Due to pruning, the decision tree did not distinguish between the improver and no change, but an examination of values for the defining variable reveals that the improving hospital reported that they reviewed a few (1-20%) of non-VA admissions compared to the hospital with no change which reported that they reviewed none of their nonVA admissions. Other decision points based on data availability were boxes 21, 22, and 28. Box 28 does not fully support the theory as the two hospitals with sustained performance have a low level of concurrent utilization review of acute admissions. The decision at box 27 was also counter-intuitive in which not having an ICU protocol for weight-based heparin administration was labeled as leading 121 to sustained performance. Likely an improved model would have been able to have an additional decision point after box 25 that would have split the remaining 24 hospitals more succinctly into their appropriate sustaining and no-change categories. The second substantive finding from the model suggested that high ratings on staff sufficiency measures were associated with better performance (improve or sustain) but at the same time the decision tree suggests that relying on high staff sufficiency ratings may not be the most effective approach to effective QI. This was best exemplified by the decision at box 16. This decision point considered whether four hospitals reported sufficient number of clinical pharmacists. Those that reported positively were able to improve LOS during FIX, and those that felt they did not have sufficient clinical pharmacists showed no statistical change. However, when considering not only this decision point, but also the decision at box 15, there was a more complex interpretation of how factors interacted to support QI. Box 15 revealed that the four hospitals evaluated in box 16 had low monitoring levels of hospital readmission rates. So it may have been more efficient to institute methods for monitoring readmission rates rather than relying on clinical pharmacists to overcome any challenges associated with not understanding a hospitals readmission rates. Other decision points with similar findings were 4, 14, and 25. Although box 14, where a completely sufficient staffing of laboratory technicians leads to 5 hospitals exhibiting no benefit from FIX, does not exactly fit this pattern. 122 The third substantive finding from this decision tree was that a lack of proven techniques, most notably ICU evidence bundles for ventilator associated pneumonia (VAP) and catheter related blood stream infections (CRBSI), was associated with poor performance as exhibited in boxes 6 and 7. The prominence of ICU evidence bundles in the LOS decision tree was not surprising as higher levels of VAP and CRBSI due to a lack of methods to support evidence based medicine would lead to extended LOS and present many challenges to reducing LOS. It should also be noted that box 6 splits out the 4 hospitals in this sample that had no ICU. So while there is no expectation that they would have a VAP bundle, their lack of success in reducing LOS may suggest that ICUs often play a role in the early development of QI programs. The fourth finding was the surprising appearance of factors in boxes 21 and 22 related to monitoring emergency department (ED) visits in this decision tree related to LOS. The presence of these factors may indirectly represent a robust data collection and dissemination culture at those hospitals with high rates of ED monitoring. These factors may also have a more direct relationship with quality, as the ED does serve as one of two main entry points for hospital admission. Since FIX did focus on hospital flow throughout the hospital experience the ED likely was included in many improvement projects. As such an understanding of ED admission rates and admission times would have helped support many QI efforts and provided extra motivation for overcoming any change resistance. 123 Figure 7-1: Full decision tree for LOS performance 124 The next decision tree, Figure 7-2, overviewed the classification for hospital performance on improving rates of discharge before noon. Much like the LOS model this one was mostly one long vine although it has a couple more tree like splits at boxes 10 and 13. It also similarly did not have many large groupings of classifications with only a grouping of 9 improvers off box 12, 8 with no change off box 16, and 8 improvers off box 24. This pruned tree had a slightly higher rate of misclassification with a total of 12. There were four substantive points in this decision tree that further clarify the association between organizational context and hospital performance. The first point, which echoes the findings from the LOS tree, was that high ratings on staff sufficiency was associated with better performance on improving and sustaining gains related to discharging patients before noon. This point is perhaps best displayed in the facility classification related to boxes 11 and 12. Box 11 splits off two facilities as no benefit that rated their laboratory technician levels as insufficient. The other 15 facilities that were in this collection were then split into improvers or sustainers based on their efforts to encourage adherence to CHF clinical guidelines. Other decision points supporting this sentiment were at boxes 4 and 15. The second point from this decision tree considers the role of specific performance activities undertaken at facilities to improve performance. This first example related to this, boxes 7 and 10, shows how performance was related to efforts to shift staff from high performing to low performing areas in hopes of improving performance in the low performing area. While these decision points 125 don’t lead to direct classification of many hospitals the presence of this measure in the decision tree further solidifies the notion that staff sufficiency plays a critical role in successful quality improvement. Another hospital activity that had an impact on quality was the decision to create performance improvement teams to address a specific performance measure (box 18). This action has clear relevance for performance on discharges before noon as hospitals would generally have created a team to send to the FIX learning sessions, so those hospitals with greater experience pulling together teams to address specific measures could have better odds of succeeding. The third issue identified in this decision tree considered the role of different information sources in supporting quality improvement efforts. In box 5, five hospitals that did not rate a local hospital as an important resource for QI information were classified as no benefit. Then in box 6, four hospitals that rated VA newsletters as an important QI information resource also were classified as no benefit. Both of these classification points had one hospital misclassified into the group. The fourth and final issue from this decision tree considered the role of hospitalists in supporting quality improvement efforts. Of the 53 hospitals evaluated in box 13, 15 reported no hospitalist program while 38 reported at least some hospitalist program. None of the 15 hospitals without hospitalists were able to sustain improvements. Seven of the hospitals did make initial improvements, apparently as a result of using incentive programs or relying on completely sufficient registered nurse (RN) staffing. While this suggests hospitalists have a 126 Figure 7-2: Full decision tree for discharges before noon performance 127 critical role in supporting QI, this decision tree also shows that the presence of a hospitalist program did not guarantee success. Only four (with two misclassifications) of the 28 hospitals with hospitalists were classified as sustainers. Of the four decision trees, the next one (Figure 7-3) which modeled composite hospital performance on both primary outcomes had the most tree like structure. This decision tree has two classification points that successfully classified 11 hospitals as no change (box 7) and 11 hospitals as no benefit (box 15). This full pruned model had a high rate of classification success with only 3 misclassified hospitals. The results of this decision tree provide little additional insight as it appears to generally be a merged or averaging of the results from the LOS and discharges before noon trees. The major insight from this decision tree comes from the ordering of different variable categories within the decision tree. The first variable category appearing in the early portions of the model was the measures of hospital resources. These measures, which appeared in boxes 4, 5, 6, and 8 establish several early divisions in the tree. These divisions support the importance of a resource or point towards alternative combinations of factors that can help overcome any limitations associated with a missing resource. The next variable category appearing in the decision tree was the measures of performance improvement activities. These were seen in boxes 7, 10, 12, and 16. The decision point at box 12 was particularly illustrative. Of the hospitals evaluated at that point, 19 reported not using pilot testing and only one 128 Figure 7-3: Full decision tree for LOS/Noon composite performance 129 of these hospitals (actually misclassified as an improver) showed sustained improvements on the composite measure. The last two sets of variable categories in the model serve to achieve final performance classification. Boxes 16, 17, 21, and 23 represent different measures of staff sufficiency, while boxes 9, 15, and 24 were different measures of data collection and availability. One final intriguing finding in the decision tree relates to how hospitals were classified based on the presence of an evidence bundle in the ICU for glycemic control. In the previous two decision trees, ICU evidence bundles were evaluated as either present or absent in leading to performance classifications. In this decision tree box 20 splits the 11 hospitals into 3 different categories. Most surprising was the two hospitals classified as achieving some improvements despite not having any evidence bundle while three hospitals that had a nonelectronic evidence bundle showed no benefit from FIX. A further examination of how the hospitals performed on the individual measures of LOS or discharges before noon did not provide any insight into why two hospitals without any evidence bundle for glycemic control achieved some initial improvements. The last of the four decision trees (Figure 7-4) examined the composite measure of hospital performance across all five outcomes. While this tree has a few more splits than the individual outcome decision trees, there was clearly still a prominent backbone running the length of the tree with no major splits. Almost all of the classification decisions represent just a small number of facilities except for box 18 in which 9 hospitals were classified as no change and box 23 with 19 hospitals classified as no benefit. This decision tree had eight misclassifications, 130 which was less than the two individual outcome models but greater than the misclassification seen in the LOS/noon composite model. Of the four decision trees, this one had the greatest number of unexpected or counterintuitive findings. As a first example, variables measuring the sufficiency of staff still appear regularly in the model, as seen in boxes 2, 16, 21, and 24. However, quite frequently the resulting classification decisions were opposite the expectations created by the previous models. In box 2 a rating of insufficient registered nurse staffing was associated with five hospitals classified as sustainers, although the one misclassification did represent a hospital whose actual performance was no benefit. Similarly in box 16, high ratings of radiology technician sufficiency were associated with no benefit, while low levels were associated with non-sustained improvements. Box 21 resulted in a fairly expected classification, while box 24 had an inverse association with completely sufficient computer application coordinators resulting in 3 hospitals classified as no benefit. The second noticeable counterexample occurs in boxes 11 and 12. This series of boxes identifies two hospitals as sustainers that have a low use of electronic reminders for supporting evidence based care for AMI, CHF and CAP. These hospitals also did not use any techniques to review electronic reminders after implementation. While it was logical that hospitals not using electronic reminders would not review their non-existent reminders post-implementation, it was surprising that this collection of hospitals would exhibit apparent success in 131 Figure 7-4: Full decision tree for overall composite performance 132 improving and sustaining composite quality. This example does only reflect the experience of two hospitals, once again re-affirming the importance of local context. Discussion This data mining decision tree analysis examining how a hospital’s organizational context modified the response to FIX as measured by several patient outcomes generated several key insights into the challenges involved in improving and sustaining quality in healthcare. The key finding to remember from this analysis was that the decision trees had low overall performance and were unable to classify performance levels at a rate better than chance. As a positive finding from this research, there were variables that appeared in the full decision trees that had relationships with hospital performance that may be useful for determining how to improve hospital quality. While the performance of these decision trees on the performance metrics associated with the 10-fold analysis was disappointing, the review of the individual decision trees suggests two likely mechanisms created this low level of performance. The first mechanism was the difficulty measuring or defining specific concepts. For example, several variable classes appear repeatedly in the individual decision trees but often as only individual factors, rarely were any of the composite factors included. This suggests that better or more refined measures of these variable classes could lead to an ability to better understand how these factors support QI. An additional consideration for this mechanism was that the performance classifications represented a novel approach to 133 evaluating QI. This analysis was subject to its own set of limitations which could have contributed to some performance misclassification and poorer decision tree performance. The second mechanism was that quality improvement context may be nothing but a local phenomenon and specific organizational features that support QI at one hospital may not play any role at another hospital. Increasingly the QI literature has focused on a theme which may best be summarized by the quote “The devil might be in the details of local context and culture.”74 The simple idea being that the difference between successful and non-successful implementation of a QI project is related to many unmeasured, and perhaps even nonmeasureable, local factors presenting a significant challenge for efforts to identify pathways that would support successful QI. The overall appearance of these decision trees like a vine instead of as a tree provides some support for this mechanism. The vine like structure implied that none of the evaluated factors created a unique context that impacted hospital performance. Instead it was a relatively unique factor that helped separate out hospitals along each step of the decision tree. Despite the inability of these models to reach definite findings of how individual variables were associated with QI performance, the full decision tree models did highlight several variable categories that should be considered when considering how to improve healthcare quality. The first of the variables that appeared multiple times in the models was the different measures of sufficient staff. Overall, these different measures appeared 13 times, generally (11 times) 134 as an individual measure of staff sufficiency. The general impression from the decision tree models was that low ratings of staff sufficiency were unlikely to be associated with any improvements in quality. In contrast, high ratings of staff sufficiency did not guarantee success but certainly contributed to an environment that could succeed. Most importantly, high levels of sufficient staff were often depicted in tree limbs that suggested the primary role of staff in the quality process was to provide manpower that could overcome other limitations in the system. To improve quality, hospitals likely need to understand whether their staffing levels meet basic needs, but beyond this there should be careful consideration about whether the role of staff was to meet a critical need or simply to provide manpower to overcome some other limitation that might be more effectively addressed through a different approach. If it is the later, then the hospital will find greater benefit in investing to correct or overcome the limitation rather than trying to use brute force to improve quality. The second set of variables was the performance monitoring and utilization review variables which served as measures of data collection at each hospital. Just like for staff sufficiency, the selected data measures in the decision trees were most commonly individual factors and not any of the composite factors. The relationship between these measures and hospital performance suggested that data availability played a crucial role in distinguishing hospitals that simply improved from those that sustained. In fact, higher levels of data monitoring was one of the few variables that consistently showed facilities with higher levels having high performance, rather than the common observation that 135 low levels of a variable was associated with poor performance. The critical importance of data measurement and availability to a QI team’s ability to successfully improve and sustain quality is an important concept for hospitals to consider as many of them work to implement an electronic medical record, or in the case of VA work to develop the next generation electronic medical record. The third key set of variables was the measures or inpatient resource availability. These measures examine whether space and equipment needs were sufficient to support inpatient care, making this measure similar to the measures of staff sufficiency. In the decision trees these measures were often towards the top of the tree, suggesting they may be factors that establish environments requiring different approaches to improve quality. However, since these decision trees were generally linear they do not support this as a conclusion but rather suggest it as an area for further study and focus. While not a surprise, the decision trees do indicate that it was necessary to have critical resources sufficiently available if a team was to successfully improve quality. The fourth, and final, set of variables was those variables related to activities to ensure adherence to clinical practice guidelines for treatment of AMI, CHF, and CAP. In this case, 8 of the 10 times one of these variables appeared in the decision tree it was as a composite measure. The classification from these measures was roughly equally split between low use of techniques leading to low performance and high use of techniques leading to at least initial and sometimes sustained improvements. The selection of variables from this class does not support any specific method, as the selected variables represented a number of 136 different approaches to supporting guideline adherence, but instead suggest a need to focus on repetition and consistency when trying to achieve quality goals. By repetition this simply suggests that there is support for using multiple approaches to address a specific quality problem. From a consistency viewpoint, there is benefit from using similar approaches (whether it be incentives or specialized templates) when addressing slightly different quality problems. The benefits of repetition and consistency likely contribute to the development of an accepting culture by giving providers appropriate reminders about expectations and also helping them see a larger picture related to the hospital’s quality improvement efforts. There were two additional findings from this analysis that merit some additional discussion. First, the data from this analysis suggest that hospitalists may contribute in a meaningful way to QI. While keeping in mind that the overall level of evidence was limited since the measure of hospitalist presence at a hospital only appeared in the discharges before noon decision tree, there was signs that hospitalists contributed meaningfully to QI. In that model (box 13), hospitals without hospitalists were unable to sustain improvements and often had to use methods that may not effectively support sustainment such as performance awards or incentives to achieve initial improvements. Of course the discharge before noon outcome was particularly physician sensitive given the critical role that physician’s play in the discharge process. It should be remembered that not all hospitals with hospitalists were able to improve or sustain improvements. So while hospitalists can be effectively used, and may 137 increase the chances for success, they cannot be viewed as a easy solution as there will always be other factors within the organizational context that will interact to support or block efforts to improve quality. The second additional consideration was the number of identified associations that were unexpected. In the LOS model these unexpected classifications were frequently at the tail end of the decision tree. It may have been that even with the pruning process the decision tree was becoming over classified and these final decision points have limited substantive meanings. However, these findings should not be dismissed as some of the factors, such as sufficient laboratory technicians in box 14, could in certain local contexts have unexpected detrimental effects on QI. The number of unexpected findings in the composite measure decision tree was also concerning. The challenge with this decision tree was that the decision tree did consider hospital performance on the three secondary FIX outcomes, in-hospital mortality, 30-day mortality and 30-day all-cause readmission rates. Since these outcomes were not specific targets for improvement they do not specifically reflect the organizational context in relationship to quality improvement efforts. As such the findings from this model played a minimal role in the final project interpretations. A final consideration for this discussion was whether there were any differences between the LOS and discharges before noon decision trees that mirror some of the differences identified during the hospital performance analysis. In the original analysis, hospitals were more likely to improve on the discharges before noon outcome, but of the improving hospitals a greater 138 percentage sustained improvements for LOS. This analysis uncovered two differences between the individual decision trees for these two outcomes which provided some additional insight into the challenges associated with sustaining QI. First, the LOS model highlighted more QI structure components while the discharges before noon model highlighted more QI process components. Second, the LOS model had a greater number of composite measures selected into the decision tree. These findings suggested that measures of QI processes were most applicable for understanding how well a facility could come together and mount an initial effort to address a perceived quality problem. In contrast, QI structural components may better address the ability to monitor (i.e. data collection) and support (i.e. sufficient staff) successful QI projects. Interpreting the Analytic Framework The results of this study did not provide strong support for the analytic framework, but similarly did not generate evidence suggesting the analytic framework was completely incorrect. Before discussing the model, it must be remembered that this analysis did not have any measures of team quality or performance for the QI teams that participated in FIX. This lack of information makes a full assessment of the conceptual model impossible. In evaluating the conceptual model, the most unexpected result from this analysis was the lack of facility structure components appearing in the decision trees. Of the 11 times (out of a total of 100 decision points) that a facility structure was included, most frequently it was a count of the number of specialty wards in the hospital. From the way hospitals split from these decision points it does not 139 appear that these wards conveyed a specific quality benefit, instead they may have simply served as a convenient marker for hospital size. As the existence of these specialty wards typically just reflects having a large enough hospital to justify a ward that targets a specific patient population. Otherwise it seems that broad facility characteristics did not impact each hospitals path to quality. Although these models did not show a critical role for facility structure, suggesting that facility structure might be removed from the analytic framework, there were enough unique characteristics of the VA healthcare system that this system may diminish the impact of certain facility characteristics. These differences are likely most evident at small VA facilities which receive a number of structural benefits from their association with the VA system that similarly sized critical access hospitals in the country may not. As such the facility structure component should remain in the analytic framework until it can be clearly determined what, if any, role these components play in defining the organizational context for QI. For the other two components in the conceptual model, QI structure and QI process, it was clear they both played significant roles in shaping the quality environment. Additionally, there were numerous interactions between the two classes with occasionally the decision tree switching between the selection of a QI process and then a QI structure. What was unclear from the decision trees was whether these two concepts built on top of each other, were equally important factors, or whether they even represent two discrete concepts. However, considering the differences between the LOS and discharges before 140 noon tree it does seem that QI structure played a distinct role in supporting sustained quality that differs from the role of QI process components in creating initial improvements. As such, there was no evidence for revising the analytic framework until future research can understand the role of team character and better understand the interaction between QI structure and QI process components. Limitations While this study has generated some important findings, there were key limitations to consider when evaluating the meaning of these findings. The key limitation was the lack of variables that measured characteristics of the QI team in charge of FIX implementation. These variables could have been particularly informative, particularly if a minimally engaged team was otherwise in a supportive organizational context. With FIX representing a nationally mandated QI program there was a distinct possibility that the goals of FIX were not relevant for all participating hospitals. Additionally, these features could have provided intriguing insight as to whether high functioning teams can overcome some of the identified barriers, or whether certain environmental barriers are too substantial to expect a QI team to overcome. Despite lacking measures of QI team behavior the findings identified in the decision trees provided a good foundation for future research and can help current administrators evaluate whether their hospital has any foundational barriers to quality that should be addressed prior to undertaking large QI efforts. 141 A second limitation of the data was that the CPOS data came from chief of staff reports and most frequently represented subjective opinions. While chief of staff may continue to participate in clinical responsibilities they will certainly have a less frequent interaction with direct patient care and may only see one side of patient care (i.e. participate in outpatient clinics but not directly on inpatient care). Given this perspective, a chief of staff’s subjective evaluation on many of these variables may not perfectly match the evaluation from the providers that interact with the system regularly. Additionally, these subjective evaluations may not be comparable across sites as opinions may differ on evaluations of what would be defined as completely sufficient staffing levels. The third limitation was that this analysis only considered performance in relation to a single QI project. VA hospitals conduct multiple QI projects each year, many of which are individual projects that occur under circumstances that can vary dramatically from those associated with a national collaborative. The real interest of this research was to develop an understanding of what creates an organizational context that supports all QI efforts. A broader measure that considered the overall success of many QI efforts in the hospital could lead to more accurate classification of hospital performance in relation to its organizational context leading to better performing decision tree models. The fourth limitation relates to how well any conclusions from this study can generalize to the broader healthcare environment. This study did have complete data for 77% of national VA hospitals, so while this sample was quite representative and analyses did not indicate any systematic reasons for non- 142 response there was the potential for some non-response bias in the sample. While this sample may be sufficiently representative of VA, there were unique characteristics of VA healthcare that impacted these findings suggesting limited generalizability to private sector hospitals. First, as a large integrated healthcare system there are numerous interactions between different hospitals that does not translate as well to the private sector. Secondly, the VA has a much longer history with an electronic medical record, so the prominent role of CPRS in the devision trees may not be mirrored in hospitals that are just adopting an electronic medical record and have not determined how to best incorporate that tool into their QI toolbox. This study, however, may help them understand ways to develop and design intra-hospital networks and effective electronic medical records to help optimize future patient safety and quality efforts. Overall, these four limitations show why this analysis was focused on generating hypotheses and not geared towards trying to test specific hypotheses. The findings from this study should help in the design of future studies that will better understand the pathways that generate quality in healthcare and how those pathways differ across settings that vary on a number of factors. Conclusions This chapter covered a number of details related to the effort to model how a hospital’s organizational context modified improvement efforts in response to FIX leading to the performance that was evaluated and reported in Chapter 4. The main finding from this analysis was that the characteristics of the organizational context that either support or hinder success with a QI initiative 143 were highly variable across hospitals. In effect knowing the performance at one hospital would not facilitate predicting performance at a hospital with similar characteristics. This likely reflects the complex nature of quality improvement and the inability to measure and model all important factors at one time. It may also reflect the severe limitation associated with not having any characteristics of the teams trying to create local improvements. Despite the lack of broad statements on how to create an organizational context to support QI, the review of individual decision trees was able to identify some variables that can help hospitals develop an environment to support QI efforts. The first three sets of variables, sufficient staff, inpatient resources, and data collection, all identify features that play a role in supporting QI efforts. For any hospital undertaking a new QI initiative or with a sense that far too many QI efforts fail, these may represent areas to address before investing in further QI. The fourth set of variables measured efforts to improve adherence to clinical practice guidelines. These measures should serve as an important reminder that success with QI takes time and often repetitive trials. For this variable category that identified factors were those that suggested hospitals utilized multiple approaches, and often consistently used these approaches across different diseases, to achieve the desired adherence to clinical guidelines. This suggests that successful QI requires a long-term investment and commitment. Hospitals likely need time to learn what methods work best for them and allow time for the development of a culture that appreciates the change brought about by successful QI. 144 The remainder of the chapter focused on examining the differences between the LOS and discharges before noon decision tree, re-evaluating the analytic framework first introduced in Chapter 5, and discussing the limitations of this analysis. The differences between the LOS and discharges before noon decision trees reaffirmed that sustaining improvements was a different process than making the initial improvements. For the analytic framework, the data did not fully support the original design, but given the study limitations there was no clear evidence to support a refinement of the analytic framework. The limitations of the study focus on the challenges posed by not having measures related to team character as well as some unique aspects of the integrated VA healthcare system that will not translate to other hospitals or healthcare systems. Overall, this study has shown some of the strength and challenges associated with the data mining approach to examining how organizational context supports QI. This model and the results from this study have highlighted some important areas for future study, which will be discussed in further detail in the next section. 145 CHAPTER 8 – SUMMARY AND FUTURE WORK This chapter serves to summarize and conclude this project. This chapter begins by reviewing the analyzed data and then highlighting the major results that increase the understanding of quality improvement (QI) in healthcare. After establishing the basic findings of the project, the chapter revisits the concepts of human factors and change management first discussed in the opening chapter. This discussion focuses on whether the data in this study support these as two theoretical areas that could help improve how hospitals approach and perform in their QI efforts. The summary will then conclude with some final recommendations for hospitals and quality leaders to consider as they work to improve safety and quality for patients as well as develop robust QI programs. Lastly, a section explores some key research questions and outlines potential future research projects that will help expand knowledge and improve hospital QI. Project Summary The first step in this project was to establish how a collection of hospitals performed while participating in a national QI collaborative. The QI literature often presents a rosy picture about the success rate of QI projects, so this project aimed to identify a collection of measurable patient outcomes that could establish whether participating hospitals in fact made improvements during a QI effort. Additionally, the project worked to introduce and analyze the concept that QI should not only initially improve quality, but should also ensure that new levels of quality were sustained for an extended period after project completion. 146 The case study for this analysis was a national QI collaborative undertaken by all 130 Veterans Affairs (VA) hospitals named the Flow Improvement Inpatient Initiative (FIX). This yearlong collaborative focused on the participating hospitals working collaboratively to improve their individual hospital patient flow. Two goals of the collaborative, which became the primary outcome measures for this project, were to shorten patient LOS and increase the percentage of patients discharged before noon. Additionally, the analyses in this project considered three secondary outcome measures, 30-day all-cause readmission rates, in-hospital mortality, and 30-day mortality. These secondary outcomes were not a specific focus during FIX and there was no expectation that hospitals would make improvements in these areas. Instead the secondary outcomes served as safety checks that ensured the efforts to improve patient flow did not result in unexpected negative outcomes. The process of evaluating hospital performance during FIX utilized an interrupted-time series analysis. The goal of this quasi-experimental approach was to provide the best available control for pre-existing trends in the outcomes (two-years of pre-FIX data) which would help determine whether changes in the outcomes during FIX were likely attributable to FIX efforts. Additionally, the interrupted-time series approach evaluated two years of post-implementation data in order to evaluate whether those hospitals that improved during FIX sustained those improvements. The consideration of how hospitals performed on each of the five evaluated outcomes combined with a need to develop a 147 framework for comparison, led to the creation of a novel classification system that included four major performance categories. The first category included those hospitals that exhibited outcomes with high levels of variance during the initial four years of the five year study. These hospitals were classified as no change as they exhibited no detectable changes in a set of outcomes that generally have clear temporal trends. Hospitals in this group likely had highly non-standardized patient care processes, which presents its own unique QI challenge. These hospitals likely need to work to develop a standardized process, rather than trying to target interventions to improve specific elements of patient care. This was in contrast to the fourth classification category which included those hospitals that did not benefit from their participation in FIX. These hospitals had less variability in the outcomes, such that the time-series models could detect changes over time. It was just that these hospitals either had no improvement or in some cases actually had declining performance during FIX. These hospitals were more likely to have standardized care processes, they just were unsuccessful in implementing changes that created measureable improvements in the process as part of their participation in FIX. The other two classification categories considered those hospitals that showed improvements in response to FIX. In effect, the time-series models indicated that these hospitals changed their outcome measures during FIX in a manner that was not predicted based on any pre-implementation trends. The distinction between the two categories was how hospitals performed in the two 148 years after FIX. Those hospitals that saw the performance on the outcomes return back to or above levels predicted by the pre-FIX baseline were classified as improvers, while those that maintained performance better than predicted by the pre-FIX baseline were categorized as sustainers. Overall, the results of the analysis found that a number of hospitals improved LOS (35%) and discharges before noon (46%). However, of those facilities that improved, hospitals were more likely to sustain improvements related to LOS. In total, only 27 (21%) hospitals showed sustained improvements for LOS and 19 (17%) hospitals for discharges before noon. Assuming that longterm sustained improvements were important, this analysis revealed a need for a better understanding of how QI efforts interact with the broader organizational context and how to create an organizational context that would better support sustained QI results. To begin to understand the interaction between organizational context and QI efforts, the second half of this project considered the literature addressing the relationship between different organizational characteristics and a variety of measures related to quality. This literature review identified several shortcomings as well as no standard model for evaluating how organizational context modifies QI programs. In order to guide further analysis, a new analytic framework was developed to describe the interaction between four components of an organization’s context. These components, facility structure, QI structure, QI process, and team character, were modeled as building on top of each other. This meant that the facility structure provided a basic framework and that 149 framework would itself modify how the QI structure contributed to the organizational context. With similar relationships between QI structure and QI process and between QI process and team character. This analytic framework drove the selection of key variables that measured components of the organizational context. The framework also led to the selection of data mining decision trees as an analytic tool for modeling and understanding the complex interactions between different variables. The major analysis in the second half of the project still worked with the basic FIX case study. In this analysis, the goal was to develop decision tree models that would identify combinations of organizational factors that were commonly associated with hospital classification into one of the four performance categories. The tested organizational factors came from facility responses to two surveys completed during the same time frame as FIX. Despite a list of 263 potential factors, the data mining analysis was unable to come up with models that predicted hospital performance better than chance. This lack of success was likely partially due to a lack of measures of performance team character, but also clearly suggests that there are many challenges in effectively measuring the unique hospital characteristics that created the context that helped determine whether their efforts with FIX were successful. While these findings were unfortunate, the decision trees did help to uncover a number of important findings that can guide hospital policy considerations and future research. The first important finding was that the decision trees consistently identified four variable categories that played critical 150 roles in establishing the nature of the organizational context. The first three categories, sufficient staff, inpatient resources, and data collection, all provided separate but also complementary avenues for creating an organizational context that can either facilitate or hinder QI projects. The fourth category focused on different activities used to promote adherence to clinical practice guidelines. While there can be no definitive conclusions about the relationship between these variables and success with QI, they did provide valuable insight that local leaders can evaluate when determining how to optimize or at least improve their QI programs. The other important finding from the decision trees related to the differences between the decision trees for LOS and discharges before noon. These two trees further confirmed that improving and sustaining QI was a two stage process. The discharges before noon tree, in which a greater number of hospitals were able to improve but not sustain, emphasized QI process variables that measured whether the hospital had the experience to pull together a QI team and make initial improvements. In contrast, the LOS decision tree which had a greater number of sustainers highlighted the critical role of QI structural components in providing appropriate support to help maintain quality levels even after the completion of the initial QI project. The data mining decision tree analysis was subject to a number of limitations. The key limitation to remember was that there were no specific measures of the FIX QI teams and how their interactions or approaches impacted the hospital’s success with FIX. Despite these limitations the analysis 151 was able to highlight the strengths of the data mining approach as a potential tool for use in certain analytic, particularly hypothesis generating, research activities. Further, this study clearly identified that there were challenges to sustaining healthcare quality and began to identify different approaches that may help in meeting this challenge. Human Factors and Change Management The introduction to this project considered the roles poor design from a human factors perspective as well as resistance to change may play in the current inability of healthcare organizations to make substantial improvements in quality. Both of these factors were hard to directly examine as the available data did not really measure concepts from these theories, however a few variables did provide some insight. This section briefly discusses these insights and whether they suggest human factors and change management theories could potentially overcome some of the observed challenges associated with creating sustained quality. The challenge in assessing the role of human factors was the lack of information about the decisions QI teams made in trying to achieve the goals of FIX. As such it is not clear whether successful approaches had better human factors designs than the non-successful approaches. In the survey data there was a specific factor that assessed whether hospitals typically used human factors assessment to develop electronic reminders. This factor did not appear in any of the models. Another potentially related factor, use of pilot testing for electronic reminders, did appear in the discharges before noon and LOS/Noon 152 composite decision trees. In those models, the factor suggested that hospitals that did not pilot test reminders would not sustain improvements and were even unlikely to make initial improvements. Another situation that suggests human factors approaches could improve the likelihood of sustaining results comes from the many appearances of the staff sufficiency variable in the decision trees. When factors from this variable class appeared in the models, they often separated improvers or sustainers from hospitals that did not succeed with FIX. However, almost uniformly the factor appeared after some other resource or activity was viewed as insufficient, suggesting that QI may occur in an approach that focuses on staff remembering or recalling that they are responsible for completing certain actions to ensure quality. While providers will always play a significant role in ensuring quality, the incorporation of human factors design principles into quality improvement projects may help identify more robust solutions that avoid the potential for declining quality over time. For change management, there were three factors in the guideline implementation variable category that measured whether physicians, nurses, or other providers had any resistance to relevant QI projects. None of these individual factors, or the composite resistance measure appeared in any of the decision trees. This likely was because approximately 85% of hospitals reported very little or some resistance for each of these three measures. With so many hospitals reporting middle range values it may be that there was not enough variation between sites allowing these measures to appear in the models. Further, these were broad measures of resistance to efforts to promote guideline 153 adherence, so individual QI teams may have experienced different levels of resistance to the changes associated with FIX. While this data makes it seem that change resistance played a minimal or negligible role in the lack of hospitals that improved or sustained quality, there was some indirect data that suggests change resistance could be an important factor to consider. As discussed in Chapter 7, the decision trees showed that consistent and repetitive approaches to ensuring adherence to clinical practice guidelines was associated with a greater likelihood of success with FIX. To the degree that consistent and repetitive approaches help providers accept change and reduce resistance, this can serve as an indicator that there can be meaningful resistance to change. So while a weak finding, it is an important concept to keep in mind and to consider should hospitals find they have difficulty achieving their quality goals. Recommendations for Improving QI Although this work only represents a single case series involving 130 hospitals and resulted in a collection of predictive models with little predictive ability, the entire synthesis of literature and data has highlighted several issues. The following are five recommendations based on this project that represent approaches local quality leaders should consider as they work to develop an effective QI program. 1. Make your QI efforts about quality, not about meeting a requirement. Successful projects are those that people believe in and want to see become successful. Far too often, the people affected by a QI project (if not even the actual QI team) are told they must change in order to meet some (what often 154 seems arbitrary) internal or external requirement. This is a setting where change resistance may be maximized and chances of project success minimized. These situations will often be marked by initial improvements followed by immediate quality degradation of improvements after project completion. This type of effect could explain the overall system-wide response on the discharges before noon outcome. If enough QI teams treated the goal as something they had to do to satisfy a request from central office, the teams would have had sufficient buy-in to get the initial improvements to report, but once no one was monitoring and reporting rates of discharge before noon providers returned to their original discharge process. The key here is to encourage QI teams to early on identify and properly communicate a project value (ideally for all of the stakeholders) that goes beyond simply meeting arbitrary requirements. If properly done the larger healthcare community should have the necessary motivation to improve and sustain those improvements. 2. Aim for real change, not just re-education. While effective QI will include education, an effective QI team must work to understand the process and what about that process allows poor quality to occur. Then the team can identify ways to change the process that will eliminate sources of poor quality. Education can then focus on helping providers understand the new process and the benefits of that process. In contrast, education that relies solely on encouraging providers to perform better, which they will strive to do, is unlikely to sufficiently support efforts and will not lead to lasting improvements in quality. 155 Another important consideration is that the QI team should make sure there is a definable and consistent process relevant to the outcome of interest. Quite often the issue in healthcare is that there are few standardized processes making it difficult to broadly implement changes when everyone performs differently. This potential issue was the motivating factor for developing the FIX classification approach that separated facilities with high variability (No change) from those that had low variability but did not improve (No benefit). 3. Empower and excite. Change is most lasting when those who provide frontline care are involved and truly excited about the QI project. The data in this study indicated that staff were critical to supporting QI; the real question was how to most efficiently utilize staff in achieving goals. While it is critically important that those who formulate the strategic plan for an organization make it clear that they value and support QI, there is only so much that management in many health care systems can do to effect change. Instead, it must be the frontline leaders who recognize a quality problem, communicate the need for change, and motivate those around them to overcome the challenge. Additionally, it is these people who understand how a process truly occurs and can best identify the waste or potential sources of error. Only when there is true energy at the front lines for supporting and making a change, is it possible to achieve long term quality. 4. Measure and evaluate. Measures of data collection were frequently used to separate different performance categories in the decision tree models. In short, it is impossible to 156 improve quality if there is no clear understanding about the current state of performance. Similarly, sustaining performance requires monitoring performance and being prepared to respond should new sources of error emerge. This process has its own challenges as hospitals must carefully identify how frequently to collect and report data, as well as how to ensure that data are reported in a format that local quality leaders can interpret and use to develop plans of action. 5. Start small, dream big. All QI approaches include some level of focus on continuous improvement and monitoring. The continuous improvement process serves many critical purposes, but perhaps most importantly recognizes that most processes are subject to multiple sources of waste or error. This means that QI teams need the ability to systematically and sequentially tackle different issues rather than feeling like a successful project must tackle all problems with a single intervention. In addition to keeping the team from tackling too large of a project, this approach helps teams meet individual goals which can be an excellent way to keep interest and excitement about the project. Future Studies As with any hypothesis generating study, these analyses have generated a number of potential future research avenues. This section considers four key areas and outlines some potential research projects. First, the clearest finding from this research was that hospitals had varying levels of success with QI and it was difficult to predict hospital performance. This finding suggests it is critical to 157 develop a better understanding of the QI process and begin to identify critical events that establish greater likelihood of project success or failure. Studies in this area could build on the analytic framework from this study, making sure that they evaluate key characteristics of participating QI teams. Some considerations are what individuals (professions, position in the organization) compose the team, how well does the team interact with each other, and how well does the team interact with others in the organization. Additionally, there is a need for studies that can develop composite measures of hospital QI success that consider success rates across all QI efforts and not just a single QI project. This avenue of research should not be limited to primarily quantitative studies such as this one, but should also utilize mixed-method or pure qualitative study designs for understanding the QI process. A potential structure for these studies could be to identify a framework for improvement (such as those found in the work of Kotter,6 Nelson et al.,75 or Langley et al.76) and identify how well improvement teams adhere to these frameworks and whether their adherence to the framework relates to their ability to improve (and even sustain) quality. The challenge with these sorts of studies will be in determining how to generate enough scale to create generalizable knowledge, so an important secondary objective of these studies may be to determine how to effectively survey QI teams to collect key measures thus facilitating less resource intensive quantitative studies. This concept introduces a second important area for future research. One surprising finding from the data mining trees was that very few composite 158 measures representing different variable categories appeared in the final decision trees. As this work moves away from considering whether individual projects succeed to evaluating whether an organizational context can broadly support multiple QI initiatives, it will be important to identify and measure composite measures that will have some relevancy across multiple settings. As this study showed, individual measures will have little predictive power when applied in different settings. A third area for future research is to do some more in-depth examination of the many variable categories that were commonly identified in the decision trees. As already discussed this can include examining the exact meaning of sufficient staffing or inpatient resources and even further understanding how those staff or resources are being used to support improvements in quality. Other considerations from the findings include examining sources of information about QI, the accuracy of that information, and how well information is disseminated. Some of the decision points in the discharge before noon decision tree suggested that some VA teams were relying on information sources that were ineffective in supporting their QI efforts. The final area of research, builds further on this concept of team information. However, this research needs to focus less on where teams get their information but on how to develop information systems to support clinical care and quality improvement. These efforts need to not only understand how to best collect data, but also what data are worth collecting and how to create useful syntheses of that data. While this research is outside the typical realm of health 159 services research and more of an engineering or information systems project, this is far too critical of an area for the future of healthcare quality to not mention as a future area of study. Conclusions As demonstrated by the distribution of hospital performance during FIX across the four classification categories as well as the inability of the decision trees to utilize a set of 263 variables to create accurate predictive models for hospital performance, there is still considerable work required to understand how and why QI projects are successful. The four areas of future study outlined in this chapter represent some key areas, but realistically are only a small portion of the many questions to consider. Hopefully this research has highlighted that QI in healthcare is a challenge and hospitals should not expect easy fixes. Instead, hospitals and QI teams need to recognize the challenges they face as they try to improve and then sustain quality. With reflection, particularly about barriers for unsuccessful projects, commitment to a process and investment in appropriate structural supports hospitals can achieve their quality goals. 160 APPENDIX A – RISK ADJUSTMENT MODEL SAS CODE /* Test Correlation*/ Proc CORR Data=Risk_Adjustment OUTP=CorrMat; var QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Hypothyr QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pepticulcer QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Rheumatic QUAN05_Sevliver QUAN05_UlcerNoBleed QUAN05_Valve QUAN05_WeightLoss ICU_DIRECT_ADMIT DISP_DiedHosp DISP_Transout; run; /*Full Variable List age_cat(Ref=FIRST) sex MS(Ref=’M’) income race_category(Ref=Last) SCPER_cat(REF=FIRST)MDC(REF=’5’) QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Hypothyr QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pepticulcer QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Rheumatic QUAN05_Sevliver QUAN05_UlcerNoBleed QUAN05_Valve QUAN05_WeightLoss SOURCE_cat(REF=’1M’) ICU_DIRECT_ADMIT DISTO_cat(REF=’-1’) DISTYPE(REF=’1’)DISP_DiedHosp DISP_Transout */ /* LOS Model*/ /* Includes all variables found to have a p< 0.1 association in single variable models */ /* Removed = Diab_NC Hypothyr Obesity UlcerNoBleed*/ /* AIC = 106542.6150 */ Proc Genmod Data=Risk_Adjustment; class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC (REF=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE; model log_los = age_cat sex MS income race_category SCPER_cat MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pepticulcer QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Rheumatic QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat DISTYPE DISP_DiedHosp DISP_Transout / DIST=NORMAL; run; 161 /*Removed Variables that do not have a p< 0.1 value in the full model*/ /* Removed = Sex COPD Depression DrugAbuse Hyper_CM Lymphoma MI Mildliver Nonmetast PepticUlcer RenalDisease RenalFailure SevLiver DISTYPE Disp_DiedHosp Disp_Transout*/ /* AIC = 106562.6133 */ Proc Genmod Data=Risk_Adjustment; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE; model log_los = age_cat MS income race_category SCPER_cat MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_Rheumatic QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat / DIST=NORMAL; run; /* Adding back in variables, up to p < 0.2 */ /* Variables Added = COPD Depression MI RenalDisease*/ /* AIC = 106524.5677*/ Proc Genmod Data=Risk_Adjustment; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model log_los = age_cat MS income race_category SCPER_cat MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_Rheumatic QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat/ DIST=NORMAL; run; /* The Above is the FINAL MODEL, next steps as well as correlation testing make no significant improvement to the AIC */ /* Removing variables back to the p < 0.1 level */ /* Variables Removed = COPD Depression MI */ /* AIC = 106542.4696*/ Proc Genmod Data=Risk_Adjustment; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model log_los = age_cat MS income race_category SCPER_cat MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_Rheumatic QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat/ DIST=NORMAL; run; 162 /* Removing variables to the p < 0.05 level */ /* Variables Removed = Hemiparesis */ /* AIC = 106525.5483*/ Proc Genmod Data=Risk_Adjustment; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model log_los = age_cat MS income race_category SCPER_cat MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_Rheumatic QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat/ DIST=NORMAL; run; /* Discharge Time Model*/ /* Includes all variables found to have a p<.1 association in single variable models */ /* Removed = Income AIDS BloodAnem Coagulation CVD Dementia Depression FluidDis Hypothyr Liver Lymphom Malignant MildLiver Neuro PepticUlcer Pulmcirc SevLiver UlcerNoBleed Valve WeightLoss*/ /* AIC = 36850.3500 */ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE; model NoonDischarge = age_cat sex MS race_category SCPER_cat MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Metastic QUAN05_MI QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Psychosis QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Rheumatic SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat DISTYPE DISP_DiedHosp DISP_Transout / DIST=BINOMIAL; run; /* Remove variables not meeting p<0.2 in full model */ /* Removed: Sex SCPER_CAT Arthrit COPD DrugAbuse Hemiparesis Metastic MI Nonmetast Obesity Paralysis Psychosis RenalDiease Rheumatic Distype Disp_DiedHospital Disp_Transout*/ /* AIC = 36827.7762*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/ PARAM=REFERENCE; model NoonDischarge = age_cat MS race_category MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_PVD QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat QUAN05_Arthrit/ DIST=BINOMIAL; run; 163 /* Remove variables not meeting p<0.1 in reduced model */ /* Removed: Alcohol Diab_NC Hyper_CM Hyper_NC*/ /* AIC = 36828.0531*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/ PARAM=REFERENCE; model NoonDischarge = age_cat MS race_category MDC QUAN05_Arrhyth QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_PVD QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat / DIST=BINOMIAL; run; /* Add correlation variables to the p<0.2 reduced model */ /* Added: Arthritis */ /* AIC = 36823.6495*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/ PARAM=REFERENCE; model NoonDischarge = age_cat MS race_category MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_PVD QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat / DIST=BINOMIAL; run; /* The Above is the FINAL MODEL, next steps as well as any other correlation testing make no significant improvement to the AIC */ /* Reduce correlation model to p<0.1 */ /* Removed: Alcohol Diab_NC Hyper_CM Hyper_NC */ /* AIC = 36823.9588*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/ PARAM=REFERENCE; model NoonDischarge = age_cat MS race_category MDC QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_PVD QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat / DIST=BINOMIAL; run; /* 30-day & Inhospital Mortality Model*/ /* Includes all variables found to have a p<.1 association in single variable models */ /* Removed = Income AIDS Arthrit Deficanem Hypothyr PepticUlcer Psychosis PVD Rheumatic UlcerNoBleed Disto_cat DisType DISP_DiedHosp DISP_Transout */ /* AIC = 11741.2237*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE; model Died30Day = age_cat sex MS race_category SCPER_cat MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Dementia 164 QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_RenalFail QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial; run; /* Includes all variables found to have a p<.2 association in full model */ /* Removed = Sex SCPER_Cat BloodAnemia COPD Diab_NC DrugAbuse Hemiparesis Lymphoma MildLiver RenalDisease RenalFailure Valve*/ /* AIC = 11751.0545*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE; model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_DementiaQUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pulmcirc QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial; run; /* Add in Correlations to Reduced Model*/ /* Added = RenalDisease */ /* AIC = 11722.0331*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE; model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_LiverQUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial; run; /* The Above is the FINAL MODEL, next steps as well as any other correlation testing make no significant improvement to the AIC */ /* Reducing Correlation Model to p<0.1*/ /* Removed = CVD Diab_CM Obesity*/ /* AIC = 11723.3343 */ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE; model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_Dementia QUAN05_Depression QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial; run; 165 /* In-hospital Mortality Model*/ /* Includes all variables found to have a p<.1 association in single variable models */ /* Removed = Income AIDS Alcohol Arthrit DeficAnem Hypothyr Psychosis PVD Rheumatic UlcerNoBleed Disto_cat Distype Disp_Transout*/ /* AIC = 8562.6092*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model DISP_DiedHosp= age_cat sex MS race_category SCPER_cat MDC QUAN05_Arrhyth QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_RenalFail QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=BINOMIAL; run; /* Includes all variables found to have a p<.2 association in the full model */ /* Removed = BloodAnemia CVD Diab_NC DrugAbuse Hemiparesis Lymphom MildLiver Obesity RenalFail Valve */ /* AIC = 8551.4851*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE; model DISP_DiedHosp= age_cat sex MS race_category SCPER_cat MDC QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT/ DIST=BINOMIAL; run; /* Remove MS from the p<.2 Reduced Model */ /* Removed = MS */ /* AIC = 8546.3365*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model DISP_DiedHosp= age_cat sex race_category SCPER_cat MDC QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT/ DIST=BINOMIAL; run; 166 /* The Above is the FINAL MODEL, next steps as well as any other correlation testing make no significant improvement to the AIC */ /* Includes all variables found to have a p<.1 association in the reduced model */ /* Removed = Sex Diab_CM PepticUlcer*/ /* AIC = 8547.6959*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model DISP_DiedHosp= age_cat race_category SCPER_cat MDC QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Dementia QUAN05_Depression QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT/ DIST=BINOMIAL; run; /* All Cause Readmission Model*/ /* Includes all variables found to have a p<.1 association in single variable models */ /* Removed = MS Income Alcohol Dementia Depression Hemiparesis Hypothyroidism Neuro Paralysis PepticUlcer Psychosis Rheumatic UlcerNoBleed ICU_Direct_Admit Disp_DiedHosp Disp_Transout*/ /* AIC = 34500.2496 */ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model allcause_readmit_flag = age_cat sex race_category SCPER_cat MDC QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Nometast QUAN05_Obesity QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss SOURCE_cat DISTO_cat DISTYPE / DIST=BINOMIAL; run; /* Includes all variables found to have a p<.2 association in Full Model*/ /* Removed = Sex BloodAnem CVD Depression DrugAbuse Hyper_Cm Liver Lymphoma MildLiver NonMetast Pulmcirc Valve Distype*/ /* AIC = 34489.2946 */ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model allcause_readmit_flag = age_cat race_category SCPER_cat MDC QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_FluidDis QUAN05_Hyper_NC QUAN05_Malignant QUAN05_Metastic 167 QUAN05_MI QUAN05_Obesity QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat DISTO_cat / DIST=BINOMIAL; run; /* Reduced to p<0.1 and Adjust for Correlation Effects*/ /* Removed = Hyper_NC RenalFail*/ /* Added = Liver*/ /* AIC = 34487.3895 */ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model allcause_readmit_flag = age_cat race_category SCPER_cat MDC QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_FluidDis QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Obesity QUAN05_PVD QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat DISTO_cat / DIST=BINOMIAL; run; /* The Above is the FINAL MODEL, next steps as well as any other correlation testing make no significant improvement to the AIC */ /* Reduced to p<0.05 */ /* Removed = Arthrit Obesity*/ /* AIC = 34490.2071*/ Proc Genmod Data=Risk_Adjustment DESCENDING; class age_cat(Ref=FIRST) sex race_category(Ref=Last) SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE; model allcause_readmit_flag = age_cat race_category SCPER_cat MDC QUAN05_AIDS QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_FluidDis QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_PVD QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat DISTO_cat / DIST=BINOMIAL; run; 168 APPENDIX B – SAS OUTPUT FOR RISK ADJUSTMENT The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable MYDATA.VERIFICATION Normal Identity log_los Number of Observations Read Number of Observations Used 60000 60000 Criteria For Assessing Goodness Of Fit Criterion Deviance Scaled Deviance Pearson Chi-Square Scaled Pearson X2 Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) DF Value Value/DF 6E4 6E4 6E4 6E4 41474.6831 60000.0000 41474.6831 60000.0000 -74058.4710 -74058.4710 148308.9420 148309.2529 149173.1436 0.6923 1.0016 0.6923 1.0016 Algorithm converged. Analysis Of Maximum Likelihood Parameter Estimates Parameter Intercept age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat MS MS MS MS MS INCOME race_category race_category race_category SCPER_Cat SCPER_Cat MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC 2 3 4 5 6 7 8 9 10 D N S U W 1 2 3 2 3 0 1 2 3 4 6 7 8 9 10 DF Estimate Standard Error 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.4390 0.0732 0.1011 0.1270 0.1500 0.1538 0.1585 0.2018 0.2053 0.2283 0.0523 0.0782 0.0598 0.0468 0.0524 -0.0000 -0.0767 0.0436 -0.0113 -0.0148 0.0117 0.0468 0.1963 0.1666 0.1031 0.3880 0.3064 0.4640 0.5131 0.5059 0.0547 0.0201 0.0226 0.0200 0.0186 0.0190 0.0204 0.0203 0.0204 0.0206 0.0222 0.0085 0.0121 0.0170 0.0595 0.0119 0.0000 0.0404 0.0100 0.0077 0.0081 0.0124 0.3722 0.0168 0.0643 0.0279 0.0118 0.0129 0.0195 0.0184 0.0195 0.0182 Wald 95% Confidence Limits 0.3996 0.0289 0.0618 0.0905 0.1128 0.1138 0.1186 0.1617 0.1649 0.1847 0.0356 0.0544 0.0265 -0.0698 0.0290 -0.0000 -0.1558 0.0239 -0.0264 -0.0306 -0.0126 -0.6826 0.1634 0.0407 0.0485 0.3648 0.2810 0.4258 0.4770 0.4676 0.0190 0.4783 0.1174 0.1403 0.1635 0.1871 0.1938 0.1984 0.2418 0.2458 0.2718 0.0690 0.1020 0.0931 0.1634 0.0757 -0.0000 0.0024 0.0632 0.0039 0.0011 0.0360 0.7763 0.2292 0.2925 0.1577 0.4111 0.3317 0.5022 0.5492 0.5441 0.0905 Wald Chi-Square Pr > ChiSq 477.89 10.52 25.48 46.45 62.58 56.71 60.73 97.57 99.11 105.63 37.59 41.45 12.39 0.62 19.32 10.56 3.61 18.86 2.12 3.34 0.90 0.02 136.79 6.72 13.68 1081.92 561.66 567.27 774.50 672.02 9.00 <.0001 0.0012 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0004 0.4313 <.0001 0.0012 0.0574 <.0001 0.1455 0.0675 0.3437 0.8999 <.0001 0.0095 0.0002 <.0001 <.0001 <.0001 <.0001 <.0001 0.0027 169 Parameter MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC QUAN05_AIDS QUAN05_ALCOHOL QUAN05_ARRHYTH QUAN05_ARTHRIT QUAN05_BLOODANEM QUAN05_CHF QUAN05_COAGULAT QUAN05_COPD QUAN05_CVD QUAN05_DEFICANEM QUAN05_DEMENTIA QUAN05_DEPRESSION QUAN05_DIAB_CM QUAN05_FLUIDDIS QUAN05_HEMIPARA QUAN05_HYPER_NC QUAN05_LIVER QUAN05_MALIGNANT QUAN05_METASTIC QUAN05_MI QUAN05_NEUROL QUAN05_PARALY QUAN05_PSYCHOSIS QUAN05_PULMCIRC QUAN05_PVD QUAN05_RENALDIS QUAN05_RHEUMATIC QUAN05_VALVE QUAN05_WEIGHTLOSS Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat ICU_DIRECT_ADMIT Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Scale 11 12 13 14 16 17 18 19 20 21 22 23 24 25 1D 1G 1K 1P 1T 2A 3A 3B -3 -2 0 3 4 5 7 11 17 22 25 30 DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Estimate 0.3205 0.2486 0.2644 -0.2870 -0.0163 0.4815 0.5710 0.1007 0.1837 -0.1093 0.5390 -0.0453 0.9161 0.7255 0.1304 0.1134 0.1321 0.3025 0.1734 0.2023 0.2776 0.0433 0.0884 0.1379 0.2217 0.0498 0.2013 0.2028 0.1270 0.0034 0.0636 0.2146 0.1550 0.1212 0.1728 0.1788 0.1003 0.2221 0.1253 0.1127 -0.2124 0.1586 0.3654 -0.3523 -0.2524 0.1297 -0.0665 -0.8782 -0.0006 0.2313 0.1680 0.3351 -0.2792 0.3537 0.1105 0.0371 -0.2963 0.6290 0.7024 -0.0340 0.0743 0.3951 0.0217 -0.0488 0.8314 Standard Error 0.0156 0.0410 0.1549 0.8318 0.0253 0.0335 0.0281 0.0310 0.0253 0.0302 0.1910 0.0279 0.2632 0.0708 0.0450 0.0133 0.0090 0.0565 0.0342 0.0099 0.0200 0.0086 0.0145 0.0166 0.0239 0.0104 0.0138 0.0094 0.0547 0.0073 0.0147 0.0121 0.0191 0.0137 0.0146 0.0449 0.0200 0.0213 0.0134 0.0109 0.0642 0.0156 0.0195 0.0310 0.0851 0.0332 0.0070 0.0628 0.0484 0.0578 0.0444 0.0091 0.0236 0.0214 0.0292 0.0956 0.0373 0.0189 0.0320 0.0435 0.2225 0.0761 0.3398 0.2087 0.0024 Wald 95% Confidence Limits 0.2899 0.3512 0.1683 0.3289 -0.0392 0.5681 -1.9173 1.3432 -0.0659 0.0332 0.4158 0.5473 0.5159 0.6262 0.0400 0.1615 0.1341 0.2333 -0.1686 -0.0500 0.1646 0.9133 -0.0999 0.0093 0.4002 1.4319 0.5868 0.8643 0.0422 0.2186 0.0873 0.1394 0.1145 0.1497 0.1918 0.4131 0.1065 0.2404 0.1829 0.2217 0.2384 0.3167 0.0265 0.0601 0.0600 0.1167 0.1055 0.1704 0.1747 0.2686 0.0294 0.0703 0.1743 0.2283 0.1845 0.2212 0.0199 0.2341 -0.0109 0.0177 0.0348 0.0924 0.1908 0.2383 0.1176 0.1924 0.0943 0.1482 0.1441 0.2015 0.0908 0.2669 0.0611 0.1395 0.1802 0.2639 0.0991 0.1514 0.0914 0.1341 -0.3382 -0.0866 0.1280 0.1893 0.3272 0.4036 -0.4131 -0.2915 -0.4191 -0.0856 0.0645 0.1948 -0.0803 -0.0528 -1.0012 -0.7552 -0.0955 0.0944 0.1179 0.3447 0.0811 0.2550 0.3174 0.3529 -0.3254 -0.2330 0.3118 0.3955 0.0533 0.1677 -0.1502 0.2245 -0.3695 -0.2231 0.5919 0.6662 0.6397 0.7652 -0.1192 0.0511 -0.3617 0.5103 0.2458 0.5443 -0.6443 0.6878 -0.4579 0.3603 0.8267 0.8361 NOTE: The scale parameter was estimated by maximum likelihood. Wald Chi-Square 420.84 36.80 2.91 0.12 0.42 206.12 412.41 10.56 52.74 13.05 7.96 2.65 12.11 105.04 8.39 72.92 216.01 28.71 25.78 418.24 193.08 25.55 37.32 69.43 85.70 22.77 213.20 468.74 5.40 0.21 18.70 313.61 65.95 77.76 139.40 15.85 25.15 108.19 87.98 107.25 10.96 103.13 351.40 128.95 8.80 15.22 89.95 195.80 0.00 15.99 14.33 1368.72 140.33 274.31 14.33 0.15 62.96 1102.22 481.57 0.61 0.11 26.93 0.00 0.05 Pr > ChiSq <.0001 <.0001 0.0879 0.7300 0.5189 <.0001 <.0001 0.0012 <.0001 0.0003 0.0048 0.1039 0.0005 <.0001 0.0038 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0201 0.6430 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0009 <.0001 <.0001 <.0001 0.0030 <.0001 <.0001 <.0001 0.9907 <.0001 0.0002 <.0001 <.0001 <.0001 0.0002 0.6977 <.0001 <.0001 <.0001 0.4334 0.7385 <.0001 0.9490 0.8151 170 The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable Number Number Number Number of of of of MYDATA.VERIFICATION Binomial Logit NoonDischarge Observations Read Observations Used Events Trials 60000 60000 10911 60000 PROC GENMOD is modeling the probability that NoonDischarge=’1’. Criteria For Assessing Goodness Of Fit Criterion DF Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Value Value/DF -27126.7305 -27126.7305 54401.4609 54401.6461 55067.6163 WARNING: Negative of Hessian not positive definite. Analysis Of Maximum Likelihood Parameter Estimates Parameter Intercept age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat MS MS MS MS MS race_catego race_catego race_catego MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC 2 3 4 5 6 7 8 9 10 D N S U W 1 2 3 0 1 2 3 4 6 7 8 9 10 11 12 13 14 16 DF Estimate Standard Error Wald 95% Confidence Limits 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -1.5855 -0.0997 -0.0025 0.0080 0.0052 0.0380 -0.0171 -0.0649 -0.0490 -0.0372 0.0929 0.0130 0.0529 0.1667 -0.0354 -0.2107 -0.1970 0.0613 -18.0777 -0.0398 -0.0910 0.0413 -0.2520 -0.1402 -0.2207 0.0003 -0.2209 -0.0738 -0.2257 0.0147 -0.2711 -17.9954 -0.2482 0.0608 0.0712 0.0622 0.0580 0.0590 0.0634 0.0633 0.0638 0.0646 0.0697 0.0266 0.0381 0.0544 0.1784 0.0389 0.1378 0.0336 0.0243 7079.609 0.0457 0.2112 0.0854 0.0366 0.0408 0.0610 0.0560 0.0653 0.0577 0.0514 0.1241 0.5421 16037.12 0.0859 -1.7047 -0.2392 -0.1244 -0.1057 -0.1104 -0.0862 -0.1412 -0.1900 -0.1755 -0.1738 0.0407 -0.0617 -0.0538 -0.1829 -0.1116 -0.4808 -0.2628 0.0137 -13893.9 -0.1294 -0.5050 -0.1260 -0.3237 -0.2201 -0.3403 -0.1096 -0.3489 -0.1869 -0.3264 -0.2285 -1.3337 -31450.2 -0.4165 -1.4662 0.0398 0.1195 0.1217 0.1208 0.1622 0.1069 0.0602 0.0775 0.0995 0.1452 0.0877 0.1595 0.5163 0.0408 0.0593 -0.1313 0.1089 13857.70 0.0497 0.3230 0.2086 -0.1802 -0.0603 -0.1012 0.1101 -0.0930 0.0392 -0.1250 0.2578 0.7915 31414.19 -0.0798 Wald Chi-Square Pr > ChiSq 678.95 1.96 0.00 0.02 0.01 0.36 0.07 1.03 0.58 0.28 12.17 0.12 0.94 0.87 0.83 2.34 34.47 6.38 0.00 0.76 0.19 0.23 47.35 11.82 13.10 0.00 11.46 1.64 19.30 0.01 0.25 0.00 8.35 <.0001 0.1614 0.9686 0.8902 0.9300 0.5488 0.7867 0.3095 0.4477 0.5939 0.0005 0.7329 0.3311 0.3500 0.3628 0.1262 <.0001 0.0116 0.9980 0.3830 0.6665 0.6288 <.0001 0.0006 0.0003 0.9961 0.0007 0.2006 <.0001 0.9060 0.6171 0.9991 0.0039 171 Parameter DF MDC 17 1 MDC 18 1 MDC 19 1 MDC 20 1 MDC 21 1 MDC 22 1 MDC 23 1 MDC 24 1 MDC 25 1 QUAN05_ALCOHOL 1 QUAN05_ARRHYTH 1 QUAN05_ARTHRIT 1 QUAN05_CHF 1 QUAN05_DEFICANEM 1 QUAN05_DIAB_CM 1 QUAN05_DIAB_NC 1 QUAN05_HYPER_CM 1 QUAN05_HYPER_NC 1 QUAN05_PVD 1 QUAN05_RENALFAIL 1 Source_cat 1D 1 Source_cat 1G 1 Source_cat 1K 1 Source_cat 1P 1 Source_cat 1T 1 Source_cat 2A 1 Source_cat 3A 1 Source_cat 3B 1 ICU_DIRECT_ADMIT 1 Disto_cat -3 1 Disto_cat -2 1 Disto_cat 0 1 Disto_cat 3 1 Disto_cat 4 1 Disto_cat 5 1 Disto_cat 7 1 Disto_cat 11 1 Disto_cat 17 1 Disto_cat 22 1 Disto_cat 25 1 Disto_cat 30 1 Scale 0 Estimate -0.0810 -0.2286 0.0435 0.6409 0.0504 -0.8949 0.2751 -0.8062 -0.2048 0.0129 -0.0739 -0.1177 -0.1790 -0.1805 -0.1962 -0.0758 -0.1302 -0.1184 0.0405 -0.1160 -0.1479 -0.1623 0.2462 0.1627 1.3782 -0.0626 0.2036 0.1279 0.2348 0.4560 1.6054 0.9000 1.1959 0.6038 1.4094 0.3295 0.9143 -0.9306 -0.2399 -17.5448 -1.0321 1.0000 Standard Error 0.0999 0.0874 0.0923 0.0697 0.0922 0.7625 0.0810 1.0749 0.1821 0.0401 0.0290 0.0910 0.0325 0.0567 0.0474 0.0260 0.0622 0.0236 0.0422 0.0562 0.0810 0.2688 0.0988 0.0226 0.1555 0.1554 0.1682 0.1287 0.0273 0.0677 0.0529 0.0759 0.2354 0.1021 0.0481 0.0983 0.1111 1.0441 0.2779 6455.267 1.0347 0.0000 Wald 95% Confidence Limits -0.2768 0.1147 -0.4000 -0.0573 -0.1374 0.2244 0.5043 0.7775 -0.1304 0.2311 -2.3893 0.5995 0.1163 0.4339 -2.9131 1.3006 -0.5617 0.1521 -0.0656 0.0915 -0.1307 -0.0171 -0.2960 0.0605 -0.2427 -0.1154 -0.2917 -0.0694 -0.2891 -0.1033 -0.1269 -0.0248 -0.2522 -0.0083 -0.1646 -0.0721 -0.0422 0.1233 -0.2261 -0.0058 -0.3066 0.0108 -0.6891 0.3645 0.0526 0.4398 0.1185 0.2069 1.0735 1.6829 -0.3671 0.2419 -0.1260 0.5332 -0.1243 0.3801 0.1813 0.2883 0.3234 0.5887 1.5018 1.7091 0.7513 1.0487 0.7345 1.6572 0.4036 0.8039 1.3152 1.5036 0.1369 0.5221 0.6965 1.1322 -2.9771 1.1158 -0.7847 0.3048 -12669.6 12634.55 -3.0600 0.9959 1.0000 1.0000 Wald Chi-Square 0.66 6.84 0.22 84.54 0.30 1.38 11.53 0.56 1.26 0.10 6.50 1.68 30.38 10.14 17.14 8.48 4.38 25.12 0.92 4.26 3.33 0.36 6.21 52.01 78.59 0.16 1.47 0.99 74.00 45.39 921.99 140.75 25.81 34.96 860.19 11.24 67.69 0.79 0.75 0.00 0.99 Pr > ChiSq 0.4173 0.0089 0.6375 <.0001 0.5849 0.2405 0.0007 0.4532 0.2608 0.7472 0.0108 0.1955 <.0001 0.0015 <.0001 0.0036 0.0364 <.0001 0.3370 0.0391 0.0678 0.5460 0.0127 <.0001 <.0001 0.6868 0.2260 0.3203 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0008 <.0001 0.3728 0.3880 0.9978 0.3185 172 The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable Number Number Number Number of of of of MYDATA.VERIFICATION Binomial Logit Died30Day Observations Read Observations Used Events Trials 60000 60000 2627 60000 PROC GENMOD is modeling the probability that Died30Day=’1’. Criteria For Assessing Goodness Of Fit Criterion DF Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Value Value/DF -8576.3926 -8576.3926 17300.7853 17300.9705 17966.9407 Analysis Of Maximum Likelihood Parameter Estimates Parameter Intercept age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat MS MS MS MS MS race_category race_category race_category MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC 2 3 4 5 6 7 8 9 10 D N S U W 1 2 3 0 1 2 3 4 6 7 8 9 10 11 12 13 14 16 17 18 19 DF Estimate Standard Error 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -6.0997 0.4912 0.7316 1.1184 1.2381 1.4199 1.5908 1.9216 1.9501 2.3331 0.0113 -0.0091 -0.0114 0.3065 0.0762 0.1025 0.0905 -0.0817 -15.5073 0.4896 0.2475 -0.2209 1.0701 0.4147 0.9986 0.5621 -0.0192 -0.0542 0.4499 -0.2165 1.6022 -14.0090 -0.0333 1.0349 1.6278 0.3938 0.2571 0.2930 0.2628 0.2523 0.2526 0.2564 0.2544 0.2531 0.2537 0.2552 0.0551 0.0805 0.1175 0.3646 0.0638 0.2511 0.0634 0.0495 11452.30 0.1074 0.5962 0.2308 0.0698 0.0892 0.1099 0.1254 0.1939 0.1308 0.0961 0.2989 1.0584 26440.74 0.1793 0.1571 0.1146 0.2411 Wald 95% Confidence Limits -6.6037 -0.0831 0.2166 0.6239 0.7429 0.9173 1.0923 1.4255 1.4530 1.8329 -0.0968 -0.1669 -0.2418 -0.4081 -0.0489 -0.3896 -0.0337 -0.1787 -22461.6 0.2792 -0.9210 -0.6733 0.9333 0.2399 0.7832 0.3164 -0.3993 -0.3106 0.2615 -0.8024 -0.4723 -51836.9 -0.3847 0.7270 1.4032 -0.0788 -5.5957 1.0656 1.2466 1.6130 1.7332 1.9225 2.0894 2.4177 2.4473 2.8333 0.1193 0.1487 0.2190 1.0212 0.2013 0.5946 0.2147 0.0153 22430.59 0.7001 1.4160 0.2314 1.2069 0.5895 1.2140 0.8078 0.3610 0.2021 0.6383 0.3694 3.6766 51808.90 0.3182 1.3427 1.8524 0.8664 Wald Chi-Square Pr > ChiSq 562.72 2.81 7.75 19.65 24.01 30.66 39.11 57.64 59.11 83.58 0.04 0.01 0.01 0.71 1.43 0.17 2.04 2.73 0.00 20.79 0.17 0.92 235.15 21.62 82.58 20.11 0.01 0.17 21.90 0.52 2.29 0.00 0.03 43.40 201.79 2.67 <.0001 0.0937 0.0054 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.8381 0.9104 0.9229 0.4005 0.2323 0.6830 0.1532 0.0987 0.9989 <.0001 0.6780 0.3384 <.0001 <.0001 <.0001 <.0001 0.9213 0.6783 <.0001 0.4689 0.1301 0.9996 0.8527 <.0001 <.0001 0.1024 173 Parameter MDC MDC MDC MDC MDC MDC QUAN05_ALCOHOL QUAN05_ARRHYTH QUAN05_CHF QUAN05_COAGULAT QUAN05_CVD QUAN05_DEMENTIA QUAN05_DEPRESSION QUAN05_DIAB_CM QUAN05_FLUIDDIS QUAN05_HYPER_CM QUAN05_HYPER_NC QUAN05_LIVER QUAN05_MALIGNANT QUAN05_METASTIC QUAN05_MI QUAN05_NEUROL QUAN05_NOMETAST QUAN05_OBESITY QUAN05_PARALY QUAN05_PULMCIRC QUAN05_RENALDIS QUAN05_SEVLIVER QUAN05_WEIGHTLOSS Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat ICU_DIRECT_ADMIT Scale 20 21 22 23 24 25 1D 1G 1K 1P 1T 2A 3A 3B DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 Estimate -0.6304 -1.1221 1.3677 0.4862 -16.3130 2.2312 0.2461 0.2942 0.4632 0.8462 0.2827 0.3465 -0.3371 0.0427 0.7946 -0.3502 -0.4008 0.5090 0.9408 1.2994 0.6754 0.6307 -0.0732 -0.1022 0.6545 0.4784 0.4185 0.9237 0.7553 1.2628 0.9146 0.1431 -0.2452 -1.6016 -0.8856 0.5152 0.9138 0.6837 1.0000 Standard Error 0.3191 0.4263 1.0594 0.1886 8088.934 0.2524 0.0834 0.0510 0.0552 0.0812 0.0786 0.1097 0.0806 0.0878 0.0470 0.0943 0.0471 0.0934 0.1203 0.0738 0.0738 0.0757 0.1279 0.1329 0.1289 0.1000 0.0842 0.1313 0.0819 0.0946 0.3292 0.1797 0.0460 1.0209 0.5817 0.2911 0.1878 0.0500 0.0000 Wald 95% Confidence Limits -1.2559 -0.0050 -1.9577 -0.2865 -0.7087 3.4442 0.1165 0.8560 -15870.3 15837.71 1.7366 2.7259 0.0827 0.4095 0.1943 0.3941 0.3550 0.5714 0.6869 1.0054 0.1286 0.4368 0.1315 0.5615 -0.4950 -0.1792 -0.1293 0.2147 0.7024 0.8867 -0.5349 -0.1654 -0.4931 -0.3085 0.3259 0.6921 0.7050 1.1766 1.1548 1.4440 0.5307 0.8201 0.4823 0.7792 -0.3239 0.1775 -0.3626 0.1582 0.4019 0.9070 0.2823 0.6744 0.2534 0.5835 0.6665 1.1810 0.5948 0.9159 1.0774 1.4482 0.2694 1.5598 -0.2091 0.4954 -0.3353 -0.1551 -3.6026 0.3994 -2.0257 0.2545 -0.0553 1.0857 0.5458 1.2818 0.5856 0.7817 1.0000 1.0000 Wald Chi-Square 3.90 6.93 1.67 6.64 0.00 78.17 8.72 33.30 70.34 108.47 12.93 9.98 17.50 0.24 285.71 13.80 72.37 29.69 61.15 310.09 83.67 69.33 0.33 0.59 25.80 22.87 24.69 49.53 85.03 178.20 7.72 0.63 28.46 2.46 2.32 3.13 23.68 186.82 Pr > ChiSq 0.0482 0.0085 0.1967 0.0100 0.9984 <.0001 0.0032 <.0001 <.0001 <.0001 0.0003 0.0016 <.0001 0.6264 <.0001 0.0002 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.5672 0.4418 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0055 0.4258 <.0001 0.1167 0.1279 0.0767 <.0001 <.0001 174 The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable Number Number Number Number of of of of MYDATA.VERIFICATION Binomial Logit DISP_DIEDHOSP Observations Read Observations Used Events Trials DISPOSITION: IN-HOSPITAL MORTALITY 60000 60000 1691 60000 PROC GENMOD is modeling the probability that DISP_DIEDHOSP=’1’. Criteria For Assessing Goodness Of Fit Criterion DF Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Value Value/DF -6127.9942 -6127.9942 12397.9885 12398.1591 13037.1376 Analysis Of Maximum Likelihood Parameter Estimates Parameter Intercept age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat SEX race_category race_category race_category SCPER_Cat SCPER_Cat MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC 2 3 4 5 6 7 8 9 10 F 1 2 3 2 3 0 1 2 3 4 6 7 8 9 10 11 12 13 14 16 17 18 19 20 21 DF Estimate 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -6.8314 0.7511 1.0474 1.3446 1.5080 1.6338 1.9215 2.1898 2.1492 2.5846 -0.2450 0.0885 0.2371 -0.0366 -0.1401 0.0639 -14.8793 0.5801 0.4339 -0.0052 1.2882 0.5039 0.9533 0.4188 0.1566 -0.4878 0.4772 -0.3403 -14.5034 -12.9909 -0.0381 1.1179 1.8836 0.4765 -0.0240 -0.9947 Standard Error 0.3533 0.3928 0.3592 0.3492 0.3493 0.3538 0.3506 0.3493 0.3497 0.3502 0.1999 0.3109 0.0751 0.0605 0.0653 0.0888 11329.29 0.1265 0.7256 0.2802 0.0862 0.1093 0.1355 0.1683 0.2336 0.1938 0.1184 0.4266 4634.019 26440.74 0.2338 0.1971 0.1274 0.3066 0.3206 0.5146 Wald 95% Confidence Limits -7.5238 -0.0188 0.3434 0.6602 0.8235 0.9403 1.2343 1.5052 1.4637 1.8982 -0.6369 -0.5207 0.0900 -0.1551 -0.2681 -0.1101 -22219.9 0.3321 -0.9882 -0.5544 1.1192 0.2896 0.6878 0.0890 -0.3013 -0.8677 0.2451 -1.1764 -9097.01 -51835.9 -0.4963 0.7316 1.6339 -0.1244 -0.6522 -2.0033 -6.1390 1.5209 1.7514 2.0290 2.1926 2.3272 2.6087 2.8745 2.8347 3.2710 0.1468 0.6978 0.3842 0.0820 -0.0121 0.2379 22190.12 0.8281 1.8560 0.5440 1.4572 0.7182 1.2188 0.7487 0.6145 -0.1079 0.7093 0.4958 9068.007 51809.92 0.4201 1.5043 2.1334 1.0774 0.6043 0.0139 Wald Chi-Square Pr > ChiSq 373.95 3.66 8.50 14.83 18.64 21.33 30.03 39.30 37.76 54.47 1.50 0.08 9.98 0.37 4.60 0.52 0.00 21.02 0.36 0.00 223.16 21.24 49.51 6.19 0.45 6.33 16.24 0.64 0.00 0.00 0.03 32.16 218.51 2.42 0.01 3.74 <.0001 0.0559 0.0035 0.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.2204 0.7758 0.0016 0.5456 0.0319 0.4715 0.9990 <.0001 0.5498 0.9853 <.0001 <.0001 <.0001 0.0128 0.5026 0.0118 <.0001 0.4251 0.9975 0.9996 0.8706 <.0001 <.0001 0.1202 0.9404 0.0532 175 Parameter MDC MDC MDC MDC QUAN05_ARRHYTH QUAN05_CHF QUAN05_COAGULAT QUAN05_COPD QUAN05_DEMENTIA QUAN05_DEPRESSION QUAN05_DIAB_CM QUAN05_FLUIDDIS QUAN05_HYPER_CM QUAN05_HYPER_NC QUAN05_LIVER QUAN05_MALIGNANT QUAN05_METASTIC QUAN05_MI QUAN05_NEUROL QUAN05_NOMETAST QUAN05_PARALY QUAN05_PEPTICULCER QUAN05_PULMCIRC QUAN05_RENALDIS QUAN05_SEVLIVER QUAN05_WEIGHTLOSS Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat ICU_DIRECT_ADMIT Scale 22 23 24 25 1D 1G 1K 1P 1T 2A 3A 3B DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 Estimate -15.4449 0.4470 -15.7453 1.8943 0.3446 0.4420 0.9579 -0.1435 0.3263 -0.4706 -0.0269 0.8728 -0.2746 -0.4691 0.6980 0.8289 1.0994 0.7861 0.7515 -0.2692 0.8250 0.0925 0.5530 0.3938 0.9330 0.7331 0.6505 1.0222 0.1565 -0.2980 -16.0848 -0.7725 0.5727 0.3937 0.9608 1.0000 Standard Error 5781.123 0.2484 8026.288 0.3382 0.0611 0.0666 0.0909 0.0627 0.1351 0.1067 0.1091 0.0561 0.1119 0.0580 0.1054 0.1452 0.0941 0.0852 0.0889 0.1566 0.1445 0.1810 0.1181 0.1007 0.1505 0.0982 0.1236 0.3633 0.2098 0.0562 1827.756 0.6927 0.3261 0.2605 0.0579 0.0000 NOTE: The scale parameter was held fixed. Wald 95% Confidence Limits -11346.2 11315.35 -0.0398 0.9338 -15747.0 15715.49 1.2315 2.5572 0.2248 0.4644 0.3114 0.5725 0.7797 1.1362 -0.2664 -0.0205 0.0615 0.5912 -0.6797 -0.2614 -0.2407 0.1869 0.7630 0.9827 -0.4939 -0.0553 -0.5829 -0.3553 0.4915 0.9045 0.5443 1.1135 0.9149 1.2838 0.6191 0.9530 0.5772 0.9258 -0.5761 0.0377 0.5417 1.1083 -0.2623 0.4473 0.3216 0.7845 0.1964 0.5912 0.6381 1.2279 0.5405 0.9256 0.4083 0.8928 0.3101 1.7343 -0.2548 0.5677 -0.4082 -0.1878 -3598.42 3566.251 -2.1302 0.5853 -0.0664 1.2118 -0.1169 0.9044 0.8472 1.0743 1.0000 1.0000 Wald Chi-Square 0.00 3.24 0.00 31.38 31.77 44.01 110.99 5.23 5.83 19.44 0.06 242.34 6.02 65.32 43.89 32.59 136.44 85.20 71.39 2.96 32.57 0.26 21.93 15.29 38.45 55.68 27.70 7.92 0.56 28.09 0.00 1.24 3.08 2.28 274.99 Pr > ChiSq 0.9979 0.0719 0.9984 <.0001 <.0001 <.0001 <.0001 0.0222 0.0157 <.0001 0.8052 <.0001 0.0141 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0856 <.0001 0.6094 <.0001 <.0001 <.0001 <.0001 <.0001 0.0049 0.4559 <.0001 0.9930 0.2648 0.0790 0.1307 <.0001 176 The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable MYDATA.VERIFICATION Binomial Logit allcause_readmit_flag Number of Observations Read Number of Observations Used Number of Events Number of Trials Missing Values 60000 57732 8904 57732 2268 PROC GENMOD is modeling the probability that allcause_readmit_flag=’1’. Criteria For Assessing Goodness Of Fit Criterion DF Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Value Value/DF -24108.8967 -24108.8967 48369.7934 48369.9964 49051.0244 Analysis Of Maximum Likelihood Parameter Estimates Parameter Intercept age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat age_cat race_category race_category race_category SCPER_Cat SCPER_Cat MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC MDC 2 3 4 5 6 7 8 9 10 1 2 3 2 3 0 1 2 3 4 6 7 8 9 10 11 12 13 14 16 17 18 19 20 21 22 DF Estimate Standard Error 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -2.2763 0.0151 0.1057 0.1041 0.1144 0.1127 0.1803 0.1409 0.1580 0.0732 -0.0753 -0.0332 -0.0955 -0.0443 0.0301 0.8129 -0.0632 -0.4100 -0.0679 0.1347 0.1838 0.5372 0.0948 -0.0010 0.0483 0.0895 -0.1340 -1.1449 -17.1240 0.3399 1.0246 0.1579 -0.1107 0.0253 -0.1533 -17.4183 0.0709 0.0847 0.0742 0.0693 0.0701 0.0745 0.0734 0.0737 0.0736 0.0785 0.1418 0.0341 0.0269 0.0280 0.0405 1.1643 0.0548 0.2814 0.1013 0.0401 0.0440 0.0612 0.0644 0.0707 0.0618 0.0526 0.1459 1.0214 16037.12 0.0781 0.0899 0.0986 0.1158 0.0867 0.1166 3647.222 Wald 95% Confidence Limits -2.4153 -0.1508 -0.0397 -0.0316 -0.0230 -0.0333 0.0364 -0.0035 0.0137 -0.0806 -0.3532 -0.1000 -0.1481 -0.0991 -0.0493 -1.4691 -0.1706 -0.9616 -0.2664 0.0561 0.0976 0.4173 -0.0315 -0.1397 -0.0729 -0.0136 -0.4200 -3.1467 -31449.3 0.1869 0.8484 -0.0354 -0.3375 -0.1447 -0.3818 -7165.84 -2.1373 0.1811 0.2511 0.2399 0.2518 0.2586 0.3241 0.2853 0.3022 0.2270 0.2027 0.0336 -0.0428 0.0105 0.1096 3.0949 0.0442 0.1416 0.1306 0.2133 0.2700 0.6572 0.2210 0.1377 0.1694 0.1926 0.1520 0.8570 31415.06 0.4930 1.2007 0.3511 0.1162 0.1953 0.0752 7131.005 Wald Chi-Square Pr > ChiSq 1030.62 0.03 2.03 2.26 2.66 2.29 6.03 3.66 4.60 0.87 0.28 0.95 12.63 2.51 0.55 0.49 1.33 2.12 0.45 11.28 17.47 77.04 2.16 0.00 0.61 2.90 0.84 1.26 0.00 18.95 129.93 2.56 0.91 0.09 1.73 0.00 <.0001 0.8581 0.1542 0.1328 0.1028 0.1303 0.0141 0.0559 0.0319 0.3508 0.5956 0.3299 0.0004 0.1129 0.4573 0.4851 0.2488 0.1452 0.5026 0.0008 <.0001 <.0001 0.1413 0.9888 0.4349 0.0888 0.3584 0.2623 0.9991 <.0001 <.0001 0.1094 0.3391 0.7706 0.1884 0.9962 177 Parameter MDC MDC MDC QUAN05_AIDS QUAN05_ARRHYTH QUAN05_ARTHRIT QUAN05_CHF QUAN05_COAGULAT QUAN05_COPD QUAN05_DEFICANEM QUAN05_DIAB_CM QUAN05_DIAB_NC QUAN05_FLUIDDIS QUAN05_LIVER QUAN05_MALIGNANT QUAN05_METASTIC QUAN05_MI QUAN05_OBESITY QUAN05_PVD QUAN05_RENALDIS QUAN05_SEVLIVER QUAN05_WEIGHTLOSS Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Source_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Disto_cat Scale 23 24 25 1D 1G 1K 1P 1T 2A 3A 3B -3 -2 0 3 4 5 7 11 17 22 25 30 DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 Estimate 0.1557 1.1641 0.5650 0.2513 0.0898 0.2208 0.3378 0.0383 0.1166 0.1432 0.2370 0.0213 0.1736 0.2036 0.4719 0.3036 0.1604 -0.0758 0.2026 0.2547 0.2378 0.1562 0.4320 -0.0608 -0.1056 0.0350 -0.2121 -0.5704 0.1394 -0.0115 0.4225 0.0000 0.7733 1.2723 0.0000 0.0350 0.1619 -0.2720 0.5254 0.1129 1.0131 -1.7497 1.0000 Standard Error 0.0948 0.6955 0.2147 0.1434 0.0303 0.0880 0.0319 0.0662 0.0283 0.0526 0.0445 0.0271 0.0307 0.0509 0.0374 0.0572 0.0459 0.0556 0.0432 0.0333 0.0926 0.0631 0.0991 0.3023 0.1180 0.0241 0.2410 0.2132 0.1906 0.1549 0.0731 0.0000 0.0809 0.9261 0.0000 0.0614 0.1012 0.1689 0.6556 0.2402 0.8705 1.0414 0.0000 Wald 95% Confidence Limits -0.0302 0.3415 -0.1990 2.5273 0.1443 0.9857 -0.0297 0.5324 0.0305 0.1492 0.0484 0.3932 0.2752 0.4004 -0.0914 0.1680 0.0612 0.1720 0.0401 0.2463 0.1498 0.3242 -0.0319 0.0745 0.1134 0.2338 0.1038 0.3033 0.3986 0.5452 0.1915 0.4156 0.0704 0.2504 -0.1847 0.0332 0.1178 0.2873 0.1893 0.3200 0.0563 0.4192 0.0325 0.2799 0.2377 0.6262 -0.6532 0.5316 -0.3368 0.1256 -0.0123 0.0822 -0.6844 0.2603 -0.9881 -0.1526 -0.2342 0.5130 -0.3151 0.2921 0.2793 0.5657 0.0000 0.0000 0.6148 0.9317 -0.5427 3.0874 0.0000 0.0000 -0.0853 0.1554 -0.0365 0.3602 -0.6031 0.0592 -0.7596 1.8104 -0.3579 0.5836 -0.6931 2.7193 -3.7909 0.2914 1.0000 1.0000 Wald Chi-Square 2.70 2.80 6.93 3.07 8.81 6.30 111.81 0.33 17.01 7.41 28.36 0.62 31.95 16.00 159.30 28.17 12.21 1.86 21.95 58.37 6.60 6.13 19.00 0.04 0.80 2.11 0.77 7.16 0.54 0.01 33.42 . 91.47 1.89 . 0.33 2.56 2.59 0.64 0.22 1.35 2.82 Pr > ChiSq 0.1006 0.0942 0.0085 0.0797 0.0030 0.0121 <.0001 0.5631 <.0001 0.0065 <.0001 0.4327 <.0001 <.0001 <.0001 <.0001 0.0005 0.1729 <.0001 <.0001 0.0102 0.0133 <.0001 0.8406 0.3707 0.1466 0.3789 0.0075 0.4645 0.9408 <.0001 . <.0001 0.1695 . 0.5682 0.1097 0.1074 0.4229 0.6384 0.2445 0.0929 178 A 12 Noon Discharge 11 A.1 1 1 0 0 1 A.2 1 3 1 1 1 Improve B.1 3 0 1 0 3 B.2 0 4 0 0 0 B.3 6 10 1 1 5 No Sustain Benefit C.1 4 5 0 3 1 C.2 4 3 0 3 3 C.3 1 2 2 3 3 D.1 16 12 7 14 8 D.2 6 3 4 16 16 No Change A 11 10 34 17 11 A.1 1 2 2 3 1 A.2 1 3 1 0 0 Improve B.1 0 3 2 0 1 B.2 0 9 0 2 0 B.3 6 10 3 3 5 No Sustain Benefit C.1 12 7 3 1 5 C.2 3 0 0 3 3 C.3 1 0 1 2 1 D.1 19 9 8 10 16 D.2 6 7 6 19 17 No No Sustain Improve Change Benefit APPENDIX C – FACILITY PERFORMANCE BY SIZE AND REGION A A.1 A.2 B.1 B.2 B.3 C.1 C.2 C.3 7 0 2 1 0 2 0 1 1 4 0 2 0 4 1 1 0 1 5 3 0 0 0 2 1 0 0 2 0 0 0 0 0 0 1 3 4 0 0 0 0 1 1 0 0 D.1 1 2 4 3 4 D.2 1 1 1 7 6 Large (N = 16) Medium (N = 60) Small (N = 54) No Change LOS 30-Day Readmission 39 30-Day Mortality 13 In-Hospital Mortality 13 A 6 Noon Discharge 7 A.1 0 0 2 0 1 A.2 0 1 0 0 0 Improve B.1 1 0 1 0 3 B.2 0 1 0 0 0 B.3 2 3 1 0 0 No Sustain Benefit C.1 4 4 0 1 1 C.2 1 0 1 2 0 C.3 1 0 0 3 2 D.1 7 5 2 6 5 D.2 1 2 4 7 4 No Change A 7 4 12 4 3 A.1 1 1 0 1 1 A.2 0 1 1 0 0 Improve B.1 0 2 1 0 0 B.2 0 5 0 1 0 B.3 4 2 2 1 1 No Sustain Benefit C.1 7 3 1 0 2 C.2 1 1 0 2 3 C.3 0 1 0 1 1 D.1 4 4 7 9 7 D.2 2 2 2 7 8 No Change A 2 3 14 9 6 A.1 1 0 0 1 0 A.2 1 1 1 0 1 Improve B.1 0 0 0 0 0 B.2 0 6 0 0 0 B.3 2 3 1 2 4 No Sustain Benefit 179 C.1 4 1 2 0 2 C.2 2 1 0 1 1 C.3 1 1 1 1 0 D.1 8 7 4 1 3 D.2 4 2 2 10 8 Central (N=25) Southeast (N = 26) No Change Northeast (N = 23) LOS 30-Day Readmission 12 30-Day Mortality 4 In-Hospital Mortality 7 A 8 Noon Discharge 5 A.1 0 2 3 1 0 A.2 1 2 0 1 0 Improve B.1 1 0 1 0 1 B.2 0 4 0 1 0 B.3 4 6 1 0 4 No Sustain Benefit C.1 0 4 0 2 0 C.2 2 1 0 1 2 C.3 0 0 0 0 0 D.1 9 2 1 7 6 D.2 4 3 1 10 9 No Change A 7 6 18 9 5 A.1 0 0 0 0 0 A.2 2 3 0 0 0 Improve B.1 2 1 0 0 0 B.2 0 1 0 0 0 B.3 2 7 1 1 2 No Sustain Benefit 180 C.1 1 1 1 1 2 C.2 2 0 0 1 0 C.3 1 1 0 3 1 D.1 8 5 5 4 7 D.2 2 2 2 8 10 West (N=27) Midwest (N=29) No Change LOS 30-Day Readmission 22 30-Day Mortality 6 In-Hospital Mortality 7 181 APPENDIX D – FULL VARIABLE LISTS • Wards o Telemetry o Step Down o Respiratory o Medicine-ECG o Medicine o Surgery-ECG o Surgery o Combined Medicine & Surgery • Sufficient Staff o Registered nurses (Clinical) o Clinical nurse specialists (Clinical) o Radiology technologists (Support) o Laboratory technologists (Support) o Clinical pharmacists (Clinical) o IRM or CPRS technical support staff (Support) o Clinical Applications Coordinators (CACs) (Support) • Barriers to Improvement o Insufficient numbers of specialists in target acute care conditions o Insufficient numbers of skilled inpatient nurses o Insufficient numbers of administrative and support staff • Inpatient Resources o Clinical resources (number of beds) (Space) o Administrative space (Space) o Computers or workstations on the units (Technology) o CPRS training time for basic functions (Technology) o CPRS training time for advanced functions (Technology) o CPRS training time for non-clinical staff (Technology) o General (non-CPRS) (Technology) o Access to medical informatics expertise (Technology) o Availability of QI / performance measurement-related training • Communication & Cooperation o Effective communication between physicians and senior admin o Effective communication between physicians and nurses o Cooperation between departments • Performance Monitoring o Hospital admission rates o Bed days of care (# of hospital days / 1000 uniques) o Hospital readmission rates o Hospital mortality rate o Number of emergency room visits o Subspecialty consult turnaround time 182 • Monitoring Level o 1 = Facility Level only o 2 = Clinic Level only o 3 = Provider only o 4 = Facility & Clinic o 5 = Facility & Provider o 6 = Clinic & Provider o 7 = All 3 Levels • Utilization Review (review for appropriateness) o Acute care admissions o Non-VA care admission paid by your VA o Concurrent inpatient stays • Clinical order sets o Community acquired pneumonia o Congestive heart failure exacerbations o Gastrointestinal bleeds o Diabetic ketoacidosis o Gastrointestinal bleed prophylaxis o Deep venous thrombosis prophylaxis o Pain Management o Heparin dosing • ICU Evidence Bundles o Myocardial Infarction o Ventilator Associated Pneumonia o Glycemic Control o Weight Based Heparin o Sedation o Ventilator Weaning o Severe Alcohol Withdrawal o GI Prophylaxis o Severe Sepsis o Catheter Related Blood Stream Infection (CRBSI) o Other • Clinical Practice Guideline Adherence o Disease Acute myocardial infarction Congestive heart failure Community-acquired pneumonia o Method Computerized reminders Specialized CPRS templates Performance profiling and feedback to providers Incentives Designated local clinical champion 183 Delegated RN for disease-specific management Provider Education • QI Information o VA central office directives o VISN-level leadership or work groups focused o Local healthcare system or medical center QI department o National or regional teleconference o VA or non-VA web-based resources o VA newsletters or other literature o Local VA or non-VA conferences or seminars • Driving Force o Overseeing task forces or work groups focused on specific VA performance measures o Arranging educational activities related to performance improvement o Arranging provider education regarding clinical practice guidelines o Arranging staff education in QI methods o Providing statistical analysis on VA facility performance o Provide technical consultation and support of template development • Clinical Reminders o Informal discussions between providers and clinical application coordinators (Development) o Requests to provider experts for clinical opinion (Development) o Formal input from relevant clinical departments (Development) o Committees for review of the research evidence (Development) o Test piloting reminders prior to full scale implementation (Development) o Post-implementation assessment of provider satisfaction (Post) o Formal evaluation of reminder usability (human factors) (Development) o Analysis of reminded impact on performance improvement (Post) • Performance Improvement o Established teams to work broadly on VA performance measures (Establish) o Established teams to work on specific disease / conditions (Establish) o Established teams to work on specific VA performance measures (Establish) o Implemented a program or activities focused on enhancing a cooperative culture o Reallocated financial resources to focus on improving a specific performance measure (Shift) 184 o Shifted staff from one part of the facility to another to improve performance at a specific department or clinic (Shift) o Actively partnered high- and low-performing clinics to improve one or more performance measures o Designated a site champion for specific clinical guidelines or performance measures o Monitored the pace at which guidelines were implemented o Provided visible support for clinical guideline implementation o Fostered collaboration among facilities in guideline implementation • Guideline Implementation o Teamwork exist to implement guidelines o Key implementation steps planned (Implementation) o Implementation steps monitored (Implementation) o Resistance from physicians (Resistance) o Resistance from nurses (Resistance) o Resistance from other providers (Resistance) • Clinical Champions o Time constraints o Lack of interest in the topic o Trust and respect o Protected time o Maintain through the duration of a project o Replace a departing champion • Facility Environment o Foster flexibility o Emphasize participative decision-making o Sufficient financial support o Sufficient personnel support • Performance Awards o Monetary incentives o Ceremonial awards o Perks (i.e. parking, additional annual leave) o Other 185 REFERENCES 1. Kohn LT, Corrigan JM, Donaldson MS, eds. To Err Is Human. Washington DC: National Academy Press; 1999. 2. AHRQ. National Healthcare Quality Report. 2008; http://www.ahrq.gov/qual/nhqr08/nhqr08.pdf. Accessed 23 Nov 2009, 2009. 3. AHRQ. National Healthcare Quality Report. 2009; http://www.ahrq.gov/qual/nhqr09/nhqr09.pdf. Accessed 19 Jul 2010, 2010. 4. Levinson DR. Adverse Events in Hospitals: National Incidence among Medicare Beneficiaries. In: Services DoHH, ed: Office of Inspector General; 2010. 5. Wickens CD, Hollands JG. Engineering Psychology and Human Performance. 3rd ed. Upper Saddle River, NJ: Prentice-Hall; 2000. 6. Kotter JP, Cohen DS. The Heart of Change: Real-Life Stories of How People Change Their Organizations. Boston, MA: Harvard Business Press; 2002. 7. Vest JR, Gamm LD. A Critical Review of the Research Literature on Six Sigma, Lean and Studergroup's Hardwiring Excellence in the United States: The Need to Demonstrate and Communicate the Effectiveness of Transformation Strategies in Healthcare. Implementation Science. Jul. 2009;4(35). 8. DelliFraine JL, Langabeer JR, Nembhard IM. Assessing the Evidence of Six Sigma and Lean in the Healthcare Industry. Quality management in health care. Jul.-Sep. 2010;19(3):211-225. 9. Weir CR, Staggers N, Phansalkar S. The State of the Evidence for Computerized Provider Order Entry: A Systematic Review and Analysis of the Quality of the Literature. International Journal of Medical Informatics. Jun. 2009;78(6):365-374. 10. Glasgow JM, Scott-Caziewell JR, Kaboli PJ. Guiding Inpatient Quality Improvement: A Systematic Review of Lean and Six Sigma. Joint Commission journal on quality and patient safety. 2010;36(12):533-540. 11. Hansen BG. Reducing Nosocomial Urinary Tract Infections through Process Improvement. Journal for Healthcare Quality. Mar.-Apr. 2006;28(2):W2-2 - W2-9. 186 12. Frankel HL, Crede WB, Topal JE, Roumanis SA, Devlin MW, Foley AB. Use of Corporate Six Sigma Performance-Improvement Strategies to Reduce Incidence of Catheter-Related Bloodstream Infections in a Surgical Icu. Journal of the American College of Surgeons. Sep. 2005;201(3):349-358. 13. Kussman MJ, Vandenberg P, Almenoff P. Survey of Icus & Acute Inpatient Medical & Surgical Care in Vha. 2007; http://vaww.va.gov/haig/ICU/2007ICUAcuteInptMedSurgReport.pdf. 14. Yano EM. Vha Practice System Assessment Survey. 2007; http://www.hsrd.research.va.gov/research/abstracts.cfm?Project_ID=2141 695109. 15. IHI. The Breakthrough Series: Ihi's Collaborative Model for Achieving Breakthrough Improvement. Boston: Institute for Healthcare Improvement;2003. 16. Asch SM, Baker DW, Keesey JW, et al. Does the Collaborative Model Improve Care for Chronic Heart Failure? Medical Care. 2005;43(7):667675. 17. Bradley EH, Nembhard IM, Yuan CT, et al. What Is the Experience of National Quality Campaigns? Views from the Field. Health Services Research. 2010;45(6):1651 - 1669. 18. Neily J, Howard K, Quigley P, Mills PD. One-Year Follow-up after a Collaborative Breakthrough Series on Reducing Falls and Fall-Related Injuries. Joint Commission journal on quality and patient safety. 2005;31(5):275-285. 19. Leape LL, Rogers G, Hanna D, et al. Developing and Implementing New Safe Practices: Voluntary Adoption through Statewide Collaboratives. Quality & Safety in Health Care. 2006;15:289-295. 20. Strating SMH, Nieboer AP, Zuiderent-Jerak T, Bal RA. Creating Effective Quality-Improvement Collaboratives: A Multiple Case Study. BMJ Quality & Safety. 2011;20(4):344-350. 21. Toncich G, Cameron P, Virtue E, Bartlett J, Ugoni A. Institute for Health Care Improvement Collaborative Trial to Improve Process Times in an Australian Emergency Department. J. Qual. Clin. Practice. 2000;2000:7986. 22. Brandrud AS, Schreiner A, Hjortdahl P, Helljesen GS, Nyen B, Nelson EC. Three Success Factors for Continual Improvement in Helathcare: An Analysis of the Reports of Improvement Team Members. BMJ Quality & Safety. 2011. 187 23. Franco LM, Marquez L. Effectiveness of Collaborative Improvement: Evidence from 27 Applications in 12 Less-Developed and Middle-Income Countries. BMJ Quality & Safety. 2011. 24. Armstrong B, Levesque O, Perlin JB, Rick C, Schectman G. Reinventing Veterans Health Administration: Focus on Primary Care. Journal of Healthcare Management. 2005;50(6):399-408. 25. Mills PD, Weeks WB. Characteristics of Successful Quality Improvement Teams: Lessons from Five Collaborative Projects in the Vha. Joint Commission journal on quality and patient safety. 2004;30(3):152-162. 26. Mills PD, Weeks WB, Surott-Kimberly BC. A Multihospital Safety Improvement Effort and the Dissemination of New Knowledge. Joint Commission journal on quality and safety. 2003;29(3):124-133. 27. Jackson GL, Powell AA, Ordin DL, et al. Developing and Sustaining Quality Improvement Partnerships in the Va: The Colorectal Cancer Care Collaborative. Journal of general internal medicine. 2010;S1:38-43. 28. Davies M. Fix Patient Flow Handbook. 2006; http://srd.vssc.med.va.gov/C10/InpatientFlow/default.aspx. Accessed 10 Aug 2007, 2007. 29. Haraden C, Resar R. Patient Flow in Hospitals: Understanding and Controlling It Better. Frontiers of Health Services Management. 2004;20(4):3-15. 30. Davies M. Vha System Redesign. 2007; http://srd.vssc.med.va.gov/C10/InpatientFlow/default.aspx. Accessed 10 Aug 2007. 31. Davies M. Vha Fix: Flow Inpatient Improvement Initiative. VHA Systems Redesign Newsletter. 2008;2(1):7. 32. Duncan D. Fix Collaborative (Inpatient Flow Improvement Initiative) Evaluation Impact Study: Veteran's Affairs; 2010:90. 33. Rosenthal GE, Kaboli PJ, Barnett MJ. Differences in Length of Stay in Veterans Health Administration and Other United States Hospitals: Is the Gap Closing? Medical Care. Aug 2003;41(8):882-894. 34. Kaboli PJ, Go JT, Hockenberry J, Glasgow JM, Rosenthal GE, VaughanSarrazin M. Associations between Reduced Hospital Length of Stay and 30-Day Readmission Rate: 14-Year Experience in 129 Veterans Administration Hospitals. Annals of Internal Medicine. 2011;Under Review. 188 35. Hser Y-I, Shen H, Chou C-P, Messer SC, Anglin MD. Analytic Approaches for Assessing Long-Term Treatment Effects. Evaluation Review. 2001;25(2):233-262. 36. Biglan A, Ary D, Wagenaar AC. The Value of Interrupted Time-Ssries Experiments for Community Intervention Research. Prevention Science. 2000;1(1):31-49. 37. Mitchell JB, Bubolz T, Paul JE, Pashos CL, Escarce JJ, Muhlbaier LH. Using Medicare Claims for Outcomes Research. Medical Care. 1994;32(7 Suppl):JS38-51. 38. Ashton CM, Petersen NJ, Souchek J, et al. Geographic Variations in Utilization Rates in Veterans Affairs Hospitals and Clinics. New England Journal of Medicine. Jan 7 1999;340(1):32-39. 39. National Committee on Vital and Health Statistics. Washington DC1980. 40. Kashner TM. Agreement between Administrative Files and Written Medical Records: A Case of the Department of Veterans Affairs. Medical Care. Sep 1998;36(9):1324-1336. 41. Va Information Resource Center (Virec). BIRLS Death File 2006; http://www.virec.research.va.gov/DataSourcesName/BIRLS/BIRLS. Accessed 12 June 2006, 2006. 42. Render ML, Kim HM, Welsh DE, et al. Automated Intensive Care Unit Risk Adjustment: Results from a National Veterans Affairs Study. Critical Care Medicine. Jun 2003;31(6):1638-1646. 43. Manning WG, Mullahy J. Estimating Log Models: To Transform or Not to Transform? J Health Econ. 2001;20:461-494. 44. Elixhauser A, Steiner C, Harris D, Coffey R. Comorbidity Measures for Use with Administrative Data. Med Care. 1998;36:8-27. 45. Quan H, Sundararajan V, Halfon P, et al. Coding Algorithms for Defining Comorbidities in Icd-9-Cm and Icd-10 Administrative Data. Med Care. 2005;43(11):1130-1139. 46. Virec Research Uuers Guide: Fy 2006 Vha Medical Sas Inpatient Datasets. 2007; http://www.virec.research.va.gov/References/RUG/RUGInpatient06.pdf. Accessed 26 Apr 2011, 2011. 47. McLeod AI. Javascript for Online Power Computation in Intervention Analysis. 2007; http://www.stats.uwo.ca/faculty/aim/2007/OnlinePower/. Accessed 14 Oct 2009. 189 48. McLeod AI, Vingilis ER. Power Computations in Time Series Analyses for Traffic Safety. Accident Analysis and Prevention. 2008;40(3):1244-1248. 49. Sas/Stat 9.2 User's Guide, 2nd Edition. Cary, NC: SAS Institute, Inc; 2009. 50. Mitchell PH, Shortell SM. Adverse Outcomes and Variations in Organization of Care Delivery. Medical Care. 1997;35(11):NS19-NS32. 51. Shortell SM, Zimmerman JE, Rousseau DM, et al. The Performance of Intensive Care Units: Does Good Management Make a Difference? Medical Care. 1994;32(5):508-525. 52. Hoff T, Jameson L, Hannan E, Flink E. A Review of the Literature Examining Linkages between Organizational Factors, Medical Errors, and Patient Safety. Medical Care Research and Review. 2004;61(1):3-37. 53. Donabedian A. An Introduction to Quality Assurance in Health Care. New York City: Oxford University Press; 2003. 54. Hearld LR, Alexander JA, Fraser I, Jiang HJ. How Do Hospital Organizational Structure and Processes Affect Quality of Care? A Critical Review of Research Methods. Medical Care Research and Review. 2008;65(3):259-299. 55. Walston SL, Burns LR, Kimberly JR. Does Reengineering Really Work? An Examination of the Context and Outcomes of Hospital Reengineering Initiatives. Health Services Research. Feb. 2000;34(6):1363-1388. 56. Tucker AL, Nembhard IM, Edmondson AC. Implementing New Pracitices: An Empirical Study of Organizational Learning in Hospital Intensive Care Units. Management Science. 2007;53(6):894-907. 57. Kaplan HC, Brady PW, Dritz MC, et al. The Influence of Context on Quality Improvement Success in Health Care: A Systematic Review of the Literature. The Milbank Quarterly. 2010;88(4):500-559. 58. Pawson R, Tilley N. Realistic Evaluation. London: Sage; 1997. 59. Stevens DP. Squire: Standards for Quality Improvement Reporting Excellence. Quality & safety in health care. 2008;17(S1):1-32. 60. VA. National Center for Veterans Analysis and Statistics. 2011; http://www.va.gov/vetdata/. Accessed June 5 2011. 61. Kizer KW, Dudley RA. Extreme Makeover: Transformation of the Veterans Health Care System. Annual Reviews of Public Health. 2009;30:313-339. 190 62. Kolodner RM, ed Computerizing Large Integrated Health Networks: The Va Success. New York: Springer; 1997. 63. Doescher M, Skillman S. Rural-Urban Commuting Area Codes (Rucas). http://depts.washington.edu/uwruca/index.php. Accessed June 7, 2011. 64. Glasgow JM, Kaboli PJ. Vamc Facility Rurality: Comparison of Three Classification Approaches. Washington D.C.: Department of Veterans Affairs;2010. 65. Wachter RM, Flanders S. The Hospitalist Movement and the Future of Academic General Internal Medicine. Journal of General Internal Medicine. 1998;13(11):783-785. 66. Wachter RM, Goldman L. Hospitalist Movement 5 Yrs Later. JAMA. 2002;287(4):487-494. 67. Meltzer DO, Arora V, Zhang JX, et al. Effects of Inpatient Experience on Outcomes and Costs in a Multicenter Trial of Academic Hospitalists. Journal of General Internal Medicine. April 2005;20(Supplement 1):141142. 68. Chen H, Fuller SS, Friedman C, Hersh W, eds. Medical Informatics: Knowledge Manageemnt and Data Mining in Biomedicine. New York: Springer; 2005. 69. Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques. 2nd ed. San Francisco: Morgan Kaufmann; 2005. 70. Quinlan JR. C4.5: Programs for Machine Learning. Los Altos, CA: Morgan Kaufmann; 1993. 71. Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA. Data Mining: A Knowledge Discovery Approach. New York City: Springer; 2007. 72. Adeyemo AB, Oriola O. Personnel Audit Using a Forensic Mining Technique. International Journal of Computer Science Issues. 2010;7(6):222-231. 73. McClish DK. Analyzing a Portion of the Roc Curve. Medical Decision Making. 1989;9(3):190-195. 74. Shen M. Computerized Physician Order Entry Decreases Hospital Wide Mortality. The Hospitalist. 2011;15(3):10. 75. Nelson EC, Batalden PB, Godfrey MM. Quality by Design: A Clinical Microsystems Approach. San Francisco: Jossey-Bass; 2007. 191 76. Langley GJ, Nolan KM, Nolan TW, Norman CL, Provost LP. The Improvement Guide: A Practical Approach to Enhancing Organizational Performance. San Francisco: Jossey-Bass Publishers; 1996.
© Copyright 2026 Paperzz