A critical evaluation of healthcare quality improvement and how

University of Iowa
Iowa Research Online
Theses and Dissertations
Spring 2013
A critical evaluation of healthcare quality
improvement and how organizational context
drives performance
Justin Mathew Glasgow
University of Iowa
Copyright 2013 Justin Mathew Glasgow
This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2503
Recommended Citation
Glasgow, Justin Mathew. "A critical evaluation of healthcare quality improvement and how organizational context drives performance."
PhD (Doctor of Philosophy) thesis, University of Iowa, 2013.
http://ir.uiowa.edu/etd/2503.
Follow this and additional works at: http://ir.uiowa.edu/etd
Part of the Clinical Epidemiology Commons
A CRITICAL EVALUATION OF HEALTHCARE QUALITY
IMPROVEMENT AND HOW ORGANIZATIONAL CONTEXT DRIVES
PERFORMANCE
by
Justin Mathew Glasgow
An Abstract
Of a thesis submitted in partial fulfillment of the requirements
for the Doctor of Philosophy degree in Epidemiology
in the Graduate College of The University of Iowa
May 2013
Thesis Supervisor: Associate Professor Peter J. Kaboli
1
ABSTRACT
This thesis explored healthcare quality improvement, considering the
general question of why the last decade’s worth of quality improvement (QI) had
not significantly improved quality and safety. The broad objective of the thesis
was to explore how hospitals perform when completing QI projects and whether
any organizational characteristics were associated with that performance.
First the project evaluated a specific QI collaborative undertaken in the
Veterans Affairs (VA) healthcare system. The goal of the collaborative was to
improve patient flow throughout the entire care process leading to shorter
hospital length of stay (LOS) and an increased percentage of patients discharged
before noon. These two goals became the primary outcomes of the analysis,
which were balanced by three secondary quality check outcomes: 30-day
readmission, in-hospital mortality, and 30-day mortality.
The analytic model consisted of a five-year interrupted time-series
examining baseline performance (two-years prior to the intervention), the year
during the QI collaborative, and then two-years after the intervention to determine
how well improvements were maintained post intervention. The results of these
models were then used to create a novel 4-level classification model. Overall, the
analysis indicated a significant amount of variation in performance; however, subgroup analyses could not identify any patterns among hospitals falling into
specific performance categories.
Given this potentially meaningful variation, the second half of the thesis
worked to understand whether specific organizational characteristics provided
2
support or acted as key barriers to QI efforts. The first step in this process
involved developing an analytic model to describe how various categories of
organizational characteristics interacted to create an environment that modified a
QI collaborative to produce measureable outcomes. This framework was then
tested using a collection of variables extracted from two surveys, the categorized
hospital performance from part one, and data mining decision trees. Although the
results did not identify any strong associations between QI performance and
organizational characteristics it generated a number of interesting hypotheses
and some mild support for the developed conceptual model.
Overall, this thesis generated more questions than it answered. Despite
this feature, it made three key contributions to the field of healthcare QI. First,
this thesis represents the most thorough comparative analysis of hospital
performance on QI and was able to identify four unique hospital performance
categories. Second, the developed conceptual model represents a
comprehensive approach for considering how organizational characteristics
modify a standardized QI initiative. Third, data mining was introduced to the field
as a useful tool for analyzing large datasets and developing important
hypotheses for future studies.
Abstract Approved: _______________________________________________
Thesis Supervisor
Associate Professor, Department of Internal Medicine______
Title and Department
October 4, 2011___________________________________
Date
A CRITICAL EVALUATION OF HEALTHCARE QUALITY
IMPROVEMENT AND HOW ORGANIZATIONAL CONTEXT DRIVES
PERFORMANCE
by
Justin Mathew Glasgow
A thesis submitted in partial fulfillment of the requirements
for the Doctor of Philosophy degree in Epidemiology in
the Graduate College of The University of Iowa
May 2013
Thesis Supervisor: Associate Professor Peter J. Kaboli
Graduate College
The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROVAL
_________________________
PH.D. THESIS
_____________
This is to certify that the Ph. D. thesis of
Justin Mathew Glasgow
has been approved by the Examining Committee for the thesis
requirement for the Doctor of Philosophy degree in Epidemiology
at the May 2013 graduation.
Thesis Committee: ______________________________________________
Peter Kaboli, Thesis Supervisor
______________________________________________
James Torner
______________________________________________
Elizabeth Chrischilles
______________________________________________
Ryan Carnahan
______________________________________________
Jason Hockenberry
______________________________________________
Jill Scott-Cawiezell
TABLE OF CONTENTS
LIST OF TABLES
iv
LIST OF FIGURES
vi
CHAPTER 1 – INTRODUCTION
1
Study Overview
Summary
7
8
CHAPTER 2 – QUALITY IMPROVEMENT COLLABORATIVES
The Collaborative Approach to Quality
Flow Improvement Inpatient Initiative (FIX)
FIX Analysis Overview
Conclusions
CHAPTER 3 – TIME-SERIES METHODS
Data Sources
Data Elements
Patient Cohort
Risk Adjustment
Time-Series Model
Improvement and Sustainability
Sub-group Analyses
Conclusions
10
10
17
21
25
27
27
29
29
31
38
41
46
47
CHAPTER 4 – TIME-SERIES RESULTS AND DISCUSSION
System-Wide Analysis
Facility Analysis
Evaluation of the Specific Aims
Discussion
Limitations
Conclusions
48
48
52
57
59
65
67
CHAPTER 5 – SUPPORTING QUALITY IMPROVEMENT
Relationships with Healthcare Quality
Relationships with Quality Improvement Efforts
Analytic Framework
Conclusions
ii
69
69
76
79
84
CHAPTER 6 – ANALYTIC VARIABLES AND DATA MINING
Organizational Characteristics in VA
VA Hospital Organizational Context
Data Mining Overview
Decision Tree Development
Decision Tree Interpretation
Conclusions
CHAPTER 7 – DECISION TREE RESULTS AND DISCUSSION
Decision Tree Performance Metrics
Individual Decision Trees
Discussion
Interpreting the Analytic Framework
Limitations
Conclusions
86
86
89
100
105
110
111
113
113
116
132
138
140
142
CHAPTER 8 – SUMMARY AND FUTURE WORK
145
Project Summary
Human Factors and Change Management
Recommendations for Improving QI
Future Studies
Conclusions
145
151
153
156
159
APPENDIX A – RISK ADJUSTMENT MODEL SAS CODE
160
APPENDIX B – SAS OUTPUT FOR RISK ADJUSTMENT
168
APPENDIX C – FACILITY PERFORMANCE BY SIZE AND REGION
178
APPENDIX D – FULL VARIABLE LISTS
181
REFERENCES
185
iii
LIST OF TABLES
Table 2-1: Reported calculation of cost savings from FIX ................................... 20
Table 3-1: List of Outcome Measures ................................................................. 30
Table 3-2: Comparison of risk adjustment cohort to all other FY07 discharges .. 32
Table 3-3: List of potential risk adjustment variables, the number of discrete
categories, and a description of how categories were defined .......... 33
Table 3-4: Modeling of age risk adjustment categories ....................................... 35
Table 3-5: Modeling of race risk adjustment categories ...................................... 35
Table 3-6: Modeling of service connected risk adjustment categories ................ 35
Table 3-7: Modeling of admission source risk adjustment categories ................. 36
Table 3-8: Modeling of place of discharge risk adjustment categories ................ 36
Table 3-9: Highly correlated risk adjustment variables........................................ 37
Table 3-10: Description of full classification categories....................................... 42
Table 4-1: Hospital classification across the 5 outcome measures (N = 130) ..... 53
Table 4-2: LOS Improvers classification (N = 45) ............................................... 55
Table 4-3: Discharge before noon Improvers classification (N = 60) .................. 56
Table 4-4: P-Values from Chi-square tests examining facility performance in subgroups by size and regional location ................................................. 56
Table 6-1: Categories for different response scales in the CPOS survey ........... 90
Table 6-2: Variables measuring facility structure................................................. 91
Table 6-3: Variables measuring QI structure ....................................................... 94
Table 6-4: Calculated and Composite measures of QI Structure ........................ 95
Table 6-5: Variables measuring QI process ........................................................ 97
Table 6-6: Calculated and Composite measures of QI Process.......................... 99
Table 6-7: Point ranges for composite model classification .............................. 110
Table 7-1: Data mining sample performance classifications (N = 100) ............. 115
iv
Table 7-2: Decision tree performance metrics .................................................. 115
Table 7-3: Count of factors in each of the decision trees .................................. 117
Table 7-4: List of individual and composite variables in the decision trees ....... 119
v
LIST OF FIGURES
Figure 2-1: Model of the IHI BTS Collaborative timeline for FY07 FIX................ 18
Figure 3-1: Decision tree used to classify hospital performance ........................ 45
Figure 4-1: Aggregate results for LOS (FY05 - FY09) ........................................ 50
Figure 4-2: Aggregate results for in-hospital mortality (FY05 - FY09) ................. 50
Figure 4-3: Aggregate results for 30-day mortality (FY05 - FY09) ...................... 51
Figure 4-4: Aggregate results for discharges before noon (FY05 - FY09) .......... 51
Figure 4-5: Aggregate results for 30-day readmissions (FY05 - FY09)............... 52
Figure 5-1: Analytic framework for how organizational context impacts QI......... 81
Figure 7-1: Full decision tree for LOS performance .......................................... 123
Figure 7-2: Full decision tree for discharges before noon performance............ 126
Figure 7-3: Full decision tree for LOS/Noon composite performance ............... 128
Figure 7-4: Full decision tree for overall composite performance ..................... 131
vi
1
CHAPTER 1 – INTRODUCTION
In the years since the Institute of Medicine (IOM) reported that as many as
98,000 people die each year as a result of medical errors,1 the healthcare
community has been focused on efforts to improve quality, efficiency, and safety.
While considerable efforts have gone into improving healthcare quality, broad
measures of quality do not show the expected improvements in quality. One
common monitor of quality is the National Healthcare Quality Report (NHQR)
which tracks annual performance on several quality measures. In 2008, the
report found that there was only a 1.4% average annual increase in all measures
of quality with a concomitant 0.9% average annual decrease in scores on patient
safety measures.2 The 2009 report continued the theme, noting that while it was
possible to identify small pockets of success, the overall variability across the
healthcare industry was too great to claim any success in improving quality and
safety.3
Further confirming the lack of improvement in quality and safety was a
recent review of patient medical records by the Centers for Medicare and
Medicaid Services (CMS). The review evaluated records of 780 Medicare
beneficiaries recently discharged from a hospital and found that 13.5%
experienced an adverse event during their hospital stay.4 Further, an expert panel
review of these adverse events determined that 44% of the events were clearly
or likely preventable.4 Taken together, the NHQR reports and the CMS chart
reviews suggest a disconnect between what quality improvement (QI) efforts
report in the literature and their actual success. The broad driving force behind
2
the research reported in this thesis is to understand potential causes for this
disconnect and to explore possible modifications to the healthcare environment
that will support and increase the probability of successful QI in the future.
Two theories have been particularly instructive in approaching and
understanding why individual reports of successful QI projects may not translate
into widespread improvements in quality. First, human factors theory advocates
that when designing a device or a process careful attention must be paid to how
innate limitations of human physical and mental capabilities will impact how
people interact with the device or process. 5 This concept means that even the
greatest of technological solutions can be unsuccessful if people cannot
successfully interact with the system. Building from this idea, a potential
hypothesis for why there is little overall improvement in quality is that many QI
projects propose and implement solutions that may impose too much additional
cognitive burden on those tasked with providing high quality care. In this
situation, there may be initial success as excitement and energy related to the
project are sufficient to overcome the additional cognitive burden. However, as
time passes and the improvements become less of a focus there is a reduction in
task specific energy. This eventually leads to a point where the additional
cognitive burden becomes too overwhelming and performance begins to decline.
This sort of process would suggest a QI project that initially appears successful,
but overtime cannot sustain performance which would result in a slow decline in
quality likely back to baseline performance as providers abandon the new
solution for their original process.
3
The other instructive theory for understanding the quality disconnect was
change management theory. This theory acknowledges that going through and
accepting change is a difficult and emotional process that people often resist.6
This theory suggests that even if a QI effort is technically correct from a human
factors perspective, resistance to change from healthcare providers could still
result in an unsuccessful QI project. This sort of change resistance could help
identify why a successful QI project at one hospital is not successful when
translated to other settings. Without the correct institutional QI culture or change
management process, QI projects will not sustain their improvements and likely
will have difficulty achieving even initial improvements.
As a concrete example of how QI solutions may not consider the cognitive
or emotional hurdles involved in improving and sustaining quality, consider that
many QI solutions rely predominately on provider education as the main
component of a QI solution. In the standard approach, providers are gathered
together in a meeting room or lecture hall where someone presents them with a
problem, for example a growing backlog of patients waiting to be admitted from
the emergency department each afternoon. Having established the problem, the
speaker asks the group to improve quality by increasing the number of inpatient
discharges that occur before noon. After discussions and pushback from the
audience, the presenter wraps up the presentation hoping the group is energized
and ready to go fix the problem.
This approach has a number of short and long term problems which
impact the likelihood of long-term improvements in quality. Perhaps the biggest
4
barrier to success in this situation is the feasibility in achieving what the speaker
proposes. Mornings are generally a busy time for physicians and nurses as they
go through their rounds, provide care, and make plans for the rest of the day.
This period may already be so busy that adding the extra cognitive task of
planning and taking care of a patient discharge may not be feasible. Combine
this cognitive difficulty with various emotional reactions, such as change
avoidance, denial of the problem (or blaming others), or just simple avoidance,
and this intervention would be lucky to make an initial improvement and it
certainly will not lead to sustained improvements.
While an interesting theoretical example, the real question to ponder is
whether actual QI projects generate and sustaining improvements in quality
across multiple healthcare settings. Unfortunately, the current QI literature
predominately focuses on case reports that describe projects in a single setting
and do not provide the in-depth project evaluation necessary to fully understand
QI in healthcare. Even systematic reviews of QI have a hard time reaching
definitive conclusions as they generally conclude project evaluations are not
methodologically sound and cannot establish whether improvements in quality
occur and if improvements were present whether those improvements were even
causally related to the QI effort.7-9 With so little focus on establishing whether
interventions create initial results, it is not a surprise that few reports broach the
subject of sustained quality, nor present any data that covers the period after
project completion.
5
Since those initial reviews, two approaches to quality improvement, Lean
and Six Sigma, have become increasingly popular in healthcare. These two
approaches are important as both of them have a specific focus as part of the
process that emphasizes the importance of trying to sustain improvements after
initial project completion. However, a recent systematic review of these two
approaches found that few articles discussed whether project interventions led to
sustained improvements.10 Of the few cases that discussed sustained
improvements, two were particularly informative about the challenges healthcare
faces as it works towards sustaining QI.
The first case involved an intervention targeted to reduce nosocomial
urinary tract infections (UTI) using the more general approach of nursing staff
education and training.11 The initial effort resulted in a steady decrease in the
number of UTI recorded which lasted for about a year after the intervention.
However, after that year the rates slowly began to rise resulting in a loss of the
initial improvements and eventually a recording of the highest quarterly rate of
UTI observed in a 4-year period. Since the unit was monitoring their UTI rates
they did respond to the increase with another round of staff education which lead
to at least a temporary reduction in rates. This QI initiative mimics the prior
theoretical example and highlights that relying solely on provider education is
unlikely to produce sustained improvements in quality. While the root cause
behind the loss of quality was not discussed in the article, there certainly could
have been emotional or cognitive challenges that contributed to the nurses’
inability to maintain low UTI rates.
6
In contrast, the second case focused on reducing catheter-related
bloodstream infections (CRBSI) by identifying solutions that involved process
changes that would not only improving quality, but also would reduce provider
cognitive burden. The solutions in this project involved developing a system to
monitor catheter dwell time, as well as the creation of a catheter insertion kit that
ensured all materials were immediately available in one area.12 This change
reduced provider burden in two ways. First, by creating a method for monitoring
and alerting providers about catheter dwell time, providers did not have to
remember when a catheter was inserted and whether it was time to be changed
or removed. Instead, they would receive a reminder when action was
appropriate. Second, by creating a procedure kit there was no longer the burden
of searching for necessary components in a time pressed environment. Anytime
a catheter needed to be placed only one item, the kit, needed to be located and
then everything necessary for high quality care would be available.
Even while these were effective changes, long-term monitoring of CRBSI
rates found a substantial spike the first winter after implementation. Review of
that increase led to the identification of a specific subset of patients with
characteristics different than those evaluated in the original project. This led to an
additional change to the process mandating the use of antibiotic coated catheters
for select subsets of patients. While this example paints a more promising picture
about the future of quality in healthcare (i.e. that well designed process
improvements can improve quality), it also reveals that fixing quality problems
may require more than a single intervention.
7
Study Overview
As established in the introduction this study is driven by the apparent
disconnect between reports of successful QI efforts and lack of measured
improvements in healthcare quality. There are likely many root causes to this
disconnect, but this study will first focus on two potential causes. First, current
evaluation approaches may overestimate how well hospitals perform on QI
efforts and stronger methodologies may identify that fewer hospitals than
expected successfully improve quality. Second, those projects that do
successfully improve quality initially may not be able to sustain results long term.
In order to explore these two areas, the first objective of this study was to
conduct an in-depth examination of whether a collection of Veterans Affairs (VA)
hospitals were able to improve and sustain quality after participating in the same
quality improvement collaborative, the Flow Improvement Inpatient Initiative
(FIX). This analysis will address the following two specific aims:
Aim 1: Determine the impact of the FIX collaborative upon quality and efficiency
as measured by LOS, percent of patients discharged before noon, in-hospital
and 30-day mortality rates, and 30-day readmission rates.
Hypothesis 1: The FIX collaborative will shorten patient LOS, increase
the percentage of patients discharged before noon. There will be no
changes in mortality or readmission rates attributable to FIX.
Aim 2: Determine whether improvements attributable to FIX are sustained postimplementation.
8
Hypothesis 2a: Improvements in the outcome measures will continue on
a downward slope after completion of FIX.
Hypothesis 2b: The rate of further improvements in the outcome
measures after completion of FIX will be at or below the rate of pre-FIX
improvements.
With this initial description of how well hospitals are able to improve and
sustain quality after a QI effort, the next question becomes what can be done to
increase the ability of QI to lead to sustained improvements. The goal of this
analysis is to understand whether there are any structural issues that may be
potential root cause barriers to improvement. Therefore, the second half of this
project will focus on an effort to understand what organizational characteristics
may be associated with successful and unsuccessful QI projects. This will be
accomplished using data mining decision trees to determine which organizational
characteristics, as reported on responses to the 2007 Survey of ICUs & Acute
Inpatient Medical & Surgical care in VHA (HAIG)13 and the VA Clinical Practice
Organizational Survey (CPOS),14 are associated with different performance
classifications. This analysis meets the third specific aim of this project:
Aim 3: Describe how selected organizational structures are associated with
sustaining improvements.
Summary
The following chapters will introduce the reader to relevant portions of the
QI literature, cover the study methods, present study results, and disscuss what
this means for QI efforts in healthcare. Chapter 2 begins the task of addressing
9
the two specific aims by discussing the collaborative approach to QI, examining
the current understanding of the approach in the literature, and exploring a
specific collaborative that served as the case study for analysis. Chapter 3
discusses the analytic methods and reasons for selecting those methods for
analyzing hospital performance during the QI collaborative. Chapter 4 concludes
the analysis by presenting and discussing the results of the analysis.
The second half of the thesis then addresses the third specific aim of the
study. Chapter 5 begins by summarizing the current literature examining how
organizational characteristics are related to quality measures and QI efforts. The
result of this discussion is the development of a new analytic framework that
guides the subsequent analysis. Chapter 6 reviews data from two surveys that
serve as the measures of organizational characteristics and then discusses how
data mining decision trees are ideal tools for modeling the relationship between
organizational characteristics and hospital performance on QI. Chapter 7
presents and discusses the results of the data mining decision trees. Finally,
Chapter 8 summarizes the findings from this thesis, overviews some
recommendations for hospitals to consider when trying to improve their success
with QI, and concludes with a discussion about future studies that will build on
this work and improve the overall understanding about how to successful
improve and sustain quality in healthcare.
10
CHAPTER 2 – QUALITY IMPROVEMENT COLLABORATIVES
The goal of this chapter is to introduce the collaborative approach to
quality improvement (QI), discuss the current evaluation of the approach in the
literature and examine a specific QI collaborative. The initial introduction to
collaborative QI considers its origins and development by the Institute for
Healthcare Improvement (IHI). The IHI collaborative model proscribes a specific
approach that has been employed to tackle a broad number of QI issues. The
review of the literature evaluates the success of these approaches, the current
understanding about the strengths and weakness of the approach, and also
considers the strengths and weaknesses of the literature. The next section of the
chapter examines the Flow Improvement Inpatient Initiative (FIX) which
represents a specific QI collaborative undertaken in the Veteran Affairs (VA)
healthcare system. This QI collaborative serves as the case study for all the
analyses reported in this study. The review of FIX considers how it fits the IHI
collaborative model and its utility as a case study to meet the goals of this thesis
as well as to contribute knowledge to the broader literature. Lastly the chapter
concludes with an overview of the first two specific aims of this project.
The Collaborative Approach to Quality
First conceived by Paul Batalden, MD, and refined by others at the IHI, the
QI collaborative was viewed as an effective means for overcoming a key limiting
factor to improving healthcare quality, diffusion of knowledge.15 Batalden and the
IHI felt that for many topics there was good underlying science on what needed
to happen to improve quality, but because hospitals were either unaware of the
11
science, unable to disseminate the science among employees, or did not have
the resources or experience necessary to make effective improvements they
could not implement the science in a meaningful manner to improve quality. They
envisioned the QI collaborative as a process that could overcome these barriers
and lead to “breakthrough” improvements in healthcare quality, while also helping
to reduce costs.15
This thinking led to the establishment of the IHI Breakthrough Series
(BTS) collaborative, which has become the common framework for QI
collaboratives in healthcare. The general concept is to have a group of hospitals
that are interested in specific and similar quality goals work together to identify
solutions. A benefit of the collaborative format, over traditional in-house QI
efforts, is that it allows hospitals to collectively invest in relevant subject matter
experts who participate by initially training and then guiding participants through
the processes necessary to achieve change and improve quality. The
collaborative also establishes a structure through which the participants at
different hospitals communicate regularly, allowing participants to be resources to
other groups such that everyone learns effective solutions for overcoming the
inevitable obstacles that arise during a QI effort.
In the BTS model there are three learning sessions with alternating action
periods (Figure 2-1, pg.18), most frequently distributed over a year but ranging
from 6 – 15 months.15 Each learning session is attended by at least three team
members from each participating institution as well as the subject matter expert.
The first learning session typically focuses on learning about the topic through
12
relevant training, refining the team aim, and making plans for change. Some
common focuses are learning how to use the Plan-Do-Study-Act (PDSA) change
cycle, how to develop specific and measurable aims, and defining the ideal state
of care. The second and third learning session brings the teams back together to
report experiences, discuss challenges, learn from other teams, and work with QI
experts to apply additional skills. There is often also a final conclusion session
where teams review their successes and discuss any goals moving forward. The
alternating action periods are times where the teams focus on implementing
improvement projects at their facility. During the action periods the various
participating hospitals interact with each other through conference calls providing
regular opportunities to brainstorm solutions for any new problems.
The literature reporting on collaborative QI projects suggests the approach
can be successful in improving quality and disseminating QI across a variety of
settings. Some example collaboratives include efforts to improve chronic heart
failure (CHF) patient care,16 reduce door-to-balloon time for heart attack care,17
reduce fall-related injuries,18 and improve medication reconciliation.19 There are
also reports showing collaboratives have worked in other healthcare systems
both in developed (Holland, Norway and Australia)20-22 and developing
countries.23
While these reports state each collaboratives is a success, it is important
to note that there is variation in performance across hospitals in individual
collaboratives. There are also some potential systematic barriers that may either
prevent participation in or greatly reduce the chance of hospital success with a
13
collaborative. As an example, consider the efforts to improve medication
reconciliation that aimed to involve all hospitals in the state of Massachusetts.
The collaborative was able to recruit 88% of hospitals in the state, but the nonparticipating facilities were clearly distinguished by their small size and often
isolated locations.19 For the participating hospitals, only 50% had success
achieving at least partial implementation of the initiatives related to improving the
medication reconciliation process. For those hospitals that did not achieve partial
implementation of the initiatives, some frequently cited barriers to success were
an inability to get people to change the way they work, an inability to get clinician
buy in and overall project complexity.19 These barriers, particularly an inability to
get buy-in or get people to change the way they work, are directly related to the
change theory and human factors issues discussed as a challenge for QI in
healthcare.
Another critical consideration about the literature discussing collaboratives
was that many articles, much like the broader QI literature, utilized
methodologies that were limited in their ability to establish cause-effect
relationships proving the effectiveness of collaboratives. The reports often
focused on a team’s ability to implement planned changes, as in the
Massachusetts article, but this does not speak to whether the implementation
was effective or lead to any improvements in quality. Another common
assessment approach is to have the team self-report of whether they felt their
efforts led to improved quality. Although the collaborative format encourages
14
rigorous data collection, rarely do publications include any data that would
increase the reader’s confidence that teams were truly successful.
In short, the assessments of collaboratives make it difficult to quantify
what measureable improvements to quality the collaborative achieved and further
which actions are most directly associated with any improvements. Showing this
causal association is particularly important in healthcare as these collaboratives
typically target highly publicized quality problems. As such, any observed
improvement may be more attributable to outside events, such as continuing
education sessions and conferences, which increase awareness about the topic
and may result in small modifications to provider behavior.
This particular problem was addressed in a study analyzing whether the
CHF BTS collaborative led to improvements in care above and beyond that
which would have naturally occurred.16 The study design to achieve this aim
involved sampling 4 hospitals from the collaborative and then identifying 4 control
hospitals that did not participate in the collaborative and had similar hospital
structures, i.e. matched controls. Using a panel of 21 common metrics for CHF
care quality, the analysis identified that the collaborative sites exhibited greater
improvements on 11 of them, with the strongest improvements associated with
patient counseling and education metrics. For some of the metrics where there
was no difference between participants and controls there were still sizable
improvements in performance. As an example, collaborative hospitals increased
by 16% the percentage of patients that had their left ventricular ejection fraction
(LVEF) measured, but the controls also increased LVEF testing by 13% leading
15
to a non-significant comparison (p = 0.49).16 This article highlights that observed
improvements cannot always be directly attributed to the collaborative and
careful consideration should be taken in developing program evaluations that can
best establish a causal relationship between measured improvements and
collaborative efforts.
VA was an early adopter of the BTS model and has used it to target
adverse drug events, safety in high-risk areas, home-based primary care, fall
risk, and many other patient safety areas.24-27 An example from primary care was
efforts to improve, across a system of nearly 1,300 sites of care, the average
number of days until the next available primary care appointment.24 Over a fouryear period the Advanced Clinic Access collaborative was able to drop the
average days until first available appointment from 42.9 to 15.7 days.
On the inpatient side, a review of 134 QI teams participating in 5 different
VA collaboratives found that somewhere between 51 – 68% of teams were
successful with their efforts.25 Success in this case was defined as a selfreported reduction in at least one outcome by 20% from baseline and sustained
at that level for 2 months before the end of the collaborative. Some example
outcomes for the collaboratives were to reduce adverse drug events, reduce
infection rates, reduce caregiver stress for home-based dementia care, reduce
delays in the compensation process, and reduce patient falls. A unique feature of
this article is that it evaluated for whether any organizational, systemic, and
interpersonal characteristics of hospitals and teams were associated with
performance in the collaborative. When comparing ratings at the end of the
16
collaborative to those at the beginning, some key findings were that low
performing teams showed reductions in their ratings of resource availability,
physician participation, and team leadership.25 In contrast, high performing teams
were more likely to rate that they had worked as a team before, were part of their
organization’s strategic goals, and had stronger team leadership.
A main take away from the analysis of collaboratives in VA, as well as the
study of the medication reconciliation collaborative in Massachusetts, was that
there may be challenges faced by hospitals that are not directly addressed in the
current QI collaborative structure. Two common barriers were a lack of resources
and difficulty getting support and buy-in from physicians. One consideration with
these barriers, particularly the availability of resources, is whether the presence
of such a barrier could be identified prior to a collaborative and if identified
whether those hospitals should participate in a collaborative. It may be that a
hospital needs to develop a certain baseline of behaviors before success is likely
in a QI collaborative and if those behaviors aren’t present that may be where the
hospital needs to focus first. This question will be addressed as part of the third
aim for this study; however, before that can be analyzed it’s necessary to
measure and understand which hospitals succeed in a QI collaborative.
In order to measure and understand which hospitals succeed, it is
necessary to get past the current style of reporting which relies too much on prepost analyses (assuming actual quantitative data) that can’t establish what
measured improvements are due to collaborative participation. The next sections
of this chapter will provide an in-depth introduction to the QI collaborative studied
17
throughout this research and overview the initial analyses undertaken to
establish which hospitals improved and also sustained quality as part of their
participation in the collaborative.
Flow Improvement Inpatient Initiative (FIX)
The collaborative of interest for this study was the Flow Improvement
Inpatient Initiative (FIX). This was a system redesign initiative undertaken in VA
during fiscal year 2007 (FY07) and closely followed the IHI BTS collaborative
model. The aim of the collaborative was to improve and optimize inpatient
hospital flow through the continuum of inpatient care.28, 29 The efforts focused on
addressing potential barriers to smooth flow in the emergency department,
operating suites, and on the inpatient wards. The objective was simply to identify
and eliminate bottlenecks, delays, waste and errors that may hinder a patient’s
smooth progression through the hospital. Some outcome measures associated
with the collaborative were shorter hospital length of stay (LOS) and increased
percentage of patients discharged before noon.30 The goal of these outcome
measures was to ensure that sufficient patient beds were available (particularly in
the early afternoon) for patients needing to be admitted from the emergency
department (ED) or after surgical procedures. By improving bed availability, not
only is patient care and safety improved, but VA hopes to reduce the need for
fee-service care, where veterans are cared for at VA expense in private hospitals.
This collaborative followed the general BTS model with 3 learning
sessions and then a final wrap up session,31 the approximate timing of these
events is outlined in Figure 2-1. In total, 130 VA hospitals participated with
18
approximately 500 participants attending at least one learning session.31 Given
the need for active participation and interaction during learning sessions, the
collaborative was split and implemented in five separate regions (Northeast,
Southeast, Central, Midwest, and West). During the action periods, teams met at
least weekly to work on their QI projects. Commonly reported projects focused on
efforts to reduce LOS, reduce bed turnover time, increase the percentage of
patients given a discharge appointment, increase the percentage of patients
discharged before noon, decrease the time to admission from the ED, and
decrease ED diversion time.30
Figure 2-1: Model of the IHI BTS Collaborative timeline for FY07 FIX
19
Despite the prior experience in VA with collaboratives, there was only
limited evaluation plans established for FIX. Teams likely measured their
performance as they worked to improve patient flow, but this data was never
systematically collected. An external consulting group was tasked with evaluating
success after the completion of FIX. This evaluation focused most predominately
on determining whether participants were satisfied with the collaborative and felt
that they gained knowledge or skills during the process.32 However, the
evaluation also considered whether there was a positive business impact or
return on investment based on changes in the Observed Minus Expected LOS
(OMELOS) during the collaborative. This pre-post analysis compared the FY06
OMELOS with the FY08 OMELOS for patient time in an ICU or a general acute
care floor at 10 hospitals. The process also involved querying the FIX team
leader at each of those hospitals so they could estimate what percentage of the
improvement they would attribute to FIX. The average of these values was then
extrapolated to the entire VA population and used to determine an estimated cost
savings. An overview of these results is presented in Table 2-1.32 After adjusting
for the estimated benefits attributable to FIX the final conclusion was that
implementation of FIX saved $141 million. In order to determine a return on
investment, the analysis considered the costs at the 10 facilities related to
oversight, planning, implementation, and evaluation. The extrapolated costs
came to $5.8 million for VA, equating to an overall return on investment of
2,327%.
20
Acute
ICU
Table 2-1: Reported calculation of cost savings from FIX
Amount
% Attributed to
# of Annual
FY08 – FY06
Cost/day
Saved
FIX
Admissions
OMELOS
0.51 days
$684.75
530,000
$185 million
40.37%
0.31 days
$3500.00
150,000
$110 million
52.18%
While impressive, these results have their limitations and are insufficient
for truly understanding the impact of FIX. One major concern is that the analysis
uses a pre-post study design based on an unspecified single time point, i.e. the
analysis does not report how many days or patients are averaged together. For
any number of reasons these single time points may not accurately reflect a
hospitals performance as measured by OMELOS. Particularly, noteworthy is that
OMELOS fluctuates, sometimes significantly at smaller hospitals, and yet the
report provided no indication of how much variation was associated with the
measure. Further, LOS has a documented pre-existing temporal trend, which
was not considered and could account for a considerable proportion of the
observed improvements.33, 34 Although the analysis adjusted for self-perceived
impact, that measure only considers whether the QI team felt they had targeted
activities that would impact OMELOS, and less on whether they felt those
activities were responsible for a specific reduction in OMELOS.
Beyond the potential misleading conclusions about reductions in
OMELOS, the cost savings calculations also have two important limitations. First,
the calculations assume that inpatient costs are distributed evenly throughout the
inpatient stay, which is unlikely to be the true case. Secondly, with much of the
involved costs representing fixed expenses, a reduction in LOS only represents a
21
savings to VA if they are able to avoid fee based care. Unfortunately, diversion
rates and fee-based care costs are not systematically collected or available for
analysis.
One final consideration about the analysis, the final report didn’t come out
until May 2010, yet there was no attempt to consider how well hospitals
maintained improvements after the completion of FIX. The sustainability of
interventions is a big component to achieving high quality care, yet there is no
assessment of this in any collaborative reports. The next section shows how
even in a retrospective nature it is possible to conduct an in-depth study of FIX
that will provide insight into whether hospitals were able to improve outcomes
and then sustain quality after participating in FIX.
FIX Analysis Overview
There are a number of challenges in developing a study for analyzing FIX,
yet the FIX collaborative has some important characteristics that make it an ideal
collaborative to study. First, the goals of FIX make it amenable to a retrospective
analysis that uses available administrative data sets. The two primary outcomes
of FIX, LOS and discharge before noon, are easily ascertained in administrative
records of patient stays. Second, since FIX occurred in FY07 there are now two
years of data available to analyze whether initial improvements in outcomes are
sustained after FIX. Third, FIX occurred at the same time as two major surveys
that assessed organizational characteristics in VA hospitals. These two surveys
will play a major role in the second half of this study as the study attempts to
identify characteristics that distinguish sites on their ability to succeed during FIX.
22
As a final strength, FIX was in effect 5 simultaneous collaboratives providing a
large sample (130 hospitals) and offering the possibility of some sub-group
analyses.
Given these strengths FIX was selected to serve as a case study that
could help identify whether a QI collaborative leads to quantifiable improvements
in quality, whether hospitals sustain those improvements, whether there is
significant variation in performance, and whether organizational characteristics
might help explain success or failure in the collaborative. As discussed through
the literature reviews, the ideal study would involve an analysis that would either
establish or provide strong arguments for a cause-effect relationship between
specific improvements and changes in the outcomes. Unfortunately, there was no
data that defined the specific improvements implemented by teams. Without this,
or other qualitative assessments from the teams it was impossible to suggest a
causal relationship between FIX and the observations of this study. Instead the
study strives to use a methodologically strong quasi-experimental study
approach that provides some support for suggesting that any identified
improvements were attributable to FIX.
One such approach could be a case-control study such as that done to
analyze improvements in the CHF collaborative. However, this is not a possibility
since FIX involved all VA acute care hospitals, eliminating any natural controls.
Additionally, selecting private sector hospitals as controls would be unrealistic as
the unique structural characteristics (i.e. federal funding, comprehensive
electronic medical record, extensive catchment areas) of VA hospitals make
23
direct comparisons difficult. Instead, this study employed an interrupted timeseries analysis.
The exclusion of a case-control study combined with the use of
administrative data leave the options for analyzing FIX as structural equation
modeling, latent growth curve modeling, hierarchical linear modeling and timeseries analysis.35 Of these four choices, hierarchical linear models and timeseries analysis are best suited for analyzing and understanding the changes over
time in outcomes such as LOS and discharges before noon. Since a separate
outcome model is planned for each facility, all measures are at the individual
level and the utility of hierarchical linear models would be for analyzing the data
as a repeated measures model. In comparing a repeated measures approach
with a time-series model, the trade off is between a greater ability to model the
correlation between individuals (hierarchical model) or the correlation between
events over time (time-series).
This analysis chooses to focus on the correlation between events over
time (i.e. uses a time-series analysis) for three reasons. First, the ability to riskadjust for different patient characteristics provides some protection for correlation
between individuals at a facility that may impact their outcome. Furthermore,
since most admissions represent a unique case (rather than a related readmission) risk adjustment better adjusts for correlation between individuals than
repeated measures hierarchical models would. Second, the use of time-series
models allows more flexibility for evaluating and adjusting for auto-regressive
relationships between data. There is a notable relationship between outcomes on
24
separate days, the strength of which dissipates over time. Further, there is the
potential for periodicity effects (e.g. weekly, seasonal). While these are not
commonly found in healthcare outcomes an analysis of this type should evaluate
for their presence. Third, time-series models are considered most appropriate
when the question of interest focuses on the impact of an intervention on a
system level rather than on an individual level.36 While some individuals may
have greater benefit from the FIX initiative, the general hypothesis was that FIX
resulted in systemic changes to the system and that benefits were essentially
uniform across individuals. A risk-adjusted time series model provides the best
balance of adjustment for individual characteristics and correlation between data
points over time while focusing on the key underlying question of what impact
FIX had on the ability of each facility to provide high quality care.
Based on these considerations it was determined that an interrupted timeseries evaluation was the strongest study design for taking into account the preexisting temporal trends in the data that might help explain observed
improvements as well as indicate whether facilities were able to sustain
improvements after FIX. The primary outcomes of the analysis will be LOS and
percent of patients discharged before noon in order to directly reflect the goals of
FIX. Additionally, three secondary outcomes (in-hospital mortality, 30-day
mortality, and 30-day all-cause readmission) will be evaluated. The purpose of
these secondary outcomes was to ensure that improvements in the primary
outcomes were not associated with reductions in quality for other quality
measures. The analyses of FIX address the following two specific aims:
25
Aim 1: Determine the impact of the FIX collaborative upon quality and efficiency
as measured by LOS, percent of patients discharged before noon, in-hospital
and 30-day mortality rates, and 30-day readmission rates.
Hypothesis 1: The FIX collaborative will shorten patient LOS, increase
the percentage of patients discharged before noon. There will be no
changes in mortality or readmission rates attributable to FIX.
Aim 2: Determine whether improvements attributable to FIX are sustained postimplementation.
Hypothesis 2a: Improvements in the outcome measures will
continue on a downward slope after completion of FIX.
Hypothesis 2b: The rate of further improvements in the outcome
measures after completion of FIX will be at or below the rate of
pre-FIX improvements.
Conclusions
This chapter established the background for this analysis of FIX as a case
study representing quality improvement in healthcare. The first half of the chapter
discussed the IHIs development of the collaborative model and its utility for
supporting broad improvements in healthcare quality. This introduction was
followed by a review of the collaborative literature which suggested that while
collaboratives do generate improvements; individual hospitals vary in their
success. Additionally, the findings were weakened because they frequently relied
on team self-report of success in implementing project components or on
improving outcomes. The second half of the chapter moved from the broad
26
literature to discuss the FIX collaborative and how an analysis of that
collaborative could improve the understanding of the collaborative as well as
begin to address the questions of this thesis. Lastly the chapter reviews several
different potential analytic approaches and identifies the reasons for selecting an
interrupted-time series model for analyzing FIX. The upcoming chapter provides
further detail on the methods used to risk-adjusted the five outcomes on interest
and then develop the final time-series models for evaluating FIX.
27
CHAPTER 3 – TIME-SERIES METHODS
This chapter presents the methods used to address the first two specific
aims of this research which focus on understanding the impact of the Flow
Improvement Inpatient Initiative (FIX) on five outcome measures. The initial
sections of the chapter describe the data sources used in this analysis and define
the patient cohort. Subsequently, there is a discussion of the process used to
develop the risk-adjustment models for each outcome. The risk-adjusted patient
values are then input into a time-series model, with the final parameters
calculated in this model serving to determine hospital performance on each of the
outcomes. Finally, the chapter discusses a classification scheme developed
based on potential outcomes from the time-series model that was used to group
hospitals into performance categories to facilitate future analyses.
Data Sources
Data for this study came from VA administrative discharge records. While
administrative databases were not originally intended for research, they have
played a valuable role in health services research in the Veterans Affairs (VA)
healthcare system.37, 38 Based on the 1972 Uniform Hospital Discharge Data Set
(UHDSS)39 healthcare administrative databases have a standard form which
includes patient demographics as well as the International Classification of
Diseases, 9th revision, Clinical Modification (ICD-9-CM) codes that serve as a
proxy for clinical status. The accuracy of some ICD-9-CM codes has been
challenged, but a VA study on the level of agreement between administrative and
medical records data report kappa statistics of 0.92 for demographics, 0.75 for
28
principal diagnosis, and 0.53 for bed section.40 Variables to determine patient
outcomes and adjust for severity at admission will come from several existing
administrative databases compiled at the Austin Automation Center for all VA
hospitals. These files include the: 1) Patient Treatment File (PTF); 2) Enrollment
File; and 3) Vital Status File. All files were linked using unique patient identifiers
which also allow for monitoring a patient over-time to detect a sequence of
hospital visits.
PTF data are updated on a quarterly basis as SAS datasets and provided
the majority of descriptive variables related to patient outcomes and risk
adjustment models. Available data fields were derived from 45,000 data fields
contained within the Veterans Health Information Systems and Technology
Architecture (VISTA). Quality control protocols ensure data fields contain
appropriate numbers and types of characters. VISTA modules cover a variety of
important hospital services and functions including admission, discharge,
transfer, scheduling, pharmacy, laboratory, and radiology.
Enrollment File contains details on basic demographic variables as well as
VA specific measures such as a listing of medical conditions that are considered
directly connected to military service.
Vital Status File combines data from four sources: VA Beneficiary
Identification and Record Locator System (BIRLS), VA Patient Treatment File
(PTF), Social Security Administration (SSA) death master file, and Medicare vital
status file. It provides date of death for VA users with a sensitivity of 98.3% and
specificity of 99.8% compared to the National Death Index.41
29
Data Elements
This study analyzes five outcomes (Table 3-1), two of which are primary
outcomes while the other three are secondary outcomes. The primary outcomes,
length of stay (LOS) and percent of discharges before noon, were chosen to
reflect the stated goals of the FIX collaborative. As stated in hypothesis 1, FIX is
expected to result in improved performance on these outcomes. The secondary
outcomes, 30-day all-cause readmission, 30-day mortality, and in-hospital
mortality, serve as quality checks focused on identifying if the efforts to improve
patient flow led to any unintended consequences. The hypothesis was there
would be no changes attributable to FIX associated with any of the secondary
outcomes.
For the purpose of defining readmissions, an index admission will be any
new admission within a 30-day period, with any subsequent admission within 30days classified as a readmission. A readmission cannot itself count as an index
admission for a later admission, although the initial index admission could
potentially have multiple associated readmissions. Visits to an emergency
department or admissions to a non-VA hospital are not captured in this data.
Patient Cohort
The study population was all patients admitted to acute medical care in
each of 130 VA hospitals between FY05 – FY09. This includes patients directly
admitted as medical patients (as opposed to surgical patients) to an ICU as well
as those admitted and discharged under observation status. While observation
patients are billed as outpatients, they are important to include in this analysis for
30
a couple reasons. First, an ability to discharge a patient within 24 hours (the
standard set in VA to maintain observation status), may be a sign of good patient
flow, so removing these patients from analyses could inadvertently penalize
facilities for some of their improvements. Second, there is inconsistent use of
observation status (reflecting policy issues as well as patient flow) across VA. A
quick analysis identified 9 facilities that had never used observation status and
one facility that classified 50% of admissions as observation patients. With no
direct understanding of how high or low use of observation status impacts patient
outcomes, exclusion of observation patients could have severe unknown
consequences on the evaluation. Lastly, observation patients are treated on the
same wards as traditional acute admissions meaning their presence impacts the
overall flow and provider workload on medical wards making it inappropriate to
exclude them in these analyses.
Variable
Length of Stay
Noon Discharge
30-Day
Readmission
Table 3-1: List of Outcome Measures
Type
Description
Calculated: Time of Discharge – Time of
Continuous
Admission
Rate
Percentage of patients discharged before noon
Rate
Any readmission to any VA Hospital
30-Day Mortality
Rate
Death recorded during the hospital stay or within
30 days of discharge
In-Hospital
Mortality
Rate
Death recorded during the hospital stay
31
Risk Adjustment
Separate risk-adjustment models were developed for each of the outcome
measures before modeling outcomes in the time-series equations. Risk
adjustment evaluation was done in a cohort of patients discharged in FY07.
Following standard VA procedure a cohort was identified that represented a
stratified sample of 10 VA hospitals representing each of the five geographic
regions (Northeast, Southeast, South, Midwest, West).42 One large (>200
medical/surgical beds) and one medium (100 – 199 medical/surgical beds) VA
hospital were randomly sampled to represent each region. Small facilities were
not included as their small volumes can lead to dramatic variation which can
have adverse affects on the final risk adjustment coefficients. The final risk
adjustment cohort represented 42,725 discharges in FY07.
Table 3-2 provides a comparison of some basic descriptive statistics
between the risk adjustment cohort and all other FY07 discharges. While the vast
majority of these comparisons were statistically different, these differences were
attributable to the large sample sizes and do not represent meaningful clinical
differences. The only concerning difference in the table is the difference between
the two groups in terms of missing race information. This example shows why
data from small facilities can be problematic and they are not included in risk
adjustment model evaluation for VA data.
A broad collection of variables, listed in Table 3-3, that measure patient
socio-demographics, primary diagnosis, diagnosed comorbidities, admission, and
discharge characteristics were evaluated to determine their impact on each
32
outcome measure. Modeling for LOS was done in the log-scale due to the
skewed nature of LOS data.43 All other outcomes were treated as rates and
modeled with binomial distributions.
Table 3-2: Comparison of risk adjustment cohort to all other FY07 discharges
Risk Adjustment
All Other FY07
p-value
(N=42,725)
(N=291,484)
Age (SD)
65.94 (12.85)
65.49 (13.11)
<0.001
Male (%)
41,032 (96.0%)
279,735 (96.0%)
0.50
Income (SD)
23,275 (47,775)
22,162 (42,390)
<0.001
Race
White (%) 26,511 (62.1%)
146,748 (50.4%)
<0.001
Black (%) 7,614 (17.8%)
43,571 (15.0%)
<0.001
Hispanic (%)
668 (1.6%)
3,171 (1.1%)
<0.001
Asian / Pacific Islander (%)
407 (1.0%)
2,111 (0.7%)
<0.001
Native American (%)
161 (0.4%)
1,375 (0.5%)
0.006
Missing (%) 7,964 (18.6%)
97,069 (33.3%)
<0.001
ICU Direct Admit (%)
7,471 (17.5%)
53,685 (18.4%)
<0.001
Un-adjusted LOS (SD)
5.43 (8.90)
5.22 (8.08)
<0.001
Died In-hospital (%)
1,104 (2.6%)
8,311 (2.85%)
0.002
Discharge Before Noon (%)
7,082 (16.6%)
54,075 (18.6%)
<0.001
All Cause Readmit (%)
6,332 (15.3%)
42,995 (15.3%)
0.85
33
Discharge
Admission
Socio-demographics
Table 3-3: List of potential risk adjustment variables, the number of discrete
categories, and a description of how categories were defined
Description
Categories
Everyone under 45*
Age
10
5-year increments from 45 – 84
Everyone 85 and older
Sex
2
Male*, Female
Married*, Divorced, Never Married, Separated,
Marital Status
6
Unknown, Widowed
Income
Continuous variable
White*, Asian / Pacific Islander, Missing, Other
Race
4
(includes Black, Hispanic, Native Am.)
Percentage that admission condition is
Service
3
connected to military service
Connected
0%*, 10 – 90%, 100%
Major Diagnostic Code Categories
Primary Diagnosis
25
Circulatory System*
Comorbidities
41
Quan adjustment to Elixhauser algorithm44, 45
Direct*, VA Nursing Home, Community
Nursing Home, Outpatient, Observation,
Source
9
Community Hospital, VA Hospital, Federal
Hospital
Direct to ICU
Yes / No
Community*, Irregular, Death, VA Hospital,
Federal hospital, Community hospital, VA
Place of
nursing home, Community nursing home,
13
Discharge
State home nursing, Boarding house, Paid
home care, Home-basic primary care, hospice
Regular*, Discharge of a committed patient for
a 30-day trial, Discharge of a nursing home
Type of
patient due to 6-month limitation, Irregular,
5
Discharge
Transfer, Death with autopsy, Death without
autopsy
Died InYes / No
Hospital
Transferred
Yes / No
out of Hospital
* Reference category
34
The first decision in the risk adjustment process was to identify the
appropriate number of categories for some of the variables. This was done by
running univariate categorical models determining the predictive association
between each category and LOS. The goal in this process was to maximize
model fit (as measured by Akaike information criterion (AIC)), while working
towards a parsimonious list of categories. The goal was to identify a collection of
categories for each variable in which the individual point estimates for each
category were statistically significant. As an example, the field for place of
discharge took on 26 different values in the administrative files, with 14 of these
fields having non-significant point estimates in the initial full model. While a
number of these categories are meaningful for administrative purposes, they
have no significance clinically. Therefore categories such as military hospital,
other federal hospital and other government hospital were grouped together and
models re-evaluated. For some small groups, there were no ideal clinical
comparisons, in which case groups may have been grouped by the similarity of
their initial point estimates. This process was iterated, trialing different grouping
as necessary, until the best model (lowest AIC, all categories significant) was
identified. Full details on the modeling process for Age (Table 3-4), Race (Table
3-5), Percent Service Connected (Table 3-6), Admission Source (Table 3-7), and
Place of Discharge (Table 3-8) are available in their respective tables. No
changes were necessary for Marital Status or Type of Discharge. A full
description of the individual categories has been published elsewhere.46
35
Table 3-4: Modeling of age risk adjustment categories
# of
AIC
Description
Model
Categories
1
Continuous 114938
2
2
115130
< 60, ≥ 60
3
4
115010
< 40, [40, 60), [60, 80), ≥ 80
4
15
114932
<20, 5 year increments, ≥ 90
5
8
114953
<20, 10 year increments, ≥ 90
6
12
114932
<25, 5 year increments, ≥ 80
7
10
114926
<45, 5 year increments, ≥ 85
Table 3-5: Modeling of race risk adjustment categories
# of
Model
AIC
Description
Categories
1
6*
115308 Native American, Hispanic were non-significant
2
4
115305 White, Asian/Pacific Islander, Missing, All others
3
4
115307 Black, Asian/Pacific Islander, Missing, All others
* Coded categories are: White, Black, Hispanic, Asian/Pacific Islander, Native
American, Missing
Table 3-6: Modeling of service connected risk adjustment categories
# of
AIC
Description
Model
Categories
1
11*
115318 30,40,50,70,90 all insignificant
2
3
115308 0, [10,90], 100
3
6
115314 Grouped in increments of 20
4
4
115310 0, [10,50], [60,90], 100
* Service connected is recorded in increments of 10 from 0 - 100
36
Table 3-7: Modeling of admission source risk adjustment categories
# of
AIC
Description
Model
Categories
1E, 1H, 1J, 1L, 1R, 1S, 2A, 2B, 2C, 3B, 3E were
1
19*
114579
non-significant
1E, 1J, 1L,1R, 1S all paired with 1P; 1G with 1H; 2A, 2B, 2C grouped as 2A; 3B,
3E all paired with 3C
2
9
114574
2A (p=.0573)
3
8
114576
2A paired with 1M
4
8
114573
2A paired with 1P
* See VA data documentation for complete listing of fields46
Table 3-8: Modeling of place of discharge risk adjustment categories
# of
Model
AIC
Description
Categories
1, 2, 3, 12, 13, 15, 16, 19, 20, 21, 27, 29, 34, 35
1
26*
113126
were non-significant
1,2,3, paired as 3; 12,13,15,20 paired with 11; 16,19 paired with 17; 27 paired
with 5; 21,29,35 paired as 21; 34 paired with 22
2
14
113121 21 is non-significant
3
13
113119 21 (including 29 & 35) paired with -3
* See VA data documentation for complete listing of fields46
Categories 9, 10, & 14 were not recorded for any discharges in the study
Once the final categorizations were set the next step in the risk adjustment
model development was to evaluate the univariate relationships between each
outcome and the potential risk adjustment variables. All variables having a p<0.1
association in univariate analyses were included in the initial full model for that
outcome. Reduced models were then generated by removing variables that did
not meet a determined threshold; the full sequence of steps taken to develop
each model is detailed in the SAS code available in Appendix A. The goal of
37
model selection was to identify the simplest model with the best AIC. In instances
where the AIC were too similar (within 2 points) the model with the greatest
number of variables, even if some were marginally significant was selected. The
model evaluation process also evaluated for potential correlations between
variables. Correlation was tested between single level variables (ex.
comorbidities, Direct Admission to ICU). The Correlation between multi-level
variables was not compared, but potential correlations such as place of
discharge and type of discharge were never relevant in identifying the best
model. The key correlations that were identified and evaluated if necessary
during the model development are listed in Table 3-9.
Table 3-9: Highly correlated risk adjustment variables
Variable 1
Variable 2
Correlation (ρ)
Rheumatic Arthritis
Arthritis
0.88
Paralysis
Hemiparesis
0.82
Renal Disease
Complicated Hypertension
0.81
Renal Failure
Complicated Hypertension
0.81
Mild Liver
Liver
0.98
Nonmetastic Cancer Malignancy
0.92
Ulcer No Bleed
Peptic Ulcer
0.82
Renal Disease
Renal Failure
1.00
Once final risk adjustment models were developed a second cohort of
60,000 patients was randomly sampled from all FY07 discharges, this random
sample included patients from small facilities and discharges from the original
risk adjustment cohort. The final models were run in this cohort to verify model
performance and generate point estimates for use in risk-adjustment. A listing of
38
these final point estimates is available in Appendix B. These risk-adjustment
point estimates were used to calculate the expected outcome for each patient,
which was used to determine the indirect adjusted outcome.
Time-Series Model
There are several issues, many discussed in Chapter 2, to consider in
determining how to best model and evaluate the impact of FIX. This study
employed an interrupted time-series model given the study designs ability to
account for pre-existing temporal trends, allow for evaluation of the outcomes
after the intervention, and to protect against some threats to internal validity in
comparison to other quasi-experimental designs.21 22 All outcomes were
individually modeled using a time series analysis covering 5 years, starting in
FY05 (October 1, 2004) through to the end of FY09 (September 30, 2009). This
provided two years of data prior to FIX which establish the baseline performance,
a year of data identifying whether hospitals made improvements during FIX, and
two years of data identifying whether those hospitals that improved were able to
sustain those improvements.
After determining the risk adjusted outcomes, the next step was
determining the best level of outcome aggregation. At the individual patient level,
LOS or rates of the other outcomes were highly variable. So modeling at that
discrete of a level would make it difficult to detect meaningful changes in any
outcome due to excessive variability or noise in the signal. Conversely, modeling
at a highly aggregated level, such as a 6-month mean, would potentially ignore
key fluctuations in the outcome measures. This study settled on having each
39
data point represent a fourteen-day average which results in 26 data points per
year, or 130 data points over the 5 study years. This level of outcome
aggregation was based on power calculations determining the appropriate
tradeoff between variability and overall number of time points. Assuming
moderate autocorrelation (φ = 0.3), these models have a power of 0.88 to detect
a change in the outcome in response to the intervention equivalent to one
standard deviation (Power = 0.87 for detecting sustainability).47, 48
While a simple 14-day average works well for the LOS and discharges
before noon models, it presents a challenge for the other models. Most notably in
smaller VA hospitals where it is reasonable to expect 14-day periods without any
observed outcomes, particularly for in-hospital mortality. To avoid the
unnecessary variance introduced by this possibility, the outcome models for
readmission and mortality rates were plotted every 14-days, but, each point
represents a moving average of the previous 70-days (5 data points). This did
result in the reduction of these time-series by 4 data points at the beginning of
these models.
The final concern in developing this model, which supports the selection of
a time-series model, is how these data are unlikely to meet the assumptions of
standard linear regression. Most importantly, while each discharge was
essentially an independent event, it was not appropriate to assume independent
error terms. Therefore, all models were evaluated and adjusted for correlation
between error terms. The potential for autocorrelation was evaluated up to 26
times points, allowing for capture of seasonal correlation up to a year. The
40
second concern was that the measures may not have homoscedastic variance.
There were two potential sources of heteroscedasticity in this analysis. First,
there may be different number of discharges averaged in each 14-day measure.
Secondly, as the outcomes improve, they may be approaching a floor in which no
further improvements are possible and thus the variance around that point is
likely to tighten. All models were evaluated for and when identified corrected for
autocorrelation and heteroscedasticity.49
With the above considerations, the following is the final form of the basic
outcome model:
In the above model, β1 – β5 represent the slope associated with the
modeled outcome during FY05 – FY09 respectively. The time component is
parameterized in order to create a continuous linear regression, so t05 counts
from 0 - 129, while t06 is 0 for the first 27 time points and then begins counting.
This parameterization continues, with each subsequent year beginning 26 points
later, thus t07 = 1 at 53, t08 at 79, and t09 at 105. The β6 term represents a
quadratic component to the overall trend. This parameter was only included in
models where it was significant (p<0.05). The final component of this model, vt
represents the autocorrelated error term, shown below:
In this equation, represents the degree of correlation between the error
terms of the current time point and any prior point. For these models only those
41
correlations that were statistically significant (p<0.05) were included in the final
model of vt. The final component of the model is the remaining error term, :
~0,1
For these models et represents the typical assumption in linear regression
that error terms are normally distributed with mean 0 and variance σ2. However,
as discussed this data may not fit this assumption, so when heteroscedasticity is
detected h, as calculated below, is used to estimate and correct for the changing
error variance.
'
+
&(
*(
# $ %& &
$ )* *
Improvement and Sustainability
With the time-series equation developed, the final step was to develop a
classification approach that would identify whether hospitals improved on any
outcome measure and then which hospitals went on to sustain those
improvements. The final classification system, listed in Table 3-10, defined 11
sub-categories that collapse into 4 major categories. This approach to classifying
performance predominately focuses on the results of parameters β3 – β5. β1 and
β2 serve to establish a baseline of performance and control for improvements
that would be expected, based on historical trends, had FIX not occurred.
The first major category is those hospitals classified as having No
Change, meaning no statistical (p<0.05) changes were observed for β1 – β4
(FY05 – FY08). The purpose of this category is to separate out those facilities
whose outcome performance was characterized by high variance, meaning any
42
signal was buried in among a significant amount of noise. It is potentially
important to note this type of performance in quality improvement, as high
variation suggests a lack of consistently performing process which is a different
quality improvement challenge than saying a hospital was unsuccessful in their
efforts to improve a process. For this reason the No Change category was kept
separate from the No Benefit category. On last consideration about the No
Change category, some of these hospitals did exhibit a detectable change in the
outcome in FY09 but were still classified here for two reasons. First, given the
high variability displayed by many of these facilities it seemed that any detected
change in FY09 was unlikely to be a true change and more likely represented a
chance occurrence. Secondly, any improvement observed in FY09 was too
distant from the occurrence of FIX to suggest any association.
No
Change
Table 3-10: Description of full classification categories
A.1 No changes observed from FY05 – FY09
A.2 No changes observed from FY05 – FY08, improvement in FY09
A.3 No changes observed from FY05 – FY08, decline in FY09
Improve B.1
Not
B.2
Sustain B.3
Immediate Loss: Improve in FY07, return to baseline in FY09
Delayed Loss: Improve in FY07, return to baseline in FY09
Delayed Impact: No change in FY07, improve in FY08
Improve C.1
and
C.2
Sustain C.3
High Sustain: Additional improvements observed in FY08/09
Moderate Sustain: No additional improvements in FY08/FY09
Weak Sustain: Diminishing improvements, better than FY05/06
No
Benefit
D.1
D.2
No change in FY07, but statistical changes observed elsewhere
Decline observed in FY07
43
The other three categories deal with hospitals that had observable
statistical changes during the first four years of the study. For these hospitals the
first step in the classification was to examine the performance in FY07 (β3).
Figure 3-1 is the flow chart depicting the decision process used to classify each
hospital’s performance. Starting with Part B of the figure, any facility that showed
a decline in performance during FIX was classified as D.2. While it may be
possible that facilities would show improvements in FY08 or FY09, given the lack
of directly observed data it was impossible to determine if these improvements
represented a delayed effect of FIX, the effect of a different QI project, or simple
regression to the mean. With this consideration, it was determined there was no
need to further sub-classify hospitals based on their outcomes in FY08/09 if there
was a decline in performance relative to baseline in FY07.
Next Part C of the flow chart represented those hospitals whose
performance during FY07 was flat (i.e. performance continued on the baseline
trend established by performance in FY05 & FY06). The outcomes for these
hospitals fell into one of two categories. First, the hospital could record an
improvement in FY08 leading to classification as B.3. This was recorded as a
possible improvement attributable to FIX with the reasoning that FIX was a
yearlong effort that aimed to improve outcomes across an entire hospital. It
seemed reasonable that not all hospitals would have an immediately measurable
impact in FY07 but would instead record the biggest gains in the latter half of
FY07 and into the first half of FY08. This is certainly the weakest category for
44
asserting that improvements were associated with FIX and should be interpreted
appropriately.
The other possibility for hospitals that had flat performance in FY07 was
that they would continue on the pre-established baseline or exhibit some decline
in FY08. These hospitals were classified as D.1 and deemed to have had no
benefit attributable to FIX. The No Benefit category, representing hospitals with a
D.1 or D.2 classification, marks those hospitals that initially performed with low
variability allowing detection of a clear baseline trend which suggests they had
processes in place that performed with some consistency. The key feature of
these hospitals was that, as measured by the individual outcome, they were
unable to make improvements to that process as part of their participation in FIX.
The last set of hospitals is those that had an initial improvement during
FY07, which is charted in Part A. All of these hospitals are classified as
improving; it just becomes a question of whether they sustain those
improvements. Hospitals that made an improvement in FY08 or FY09 with no
declines in either time period were classified as C.1 or a high sustainer since
they not only sustained initial improvements but went on to make further
improvements. A facility that neither declined nor improved (i.e. just continued the
new baseline established in FY07) were classified as C.2 or moderate sustainer.
The last category of sustainer (C.3) was those hospitals that exhibited a
decrease in the rate of improvement in FY08 or FY09. However, their overall
performance did not decline such that their performance on the outcome returned
to pre-FIX levels. This category acknowledges that rates of improvement may
45
Figure 3-1: Decision tree used to classify hospital performance
A. Hospitals showing an initial improvement during FY07
B. Hospitals with decreased performance in FY07
C. Hospitals with non statistical (p>0.05) performance in FY07
46
level off after a QI collaborative completes, but hospitals may still maintain a high
level of performance. The final category was those hospitals that were unable to
sustain the improvements. If the hospitals returned to baseline performance in
FY08 they were classified as B.1, immediate loss. If however, they had a slower
return to baseline with it not occurring until FY09 then they were classified as
B.2, delayed loss.
Sub-group Analyses
Although the later chapters of this study will provide an in-depth evaluation
of the relationship between organizational characteristics and hospital
performance, this initial evaluation did consider three sub-group comparisons.
The first comparison evaluated hospitals by size to determine if the collaborative
was effective across all size categories. Hospitals were classified as either large
(≥ 200 beds), medium (100 – 199) or small (< 100) based on the number of
approved medical/surgical beds. The second comparison evaluated whether
performance varied based on which learning session the team attended. Since
130 hospitals participated in FIX, the learning sessions were broken into five
separate regions (Northeast, Southeast, Central, Midwest, and West) to allow all
participants to actively engage.31
Lastly, the final comparison examines whether facilities that improved
(whether they sustained or not) on the primary outcomes had a different
distribution of performance on the other outcomes (particularly the secondary
outcomes) compared to the full group. This comparison ensures that these
hospitals did not have higher than expected rates of classification into No Benefit
47
on the secondary outcomes. All of these comparisons were done using Pearson
chi-square tests comparing the distribution of the relevant sub-group to that of
the overall group.
Conclusions
This chapter has discussed the methods used to evaluate performance at
each hospital for each of the five outcomes of interest. Overall, the chapter
covered the data sources, defined the patient cohort and provided a detailed
description of the risk adjustment and time-series modeling processes. Although
the time-series methods used in this analysis were not novel, they have not been
applied in this manner to evaluate healthcare QI. Additionally, the classification
algorithm generated to aggregate facilities based on performance was a new
approach. This classification approach focused on understanding how facilities
may be grouped to facilitate later analyses examining how organizational
characteristics impact QI efforts. The next chapter will present and discuss the
results of this time-series evaluation and classification algorithm.
48
CHAPTER 4 – TIME-SERIES RESULTS AND DISCUSSION
This chapter concludes the first half of this study by presenting the results
from the analysis of the Flow Improvement Inpatient Initiative (FIX). This analysis
first considers results at the aggregate Veterans Affairs (VA) level by grouping
patient discharges across hospitals. This provides some understanding of the
overall impact of FIX. However, the real purpose of this analysis is to examine
the performance of each individual hospital using the time-series approach
outlined in Chapter 3. After presenting these results, the chapter continues with
an in-depth discussion. First, the discussion focuses on addressing the first two
specific aims for the project. Second, it considers the greater implications of the
findings for quality improvement in healthcare and whether there is support for
using large collaboratives, such as FIX, to improve quality in healthcare.
System-Wide Analysis
Although the main interest of this analysis was to understand performance
at each individual hospital, it was useful to first understand the aggregate impact
of FIX for VA as a system. Viewing the data at the aggregate level provides some
understanding of the average performance providing a basis for comparing high
and low performing hospitals. The five years of data in this study covered
1,690,191 discharges from 130 VA hospitals. Three of the outcome measures,
LOS, in-hospital mortality, and 30-day mortality exhibited a natural 3-4% annual
improvement in performance prior to FIX. For LOS, Figure 4-1, the time-series
model identified a subtle statistical increase in the rate of improvement during
49
FIX which was sustained through the post-intervention period. This was in
contrast to in-hospital and 30-day mortality which showed no aggregate
improvements associated with FIX. In-hospital mortality, Figure 4-2, showed no
statistical changes in FY07 – FY09 from the pre-established trends. For 30-day
mortality there was a slight decline in performance in FY07, although as seen in
Figure 4-3 this decline does not mean 30-day mortality rates were rising, instead
it only signified a leveling of 30-day mortality rates. Most likely this simply reflects
that 30-day mortality rates were reaching optimal potential performance leaving
few achievable improvements.
The other two outcomes in this study, discharges before noon and 30-day
all-cause readmission, both were statistically flat prior to FIX. The aggregate
results for discharges before noon are perhaps the most intriguing in this study.
As shown in Figure 4-4, there is a clear improvement during and after FIX with
discharges before noon jumping to near 24% from a baseline of 17%.
Unfortunately, part way through FY08 the percentage of patients discharged
before noon began to decline, reaching a rate around 20% at the end of the
study. While this level of performance is still improved at the end of the study
compared to the baseline, it is unclear whether performance will level off at 20%
or continue to decline back to baseline. Lastly, 30-day readmissions, Figure 4-5,
showed highly variable performance with an overall worsening of performance
during FIX.
50
3.6
3.5
Days
3.4
3.3
3.2
3.1
3
2.9
0
FY05
26
FY06
Observed
52
FY07
78
FY08
104
FY09
130
Time-Series Model
Figure 4-1: Aggregate results for LOS (FY05 - FY09)
Percent of Discharges
5.0%
4.5%
4.0%
3.5%
3.0%
2.5%
2.0%
1.5%
0
FY05
26
FY06
Observed
52
FY07
78
FY08
104
FY09
130
Time-Series Model
Figure 4-2: Aggregate results for in-hospital mortality (FY05 - FY09)
51
Percent of Discharges
6.5%
6.0%
5.5%
5.0%
4.5%
4.0%
3.5%
3.0%
2.5%
0
FY05
26
FY06
Observed
52
FY07
78
FY08
104
FY09
130
Time-Series Model
Figure 4-3: Aggregate results for 30-day mortality (FY05 - FY09)
Percent of Discharges
25%
23%
21%
19%
17%
15%
0
FY05
26
FY06
Observed
52
FY07
78
FY08
104
FY09
130
Time-Series Model
Figure 4-4: Aggregate results for discharges before noon (FY05 - FY09)
52
Percent of Discharges
16.0%
15.5%
15.0%
14.5%
14.0%
13.5%
13.0%
0
FY05
26
FY06
Observed
52
FY07
78
FY08
104
FY09
130
Time-Series Model
Figure 4-5: Aggregate results for 30-day readmissions (FY05 - FY09)
Facility Analysis
Working with this initial introduction of how FIX impacted VA performance,
the focus now shifts to classifying individual hospitals using the classification
approach outlined in Chapter 3. The breakdown in performance for all 130
hospitals across each of the 5 outcomes was listed in Table 4-1. These results
suggest there was considerable variation both within each hospital on individual
outcomes and also across each of the five outcomes. Beginning with LOS, there
were 45 (35%) hospitals that made an initial improvement with 27 (60%) able to
sustain the improvements. Further, 14 out of the 45 (31%) hospitals classified as
improvers had a delayed onset of improvements which means they were not
evaluated for whether they sustained the improvements. Of course these
53
successes are balanced by 36 hospitals (28%) in whom there were no statistical
changes over the entire study and 49 (38%) that saw a decline or showed no
benefit associated with FIX.
A.1
2
3
5
3
2
A.2
4
8
2
1
1
Improve
B.1
4
3
3
0
4
B.2
0
17
0
2
0
B.3
14
21
6
4
11
Sustain
C.1
C.2
C.3
16
8
3
13
3
3
4
1
1
4
7
8
7
6
4
No
Benefit
No
Change
Table 4-1: Hospital classification across the 5 outcome measures (N = 130)
Noon
30-Day
30-Day
In-Hospital
LOS
Discharge Readmission Mortality
Mortality
A
30
25
78
32
28
D.1
36
23
19
27
28
D.2
13
11
11
42
39
This break-down between categories is in contrast to how hospitals
performed on efforts to improve the percent of patients discharged before noon.
Interestingly, there was the exact same number of hospitals, 36, that showed no
statistical changes. However, it was not the same hospitals as only 13 were
categorized as no change for both LOS and discharge before noon. For
improvements, a greater number made initial improvements, 60 (46%), but fewer
were able to sustain (19 out of 60, 32%). Once again there were a fair number of
54
hospitals, 21 out of 60 improvers (16%), which exhibited flat performance during
FY07 but recorded improvements in FY08. Lastly, 34 (26%) of the facilities did
not record any benefit from their participation in FIX related to increasing the rate
of discharges before noon.
For the secondary outcomes, these were included mainly to determine
whether there were declines during FIX, there is less of an expectation that
hospitals would improve these outcomes in response to FIX. This expectation
was supported as few hospitals showed improvements with a substantial
percentage either recording no statistical changes over the study or no benefit
from FIX. Of the hospitals not recording any benefit from FIX, those classified as
D.2 would be the most concerning as that would signify a decline in performance
during FIX that might mean FIX had a negative impact. For the mortality rates, 39
(30%) had declining in-hospital mortality performance and 42 (32%) had
declining 30-day mortality performance. While these are concerning numbers,
perhaps the more telling association would be if there is a strong association
between improvement on the primary outcomes and a subsequent decline on the
secondary outcomes. As shown in Table 4-2 (LOS) and Table 4-3 (Discharges
before Noon), the distribution of facilities that improved on either of these
outcomes is no different than the overall distributions of all facilities, suggesting
that improvements attributable to FIX were not associated with direct declines on
the secondary outcomes. The last feature to notice in Table 4-1 was the high
proportion of hospitals (65%) that showed no statistical change on 30-day
readmission. This fits with the aggregate readmission graph (Figure 4-5, p. 52)
55
which suggests hospital readmissions are highly variable and potentially even
Improve
B.1
B.2
B.3
0
8
8
0
0
2
0
1
2
2
0
5
Sustain
Table 4-2: LOS Improvers classification (N = 45)
Noon
30-Day
30-Day
In-Hospital
Discharge Readmission Mortality
Mortality
A
6
28
9
9
A.1
0
2
1
1
A.2
1
1
1
0
C.1
C.2
C.3
7
1
1
1
0
1
2
1
4
5
1
3
No
Benefit
No
Change
associated with a random process.
D.1
10
7
10
6
D.2
3
3
14
13
0.73 (10)
0.96 (9)
0.93 (9)
0.55 (9)
X2 p-value (df)
The other sub-group analyses showed that performance did not vary by
hospital size or region. Table 4-4 displays the results (p-values) of the chi-square
tests comparing the hospital size or region sub-group to the overall population
distribution. None of the comparisons were statistically significant at the p< 0.05
level, which given the number of comparisons may have been inappropriately
conservative. The full break down showing the number of hospitals classified into
each performance category by hospital size and region is available in Appendix
C.
56
Improve
B.1
B.2
B.3
2
0
8
2
0
3
0
0
2
1
0
5
Sustain
C.1
C.2
C.3
8
5
2
4
0
1
1
3
4
4
4
3
No
Benefit
No
Change
Table 4-3: Discharge before noon Improvers classification (N = 60)
30-Day
30-Day
In-Hospital
LOS
Readmission Mortality
Mortality
A
11
35
16
12
A.1
2
3
0
1
A.2
3
1
1
1
D.1
13
5
13
15
D.2
6
6
20
14
0.87 (9)
0.75 (9)
0.94 (9)
0.93 (9)
X2 p-value (df)
Category (N)
LOS
Size
Small (54)
Medium (60)
Large (16)
0.97
0.71
0.11
Region
Table 4-4: P-Values from Chi-square tests examining facility performance in subgroups by size and regional location
Northeast (23)
Southeast (26)
Central (25)
Midwest (29)
West (27)
0.96
0.34
0.71
0.76
0.72
30-Day
30-Day
Noon
Discharge Readmission Mortality
0.73
0.18
0.83
0.76
0.98
0.71
0.75
0.06
0.58
0.75
0.77
0.67
0.72
0.68
0.36
0.67
0.51
0.37
0.96
0.84
0.72
0.43
0.51
0.92
In-Hospital
Mortality
0.77
0.95
0.97
0.06
0.62
0.49
0.87
0.93
57
Evaluation of the Specific Aims
The first specific aim evaluated in this study was whether FIX positively
impacted quality and efficiency as measured by the five selected outcomes.
An evaluation of both the aggregate results and the individual facility results
leads to the conclusion that FIX did result in a reduction in LOS, an increase
in the percent of patients discharged before noon and that these
improvements were not associated with any systematic negative impacts as
measured by mortality or readmission rates. The aggregate results for both
primary outcomes showed improvements in FY07 that were greater than
expected given preliminary trends. Both of the mortality rates had promising
aggregate results with in-hospital mortality showing a continuation of the preexisting trend during FIX. The observed leveling in the rate of 30-day
mortality rates during FIX likely reflects that there was little to improve on that
outcome. The 30-day readmission rate did show a slight increase during FIX,
but given the high variability of this outcome (at the individual hospital level
65% had no statistical changes over the entire study) and the lack of
association between high performance on LOS or discharges before noon
with poor performance on readmissions, this increase in the rate of 30-day
readmissions was unlikely a direct effect of FIX. This conclusion is supported
by prior work which showed no increase in hospital readmissions with lower
hospital LOS.34
Although the aggregate results are impressive, the analyses at the
individual hospital level provide a more complex evaluation of FIX. At the
58
individual hospital level only 35% of hospitals improved LOS and 46%
improved discharges before noon. Looking at it from another perspective, 50
hospitals (39%) did not improve on either of the primary outcomes and 30
(23%) did not improve on any of the five outcomes. These results are similar
but somewhat lower than other published reports of hospital success with
collaboratives. The most likely explanation for this difference is that most of
the other reports focused on team self reports of success. Certainly a small
selection of teams that believe they succeeded would have not produced any
measureable improvements. Overall, the conclusion is that FIX was
successful, based on the aggregate results, but it is important to recognize
that despite receiving the same training individual hospital performance was
quite variable. This variation suggests that while successful there were
components of QI collaboratives that can be improved in order to help all
hospitals have measureable benefits from the effort.
The second specific aim evaluated whether those hospitals that
achieved initial improvements as part of FIX sustained the improvements for
two years post-intervention. This evaluation only considers the results from
the primary outcomes given that as predicted few hospitals recorded
improvements on readmission or mortality rates during the intervention
period. The two primary outcomes paint distinctly different pictures. For LOS,
considering only those hospitals that improved in FY07 (i.e. those classified
as B.1, B.2, C.1, C.2, or C.3) 87% of them sustained improvements (27 out of
31). Further, 59% (16 out of 27) of the sustaining hospitals were classified as
59
high sustainers (C.1), meaning not only did they sustain a new rate of
improvement, but exhibited additional improvements after FIX. From these
results it would appear the collaborative was successful, with some individual
variation, in creating sustained quality.
In contrast, for discharges before noon, only 49% of hospitals
improving in FY07 sustained the improvements (19 out of 39). Although fewer
hospitals sustained improvements, those that did sustain were frequently high
sustainers (13 out of 19, 68%). The results of this outcome, paint a less
promising picture about sustainability. With only 50% of hospitals sustaining
improvements and a clear declining trend in the aggregate results it is hard to
conclude that any solutions developed during the collaborative were
specifically designed for sustained improvements. Considering the results of
these two outcomes, the overall picture suggests that it is possible to improve
and sustain quality after a collaborative; however, there may be some
important lessons to learn from the observation that more facilities improved
and sustained for LOSA compared to discharges before noon.
Discussion
This analysis found that a selection of hospitals achieved sustained
improvements as part of their participation in FIX. However, individual hospital
performance was highly variable, suggesting an opportunity to improve on the
success of hospitals participating in a QI collaborative. Since variation in
performance was consistent across all 5 regions and across hospital size
60
categories it appears the collaborative was successfully implemented, just not all
hospitals had measureable benefits from the experience.
Given this mixed evaluation, it is important to remember the complexity of
these outcomes and that many different factors impact the final measurement.
So while FIX strove to take a system wide approach to improving patient flow,
there may still be factors that the framework of the collaborative did not address
which would explain the limited success at hospitals. Further, it may have been
difficult to widely disseminate improvements across all medical patients in the
course of a single year. Despite these inherent limitations to the effort to evaluate
FIX, the results of the study still uncovered some interesting challenges to
achieving high quality healthcare.
These challenges are perhaps best highlighted by the overall performance
on the efforts to increase the number of discharges before noon. While not every
hospital was successful, it is evident that improvements were achieved systemwide (Figure 4-4). Yet, once the collaborative ended and focus on the
performance metric was reduced, only 49% of those that made improvements in
FY07 sustained that performance. If many QI initiatives have a similar response
profile (initial success that regresses back to the mean over time) that would
explain why there have been limited measureable improvements in quality. In
contrast, the other primary outcome, LOS, had 87% of improving hospitals go on
to sustain improvements. However, there may be inherent differences between
these two outcomes which explain the greater success in sustaining LOS
compared to discharges before noon.
61
Perhaps the most significant difference is the long history in healthcare of
LOS as a performance metric. As such providers generally accept the premise
that they should work to shorten LOS, recognize potential personal benefit from
shortened LOS, know their average LOS (at least for physicians), and how their
performance compares to others. The major benefit of these features is that
providers are likely to be less resistant to change, suggesting there would be a
low barrier for teams to overcome in implementing and sustaining an intervention
designed to shorten LOS. The only real obstacle would be ensuring the
intervention was well designed and did not create unmanageable burden.
This environment would be in stark contrast for the environment around
increasing the rate of discharges before noon, which was a newly introduced
performance metric. In that case providers will not have considered it before,
know nothing or very little about current performance, and have no basis for
comparing performance. If the hospital culture is not otherwise accepting of
change this is likely to be a change adverse situation. Successful sustainment of
improvement therefore would require a solution that not only improves outcomes
but also works to help providers accept and maintain the change. It may be this
last part, how to handle and maintain change, where implementation teams were
most likely to be unsuccessful in sustaining improvements related to discharges
before noon during FIX.
It is not surprising that teams would have difficulty achieving sustained
improvements related to discharging before noon considering that many morning
activities work in direct conflict with the process of trying to discharge patients.
62
The morning is when physician teams round on patients, nurses provide
medications, phlebotomy collects blood, and the labs run tests. Not only do these
activities represent a significant effort on providers but many of them, particularly
results of morning lab tests, provide critical information for deciding whether or
not to discharge a patient. With all of these barriers, a proposed solution must not
only be effective but it must also reduce workload burden and address
information needs. If the proposed solutions did not achieve all of these
necessities, it is reasonable to predict that providers trialed the proposed
solution, found it unacceptable and then returned to the old way of caring for
patients in the morning. Such a response certainly fits with the overall observed
aggregate profile where initial improvements are quickly lost with a trend back to
pre-implementation performance levels.
Despite these concerns, it cannot be forgotten that at the aggregate level
the observed percentage of patients discharged before noon was still above the
baseline level. The final observed rate of 20% of patients discharged before noon
is a 3% absolute increase, or about a 15% relative increase, in the rate
compared to the baseline of 17%. It is worth considering the possibility that a
decline from a high around 23% of patients to a final rate of 20% may not
indicate worse care or poorer hospital flow. Instead, particularly if performance
levels off and does not continue to decline, a final rate of 20% may mean that
hospitals have achieved an appropriate balance between provider workload
burden and meeting the flow needs for their hospital. Other measures that would
better understand the flow concerns of a hospital could be emergency
63
department (ED) diversion rates, ED to medicine admission times or the amount
of fee basis care for medical admissions. These however were not considered as
measured outcomes during FIX, nor are they systematically collected. An
important lesson here is that while the metric of discharges before noon was
potentially useful for driving improvements, it needed to be evaluated in tandem
with a more clinically or business relevant metric to determine true success in
improving flow.
A secondary consideration from this analysis was the higher than
expected percentage (28%) of hospitals classified as No Change on the primary
outcomes. In fact more hospitals, 13 compared to 5, recorded No Change on
both primary outcomes compared to recording Sustain on both outcomes. While
discharges before noon did have a flat baseline at aggregate, to have so many
hospitals exhibit no statistical change in LOS which has a distinct trend was
particularly surprising to observe from hospitals participating in a QI collaborative.
This data serves as a stark reminder that improvements in quality require a
standardized process that can be analyzed and improved. QI teams should
remember that they first must understand the relevant process, or lack of
process, before trying to make change.
In the end, whether the implemented solutions were ineffective, not
needed, or unacceptable the end result is that individual performance varied
considerably despite all participants receiving the same training and having
access to national resources. Not only did just a fraction of hospitals show
sustained improvements through FY09, but 50 (39%) hospitals did not show any
64
improvement on the two primary outcomes. This leads to the concern that
perhaps the collaborative approach to QI does not add any additional value
compared to a more individual hospital approach to QI (i.e. a QI project not
associated with a collaborative). The main reason for making this comparison is
that a collaborative can be an expensive undertaking, the estimated cost of FIX
for VA was $5.8 million.32 When hospitals have to pay for this directly out of their
budget (it is not clear who bore the individual costs of FIX) they may not want to
participate should they recognize that anywhere from a third to a half of
participating hospitals would not improve on measured patient outcomes.
However, there are two points worth considering when evaluating the tradeoff
between in-house QI projects and QI collaboratives.
First, collaboratives likely provide many benefits beyond measureable
improvements in outcomes. A key purported benefit of a collaborative is that it
brings hospitals together to learn skills, coordinate activities, and share
knowledge. For a hospital with limited experience with QI, a collaborative may
provide many worthwhile benefits even if that collaborative cannot be directly
associated with improved quality. These sorts of benefits have been noted in
prior analyses of collaboratives which often acknowledge important cultural
changes.17 Unfortunately, there was no data collected on these types of benefits
during FIX so it was not possible to factor any benefits of this nature into the
analysis. While hospitals could achieve some of these benefits from an in-house
QI effort, if they have to bring in outside resources to providing initial training the
cost of this is likely to be the same if not more than the cost of training at a
65
collaborative. Second, there is no good basis for understanding the individual
success rate with in-house QI efforts. Further, there is little data about the costs
of these QI projects. Considering that individual QI efforts are not uniformly
successful and have many associated costs as well, investing in a collaborative
may represent little additional risk.
Given the general poor knowledge about QI success rates and costs,
there is no clear conclusion about whether a QI collaborative is a worthwhile
investment. However, given the potential collaboration between hospitals working
together there should be a general benefit from participating in a collaborative.
Therefore, the second half of this study works to develop an understanding of
what factors may predict an ability to succeed in a collaborative. This
understanding can help hospitals decide if they can succeed in a collaborative,
and if they cannot succeed identify issues they should focus on in order to create
an environment that will support a successful QI collaborative.
Limitations
While this study generated some intriguing results, it is important to
remember that these were exploratory analyses that were subject to some key
limitations. First, these results are based on administrative data. This means
there are many unmeasured and unaccounted for variables. A key consideration
is that this meant the analyses could not be tied to specific areas of a hospital if
improvements were initially trialed on specific units before dissemination. In the
case of FIX, a hospital-wide approach is supported and in some ways most
appropriate. Since FIX aimed to improve flow throughout the entire hospital,
66
improvement projects should have targeted broad initiatives that improved flow
for all patients not just a small subset. This is also why the internal evaluation of
FIX considered all patients, not just those on targeted wards. So if teams did only
make small improvements during FIX, while this would be beneficial, there is
reason to argue that this would not have been a fully successful collaborative
experience.
Second, these results cannot isolate the impact of FIX. FIX was not a onetime, isolated QI initiative, but rather the first of many systems redesign
collaboratives (some examples are Patient Flow Center, Transitioning Levels of
Care, and Bedside Care Collaborative) some of which occurred during the twoyear follow-up period. Additionally, VA hospitals have been encouraged to
conduct numerous local QI efforts each year. Some of these other projects are
likely to impact the measured outcomes (LOS and 30-day readmission in
particular) meaning the detected improvements can only weakly be attributed to
FIX. However, the impact of these other QI projects is of limited concern for two
reasons. First, the time-series analysis accounts for baseline trends in the
outcomes. So to the degree VA hospitals maintain a regular focus on QI projects,
the national focus on FIX represents a single increase in effort and all other QI
projects would be accounted for by the baseline trends. Second, it is reasonable
to expect that for complex outcomes, such as LOS and discharge before noon,
sustained quality will not come out of a single QI project. Instead the importance
of any single QI project may be the attention it brings to a topic, the training it
provides team members, and its contribution to a greater culture focused on QI.
67
With these considerations, sustained results that were generated by a continuous
cycle of improvement generated in response to FIX would be just as meaningful.
The final limitation of this analysis was the lack of information at the
individual team level. Key metrics such as team leadership quality, support from
hospital leadership, and actual team engagement with FIX would provide critical
information for distinguishing high and low performers. FIX was a mandated QI
collaborative, thus much of the variation in performance may simply be due to
varying levels of engagement by teams or hospitals with the collaborative. Even if
this is the reason for non-success, it is telling for VA and other policy makers that
simple presence at a QI collaborative did not ensure success.
Conclusions
This chapter brings to a conclusion the first half of this study, which utilized
a five-year time series analysis to evaluate whether a large QI collaborative lead
to sustained improvements in quality as measured by two primary outcomes,
LOS and discharges before noon. The analyses found that in aggregate there
were improvements in LOS and discharges before noon. However, performance
at individual hospitals was quite variable and not all hospitals showed
improvements. For those hospitals that improved, there was a high likelihood of
sustaining LOS but a low likelihood of sustaining discharges before noon. Some
of the decline may just reflect a balance between patient flow and provider
workload. However, if many other newly introduced quality metrics seem a similar
post-implementation decline, it will be difficult to achieve substantial
68
improvements in quality. The study also considered three secondary outcomes
which, as expected showed little change and little impact associated with FIX.
Based on this analysis, there are four important findings. First, in
comparison to the traditional pre-post study involving team reported success, an
analysis that accounts for pre-existing temporal trends in patient outcomes leads
to the identification of a smaller than expected group of QI teams that made initial
improvements. Second, there may be significant loss of quality, or regression to
the mean, after the completion of QI projects. Third, this novel classification
approach highlighted that many hospitals operate with processes that lead to
highly variable performance. These hospitals likely need to focus on creating a
standardized process before undertaking serious efforts to improve any of those
processes. Fourth, this analysis showed that success can be achieved across
multiple hospital settings; but given the overall variation there needs to be a
better understanding of what factors predict success in a collaborative.
69
CHAPTER 5 – SUPPORTING QUALITY IMPROVEMENT
This chapter begins the second half of this project which considers
another body of literature, develops an analytic framework, and analyzes survey
data in conjunction with the results from the prior analysis to meet the goals of
the studies third specific aim. This specific aim was to describe how selected
components of an organization’s structure were associated with an ability to
sustain improvements in quality. The first half of this chapter reviews the
extensive literature that evaluates the relationships between different
organizational characteristics and high quality healthcare. The second half then
works from the conclusions reached in this literature to develop a guiding analytic
framework. The goal of this analytic framework is to posit a relationship of how
different classes of organizational characteristics interact to generate an
environment that may or may not support successful QI initiatives. The next
chapter in this section then discusses how the framework was applied to analyze
FIX and the methods used to generate hypotheses based on those results. The
third chapter of this section then presents and discusses the results of the
analysis as well as discussing their implications for QI and the overall framework.
Relationships with Healthcare Quality
There have been a number of studies that evaluated whether different
features or characteristics of an organization were associated with higher quality
healthcare. This literature has been nicely summarized in three systematic
reviews. The first of these systematic reviews evaluated 81 publications that
examined the relationship between a measured organizational variable and
70
mortality rates.50 While mortality rates were the primary outcome of interest the
review also included studies that evaluated other adverse healthcare outcomes
such as nosocomial infections, falls, and medication errors. The review
considered structural variables (professional expertise, professionalization, nurse
to patient ratio, care team mix, not for profit status, teaching status, hospital size,
technology use, and location), organizational process variables (measures of
caregiver interaction, patient volumes), and clinical process variables (implicit
quality, explicit quality, and computer decision support). The general conclusion
of the review was that the body of evidence for each of the organizational
variable categories was equivocal at best. The only organizational variable with
consistently positive impact on mortality rates was having high levels of
technology, which at the time of these studies meant having access to equipment
such as ventilators and pacemakers.50, 51
The second review built on this first review by focusing the literature
review on how each study operationally defined the outcome of interest. The
objective was to determine whether the operational definitions for the studied
adverse events identified a mechanism through which altering an organizational
characteristic could realistically improve a care process and result in fewer
adverse events.52 Based on the lack of consistent evidence showing an
association between any single organizational characteristic and improved
quality, the authors theorized that perhaps adverse events were too broadly
defined meaning there were too many factors impacting quality and thus the
measured characteristics could not reasonably lead to improved quality. This
71
review analyzed 42 articles that provided 67 measures of different organizational
characteristics and their association with medical errors and patient safety
outcomes. The measured organizational characteristics broke down into 13
groups: team quality, implementation of standard operating procedure use,
feedback, technology, training, leadership, staffing, communication, simplification
of the work process, culture, organization structure, employee empowerment,
and group decision making. The operational definitions for adverse events in the
studies included medication errors, medication complications, diagnostic errors,
treatment-related errors, patient falls, specimen labeling errors, and other nonspecified patient safety concerns. The authors noted that while most of the
studies focused on medication errors and complications, there was no
consistency across studies in how to define and measure a medication error or
complication. This made drawing any systematic conclusions about
organizational characteristics and adverse events a challenge. Additionally, the
authors noted that only 9 of the studies provided sufficient detail that would allow
the reader to identify a specific relationship between an organizational variable
and the measured adverse event. Given these limitations, as well as others, the
review concluded that there were no generalizable statements about how a
specific organizational factor could address errors or safety in healthcare.52
The third of these systematic reviews continued to refine the process, this
time by using Donabedian’s structure-process-outcome model as a framework for
structuring the analysis.53 This review identified 92 articles and analyzed them to
understand whether sequentially close Donabedian relationships (e.g. process –
72
outcome) had more consistent and positive findings than distant relationships
(e.g. structure – outcome).54 The review also examined whether studies
considered definitions of quality that included improving services rather than
simply defining quality as a reduction in negative events. The study evaluated 19
structure-process, 58 structure-outcome, 20 process-outcome, and 9 processprocess relationships. Much like the prior reviews, this systematic review found
that the preponderance of organizational factors studied were associated with
non-significant findings.54 These non-significant findings were most frequently
found when examining the distant structure – outcome relationships, which was
the most commonly examined relationship in the literature. A general concern
with these studies was that they did not consider or evaluate any of the
intervening process variables which would help enlighten the understanding of
why some studies identified positive impacts while others had negative or nonsignificant outcomes. When studies examined the sequential Donabedian
relationships of structure-process or process-outcome, cross study results were
more consistent and there were greater odds of detecting a statistically
significant relationship between an organizational variable and a measure of
improved quality.
The review of this literature highlights that components of organizational
structure and care quality have a complex relationship that was difficult to
analyze. Those components that have a direct cause-effect relationship (e.g.
certain forms of technology, nurse-patient ratios) quite frequently have positive
effects on quality. However, more peripheral factors (e.g. affiliation with a medical
73
university) that do not have that direct linear relationship will show contradictory
results across studies leading to a conclusion of non-significant impact when
analyzed in aggregate.
One key conclusion from this research was that multiple organizational
characteristics contribute to any single measure of quality. Therefore, any
analysis that does not appropriately model the complex relationships between
organizational characteristics and quality outcomes cannot expect to ascertain a
strong relationship between factors. This approach would likely require a multilevel analysis that could test how different variables interact and mediate each
other to support quality. Few studies have the data for this type of analysis, but
when such data was available it did help identify meaningful relationships, even
helping identify how important intervening factors could inhibit quality. For
example, an analysis of reengineering efforts across 497 hospitals initially found
that reengineering was detrimental from a cost competitive standpoint.55
However, when using a multivariable analysis that adjusted for indicators of
organizational support and quality of the implementation, the study identified
trends showing that if successfully implemented the reengineering efforts were
beneficial.55 Of course, as potentially indicated by the variability in performance
with FIX, the question of how to successfully implement QI is an important and
little examined topic.
Before addressing the literature related to the implementation of QI, there
were some key limitations associated with these reviews and the studies they
summarized. The first limitation was the difficulty associated with defining and
74
measuring quality. Early studies focused on efforts to reduce mortality rates,
which as a generally rare and complex event was difficult for any broad
organizational characteristic to significantly impact.50 Later efforts identified more
modifiable targets of quality (e.g. reduce adverse events, improve patient
satisfaction) and were able to uncover some relationships. However, the
operational definition of the same outcome frequently varied between studies
making it difficult to determine whether any relationships existed across
healthcare institutions or only in those where the studies occurred. Some of
these same problems plagued the analysis of FIX, LOS and discharges before
noon represented composite outcomes that likely did not measure the true
quality goals and this limitation will impact the results of the analyses in this
study. Some recent efforts have addressed these issues and will lead to better
and more consistent measures of quality. As one example, the National
Healthcare Quality Report published annually since 2003 promotes the
systematic collection of quality measures allowing comparisons between
hospitals.2
A second limitation identified in the reviews was weak methodology. One
weakness of the early studies was they did not adjust for patient severity.
However, now that risk adjustment is an accepted standard in health services
research the more recent studies all used appropriate risk-adjustment
procedures. However, even with risk-adjustment these studies often suffered
from methodologically weak study designs. Most of these studies employed an
observational study design and could not address any characteristics that varied
75
between different healthcare institutions and how those variables may confound
any observed relationships. In fact, given the number of postulated organizational
factors that may impact quality with each individual study only considering a few
organizational characteristics they all potentially suffered from significant
unmeasured confounding. A few studies did use a stronger methodology and
utilize an interventional design with quality measured before and after a change
in the organizational characteristic. These studies however utilized pre-post
designs, did not consider any natural trends in the outcomes, frequently analyzed
distant structure-outcome relationships, and only reported results from a single
site. A number of biases, particularly historical bias and regression to the mean,
threaten the validity of these studies.
While not an inherent limitation of these systematic reviews, on final
consideration was that the reviews only focused on how the presence or absence
of different organizational characterists were associated with quality. However, it
may be more important to evaluate how an organizational characteristic supports
the process of improving quality. This concept moves away from efforts tocused
on identifying distant relationships between features and instead explores how QI
teams conduct improvement projects and how they use resources and otherwise
interact with their surrounding environment. The first step in this process was to
examine whether different organizational characteristics were associated with
successful QI initiatives.
76
Relationships with Quality Improvement Efforts
The relationship between organizational characteristics and quality
improvement efforts has been less studied, but there are three notable studies to
consider. The first of these studies considers the process of organizational
learning in neonatal intensive care units (NICU).56 This paper synthesizes
theories from best-practice transfer, team learning, and process change to
develop hypotheses testing the relationship between concepts such as learnwhat (activies related to learning what the best practice is), learn-how (activities
related to operationalizing or implementing best practice), and psychological
safety with success in a QI initiative. The data in the study represents 1,440
survey respondents spread over 23 NICUs. The results of the survey indicated
that perceived implementation success was associated with respondents feeling
there was a greater body of evidence supporting the intervention, a greater
sense of psychological safety at the insitution, and high use of learn-how
activities. They did not find any association with learn-what activities, nor did any
of the control variables measuring structural characteristics have any impact.
Some limitations of the study were that it only studied 23 NICUs that all had selfselected into the collaborative. Additionally, among all the NICUs participating in
the collaborative, there was a low response rate to participate in this study and a
low response rate among providers at the NICUs that did participate in the
survey. Although this study did not examine more traditional organizational
characteristics, it did establish that certain characteristics are associated with
percieved success at implentation of a QI collaborative.
77
The next critial article was a systematic review that examined how
organization context was related to quality improvement success. The majority of
the 47 studies in the review examined QI projects associated with the Total
Quality Management (TQM) or Continuous Quality Improvement (CQI)
approaches.57 The analyzed studies most frequently measured success with QI
based on pre-post data. A small selection of the studies only reported team
perceived success. Factors that were associated with improvement were
management leadership, organizational culture, use of information systems, and
prior experience with QI. Additionally, there was support for physician
involvement, microsystem motivation to change, available resources and quality
of the QI team leadership. The findings of this review were difficult to interpret
since it could only measure those factors included in the reports, none of which
had the specific goal of testing the role of specific organizational characteristics.
As such, any individual factor was only mentioned in 20% of articles leading to
small smaple sizes to draw any conclusions from. The strength of the paper is
that it starts to identify a collection of variables that studies should evaluate when
working to identify which organizational characteristics best support QI.
The last article to considered reported on 99 interviews conducted at 12
hospitals that participated in the Door-to-Balloon (D2B) Alliance.17 The hospitals
were recruited into this study based on their reported influence of the D2B
Alliance on improving care at their hospital, with 6 reporting a strong influence
and 6 a limited influence. Their qualitative analysis of the interviews was based
on a realistic evaluation framework focused on identifying the contextual
78
environment that led to the hospitals percieved impact of the D2B Alliance. This
anlaysis revealed that a perceived need to change, openness to external sources
of information, and a strong champion for change were all contextual factors
consistently associated with the D2B Alliance having a strong impact. While this
study only considers a small number of hospitals, the interviews provided a
wealth of information on various organizational characteristics, providing the best
assurance that the identified associations between organizational characteristics
and QI success were at least true associations at those individual hospitals.
This collection of articles suggested that a number of factors can impact a
team’s success with a QI effort. In contrast to the prior section, the supported
organizational characteristics are generally closely associated along the causal
pathway with the measured outcome of interest. The most notable exceptions to
this concept were the more broadly defined features such as psychological safety
and organizational culture. While it is important to recognize that the identified
organizational characteristics were associated with successful QI efforts there is
little available information on what constitued a good organizational culture or
supportive leadership. The next challenge for healthcare QI may be in
determining how to best create the environment and necessary support
structures to allow effective QI.
In concluding this literature synthesis around the relationship between
organizational characteristics and healthcare quality, there were three key
concepts that stand out. Future studies should focus on these concepts as they
work to overcome the limitations of this prior work as well as begin to develop an
79
understanding of how to best improve healthcare quality. First, there should be
consideration of how organizational features and processes interact to support
quality. Second, the overall context of an organization impacts their QI efforts. As
such, analyses need to compare across multiple organizations in order to best
understand the relationships between organizational characteristics and
outcomes. Third, longitudinal analyses related to specific interventions will help
establish a causal relationship showing how structures support quality.
This study’s analysis of the results from the FIX collaborative addresses
some of these limitations. The analyses use survey data collected during FIX to
understand how a large collection of organizational characteristics were
associated with performance during FIX. The focus was to identify whether any
modifiable organizational characteristics were part of a collection of
characteristics commonly associated with success in FIX. The identified
characteristics would then be potential targets for intervention allowing an
unsuccessful hospital to adopt changes that will help support future QI efforts.
Analytic Framework
In order to best understand how organizational characteristics related to
FIX performance, the first step was to develop an analytic framework to structure
the analyses. The starting point in this process was to identify a theoretical
approach to guide the development. Based on the literature review, there was no
established theoretical approach guiding the field. After surveying a selection of
organizational theories, realistic evaluation was selected as the approach that
best matched the purpose of this analysis. Realistic evaluation theory, originally
80
developed for improving the quality of evaluation for public policy interventions,
focuses on understanding the context of the situation where an intervention
occurs and how factors interact to lead to the observed result.58 A common quote
that succinctly summarizes the theory is to understand “what works for whom in
what circumstances.”58 In effect, the work argues that success in one situation
will not always translate to another situation and that it is a complex interaction of
factors that results in improvement or failure. This theory contributes two
important characteristics to this analysis. First it led to the decision to use a data
mining approach to analyze the data. The support for this decision will be
discussed in the next chapter. Second, it provides the superstructure for the
framework. This superstructure conceptualizes a QI effort (in this case FIX) as an
external stimulus applied to a specific organizational context. This organizational
context responds to the QI effort and produces a set of measureable outcomes. A
model of this framework is outlined in Figure 5-1.
This superstructure however, does not address the key objective of realist
evaluation, which was to thoroughly understand the characteristics of the
organizational context and how those characteristics interact to generate the
outcomes. To understand this required developing a more detailed model of the
organizational context that shapes a QI effort. This process began with a
consideration of the organizational characteristics covered in the literature. This
consideration revealed that there was no succinct list of factors, but instead
suggested that factors may be categorized into specific classes. A further
refinement of this concept came from a review of the SQUIRE (Standards for
81
QUality Improvement Reporting Excellence) Guidelines.59 These publication
guidelines encourage authors to describe various aspects of the organizational
context that might impact a QI project. The consideration of these two factors led
to the identification of four classes of contextual factors that may impact success
with QI efforts, 1) Facility structure, 2) QI structure, 3) QI processes, and 4) Team
character.
Figure 5-1: Analytic framework for how organizational context impacts QI
The first class, facility structure, represented factors describing the basic
structural characteristics of the healthcare institution. These factors were
conceptualized as generally unmodifiable variables (e.g. facility size). Despite
their unmodifiable nature, these factors create a critical foundation that not only
82
supports but also interacts with the other classes to create the environment that
responds to the QI project. So even though these factors may be unmodifiable,
their interactions were critical and necessary to include in the analytic framework.
The next class, QI structure, also represented structural components, but
these were distinguished from facility structural factors in two ways. First,
variables selected for this class should be more likely to directly impact or
support QI activities. Second, these structural variables should be more
modifiable than the innate characteristics of a hospital. Some examples of
variables that might fit into this category include nurse to patient ratios, levels of
support staff, or the availability of critical resources for providing quality care. In
total, these first two variable classes provide a general context for understanding
overall characteristics, unique challenges, and available resources for successful
QI.
The third class, QI processes, consisted of factors that measured prior
experience with QI. The goal of these factors was to understand how ubiquitous
QI is in the environment. These factors were considered important to include
based on two theories. Most directly, hospitals that consistently pursue QI should
be more likely to have determined how to best run a QI project. Further, the more
ubiquitous QI was at a facility the greater odds of sustaining improvements due
to a continuous cycle of improvement preventing any significant decline in quality.
Indirectly, high levels of QI activities should increase the likelihood of an overall
hospital culture that supports QI. When providers accept, support and participate
83
in QI, there should be a decreased prevalence of change resistance suggesting a
greater probability for successful implementation of QI solutions.
The last class, team character, consists of variables defining the QI team.
Important variables in this class would measure team make-up, team functioning,
and team organization. These variables measure the quality of the team as it is
important to recognize, particularly in the setting of a failure, whether it was poor
support that a quality team could not overcome, or whether quality support
seemed to be present, however the QI team was unsuccessful because they
could not effectively function.
The last step in developing the analytic framework was to consider how to
model the interaction between each of the four components. Some of the most
recent analyses of organizational structure have been based on Donabedian’s
structure-process-outcome model for quality assurance.53 A major concern with
that framework was its prescription of a tight linear interaction between
sequential components in the model. It seems more likely that for quality
improvement there is a complex interplay between components to generate the
specific organizational context. A potential example of this is the interaction
between QI process and QI structure. As QI activity becomes increasingly
common, hospitals are more likely to see benefit from QI and increase their
willingness or desire to invest in QI structure. Therefore, rather than seeing these
four components as part of a causal pathway, they are thought of as layers that
build-on top of each other.
84
These layers are represented in a triangle as it helps to emphasize a few
concepts. First, it introduces the concept that organizations need to build-up for
QI success. It seems unlikely that even the most highly functioning team can
succeed if a proper foundation does not support the efforts. Building on this
concept, the area associated with each class of variables signifies their relatively
importance. Although a hospital may not be able to modify their innate
characteristics it is important to understand how those characteristics impact how
the hospital functions. A successful QI effort in a large urban hospital may not
translate to a small low volume critical access hospital and it is important to
recognize and understand what characteristics result in success and failure in
these disparate settings. Lastly, the overall shape should convey the idea of
scaling or climbing a mountain. The intention being to remind people that
improving quality is not an easy task, but instead a skill that must be carefully
honed and perfected if there is a hope of achieving the ultimate summit.
Conclusions
This chapter has focused on understanding how organizational
characteristics relate to quality improvement. A review of the literature revealed
that although extensively studied there were few conclusions about which
organizational characteristics can improve quality or effectively support QI efforts.
This was most likely attributable to the complex nature of healthcare and the
inability to isolate any single factor. Instead, efforts to understand how
organizational characteristics can improve or support quality need to consider
how individual factors interact, an effort that likely requires more complex
85
modeling and analytic approaches. In the first step along this path this chapter
introduced a new analytic framework. This framework identified 4 key classes of
factors that likely play a role in modulating the success of a QI project. This
analytic framework will be further explored in the following chapter as it is applied
to the FIX initiative to understand whether any collections of organizational
factors were commonly associated with an ability to improve and then sustain
those improvements.
86
CHAPTER 6 – ANALYTIC VARIABLES AND DATA MINING
This chapter overviews and defines the analysis of how organizational
context modifies a quality improvement (QI) collaborative to result in measured
outcomes. The first portion of this chapter continues to expound on the analytic
framework introduced in chapter 5 by discussing two additional data sources that
served the basis for this analysis. These two surveys measured a number of
organizational variables in Veterans Affairs (VA) hospitals during FY07, the same
year as the Flow Improvement Inpatient Initiative (FIX). The measured variables
will be introduced and classified into categories based on the analytic framework
and their perceived relationships to QI activities. The second portion of this
chapter addresses the analytic methods. This involves an introduction to the data
mining process as well as the specific method used in this analysis, a decision
tree. After covering the details involved in establishing the dataset and analyzing
the dataset, the chapter concludes with a brief discussion about the process of
interpreting and evaluating the decision tree. This covers both the identification of
hypotheses to drive future research as well as determining whether the models
suggest a need to modify the analytic framework.
Organizational Characteristics in VA
While this research examined individual hospital performance in response
to participation in FIX, it cannot be forgotten that these hospitals are all part of VA
healthcare, the largest integrated healthcare system in the US. There are many
characteristics about VA healthcare that make it a useful case study for this
analysis but also may impact how well some of the findings generalize to a larger
87
population. VA healthcare represents one-third of the VA cabinet level office in
the federal government with direct congressional oversight. Currently over 5.5
million veterans receive some portion of their healthcare through VA, 3.1 million
of whom have conditions connected to their military service.60 The VA patient
population is just over 8% female and 20% minority. The healthcare network
includes over 1,000 service locations including the 130 acute care hospitals in
this study.
As part of a large integrated system, VA healthcare has developed
regional divisions that coordinate efforts in a region and promote high quality
care. The VA adopted this regional network, known as the Veterans Integrated
Service Network (VISN), in 1996.61 The 21 individual VISNs promote high quality
care through a number of mechanisms including VISN-level budget control,
adoption of VISN mandated performance measures, establishment of drug
formularies, and the promotion or coordination of QI efforts. The VISN structure
potentially impacts this study as interactions between an individual hospital, its
network hospitals, and VISN leadership could modify a hospital’s behavior in
response to a QI effort. As such VISN memberships as well as other measures of
hospital-VISN interaction were considered in the analyzed variables.
Another critical feature of the integrated VA network is the existence of a
comprehensive electronic medical record. Early versions of the VA electronic
medical record first appeared in 1978.62 Over the years the electronic medical
record, known as CPRS, has evolved into a format that is highly standardized but
also allows for flexibility. The basic standardized structure supports patient care
88
and uniform data collection across facilities. The flexibility in the interface allows
individual facilities to develop, test, and implement unique solutions to address
local needs. As such the electronic medical record and its interface play a
significant role in many QI solutions and measures of CPRS use as part of
quality improvement were appropriately represented in many study variables.
An additional key characteristic of VA that supports this research is its
broad survey culture. VA conducts numerous surveys each year; some are
repeated in regular intervals while others represent targeted research efforts.
This study utilized data from two surveys both of which were completed by key
hospital representatives during FY07, making these surveys an accurate
snapshot of the organizational context at the time of FIX. The first of these
surveys, Survey of Intensive Care Units & Acute Inpatient Medical & Surgical
Care in VHA (HAIG), was a biennial survey that recorded key facility attributes
and all VA hospitals completed.13 Any values reported in this survey were
considered the official record for those features.
The other survey used in this study, The VA Clinical Practice
Organizational Survey (CPOS), was a one-time survey designed to evaluate
clinical practice characteristics that may be associated with quality care and high
performance.14 The objective of the survey was to measure organizational
readiness for change, particularly in the primary care arena, since VA was shifting
its care focus away from acute care episodes to coordinated primary care efforts.
Since many VA hospitals support both acute care and primary care facilities, a
number of the survey responses are relevant to the FIX efforts to improve acute
89
care flow. The survey was sent to Chiefs of Staff at 160 VA facilities, with 86% of
facilities responding. A few data elements overlap between these two surveys,
since the HAIG survey is more complete and considered an official VA record it
was used in any situation where data elements were duplicated.
VA Hospital Organizational Context
The process of providing inpatient care, let alone attempting to improve
the quality of that care, is exceedingly complex. As such there were a
considerable number of variables identified in these two surveys that may help
characterize how organizational context responds to a QI effort to produce
measured outcomes. This section walks through these variables and where
appropriate discusses how the variables were categorized into the various
classes established in the analytic framework.
An important first step to describing and understanding these variables is
to understand the different scales used for categorical variables in the CPOS
survey. A total of 12 response scales were used in the CPOS survey, with the
scales varying from 3 to 6 response options. Table 6-1 lists each of these scales,
with the first column listing the stem of the response. For example, the stem
“useful” with 3 response options means that the respondents selected from the
options: not useful, somewhat useful, or very useful. Those stems that are
numbered signify that there were multiple scales with the same stem. The stems
in this table serve as the “type” labels in the tables that follow in this section. For
analytic purposes, these variables were recorded with numeric values equivalent
to the scale levels in the table.
90
Table 6-1: Categories for different response scales in the CPOS survey
Scale Levels
Stem
1
2
3
4
5
6
Useful
Not
Somewhat Very
Barrier
Not
Small
Moderate
Large
Monitoring
Not
Annually
Quarterly
Monthly
Importance Not
Somewhat Moderately Very
No
Planned
Partially
Fully
Implement1
Plans
Implement2 Not
V. Little
Some
Great
V. Great
Very
Challenge
Difficult
Neutral
Easy
V. Easy
Difficult
Mostly More
Share
More
Mostly
Responsible
VISN
VISN
equally
Hospital Hospital
Sufficient1
Not
Barely
Somewhat Mostly
Completely
Sufficient2
Never Rarely
Sometimes Usually
Always
Cooperate
V. Great
Great
Some
V. Little
None
None
Few
Some
Most
~ Half
All
Percentage
(41-60%) (61-90%)
(≥91%)
(1-20%)
(21-40%)
(0%)
Working from the base of the triangle that describes the organizational
context in the analytical framework, the first collection of variables to describe
were those that measure characteristics of the facility structure. These variables,
listed in Table 6-2, ideally measure facility characteristics that were not related to
specific quality improvement efforts and were generally stable from year to year.
The first six variables in the table, region, VISN, facility type, wards, ICU level,
and ICU status, were easily identified as basic hospital demographic variables
making them measures of facility structure. The next variable, academic
affiliation, was a strong candidate for this class but it was frequently tested in
early studies for its relationship with quality, suggesting it could be classified as a
QI structure. Considering that in those studies academic affiliation had no
91
consistent direct association with mortality rates and that the associations
between VA hospitals and academic institutions were static, the conclusion was
that any association academic affiliation has with quality was distant and more
emblematic of a measure of facility structure. The counts of operational beds
(whether ICU or acute care) were slotted into this class because while they may
change, in VA they do not change dynamically but instead only in response to
long term facility planning which has been driven by patient volume or building
remodeling not directly as an a specific consideration about improved quality.
Variable
Region
VISN
Facility Type
Ward
ICU Level
ICU Status
Table 6-2: Variables measuring facility structure
Source
Type
Description
Categorical FIX learning session region
Categorical Veterans Integrated Service Network
HAIG Categorical Primary, secondary or tertiary
Number of 9 different types of wards
HAIG
Count
in a hospital
HAIG Categorical Level 1, 2, 3, 4 or No ICU
Closed, Open, Open with Mandatory
HAIG Categorical
Consult
Is hospital affiliated with an academic
HAIG
Yes/No
medical center
Academic
Affiliation
# Operational
ICU Beds
# Operational
Acute Care Beds
Annual Volume
Rural
Total Wards
Specialty Wards
Discharges/Bed
HAIG
HAIG
PTF
PTF
Continuous Total number of active ICU beds
Total number of active medical and
surgical beds
Count
Number of FY07 medicine discharges
Categorical Based on % of rural patients
Count
Total number of separate wards
Number of telemetry, step-down or
Count
respiratory specific wards
Calculated # FY07 discharges / # acute beds
Continuous
92
The last individual variables in this class, annual volume and rural, were
two measures that reflect broad features (size and location) of a hospital’s patient
population. These data come from the Patient Treatment File (PTF) and not
either of the surveys. The rural classification is a three tiered classification of
urban, rural, and highly rural based on the percentage of patients discharged
from the facility in FY07 that fall into classifications of Small Rural or Isolated
using the Rural-Urban Commuting Area Code system.63, 64 These variables were
classified as facility structures because while they may change from year-to-year
based on which patients seek inpatient care, the changes were outside the
hospitals’ control and cannot be manipulated to directly impact quality.
Beyond these individual variables available in the surveys or patient
records, this class also included three composite or calculated variables. The two
composite variables dealt with the number of wards in the hospital. The first was
a count of the total number of wards, while the second was a count of the
specialty wards in the hospital. Generally, the number of wards serves as an
additional marker for facility size although they may also have potential positive
or negative associations with quality. For example, allowing specialization on
wards may improve quality for certain conditions, but this situation could also
increase the number of in-hospital care transitions which can be vulnerable
periods. The calculated variable, discharges per bed, represented the simple
division of the total annual volume over the total number of active acute care
beds. This was conceptualized as a crude measure of workload or provider
93
burden, with the theory that it may be a marker for provider motivation to accept
suggested changes for both of the primary FIX analysis outcomes.
The next set of variables, listed in Table 6-3, represent the individual
measures of hospital QI structure. These variables consider hospital
characteristics that exhibit increased flexibility compared to those identified as
measures of facility structure. Additionally, these variables should have more
direct theorized associations with quality. As a broad set, these variables
generally measure the structures necessary for providing care. The quality or
quantity of these structures impacts not only the ability to provide basic care but
potentially the ability to undertake improvement efforts. The underlying theory
being that an effective structure for supporting QI likely requires providers,
support staff and resources that sufficiently meet basic patient needs while also
having the flexibility or additional capacity to support QI efforts. As an example,
hospitalists were classified as QI structure because one influence in adopting a
hospitalist program was the concept that physicians employed by a hospital
focusing on inpatient care will be more efficient, have a better understanding of
the inpatient care system, and can potentially justify protected time to participate
in QI.65-67 Similarly, nurse staffing and nurse to patient ratios consider whether
sufficient staffing was present to provide consistent care and whether nurses
would be able to participate in and support QI efforts. Included in this section of
variables was the set labeled as barriers to improvement, as all 3 measures ask
whether there were insufficient numbers of providers or staff for achieving
desired improvements.
94
The last collection of variables in this set measured basic events that were
not directly related to QI, meaning they didn’t qualify as a QI processes, but still
potentially contribute to the development of an organizational culture that
supports QI. The first set of these variables measured the cooperation and
communication between providers and departments, a particularly important
consideration for a broad QI effort such as FIX. The last two measures,
performance monitoring and utilization review, provided information about how
much data was available for QI as well as establishing how accustomed
providers would be to interpreting performance data.
Table 6-3: Variables measuring QI structure
Type
Description*
Variable
Source
Hospitalists used on: All, Some, or
Hospitalists
HAIG
Categorical
No Wards
Medical FTEE
Number of Full Time Employee
CPOS Continuous
Nurses
Equivalents (FTEE)
ICU Nurse to
Reported for Day, Evening & Night
HAIG
Categorical
Patient Ratio
Shifts, 1:1, 1:2 or 1:3
Were 7 types of staff sufficient for
Sufficient Staff
CPOS Sufficient 1
inpatient care needs
Barriers to
Extent to which 3 measures were a
CPOS
Barrier
Improvement
barrier to improvement
Inpatient
Were 9 types of resources
CPOS Sufficient 2
Resources
adequate for inpatient care
Communication &
3 measures of the quality of
CPOS
Cooperate
Cooperation
communication or cooperation
Performance
How frequently are 6 performance
CPOS
Monitoring
Monitoring
measures monitored
What percentage of 3 types of
Utilization Review
CPOS Percentage
admissions are reviewed
* For full listing of grouped variables see Appendix D
95
Table 6-4: Calculated and Composite measures of QI Structure
Variable
Discharges / Nurse
Total Staff
Clinical Staff
Support Staff
Total Barriers
Total Resources
Space Resources
Technology Resources
Communication
Total Performance Monitoring
Description*
# FY07 discharges / # medicine nurse FTEE
Sum of all 7 sufficient staff variables
Sum of 3 sufficient clinical staff variables
Sum of 4 sufficient support staff variables
Sum of all 3 barrier variables
Sum of all 9 resource variables
Sum of 2 space resource variables
Sum of 6 technology resource variables
Sum of 2 communication variables
Sum of all 6 performance monitoring variables
Level of monitoring for each of the 6
Monitoring Level
performance monitoring measures
Total Utilization Review
Sum of all 3 utilization review variables
* For further details see description in Appendix D
In addition to these individual variables, the QI structure class also
included several calculated and composite variables, listed in Table 6-4. Much
like the calculated variable in the facility structure class, the calculated variable
here was a crude measure of workload, this time calculated as the annual
volume divided by the total medicine nurse FTEE. The composite variables
represent aggregate measures for each set of individual variables. These
variables acknowledge the likelihood that any single resource or barrier cannot
significantly and consistently impact QI, but instead it may be the collection of
these variables that has meaning. Therefore, all sets of variables have a
composite variable representing the sum of the individual variables. For two sets
of variables, sufficient staff and inpatient resources, two additional composite
variables were calculated representing distinct subsets within those variable sets.
96
The last variable of note was monitoring level. This variable was recorded for
each of the 6 performance measures and reflects whether the performance
monitor was measured at the facility, clinic, provider or some combination of
those three levels.
The next set of variables, listed in Table 6-5, represent measures of
hospital QI processes. These variables measured prior experience with quality
improvement at the hospital and likely should have a strong relationship with
current QI performance. The first set of variables in this collection measured
products and outcomes likely associated with prior QI efforts. The first measure,
clinical order sets, evaluated whether a hospital implemented an electronic order
set or a clinical reminder for 8 common inpatient conditions. Further, if an order
set had been implemented, respondents indicated whether the clinical order set
or reminder was viewed as useful. The next variable was a similar measure
related to the implementation of evidence bundles for common ICU events. The
last variable fitting into this set measured which of 7 different approaches
hospitals used to encourage adherence to clinical practice guidelines for 3
different conditions: acute myocardial infarction (AMI), congestive heart failure
(CHF), and community-acquired pneumonia (CAP). The next set of variables in
this class considered different drivers of local QI. The first variable, QI
information, considered the role played by seven different potential sources of
information for guiding QI efforts or strategic planning. The other, driving force,
considered the split in responsibility between the VISN office and the individual
hospital for six activities that support QI.
97
Table 6-5: Variables measuring QI process
Type
Description
Variable
Source
Are clinical order sets or reminders
Implement 1
Clinical Order
implemented in CPRS for 8 conditions
CPOS
Sets
Useful
If implemented, how useful
Are 10 different evidence bundles
ICU Evidence
HAIG
implemented, paper or electronically
Bundles
Clinical Practice
Use of 7 methods for adhering to
Guideline
CPOS
Yes/No
guidelines for 3 conditions (AMI, CHF,
Adherence
CAP)
How important are 7 sources of
QI Information
CPOS Importance
information for guiding QI efforts
Is the VISN or hospital primarily
Driving Force
CPOS Responsible
responsible for 6 activities
Which of 8 methods are usually used
Clinical
CPOS
Yes/No
to develop reminders
Reminders
Performance
To what extent have 11 actions been
CPOS Implement 2
Improvement
implemented
Guideline
Presence of 6 factors in response to
CPOS Implement 2
Implementation
guideline implementation
6 challenges related to clinical
Clinical
CPOS
Challenge
champions
Champions
Four measure of hospital culture and
Facility
CPOS
Cooperate
support
Environment
Performance
Use of 4 types of incentives related to
CPOS
Yes/No
Awards
improving performance measures
On average, percentage of awards
Award
CPOS Percentage
given to groups
Distribution
Has the hospital implemented a QI
Yes/No
program in the ER
ED QI Teams
CPOS
Continuous If Yes, how many teams
* For full listing of grouped variables see Appendix D
98
The final set of variables in this class examined factors related to general
team performance and actions with QI activities. These were included in the QI
process class, as opposed to the team character class, in the framework
because these were general measures about QI at a facility and not measures of
the specific members of the team involved in FIX. The first in this section, clinical
reminders, considered whether eight different methods were typically used to
develop reminders in CPRS. The next, performance improvement, measured the
extent of implementation of eleven actions to improve VA clinical performance.
The next three, guideline implementation, clinical champions, and facility culture,
all examined different challenges or responses from hospital staff related to QI
efforts. Lastly, a collection of variables considered the use of awards to
encourage performance improvement and the number of QI teams implemented
in the emergency department (ED) the year prior to the survey.
Just as in the QI structure class, the QI process class had several
composite variables, listed in Table 6-6. Generally these composites represented
the sum of measures across a group of individual categories. However, for order
sets, guideline adherence, clinical reminders, and performance improvement
there were additional sub-groupings. For clinical order sets these sub-groups
consider the number of fully or partially implemented sets, the number of planned
sets and an average usefulness rating across the implemented sets. The
guideline adherence sub-groups reflect the number of methods used to address
each of the individual diseases (disease total) as well as for how many diseases
a method was used (method total). The sub-groups for clinical reminders
99
separated the five activities involved in the development of a clinical order set
from the 2 activities involved in evaluating a clinical order set after
implementation. Lastly, for performance improvement, the sub-groups considered
a collection of measures related to establishing a team as well as shifting
resources between areas in the hospital in an effort to improve performance.
Table 6-6: Calculated and Composite measures of QI Process
Variable
Description*
Total Clinical Order Sets
Sum of all 8 conditions
Implemented Count of all partially or fully implemented
Planned Count of planned order sets
Usefulness Average usefulness of implemented order sets
Total Evidence Bundles
Count of all implemented ICU evidence bundles
Total Guideline Adherence Sum of all 21 fields (3 diseases x 7 methods)
Disease total Sum of all 7 methods for each diseases
Method Total Sum of method use across the 3 diseases
Total Information
Sum of all 7 sources of information
Average Driving Force
Average across the 6 variables
Total Clinical Reminders
Sum of all 8 methods
Development Sum of 5 measures related to development
Post Sum of 2 measures related to post-implementation
Total Performance
Sum of all 11 activities
Establish Sum of 3 activities related to establishing a team
Shift Sum of 2 activities related to resource shifting
Guideline Implementation Sum of 2 measures of implementation process
Guideline Resistance
Sum of 2 measures of resistance
Total Clinical Champion
Sum of 6 clinical champion measures
Facility Culture
Sum of 2 measures of culture
Facility Support
Sum of 2 measures of financial support
Total Incentives
Sum of 4 measures of incentive use
* For further details see description in Appendix D
100
The final class in the analytic framework, team character, was not
represented in this analysis. The data from the two surveys reviewed thus far did
not directly pertain to FIX, but only represented the organizational environment at
the time of FIX. As part of FIX there were some surveys completed by the
participants, but this data was not available at the individual or team level as it
was aggregated at the regional level. Further, this information was unlikely to
provide much insight as many of the questions show greater than 95% of
respondents responding positively (agree / strongly agree) to survey questions.
So rather than include this data which could lead to erroneous interpretations of
the final model, these data were excluded. While this was a clear limitation of the
analysis, as an exploratory analysis it was not a fatal limitation. The impacts of
this limitation will be discussed during the results review in the next chapter.
Data Mining Overview
This section provides an overview and discussion of data mining as a
technique for developing an understanding of how these measures of
organizational context may relate to hospital performance during FIX. The
challenge to this task was that while the literature review in Chapter 5 identified a
number of studies that examined relationships between quality and
organizational characteristics, due to the complexity of these relationships there
were few consistent meaningful associations. This study selected data mining as
a tool that could capably analyze this data and effectively identify complex
associations and patterns that describe hospital performance during FIX. Any
identified associations would then serve a basis for developing future studies that
101
should involve a qualitative and quantitative analysis to better understand the
specific relationships.
While data mining was the selected analytic method, logistic regression
was also considered. The selection of data mining mainly reflected its ability to
uncover complex relationships between factors in a dataset.68 This was in
contrast to logistic regression approaches which would require a considerably
larger sample than was available to effectively analyze the potential variables of
interest. In short, there were two particular concerns that suggested logistic
regression was inappropriate for achieving the goals of this study. First as shown
in the literature review, none of these organizational factors are likely to have a
clear univariate relationship with hospital performance during FIX. As such, an
attempt to define the average affect of a given variable across hospitals would
generally lead to non-significant findings. Second, given the available sample
size, any efforts to model interactions between variables would lead to
underpowered analyses, once again increasing the likelihood of non-significant
findings. The next couple paragraphs provide an overview of data mining
highlighting its strengths and showing why data mining was an appropriate tool
for developing a set of hypotheses about how organizational context modifies a
QI initiative to produce a set of measured outcomes.
The term data mining in fact encompasses a large tool box of analytic
methods, this study focused on decision trees. Decision trees belong to the class
of symbolic learning and rule induction algorithms.68 Symbolic learning algorithms
aim to generate a structured set of hypotheses that can be used to understand
102
and classify a specific concept. In this study the concept to classify was facility
performance, with the classification representing the four-level major
classifications listed in Table 4-1. The decision tree process first begins with the
concept of an information system (IS) with four key components, S, Q, V, and f:
where IS = <S,Q,V,f>. S represents the set of examples used to develop the
hypotheses, in this case the sample of hospitals that participated in FIX and have
complete data on the HAIG and CPOS surveys. The next component, Q,
represents the collection of features that serve to characterize the sample, in this
case the collection of variables discussed in the prior section, “VA Hospital
Organizational Context”. Within the set (Q), each individual feature (F) can take
on a discrete set of values (V), for example the potential answers on the scales
listed in Table 6-1. The last component, f, represents a function encoding the
individual values for each feature for each example (individual hospital) within the
entire dataset.
The decision tree analysis begins with a root node, which constitutes the
entire set of examples, S, in the IS. Working with a selected algorithm, the
process identifies a feature (F) in the dataset that best separates the data into
smaller subsets. The definition of what constitutes the best separation varies
between algorithms. This process iterates growing limbs of a tree as the
examples in each node are evaluated and features identified to create nodes with
a smaller number of examples. The process terminates when all the examples in
a node have the same classification value. These terminating nodes are called
leaves. A tree could theoretically have as many leaves as there are examples,
103
although that would generally be an undesirable outcome. Similarly a limb can
have as many nodes as necessary to reach the final classification. Lastly, limbs
can be of varying nodal lengths.
In order to better clarify this process, it is useful to understand how the
decision tree development process differs from that of stepwise variable selection
in logistic regression. In stepwise variable selection, variables are sequentially
added to a model based on their relationship with the entire data set. With
addition to the model based on the amount of variance explained by that
variables addition to the model. In decision tree modeling, the addition of a
variable to extend a limb is only based on its relationship with the examples in
that node. In effect, it is therefore a conditional relationship based on the factors
identified at prior points along the limb. For example, in a decision tree with 100
hospitals in the full sample, the first decision point may split into a node of 75
hospitals with an academic affiliation and a node of 25 hospitals without an
academic affiliation. The next decision point along each of these limbs would
then be separately and conditionally evaluated based on the hospitals academic
affiliation. So for the hospitals with an academic affiliation the feature that best
splits them into smaller groups may be the use of hospitalists, while for the nonacademic affiliates it may be ICU level. Taking this to a third level, hospitals with
academic affiliates and hospitalists program would be evaluated with no direct
consideration about their ICU level as that variable had been included in a
separate limb.
104
Keeping in mind that this analysis was in part driven by the realistic
evaluation framework and its goal to thoroughly describe and understand the
context that leads to a measured outcome, data mining seemed the tool best
suited to achieve this goal. The algorithms that create decision points in the
decision tree process do not have the traditional concerns about statistical power
as would be the case in regression approaches. Further, the process of
evaluating nodes and identifying key features provides critical insight into how
interactions between variables may differ based on the context in which they
interact. These strengths lead to the conclusion that data mining, and more
specifically decision trees, were the best approach for modeling the interactions
between components of facility structure, QI structure, and QI process.
A main weakness of decision trees is that they can generate long complex
limbs leading to difficult interpretations. However, there are two modifications to
the general data mining structure, which help to address this weakness. The first
and most common modification is to develop a pruned tree. Pruned trees
consider the tradeoff between 100% accurate classification and the risk of overfitting the data in a manner that leads to low interpretability and generalizability of
the findings. Pre-pruning algorithms attempt to balance this trade-off by testing
whether the addition of another feature to the limb provides sufficient additional
information. If the magnitude of information gained (as calculated by the selected
algorithm) does not meet a defined threshold the limb terminates leaving some
misclassification. There are also post-pruning processes that generate a fully
classified tree and then trim limbs back based on the information gain at each
105
step. These two processes generally lead to the same decision tree and mostly
differ in computation efficiency; as such pre-pruning is the generally adopted
approach and what will be used in this analysis.
The other and more recently developed technique, boosting, involves the
development and combination of multiple decision trees. 67, 68 The trees are
combined based on a weighting that reflects the misclassification at each
individual node. These approaches produce more accurate classifications, but
increase the complexity of the overall interpretation. Since the application of data
mining to this type of healthcare data is a novel technique, and the goal of this
analysis is to generate hypotheses for future studies not necessarily identify
definitive associations, a boosting algorithm was not employed. The final
conclusion was that a pruned decision tree would best facilitate description of the
findings to audiences generally unfamiliar with data mining and decision trees.
Decision Tree Development
The decision tree modeling process was completed in the Waikato
Environment for Knowledge Development (WEKA) version 3.6.4 data mining
environment.69 The selected model was the J48 algorithm which is an
implementation of the C4.5 pre-pruning information entropy algorithm.70 This
algorithm operates by calculating a level of information entropy (uncertainty) for
the examples within a node and then determining which feature provides the
most information gain. Entropy was defined by the following equation:71
6
,-
./012 $ 0& log 0&
&(
106
Where pi signifies the proportion of examples within the set that have a
given classification, c. After determining the entropy of the set, the algorithm
compares each of the features determining which feature provides the greatest
information gain, as defined by the following equation:71
-7/.89
:/- ;9:-2, <& ,-
./012 $
= >?
2=
,-
./012= 2
Where F represents one of the features from the entire feature set (Q) and
SF represents the number of examples in the set of interest with the classification
of interest. After calculating the feature that provides the maximum information
gain, sub-nodes are generated and the program continues to iterate through the
process until all limbs terminate as leaves or nodes in which the addition of
further features does not meet the information gain pruning threshold.
Although decision tree algorithms can be developed to respond as desired
to variables with either numeric or text values, in WEKA there is a clear
distinction in the natural evaluation of these two variable types.69 Numeric
variables were treated as a continuous variable with the evaluation process
considering the optimal point to split the group in two. As an example, for a 5point Likert-scale recorded with 1 = very difficult, 2 = difficult, 3 = neutral, 4 =
easy, and 5 = very easy, the model would consider the different groupings of 1
and 4 (i.e. very difficult vs. all others) or 2 and 3 (i.e. easy and very easy vs.
neutral, difficult, and very difficult) to identify which split would provide the
greatest information gain in that specific setting. In contrast, if that same Likert
scale was encoded using the text descriptors, the decision tree would not be able
to identify that very difficult is similar to difficult and could thus potentially be
107
grouped. Instead it would treat all of these as separate values and evaluate them
for the information gained if the group was split into five different subgroups.
In general, there was no expectation that responses on the Likert scale
questions from the CPOS survey would provide much information gain if treated
as text variables, and these were all numerically encoded. There were however,
some variables with a limited number of categories that supported hypotheses
suggesting they may split into two groups or into multiple groups. The first of
these, ICU evidence bundles had three response options: no not used, used
without electronic orders, and used with electronic orders. These three situations
could all lead to different levels of performance with quality improvement.
However, it very well could be that only having electronic orders is associated
with quality while the other two have no direct impact suggesting a two-level split.
A similar theory can be used for the hospitalist presence variable which was
encoded with the options: no hospitalists, hospitalists used for some patients and
hospitalists used for all patients. The other variables that were given both text
and numeric variables were a few other facility structure variables that were
originally encoded as text, but were likely to have the most meaning as numeric
variables. These variables were nurse to patient ratios, rurality, ICU type, ICU
level, ICU management, and facility status.
There were two complementary processes, both of which were utilized in
the analysis, for developing and interpreting decision trees. The first and more
standard process was to evaluate the dataset using an n-fold cross validation.
This process involved splitting the dataset into n equal splits, most commonly
108
and as implemented here into 10 equal splits. The model development process
then created a decision tree based on an information set that included n-1 splits;
in a 10-fold cross validation that involved 90% of the data. Then the remaining
split (10% for this example) was a test set used to test the classification accuracy.
This process iterated n times, such that each split serves as the test split once.
The classification results from the collection of tests were then used to calculate
the following performance metrics, based on traditional confusion matrix
principles:72, 73
@.A B/C:
: D9
<9EC B/C:
: D9
B.F:C:/- DF9EE < G9CA. @B
@B <
<B
<B @
@B
<B @B
@B
< @B
2 I DF9EE I B.F:C:/DF9EE B.F:C:/6P
DF:. J0.9
:-K L9.9F
.:C
:F LA. M @BF<BN FOF
6Q
These metrics were calculated for each individual classification class as
well as an average for the entire classification scheme. Additionally, the interrater reliability or kappa statistic was calculated comparing across each of the
test sets, with P(a) representing the observed agreement between raters (or test
sets) and P(e) representing the potential agreement due to chance.72
R
B9 B
1 B
109
The other model development process used all of the available examples
to create a single decision tree. With each of the leaves noting the number of
correctly and incorrectly classified instances. There were no formal performance
metrics to evaluate for this tree, but it does offer a single easy to interpret
presentation of the data. The results from both of these model development
processes will be presented to evaluate the decision tree models.
The last consideration during model development was which
classifications to use. The initial FIX analysis used an 11-point classification
model and applied it to 5 different outcome measures. However, given the small
number of hospitals in some of the individual categories, this final analysis will
use a simplified classification model that only considers a 4-level classification
based on the major categories: No change, Improve not Sustain, Sustain, and No
Benefit. Further, given the limited numbers of hospitals that improved or
sustained on the three quality check outcomes, only length of stay (LOS) and
discharges before noon were individually modeled.
In addition to modeling hospital performance on the two primary outcome
measures, models were created for two composite measures. The first model
combined hospital performance on LOS and discharges before noon, while the
other considered performance across all 5 outcomes. The composite
classification was created by assigning the following point values to each
performance category: 2 for sustained improvement, 1 for non-sustained
improvements, 0 for no change, and -1 for no benefit. Table 6-7 lists the point
ranges assigned to each classification category for both composite models. In
110
the LOS/Noon composite model both the sustained and non-sustained
classifications include hospitals with a score of two. The distinction reflects a
decision that any facility having a classification of sustain needed to have
successfully sustained one of the two outcomes. Thus a few hospitals with a
score of 2, reflecting that they improved but did not sustain on both outcomes,
had their classifications set to improve not sustain for that composite. While
hospitals that sustained on one outcome and recorded no change on the other
maintained a classification of sustain. A similar check was performed on the total
composite ranking, but all hospitals classified as sustain by point score showed
sustainment on at least one of the five outcomes.
Table 6-7: Point ranges for composite model classification
LOS/Noon
Total Composite
Composite
No change
0
-1 – 0
Improve not Sustain
1–2
1–2
Sustain
2–4
3–6
No Benefit
-2 – -1
-5 – -2
Decision Tree Interpretation
The overall purpose of this analysis was to generate hypotheses that
could serve as a basis for guiding future in-depth studies focusing on how
organizational characteristics support effective quality improvement efforts. As
such this evaluation focuses on the performance metrics from the 10-fold
analysis of the data as well as each of the four developed decision trees. The
111
performance metrics provided some insight into the potential external validity of
this analysis by determining whether the hospitals in this dataset were able to
predict each other’s performance.
After evaluating the performance metrics, the next step considered the
decision trees created from the entire dataset. This analysis examined the
different limbs of the trees aiming to identify collections of organizational
characteristics that consistently result in similar performance classifications. It is
these limbs and the identified associations that serve the basis for future studies.
This process also considered whether the structure of the decision trees provided
any support for the guiding conceptual framework. In general, this analysis
considered whether broad facility structures were located close to the main node
of the tree suggesting they provide a basic foundation that is modified by QI
structure and QI process to result in the final performance classification.
Following this logic, after facility structure the next variables along the limb
should be QI structures, with QI processes serving to establish the final leaves.
Conclusions
This section has introduced the methods used in this study to begin
developing hypotheses on how organizational characteristics interact to create
an environment that supports sustained improvement as part of a QI
collaborative. The initial sections discussed the collection of individual, calculated
and composite variables used to measure different components of facility
structure, QI structure, and QI processes. This was followed by an overview of
data mining and efforts to establish it as an effective tool for this task.
112
Subsequently this was followed by a description of the specific data mining steps
involved in this analysis. Lastly, a short discussion focused on how to interpret
the data mining models and established the goals for the analysis. The actual
results of the analysis appear in the next chapter.
113
CHAPTER 7 – DECISION TREE RESULTS AND DISCUSSION
This chapter reports and evaluates the results from the decision tree
modeling efforts. These decision trees examined how different organizational
characteristics interacted to create the organizational context that contributed to
how the hospital responded to FIX to generate the measured outcomes reported
in Chapter 4. The first section of this chapter identifies the sample of hospitals
that had complete data allowing inclusion in the study. Next the chapter
examines the performance metrics from the 10-fold decision tree analysis. The
last portion of the results examines the individual decision trees for the length of
stay (LOS), discharges before noon, LOS/Noon composite, and overall
composite classifications. The discussion of these results interprets the
performance metrics and the decision trees before exploring whether the results
suggest any modifications to the guiding analytic framework. Before concluding
the chapter lastly considered some of the key limitations of the analysis.
Decision Tree Performance Metrics
Of the 130 hospitals that participated in FIX, the chief of staff at 100 of
them completed the VA Clinical Practice Organizational Survey (CPOS) leading
to a final sample size for the data mining analysis of 100 hospitals. Table 7-1 lists
the number of hospitals classified into each of the four performance categories
for the two primary outcomes as well as the two composite measures. Chisquare tests comparing the performance distribution between the full sample and
this sample on LOS and discharges before noon show no signs of systematic
non-response (Χ2 (df = 3), p(LOS) = 0.99; p(Noon) = 0.92). Since the full analysis
114
in Chapter 4 did not show any variation in performance by hospital size or region,
the distribution of these factors in the data mining sample were not compared to
the full sample.
Overall, the decision trees developed for each of these outcomes had a
difficult time identifying consistent relationships among features that combined to
create an organizational context that was consistently associated with a specific
classification of hospital performance. Even with a total of 263 individual and
composite variables to consider, the kappa statistic for all models performed
equivalently to chance (κ(LOS) = -0.03, κ(Noon) = -0.02, κ(LOS/Noon
Composite) = -0.06, κ(Overall Composite) = 0.02). The other performance
metrics from the evaluation similarly suggest that the models performed no better
than chance, see Table 7-2. Further, a review of the receiver operator
characteristic (ROC) measures showed that the only categorization level that
was consistently identified at a rate better than chance was those hospitals that
performed with no statistical change in response to FIX. One promising note was
that the decision tree with the best performance at classifying facilities was the
overall composite measure. This reflects that organizational context, particularly
as reflected by the measures in this analysis, impacted the larger environment at
the hospital and that measuring a single outcome, such as LOS, cannot
adequately understand how well an organization supports QI.
115
Noon
No Change
Improve Only
Sustained
No Benefit
Average
0.346
0.294
0.077
0.222
0.26
0.297
0.318
0.115
0.288
0.278
0.29
0.323
0.091
0.222
0.257
0.346
0.294
0.077
0.222
0.26
0.316
0.308
0.083
0.222
0.258
0.526
0.496
0.373
0.456
0.477
LOS/Noon
Composite
Table 7-2: Decision tree performance metrics
TP Rate FP Rate Precision Recall F-Measure ROC
No Change
0.407
0.288
0.344
0.407
0.373
0.55
Improve Only
0
0.172
0
0
0
0.421
Sustained
0.182
0.231
0.182
0.182
0.182
0.454
No Benefit
0.263
0.339
0.323
0.263
0.29
0.443
Average
0.25
0.28
0.255
0.25
0.251
0.471
No Change
Improve Only
Sustained
No Benefit
Average
0.455
0.133
0.105
0.172
0.21
0.256
0.314
0.21
0.282
0.272
0.333
0.154
0.105
0.2
0.197
0.455
0.133
0.105
0.172
0.21
0.385
0.143
0.105
0.185
0.201
0.594
0.375
0.476
0.369
0.441
Overall
Composite
LOS
Table 7-1: Data mining sample performance classifications (N = 100)
LOS/Noon
Overall
LOS
Noon
Composite Composite
No Change
27
26
22
29
Improve Not Sustain
13
34
30
20
Improve and Sustain
22
13
19
17
No Benefit
38
27
29
34
No Change
Improve Only
Sustained
No Benefit
Average
0.379
0.25
0.176
0.265
0.28
0.352
0.188
0.145
0.303
0.267
0.306
0.25
0.2
0.31
0.278
0.379
0.25
0.176
0.265
0.28
0.338
0.25
0.188
0.286
0.277
0.517
0.518
0.541
0.526
0.524
116
Individual Decision Trees
This section considers the results of the full decision trees which represent
the pruned classification of all 100 samples. Before examining the trees
individually, the first evaluation step was to examine which variable categories
were emphasized across models. Table 7-3 lists each of the major variable
categories identified in at least one model and the count of how many times a
factor from that category appeared in each of the four models. Variable
categories in the table were ordered by the total number of appearances with
separations between each of the three major classes from the analytic
framework. Surprisingly, the four decision trees all featured a similar number of
factors with the LOS tree using 28 factors to reach the pruned classification,
while the other 3 trees each used 24 factors. There were very few components of
facility structure in the models with only 4 of the potential 12 variable categories
identified in any model. In contrast the QI structure and QI process classes were
frequently observed in the models. The QI structure class played a prominent
role in the LOS model while QI process was the prominent class in the
discharges before noon and overall composite models. The LOS/Noon
composite model had an even number of features from both of these two
classes.
Most of the major variable categories in the QI structure and QI process
classes were represented in at least one model. For QI structure there were two
classes that did not appear in any model, ICU nurse to patient ratio and barriers
to improvement. Although the nurse to patient ratio is a factor with a strong
117
Table 7-3: Count of factors in each of the decision trees
Overall
LOS/Noon
LOS
Noon
Composite Composite
Facility Structure
Ward
3
1
1
1
Academic Affiliate
1
1
# of ICU Beds
1
1
Rural
1
Total Facility Structure
5
2
2
2
Total
6
2
2
1
QI Structure
Sufficient Staff
Performance Monitoring
Inpatient Resources
Utilization Review
Communication
Nurse FTEE
Hospitalists
Total QI Structure
QI Process
Guideline Adherence
Performance Improvement
QI Information
Clinical Reminders
ICU Bundles
Driving Force
Clinical Champion
ER QI Teams
Performance Awards
Clinical Order Sets
Facility Environment
Total QI Process
Total Decision Points
4
8
1
2
15
2
1
1
4
3
1
1
1
1
1
8
1
3
3
2
2
2
3
2
1
1
4
1
2
1
11
8
3
3
4
1
1
2
1
1
2
1
1
1
2
2
2
1
1
8
14
1
11
14
28
24
24
24
13
12
7
5
2
2
1
10
6
5
5
5
4
3
3
3
2
1
association to quality, nearly every hospital ICU has a 1:2 nurse to patient ratio
across all shifts. Given the lack of variation across hospitals, it was not surprising
that this factor did not appear in any models. In contrast, the hospitals did vary on
their reporting of barriers to improvement, so it was not clear why none of these
118
factors appeared in the models. Interestingly, the only variable category from the
QI process class that was not selected into a model was guideline
implementation, which similarly measured the presence of barriers, specifically
negative behavioral responses to efforts to implement clinical practice guidelines.
So while the exact explanation for why these variable categories were not
included was not clear, it seems that a consistent issue with measurement or
definition of barriers to improvement impacted the overall understanding of how
the presence of any barriers impacted efforts to improve quality.
The two most frequently utilized variables categories from the QI structure
class were the measures evaluating whether certain staffing levels were sufficient
and measures considering the frequency (annual, quarterly, monthly, or never) or
level (hospital, ward, or provider) of data monitoring on a collection of
performance measures. For the QI process class, the two most frequent
variables were measures related to efforts to improve clinical guideline
adherence and measures related to implementation of actions to improve
performance.
Before considering more in-depth the individual decision trees, a second
evaluation step considered whether the selected factors represented individual
measures from the surveys or one of the composite measures created to
represent a variable category. Across the four models there were 17 composite
variables selected into the various models. These composite measures
represented 3 QI structure and 3 QI process variable categories, each listed in
Table 7-4. In general, these composite variables did not seem to play a
119
significant role in summarizing the individual measures. The one exception to this
was the guideline adherence variable category where a composite measure was
selected for eight of the ten times a factor from that variable category appeared
in the decision tree models. Within the decision trees, the LOS model had 7
composite variables, the overall composite 5, 3 for the LOS/noon composite, and
2 for the discharges before noon model.
Table 7-4: List of individual and composite variables in the decision trees
Individual
Composite
Total
QI Structure
Sufficient Staff
11
2
13
Performance Monitoring
9
3
12
Utilization Review
4
1
5
QI Process
Guideline Adherence
Clinical Reminders
Performance Awards
2
3
2
8
2
1
10
5
3
The review of the individual decision trees began with the model depicting
hospital performance on LOS, which is displayed in Figure 7-1. To ease
reference while describing the decision trees, the boxes that list selected features
were numbered. A striking feature of this decision tree upon initial review was
that it appeared less like a tree and more like a long vine with just a few small
offshoots. An additional feature of this model, which was to be expected based
on the decision tree performance metrics as well as the use of 28 measures to
classify the facilities, was that most of the final classification groups only included
a small number of hospitals. The few exceptions to this include the classification
120
off box 2 which lead to 6 hospitals classified as sustaining performance, and off
box 28 where 15 hospitals were grouped as having no statistical changes in the
study. Across the full pruned tree a total of 10 hospitals were misclassified.
On a substantive level, there were four interesting findings to highlight.
First, higher levels of data availability or monitoring were associated with better
performance. The clearest indication of this occurs at box 2 which considers a
collection of 8 hospitals that used incentives to encourage guideline adherence
for all 3 measured disease categories (acute myocardial infarction (AMI), chronic
heart failure (CHF), and community acquired pneumonia (CAP)). These hospitals
were split into 6 with sustained performance and 2 with no-statistical changes
based on whether they had a utilization review for non-VA admissions for over
half (sustainers) or less than half (no change). The misclassified hospital in the
no change category did register improved performance. Due to pruning, the
decision tree did not distinguish between the improver and no change, but an
examination of values for the defining variable reveals that the improving hospital
reported that they reviewed a few (1-20%) of non-VA admissions compared to
the hospital with no change which reported that they reviewed none of their nonVA admissions.
Other decision points based on data availability were boxes 21, 22, and
28. Box 28 does not fully support the theory as the two hospitals with sustained
performance have a low level of concurrent utilization review of acute
admissions. The decision at box 27 was also counter-intuitive in which not having
an ICU protocol for weight-based heparin administration was labeled as leading
121
to sustained performance. Likely an improved model would have been able to
have an additional decision point after box 25 that would have split the remaining
24 hospitals more succinctly into their appropriate sustaining and no-change
categories.
The second substantive finding from the model suggested that high
ratings on staff sufficiency measures were associated with better performance
(improve or sustain) but at the same time the decision tree suggests that relying
on high staff sufficiency ratings may not be the most effective approach to
effective QI. This was best exemplified by the decision at box 16. This decision
point considered whether four hospitals reported sufficient number of clinical
pharmacists. Those that reported positively were able to improve LOS during
FIX, and those that felt they did not have sufficient clinical pharmacists showed
no statistical change. However, when considering not only this decision point, but
also the decision at box 15, there was a more complex interpretation of how
factors interacted to support QI. Box 15 revealed that the four hospitals evaluated
in box 16 had low monitoring levels of hospital readmission rates. So it may have
been more efficient to institute methods for monitoring readmission rates rather
than relying on clinical pharmacists to overcome any challenges associated with
not understanding a hospitals readmission rates. Other decision points with
similar findings were 4, 14, and 25. Although box 14, where a completely
sufficient staffing of laboratory technicians leads to 5 hospitals exhibiting no
benefit from FIX, does not exactly fit this pattern.
122
The third substantive finding from this decision tree was that a lack of
proven techniques, most notably ICU evidence bundles for ventilator associated
pneumonia (VAP) and catheter related blood stream infections (CRBSI), was
associated with poor performance as exhibited in boxes 6 and 7. The
prominence of ICU evidence bundles in the LOS decision tree was not surprising
as higher levels of VAP and CRBSI due to a lack of methods to support evidence
based medicine would lead to extended LOS and present many challenges to
reducing LOS. It should also be noted that box 6 splits out the 4 hospitals in this
sample that had no ICU. So while there is no expectation that they would have a
VAP bundle, their lack of success in reducing LOS may suggest that ICUs often
play a role in the early development of QI programs.
The fourth finding was the surprising appearance of factors in boxes 21
and 22 related to monitoring emergency department (ED) visits in this decision
tree related to LOS. The presence of these factors may indirectly represent a
robust data collection and dissemination culture at those hospitals with high rates
of ED monitoring. These factors may also have a more direct relationship with
quality, as the ED does serve as one of two main entry points for hospital
admission. Since FIX did focus on hospital flow throughout the hospital
experience the ED likely was included in many improvement projects. As such an
understanding of ED admission rates and admission times would have helped
support many QI efforts and provided extra motivation for overcoming any
change resistance.
123
Figure 7-1: Full decision tree for LOS performance
124
The next decision tree, Figure 7-2, overviewed the classification for
hospital performance on improving rates of discharge before noon. Much like the
LOS model this one was mostly one long vine although it has a couple more tree
like splits at boxes 10 and 13. It also similarly did not have many large groupings
of classifications with only a grouping of 9 improvers off box 12, 8 with no change
off box 16, and 8 improvers off box 24. This pruned tree had a slightly higher rate
of misclassification with a total of 12. There were four substantive points in this
decision tree that further clarify the association between organizational context
and hospital performance.
The first point, which echoes the findings from the LOS tree, was that high
ratings on staff sufficiency was associated with better performance on improving
and sustaining gains related to discharging patients before noon. This point is
perhaps best displayed in the facility classification related to boxes 11 and 12.
Box 11 splits off two facilities as no benefit that rated their laboratory technician
levels as insufficient. The other 15 facilities that were in this collection were then
split into improvers or sustainers based on their efforts to encourage adherence
to CHF clinical guidelines. Other decision points supporting this sentiment were
at boxes 4 and 15.
The second point from this decision tree considers the role of specific
performance activities undertaken at facilities to improve performance. This first
example related to this, boxes 7 and 10, shows how performance was related to
efforts to shift staff from high performing to low performing areas in hopes of
improving performance in the low performing area. While these decision points
125
don’t lead to direct classification of many hospitals the presence of this measure
in the decision tree further solidifies the notion that staff sufficiency plays a critical
role in successful quality improvement. Another hospital activity that had an
impact on quality was the decision to create performance improvement teams to
address a specific performance measure (box 18). This action has clear
relevance for performance on discharges before noon as hospitals would
generally have created a team to send to the FIX learning sessions, so those
hospitals with greater experience pulling together teams to address specific
measures could have better odds of succeeding.
The third issue identified in this decision tree considered the role of
different information sources in supporting quality improvement efforts. In box 5,
five hospitals that did not rate a local hospital as an important resource for QI
information were classified as no benefit. Then in box 6, four hospitals that rated
VA newsletters as an important QI information resource also were classified as
no benefit. Both of these classification points had one hospital misclassified into
the group.
The fourth and final issue from this decision tree considered the role of
hospitalists in supporting quality improvement efforts. Of the 53 hospitals
evaluated in box 13, 15 reported no hospitalist program while 38 reported at least
some hospitalist program. None of the 15 hospitals without hospitalists were able
to sustain improvements. Seven of the hospitals did make initial improvements,
apparently as a result of using incentive programs or relying on completely
sufficient registered nurse (RN) staffing. While this suggests hospitalists have a
126
Figure 7-2: Full decision tree for discharges before noon performance
127
critical role in supporting QI, this decision tree also shows that the presence of a
hospitalist program did not guarantee success. Only four (with two
misclassifications) of the 28 hospitals with hospitalists were classified as
sustainers.
Of the four decision trees, the next one (Figure 7-3) which modeled
composite hospital performance on both primary outcomes had the most tree like
structure. This decision tree has two classification points that successfully
classified 11 hospitals as no change (box 7) and 11 hospitals as no benefit (box
15). This full pruned model had a high rate of classification success with only 3
misclassified hospitals. The results of this decision tree provide little additional
insight as it appears to generally be a merged or averaging of the results from
the LOS and discharges before noon trees. The major insight from this decision
tree comes from the ordering of different variable categories within the decision
tree.
The first variable category appearing in the early portions of the model
was the measures of hospital resources. These measures, which appeared in
boxes 4, 5, 6, and 8 establish several early divisions in the tree. These divisions
support the importance of a resource or point towards alternative combinations of
factors that can help overcome any limitations associated with a missing
resource. The next variable category appearing in the decision tree was the
measures of performance improvement activities. These were seen in boxes 7,
10, 12, and 16. The decision point at box 12 was particularly illustrative. Of the
hospitals evaluated at that point, 19 reported not using pilot testing and only one
128
Figure 7-3: Full decision tree for LOS/Noon composite performance
129
of these hospitals (actually misclassified as an improver) showed sustained
improvements on the composite measure. The last two sets of variable
categories in the model serve to achieve final performance classification. Boxes
16, 17, 21, and 23 represent different measures of staff sufficiency, while boxes
9, 15, and 24 were different measures of data collection and availability.
One final intriguing finding in the decision tree relates to how hospitals
were classified based on the presence of an evidence bundle in the ICU for
glycemic control. In the previous two decision trees, ICU evidence bundles were
evaluated as either present or absent in leading to performance classifications. In
this decision tree box 20 splits the 11 hospitals into 3 different categories. Most
surprising was the two hospitals classified as achieving some improvements
despite not having any evidence bundle while three hospitals that had a nonelectronic evidence bundle showed no benefit from FIX. A further examination of
how the hospitals performed on the individual measures of LOS or discharges
before noon did not provide any insight into why two hospitals without any
evidence bundle for glycemic control achieved some initial improvements.
The last of the four decision trees (Figure 7-4) examined the composite
measure of hospital performance across all five outcomes. While this tree has a
few more splits than the individual outcome decision trees, there was clearly still
a prominent backbone running the length of the tree with no major splits. Almost
all of the classification decisions represent just a small number of facilities except
for box 18 in which 9 hospitals were classified as no change and box 23 with 19
hospitals classified as no benefit. This decision tree had eight misclassifications,
130
which was less than the two individual outcome models but greater than the
misclassification seen in the LOS/noon composite model. Of the four decision
trees, this one had the greatest number of unexpected or counterintuitive
findings.
As a first example, variables measuring the sufficiency of staff still appear
regularly in the model, as seen in boxes 2, 16, 21, and 24. However, quite
frequently the resulting classification decisions were opposite the expectations
created by the previous models. In box 2 a rating of insufficient registered nurse
staffing was associated with five hospitals classified as sustainers, although the
one misclassification did represent a hospital whose actual performance was no
benefit. Similarly in box 16, high ratings of radiology technician sufficiency were
associated with no benefit, while low levels were associated with non-sustained
improvements. Box 21 resulted in a fairly expected classification, while box 24
had an inverse association with completely sufficient computer application
coordinators resulting in 3 hospitals classified as no benefit.
The second noticeable counterexample occurs in boxes 11 and 12. This
series of boxes identifies two hospitals as sustainers that have a low use of
electronic reminders for supporting evidence based care for AMI, CHF and CAP.
These hospitals also did not use any techniques to review electronic reminders
after implementation. While it was logical that hospitals not using electronic
reminders would not review their non-existent reminders post-implementation, it
was surprising that this collection of hospitals would exhibit apparent success in
131
Figure 7-4: Full decision tree for overall composite performance
132
improving and sustaining composite quality. This example does only reflect the
experience of two hospitals, once again re-affirming the importance of local
context.
Discussion
This data mining decision tree analysis examining how a hospital’s
organizational context modified the response to FIX as measured by several
patient outcomes generated several key insights into the challenges involved in
improving and sustaining quality in healthcare. The key finding to remember from
this analysis was that the decision trees had low overall performance and were
unable to classify performance levels at a rate better than chance. As a positive
finding from this research, there were variables that appeared in the full decision
trees that had relationships with hospital performance that may be useful for
determining how to improve hospital quality.
While the performance of these decision trees on the performance metrics
associated with the 10-fold analysis was disappointing, the review of the
individual decision trees suggests two likely mechanisms created this low level of
performance. The first mechanism was the difficulty measuring or defining
specific concepts. For example, several variable classes appear repeatedly in the
individual decision trees but often as only individual factors, rarely were any of
the composite factors included. This suggests that better or more refined
measures of these variable classes could lead to an ability to better understand
how these factors support QI. An additional consideration for this mechanism
was that the performance classifications represented a novel approach to
133
evaluating QI. This analysis was subject to its own set of limitations which could
have contributed to some performance misclassification and poorer decision tree
performance.
The second mechanism was that quality improvement context may be
nothing but a local phenomenon and specific organizational features that support
QI at one hospital may not play any role at another hospital. Increasingly the QI
literature has focused on a theme which may best be summarized by the quote
“The devil might be in the details of local context and culture.”74 The simple idea
being that the difference between successful and non-successful implementation
of a QI project is related to many unmeasured, and perhaps even nonmeasureable, local factors presenting a significant challenge for efforts to identify
pathways that would support successful QI. The overall appearance of these
decision trees like a vine instead of as a tree provides some support for this
mechanism. The vine like structure implied that none of the evaluated factors
created a unique context that impacted hospital performance. Instead it was a
relatively unique factor that helped separate out hospitals along each step of the
decision tree.
Despite the inability of these models to reach definite findings of how
individual variables were associated with QI performance, the full decision tree
models did highlight several variable categories that should be considered when
considering how to improve healthcare quality. The first of the variables that
appeared multiple times in the models was the different measures of sufficient
staff. Overall, these different measures appeared 13 times, generally (11 times)
134
as an individual measure of staff sufficiency. The general impression from the
decision tree models was that low ratings of staff sufficiency were unlikely to be
associated with any improvements in quality. In contrast, high ratings of staff
sufficiency did not guarantee success but certainly contributed to an environment
that could succeed. Most importantly, high levels of sufficient staff were often
depicted in tree limbs that suggested the primary role of staff in the quality
process was to provide manpower that could overcome other limitations in the
system. To improve quality, hospitals likely need to understand whether their
staffing levels meet basic needs, but beyond this there should be careful
consideration about whether the role of staff was to meet a critical need or simply
to provide manpower to overcome some other limitation that might be more
effectively addressed through a different approach. If it is the later, then the
hospital will find greater benefit in investing to correct or overcome the limitation
rather than trying to use brute force to improve quality.
The second set of variables was the performance monitoring and
utilization review variables which served as measures of data collection at each
hospital. Just like for staff sufficiency, the selected data measures in the decision
trees were most commonly individual factors and not any of the composite
factors. The relationship between these measures and hospital performance
suggested that data availability played a crucial role in distinguishing hospitals
that simply improved from those that sustained. In fact, higher levels of data
monitoring was one of the few variables that consistently showed facilities with
higher levels having high performance, rather than the common observation that
135
low levels of a variable was associated with poor performance. The critical
importance of data measurement and availability to a QI team’s ability to
successfully improve and sustain quality is an important concept for hospitals to
consider as many of them work to implement an electronic medical record, or in
the case of VA work to develop the next generation electronic medical record.
The third key set of variables was the measures or inpatient resource
availability. These measures examine whether space and equipment needs were
sufficient to support inpatient care, making this measure similar to the measures
of staff sufficiency. In the decision trees these measures were often towards the
top of the tree, suggesting they may be factors that establish environments
requiring different approaches to improve quality. However, since these decision
trees were generally linear they do not support this as a conclusion but rather
suggest it as an area for further study and focus. While not a surprise, the
decision trees do indicate that it was necessary to have critical resources
sufficiently available if a team was to successfully improve quality.
The fourth, and final, set of variables was those variables related to
activities to ensure adherence to clinical practice guidelines for treatment of AMI,
CHF, and CAP. In this case, 8 of the 10 times one of these variables appeared in
the decision tree it was as a composite measure. The classification from these
measures was roughly equally split between low use of techniques leading to low
performance and high use of techniques leading to at least initial and sometimes
sustained improvements. The selection of variables from this class does not
support any specific method, as the selected variables represented a number of
136
different approaches to supporting guideline adherence, but instead suggest a
need to focus on repetition and consistency when trying to achieve quality goals.
By repetition this simply suggests that there is support for using multiple
approaches to address a specific quality problem. From a consistency viewpoint,
there is benefit from using similar approaches (whether it be incentives or
specialized templates) when addressing slightly different quality problems. The
benefits of repetition and consistency likely contribute to the development of an
accepting culture by giving providers appropriate reminders about expectations
and also helping them see a larger picture related to the hospital’s quality
improvement efforts.
There were two additional findings from this analysis that merit some
additional discussion. First, the data from this analysis suggest that hospitalists
may contribute in a meaningful way to QI. While keeping in mind that the overall
level of evidence was limited since the measure of hospitalist presence at a
hospital only appeared in the discharges before noon decision tree, there was
signs that hospitalists contributed meaningfully to QI. In that model (box 13),
hospitals without hospitalists were unable to sustain improvements and often had
to use methods that may not effectively support sustainment such as
performance awards or incentives to achieve initial improvements. Of course the
discharge before noon outcome was particularly physician sensitive given the
critical role that physician’s play in the discharge process. It should be
remembered that not all hospitals with hospitalists were able to improve or
sustain improvements. So while hospitalists can be effectively used, and may
137
increase the chances for success, they cannot be viewed as a easy solution as
there will always be other factors within the organizational context that will
interact to support or block efforts to improve quality.
The second additional consideration was the number of identified
associations that were unexpected. In the LOS model these unexpected
classifications were frequently at the tail end of the decision tree. It may have
been that even with the pruning process the decision tree was becoming over
classified and these final decision points have limited substantive meanings.
However, these findings should not be dismissed as some of the factors, such as
sufficient laboratory technicians in box 14, could in certain local contexts have
unexpected detrimental effects on QI. The number of unexpected findings in the
composite measure decision tree was also concerning. The challenge with this
decision tree was that the decision tree did consider hospital performance on the
three secondary FIX outcomes, in-hospital mortality, 30-day mortality and 30-day
all-cause readmission rates. Since these outcomes were not specific targets for
improvement they do not specifically reflect the organizational context in
relationship to quality improvement efforts. As such the findings from this model
played a minimal role in the final project interpretations.
A final consideration for this discussion was whether there were any
differences between the LOS and discharges before noon decision trees that
mirror some of the differences identified during the hospital performance
analysis. In the original analysis, hospitals were more likely to improve on the
discharges before noon outcome, but of the improving hospitals a greater
138
percentage sustained improvements for LOS. This analysis uncovered two
differences between the individual decision trees for these two outcomes which
provided some additional insight into the challenges associated with sustaining
QI. First, the LOS model highlighted more QI structure components while the
discharges before noon model highlighted more QI process components.
Second, the LOS model had a greater number of composite measures selected
into the decision tree. These findings suggested that measures of QI processes
were most applicable for understanding how well a facility could come together
and mount an initial effort to address a perceived quality problem. In contrast, QI
structural components may better address the ability to monitor (i.e. data
collection) and support (i.e. sufficient staff) successful QI projects.
Interpreting the Analytic Framework
The results of this study did not provide strong support for the analytic
framework, but similarly did not generate evidence suggesting the analytic
framework was completely incorrect. Before discussing the model, it must be
remembered that this analysis did not have any measures of team quality or
performance for the QI teams that participated in FIX. This lack of information
makes a full assessment of the conceptual model impossible.
In evaluating the conceptual model, the most unexpected result from this
analysis was the lack of facility structure components appearing in the decision
trees. Of the 11 times (out of a total of 100 decision points) that a facility structure
was included, most frequently it was a count of the number of specialty wards in
the hospital. From the way hospitals split from these decision points it does not
139
appear that these wards conveyed a specific quality benefit, instead they may
have simply served as a convenient marker for hospital size. As the existence of
these specialty wards typically just reflects having a large enough hospital to
justify a ward that targets a specific patient population. Otherwise it seems that
broad facility characteristics did not impact each hospitals path to quality.
Although these models did not show a critical role for facility structure,
suggesting that facility structure might be removed from the analytic framework,
there were enough unique characteristics of the VA healthcare system that this
system may diminish the impact of certain facility characteristics. These
differences are likely most evident at small VA facilities which receive a number
of structural benefits from their association with the VA system that similarly sized
critical access hospitals in the country may not. As such the facility structure
component should remain in the analytic framework until it can be clearly
determined what, if any, role these components play in defining the
organizational context for QI.
For the other two components in the conceptual model, QI structure and
QI process, it was clear they both played significant roles in shaping the quality
environment. Additionally, there were numerous interactions between the two
classes with occasionally the decision tree switching between the selection of a
QI process and then a QI structure. What was unclear from the decision trees
was whether these two concepts built on top of each other, were equally
important factors, or whether they even represent two discrete concepts.
However, considering the differences between the LOS and discharges before
140
noon tree it does seem that QI structure played a distinct role in supporting
sustained quality that differs from the role of QI process components in creating
initial improvements. As such, there was no evidence for revising the analytic
framework until future research can understand the role of team character and
better understand the interaction between QI structure and QI process
components.
Limitations
While this study has generated some important findings, there were key
limitations to consider when evaluating the meaning of these findings. The key
limitation was the lack of variables that measured characteristics of the QI team
in charge of FIX implementation. These variables could have been particularly
informative, particularly if a minimally engaged team was otherwise in a
supportive organizational context. With FIX representing a nationally mandated
QI program there was a distinct possibility that the goals of FIX were not relevant
for all participating hospitals. Additionally, these features could have provided
intriguing insight as to whether high functioning teams can overcome some of the
identified barriers, or whether certain environmental barriers are too substantial
to expect a QI team to overcome. Despite lacking measures of QI team behavior
the findings identified in the decision trees provided a good foundation for future
research and can help current administrators evaluate whether their hospital has
any foundational barriers to quality that should be addressed prior to undertaking
large QI efforts.
141
A second limitation of the data was that the CPOS data came from chief of
staff reports and most frequently represented subjective opinions. While chief of
staff may continue to participate in clinical responsibilities they will certainly have
a less frequent interaction with direct patient care and may only see one side of
patient care (i.e. participate in outpatient clinics but not directly on inpatient care).
Given this perspective, a chief of staff’s subjective evaluation on many of these
variables may not perfectly match the evaluation from the providers that interact
with the system regularly. Additionally, these subjective evaluations may not be
comparable across sites as opinions may differ on evaluations of what would be
defined as completely sufficient staffing levels.
The third limitation was that this analysis only considered performance in
relation to a single QI project. VA hospitals conduct multiple QI projects each
year, many of which are individual projects that occur under circumstances that
can vary dramatically from those associated with a national collaborative. The
real interest of this research was to develop an understanding of what creates an
organizational context that supports all QI efforts. A broader measure that
considered the overall success of many QI efforts in the hospital could lead to
more accurate classification of hospital performance in relation to its
organizational context leading to better performing decision tree models.
The fourth limitation relates to how well any conclusions from this study
can generalize to the broader healthcare environment. This study did have
complete data for 77% of national VA hospitals, so while this sample was quite
representative and analyses did not indicate any systematic reasons for non-
142
response there was the potential for some non-response bias in the sample.
While this sample may be sufficiently representative of VA, there were unique
characteristics of VA healthcare that impacted these findings suggesting limited
generalizability to private sector hospitals. First, as a large integrated healthcare
system there are numerous interactions between different hospitals that does not
translate as well to the private sector. Secondly, the VA has a much longer history
with an electronic medical record, so the prominent role of CPRS in the devision
trees may not be mirrored in hospitals that are just adopting an electronic
medical record and have not determined how to best incorporate that tool into
their QI toolbox. This study, however, may help them understand ways to develop
and design intra-hospital networks and effective electronic medical records to
help optimize future patient safety and quality efforts.
Overall, these four limitations show why this analysis was focused on
generating hypotheses and not geared towards trying to test specific hypotheses.
The findings from this study should help in the design of future studies that will
better understand the pathways that generate quality in healthcare and how
those pathways differ across settings that vary on a number of factors.
Conclusions
This chapter covered a number of details related to the effort to model how
a hospital’s organizational context modified improvement efforts in response to
FIX leading to the performance that was evaluated and reported in Chapter 4.
The main finding from this analysis was that the characteristics of the
organizational context that either support or hinder success with a QI initiative
143
were highly variable across hospitals. In effect knowing the performance at one
hospital would not facilitate predicting performance at a hospital with similar
characteristics. This likely reflects the complex nature of quality improvement and
the inability to measure and model all important factors at one time. It may also
reflect the severe limitation associated with not having any characteristics of the
teams trying to create local improvements. Despite the lack of broad statements
on how to create an organizational context to support QI, the review of individual
decision trees was able to identify some variables that can help hospitals develop
an environment to support QI efforts.
The first three sets of variables, sufficient staff, inpatient resources, and
data collection, all identify features that play a role in supporting QI efforts. For
any hospital undertaking a new QI initiative or with a sense that far too many QI
efforts fail, these may represent areas to address before investing in further QI.
The fourth set of variables measured efforts to improve adherence to clinical
practice guidelines. These measures should serve as an important reminder that
success with QI takes time and often repetitive trials. For this variable category
that identified factors were those that suggested hospitals utilized multiple
approaches, and often consistently used these approaches across different
diseases, to achieve the desired adherence to clinical guidelines. This suggests
that successful QI requires a long-term investment and commitment. Hospitals
likely need time to learn what methods work best for them and allow time for the
development of a culture that appreciates the change brought about by
successful QI.
144
The remainder of the chapter focused on examining the differences
between the LOS and discharges before noon decision tree, re-evaluating the
analytic framework first introduced in Chapter 5, and discussing the limitations of
this analysis. The differences between the LOS and discharges before noon
decision trees reaffirmed that sustaining improvements was a different process
than making the initial improvements. For the analytic framework, the data did
not fully support the original design, but given the study limitations there was no
clear evidence to support a refinement of the analytic framework. The limitations
of the study focus on the challenges posed by not having measures related to
team character as well as some unique aspects of the integrated VA healthcare
system that will not translate to other hospitals or healthcare systems. Overall,
this study has shown some of the strength and challenges associated with the
data mining approach to examining how organizational context supports QI. This
model and the results from this study have highlighted some important areas for
future study, which will be discussed in further detail in the next section.
145
CHAPTER 8 – SUMMARY AND FUTURE WORK
This chapter serves to summarize and conclude this project. This chapter
begins by reviewing the analyzed data and then highlighting the major results
that increase the understanding of quality improvement (QI) in healthcare. After
establishing the basic findings of the project, the chapter revisits the concepts of
human factors and change management first discussed in the opening chapter.
This discussion focuses on whether the data in this study support these as two
theoretical areas that could help improve how hospitals approach and perform in
their QI efforts. The summary will then conclude with some final
recommendations for hospitals and quality leaders to consider as they work to
improve safety and quality for patients as well as develop robust QI programs.
Lastly, a section explores some key research questions and outlines potential
future research projects that will help expand knowledge and improve hospital
QI.
Project Summary
The first step in this project was to establish how a collection of hospitals
performed while participating in a national QI collaborative. The QI literature often
presents a rosy picture about the success rate of QI projects, so this project
aimed to identify a collection of measurable patient outcomes that could establish
whether participating hospitals in fact made improvements during a QI effort.
Additionally, the project worked to introduce and analyze the concept that QI
should not only initially improve quality, but should also ensure that new levels of
quality were sustained for an extended period after project completion.
146
The case study for this analysis was a national QI collaborative
undertaken by all 130 Veterans Affairs (VA) hospitals named the Flow
Improvement Inpatient Initiative (FIX). This yearlong collaborative focused on the
participating hospitals working collaboratively to improve their individual hospital
patient flow. Two goals of the collaborative, which became the primary outcome
measures for this project, were to shorten patient LOS and increase the
percentage of patients discharged before noon. Additionally, the analyses in this
project considered three secondary outcome measures, 30-day all-cause
readmission rates, in-hospital mortality, and 30-day mortality. These secondary
outcomes were not a specific focus during FIX and there was no expectation that
hospitals would make improvements in these areas. Instead the secondary
outcomes served as safety checks that ensured the efforts to improve patient
flow did not result in unexpected negative outcomes.
The process of evaluating hospital performance during FIX utilized an
interrupted-time series analysis. The goal of this quasi-experimental approach
was to provide the best available control for pre-existing trends in the outcomes
(two-years of pre-FIX data) which would help determine whether changes in the
outcomes during FIX were likely attributable to FIX efforts. Additionally, the
interrupted-time series approach evaluated two years of post-implementation
data in order to evaluate whether those hospitals that improved during FIX
sustained those improvements. The consideration of how hospitals performed on
each of the five evaluated outcomes combined with a need to develop a
147
framework for comparison, led to the creation of a novel classification system
that included four major performance categories.
The first category included those hospitals that exhibited outcomes with
high levels of variance during the initial four years of the five year study. These
hospitals were classified as no change as they exhibited no detectable changes
in a set of outcomes that generally have clear temporal trends. Hospitals in this
group likely had highly non-standardized patient care processes, which presents
its own unique QI challenge. These hospitals likely need to work to develop a
standardized process, rather than trying to target interventions to improve
specific elements of patient care. This was in contrast to the fourth classification
category which included those hospitals that did not benefit from their
participation in FIX. These hospitals had less variability in the outcomes, such
that the time-series models could detect changes over time. It was just that these
hospitals either had no improvement or in some cases actually had declining
performance during FIX. These hospitals were more likely to have standardized
care processes, they just were unsuccessful in implementing changes that
created measureable improvements in the process as part of their participation in
FIX.
The other two classification categories considered those hospitals that
showed improvements in response to FIX. In effect, the time-series models
indicated that these hospitals changed their outcome measures during FIX in a
manner that was not predicted based on any pre-implementation trends. The
distinction between the two categories was how hospitals performed in the two
148
years after FIX. Those hospitals that saw the performance on the outcomes
return back to or above levels predicted by the pre-FIX baseline were classified
as improvers, while those that maintained performance better than predicted by
the pre-FIX baseline were categorized as sustainers.
Overall, the results of the analysis found that a number of hospitals
improved LOS (35%) and discharges before noon (46%). However, of those
facilities that improved, hospitals were more likely to sustain improvements
related to LOS. In total, only 27 (21%) hospitals showed sustained improvements
for LOS and 19 (17%) hospitals for discharges before noon. Assuming that longterm sustained improvements were important, this analysis revealed a need for a
better understanding of how QI efforts interact with the broader organizational
context and how to create an organizational context that would better support
sustained QI results.
To begin to understand the interaction between organizational context and
QI efforts, the second half of this project considered the literature addressing the
relationship between different organizational characteristics and a variety of
measures related to quality. This literature review identified several shortcomings
as well as no standard model for evaluating how organizational context modifies
QI programs. In order to guide further analysis, a new analytic framework was
developed to describe the interaction between four components of an
organization’s context. These components, facility structure, QI structure, QI
process, and team character, were modeled as building on top of each other.
This meant that the facility structure provided a basic framework and that
149
framework would itself modify how the QI structure contributed to the
organizational context. With similar relationships between QI structure and QI
process and between QI process and team character. This analytic framework
drove the selection of key variables that measured components of the
organizational context. The framework also led to the selection of data mining
decision trees as an analytic tool for modeling and understanding the complex
interactions between different variables.
The major analysis in the second half of the project still worked with the
basic FIX case study. In this analysis, the goal was to develop decision tree
models that would identify combinations of organizational factors that were
commonly associated with hospital classification into one of the four performance
categories. The tested organizational factors came from facility responses to two
surveys completed during the same time frame as FIX. Despite a list of 263
potential factors, the data mining analysis was unable to come up with models
that predicted hospital performance better than chance. This lack of success was
likely partially due to a lack of measures of performance team character, but also
clearly suggests that there are many challenges in effectively measuring the
unique hospital characteristics that created the context that helped determine
whether their efforts with FIX were successful.
While these findings were unfortunate, the decision trees did help to
uncover a number of important findings that can guide hospital policy
considerations and future research. The first important finding was that the
decision trees consistently identified four variable categories that played critical
150
roles in establishing the nature of the organizational context. The first three
categories, sufficient staff, inpatient resources, and data collection, all provided
separate but also complementary avenues for creating an organizational context
that can either facilitate or hinder QI projects. The fourth category focused on
different activities used to promote adherence to clinical practice guidelines.
While there can be no definitive conclusions about the relationship between
these variables and success with QI, they did provide valuable insight that local
leaders can evaluate when determining how to optimize or at least improve their
QI programs.
The other important finding from the decision trees related to the
differences between the decision trees for LOS and discharges before noon.
These two trees further confirmed that improving and sustaining QI was a two
stage process. The discharges before noon tree, in which a greater number of
hospitals were able to improve but not sustain, emphasized QI process variables
that measured whether the hospital had the experience to pull together a QI team
and make initial improvements. In contrast, the LOS decision tree which had a
greater number of sustainers highlighted the critical role of QI structural
components in providing appropriate support to help maintain quality levels even
after the completion of the initial QI project.
The data mining decision tree analysis was subject to a number of
limitations. The key limitation to remember was that there were no specific
measures of the FIX QI teams and how their interactions or approaches
impacted the hospital’s success with FIX. Despite these limitations the analysis
151
was able to highlight the strengths of the data mining approach as a potential tool
for use in certain analytic, particularly hypothesis generating, research activities.
Further, this study clearly identified that there were challenges to sustaining
healthcare quality and began to identify different approaches that may help in
meeting this challenge.
Human Factors and Change Management
The introduction to this project considered the roles poor design from a
human factors perspective as well as resistance to change may play in the
current inability of healthcare organizations to make substantial improvements in
quality. Both of these factors were hard to directly examine as the available data
did not really measure concepts from these theories, however a few variables did
provide some insight. This section briefly discusses these insights and whether
they suggest human factors and change management theories could potentially
overcome some of the observed challenges associated with creating sustained
quality.
The challenge in assessing the role of human factors was the lack of
information about the decisions QI teams made in trying to achieve the goals of
FIX. As such it is not clear whether successful approaches had better human
factors designs than the non-successful approaches. In the survey data there
was a specific factor that assessed whether hospitals typically used human
factors assessment to develop electronic reminders. This factor did not appear in
any of the models. Another potentially related factor, use of pilot testing for
electronic reminders, did appear in the discharges before noon and LOS/Noon
152
composite decision trees. In those models, the factor suggested that hospitals
that did not pilot test reminders would not sustain improvements and were even
unlikely to make initial improvements. Another situation that suggests human
factors approaches could improve the likelihood of sustaining results comes from
the many appearances of the staff sufficiency variable in the decision trees.
When factors from this variable class appeared in the models, they often
separated improvers or sustainers from hospitals that did not succeed with FIX.
However, almost uniformly the factor appeared after some other resource or
activity was viewed as insufficient, suggesting that QI may occur in an approach
that focuses on staff remembering or recalling that they are responsible for
completing certain actions to ensure quality. While providers will always play a
significant role in ensuring quality, the incorporation of human factors design
principles into quality improvement projects may help identify more robust
solutions that avoid the potential for declining quality over time.
For change management, there were three factors in the guideline
implementation variable category that measured whether physicians, nurses, or
other providers had any resistance to relevant QI projects. None of these
individual factors, or the composite resistance measure appeared in any of the
decision trees. This likely was because approximately 85% of hospitals reported
very little or some resistance for each of these three measures. With so many
hospitals reporting middle range values it may be that there was not enough
variation between sites allowing these measures to appear in the models.
Further, these were broad measures of resistance to efforts to promote guideline
153
adherence, so individual QI teams may have experienced different levels of
resistance to the changes associated with FIX. While this data makes it seem
that change resistance played a minimal or negligible role in the lack of hospitals
that improved or sustained quality, there was some indirect data that suggests
change resistance could be an important factor to consider. As discussed in
Chapter 7, the decision trees showed that consistent and repetitive approaches
to ensuring adherence to clinical practice guidelines was associated with a
greater likelihood of success with FIX. To the degree that consistent and
repetitive approaches help providers accept change and reduce resistance, this
can serve as an indicator that there can be meaningful resistance to change. So
while a weak finding, it is an important concept to keep in mind and to consider
should hospitals find they have difficulty achieving their quality goals.
Recommendations for Improving QI
Although this work only represents a single case series involving 130
hospitals and resulted in a collection of predictive models with little predictive
ability, the entire synthesis of literature and data has highlighted several issues.
The following are five recommendations based on this project that represent
approaches local quality leaders should consider as they work to develop an
effective QI program.
1. Make your QI efforts about quality, not about meeting a requirement.
Successful projects are those that people believe in and want to see
become successful. Far too often, the people affected by a QI project (if not even
the actual QI team) are told they must change in order to meet some (what often
154
seems arbitrary) internal or external requirement. This is a setting where change
resistance may be maximized and chances of project success minimized. These
situations will often be marked by initial improvements followed by immediate
quality degradation of improvements after project completion. This type of effect
could explain the overall system-wide response on the discharges before noon
outcome. If enough QI teams treated the goal as something they had to do to
satisfy a request from central office, the teams would have had sufficient buy-in
to get the initial improvements to report, but once no one was monitoring and
reporting rates of discharge before noon providers returned to their original
discharge process. The key here is to encourage QI teams to early on identify
and properly communicate a project value (ideally for all of the stakeholders) that
goes beyond simply meeting arbitrary requirements. If properly done the larger
healthcare community should have the necessary motivation to improve and
sustain those improvements.
2. Aim for real change, not just re-education.
While effective QI will include education, an effective QI team must work
to understand the process and what about that process allows poor quality to
occur. Then the team can identify ways to change the process that will eliminate
sources of poor quality. Education can then focus on helping providers
understand the new process and the benefits of that process. In contrast,
education that relies solely on encouraging providers to perform better, which
they will strive to do, is unlikely to sufficiently support efforts and will not lead to
lasting improvements in quality.
155
Another important consideration is that the QI team should make sure
there is a definable and consistent process relevant to the outcome of interest.
Quite often the issue in healthcare is that there are few standardized processes
making it difficult to broadly implement changes when everyone performs
differently. This potential issue was the motivating factor for developing the FIX
classification approach that separated facilities with high variability (No change)
from those that had low variability but did not improve (No benefit).
3. Empower and excite.
Change is most lasting when those who provide frontline care are involved
and truly excited about the QI project. The data in this study indicated that staff
were critical to supporting QI; the real question was how to most efficiently utilize
staff in achieving goals. While it is critically important that those who formulate
the strategic plan for an organization make it clear that they value and support
QI, there is only so much that management in many health care systems can do
to effect change. Instead, it must be the frontline leaders who recognize a quality
problem, communicate the need for change, and motivate those around them to
overcome the challenge. Additionally, it is these people who understand how a
process truly occurs and can best identify the waste or potential sources of error.
Only when there is true energy at the front lines for supporting and making a
change, is it possible to achieve long term quality.
4. Measure and evaluate.
Measures of data collection were frequently used to separate different
performance categories in the decision tree models. In short, it is impossible to
156
improve quality if there is no clear understanding about the current state of
performance. Similarly, sustaining performance requires monitoring performance
and being prepared to respond should new sources of error emerge. This
process has its own challenges as hospitals must carefully identify how
frequently to collect and report data, as well as how to ensure that data are
reported in a format that local quality leaders can interpret and use to develop
plans of action.
5. Start small, dream big.
All QI approaches include some level of focus on continuous improvement
and monitoring. The continuous improvement process serves many critical
purposes, but perhaps most importantly recognizes that most processes are
subject to multiple sources of waste or error. This means that QI teams need the
ability to systematically and sequentially tackle different issues rather than feeling
like a successful project must tackle all problems with a single intervention. In
addition to keeping the team from tackling too large of a project, this approach
helps teams meet individual goals which can be an excellent way to keep interest
and excitement about the project.
Future Studies
As with any hypothesis generating study, these analyses have generated
a number of potential future research avenues. This section considers four key
areas and outlines some potential research projects. First, the clearest finding
from this research was that hospitals had varying levels of success with QI and it
was difficult to predict hospital performance. This finding suggests it is critical to
157
develop a better understanding of the QI process and begin to identify critical
events that establish greater likelihood of project success or failure. Studies in
this area could build on the analytic framework from this study, making sure that
they evaluate key characteristics of participating QI teams. Some considerations
are what individuals (professions, position in the organization) compose the
team, how well does the team interact with each other, and how well does the
team interact with others in the organization. Additionally, there is a need for
studies that can develop composite measures of hospital QI success that
consider success rates across all QI efforts and not just a single QI project.
This avenue of research should not be limited to primarily quantitative
studies such as this one, but should also utilize mixed-method or pure qualitative
study designs for understanding the QI process. A potential structure for these
studies could be to identify a framework for improvement (such as those found in
the work of Kotter,6 Nelson et al.,75 or Langley et al.76) and identify how well
improvement teams adhere to these frameworks and whether their adherence to
the framework relates to their ability to improve (and even sustain) quality. The
challenge with these sorts of studies will be in determining how to generate
enough scale to create generalizable knowledge, so an important secondary
objective of these studies may be to determine how to effectively survey QI
teams to collect key measures thus facilitating less resource intensive
quantitative studies.
This concept introduces a second important area for future research. One
surprising finding from the data mining trees was that very few composite
158
measures representing different variable categories appeared in the final
decision trees. As this work moves away from considering whether individual
projects succeed to evaluating whether an organizational context can broadly
support multiple QI initiatives, it will be important to identify and measure
composite measures that will have some relevancy across multiple settings. As
this study showed, individual measures will have little predictive power when
applied in different settings.
A third area for future research is to do some more in-depth examination
of the many variable categories that were commonly identified in the decision
trees. As already discussed this can include examining the exact meaning of
sufficient staffing or inpatient resources and even further understanding how
those staff or resources are being used to support improvements in quality. Other
considerations from the findings include examining sources of information about
QI, the accuracy of that information, and how well information is disseminated.
Some of the decision points in the discharge before noon decision tree
suggested that some VA teams were relying on information sources that were
ineffective in supporting their QI efforts.
The final area of research, builds further on this concept of team
information. However, this research needs to focus less on where teams get their
information but on how to develop information systems to support clinical care
and quality improvement. These efforts need to not only understand how to best
collect data, but also what data are worth collecting and how to create useful
syntheses of that data. While this research is outside the typical realm of health
159
services research and more of an engineering or information systems project,
this is far too critical of an area for the future of healthcare quality to not mention
as a future area of study.
Conclusions
As demonstrated by the distribution of hospital performance during FIX
across the four classification categories as well as the inability of the decision
trees to utilize a set of 263 variables to create accurate predictive models for
hospital performance, there is still considerable work required to understand how
and why QI projects are successful. The four areas of future study outlined in this
chapter represent some key areas, but realistically are only a small portion of the
many questions to consider. Hopefully this research has highlighted that QI in
healthcare is a challenge and hospitals should not expect easy fixes. Instead,
hospitals and QI teams need to recognize the challenges they face as they try to
improve and then sustain quality. With reflection, particularly about barriers for
unsuccessful projects, commitment to a process and investment in appropriate
structural supports hospitals can achieve their quality goals.
160
APPENDIX A – RISK ADJUSTMENT MODEL SAS CODE
/* Test Correlation*/
Proc CORR Data=Risk_Adjustment OUTP=CorrMat;
var QUAN05_AIDS QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD
QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia
QUAN05_Depression
QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus
QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_Hypothyr QUAN05_Liver QUAN05_Lymphom
QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Mildliver
QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity
QUAN05_paraly QUAN05_Pepticulcer QUAN05_Psychosis QUAN05_Pulmcirc
QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail
QUAN05_Rheumatic QUAN05_Sevliver QUAN05_UlcerNoBleed QUAN05_Valve
QUAN05_WeightLoss ICU_DIRECT_ADMIT DISP_DiedHosp
DISP_Transout;
run;
/*Full Variable List
age_cat(Ref=FIRST) sex MS(Ref=’M’) income race_category(Ref=Last)
SCPER_cat(REF=FIRST)MDC(REF=’5’) QUAN05_AIDS QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem QUAN05_CHF
QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem QUAN05_Dementia
QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus
QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_Hypothyr QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant
QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol
QUAN05_Nometast QUAN05_Obesity QUAN05_paraly QUAN05_Pepticulcer
QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis
QUAN05_RenalFail QUAN05_Rheumatic QUAN05_Sevliver QUAN05_UlcerNoBleed
QUAN05_Valve QUAN05_WeightLoss SOURCE_cat(REF=’1M’) ICU_DIRECT_ADMIT
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’)DISP_DiedHosp DISP_Transout */
/* LOS Model*/
/* Includes all variables found to have a p< 0.1 association in
single variable models */
/* Removed = Diab_NC Hypothyr Obesity UlcerNoBleed*/
/* AIC = 106542.6150 */
Proc Genmod Data=Risk_Adjustment;
class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC (REF=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE;
model log_los = age_cat sex MS income race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD
QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM
QUAN05_DrugAbus QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM
QUAN05_Hyper_NC QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant
QUAN05_Metastic QUAN05_MI QUAN05_Mildliver QUAN05_Neurol
QUAN05_Nometast QUAN05_paraly QUAN05_Pepticulcer QUAN05_Psychosis
QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail
QUAN05_Rheumatic QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss
SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat DISTYPE DISP_DiedHosp
DISP_Transout / DIST=NORMAL;
run;
161
/*Removed Variables that do not have a p< 0.1 value in the full model*/
/* Removed = Sex COPD Depression DrugAbuse Hyper_CM Lymphoma MI
Mildliver Nonmetast PepticUlcer RenalDisease RenalFailure SevLiver
DISTYPE Disp_DiedHosp Disp_Transout*/
/* AIC = 106562.6133 */
Proc Genmod Data=Risk_Adjustment;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE;
model log_los = age_cat MS income race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD
QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant
QUAN05_Metastic QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis
QUAN05_Pulmcirc QUAN05_PVD QUAN05_Rheumatic QUAN05_Valve
QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat /
DIST=NORMAL;
run;
/* Adding back in variables, up to p < 0.2 */
/* Variables Added = COPD Depression MI RenalDisease*/
/* AIC = 106524.5677*/
Proc Genmod Data=Risk_Adjustment;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model log_los = age_cat MS income race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD
QUAN05_Deficanem QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM
QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver
QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol
QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc QUAN05_PVD
QUAN05_RenalDis QUAN05_Rheumatic QUAN05_Valve QUAN05_WeightLoss
SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat/ DIST=NORMAL;
run;
/* The Above is the FINAL MODEL, next steps as well as correlation
testing make no significant improvement to the AIC */
/* Removing variables back to the p < 0.1 level */
/* Variables Removed = COPD Depression MI */
/* AIC = 106542.4696*/
Proc Genmod Data=Risk_Adjustment;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model log_los = age_cat MS income race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD
QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hemipara QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant
QUAN05_Metastic QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis
QUAN05_Pulmcirc QUAN05_PVD QUAN05_RenalDis QUAN05_Rheumatic
QUAN05_Valve QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT
DISTO_cat/ DIST=NORMAL;
run;
162
/* Removing variables to the p < 0.05 level */
/* Variables Removed = Hemiparesis */
/* AIC = 106525.5483*/
Proc Genmod Data=Risk_Adjustment;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model log_los = age_cat MS income race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit
QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat QUAN05_CVD
QUAN05_Deficanem QUAN05_Dementia QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant QUAN05_Metastic
QUAN05_Neurol QUAN05_paraly QUAN05_Psychosis QUAN05_Pulmcirc
QUAN05_PVD QUAN05_RenalDis QUAN05_Rheumatic QUAN05_Valve
QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat/
DIST=NORMAL;
run;
/* Discharge Time Model*/
/* Includes all variables found to have a p<.1 association in single
variable models */
/* Removed = Income AIDS BloodAnem Coagulation CVD Dementia
Depression FluidDis Hypothyr Liver Lymphom Malignant MildLiver
Neuro PepticUlcer Pulmcirc SevLiver UlcerNoBleed Valve
WeightLoss*/
/* AIC = 36850.3500 */
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE;
model NoonDischarge = age_cat sex MS race_category SCPER_cat MDC
QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_COPD
QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus
QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Metastic
QUAN05_MI QUAN05_Nometast QUAN05_Obesity QUAN05_paraly
QUAN05_Psychosis QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail
QUAN05_Rheumatic SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat DISTYPE
DISP_DiedHosp DISP_Transout / DIST=BINOMIAL;
run;
/* Remove variables not meeting p<0.2 in full model */
/* Removed: Sex SCPER_CAT Arthrit COPD DrugAbuse Hemiparesis Metastic
MI Nonmetast Obesity Paralysis Psychosis RenalDiease Rheumatic
Distype Disp_DiedHospital Disp_Transout*/
/* AIC = 36827.7762*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/
PARAM=REFERENCE;
model NoonDischarge = age_cat MS race_category MDC QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM
QUAN05_Diab_NC QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_PVD
QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat
QUAN05_Arthrit/ DIST=BINOMIAL;
run;
163
/* Remove variables not meeting p<0.1 in reduced model */
/* Removed: Alcohol Diab_NC Hyper_CM Hyper_NC*/
/* AIC = 36828.0531*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/
PARAM=REFERENCE;
model NoonDischarge = age_cat MS race_category MDC QUAN05_Arrhyth
QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_PVD
QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat /
DIST=BINOMIAL;
run;
/* Add correlation variables to the p<0.2 reduced model */
/* Added: Arthritis */
/* AIC = 36823.6495*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/
PARAM=REFERENCE;
model NoonDischarge = age_cat MS race_category MDC QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Deficanem
QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_PVD QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat /
DIST=BINOMIAL;
run;
/* The Above is the FINAL MODEL, next steps as well as any other
correlation testing make no significant improvement to the AIC */
/* Reduce correlation model to p<0.1 */
/* Removed: Alcohol Diab_NC Hyper_CM Hyper_NC */
/* AIC = 36823.9588*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) DISTO_cat(REF=’-1’)/
PARAM=REFERENCE;
model NoonDischarge = age_cat MS race_category MDC QUAN05_Arrhyth
QUAN05_Arthrit QUAN05_CHF QUAN05_Deficanem QUAN05_Diab_CM QUAN05_PVD
QUAN05_RenalFail SOURCE_cat ICU_DIRECT_ADMIT DISTO_cat /
DIST=BINOMIAL;
run;
/* 30-day & Inhospital Mortality Model*/
/* Includes all variables found to have a p<.1 association in single
variable models */
/* Removed = Income AIDS Arthrit Deficanem Hypothyr PepticUlcer
Psychosis PVD Rheumatic UlcerNoBleed Disto_cat DisType
DISP_DiedHosp DISP_Transout */
/* AIC = 11741.2237*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’) /
PARAM=REFERENCE;
model Died30Day = age_cat sex MS race_category SCPER_cat MDC
QUAN05_ALCOHOL QUAN05_Arrhyth QUAN05_bloodanem QUAN05_CHF
QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Dementia
164
QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus
QUAN05_FluidDis QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_Liver QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic
QUAN05_MI QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast
QUAN05_Obesity QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis
QUAN05_RenalFail QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss
SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial;
run;
/* Includes all variables found to have a p<.2 association in full
model */
/* Removed = Sex SCPER_Cat BloodAnemia COPD Diab_NC DrugAbuse
Hemiparesis Lymphoma MildLiver RenalDisease RenalFailure Valve*/
/* AIC = 11751.0545*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE;
model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_CVD
QUAN05_DementiaQUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant
QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast
QUAN05_Obesity QUAN05_paraly QUAN05_Pulmcirc QUAN05_Sevliver
QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial;
run;
/* Add in Correlations to Reduced Model*/
/* Added = RenalDisease */
/* AIC = 11722.0331*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE;
model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_CVD QUAN05_Dementia
QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis QUAN05_Hyper_CM
QUAN05_Hyper_NC QUAN05_LiverQUAN05_Malignant QUAN05_Metastic
QUAN05_MI QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity QUAN05_paraly
QUAN05_Pulmcirc QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss
SOURCE_cat ICU_DIRECT_ADMIT / DIST=Binomial;
run;
/* The Above is the FINAL MODEL, next steps as well as any other
correlation testing make no significant improvement to the AIC */
/* Reducing Correlation Model to p<0.1*/
/* Removed = CVD Diab_CM Obesity*/
/* AIC = 11723.3343 */
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) MS(Ref=’M’) race_category(Ref=Last)
MDC(Ref=’5’) SOURCE_cat(REF=’1M’) / PARAM=REFERENCE;
model Died30Day = age_cat MS race_category MDC QUAN05_ALCOHOL
QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_Dementia
QUAN05_Depression QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_Neurol
QUAN05_Nometast QUAN05_paraly QUAN05_Pulmcirc QUAN05_RenalDis
QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT /
DIST=Binomial;
run;
165
/* In-hospital Mortality Model*/
/* Includes all variables found to have a p<.1 association in single
variable models */
/* Removed = Income AIDS Alcohol Arthrit DeficAnem Hypothyr
Psychosis PVD Rheumatic UlcerNoBleed Disto_cat Distype
Disp_Transout*/
/* AIC = 8562.6092*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model DISP_DiedHosp= age_cat sex MS race_category SCPER_cat MDC
QUAN05_Arrhyth QUAN05_bloodanem QUAN05_CHF QUAN05_Coagulat
QUAN05_COPD QUAN05_CVD QUAN05_Dementia QUAN05_Depression
QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus QUAN05_FluidDis
QUAN05_Hemipara QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver
QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI
QUAN05_Mildliver QUAN05_Neurol QUAN05_Nometast QUAN05_Obesity
QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis
QUAN05_RenalFail QUAN05_Sevliver QUAN05_Valve QUAN05_WeightLoss
SOURCE_cat ICU_DIRECT_ADMIT / DIST=BINOMIAL;
run;
/* Includes all variables found to have a p<.2 association in the full
model */
/* Removed = BloodAnemia CVD Diab_NC DrugAbuse Hemiparesis Lymphom
MildLiver Obesity RenalFail Valve */
/* AIC = 8551.4851*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex MS(Ref=’M’) race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’)DISTYPE(REF=’1’) / PARAM=REFERENCE;
model DISP_DiedHosp= age_cat sex MS race_category SCPER_cat MDC
QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD
QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant
QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast
QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis
QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT/
DIST=BINOMIAL;
run;
/* Remove MS from the p<.2 Reduced Model */
/* Removed = MS */
/* AIC = 8546.3365*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model DISP_DiedHosp= age_cat sex race_category SCPER_cat MDC
QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD
QUAN05_Dementia QUAN05_Depression QUAN05_Diab_CM QUAN05_FluidDis
QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver QUAN05_Malignant
QUAN05_Metastic QUAN05_MI QUAN05_Neurol QUAN05_Nometast
QUAN05_paraly QUAN05_Pepticulcer QUAN05_Pulmcirc QUAN05_RenalDis
QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat ICU_DIRECT_ADMIT/
DIST=BINOMIAL; run;
166
/* The Above is the FINAL MODEL, next steps as well as any other
correlation testing make no significant improvement to the AIC */
/* Includes all variables found to have a p<.1 association in the
reduced model */
/* Removed = Sex Diab_CM PepticUlcer*/
/* AIC = 8547.6959*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model DISP_DiedHosp= age_cat race_category SCPER_cat MDC QUAN05_Arrhyth
QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_Dementia
QUAN05_Depression QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC
QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI
QUAN05_Neurol QUAN05_Nometast QUAN05_paraly QUAN05_Pulmcirc
QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat
ICU_DIRECT_ADMIT/ DIST=BINOMIAL;
run;
/* All Cause Readmission Model*/
/* Includes all variables found to have a p<.1 association in single
variable models */
/* Removed = MS Income Alcohol Dementia Depression Hemiparesis
Hypothyroidism Neuro Paralysis PepticUlcer Psychosis Rheumatic
UlcerNoBleed ICU_Direct_Admit Disp_DiedHosp Disp_Transout*/
/* AIC = 34500.2496 */
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model allcause_readmit_flag = age_cat sex race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_bloodanem
QUAN05_CHF QUAN05_Coagulat QUAN05_COPD QUAN05_CVD QUAN05_Deficanem
QUAN05_Depression QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_DrugAbus
QUAN05_FluidDis QUAN05_Hyper_CM QUAN05_Hyper_NC QUAN05_Liver
QUAN05_Lymphom QUAN05_Malignant QUAN05_Metastic QUAN05_MI
QUAN05_Mildliver QUAN05_Nometast QUAN05_Obesity QUAN05_Pulmcirc
QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail QUAN05_Sevliver
QUAN05_Valve QUAN05_WeightLoss SOURCE_cat DISTO_cat DISTYPE /
DIST=BINOMIAL;
run;
/* Includes all variables found to have a p<.2 association in Full
Model*/
/* Removed = Sex BloodAnem CVD Depression DrugAbuse Hyper_Cm Liver
Lymphoma MildLiver NonMetast Pulmcirc Valve Distype*/
/* AIC = 34489.2946 */
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model allcause_readmit_flag = age_cat race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Coagulat
QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC
QUAN05_FluidDis QUAN05_Hyper_NC QUAN05_Malignant QUAN05_Metastic
167
QUAN05_MI QUAN05_Obesity QUAN05_PVD QUAN05_RenalDis QUAN05_RenalFail
QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat DISTO_cat /
DIST=BINOMIAL;
run;
/* Reduced to p<0.1 and Adjust for Correlation Effects*/
/* Removed = Hyper_NC RenalFail*/
/* Added = Liver*/
/* AIC = 34487.3895 */
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model allcause_readmit_flag = age_cat race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_Arrhyth QUAN05_Arthrit QUAN05_CHF QUAN05_Coagulat
QUAN05_COPD QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC
QUAN05_FluidDis QUAN05_Liver QUAN05_Malignant QUAN05_Metastic
QUAN05_MI QUAN05_Obesity QUAN05_PVD QUAN05_RenalDis QUAN05_Sevliver
QUAN05_WeightLoss SOURCE_cat DISTO_cat / DIST=BINOMIAL;
run;
/* The Above is the FINAL MODEL, next steps as well as any other
correlation testing make no significant improvement to the AIC */
/* Reduced to p<0.05 */
/* Removed = Arthrit Obesity*/
/* AIC = 34490.2071*/
Proc Genmod Data=Risk_Adjustment DESCENDING;
class age_cat(Ref=FIRST) sex race_category(Ref=Last)
SCPER_cat(REF=FIRST) MDC(Ref=’5’) SOURCE_cat(REF=’1M’)
DISTO_cat(REF=’-1’) DISTYPE(REF=’1’) / PARAM=REFERENCE;
model allcause_readmit_flag = age_cat race_category SCPER_cat MDC
QUAN05_AIDS QUAN05_Arrhyth QUAN05_CHF QUAN05_Coagulat QUAN05_COPD
QUAN05_Deficanem QUAN05_Diab_CM QUAN05_Diab_NC QUAN05_FluidDis
QUAN05_Liver QUAN05_Malignant QUAN05_Metastic QUAN05_MI QUAN05_PVD
QUAN05_RenalDis QUAN05_Sevliver QUAN05_WeightLoss SOURCE_cat
DISTO_cat / DIST=BINOMIAL;
run;
168
APPENDIX B – SAS OUTPUT FOR RISK ADJUSTMENT
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
MYDATA.VERIFICATION
Normal
Identity
log_los
Number of Observations Read
Number of Observations Used
60000
60000
Criteria For Assessing Goodness Of Fit
Criterion
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
DF
Value
Value/DF
6E4
6E4
6E4
6E4
41474.6831
60000.0000
41474.6831
60000.0000
-74058.4710
-74058.4710
148308.9420
148309.2529
149173.1436
0.6923
1.0016
0.6923
1.0016
Algorithm converged.
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
MS
MS
MS
MS
MS
INCOME
race_category
race_category
race_category
SCPER_Cat
SCPER_Cat
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
2
3
4
5
6
7
8
9
10
D
N
S
U
W
1
2
3
2
3
0
1
2
3
4
6
7
8
9
10
DF
Estimate
Standard
Error
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.4390
0.0732
0.1011
0.1270
0.1500
0.1538
0.1585
0.2018
0.2053
0.2283
0.0523
0.0782
0.0598
0.0468
0.0524
-0.0000
-0.0767
0.0436
-0.0113
-0.0148
0.0117
0.0468
0.1963
0.1666
0.1031
0.3880
0.3064
0.4640
0.5131
0.5059
0.0547
0.0201
0.0226
0.0200
0.0186
0.0190
0.0204
0.0203
0.0204
0.0206
0.0222
0.0085
0.0121
0.0170
0.0595
0.0119
0.0000
0.0404
0.0100
0.0077
0.0081
0.0124
0.3722
0.0168
0.0643
0.0279
0.0118
0.0129
0.0195
0.0184
0.0195
0.0182
Wald 95%
Confidence Limits
0.3996
0.0289
0.0618
0.0905
0.1128
0.1138
0.1186
0.1617
0.1649
0.1847
0.0356
0.0544
0.0265
-0.0698
0.0290
-0.0000
-0.1558
0.0239
-0.0264
-0.0306
-0.0126
-0.6826
0.1634
0.0407
0.0485
0.3648
0.2810
0.4258
0.4770
0.4676
0.0190
0.4783
0.1174
0.1403
0.1635
0.1871
0.1938
0.1984
0.2418
0.2458
0.2718
0.0690
0.1020
0.0931
0.1634
0.0757
-0.0000
0.0024
0.0632
0.0039
0.0011
0.0360
0.7763
0.2292
0.2925
0.1577
0.4111
0.3317
0.5022
0.5492
0.5441
0.0905
Wald
Chi-Square
Pr > ChiSq
477.89
10.52
25.48
46.45
62.58
56.71
60.73
97.57
99.11
105.63
37.59
41.45
12.39
0.62
19.32
10.56
3.61
18.86
2.12
3.34
0.90
0.02
136.79
6.72
13.68
1081.92
561.66
567.27
774.50
672.02
9.00
<.0001
0.0012
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0004
0.4313
<.0001
0.0012
0.0574
<.0001
0.1455
0.0675
0.3437
0.8999
<.0001
0.0095
0.0002
<.0001
<.0001
<.0001
<.0001
<.0001
0.0027
169
Parameter
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
QUAN05_AIDS
QUAN05_ALCOHOL
QUAN05_ARRHYTH
QUAN05_ARTHRIT
QUAN05_BLOODANEM
QUAN05_CHF
QUAN05_COAGULAT
QUAN05_COPD
QUAN05_CVD
QUAN05_DEFICANEM
QUAN05_DEMENTIA
QUAN05_DEPRESSION
QUAN05_DIAB_CM
QUAN05_FLUIDDIS
QUAN05_HEMIPARA
QUAN05_HYPER_NC
QUAN05_LIVER
QUAN05_MALIGNANT
QUAN05_METASTIC
QUAN05_MI
QUAN05_NEUROL
QUAN05_PARALY
QUAN05_PSYCHOSIS
QUAN05_PULMCIRC
QUAN05_PVD
QUAN05_RENALDIS
QUAN05_RHEUMATIC
QUAN05_VALVE
QUAN05_WEIGHTLOSS
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
ICU_DIRECT_ADMIT
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Scale
11
12
13
14
16
17
18
19
20
21
22
23
24
25
1D
1G
1K
1P
1T
2A
3A
3B
-3
-2
0
3
4
5
7
11
17
22
25
30
DF
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Estimate
0.3205
0.2486
0.2644
-0.2870
-0.0163
0.4815
0.5710
0.1007
0.1837
-0.1093
0.5390
-0.0453
0.9161
0.7255
0.1304
0.1134
0.1321
0.3025
0.1734
0.2023
0.2776
0.0433
0.0884
0.1379
0.2217
0.0498
0.2013
0.2028
0.1270
0.0034
0.0636
0.2146
0.1550
0.1212
0.1728
0.1788
0.1003
0.2221
0.1253
0.1127
-0.2124
0.1586
0.3654
-0.3523
-0.2524
0.1297
-0.0665
-0.8782
-0.0006
0.2313
0.1680
0.3351
-0.2792
0.3537
0.1105
0.0371
-0.2963
0.6290
0.7024
-0.0340
0.0743
0.3951
0.0217
-0.0488
0.8314
Standard
Error
0.0156
0.0410
0.1549
0.8318
0.0253
0.0335
0.0281
0.0310
0.0253
0.0302
0.1910
0.0279
0.2632
0.0708
0.0450
0.0133
0.0090
0.0565
0.0342
0.0099
0.0200
0.0086
0.0145
0.0166
0.0239
0.0104
0.0138
0.0094
0.0547
0.0073
0.0147
0.0121
0.0191
0.0137
0.0146
0.0449
0.0200
0.0213
0.0134
0.0109
0.0642
0.0156
0.0195
0.0310
0.0851
0.0332
0.0070
0.0628
0.0484
0.0578
0.0444
0.0091
0.0236
0.0214
0.0292
0.0956
0.0373
0.0189
0.0320
0.0435
0.2225
0.0761
0.3398
0.2087
0.0024
Wald 95%
Confidence Limits
0.2899
0.3512
0.1683
0.3289
-0.0392
0.5681
-1.9173
1.3432
-0.0659
0.0332
0.4158
0.5473
0.5159
0.6262
0.0400
0.1615
0.1341
0.2333
-0.1686
-0.0500
0.1646
0.9133
-0.0999
0.0093
0.4002
1.4319
0.5868
0.8643
0.0422
0.2186
0.0873
0.1394
0.1145
0.1497
0.1918
0.4131
0.1065
0.2404
0.1829
0.2217
0.2384
0.3167
0.0265
0.0601
0.0600
0.1167
0.1055
0.1704
0.1747
0.2686
0.0294
0.0703
0.1743
0.2283
0.1845
0.2212
0.0199
0.2341
-0.0109
0.0177
0.0348
0.0924
0.1908
0.2383
0.1176
0.1924
0.0943
0.1482
0.1441
0.2015
0.0908
0.2669
0.0611
0.1395
0.1802
0.2639
0.0991
0.1514
0.0914
0.1341
-0.3382
-0.0866
0.1280
0.1893
0.3272
0.4036
-0.4131
-0.2915
-0.4191
-0.0856
0.0645
0.1948
-0.0803
-0.0528
-1.0012
-0.7552
-0.0955
0.0944
0.1179
0.3447
0.0811
0.2550
0.3174
0.3529
-0.3254
-0.2330
0.3118
0.3955
0.0533
0.1677
-0.1502
0.2245
-0.3695
-0.2231
0.5919
0.6662
0.6397
0.7652
-0.1192
0.0511
-0.3617
0.5103
0.2458
0.5443
-0.6443
0.6878
-0.4579
0.3603
0.8267
0.8361
NOTE: The scale parameter was estimated by maximum likelihood.
Wald
Chi-Square
420.84
36.80
2.91
0.12
0.42
206.12
412.41
10.56
52.74
13.05
7.96
2.65
12.11
105.04
8.39
72.92
216.01
28.71
25.78
418.24
193.08
25.55
37.32
69.43
85.70
22.77
213.20
468.74
5.40
0.21
18.70
313.61
65.95
77.76
139.40
15.85
25.15
108.19
87.98
107.25
10.96
103.13
351.40
128.95
8.80
15.22
89.95
195.80
0.00
15.99
14.33
1368.72
140.33
274.31
14.33
0.15
62.96
1102.22
481.57
0.61
0.11
26.93
0.00
0.05
Pr > ChiSq
<.0001
<.0001
0.0879
0.7300
0.5189
<.0001
<.0001
0.0012
<.0001
0.0003
0.0048
0.1039
0.0005
<.0001
0.0038
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0201
0.6430
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0009
<.0001
<.0001
<.0001
0.0030
<.0001
<.0001
<.0001
0.9907
<.0001
0.0002
<.0001
<.0001
<.0001
0.0002
0.6977
<.0001
<.0001
<.0001
0.4334
0.7385
<.0001
0.9490
0.8151
170
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
Number
Number
Number
Number
of
of
of
of
MYDATA.VERIFICATION
Binomial
Logit
NoonDischarge
Observations Read
Observations Used
Events
Trials
60000
60000
10911
60000
PROC GENMOD is modeling the probability that NoonDischarge=’1’.
Criteria For Assessing Goodness Of Fit
Criterion
DF
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Value
Value/DF
-27126.7305
-27126.7305
54401.4609
54401.6461
55067.6163
WARNING: Negative of Hessian not positive definite.
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
MS
MS
MS
MS
MS
race_catego
race_catego
race_catego
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
2
3
4
5
6
7
8
9
10
D
N
S
U
W
1
2
3
0
1
2
3
4
6
7
8
9
10
11
12
13
14
16
DF
Estimate
Standard
Error
Wald 95% Confidence
Limits
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
-1.5855
-0.0997
-0.0025
0.0080
0.0052
0.0380
-0.0171
-0.0649
-0.0490
-0.0372
0.0929
0.0130
0.0529
0.1667
-0.0354
-0.2107
-0.1970
0.0613
-18.0777
-0.0398
-0.0910
0.0413
-0.2520
-0.1402
-0.2207
0.0003
-0.2209
-0.0738
-0.2257
0.0147
-0.2711
-17.9954
-0.2482
0.0608
0.0712
0.0622
0.0580
0.0590
0.0634
0.0633
0.0638
0.0646
0.0697
0.0266
0.0381
0.0544
0.1784
0.0389
0.1378
0.0336
0.0243
7079.609
0.0457
0.2112
0.0854
0.0366
0.0408
0.0610
0.0560
0.0653
0.0577
0.0514
0.1241
0.5421
16037.12
0.0859
-1.7047
-0.2392
-0.1244
-0.1057
-0.1104
-0.0862
-0.1412
-0.1900
-0.1755
-0.1738
0.0407
-0.0617
-0.0538
-0.1829
-0.1116
-0.4808
-0.2628
0.0137
-13893.9
-0.1294
-0.5050
-0.1260
-0.3237
-0.2201
-0.3403
-0.1096
-0.3489
-0.1869
-0.3264
-0.2285
-1.3337
-31450.2
-0.4165
-1.4662
0.0398
0.1195
0.1217
0.1208
0.1622
0.1069
0.0602
0.0775
0.0995
0.1452
0.0877
0.1595
0.5163
0.0408
0.0593
-0.1313
0.1089
13857.70
0.0497
0.3230
0.2086
-0.1802
-0.0603
-0.1012
0.1101
-0.0930
0.0392
-0.1250
0.2578
0.7915
31414.19
-0.0798
Wald
Chi-Square
Pr > ChiSq
678.95
1.96
0.00
0.02
0.01
0.36
0.07
1.03
0.58
0.28
12.17
0.12
0.94
0.87
0.83
2.34
34.47
6.38
0.00
0.76
0.19
0.23
47.35
11.82
13.10
0.00
11.46
1.64
19.30
0.01
0.25
0.00
8.35
<.0001
0.1614
0.9686
0.8902
0.9300
0.5488
0.7867
0.3095
0.4477
0.5939
0.0005
0.7329
0.3311
0.3500
0.3628
0.1262
<.0001
0.0116
0.9980
0.3830
0.6665
0.6288
<.0001
0.0006
0.0003
0.9961
0.0007
0.2006
<.0001
0.9060
0.6171
0.9991
0.0039
171
Parameter
DF
MDC
17
1
MDC
18
1
MDC
19
1
MDC
20
1
MDC
21
1
MDC
22
1
MDC
23
1
MDC
24
1
MDC
25
1
QUAN05_ALCOHOL
1
QUAN05_ARRHYTH
1
QUAN05_ARTHRIT
1
QUAN05_CHF
1
QUAN05_DEFICANEM 1
QUAN05_DIAB_CM
1
QUAN05_DIAB_NC
1
QUAN05_HYPER_CM
1
QUAN05_HYPER_NC
1
QUAN05_PVD
1
QUAN05_RENALFAIL 1
Source_cat 1D
1
Source_cat 1G
1
Source_cat 1K
1
Source_cat 1P
1
Source_cat 1T
1
Source_cat 2A
1
Source_cat 3A
1
Source_cat 3B
1
ICU_DIRECT_ADMIT 1
Disto_cat
-3
1
Disto_cat
-2
1
Disto_cat
0
1
Disto_cat
3
1
Disto_cat
4
1
Disto_cat
5
1
Disto_cat
7
1
Disto_cat
11
1
Disto_cat
17
1
Disto_cat
22
1
Disto_cat
25
1
Disto_cat
30
1
Scale
0
Estimate
-0.0810
-0.2286
0.0435
0.6409
0.0504
-0.8949
0.2751
-0.8062
-0.2048
0.0129
-0.0739
-0.1177
-0.1790
-0.1805
-0.1962
-0.0758
-0.1302
-0.1184
0.0405
-0.1160
-0.1479
-0.1623
0.2462
0.1627
1.3782
-0.0626
0.2036
0.1279
0.2348
0.4560
1.6054
0.9000
1.1959
0.6038
1.4094
0.3295
0.9143
-0.9306
-0.2399
-17.5448
-1.0321
1.0000
Standard
Error
0.0999
0.0874
0.0923
0.0697
0.0922
0.7625
0.0810
1.0749
0.1821
0.0401
0.0290
0.0910
0.0325
0.0567
0.0474
0.0260
0.0622
0.0236
0.0422
0.0562
0.0810
0.2688
0.0988
0.0226
0.1555
0.1554
0.1682
0.1287
0.0273
0.0677
0.0529
0.0759
0.2354
0.1021
0.0481
0.0983
0.1111
1.0441
0.2779
6455.267
1.0347
0.0000
Wald 95% Confidence
Limits
-0.2768
0.1147
-0.4000
-0.0573
-0.1374
0.2244
0.5043
0.7775
-0.1304
0.2311
-2.3893
0.5995
0.1163
0.4339
-2.9131
1.3006
-0.5617
0.1521
-0.0656
0.0915
-0.1307
-0.0171
-0.2960
0.0605
-0.2427
-0.1154
-0.2917
-0.0694
-0.2891
-0.1033
-0.1269
-0.0248
-0.2522
-0.0083
-0.1646
-0.0721
-0.0422
0.1233
-0.2261
-0.0058
-0.3066
0.0108
-0.6891
0.3645
0.0526
0.4398
0.1185
0.2069
1.0735
1.6829
-0.3671
0.2419
-0.1260
0.5332
-0.1243
0.3801
0.1813
0.2883
0.3234
0.5887
1.5018
1.7091
0.7513
1.0487
0.7345
1.6572
0.4036
0.8039
1.3152
1.5036
0.1369
0.5221
0.6965
1.1322
-2.9771
1.1158
-0.7847
0.3048
-12669.6
12634.55
-3.0600
0.9959
1.0000
1.0000
Wald
Chi-Square
0.66
6.84
0.22
84.54
0.30
1.38
11.53
0.56
1.26
0.10
6.50
1.68
30.38
10.14
17.14
8.48
4.38
25.12
0.92
4.26
3.33
0.36
6.21
52.01
78.59
0.16
1.47
0.99
74.00
45.39
921.99
140.75
25.81
34.96
860.19
11.24
67.69
0.79
0.75
0.00
0.99
Pr > ChiSq
0.4173
0.0089
0.6375
<.0001
0.5849
0.2405
0.0007
0.4532
0.2608
0.7472
0.0108
0.1955
<.0001
0.0015
<.0001
0.0036
0.0364
<.0001
0.3370
0.0391
0.0678
0.5460
0.0127
<.0001
<.0001
0.6868
0.2260
0.3203
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0008
<.0001
0.3728
0.3880
0.9978
0.3185
172
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
Number
Number
Number
Number
of
of
of
of
MYDATA.VERIFICATION
Binomial
Logit
Died30Day
Observations Read
Observations Used
Events
Trials
60000
60000
2627
60000
PROC GENMOD is modeling the probability that Died30Day=’1’.
Criteria For Assessing Goodness Of Fit
Criterion
DF
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Value
Value/DF
-8576.3926
-8576.3926
17300.7853
17300.9705
17966.9407
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
MS
MS
MS
MS
MS
race_category
race_category
race_category
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
2
3
4
5
6
7
8
9
10
D
N
S
U
W
1
2
3
0
1
2
3
4
6
7
8
9
10
11
12
13
14
16
17
18
19
DF
Estimate
Standard
Error
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
-6.0997
0.4912
0.7316
1.1184
1.2381
1.4199
1.5908
1.9216
1.9501
2.3331
0.0113
-0.0091
-0.0114
0.3065
0.0762
0.1025
0.0905
-0.0817
-15.5073
0.4896
0.2475
-0.2209
1.0701
0.4147
0.9986
0.5621
-0.0192
-0.0542
0.4499
-0.2165
1.6022
-14.0090
-0.0333
1.0349
1.6278
0.3938
0.2571
0.2930
0.2628
0.2523
0.2526
0.2564
0.2544
0.2531
0.2537
0.2552
0.0551
0.0805
0.1175
0.3646
0.0638
0.2511
0.0634
0.0495
11452.30
0.1074
0.5962
0.2308
0.0698
0.0892
0.1099
0.1254
0.1939
0.1308
0.0961
0.2989
1.0584
26440.74
0.1793
0.1571
0.1146
0.2411
Wald 95%
Confidence Limits
-6.6037
-0.0831
0.2166
0.6239
0.7429
0.9173
1.0923
1.4255
1.4530
1.8329
-0.0968
-0.1669
-0.2418
-0.4081
-0.0489
-0.3896
-0.0337
-0.1787
-22461.6
0.2792
-0.9210
-0.6733
0.9333
0.2399
0.7832
0.3164
-0.3993
-0.3106
0.2615
-0.8024
-0.4723
-51836.9
-0.3847
0.7270
1.4032
-0.0788
-5.5957
1.0656
1.2466
1.6130
1.7332
1.9225
2.0894
2.4177
2.4473
2.8333
0.1193
0.1487
0.2190
1.0212
0.2013
0.5946
0.2147
0.0153
22430.59
0.7001
1.4160
0.2314
1.2069
0.5895
1.2140
0.8078
0.3610
0.2021
0.6383
0.3694
3.6766
51808.90
0.3182
1.3427
1.8524
0.8664
Wald
Chi-Square
Pr > ChiSq
562.72
2.81
7.75
19.65
24.01
30.66
39.11
57.64
59.11
83.58
0.04
0.01
0.01
0.71
1.43
0.17
2.04
2.73
0.00
20.79
0.17
0.92
235.15
21.62
82.58
20.11
0.01
0.17
21.90
0.52
2.29
0.00
0.03
43.40
201.79
2.67
<.0001
0.0937
0.0054
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.8381
0.9104
0.9229
0.4005
0.2323
0.6830
0.1532
0.0987
0.9989
<.0001
0.6780
0.3384
<.0001
<.0001
<.0001
<.0001
0.9213
0.6783
<.0001
0.4689
0.1301
0.9996
0.8527
<.0001
<.0001
0.1024
173
Parameter
MDC
MDC
MDC
MDC
MDC
MDC
QUAN05_ALCOHOL
QUAN05_ARRHYTH
QUAN05_CHF
QUAN05_COAGULAT
QUAN05_CVD
QUAN05_DEMENTIA
QUAN05_DEPRESSION
QUAN05_DIAB_CM
QUAN05_FLUIDDIS
QUAN05_HYPER_CM
QUAN05_HYPER_NC
QUAN05_LIVER
QUAN05_MALIGNANT
QUAN05_METASTIC
QUAN05_MI
QUAN05_NEUROL
QUAN05_NOMETAST
QUAN05_OBESITY
QUAN05_PARALY
QUAN05_PULMCIRC
QUAN05_RENALDIS
QUAN05_SEVLIVER
QUAN05_WEIGHTLOSS
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
ICU_DIRECT_ADMIT
Scale
20
21
22
23
24
25
1D
1G
1K
1P
1T
2A
3A
3B
DF
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
Estimate
-0.6304
-1.1221
1.3677
0.4862
-16.3130
2.2312
0.2461
0.2942
0.4632
0.8462
0.2827
0.3465
-0.3371
0.0427
0.7946
-0.3502
-0.4008
0.5090
0.9408
1.2994
0.6754
0.6307
-0.0732
-0.1022
0.6545
0.4784
0.4185
0.9237
0.7553
1.2628
0.9146
0.1431
-0.2452
-1.6016
-0.8856
0.5152
0.9138
0.6837
1.0000
Standard
Error
0.3191
0.4263
1.0594
0.1886
8088.934
0.2524
0.0834
0.0510
0.0552
0.0812
0.0786
0.1097
0.0806
0.0878
0.0470
0.0943
0.0471
0.0934
0.1203
0.0738
0.0738
0.0757
0.1279
0.1329
0.1289
0.1000
0.0842
0.1313
0.0819
0.0946
0.3292
0.1797
0.0460
1.0209
0.5817
0.2911
0.1878
0.0500
0.0000
Wald 95%
Confidence Limits
-1.2559
-0.0050
-1.9577
-0.2865
-0.7087
3.4442
0.1165
0.8560
-15870.3 15837.71
1.7366
2.7259
0.0827
0.4095
0.1943
0.3941
0.3550
0.5714
0.6869
1.0054
0.1286
0.4368
0.1315
0.5615
-0.4950
-0.1792
-0.1293
0.2147
0.7024
0.8867
-0.5349
-0.1654
-0.4931
-0.3085
0.3259
0.6921
0.7050
1.1766
1.1548
1.4440
0.5307
0.8201
0.4823
0.7792
-0.3239
0.1775
-0.3626
0.1582
0.4019
0.9070
0.2823
0.6744
0.2534
0.5835
0.6665
1.1810
0.5948
0.9159
1.0774
1.4482
0.2694
1.5598
-0.2091
0.4954
-0.3353
-0.1551
-3.6026
0.3994
-2.0257
0.2545
-0.0553
1.0857
0.5458
1.2818
0.5856
0.7817
1.0000
1.0000
Wald
Chi-Square
3.90
6.93
1.67
6.64
0.00
78.17
8.72
33.30
70.34
108.47
12.93
9.98
17.50
0.24
285.71
13.80
72.37
29.69
61.15
310.09
83.67
69.33
0.33
0.59
25.80
22.87
24.69
49.53
85.03
178.20
7.72
0.63
28.46
2.46
2.32
3.13
23.68
186.82
Pr > ChiSq
0.0482
0.0085
0.1967
0.0100
0.9984
<.0001
0.0032
<.0001
<.0001
<.0001
0.0003
0.0016
<.0001
0.6264
<.0001
0.0002
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.5672
0.4418
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0055
0.4258
<.0001
0.1167
0.1279
0.0767
<.0001
<.0001
174
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
Number
Number
Number
Number
of
of
of
of
MYDATA.VERIFICATION
Binomial
Logit
DISP_DIEDHOSP
Observations Read
Observations Used
Events
Trials
DISPOSITION: IN-HOSPITAL
MORTALITY
60000
60000
1691
60000
PROC GENMOD is modeling the probability that DISP_DIEDHOSP=’1’.
Criteria For Assessing Goodness Of Fit
Criterion
DF
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Value
Value/DF
-6127.9942
-6127.9942
12397.9885
12398.1591
13037.1376
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
SEX
race_category
race_category
race_category
SCPER_Cat
SCPER_Cat
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
2
3
4
5
6
7
8
9
10
F
1
2
3
2
3
0
1
2
3
4
6
7
8
9
10
11
12
13
14
16
17
18
19
20
21
DF
Estimate
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
-6.8314
0.7511
1.0474
1.3446
1.5080
1.6338
1.9215
2.1898
2.1492
2.5846
-0.2450
0.0885
0.2371
-0.0366
-0.1401
0.0639
-14.8793
0.5801
0.4339
-0.0052
1.2882
0.5039
0.9533
0.4188
0.1566
-0.4878
0.4772
-0.3403
-14.5034
-12.9909
-0.0381
1.1179
1.8836
0.4765
-0.0240
-0.9947
Standard
Error
0.3533
0.3928
0.3592
0.3492
0.3493
0.3538
0.3506
0.3493
0.3497
0.3502
0.1999
0.3109
0.0751
0.0605
0.0653
0.0888
11329.29
0.1265
0.7256
0.2802
0.0862
0.1093
0.1355
0.1683
0.2336
0.1938
0.1184
0.4266
4634.019
26440.74
0.2338
0.1971
0.1274
0.3066
0.3206
0.5146
Wald 95%
Confidence Limits
-7.5238
-0.0188
0.3434
0.6602
0.8235
0.9403
1.2343
1.5052
1.4637
1.8982
-0.6369
-0.5207
0.0900
-0.1551
-0.2681
-0.1101
-22219.9
0.3321
-0.9882
-0.5544
1.1192
0.2896
0.6878
0.0890
-0.3013
-0.8677
0.2451
-1.1764
-9097.01
-51835.9
-0.4963
0.7316
1.6339
-0.1244
-0.6522
-2.0033
-6.1390
1.5209
1.7514
2.0290
2.1926
2.3272
2.6087
2.8745
2.8347
3.2710
0.1468
0.6978
0.3842
0.0820
-0.0121
0.2379
22190.12
0.8281
1.8560
0.5440
1.4572
0.7182
1.2188
0.7487
0.6145
-0.1079
0.7093
0.4958
9068.007
51809.92
0.4201
1.5043
2.1334
1.0774
0.6043
0.0139
Wald
Chi-Square
Pr > ChiSq
373.95
3.66
8.50
14.83
18.64
21.33
30.03
39.30
37.76
54.47
1.50
0.08
9.98
0.37
4.60
0.52
0.00
21.02
0.36
0.00
223.16
21.24
49.51
6.19
0.45
6.33
16.24
0.64
0.00
0.00
0.03
32.16
218.51
2.42
0.01
3.74
<.0001
0.0559
0.0035
0.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.2204
0.7758
0.0016
0.5456
0.0319
0.4715
0.9990
<.0001
0.5498
0.9853
<.0001
<.0001
<.0001
0.0128
0.5026
0.0118
<.0001
0.4251
0.9975
0.9996
0.8706
<.0001
<.0001
0.1202
0.9404
0.0532
175
Parameter
MDC
MDC
MDC
MDC
QUAN05_ARRHYTH
QUAN05_CHF
QUAN05_COAGULAT
QUAN05_COPD
QUAN05_DEMENTIA
QUAN05_DEPRESSION
QUAN05_DIAB_CM
QUAN05_FLUIDDIS
QUAN05_HYPER_CM
QUAN05_HYPER_NC
QUAN05_LIVER
QUAN05_MALIGNANT
QUAN05_METASTIC
QUAN05_MI
QUAN05_NEUROL
QUAN05_NOMETAST
QUAN05_PARALY
QUAN05_PEPTICULCER
QUAN05_PULMCIRC
QUAN05_RENALDIS
QUAN05_SEVLIVER
QUAN05_WEIGHTLOSS
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
ICU_DIRECT_ADMIT
Scale
22
23
24
25
1D
1G
1K
1P
1T
2A
3A
3B
DF
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
Estimate
-15.4449
0.4470
-15.7453
1.8943
0.3446
0.4420
0.9579
-0.1435
0.3263
-0.4706
-0.0269
0.8728
-0.2746
-0.4691
0.6980
0.8289
1.0994
0.7861
0.7515
-0.2692
0.8250
0.0925
0.5530
0.3938
0.9330
0.7331
0.6505
1.0222
0.1565
-0.2980
-16.0848
-0.7725
0.5727
0.3937
0.9608
1.0000
Standard
Error
5781.123
0.2484
8026.288
0.3382
0.0611
0.0666
0.0909
0.0627
0.1351
0.1067
0.1091
0.0561
0.1119
0.0580
0.1054
0.1452
0.0941
0.0852
0.0889
0.1566
0.1445
0.1810
0.1181
0.1007
0.1505
0.0982
0.1236
0.3633
0.2098
0.0562
1827.756
0.6927
0.3261
0.2605
0.0579
0.0000
NOTE: The scale parameter was held fixed.
Wald 95%
Confidence Limits
-11346.2 11315.35
-0.0398
0.9338
-15747.0 15715.49
1.2315
2.5572
0.2248
0.4644
0.3114
0.5725
0.7797
1.1362
-0.2664
-0.0205
0.0615
0.5912
-0.6797
-0.2614
-0.2407
0.1869
0.7630
0.9827
-0.4939
-0.0553
-0.5829
-0.3553
0.4915
0.9045
0.5443
1.1135
0.9149
1.2838
0.6191
0.9530
0.5772
0.9258
-0.5761
0.0377
0.5417
1.1083
-0.2623
0.4473
0.3216
0.7845
0.1964
0.5912
0.6381
1.2279
0.5405
0.9256
0.4083
0.8928
0.3101
1.7343
-0.2548
0.5677
-0.4082
-0.1878
-3598.42 3566.251
-2.1302
0.5853
-0.0664
1.2118
-0.1169
0.9044
0.8472
1.0743
1.0000
1.0000
Wald
Chi-Square
0.00
3.24
0.00
31.38
31.77
44.01
110.99
5.23
5.83
19.44
0.06
242.34
6.02
65.32
43.89
32.59
136.44
85.20
71.39
2.96
32.57
0.26
21.93
15.29
38.45
55.68
27.70
7.92
0.56
28.09
0.00
1.24
3.08
2.28
274.99
Pr > ChiSq
0.9979
0.0719
0.9984
<.0001
<.0001
<.0001
<.0001
0.0222
0.0157
<.0001
0.8052
<.0001
0.0141
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0856
<.0001
0.6094
<.0001
<.0001
<.0001
<.0001
<.0001
0.0049
0.4559
<.0001
0.9930
0.2648
0.0790
0.1307
<.0001
176
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
MYDATA.VERIFICATION
Binomial
Logit
allcause_readmit_flag
Number of Observations Read
Number of Observations Used
Number of Events
Number of Trials
Missing Values
60000
57732
8904
57732
2268
PROC GENMOD is modeling the probability that allcause_readmit_flag=’1’.
Criteria For Assessing Goodness Of Fit
Criterion
DF
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Value
Value/DF
-24108.8967
-24108.8967
48369.7934
48369.9964
49051.0244
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
age_cat
race_category
race_category
race_category
SCPER_Cat
SCPER_Cat
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
MDC
2
3
4
5
6
7
8
9
10
1
2
3
2
3
0
1
2
3
4
6
7
8
9
10
11
12
13
14
16
17
18
19
20
21
22
DF
Estimate
Standard
Error
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
-2.2763
0.0151
0.1057
0.1041
0.1144
0.1127
0.1803
0.1409
0.1580
0.0732
-0.0753
-0.0332
-0.0955
-0.0443
0.0301
0.8129
-0.0632
-0.4100
-0.0679
0.1347
0.1838
0.5372
0.0948
-0.0010
0.0483
0.0895
-0.1340
-1.1449
-17.1240
0.3399
1.0246
0.1579
-0.1107
0.0253
-0.1533
-17.4183
0.0709
0.0847
0.0742
0.0693
0.0701
0.0745
0.0734
0.0737
0.0736
0.0785
0.1418
0.0341
0.0269
0.0280
0.0405
1.1643
0.0548
0.2814
0.1013
0.0401
0.0440
0.0612
0.0644
0.0707
0.0618
0.0526
0.1459
1.0214
16037.12
0.0781
0.0899
0.0986
0.1158
0.0867
0.1166
3647.222
Wald 95%
Confidence Limits
-2.4153
-0.1508
-0.0397
-0.0316
-0.0230
-0.0333
0.0364
-0.0035
0.0137
-0.0806
-0.3532
-0.1000
-0.1481
-0.0991
-0.0493
-1.4691
-0.1706
-0.9616
-0.2664
0.0561
0.0976
0.4173
-0.0315
-0.1397
-0.0729
-0.0136
-0.4200
-3.1467
-31449.3
0.1869
0.8484
-0.0354
-0.3375
-0.1447
-0.3818
-7165.84
-2.1373
0.1811
0.2511
0.2399
0.2518
0.2586
0.3241
0.2853
0.3022
0.2270
0.2027
0.0336
-0.0428
0.0105
0.1096
3.0949
0.0442
0.1416
0.1306
0.2133
0.2700
0.6572
0.2210
0.1377
0.1694
0.1926
0.1520
0.8570
31415.06
0.4930
1.2007
0.3511
0.1162
0.1953
0.0752
7131.005
Wald
Chi-Square
Pr > ChiSq
1030.62
0.03
2.03
2.26
2.66
2.29
6.03
3.66
4.60
0.87
0.28
0.95
12.63
2.51
0.55
0.49
1.33
2.12
0.45
11.28
17.47
77.04
2.16
0.00
0.61
2.90
0.84
1.26
0.00
18.95
129.93
2.56
0.91
0.09
1.73
0.00
<.0001
0.8581
0.1542
0.1328
0.1028
0.1303
0.0141
0.0559
0.0319
0.3508
0.5956
0.3299
0.0004
0.1129
0.4573
0.4851
0.2488
0.1452
0.5026
0.0008
<.0001
<.0001
0.1413
0.9888
0.4349
0.0888
0.3584
0.2623
0.9991
<.0001
<.0001
0.1094
0.3391
0.7706
0.1884
0.9962
177
Parameter
MDC
MDC
MDC
QUAN05_AIDS
QUAN05_ARRHYTH
QUAN05_ARTHRIT
QUAN05_CHF
QUAN05_COAGULAT
QUAN05_COPD
QUAN05_DEFICANEM
QUAN05_DIAB_CM
QUAN05_DIAB_NC
QUAN05_FLUIDDIS
QUAN05_LIVER
QUAN05_MALIGNANT
QUAN05_METASTIC
QUAN05_MI
QUAN05_OBESITY
QUAN05_PVD
QUAN05_RENALDIS
QUAN05_SEVLIVER
QUAN05_WEIGHTLOSS
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Source_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Disto_cat
Scale
23
24
25
1D
1G
1K
1P
1T
2A
3A
3B
-3
-2
0
3
4
5
7
11
17
22
25
30
DF
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
1
1
1
1
1
1
0
Estimate
0.1557
1.1641
0.5650
0.2513
0.0898
0.2208
0.3378
0.0383
0.1166
0.1432
0.2370
0.0213
0.1736
0.2036
0.4719
0.3036
0.1604
-0.0758
0.2026
0.2547
0.2378
0.1562
0.4320
-0.0608
-0.1056
0.0350
-0.2121
-0.5704
0.1394
-0.0115
0.4225
0.0000
0.7733
1.2723
0.0000
0.0350
0.1619
-0.2720
0.5254
0.1129
1.0131
-1.7497
1.0000
Standard
Error
0.0948
0.6955
0.2147
0.1434
0.0303
0.0880
0.0319
0.0662
0.0283
0.0526
0.0445
0.0271
0.0307
0.0509
0.0374
0.0572
0.0459
0.0556
0.0432
0.0333
0.0926
0.0631
0.0991
0.3023
0.1180
0.0241
0.2410
0.2132
0.1906
0.1549
0.0731
0.0000
0.0809
0.9261
0.0000
0.0614
0.1012
0.1689
0.6556
0.2402
0.8705
1.0414
0.0000
Wald 95%
Confidence Limits
-0.0302
0.3415
-0.1990
2.5273
0.1443
0.9857
-0.0297
0.5324
0.0305
0.1492
0.0484
0.3932
0.2752
0.4004
-0.0914
0.1680
0.0612
0.1720
0.0401
0.2463
0.1498
0.3242
-0.0319
0.0745
0.1134
0.2338
0.1038
0.3033
0.3986
0.5452
0.1915
0.4156
0.0704
0.2504
-0.1847
0.0332
0.1178
0.2873
0.1893
0.3200
0.0563
0.4192
0.0325
0.2799
0.2377
0.6262
-0.6532
0.5316
-0.3368
0.1256
-0.0123
0.0822
-0.6844
0.2603
-0.9881
-0.1526
-0.2342
0.5130
-0.3151
0.2921
0.2793
0.5657
0.0000
0.0000
0.6148
0.9317
-0.5427
3.0874
0.0000
0.0000
-0.0853
0.1554
-0.0365
0.3602
-0.6031
0.0592
-0.7596
1.8104
-0.3579
0.5836
-0.6931
2.7193
-3.7909
0.2914
1.0000
1.0000
Wald
Chi-Square
2.70
2.80
6.93
3.07
8.81
6.30
111.81
0.33
17.01
7.41
28.36
0.62
31.95
16.00
159.30
28.17
12.21
1.86
21.95
58.37
6.60
6.13
19.00
0.04
0.80
2.11
0.77
7.16
0.54
0.01
33.42
.
91.47
1.89
.
0.33
2.56
2.59
0.64
0.22
1.35
2.82
Pr > ChiSq
0.1006
0.0942
0.0085
0.0797
0.0030
0.0121
<.0001
0.5631
<.0001
0.0065
<.0001
0.4327
<.0001
<.0001
<.0001
<.0001
0.0005
0.1729
<.0001
<.0001
0.0102
0.0133
<.0001
0.8406
0.3707
0.1466
0.3789
0.0075
0.4645
0.9408
<.0001
.
<.0001
0.1695
.
0.5682
0.1097
0.1074
0.4229
0.6384
0.2445
0.0929
178
A
12
Noon
Discharge
11
A.1
1
1
0
0
1
A.2
1
3
1
1
1
Improve
B.1
3
0
1
0
3
B.2
0
4
0
0
0
B.3
6
10
1
1
5
No
Sustain
Benefit
C.1
4
5
0
3
1
C.2
4
3
0
3
3
C.3
1
2
2
3
3
D.1
16
12
7
14
8
D.2
6
3
4
16
16
No
Change
A
11
10
34
17
11
A.1
1
2
2
3
1
A.2
1
3
1
0
0
Improve
B.1
0
3
2
0
1
B.2
0
9
0
2
0
B.3
6
10
3
3
5
No
Sustain
Benefit
C.1
12
7
3
1
5
C.2
3
0
0
3
3
C.3
1
0
1
2
1
D.1
19
9
8
10
16
D.2
6
7
6
19
17
No
No
Sustain Improve
Change
Benefit
APPENDIX C – FACILITY PERFORMANCE BY SIZE AND REGION
A
A.1
A.2
B.1
B.2
B.3
C.1
C.2
C.3
7
0
2
1
0
2
0
1
1
4
0
2
0
4
1
1
0
1
5
3
0
0
0
2
1
0
0
2
0
0
0
0
0
0
1
3
4
0
0
0
0
1
1
0
0
D.1
1
2
4
3
4
D.2
1
1
1
7
6
Large (N = 16)
Medium (N = 60)
Small (N = 54)
No
Change
LOS
30-Day
Readmission
39
30-Day
Mortality
13
In-Hospital
Mortality
13
A
6
Noon
Discharge
7
A.1
0
0
2
0
1
A.2
0
1
0
0
0
Improve
B.1
1
0
1
0
3
B.2
0
1
0
0
0
B.3
2
3
1
0
0
No
Sustain
Benefit
C.1
4
4
0
1
1
C.2
1
0
1
2
0
C.3
1
0
0
3
2
D.1
7
5
2
6
5
D.2
1
2
4
7
4
No
Change
A
7
4
12
4
3
A.1
1
1
0
1
1
A.2
0
1
1
0
0
Improve
B.1
0
2
1
0
0
B.2
0
5
0
1
0
B.3
4
2
2
1
1
No
Sustain
Benefit
C.1
7
3
1
0
2
C.2
1
1
0
2
3
C.3
0
1
0
1
1
D.1
4
4
7
9
7
D.2
2
2
2
7
8
No
Change
A
2
3
14
9
6
A.1
1
0
0
1
0
A.2
1
1
1
0
1
Improve
B.1
0
0
0
0
0
B.2
0
6
0
0
0
B.3
2
3
1
2
4
No
Sustain
Benefit
179
C.1
4
1
2
0
2
C.2
2
1
0
1
1
C.3
1
1
1
1
0
D.1
8
7
4
1
3
D.2
4
2
2
10
8
Central (N=25)
Southeast (N = 26)
No
Change
Northeast (N = 23)
LOS
30-Day
Readmission
12
30-Day
Mortality
4
In-Hospital
Mortality
7
A
8
Noon
Discharge
5
A.1
0
2
3
1
0
A.2
1
2
0
1
0
Improve
B.1
1
0
1
0
1
B.2
0
4
0
1
0
B.3
4
6
1
0
4
No
Sustain
Benefit
C.1
0
4
0
2
0
C.2
2
1
0
1
2
C.3
0
0
0
0
0
D.1
9
2
1
7
6
D.2
4
3
1
10
9
No
Change
A
7
6
18
9
5
A.1
0
0
0
0
0
A.2
2
3
0
0
0
Improve
B.1
2
1
0
0
0
B.2
0
1
0
0
0
B.3
2
7
1
1
2
No
Sustain
Benefit
180
C.1
1
1
1
1
2
C.2
2
0
0
1
0
C.3
1
1
0
3
1
D.1
8
5
5
4
7
D.2
2
2
2
8
10
West (N=27)
Midwest (N=29)
No
Change
LOS
30-Day
Readmission
22
30-Day
Mortality
6
In-Hospital
Mortality
7
181
APPENDIX D – FULL VARIABLE LISTS
•
Wards
o Telemetry
o Step Down
o Respiratory
o Medicine-ECG
o Medicine
o Surgery-ECG
o Surgery
o Combined Medicine & Surgery
•
Sufficient Staff
o Registered nurses (Clinical)
o Clinical nurse specialists (Clinical)
o Radiology technologists (Support)
o Laboratory technologists (Support)
o Clinical pharmacists (Clinical)
o IRM or CPRS technical support staff (Support)
o Clinical Applications Coordinators (CACs) (Support)
•
Barriers to Improvement
o Insufficient numbers of specialists in target acute care conditions
o Insufficient numbers of skilled inpatient nurses
o Insufficient numbers of administrative and support staff
•
Inpatient Resources
o Clinical resources (number of beds) (Space)
o Administrative space (Space)
o Computers or workstations on the units (Technology)
o CPRS training time for basic functions (Technology)
o CPRS training time for advanced functions (Technology)
o CPRS training time for non-clinical staff (Technology)
o General (non-CPRS) (Technology)
o Access to medical informatics expertise (Technology)
o Availability of QI / performance measurement-related training
•
Communication & Cooperation
o Effective communication between physicians and senior admin
o Effective communication between physicians and nurses
o Cooperation between departments
•
Performance Monitoring
o Hospital admission rates
o Bed days of care (# of hospital days / 1000 uniques)
o Hospital readmission rates
o Hospital mortality rate
o Number of emergency room visits
o Subspecialty consult turnaround time
182
•
Monitoring Level
o 1 = Facility Level only
o 2 = Clinic Level only
o 3 = Provider only
o 4 = Facility & Clinic
o 5 = Facility & Provider
o 6 = Clinic & Provider
o 7 = All 3 Levels
•
Utilization Review (review for appropriateness)
o Acute care admissions
o Non-VA care admission paid by your VA
o Concurrent inpatient stays
•
Clinical order sets
o Community acquired pneumonia
o Congestive heart failure exacerbations
o Gastrointestinal bleeds
o Diabetic ketoacidosis
o Gastrointestinal bleed prophylaxis
o Deep venous thrombosis prophylaxis
o Pain Management
o Heparin dosing
•
ICU Evidence Bundles
o Myocardial Infarction
o Ventilator Associated Pneumonia
o Glycemic Control
o Weight Based Heparin
o Sedation
o Ventilator Weaning
o Severe Alcohol Withdrawal
o GI Prophylaxis
o Severe Sepsis
o Catheter Related Blood Stream Infection (CRBSI)
o Other
•
Clinical Practice Guideline Adherence
o Disease
Acute myocardial infarction
Congestive heart failure
Community-acquired pneumonia
o Method
Computerized reminders
Specialized CPRS templates
Performance profiling and feedback to providers
Incentives
Designated local clinical champion
183
Delegated RN for disease-specific management
Provider Education
•
QI Information
o VA central office directives
o VISN-level leadership or work groups focused
o Local healthcare system or medical center QI department
o National or regional teleconference
o VA or non-VA web-based resources
o VA newsletters or other literature
o Local VA or non-VA conferences or seminars
•
Driving Force
o Overseeing task forces or work groups focused on specific VA
performance measures
o Arranging educational activities related to performance
improvement
o Arranging provider education regarding clinical practice guidelines
o Arranging staff education in QI methods
o Providing statistical analysis on VA facility performance
o Provide technical consultation and support of template
development
•
Clinical Reminders
o Informal discussions between providers and clinical application
coordinators (Development)
o Requests to provider experts for clinical opinion (Development)
o Formal input from relevant clinical departments (Development)
o Committees for review of the research evidence (Development)
o Test piloting reminders prior to full scale implementation
(Development)
o Post-implementation assessment of provider satisfaction (Post)
o Formal evaluation of reminder usability (human factors)
(Development)
o Analysis of reminded impact on performance improvement (Post)
•
Performance Improvement
o Established teams to work broadly on VA performance measures
(Establish)
o Established teams to work on specific disease / conditions
(Establish)
o Established teams to work on specific VA performance measures
(Establish)
o Implemented a program or activities focused on enhancing a
cooperative culture
o Reallocated financial resources to focus on improving a specific
performance measure (Shift)
184
o Shifted staff from one part of the facility to another to improve
performance at a specific department or clinic (Shift)
o Actively partnered high- and low-performing clinics to improve one
or more performance measures
o Designated a site champion for specific clinical guidelines or
performance measures
o Monitored the pace at which guidelines were implemented
o Provided visible support for clinical guideline implementation
o Fostered collaboration among facilities in guideline implementation
•
Guideline Implementation
o Teamwork exist to implement guidelines
o Key implementation steps planned (Implementation)
o Implementation steps monitored (Implementation)
o Resistance from physicians (Resistance)
o Resistance from nurses (Resistance)
o Resistance from other providers (Resistance)
•
Clinical Champions
o Time constraints
o Lack of interest in the topic
o Trust and respect
o Protected time
o Maintain through the duration of a project
o Replace a departing champion
•
Facility Environment
o Foster flexibility
o Emphasize participative decision-making
o Sufficient financial support
o Sufficient personnel support
•
Performance Awards
o Monetary incentives
o Ceremonial awards
o Perks (i.e. parking, additional annual leave)
o Other
185
REFERENCES
1.
Kohn LT, Corrigan JM, Donaldson MS, eds. To Err Is Human. Washington
DC: National Academy Press; 1999.
2.
AHRQ. National Healthcare Quality Report. 2008;
http://www.ahrq.gov/qual/nhqr08/nhqr08.pdf. Accessed 23 Nov 2009,
2009.
3.
AHRQ. National Healthcare Quality Report. 2009;
http://www.ahrq.gov/qual/nhqr09/nhqr09.pdf. Accessed 19 Jul 2010, 2010.
4.
Levinson DR. Adverse Events in Hospitals: National Incidence among
Medicare Beneficiaries. In: Services DoHH, ed: Office of Inspector
General; 2010.
5.
Wickens CD, Hollands JG. Engineering Psychology and Human
Performance. 3rd ed. Upper Saddle River, NJ: Prentice-Hall; 2000.
6.
Kotter JP, Cohen DS. The Heart of Change: Real-Life Stories of How
People Change Their Organizations. Boston, MA: Harvard Business
Press; 2002.
7.
Vest JR, Gamm LD. A Critical Review of the Research Literature on Six
Sigma, Lean and Studergroup's Hardwiring Excellence in the United
States: The Need to Demonstrate and Communicate the Effectiveness of
Transformation Strategies in Healthcare. Implementation Science. Jul.
2009;4(35).
8.
DelliFraine JL, Langabeer JR, Nembhard IM. Assessing the Evidence of
Six Sigma and Lean in the Healthcare Industry. Quality management in
health care. Jul.-Sep. 2010;19(3):211-225.
9.
Weir CR, Staggers N, Phansalkar S. The State of the Evidence for
Computerized Provider Order Entry: A Systematic Review and Analysis of
the Quality of the Literature. International Journal of Medical Informatics.
Jun. 2009;78(6):365-374.
10.
Glasgow JM, Scott-Caziewell JR, Kaboli PJ. Guiding Inpatient Quality
Improvement: A Systematic Review of Lean and Six Sigma. Joint
Commission journal on quality and patient safety. 2010;36(12):533-540.
11.
Hansen BG. Reducing Nosocomial Urinary Tract Infections through
Process Improvement. Journal for Healthcare Quality. Mar.-Apr.
2006;28(2):W2-2 - W2-9.
186
12.
Frankel HL, Crede WB, Topal JE, Roumanis SA, Devlin MW, Foley AB.
Use of Corporate Six Sigma Performance-Improvement Strategies to
Reduce Incidence of Catheter-Related Bloodstream Infections in a
Surgical Icu. Journal of the American College of Surgeons. Sep.
2005;201(3):349-358.
13.
Kussman MJ, Vandenberg P, Almenoff P. Survey of Icus & Acute Inpatient
Medical & Surgical Care in Vha. 2007;
http://vaww.va.gov/haig/ICU/2007ICUAcuteInptMedSurgReport.pdf.
14.
Yano EM. Vha Practice System Assessment Survey. 2007;
http://www.hsrd.research.va.gov/research/abstracts.cfm?Project_ID=2141
695109.
15.
IHI. The Breakthrough Series: Ihi's Collaborative Model for Achieving
Breakthrough Improvement. Boston: Institute for Healthcare
Improvement;2003.
16.
Asch SM, Baker DW, Keesey JW, et al. Does the Collaborative Model
Improve Care for Chronic Heart Failure? Medical Care. 2005;43(7):667675.
17.
Bradley EH, Nembhard IM, Yuan CT, et al. What Is the Experience of
National Quality Campaigns? Views from the Field. Health Services
Research. 2010;45(6):1651 - 1669.
18.
Neily J, Howard K, Quigley P, Mills PD. One-Year Follow-up after a
Collaborative Breakthrough Series on Reducing Falls and Fall-Related
Injuries. Joint Commission journal on quality and patient safety.
2005;31(5):275-285.
19.
Leape LL, Rogers G, Hanna D, et al. Developing and Implementing New
Safe Practices: Voluntary Adoption through Statewide Collaboratives.
Quality & Safety in Health Care. 2006;15:289-295.
20.
Strating SMH, Nieboer AP, Zuiderent-Jerak T, Bal RA. Creating Effective
Quality-Improvement Collaboratives: A Multiple Case Study. BMJ Quality &
Safety. 2011;20(4):344-350.
21.
Toncich G, Cameron P, Virtue E, Bartlett J, Ugoni A. Institute for Health
Care Improvement Collaborative Trial to Improve Process Times in an
Australian Emergency Department. J. Qual. Clin. Practice. 2000;2000:7986.
22.
Brandrud AS, Schreiner A, Hjortdahl P, Helljesen GS, Nyen B, Nelson EC.
Three Success Factors for Continual Improvement in Helathcare: An
Analysis of the Reports of Improvement Team Members. BMJ Quality &
Safety. 2011.
187
23.
Franco LM, Marquez L. Effectiveness of Collaborative Improvement:
Evidence from 27 Applications in 12 Less-Developed and Middle-Income
Countries. BMJ Quality & Safety. 2011.
24.
Armstrong B, Levesque O, Perlin JB, Rick C, Schectman G. Reinventing
Veterans Health Administration: Focus on Primary Care. Journal of
Healthcare Management. 2005;50(6):399-408.
25.
Mills PD, Weeks WB. Characteristics of Successful Quality Improvement
Teams: Lessons from Five Collaborative Projects in the Vha. Joint
Commission journal on quality and patient safety. 2004;30(3):152-162.
26.
Mills PD, Weeks WB, Surott-Kimberly BC. A Multihospital Safety
Improvement Effort and the Dissemination of New Knowledge. Joint
Commission journal on quality and safety. 2003;29(3):124-133.
27.
Jackson GL, Powell AA, Ordin DL, et al. Developing and Sustaining
Quality Improvement Partnerships in the Va: The Colorectal Cancer Care
Collaborative. Journal of general internal medicine. 2010;S1:38-43.
28.
Davies M. Fix Patient Flow Handbook. 2006;
http://srd.vssc.med.va.gov/C10/InpatientFlow/default.aspx. Accessed 10
Aug 2007, 2007.
29.
Haraden C, Resar R. Patient Flow in Hospitals: Understanding and
Controlling It Better. Frontiers of Health Services Management.
2004;20(4):3-15.
30.
Davies M. Vha System Redesign. 2007;
http://srd.vssc.med.va.gov/C10/InpatientFlow/default.aspx. Accessed 10
Aug 2007.
31.
Davies M. Vha Fix: Flow Inpatient Improvement Initiative. VHA Systems
Redesign Newsletter. 2008;2(1):7.
32.
Duncan D. Fix Collaborative (Inpatient Flow Improvement Initiative)
Evaluation Impact Study: Veteran's Affairs; 2010:90.
33.
Rosenthal GE, Kaboli PJ, Barnett MJ. Differences in Length of Stay in
Veterans Health Administration and Other United States Hospitals: Is the
Gap Closing? Medical Care. Aug 2003;41(8):882-894.
34.
Kaboli PJ, Go JT, Hockenberry J, Glasgow JM, Rosenthal GE, VaughanSarrazin M. Associations between Reduced Hospital Length of Stay and
30-Day Readmission Rate: 14-Year Experience in 129 Veterans
Administration Hospitals. Annals of Internal Medicine. 2011;Under Review.
188
35.
Hser Y-I, Shen H, Chou C-P, Messer SC, Anglin MD. Analytic Approaches
for Assessing Long-Term Treatment Effects. Evaluation Review.
2001;25(2):233-262.
36.
Biglan A, Ary D, Wagenaar AC. The Value of Interrupted Time-Ssries
Experiments for Community Intervention Research. Prevention Science.
2000;1(1):31-49.
37.
Mitchell JB, Bubolz T, Paul JE, Pashos CL, Escarce JJ, Muhlbaier LH.
Using Medicare Claims for Outcomes Research. Medical Care. 1994;32(7
Suppl):JS38-51.
38.
Ashton CM, Petersen NJ, Souchek J, et al. Geographic Variations in
Utilization Rates in Veterans Affairs Hospitals and Clinics. New England
Journal of Medicine. Jan 7 1999;340(1):32-39.
39.
National Committee on Vital and Health Statistics. Washington DC1980.
40.
Kashner TM. Agreement between Administrative Files and Written Medical
Records: A Case of the Department of Veterans Affairs. Medical Care. Sep
1998;36(9):1324-1336.
41.
Va Information Resource Center (Virec). BIRLS Death File 2006;
http://www.virec.research.va.gov/DataSourcesName/BIRLS/BIRLS.
Accessed 12 June 2006, 2006.
42.
Render ML, Kim HM, Welsh DE, et al. Automated Intensive Care Unit Risk
Adjustment: Results from a National Veterans Affairs Study. Critical Care
Medicine. Jun 2003;31(6):1638-1646.
43.
Manning WG, Mullahy J. Estimating Log Models: To Transform or Not to
Transform? J Health Econ. 2001;20:461-494.
44.
Elixhauser A, Steiner C, Harris D, Coffey R. Comorbidity Measures for Use
with Administrative Data. Med Care. 1998;36:8-27.
45.
Quan H, Sundararajan V, Halfon P, et al. Coding Algorithms for Defining
Comorbidities in Icd-9-Cm and Icd-10 Administrative Data. Med Care.
2005;43(11):1130-1139.
46.
Virec Research Uuers Guide: Fy 2006 Vha Medical Sas Inpatient
Datasets. 2007; http://www.virec.research.va.gov/References/RUG/RUGInpatient06.pdf. Accessed 26 Apr 2011, 2011.
47.
McLeod AI. Javascript for Online Power Computation in Intervention
Analysis. 2007; http://www.stats.uwo.ca/faculty/aim/2007/OnlinePower/.
Accessed 14 Oct 2009.
189
48.
McLeod AI, Vingilis ER. Power Computations in Time Series Analyses for
Traffic Safety. Accident Analysis and Prevention. 2008;40(3):1244-1248.
49.
Sas/Stat 9.2 User's Guide, 2nd Edition. Cary, NC: SAS Institute, Inc; 2009.
50.
Mitchell PH, Shortell SM. Adverse Outcomes and Variations in
Organization of Care Delivery. Medical Care. 1997;35(11):NS19-NS32.
51.
Shortell SM, Zimmerman JE, Rousseau DM, et al. The Performance of
Intensive Care Units: Does Good Management Make a Difference?
Medical Care. 1994;32(5):508-525.
52.
Hoff T, Jameson L, Hannan E, Flink E. A Review of the Literature
Examining Linkages between Organizational Factors, Medical Errors, and
Patient Safety. Medical Care Research and Review. 2004;61(1):3-37.
53.
Donabedian A. An Introduction to Quality Assurance in Health Care. New
York City: Oxford University Press; 2003.
54.
Hearld LR, Alexander JA, Fraser I, Jiang HJ. How Do Hospital
Organizational Structure and Processes Affect Quality of Care? A Critical
Review of Research Methods. Medical Care Research and Review.
2008;65(3):259-299.
55.
Walston SL, Burns LR, Kimberly JR. Does Reengineering Really Work?
An Examination of the Context and Outcomes of Hospital Reengineering
Initiatives. Health Services Research. Feb. 2000;34(6):1363-1388.
56.
Tucker AL, Nembhard IM, Edmondson AC. Implementing New Pracitices:
An Empirical Study of Organizational Learning in Hospital Intensive Care
Units. Management Science. 2007;53(6):894-907.
57.
Kaplan HC, Brady PW, Dritz MC, et al. The Influence of Context on Quality
Improvement Success in Health Care: A Systematic Review of the
Literature. The Milbank Quarterly. 2010;88(4):500-559.
58.
Pawson R, Tilley N. Realistic Evaluation. London: Sage; 1997.
59.
Stevens DP. Squire: Standards for Quality Improvement Reporting
Excellence. Quality & safety in health care. 2008;17(S1):1-32.
60.
VA. National Center for Veterans Analysis and Statistics. 2011;
http://www.va.gov/vetdata/. Accessed June 5 2011.
61.
Kizer KW, Dudley RA. Extreme Makeover: Transformation of the Veterans
Health Care System. Annual Reviews of Public Health. 2009;30:313-339.
190
62.
Kolodner RM, ed Computerizing Large Integrated Health Networks: The
Va Success. New York: Springer; 1997.
63.
Doescher M, Skillman S. Rural-Urban Commuting Area Codes (Rucas).
http://depts.washington.edu/uwruca/index.php. Accessed June 7, 2011.
64.
Glasgow JM, Kaboli PJ. Vamc Facility Rurality: Comparison of Three
Classification Approaches. Washington D.C.: Department of Veterans
Affairs;2010.
65.
Wachter RM, Flanders S. The Hospitalist Movement and the Future of
Academic General Internal Medicine. Journal of General Internal
Medicine. 1998;13(11):783-785.
66.
Wachter RM, Goldman L. Hospitalist Movement 5 Yrs Later. JAMA.
2002;287(4):487-494.
67.
Meltzer DO, Arora V, Zhang JX, et al. Effects of Inpatient Experience on
Outcomes and Costs in a Multicenter Trial of Academic Hospitalists.
Journal of General Internal Medicine. April 2005;20(Supplement 1):141142.
68.
Chen H, Fuller SS, Friedman C, Hersh W, eds. Medical Informatics:
Knowledge Manageemnt and Data Mining in Biomedicine. New York:
Springer; 2005.
69.
Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and
Techniques. 2nd ed. San Francisco: Morgan Kaufmann; 2005.
70.
Quinlan JR. C4.5: Programs for Machine Learning. Los Altos, CA: Morgan
Kaufmann; 1993.
71.
Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA. Data Mining: A
Knowledge Discovery Approach. New York City: Springer; 2007.
72.
Adeyemo AB, Oriola O. Personnel Audit Using a Forensic Mining
Technique. International Journal of Computer Science Issues.
2010;7(6):222-231.
73.
McClish DK. Analyzing a Portion of the Roc Curve. Medical Decision
Making. 1989;9(3):190-195.
74.
Shen M. Computerized Physician Order Entry Decreases Hospital Wide
Mortality. The Hospitalist. 2011;15(3):10.
75.
Nelson EC, Batalden PB, Godfrey MM. Quality by Design: A Clinical
Microsystems Approach. San Francisco: Jossey-Bass; 2007.
191
76.
Langley GJ, Nolan KM, Nolan TW, Norman CL, Provost LP. The
Improvement Guide: A Practical Approach to Enhancing Organizational
Performance. San Francisco: Jossey-Bass Publishers; 1996.