Written Responses from RRT members (PDF: 127KB/15 pages)

Protecting, maintaining and improving the health of all Minnesotans
Date: May 29, 2012
To:
Provider Peer Grouping (PPG) Rapid Response Team (RRT) Members
From: Stefan Gildemeister
Director, Health Economics Program
Subj.: Quality Composite Measure Design, revised first hospital report
Thank you for participating in the Rapid Response Team (RRT). In preparation for our
meeting this afternoon on May 29 (1:00-2:00 p.m.), I wanted to distribute the attached
memo from Mathematica Policy Research. The memo summarizes changes to the
scoring and compositing methodologies which we developed and implemented after
additional analysis and in response to hospital stakeholder comments—particularly with
respect to concerns about the original relative scoring methodology. We would like to
discuss and receive your comments on the following changes:
•
Using absolute thresholds rather than relative thresholds to assign points to each
measure;
•
Stricter requirements on the number of measures per subdomain that are needed
to receive a score; and
•
The combination of three former outcome subdomains—readmission, mortality,
and inpatient complications—into one combined outcome subdomain.
We will review the memo during our meeting to ensure you have an opportunity to clarify
your understanding of the issues and to ask questions.
Response deadline: We will need your feedback on these issues by June 5 at 4:00
p.m. Comments may be provided via email to [email protected]. We
will reconvene the RRT for a conference call on June 7 (8:00-9:30 a.m.) to discuss your
comments.
Minnesota Council of Health Plans: Sue Knudson_______________________________
MDH Rapid Response Team
Peer Grouping Methodology
Quality Composite Measure Design – Round 2
Thank you for the opportunity to review and provide input on the revised hospital
total care quality scoring methodology. We appreciate the Department’s effort to
improve the methodology and attention to producing sound results.
In general, we support the revised approach of scoring hospitals on their absolute
performance at the measure level as it is an improvement over the previous relative
scoring method. Still, there are several further improvements needed.
•
Rate Cutoffs for Absolute Point Assignments:
Thresholds should be modified to be measure specific. Using one threshold
across all measures methodologically produces different weights by measure
when this should be an intended decision. While this puts every provider and
every measure on the same ‘ruler’ it inadvertently devalues some measure’s
weight within the cluster. Essentially, this has the effect of saying where the
average performance is lower in comparison; these measures are worth fewer
potential points within the cluster. Again, the net effect of this is a
disproportional weight being placed on topped out measures. Measure specific
absolute thresholds or normalizing techniques should be tested and implemented
to solve for this issue.
•
Use of out of date quality measures:
MDH should reconsider the quality reports methodology to use the most current
quality data available. Presenting outdated quality results does not reflect quality
scores today and may inadvertently drive care to lower achieving hospitals. MDH
explains the rationale for using the outdated data as a means to match the time
period for the cost analysis which is largely driven by the availability of Medicare
data. There are two main reasons to seriously reconsider this approach. First,
lack of correlation between cost and quality is well documented in literature thus
rendering this rationale unsupported. Second, the cost results should be
segmented by commercial, Medicaid and Medicare for usability and accuracy.
•
Maximizing the number of providers analyzed at the expense of methods
accuracy:
MDH has consistently cited the principle, as stated by the original advisory
committee, to include as many providers/hospitals in the comparison as possible.
We don’t believe the intent of this principle is to increase the number of
providers/hospitals in the model if it compromises the integrity of methods used
to derive the comparisons. We provide two examples for which we think this
occurs:
1. Inclusion of small denominator measures may lead to inaccurate results as
MDH has lowered the minimum patient thresholds for the process of care
measures, using patient denominators as low as 10, where CMS Hospital
Compare and The Joint Commission recommend minimum denominators of
25 patients to achieve reliable results for transparency. CMS only lowers the
denominator to 10 patients for quality improvement purposes.
2. Domain scores require a minimum of 6 measures out of 16 total. We
recommend reconsidering this decision to include at least half or 8 measures.
Though page 9, paragraph 3 outlines that there is not very much difference
between the use of at least 7, 8 and 9 measures, the fact that on average 2
are imputed [table 3], is likely what drives this finding. We would assert
these confounding methods decisions are not reflective of actual results and
are misleading.
•
Lack of consideration of N size requirements and confidence intervals issued by
original measure/results publishers:
Previously MDH cited the CMS Value based Purchasing (HVBP) program’s lack of
use of confidence intervals as justification for not taking into account the
confidence intervals in PPG. However, the purposes of these programs are very
different as PPG focuses on steerage and transparency versus HVBP focuses
quality improvement.
•
Transparency of peer group performance vs. ranking:
With a 10 point range of the overall composite quality scores for PPS hospitals
[page 10, Table 4a], we question if there is any real difference in performance
among the hospitals using this methodology. Again, most variation is likely
driven by disproportionate weight on topped out measures as well as other
methods issues noted above. To that end, the process of ‘ranking’ the hospitals
versus ‘peer grouping’ the hospitals performance is misleading and an overbroad
interpretation of the legislated task.
In closing, the timeline to review this methodology change along with the upcoming
risk adjustment review as mentioned on the call seem rather ambitious to make the
goal of producing new results for hospital review by mid-July. The ambitious
timeline leaves the perception the rapid response team feedback couldn’t possibly be
taken into account in earnest if the timeline is to be made. Given our time
investment to provide improvement information on methods, we are hopeful it is
considered seriously.
On the whole, our review of the revised methodology finds it to provide incremental
improvement. Still, it is not on the whole a credible method for use in transparency
and peer grouping (or ranking per the current interpretation).
Minnesota Medical Association: Janet Silversmith____________________
From: Janet Silversmith <[email protected]>
Sent: Tuesday, June 05, 2012 2:12 PM
To:
McCabe, Denise (MDH)
Subject:
RRT Total Care Quality Composite
Denise:
On behalf of the MMA, I appreciate the opportunity to provide feedback on the total care
quality composite development for hospitals. Given the particular focus of this issue on
hospitals, the MMA’s comments are very limited.
Generally, the MMA finds the new approach (absolute thresholds) preferable to the
relative threshold approach previously used. We do have some questions about how
results will ultimately be displayed given the likely small variation between hospitals that
is expected. It is important that consumers/other users of the data not assume variation
in performance where none actually exists.
Thanks for your considerationJanet
-------------------------------------------------------------Janet Silversmith | Director of Health Policy
Minnesota Medical Association | mnmed.org
1300 Godward Street NE | Suite 2500 | Minneapolis, MN 55413
612-362-3763 office | 651-210-2275 cell | [email protected]
Minnesota Hospital Association: Mark Sonneborn____________________
From: Mark Sonneborn <[email protected]>
Sent: Tuesday, June 05, 2012 3:45 PM
To:
McCabe, Denise (MDH)
Subject:RE: PPG RRT: Total Care Quality Composite Memo
Denise:
Several things:
1) Statistical significance within individual measures
A)
I’ve shared a thought about statistical significance to Stefan late last week, but I’ll copy
it here:
Stefan,
I just wanted to give you a more concrete example of why using the risk-adjusted rate alone can
skew things.
In the latest run of the AHRQ measures (which is based on 4q10-3q11), I looked at IQI 19 – Hip
fracture mortality.
Fairview Ridges had a risk adjusted rate of .0126 which is among the better scores in the state.
St. John’s Maplewood was at .0185, and there were 8 hospitals between those two performance
rates. However, St. John’s rate was statistically significantly lower than the expected rate while
Ridges was not. Yet, Ridges (probably) would get more points than St. John’s.
An alternate step you could take would be to create a ratio of the risk-adjusted rate to the
expected rate. This would probably be fairer than the risk-adjusted rate alone, but it still does
not account for statistical significance (i.e. you could have 2 hospitals with O/E of 0.8 and one of
them is statistically significantly lower than expected and the other isn’t).
I’m not sure how to incorporate the confidence interval, but it seems unfair for a hospital that is
statistically significantly better to not get full points (or, in the reverse, for one that is worse to
get any points).
Since I sent that e-mail, I learned that the Leapfrog Group will be releasing a report tomorrow
that gives letter grades to hospitals based on their z-scores on many of the same measures that
we are looking at. I think there are problems with the Leapfrog methodology because they are
using a lot of very low-frequency patient safety measures (i.e. most hospitals have a rate of
zero), but here we have a group that releases reports for consumers that uses z-scores. This
might be an alternative worth exploring.
B)
I also received a comment from [Respondent A] on this topic. I’ll copy it here, but I have
to admit that I’m a bit confused by it. I tried to get clarification today, but was unable to reach
anyone:
Mark:
Thanks for the opportunity to provide comments.
The updated methodology does represent minor improvement in quality measurement but still
leaves much to be desired; no discussion was provided on cost methodology. Frankly, it is
difficult to tell how much improvement there actually is. There is not enough information for us
to try and model it.
They report to model this after the Federal Hospital Value Based Purchasing but they do not
include patient satisfaction data. We think inclusion of this data would be very helpful. We
couldn’t recall why it was initially left out of the public reporting.
It seems that the scoring of process measures has improved but now the scoring ignores
substantive differences for outcome measures – have we gone from one extreme of forced
variation to the other extreme of ignoring variation? At the end of the report; specifically, there
is an intent to show relative ranking, ostensibly because it is easier for consumers to understand
relative ranking. The relative ranking is meaningless if the real differences between hospitals
are non-existent. MDH needs to be able to demonstrate to consumers where real (statistically
significant) differences exist and where such differences do not exist. In the CMS hospital
compare outcomes measures, CMS does this simply by saying a hospital is better than, no
different than or worse than the national average, regardless of ranking (relative or absolute).
Can you ask Mathematica to show whether there is a way to see whether significant differences
in performance exits?
The methods so far proposed do not help consumers understand what is a significant variation
or what might cause the variation. One thing that is missing, and probably is far more important
in terms of value, is the utilization rate for certain procedures. We know that utilization varies
across the state and
It would be very helpful to know why care patterns are different in these areas. The current PPG
effort doesn't get us closer to this understanding. When we are looking at disparities in
outcomes, then this representation doesn't get us there either.
So, sooner or later, we need to step back and ask why we would try to keep refining a method
that doesn't really get us where we want to be. That is, stop trying to refine the wrong thing
(even if it is perfect, it is still wrong) and instead create a method to do the right thing.
2)
Other general comments
A)
All of the people that I contacted felt the change in direction to using hard thresholds
was welcome. There was also a very keen curiosity about how MDH would display the very
compressed total scores – here is a representative comment:
“For instance, it would be perfectly legitimate for them to report the percentile rankings, but
incumbent upon them to highlight that the percentile rankings may contain no meaningful
differences in actual scores.”
B)
There was also a comment about potentially using different thresholds for each
measure rather than across the board one:
“The use of an absolute scale assumes equal variation and similar denominators, and that 1%
variation in one metric is roughly equal to a 1% variation in another metric. One
recommendation to address this would be to use a weighted index in which each metric is
weighted by denominator and coefficient of variation. Compressed measures, and those with
very small denominators, would receive less index weight.
We would encourage them to use at least 1 digit precision in the calculations of the individual
measures as to mitigate rounding error. Not rounding will cause a 10-15% variation in the total
composite score for the IPPS hospitals.”
3)
Critical Access Hospital comment: “…Encourage the special emphasis on the importance
of clear, concise, easy-to-digest descriptions and presentation of results from both the PPS and
CAH perspectives as I did during an interview by Mathematica Research in May 2011 regarding
the reporting of hospital quality scores.” This person also felt that the shifting around of how
many measures you needed to be included in the report was a bit random – either you have
enough or you don’t. How many hospitals end up getting included should not be as big a factor
as whether it’s a fair analysis.
4)
Time lag of data – through the course of e-mail exchange, a side discussion arose about
the time lag issue. I then posed the question of whether it would be more preferable to have
cost and quality data from the same time period or to have the most recent data available. The
group unanimously chose the latter.
That’s it. Look forward to speaking in a couple of days.
Mark A. Sonneborn, FACHE
VP, Information Services
Minnesota Hospital Association
2550 University Av. W, Ste. 350-S
St. Paul, MN 55114
651-659-1423
Minnesota Council of Health Plans: Sue Knudson –
MCHP Response to MHA Comments________________________________
From: Knudson, Susan M <[email protected]>
Sent: Thursday, June 14, 2012 8:01 AM
To:
McCabe, Denise (MDH); Mark Sonneborn; Janet Silversmith; Beth
McMullen; Castellano, Susan E (DHS); Michele Kimball
Cc:
Jennifer Sanislo ([email protected]); Zimmerman, Marie L
(DHS); Wasieleski, Christine M (DHS); Lo, Sia X; Darcee Weber; Julie
Brunner; Eileen Smith; Gildemeister, Stefan (MDH)
Subject:RE: PPG RRT Comments: Total Care Quality Composite Memo
Mark,
It’s very helpful to see the other written feedback. I concur with [Respondent A’s] feedback and
view it very consistently with the MN Council of Health Plan’s feedback.
Thanks.
Sue
Sue Knudson
Vice President, Health Informatics
HealthPartners, Inc.
952-883-6185 Office
952-484-6744 Cell