Protecting, maintaining and improving the health of all Minnesotans Date: April 15, 2011 To: Provider Peer Grouping Rapid Response Team members From: Katie Burns, Health Economics Program Subject: Quality Composite Measure Design Thank you for participating in the Rapid Response Team. In preparation for our next meeting, I wanted to distribute the attached memo and excel tables from Mathematica Policy Research, Inc. The memo summarizes our planned approach for constructing the composite quality measure and outlines several issues for which we would like your input: • What method should be used to combine individual measures into subcomposite categories? • How should missing data be treated within the subcomposite categories? • How should missing subcomposite category scores be treated? • How should subcomposite categories be weighted as part of the overall composite quality score? We will review the memo during our meeting to ensure you have an opportunity to clarify your understanding of the issues and to ask questions. Response deadline: We will need your feedback on these issues by Tuesday, April 26 at 4:00 pm. Responses may be provided via email to [email protected]. MN Council of Health Plans: Sue Knudson MDH Rapid Response Team Peer Grouping Methodology Quality Composite Measure Design Thank you for the opportunity to review and provide input on the quality composite measure design. Our feedback is outlined according to the approach for constructing the composite. 1. What method should be used to combine individual measures into subcomposite categories? The recommended approach of combining standardized scores for the measures within each sub-composite and standardizing each measure with a z-score is thorough but it introduces a moving target for performance instead of having a stated goal. It is further confounded by the notion that the same performance level in two measures relative to average could have very different z-scores. For example, measures with smaller variation will contribute more to the overall quality score than each measure contributing equally. This relative influence makes this a complicated approach and is likely to result in an outcome which is difficult to explain to providers and consumers because it is not intuitive based on measure level results. While this approach is good to highlight breakout performers, it is likely to be perceived by providers and consumers to be arbitrary because it is difficult to explain and peer group rankings could be counterintuitive to results of individual performance measures. Our experience in many measurement areas is that we enjoy a high performing network of providers in our state. We do not recommend use of this method because it would force some to look like relatively lower performers when in actuality they are very strong performers. Based on this, we recommend reconsideration of Option 3 to combine absolute rates. This approach aligns more closely with other existing industry approaches that you outline in addition it also aligns more closely with the recently released federal ACO criteria. This method deals with topped out measures without forcing some providers to look like they have relatively poor performance when in fact they are a strong performer. The point scale can be simplified to accommodate reasonable yet meaningful targeted performance by measure. Well thought out targets is the key to communication on this approach to mitigate perceptions of it being an arbitrary measures. This approach is more intuitive, easy to understand and communicate. 2. How should missing data be treated within the sub-composite categories? The current recommendation to calculate a sub-composite if at least one measure is available will result in performance on one measure driving performance for an entire category for many providers. An unintended consequence of this approach will be that asthma results drives 60% of an overall score because of the significant weight placed on chronic conditions. This type of result could be very misleading because it is not representative of overall provider quality. On a similar note, in our experience, it is never a good idea to “impute” quality performance level for providers. It is either demonstrated or absent. We also recommend the inclusion of displays that are intuitive to understand and that list individual measure performance. We recommend each sub-composite should have the majority or at least half of available measures. We recommend use of the optimal/composite care scores for both hospital and physician care, not the component parts. As an alternative, component part use would be better than both. However, the latter approach and the MDH recommended approach will result in overall performance being swamped by these measures alone. 3. How should missing sub-composite category scores be treated? We agree with the recommendation to require all sub-composite categories scores be present for inclusion in physician peer grouping. We will reserve comment on hospital until a recommendation is proposed. 4. How should sub-composite categories be weighted as part of the overall composite quality score? Measures of health information technology (HIT) are missing. Given emphasis on safety and care coordination among other critical aspects, we suggest adding these measures to both physician and hospital. Minnesota providers have high adoption rates and should be given credit for early adoption and capabilities. HIT could be considered another sub-composite on its own because it is more of a structure measure, versus a process or outcome measure. Hospital measures appear to be disproportionally representative of Medicare population with 50% of the score being driven by measures sourced from CMS. Adding HIT measures would add greater balance so results are usable by other cohorts and purchasers. Other comments: Given verbal comments on the RRT call and the inferences made in this memo regarding a potential change to peer group physicians at the clinic level rather than the group level, we suggest we revisit attribution rules. Physician level enumeration is complicated by physicians practicing at multiple sites. Last, in order to fully inform this methodology it would be most helpful to also understand the final approach for stratifying results. Thank you for the opportunity to provide input and suggest alternative methods for consideration. MN Medical Association: Janet Silversmith Memorandum To: Katie Burns Minnesota Department of Health From: Janet Silversmith Date: April 26, 2011 Re: MMA Feedback on Rapid Response Team Quality Measurement Issues On behalf of the Minnesota Medical Association (MMA), I appreciate the opportunity to provide the following comments for your consideration with respect to the development and incorporation of quality metrics into Minnesota’s provider peer grouping system. Measure Selection A total of 21 measures are proposed in three sub-composite areas – preventive services (5 measures), short-term acute (3 measures), and chronic disease outcomes (13 measures). The MMA would note that this set of measures – although generally reasonable for inclusion (see exception below) – still represents a fairly narrow and limited picture of quality across all the types of care that primary care clinics deliver. The MMA urges both clarity and caution when communicating the “total care” results as these measures do not represent an overall statement on practice quality. This challenge is more significant should the department attempt to create a quality and cost composite result – which would infer measurement of quality associated with those areas of care included in total cost, and vice versa. Given the limited available measures, the MMA urges the department to modify its representation of “total care” in its publication and communication of results and urges the department to avoid the development of an overall combined cost/quality ranking. Among the 21 measures recommended for inclusion are the optimal vascular care (OVC) and optimal diabetes care (ODC) – both as individual measures and as composites. The MMA is very concerned that the inclusion of the composites will provide additional (inappropriate) weighting such that the results are skewed for the lower and higher performers. At a minimum, the MMA urges the department to study and report back to the RRT the distribution of the chronic disease sub composite scores with the ODC and OVC measures included and excluded in order to better understand the likely impact. Combining Measures The MMA supports the recommendation to combine standardized scores for the measures within each subcomposite, using a z-score standardization approach. Minimum Case Size The MMA notes that the recommended minimum case sizes for clinic measures will be 30 (OVC/ODC and HEDIS) or 60 (HEDIS hybrid measures). Missing from the background memo and discussion, however, is how the department intends to address the reliability of quality measures. Research (see documents attached to email) indicates that there is no guarantee of having reliable results with as few as 30 cases and that reliability can vary significantly by measure. The MMA urges the department to explicitly address the reliability of quality measures either through the RRT or through the Reliability Work Group. The MMA requests follow up with the RRT as to the department’s planned approach. Combining Group and Clinic-Level Measures The department notes that in an attempt to create peer grouping analyses at the clinic level, it will be necessary to create clinic-level quality data where such gaps exist (i.e., all measures with the exception of the ODC and OVC). The MMA is strongly opposed to the allocation of group-level results to individual clinics within a group. Such assignment of results is both unfair and inappropriate to individual clinics. The MMA believes strongly that physicians (and clinics/groups) should not be held accountable for care they did not provide. Given the expected use of peer grouping results (e.g., to change payment policy or limit provider networks), it is critical that the department create accurate results, even if such results are limited to a smaller number of entities. Publication of clinic-level results would not fairly or accurately differentiate the cost and quality of care provided at each clinic and, as such, suggests a higher degree of accuracy than the data will support. The MMA strongly opposes the assignment of group results to individual clinics. Weighting of Subcomposite Scores The MMA generally supports the proposed weighting for the development of the overall composite, recognizing that the weights are somewhat arbitrary. If possible, it would be interesting to analyze the data to see if the weights reflect the actual care delivered by primary care clinics (i.e., is their mix of services approximately 20% preventive, 20% short-term acute, and 60% chronic disease). MN Hospital Association: Mark Sonneborn A. Construction of Subcomposites 1. General approach for combining measures: Mathematica’s recommendation is to use a standardized variation of combining absolute rates by converting each measure to a z-score (option 1a). While not judging whether this method is superior, it is different than the one being employed by CMS in their Hospital Value Based Purchasing (HVBP). CMS’s methodology is most closely aligned with option 3. Given that hospitals will become increasingly familiar with HVBP as it is implemented over the next few years, we would recommend option 3. Using a different framework is unnecessarily confusing and adds to administrative burden and complexity. We do not accept the rationale given against option 3 – that 1) using a point system and benchmarks could be viewed as arbitrary, and 2) it results in the loss of individual measure distribution. It is not arbitrary because it is following a methodology established by CMS. Although one could argue that the CMS methodology arbitrarily chooses the 90th percentile as its top threshold and 50th as its bottom, and that it chooses a 10-point point scale, these are at least rational choices. We do not understand the second rationale. You can choose to display individual measures so that the reader sees the actual distribution of results. But for scoring, if you use a 10-point scale, you will have 11 groups (including the zeros). The AHRQ measures do lend themselves more to z-score conversion, but these can relatively easily be converted to benchmarks and point scores. For any general approach used, there is an issue with “topped out” hospital measures (discussed in section 4). The CMS proposed approach is to score these measures as “pass/fail” – either a hospital gets full credit or no credit. We support this approach. Using a z-score in situations with so little variation may be problematic, as is setting 50th and 90th percentile benchmarks. We would recommend setting a hard target of 95% compliance for these measures. Sections 2, 3, & 4: We generally support the recommendations in sections A.2, 3 and 4 (see comment on “topped out” measures above). The minimum case threshold for HCAHPS measures, however, may need amendment. Though HVBP may exclude hospitals with fewer than 100 responses, that program is meant only for PPS hospitals. Hospital Compare still displays all HCAHPS-participating hospitals, including CAHs. CMS only provides the range of returned surveys (less than 100, 100-300, over 300), so we cannot determine how many were actually returned. We would recommend no minimum case threshold for HCAHPS. B. Construction of Overall Composite Mathematica recommends excluding the HCAHPS scores in computing the overall composite for CAHs. Though we understand this makes it cleaner from a computational and comparison standpoint, it strikes us as unfortunate. HCAHPS is mandated as part of SQRMS for CAHs with more than 500 discharges per year, so it would stand to reason that those results should be used in Provider Peer Grouping. We are aware of CAHs that are very engaged in improving patient experience. We also suspect that many of the 20% of CAHs that fall out of the analysis because of the recommended treatment of missing subcomposites are the same CAHs who do not participate in HCAHPS because they have fewer than 500 annual discharges. This would leave roughly two-thirds of the remaining CAHs who do participate in HCAHPS (even if they had fewer than 100 surveys returned). Could you let CAHs with HCAHPS be scored with the same weighting method as PPS hospitals and those without HCAHPS to redistribute to the 15 extra percentage points to the process measures? That would violate the recommendation about a missing subcomposite, but it could be treated as an exception to that rule. MN Business Partnership: Beth McMullen No response submitted. AARP: Michelle Kimball No response submitted. DHS: Marie Zimmerman No response submitted. Protecting, maintaining and improving the health of all Minnesotans Date: May 29, 2012 To: Provider Peer Grouping (PPG) Rapid Response Team (RRT) Members From: Stefan Gildemeister Director, Health Economics Program Subj.: Quality Composite Measure Design, revised first hospital report Thank you for participating in the Rapid Response Team (RRT). In preparation for our meeting this afternoon on May 29 (1:00-2:00 p.m.), I wanted to distribute the attached memo from Mathematica Policy Research. The memo summarizes changes to the scoring and compositing methodologies which we developed and implemented after additional analysis and in response to hospital stakeholder comments—particularly with respect to concerns about the original relative scoring methodology. We would like to discuss and receive your comments on the following changes: • Using absolute thresholds rather than relative thresholds to assign points to each measure; • Stricter requirements on the number of measures per subdomain that are needed to receive a score; and • The combination of three former outcome subdomains—readmission, mortality, and inpatient complications—into one combined outcome subdomain. We will review the memo during our meeting to ensure you have an opportunity to clarify your understanding of the issues and to ask questions. Response deadline: We will need your feedback on these issues by June 5 at 4:00 p.m. Comments may be provided via email to [email protected]. We will reconvene the RRT for a conference call on June 7 (8:00-9:30 a.m.) to discuss your comments. Minnesota Council of Health Plans: Sue Knudson_______________________________ MDH Rapid Response Team Peer Grouping Methodology Quality Composite Measure Design – Round 2 Thank you for the opportunity to review and provide input on the revised hospital total care quality scoring methodology. We appreciate the Department’s effort to improve the methodology and attention to producing sound results. In general, we support the revised approach of scoring hospitals on their absolute performance at the measure level as it is an improvement over the previous relative scoring method. Still, there are several further improvements needed. • Rate Cutoffs for Absolute Point Assignments: Thresholds should be modified to be measure specific. Using one threshold across all measures methodologically produces different weights by measure when this should be an intended decision. While this puts every provider and every measure on the same ‘ruler’ it inadvertently devalues some measure’s weight within the cluster. Essentially, this has the effect of saying where the average performance is lower in comparison; these measures are worth fewer potential points within the cluster. Again, the net effect of this is a disproportional weight being placed on topped out measures. Measure specific absolute thresholds or normalizing techniques should be tested and implemented to solve for this issue. • Use of out of date quality measures: MDH should reconsider the quality reports methodology to use the most current quality data available. Presenting outdated quality results does not reflect quality scores today and may inadvertently drive care to lower achieving hospitals. MDH explains the rationale for using the outdated data as a means to match the time period for the cost analysis which is largely driven by the availability of Medicare data. There are two main reasons to seriously reconsider this approach. First, lack of correlation between cost and quality is well documented in literature thus rendering this rationale unsupported. Second, the cost results should be segmented by commercial, Medicaid and Medicare for usability and accuracy. • Maximizing the number of providers analyzed at the expense of methods accuracy: MDH has consistently cited the principle, as stated by the original advisory committee, to include as many providers/hospitals in the comparison as possible. We don’t believe the intent of this principle is to increase the number of providers/hospitals in the model if it compromises the integrity of methods used to derive the comparisons. We provide two examples for which we think this occurs: 1. Inclusion of small denominator measures may lead to inaccurate results as MDH has lowered the minimum patient thresholds for the process of care measures, using patient denominators as low as 10, where CMS Hospital Compare and The Joint Commission recommend minimum denominators of 25 patients to achieve reliable results for transparency. CMS only lowers the denominator to 10 patients for quality improvement purposes. 2. Domain scores require a minimum of 6 measures out of 16 total. We recommend reconsidering this decision to include at least half or 8 measures. Though page 9, paragraph 3 outlines that there is not very much difference between the use of at least 7, 8 and 9 measures, the fact that on average 2 are imputed [table 3], is likely what drives this finding. We would assert these confounding methods decisions are not reflective of actual results and are misleading. • Lack of consideration of N size requirements and confidence intervals issued by original measure/results publishers: Previously MDH cited the CMS Value based Purchasing (HVBP) program’s lack of use of confidence intervals as justification for not taking into account the confidence intervals in PPG. However, the purposes of these programs are very different as PPG focuses on steerage and transparency versus HVBP focuses quality improvement. • Transparency of peer group performance vs. ranking: With a 10 point range of the overall composite quality scores for PPS hospitals [page 10, Table 4a], we question if there is any real difference in performance among the hospitals using this methodology. Again, most variation is likely driven by disproportionate weight on topped out measures as well as other methods issues noted above. To that end, the process of ‘ranking’ the hospitals versus ‘peer grouping’ the hospitals performance is misleading and an overbroad interpretation of the legislated task. In closing, the timeline to review this methodology change along with the upcoming risk adjustment review as mentioned on the call seem rather ambitious to make the goal of producing new results for hospital review by mid-July. The ambitious timeline leaves the perception the rapid response team feedback couldn’t possibly be taken into account in earnest if the timeline is to be made. Given our time investment to provide improvement information on methods, we are hopeful it is considered seriously. On the whole, our review of the revised methodology finds it to provide incremental improvement. Still, it is not on the whole a credible method for use in transparency and peer grouping (or ranking per the current interpretation). Minnesota Medical Association: Janet Silversmith____________________ From: Janet Silversmith <[email protected]> Sent: Tuesday, June 05, 2012 2:12 PM To: McCabe, Denise (MDH) Subject: RRT Total Care Quality Composite Denise: On behalf of the MMA, I appreciate the opportunity to provide feedback on the total care quality composite development for hospitals. Given the particular focus of this issue on hospitals, the MMA’s comments are very limited. Generally, the MMA finds the new approach (absolute thresholds) preferable to the relative threshold approach previously used. We do have some questions about how results will ultimately be displayed given the likely small variation between hospitals that is expected. It is important that consumers/other users of the data not assume variation in performance where none actually exists. Thanks for your considerationJanet -------------------------------------------------------------Janet Silversmith | Director of Health Policy Minnesota Medical Association | mnmed.org 1300 Godward Street NE | Suite 2500 | Minneapolis, MN 55413 612-362-3763 office | 651-210-2275 cell | [email protected] Minnesota Hospital Association: Mark Sonneborn____________________ From: Mark Sonneborn <[email protected]> Sent: Tuesday, June 05, 2012 3:45 PM To: McCabe, Denise (MDH) Subject:RE: PPG RRT: Total Care Quality Composite Memo Denise: Several things: 1) Statistical significance within individual measures A) I’ve shared a thought about statistical significance to Stefan late last week, but I’ll copy it here: Stefan, I just wanted to give you a more concrete example of why using the risk-adjusted rate alone can skew things. In the latest run of the AHRQ measures (which is based on 4q10-3q11), I looked at IQI 19 – Hip fracture mortality. Fairview Ridges had a risk adjusted rate of .0126 which is among the better scores in the state. St. John’s Maplewood was at .0185, and there were 8 hospitals between those two performance rates. However, St. John’s rate was statistically significantly lower than the expected rate while Ridges was not. Yet, Ridges (probably) would get more points than St. John’s. An alternate step you could take would be to create a ratio of the risk-adjusted rate to the expected rate. This would probably be fairer than the risk-adjusted rate alone, but it still does not account for statistical significance (i.e. you could have 2 hospitals with O/E of 0.8 and one of them is statistically significantly lower than expected and the other isn’t). I’m not sure how to incorporate the confidence interval, but it seems unfair for a hospital that is statistically significantly better to not get full points (or, in the reverse, for one that is worse to get any points). Since I sent that e-mail, I learned that the Leapfrog Group will be releasing a report tomorrow that gives letter grades to hospitals based on their z-scores on many of the same measures that we are looking at. I think there are problems with the Leapfrog methodology because they are using a lot of very low-frequency patient safety measures (i.e. most hospitals have a rate of zero), but here we have a group that releases reports for consumers that uses z-scores. This might be an alternative worth exploring. B) I also received a comment from [Respondent A] on this topic. I’ll copy it here, but I have to admit that I’m a bit confused by it. I tried to get clarification today, but was unable to reach anyone: Mark: Thanks for the opportunity to provide comments. The updated methodology does represent minor improvement in quality measurement but still leaves much to be desired; no discussion was provided on cost methodology. Frankly, it is difficult to tell how much improvement there actually is. There is not enough information for us to try and model it. They report to model this after the Federal Hospital Value Based Purchasing but they do not include patient satisfaction data. We think inclusion of this data would be very helpful. We couldn’t recall why it was initially left out of the public reporting. It seems that the scoring of process measures has improved but now the scoring ignores substantive differences for outcome measures – have we gone from one extreme of forced variation to the other extreme of ignoring variation? At the end of the report; specifically, there is an intent to show relative ranking, ostensibly because it is easier for consumers to understand relative ranking. The relative ranking is meaningless if the real differences between hospitals are non-existent. MDH needs to be able to demonstrate to consumers where real (statistically significant) differences exist and where such differences do not exist. In the CMS hospital compare outcomes measures, CMS does this simply by saying a hospital is better than, no different than or worse than the national average, regardless of ranking (relative or absolute). Can you ask Mathematica to show whether there is a way to see whether significant differences in performance exits? The methods so far proposed do not help consumers understand what is a significant variation or what might cause the variation. One thing that is missing, and probably is far more important in terms of value, is the utilization rate for certain procedures. We know that utilization varies across the state and It would be very helpful to know why care patterns are different in these areas. The current PPG effort doesn't get us closer to this understanding. When we are looking at disparities in outcomes, then this representation doesn't get us there either. So, sooner or later, we need to step back and ask why we would try to keep refining a method that doesn't really get us where we want to be. That is, stop trying to refine the wrong thing (even if it is perfect, it is still wrong) and instead create a method to do the right thing. 2) Other general comments A) All of the people that I contacted felt the change in direction to using hard thresholds was welcome. There was also a very keen curiosity about how MDH would display the very compressed total scores – here is a representative comment: “For instance, it would be perfectly legitimate for them to report the percentile rankings, but incumbent upon them to highlight that the percentile rankings may contain no meaningful differences in actual scores.” B) There was also a comment about potentially using different thresholds for each measure rather than across the board one: “The use of an absolute scale assumes equal variation and similar denominators, and that 1% variation in one metric is roughly equal to a 1% variation in another metric. One recommendation to address this would be to use a weighted index in which each metric is weighted by denominator and coefficient of variation. Compressed measures, and those with very small denominators, would receive less index weight. We would encourage them to use at least 1 digit precision in the calculations of the individual measures as to mitigate rounding error. Not rounding will cause a 10-15% variation in the total composite score for the IPPS hospitals.” 3) Critical Access Hospital comment: “…Encourage the special emphasis on the importance of clear, concise, easy-to-digest descriptions and presentation of results from both the PPS and CAH perspectives as I did during an interview by Mathematica Research in May 2011 regarding the reporting of hospital quality scores.” This person also felt that the shifting around of how many measures you needed to be included in the report was a bit random – either you have enough or you don’t. How many hospitals end up getting included should not be as big a factor as whether it’s a fair analysis. 4) Time lag of data – through the course of e-mail exchange, a side discussion arose about the time lag issue. I then posed the question of whether it would be more preferable to have cost and quality data from the same time period or to have the most recent data available. The group unanimously chose the latter. That’s it. Look forward to speaking in a couple of days. Mark A. Sonneborn, FACHE VP, Information Services Minnesota Hospital Association 2550 University Av. W, Ste. 350-S St. Paul, MN 55114 651-659-1423 Minnesota Council of Health Plans: Sue Knudson – MCHP Response to MHA Comments________________________________ From: Knudson, Susan M <[email protected]> Sent: Thursday, June 14, 2012 8:01 AM To: McCabe, Denise (MDH); Mark Sonneborn; Janet Silversmith; Beth McMullen; Castellano, Susan E (DHS); Michele Kimball Cc: Jennifer Sanislo ([email protected]); Zimmerman, Marie L (DHS); Wasieleski, Christine M (DHS); Lo, Sia X; Darcee Weber; Julie Brunner; Eileen Smith; Gildemeister, Stefan (MDH) Subject:RE: PPG RRT Comments: Total Care Quality Composite Memo Mark, It’s very helpful to see the other written feedback. I concur with [Respondent A’s] feedback and view it very consistently with the MN Council of Health Plan’s feedback. Thanks. Sue Sue Knudson Vice President, Health Informatics HealthPartners, Inc. 952-883-6185 Office 952-484-6744 Cell
© Copyright 2026 Paperzz