Overview of Public Use Files Developed from the MN APCD (PDF)

Public Use Files Developed from
Minnesota’s All Payer Claims
Database (MN APCD): An Overview
March 2016
HEALTH ECONOMICS PROGRAM
Contents
Introduction .................................................................................................................................................. 3
What is the MN APCD? ............................................................................................................................. 3
What are Public Use Files? ........................................................................................................................ 4
What is the Potential Value of MN APCD-based Public Use Files? ........................................................... 5
What Data is Available in MN APCD-based Public Use Files? ....................................................................... 6
Potential Applications for Public Use Files................................................................................................ 7
Services - What Types of Medical Care Do Insured Minnesotans Receive? ......................................... 7
Conditions - What primary conditions are recorded when insured Minnesotans receive medical
care? ..................................................................................................................................................... 7
Utilization - How Is Care for Insured Minnesotans Distributed Across Certain Settings? .................... 7
Basic Features of Public Use Files ............................................................................................................. 8
Data Aggregation .................................................................................................................................. 8
File Size and Complexity........................................................................................................................ 8
Protection against Re-Identification ..................................................................................................... 8
Data Validation and Quality Assurance................................................................................................. 9
Additional Technical Information & Potential Limitations.................................................................... 9
Data Notes to the User ....................................................................................................................... 10
How Can Potential Data Users Access the Public Use Files?....................................................................... 11
Requesting the Public Use Files .............................................................................................................. 11
Questions and Feedback ......................................................................................................................... 11
2
Introduction
In 2008, the Minnesota Legislature directed the Minnesota Department of Health (MDH) to construct a
database of administrative health care transactions (health care claims data) from public and private
payers of health care in Minnesota.1 This dataset, called the Minnesota All Payer Claims Database (MN
APCD), has been used for a variety of purposes, including assessing trends in health care cost, quality,
utilization, and disease burden.2
In 2015, the Minnesota State Legislature directed MDH to annually prepare summary information from
the MN APCD and make it publicly available, if possible, at little or no cost.3 To inform the process of
developing the first set of summary files, MDH consulted a workgroup formed in 2014 to inform data
use decisions. The workgroup provided feedback on a range of issues, including:




How to ensure compliance with the guardrails required in state law;
How to stratify the information in a way that would be most useful to researchers;
What iterative steps to take to prepare follow-up data files; and
What documentation to create to inform potential data users.4
The first set of data files and summary tables were made available in March, 2016. If resources allow,
MDH will be updating these data files partway through 2016 and creating additional data aggregations
based on input from data users. The next legislatively required refresh of the data is due in March, 2017.
This document acts as a companion document to a range of information available online.5 It describes
the history, format and contents of the PUFs. Updates to this document will contain updates to available
data files, as well as feedback received from potential PUF users.
What is the MN APCD?
The MN APCD is a large-scale database that collects administrative health care transaction data from
public and private payers, third party administrators (TPAs), and pharmacy benefit managers (PBMs) in
Minnesota. It was established in 2008 by the Minnesota State Legislature, and serves as a tool to
measure the cost, quality, and utilization of health care services in the state. The MN APCD includes
eligibility, pharmacy, and medical claims files, as well as information on actual transaction prices for
health care services. The database contains information on an estimated 89% of insured Minnesotans
across the state, including commercial, Medicare, and Minnesota public program enrollees. It does not
contain information on uninsured people or those covered through the Veteran’s Affairs Department,
the Indian Health Service, or Tricare.
The MN APCD prioritizes the privacy and protection of individuals. As such, it does not collect sensitive
information that would identify unique patients. In particular, the database does not include any of the
following information for individual patients:
1
Minnesota Statutes, Section 62U.04
Additional information on the current uses of the data are available online:
http://www.health.state.mn.us/healthreform/allpayer/use_of_apcd_fact_sheet.pdf
3
https://www.revisor.mn.gov/laws/?year=2015&type=0&doctype=Chapter&id=71
4
Discussion of the 2015 workgroup and the group’s earlier deliberations are available online:
http://www.health.state.mn.us/healthreform/allpayer/allworkgroups.html
5
Additional information is available here:
http://www.health.state.mn.us/healthreform/allpayer/publicusefiles/index.html
2
3




Name
Birth date
Address
Social Security number
Additionally, the identity of the health care providers included in the MN APCD is currently classified as
non-public data and, as such, will not be released in any public files or reports.
MDH currently contracts with Onpoint Health Data (“Onpoint”) for services related to constructing and
maintaining the MN APCD, including data collection, processing, quality assurance, and aggregation.
Data submitters use standardized submission guidelines when submitting their data files to the MN
APCD, and MDH’s data aggregator, Onpoint, works closely with each data submitter to ensure that all
incoming data is complete and of excellent quality, both in terms of its reliability and credibility. Several
recent projects conducted by external parties have examined portions of the MN APCD in detail and
have found that the database provides solid, evidence-supported findings.6
Additional background information about the MN APCD is available online, including in a document that
provides an overview of the database’s history, content and current uses.7
What are Public Use Files?
Most generally, Public Use Files (PUFs) provide the opportunity for researchers and the public to use the
information contained in non-public datasets in an aggregated form that protects sensitive information.
There are a number of state and federal programs that collect claims data for analysis and provide
access to that information in a variety of
Five states – Maine, Oregon, Utah, New Hampshire, and
forms. PUFs range from detail-level, deColorado – are currently releasing public use files from
identified data sets that require a formal
their APCDs. File types vary by state, but include
request process and Data Use Agreements,
summary data tables (Colorado), claims-level files (Utah),
to aggregate tables and interactive tools
and files segmented by service type (Oregon). The
that are publicly available on state and
Centers for Medicare and Medicaid Services (CMS) also
federal websites. While PUFs differ in
releases public use files with both aggregate and claimsvarious ways – including the number and
level detail.
types of data elements they include, as well
as the pricing level, allowable uses, and
level of restriction on accessing the files –
PUFs can generally be categorized into the
following two types:
1. Claims-level, de-identified data sets
Most PUFs developed from state APCDs
are available by request as detailed,
claims-level files. Although available at
the claims level, these data sets are deidentified according to the HIPAA
Maine, Oregon, Utah, and Colorado require formal
applications, data use agreements, and user fees to
access their PUFs, while New Hampshire provides the
data free of charge but requires a data request form.
Public use files from CMS are available for free download
without an application or data use agreement. These
state and federal PUFs support a wide range of uses,
including population health research, surveillance, and
prevention activities; program planning and assessment;
and public reporting.
6
See information included in: http://www.health.state.mn.us/healthreform/allpayer/mnapcdoverview.pdf
The home page of the MN APCD and the relevant background information can be accessed at the following
locations, respectively: http://www.health.state.mn.us/healthreform/allpayer/index.html and
http://www.health.state.mn.us/healthreform/allpayer/mnapcdoverview.pdf.
7
4
Privacy Rule and Safe Harbor guidelines and may contain limited or no provider information.
Examples of excluded provider or payer information may include the National Provider Identifier
(NPI), provider names, and payer names. Certain other fields may also be aggregated or masked in
some way. Although they are “public use,” these PUFs often must be requested through a formal
application process, may require a Data Use Agreement, and usually have an associated fee (this fee
is lower than those charged for limited, restricted or identifiable data sets). Most states require that
projects meet statutorily defined use parameters, including for instance that the research is “in the
public interest.”
2. Aggregate/summary tables
Another PUF approach is to provide public use tables, files or reports containing aggregated data, in
place of offering access to claim-level data sets. While fewer states use this model for APCD PUFs,
the Centers for Medicare & Medicaid Services (CMS) provides several aggregated data tables as
PUFs from claims and other databases. These may be aggregated by users at different demographic,
procedural, diagnostic, and/or geographic levels. While these tables are already summarized, they
still allow the user to manipulate the data and aggregate fields to a higher level. These types of
summary data tables are usually publicly available and do not require a formal application process
or fee to access them.
What is the Potential Value of MN APCD-based Public Use Files?
Currently, the Minnesota Legislature has limited the use of the MN APCD to MDH staff and its
contractors to perform relevant analyses on variation in cost, quality, utilization, and disease burden, as
well as for certain evaluation activities. To date, the Legislature has authorized MDH to use the MN
APCD in the following ways:







Evaluating the performance of the Health Care Homes program;
Studying hospital readmission rates and trends, in collaboration with the Reducing Avoidable
Readmissions Effectively (RARE) campaign;
Analyzing variations in health care costs, quality, utilization and illness burden based on
geographical areas or populations;
Evaluating the State Innovation Model (SIM) testing grant received by the Departments of
Health and Human Services;
Conducting a one-time study of chronic pain management procedures (completed in January
2015);
Assessing the feasibility of conducting state-based risk adjustment in the individual and small
group health insurance markets; and
Studying trends in health care spending for specific chronic conditions and risk factors.
Given MDH’s finite resources, as well as the specific sets of expertise among its staff, there are untapped
opportunities to use the MN APCD in ways that can more rapidly inform improvements in population
health and delivery system efficiency. As a summarized, aggregated product of the MN APCD, the PUFs
allow other users to bring their research questions and expertise to bear on a range of policy and system
redesign issues, thereby continuing to demonstrate the value of the MN APCD. Broader engagement
with the data will also help inform MDH’s continuing efforts to improve the quality and effectiveness of
the data and may help prioritize research at the agency.
5
What Data is Available in MN APCD-based Public Use Files?
While developing PUFs, MDH sought input from both the legislatively required workgroup, as noted
above, as well as from potential users. Through those discussions, MDH learned that there are a number
of different types of users with different needs.


Most researchers are accustomed to working with very technically detailed data files at the
individual health care claim level, generally with identifiers for providers and health insurance
carriers. These researchers would prefer public access to granular files to enable robust, multivariate analysis.
Other potential users expressed an interest in initially seeing more benchmark-level population
data that may permit organizations to compare metrics of performance against market
averages.
In developing the first set of PUFs, MDH aimed to balance these different user needs with the legislative
guardrails that protect against the identification of individuals, providers and health insurance carriers.
Thus, the first files released consist of higher-level summary data from the MN APCD. The data are
considered public data under the Minnesota Data Practices Act, as the files contain summary
information that does not present a greater chance of individuals’ re-identification if the files are linked
to other data systems.
Data in the first set of PUFs are for calendar year 2013 and include medical claims from Medicare,
Medicaid and other state public programs, as well as from commercial payers. No prescription drug
information is included in the first set of data files. The PUFs focus on three themes:
1. Health Care Services: This file is designed to analyze the volume of health care services used by
Minnesota residents. It contains data at the service code level, aggregated by 3-digit ZIP code
and three age groupings.8 Where the combination of geography, age group and service code
creates small cells with only a few cases, the categories are summarized at higher levels of
aggregation.
2. Primary Diagnoses: This file is designed to analyze the distribution of primary diagnoses among
Minnesotans by three age grouping and 3-digit ZIP code. It contains common diagnostic codes9
at the 3-digit level. Where the combination of geography, age group and primary diagnosis
creates small cells with only a few cases, the categories are summarized at higher levels of
aggregation.
3. Health Care Use: This file is designed to analyze common types of health care service use
categories, including hospital admission, use of ambulance services, and clinic visits. Data is
provided at the 3-digit zip code level by three age groupings. Where the combination of
geography and primary service use categories creates small cells with only a few cases, the
categories are summarized at higher levels of aggregation.
8
Service codes used in this file are derived from the American Medical Association’s Current Procedural
Terminology (CPT), the Centers for Medicare & Medicaid Services’ Healthcare Common Procedure Coding System
(HCPCS), and the National Uniform Billing Committee Revenue Codes.
9
Available diagnostic information stems from the Clinical Modification of International Classification of Diseases,
Ninth Revision (ICD-CM-9). At the 3-digit level, ICD9 codes describes disease types or organ system.
6
As noted, MDH is required to publish and update PUFs annually. To the extent of available resources,
MDH will seek to respond to user feedback and create extracts of Public Use Files more frequently and
at additional levels of aggregation.
Potential Applications for Public Use Files
Services - What Types of Medical Care Do Insured Minnesotans Receive?
Services refer to the specific medical procedures that Minnesotans receive during a visit to a health care
provider. Service summary PUFs provide information on the frequency, cost, and distribution of specific
medical services provided to Minnesotans.
These types of files can answer the following questions:

In the aggregate, what procedures were delivered in a calendar year to patients in certain age
and geographic combinations? How did this vary across the state?

What is the age, gender, and geographic area of residence of patients using different types of
services?

How much, on average and in total, was spent by insurers and patients for a particular service?

How many times was a particular procedure performed over the course of a year?
Conditions - What primary conditions are recorded when insured Minnesotans receive medical
care?
Condition summary files include information on the frequency, cost, and distribution of certain
conditions for which groups of patients received medical care, as reported to the MN APCD. Condition
summary files do not address the complexities of comorbidities or the severity of conditions; they are
derived solely on the coded primary diagnoses.10
These types of files can provide population-level data such as frequencies and counts for groups of
patients organized by age and geography, such as:
• What is the frequency of a diagnosis reported by providers to the health insurance company?
• How does the distribution of patients with certain conditions vary by age, gender, and
geographic area of residence?
• How much was spent on care for insured patients, on average, for the visit or procedure under
a specific primary diagnosis?
• How many patients received a particular diagnosis?
• Among all patients that received a particular diagnosis, what was the average and median total
amount paid by the patient and insurance carrier?
Utilization - How Is Care for Insured Minnesotans Distributed Across Certain Settings?
Minnesotans may receive care during a visit to a physician’s office, outpatient setting, ambulatory
surgical center, inpatient hospital setting, skilled nursing facility or home health provider. Utilization
summary files allow users to explore the types of settings in which medical care is provided.11
10
To develop a more complete understanding of the prevalence and cost of certain chronic conditions in
Minnesota by region and age, analysts may consider the 2016 analysis conducted by MDH:
http://www.health.state.mn.us/divs/hpsc/hep/publications/costs/20160127_chronicconditions.pdf
11
At this first stage, the utilization summary does not a have discrete category for emergency department (ED)
visits; these visits are within the appropriate outpatient, inpatient or clinic/office categories.
7
These files can answer the following questions:

In what type of settings did patients receive care?

What are patient characteristics (counts by age, gender, and geographic area of residence) for
individual settings?

How much total health care spending was there for health care provided in particular settings?

How many patients received care in a particular setting?

Among all patients that received care in a particular setting, what was the average and median
total amount paid by the patient and insurance carrier?
Basic Features of Public Use Files
MDH prepared documentation for each data file consisting of a description of the data, a data
dictionary, background on the derivation of data files and a set of summary statistics. This
documentation is available online.12 High-level technical detail on the unit of analysis, data quality,
definitions and potential limitations of claims data is provided in this section.
Data Aggregation
In order to produce meaningful information without identifying individual patients, payers, or providers,
the data in the PUFs were aggregated into groups by geography and age. The level of aggregation for
each PUF is carefully selected to balance the need for detailed information with the required privacy and
confidentiality protections. The level of aggregation may depend on the specific PUF and its intended
uses. For the first set of PUFs, geography is reported at the 3-digit ZIP code level; future PUFs with
metrics that are broader than services or diagnoses may aggregate the data at the county level or just
distinguish between rural and urban settings.
For the first set of PUFs, patient age is grouped into three categories: “Children & Youth (<18),” “Adult
(18-64),” and “Older Adult (65+).” Again, other age groupings are imaginable, particularly for PUFs
whose primary metric produces fewer categories. The first set of PUFs do not distinguish data by
gender, nor do they aggregate data by broad payer type, such as Medicare, Medicaid and commercial.
Future PUFs, some of which will provide summary data from MDH research reports, may make different
tradeoffs between demographic variables and the analytic measures.
File Size and Complexity
The size and complexity of the PUFs will vary, depending on the level of detail that is included. The level
of detail for each file is determined by considering the type of information included, the potential uses
of the information, and appropriate protection of patient, provider and payer identities. Due to licensing
restrictions, some files may require that users provide their own secondary data to translate or group
certain diagnosis or procedure codes.
Protection against Re-Identification
The underlying data from the MN APCD are de-identified. In addition, each PUF is structured to further
protect patient privacy by rolling up health care transactions for individual patients to summary levels
without the possibility for reversal using third-party data. Similarly, the PUFs mask provider and health
insurance carrier identity by aggregating the data. In some cases, the number of individuals or providers
remains too small even after the data have been rolled up. In these cases, the data are removed from
12
http://www.health.state.mn.us/healthreform/allpayer/publicusefiles/about.html
8
the published PUF. Each published PUF is accompanied by a set of summary statistics and control totals
to assist users in understanding the impact of redacted data.
Specifically, the first set of PUFs aggregate data to ensure that any combination of characteristics is
associated with at least 11 distinct individuals in the dataset. These characteristics include patient age,
geographic area of residence, and type of service or diagnosis. MDH maintains this policy for the PUFs
to prevent the re-identification of any individual in a particular region or demographic group that may
have a rare condition or treatment. While data for these individuals remain in the PUF dataset, they are
aggregated into larger groups in such a way that the individuals cannot be re-identified.
In order to protect the identity of individual health care providers, the first set of PUFs will not include
any records that are not associated with at least 20 health care providers, determined by the number of
unique National Provider Identifiers (NPIs) associated with any combination of characteristics.
Data Validation and Quality Assurance
Data validation and quality assurance checks on MN APCD data occur at every stage of development,
from payers’ submission of the data to the data’s inclusion in the PUF. The data are rich and complex,
and provide a comprehensive picture of the cost, delivery and utilization of health care services in
Minnesota. In order to achieve this richness, the MN APCD must collect and consolidate many different
claim records. In support of several recent research studies, including assessments of pediatric health
care and state-based risk adjustment methodologies, independent contractors employed by MDH
evaluated the quality of MN APCD data and found it to be a high-quality, reliable data source for their
projects.13
Health care claim data are submitted to the MN APCD by public programs, private health plan
companies, TPAs, and PBMs; as such, the database represents nearly all insured Minnesota residents
and health care services that are considered covered benefits. All payers in the state are required to
submit data for their membership to the MN APCD, except for health plan companies and TPAs that paid
less than $3 million in medical (institutional and professional) claims during the submission year, as well
as PBMs that paid less than $300,000 in pharmacy claims.
Additional Technical Information & Potential Limitations
What Is a Claim? A medical provider who treats a patient with insurance coverage submits a bill to the
insurance company. Most bills, or “claims,” are sent electronically to the insurance company. Upon
receipt, the insurance company’s processing systems review the bill, determine whether the claim can
be paid and if so, the amount owed by the plan and the patient’s share. In some instances, this process
may be repeated and a new or revised claim could be submitted for reasons such as:



Changes in the patient’s insurance coverage
Corrections or additions to the service information
Coverage that requires more than one insurance company to pay the bill
These normal business activities can produce multiple versions of a single claim. When insurance
companies submit data to the MN APCD, on occasion more than one version could be passed along.
13
See the list of existing publications for additional detail:
http://www.health.state.mn.us/healthreform/allpayer/publications.html
9
Minnesota’s data aggregation vendor uses sophisticated, extensive algorithms to identify the different
versions of a claim, resolve all duplicate records, and retain a consolidated, true claim record that
provides the most complete and accurate description of the service event possible.
The Potential Impact of De-identified Patient Information: The MN APCD protects patient privacy by
collecting only de-identified patient information. Prior to submitting every file, each submitter masks the
name, birth date, and address of every patient. Processing claims data this way creates the potential for
some double counting of services and some inflation in the number of distinct patients recorded in the
MN APCD, because slight variations in how, for example, the name is recorded across a set of health
care payers could produce different unique patient IDs. De-identification of patient records,
implemented to ensure privacy protection, masks information that could be used to group related
records in some instances.
Data Intake: The MN APCD requires payers to submit data files at least every six months (in January and
July of each year) that contain information about claims paid during that six-month time period. All data
provided to the MN APCD must pass rigorous quality checks prior to inclusion in the data warehouse
and subsequent data extract. These quality checks are part of MDH’s data aggregation vendor’s
standard validation process, which begins with verification that the data meet minimum thresholds for
completeness, adhere to standard formats and code sets, and pass quality validations that require
relationships between data elements to be consistent and logical. For any data element that does not
pass these initial checks, MDH’s vendor works with the submitter to understand and correct any
problems.
PUF Data Source: MDH, with the help of the state’s data aggregation vendor, developed the PUFs
through a data extract from the MN APCD that summarizes and aggregates all the service lines in the
MN APCD. This data extract consolidates all the records in the MN APCD at a service-line level, with each
single service or procedure represented by a single record. This File creates a one-time “snapshot” of
the claims in the MN APCD. Because it is built from an extract of the full MN APCD dataset, the PUFs
retain the same rigorous data quality standards as the MN APCD.
Data Notes to the User
As users begin working with PUFs, they should consider the nature of claims data and the following
notes when drawing conclusions from their research using PUFs.
1. De-identification processes may produce multiple records for a particular service.
2. Data submitters are required to submit every paid claim and subsequent adjudications.
Submitters may or may not use rules that allow straightforward identification of prior versions
of the same claim. While every effort has been made to eliminate multiple versions of the same
service and payment record, including by flagging duplicates, data submitters’ systems may not
provide sufficient information to support consolidation in every instance.
3. Minnesota’s “Prompt Payment” law requires insurance companies to pay providers as long as
the minimum necessary information is provided.14 In some cases, data elements required by, or
useful to, the work of the MN APCD may not be available for every record, and it is not clear
that payers in all instances resubmit fully adjudicated claims after further adjustments take
place.
4. Certain categories of coverage are not reported in the MN APCD:
14
Laws of Minnesota 2015, chapter 62Q, section 75
10
5.
6.
7.
8.
a. Insurers that pay less than $3,000,000 in medical claims or $300,000 in pharmacy claims
for members residing in Minnesota are not required to submit files.
b. The MN APCD does not include information on persons insured through Tricare, the
Indian Health Service, and the Veterans Affairs Department.
The PUFs do not provide information on Minnesota’s uninsured population.
Some costs (e.g. withholds and incentive payments) may not be part of from the PUFs, because
they are not incorporated in the claims stream.
Other costs that are not considered payment for health care services (e.g. teaching and
education) are included in instances, because they are built-in to the payment formulas for state
and federal public programs.
Claims data are only as good as the coding of medical visits. In other words, claims data can only
speak to conditions and health care services that were appropriate and completely recorded at
the time of billing. Researchers should consider that trends over time may be driven by changes
in coding practices.
Users are solely responsible for analysis of this data and any conclusions or decisions made based on the
PUFs. However, MDH encourages users to contact the MN APCD team at [email protected] with
questions, and MDH is open to providing technical assistance as much as possible based on available
staff resources. In addition, MDH will work to update available user information based on its own
research, as well as research by other PUF users.
How Can Potential Data Users Access the Public Use Files?
The PUFs are available to the general public upon request. Potential data users or interested parties can
use an online form to issue a request and coordinate the logistics of obtaining data files. In order to
gather users’ input on MDH’s strategy for future PUF expansions and to assure that users are best
equipped to effectively use the data, MDH will seek to maintain contact with individuals and
organizations that obtained PUFs.
Requesting the Public Use Files
One or more PUFs can be obtained by completing the PUF Data Request Form available at:
http://www.health.state.mn.us/healthreform/allpayer/publicusefiles/request.html.


The form collects users’ contact information so that the MN APCD team can stay connected in
order to understand users’ experience with the PUFs and offer technical assistance.
The form also asks users to confirm that they have read and understood relevant contextual
information regarding the appropriate use of the PUFs.
Completed forms should be sent via email to the MDH MN APCD team at [email protected].
MDH will then coordinate the logistics for exchanging the requested PUF(s) with the user.
Questions and Feedback
MDH values users’ feedback, as it will help inform future iterations of the PUFs. Users are encouraged to
provide feedback on their experience accessing, obtaining, and using the PUFs by emailing MDH at
[email protected]. In addition, MDH will distribute a short survey to users after they have
obtained and used the data.
11