Estimating the Proportion of Health

Contract No.: 86824
MPR Reference No.: 6218-700
Estimating the Proportion
of Health-Related
Websites Disclosing
Information That Can Be
Used to Assess Their
Quality
Final Report
May 9, 2006
Margaret Gerteis
Anna Katz
Davene Wright
Frank Potter
Margo Rosenbach
Submitted to:
Office of Disease Prevention and Health
Promotion
Suite LL, 1101 Wootton Parkway
Rockville, MD 20852
Project Officer:
Cynthia Baur
Submitted by:
Mathematica Policy Research, Inc.
955 Massachusetts Ave., Suite 801
Cambridge, MA 02139
Telephone: (617) 491-7900
Facsimile: (617) 491-8044
Project Director:
Margaret Gerteis
ACKNOWLEDGEMENTS
T
he authors wish to acknowledge Hitwise—Real-Time Competitive Intelligence
(www.hitwise.com), which provided the data on Internet traffic to health-related
websites from which we generated the sample for our analysis, and Mark Mazzacano,
Senior Business Development Manager at Hitwise USA, Inc., for his valuable assistance.
We also wish to acknowledge the contribution of the many other members of the MPR
project team not listed on the cover page of this report: Julie Ladinsky helped us think
through the review process, draft the protocols, and conduct the pretest. Lisa Trebino
undertook the arduous task of reviewing, with care and diligence, over 50 health websites.
Mary Laschober pitched in to help us analyze the data we collected and critically reviewed
our reported findings. John Hall and Janice Ballou critically reviewed our proposed sampling
options and review protocols, and Yuhong Zheng helped generate the subsamples we used
for data collection. Eileen Curley formatted and produced all of our reports, including this
one, and Jane Retter, Leah Hackelman, and Walt Brower edited them.
Above all, we wish to thank Cynthia Baur, our project officer at the Office of Disease
Prevention and Health Promotion, for her timely help, guidance, and patience throughout
this project, as, together, we worked through difficult questions of interpretation and
judgment.
CONTENTS
Page
EXECUTIVE SUMMARY .................................................................................................xi
INTRODUCTION ..............................................................................................................1
METHODOLOGY..............................................................................................................3
REVIEWING BACKGROUND MATERIALS .......................................................................3
DEFINING THE DENOMINATOR ....................................................................................4
SAMPLING ..........................................................................................................................5
DRAFTING AND TESTING PROTOCOLS .........................................................................8
COLLECTING AND ANALYZING DATA FOR THE BASELINE ANALYSIS.....................9
Selecting Health Content for Review....................................................................9
Collecting and Validating Baseline Data............................................................ 10
Scoring and Analyzing Data ................................................................................ 11
FINDINGS ...................................................................................................................... 13
SUMMARY ESTIMATES OF COMPLIANCE .................................................................... 13
ELEMENTS OF COMPLIANCE BY CRITERIA ................................................................ 15
Identity.................................................................................................................... 15
Purpose................................................................................................................... 17
Content................................................................................................................... 18
Privacy .................................................................................................................... 19
vi
FINDINGS (continued)
User Feedback/Evaluation.................................................................................. 20
Content Updating ................................................................................................. 20
DISCUSSION................................................................................................................... 23
APPENDIX A: REPORT OF FINDINGS FROM WEBSITES EVALUATION PRETEST
APPENDIX B: WEBSITES EVALUATION REVISED PROTOCOL
APPENDIX C: BASELINE ESTIMATES OF COMPLIANCE
Contents
TABLES
Table
Page
1
COMPARISON OF WEBSITES IN UNIVERSE AND BASELINE SAMPLE,
BY TYPE OF SITE .............................................................................................................. 6
2
INELIGIBLE SAMPLE WEBSITES, BY REASON FOR INELIGIBILITY ............................ 7
3
SAMPLE ELIGIBILITY, BY SELECTED WEBSITE CHARACTERISTICS .......................... 8
4
REQUIRED ELEMENTS FOR SCORING AND OPTIONAL ELEMENTS........................ 12
5
ESTIMATES OF COMPLIANCE BY CRITERION AND DISCLOSURE ELEMENT.......... 16
C1
ESTIMATES OF COMPLIANCE WITH CRITERIA AND ASSOCIATED ELEMENTS
OF DISCLOSURE, ALL HEALTH WEBSITES
C2
ESTIMATES OF COMPLIANCE WITH CRITERIA AND ASSOCIATED ELEMENTS
OF DISCLOSURE, BY STRATUM
FIGURES
Figure
Page
1
ESTIMATES OF COMPLIANCE FOR ALL HEALTH WEBSITES AND
FREQUENTLY VISITED SITES, BY NUMBER OF CRITERIA IN COMPLIANCE .......... 13
2
ESTIMATES OF COMPLIANCE FOR ALL HEALTH WEBSITES AND
FREQUENTLY VISITED SITES, BY CRITERION ........................................................... 14
3
COMPLIANCE OF FREQUENTLY VISITED SITES AND REMAINDER SITES
WITH IDENTITY CRITERION, BY NUMBER OF DISCLOSURE ELEMENTS
IN COMPLIANCE ............................................................................................................. 17
4
COMPLIANCE OF FREQUENTLY VISITED SITES AND REMAINDER SITES
WITH PURPOSE CRITERION, BY NUMBER OF DISCLOSURE ELEMENTS
IN COMPLIANCE ............................................................................................................. 18
5
COMPLIANCE OF FREQUENTLY VISITED SITES AND REMAINDER SITES
WITH CONTENT CRITERION, BY NUMBER OF DISCLOSURE ELEMENTS
IN COMPLIANCE ............................................................................................................. 19
6
COMPLIANCE OF FREQUENTLY VISITED SITES AND REMAINDER SITES
WITH PRIVACY CRITERION, BY NUMBER OF DISCLOSURE ELEMENTS
IN COMPLIANCE ............................................................................................................. 20
7
COMPLIANCE OF FREQUENTLY VISITED SITES AND REMAINDER SITES
WITH UPDATING CRITERION, BY NUMBER OF DISCLOSURE ELEMENTS
IN COMPLIANCE ............................................................................................................. 21
EXECUTIVE SUMMARY
W
idespread and growing use of the Internet as a medium for disseminating and
gathering information has raised concerns about users’ ability to assess the quality
of the health and medical information presented on Internet websites. The Office
of Disease Prevention and Health Promotion (ODPHP) has identified six types of
information that should be publicly disclosed to users of health-related websites—including
information on the identity of the website sponsors (Identity), the purpose of the site
(Purpose), the source of the information provided (Content and Content Development),
policies for protecting the confidentiality of personal information (Privacy), how the site
solicits user feedback and is evaluated (User Feedback/Evaluation), and how the content is
updated (Content Updating). As part of the Healthy People 2010 initiative, ODPHP has
established a national objective to increase the proportion of health-related websites that
disclose information consistent with these six criteria (Communication Objective 11-4).
Mathematica Policy Research, Inc. (MPR), under contract to ODPHP, has developed, tested,
and implemented a methodology for estimating the proportion of health websites that
disclose information consistent with the identified criteria.
METHODS
We defined “health-related website” to include websites associated with a wide variety
of sponsoring organizations that provide information for staying well, for preventing and
managing disease, and for making decisions about health, health care, health products, or
health services. Using information generated by Hitwise, a commercial vendor that tracks
Internet traffic, we identified 3,608 health-related websites that had been visited by Internet
users in the United States during October 2005. We then stratified the 3,608 websites into
two groups—(1) the “target stratum” of the 213 most-frequently-visited sites, which account
for 60 percent of all visits; and (2) the 3,395 sites in the “remainder”—and drew a simple
xii
random sample from each stratum. Our final sample of 102 websites included 52 from the
target stratum and 50 from the remainder.1
We developed technical specifications for determining compliance with each of the six
disclosure criteria, enumerating the disclosure elements required for compliance as well as
for accessibility to users (in most cases, within two clicks of the home page). We then drafted
and pretested a data collection instrument on a subsample of 10 websites, and revised the
protocols based on the findings. Two reviewers then evaluated the websites from the final
sample; 24 websites were independently evaluated by both reviewers (to assess inter-rater
reliability), and 78 websites were singly reviewed. Once the data were collected, cleaned, and
validated, we coded all responses for scoring and analysis. We determined compliance at the
criterion level and for disclosure elements subsumed under each criterion.
FINDINGS
None of the 102 websites reviewed for this analysis met all six of the disclosure criteria
enumerated in Healthy People 2010 Objective 11-4, and only 6 complied with more than three
criteria. Figure A displays the frequency of compliance for the whole sample, the sites most
frequently visited, and the remainder, by the number of criteria in compliance.
Figure A. Estimates of Compliance for All Health Websites and Frequently Visited Sites,
by Number of Criteria in Compliance
45.0
38.5
40.0
38.0
34.6
Percent of Websites In Compliance
35.0
28.7
30.0
25.0
All Health Websites
21.0
Frequently Visited Sites
20.0
15.0
11.5
9.7
10.0
5.8
5.8
3.9
5.0
2.3
0.3
0.0
0.0
0.0
None
One
Two
Three
Four
Five
Six
Number of Criteria in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
Note:
All percentages shown are weighted.
1 We selected in each stratum a larger equal probability sample than we expected to need in order to
replace sites found to be ineligible. Of the 150 sites in the larger sample, 48 were ineligible, leaving a final
sample of 102.
Executive Summary
xiii
Of the six criteria, Privacy was met most often, followed by User Feedback/Evaluation.
The lowest levels of compliance were on the Content and Content Development criterion
and the Content Updating criterion, both of which required specific disclosure elements on
three randomly selected items of health content. Across all six criteria, a somewhat higher
proportion of websites from the “target” stratum most frequently visited were compliant
than were those drawn from the remainder.
Figure B displays the frequency of compliance for the whole sample, the target stratum
of sites most frequently visited, and the remainder, by each of the six criteria.
Figure B. Estimates of Compliance for All Health Websites and Frequently Visited Sites,
by Criterion
100.0
92.3
90.0
Percent of Websites in Compliance
80.0
75.3
69.2
70.0
58.8
60.0
51.9
All Health Websites
Frequently Visited Sites
50.0
40.0
35.2
30.0
20.0
10.0
15.4
8.5
3.9
0.3
0.1
1.9
0.0
Identity
Purpose
Content
Privacy
User Feedback
Updating
Criteria
Source:
Computations by Mathematica Policy Research, Inc.
Note:
All percentages shown are weighted.
DISCUSSION AND IMPLICATIONS
There was a noteworthy lack of consistency in how or where websites disclosed
information relating to the criteria. The disclosure elements reported here on which
compliance was high are indicative of the few conventions in practice that have emerged to
convey information about identity, privacy, and purpose, and to differentiate advertising
content from other information. However, no such conventions govern the disclosure of
other critical pieces of information—notably, information on sources of funding, editorial
oversight, authorship, or dating of information.
The same qualities that make the Internet appealing as a medium to search for
information—the ability to navigate quickly through multiple pages, sites, and sources—also
Executive Summary
xiv
complicate the task of disclosure. Many websites provide ready access to health information
from a variety of different sources, but very few consistently disclose information on
authorship or content updating on randomly selected items of health content. It is also
unclear whether stated policies found through links to affiliated sites are intended to apply to
the home site. While advertising may be clearly labeled, hyperlinks to what appears to be
health information sometimes take the user to commercial promotions.
The small sample size of this study limits the reliability of several of our baseline
estimates and our ability to detect statistically significant progress on later assessments.
Notwithstanding these limitations, this baseline estimate of health websites’ compliance with
the disclosure criteria clearly identifies both the areas on which some progress has been
made and those on which future improvement efforts must focus. The marginally better
performance among the websites most frequently visited suggests that some of the
conventions in practice that will improve disclosure in the future may be emerging. A
qualitative analysis of the practices used by the better-performing websites could offer useful
insights and guidance for improvement.
Executive Summary
INTRODUCTION
W
idespread and growing use of the Internet for disseminating and gathering
information has raised concerns about users’ ability to assess the quality of the
health and medical information presented on Internet websites. The Office of
Disease Prevention and Health Promotion (ODPHP) has created a national objective as part
of the Healthy People 2010 initiative to “increase the proportion of health-related World Wide
Web sites that disclose information that can be used to assess the quality of the site”
(Objective 11-4 in the Health Communication Focus Area).
In the absence of existing national data related to this objective, ODPHP has
recognized the need to create a methodology that would make the objective measurable and
to develop a baseline estimate against which progress can be measured over time. This in
turn requires (1) a reliable estimate of the total number of health-related websites (the
denominator); (2) consensus about what information should be disclosed to users to assess
website quality (disclosure criteria); and (3) a reliable estimate of the number of healthrelated websites that disclose this information (the numerator).
The ODPHP and the Objective 11-4 Technical Expert Workgroup have made
substantial progress on the second requirement, having identified six types of information
that should be publicly disclosed to users of health-related websites: (1) information on the
identity of the website sponsors, (2) the purpose of the site, (3) the authorship or source of
the health information provided, (4) policies for protecting the confidentiality of users’
personal information, (5) how the site is evaluated, and (6) how often the health content is
updated. The ODPHP subsequently contracted with Mathematica Policy Research, Inc.
(MPR) in September 2005 to develop, test, and implement a methodology for estimating the
proportion of health websites that disclose information consistent with the identified criteria.
This project had two main objectives:
• To develop and test, for assessing website quality, a methodology that (a)
reflects the way lay consumers actually use the Internet to seek health
information; (b) is credible, feasible, and replicable; and (c) is consistent with
other recognized and established online health quality initiatives
2
• To establish a baseline estimate of the proportion of health websites that comply
with identified disclosure criteria, using this methodology
In this report, we (1) describe our methodological approach, (2) present baseline
estimates and other key findings, and (3) discuss implications of findings and study
limitations as they relate to Healthy People 2010 Objective 11-4. A technical manual for
conducting the website review is provided separately.
Introduction
METHODOLOGY
W
e undertook the following six research tasks for this project:
1. Review background materials from ODPHP’s prior work on Objective 11-4 and
from related online health quality initiatives
2. Finalize a methodology for determining the denominator of health websites
3. Develop technical specifications that identify discrete elements required to
comply with the disclosure criteria
4. Create protocols and instruments for reviewing and scoring health websites
5. Conduct a preliminary test of the protocols and revise them, based on findings
6. Collect and analyze data for the baseline analysis
In the following sections, we describe our methodological approach to these tasks.
REVIEWING BACKGROUND MATERIALS
To assure that our working assumptions appropriately reflected the prior work of the
Objective 11-4 Technical Expert Workgroup, we reviewed background materials the
ODPHP Project Officer provided:
• “Measuring Healthy People 2010 Objective 11-4: Disclosure of Information to
Assess the Quality of Health Web Sites; Discussion Paper,” prepared by Carol
Cronin, Consultant to ODPHP, September 15, 2004
• “Measurement Options and Approaches; Defining the Denominator – An
Estimate of the Number of Health Web Sites,” prepared by Carol Cronin,
Consultant to ODPHP, September 23, 2004
4
• “Measuring Healthy People 2010 Objective 11-4: Disclosure of Information to
Assess the Quality of Health Web Sites,” September 30, 2004, Technical Expert
Workgroup Meeting Summary
We also examined the standards and protocols produced by the two organizations that
have developed processes to review and assess the quality of health websites, to assure that
our definitions of terms, identified disclosure elements and criteria, and language were
consistent with those of these established and recognized quality initiatives. These included:
• URAC Health Web Site Standards, Versions 1.0 (©2004) and 2.0 (©2006)
• Health Improvement Institute’s health information rating instrument, Version
2.0 (revised April 16, 2005), prepared for Consumer Health WebWatch
This review assured that the assumptions guiding our work and our definitions of terms
and concepts were consistent with evolving standards in the field.
DEFINING THE DENOMINATOR
Several assumptions guided our definition of the universe of health websites on the
Internet (the “denominator”) and the sampling strategy for this project. First, we assumed
that many (if not most) Internet users search for health information using standard search
engines, such as Yahoo or Google, and then visit several different websites identified.
Second, we assumed that topics of interest, and therefore search terms, will vary, depending
not only on individual interests, but also on health issues in the news. Thus, while a small
number of heavily used consumer-oriented health information websites may address
common areas of interest or serve as portals to specialized sources of information, any given
search might in fact lead the consumer to a large number of websites designed for many
different audiences and many different purposes. We therefore defined “health-related
website” in inclusive terms consistent with those suggested by URAC, Consumer Health
WebWatch, and the eHealth Code of Ethics, to include websites associated with a wide
variety of sponsoring organizations that provide information for staying well, for preventing
and managing disease, and for making decisions related to health, health care, health
products, or health services. 1
The e-Health Code of Ethics defines “health information” as follows: “Health information includes
information for staying well, preventing and managing disease, and making other decisions related to health and
health care. It includes information for making decisions about health products and health services. It may be
in the form of data, text, audio, and/or video. It may involve enhancements through programming and
interactivity.” Accessed June 19, 2005 at [www.estrategy.com]. This definition was also adopted by Risk, A., and
Dzenowagis, J. “Review of Internet Health Information Quality Initiatives.” Journal of Medical Internet Research,
vol. 3, no. 4, 2001.
1
Methodology
5
For the purpose of defining the denominator, we sought to enumerate an inclusive
universe of health websites visited by Internet users. We therefore relied on information
generated by a commercial vendor (Hitwise) that tracks Internet traffic, in lieu of smaller or
more selective lists available from other sources. Hitwise uses information from Internet
service providers to monitor traffic and categorizes websites with a minimum number of
visits by market type, based on their subject matter and content.2 Market share is then
reported based on the number of visits to each website within the market, the number of
pages viewed per visit, and the amount of time spent per visit.
We purchased data from Hitwise on websites classified in its Health and Medical market
category, based on Internet traffic in the United States during October 2005.3 Although
information was also available on subgroups within this market (including the Health
Information subgroup), we defined the “universe” for this project as all 3,608 websites
within the overall Health and Medical category, in order not to exclude websites that might
contain health information consistent with our working definition. Of these 3,608 websites,
213 accounted for 60 percent of all visits within this market. The dataset we purchased from
Hitwise included the following information for all websites: domain name, ranking (from 1
to 3,608) within the Health and Medical category (based on number of visits), percentage of
market share by visits, percentage of market share by number of page views, and average
length of each visit (expressed in minutes:seconds). We used market share defined by
number of visits to identify the heavily trafficked sites, since page views and the average
length of visits could reflect navigational difficulties rather than user interest. Hitwise also
provided domain names and limited information on market share of the top 100 sites within
each of 11 subcategories within the Health and Medical category. 4 We used this information
to assure diversity in our sample.
SAMPLING
For practical purposes, given budgetary and time constraints, we were limited to a
review of about 100 websites to develop baseline estimates of the proportion of health
websites in compliance with the disclosure criteria. In collaboration with the ODPHP
Project Officer, we reviewed sampling options and chose a strategy that would balance
ODPHP’s interests in describing the universe of health-related websites on the one hand,
and sites that account for most of the web traffic on the other. The selected strategy called
for stratifying the 3,608 websites from the Hitwise database into two groups—(1) the “target
Hitwise monitors visits to primary domain names or sub-domain names only (that is, the initial part of a
URL address up until the first slash). Such URL addresses typically direct users to a website’s homepage.
Hitwise does not track visits to more specific websites within a domain or sub-domain.
2
Hitwise excluded from its listing any websites that accounted for less than 0.01 percent of Internet
traffic within the Health and Medical category during October 2005.
3
Hitwise also reports information on subcategories within each of its larger market categories. The 11
subcategories in the Health and Medical category are Information, Research, Well-being, Primary and
Specialist, Pharmaceutical, Pharmacies, Paramedical, Organization, Health Insurance, Hospitals, and Alternative
Medicine.
4
Methodology
6
stratum” of the 213 sites most frequently visited (accounting for 60 percent of all visits), and
(2) the 3,395 sites in the “remainder”—and then drawing a simple random sample of 50
websites from each stratum. While this option would provide less precision for the sample
overall than an unstratified simple random sample, it allowed us to achieve greater precision
for the target stratum, while retaining a reasonable level of precision for the remainder.
Because it yields a sample that is also representative of the universe of all health websites in
the baseline period, this method was also selected because it better supports ODPHP’s need
to track changes over time.
We also controlled the sample selection by using a sequential selection procedure and
sorted the sampling frame by two factors: (1) the number of “top 100” subcategory lists that
a website was on, and (2) the type of website (for profit, nonprofit, government, or foreign).5
We selected in each stratum a larger equal probability sample than we expected to need, in
order to replace sites found to be ineligible. We then randomly partitioned this larger sample
into subsamples of five (called waves). The random partitioning took into account the
original sorting of the sample to ensure that the sample was diverse on the two sorting
factors. We then released waves as needed throughout the data collection effort to replace
ineligible sites. A comparison of the types of sites in the “universe” and the baseline sample
of 150 (before exclusion of ineligible sites) is shown in Table 1.
Table 1. Comparison of Websites in Universe and Baseline Sample, by Type of Site
Universe
Type of Site
Baseline Sample
Number
Percent
Number
Percent
Total
3,608
100.0
150
100.0
For Profit
Nonprofit
Government
Foreign
2,650
674
126
158
73.4
18.7
3.5
4.4
114
17
11
8
76.0
11.3
7.3
5.3
Source:
a
Hitwise—Real-Time Competitive
Mathematica Policy Research.
Intelligence
(www.hitwise.com).
Analysis
by
Type of Site:
”For Profit” sites include domains ending in .com, .net, and .biz.
“Nonprofit” sites include domains ending in .org, .edu, and .info.
“Government” sites include domains ending in .gov, .mil, .us, and .int.
“Foreign” sites include domains ending in a foreign country’s suffix (e.g.: .fr, .uk, .au).
Prior to releasing the subsamples for baseline data collection, the MPR review
supervisor examined each website in each subsample to identify sites that were inoperable,
5 We used domain names as proxy indicators of website type, because it was not feasible for reviewers to
make a more thorough investigation. However, we recognize that domain extensions may not accurately reflect
profit status or the country of origin.
Methodology
7
inaccessible, or otherwise not appropriate for review. Consistent with our working definition
of “health-related websites,” we included all accessible sites with at least three items of health
information content, as broadly defined by the eHealth Code of Ethics. Of 150 sites in the
baseline sample, 48 (32 percent) were found to be ineligible, most often because they lacked
sufficient health information content6 or because access to the sites or to health information
on the site was restricted (Table 2).7 The final sample size was 102.
Table 2. Ineligible Sample Websites, by Reason for Ineligibility
Reason
Number
Total Ineligible
No health information content
48
a
23
a
Less than 3 items of health content
4
Requires registration or subscription
18
Duplicate of another website in sample
b
Inactive website
Source:
2
1
Hitwise—Real-Time Competitive Intelligence (www.hitwise.com).
Analysis by Mathematica Policy Research.
a
We used eHealth Code of Ethics definition of health information. Accessed June 19, 2005 at
http://www.estrategy.com.
b
Different URLs but the same content.
Table 3 shows how the final sample of eligible websites compared with the universe
(sample frame) and to the initial sample, by stratum and type of site. The table shows both
unweighted and weighted percentages. Because we oversampled the sites most frequently
visited, we have weighted all estimates to adjust for the complex sample design.
6 Examples of health-related websites that lacked sufficient health information for our purposes are sites
that listed job postings for health professionals or research grants available for health researchers, or that were
designed only to support wholesale or retail sellers of specific commercial products. However, if websites
designed for such purposes also included health-related information—for example, findings from research
grants or information about the therapeutic effects of products—they were considered eligible.
Examples of health-related websites with restricted access are those accessible only to members or
paying subscribers who must enter an identifying log-in name and password. We also excluded sites that
required users to “register” by providing personal information. However, sites that limited access to registered
users but also provided some health information to nonregistered visitors were considered eligible.
7
Methodology
8
Table 3. Sample Eligibility, by Selected Website Characteristics
Eligible
Weighted
Percent of
Initial Sample
Sample Frame
Initial Sample
Number
Percent of
Initial Sample
Total
3,608
150
102
68.0
63.2
Stratuma
Most frequently visited
Remainder
213
3,395
70
80
52
50
74.3
62.5
74.3
62.5
Type of Siteb
For profit
Other
2,650
1,009
114
36
77
25
67.5
69.4
68.1
50.5
Source:
Hitwise—Real-Time Competitive Intelligence (www.hitwise.com). Analysis by Mathematica
Policy Research, Inc.
Note:
Percent is unweighted percentage of eligible site and weighted percentage takes into account
the disproportionate number of sampled sites among the most frequently visited sites.
a
Stratum:
“Most frequently visited: sites are those that account for 60 percent of total user visits.
“Remainder” include all other sites (that is sites that account for 40 percent of total user visits).
b
Type of Site:
“For Profit” sites include domains ending in .com, .net, and .biz.
“Other” includes the following domains:
“Non Profit” sites include domains ending in .org, .edu, and .info.
“Government” sites include domains ending in .gov, .mil, .us, and .int.
“Foreign” sites include domains ending in a foreign country’s suffix (i.e. .fr, .uk, .au).
DRAFTING AND TESTING PROTOCOLS
Building on the work of ODPHP and the Objective 11-4 Technical Expert Workgroup,
and in collaboration with the ODPHP Project Officer, we drafted technical specifications
that further defined each of the six disclosure criteria: (1) Identity, (2) Purpose, (3) Content
and Content Development, (4) Privacy and Confidentiality, (5) User Feedback/Evaluation,
and (6) Content Updating. Draft technical specifications included the primary question that
each criterion was intended to address, the content or disclosure elements associated with
each, and guidelines for determining the accessibility of each disclosure element. Consistent
with the prior recommendations of the Technical Expert Workgroup, the technical
specifications required most disclosure elements to be accessible to users within two clicks
of the home page.8 To the extent possible, we also sought to capture critical elements from
the URAC and Consumer Health WebWatch standards and language.
8 However, the technical specifications require that disclosure elements relating to health content be
contiguous to items of health content, rather than the home page. See Appendix B.
Methodology
9
Based on the draft technical specifications, we drafted a data collection instrument and
protocols for the review of health websites. The protocols guided the reviewer through a
series of questions to determine compliance with disclosure criteria. We designed the
protocols to reflect, to the extent feasible, the average Internet user’s navigation of the
website, starting from the URL of the home page. We also aimed to minimize the need for
reviewers to make independent judgments about what does and does not “count” toward
compliance and to ensure consistency across reviewers by giving closed-ended response
options accompanied by clear directions and definitions of terms. In addition, we included
space in the instrument to record the URL where specific information was found in order to
facilitate validation of data, as well as a comments field to capture issues that might require
additional discussion or followup. The protocols also directed reviewers to review three
separate items of health content on each website in answering specific questions about
Content and Content Development as well as Content Updating.
We pretested the draft protocols on a sample of 10 websites broadly representative of
the pool from which the final sample of 100 sites would be drawn. The pretest sample
included 5 sites from the target stratum and 5 from the remainder, and a mix of nonprofit
(.org), commercial (.com), government (.gov), and educational (.edu) domains. Two
members of the MPR project team who would not be the primary reviewers of the full
sample conducted the pretest (1) to determine whether the protocol appropriately elicited
information about compliance with disclosure criteria from these sites, and (2) to identify
any adjustments needed for the full review. Although the pretest was designed primarily to
test the protocols, we also obtained preliminary feedback on the proposed mode of
administration for the full review. Findings from the pretest addressed the length of time
required to review each site, the mode of administration, sources of discrepancies between
reviewers, and the content and wording of the protocols. Findings from the pretest are
attached as Appendix A. The revised final protocols are in Appendix B.
COLLECTING AND ANALYZING DATA FOR THE BASELINE ANALYSIS
Selecting Health Content for Review
Because our protocols call for a review of three separate items of health-related content
to answer specific questions, the review supervisor randomly selected three items of health
content for review within each website deemed eligible. Our aim in selecting the items for
review ahead of time was to minimize selection bias that might result from a given reviewer’s
particular interests or from website sponsors’ efforts to direct users to featured content. For
each site, we used random numbers to select three items of health content from available
options, starting with hyperlinks on the home page. Any content thus reached that was
consistent with the eHealth Code of Ethics definition of health information was sampled,
including content reached through hyperlinks to other websites and stand-alone documents
in .pdf format. However, we did not include health content that was in audio or video
format.
Methodology
10
Collecting and Validating Baseline Data
We transferred the revised protocols to an Access database to facilitate data input,
scoring, and analysis. We trained two MPR reviewers on the use of the protocols and briefed
them on the nuances of interpretation that arose during the pretest. We then set up two
computer screens for each reviewer to allow them to view and navigate both the website
under review and the protocol at the same time.
Each reviewer then independently reviewed the same websites from the first wave of
five drawn from the stratified random sample. We assessed inter-rater reliability, identified
and resolved discrepancies, and revised the protocols or clarified definitions, where
indicated. We repeated this process of double-reviewing on two successive waves of five
websites drawn from the full sample, achieving on both waves a raw Kappa score of 0.80 (a
score generally accepted as demonstrating an acceptable degree of inter-rater reliability on
survey protocols). Thereafter, each reviewer conducted separate reviews of the remaining
sites, alternating between waves drawn from the stratum most frequently visited and the
remainder, such that each reviewer reviewed an equal number of websites from both strata.
The review supervisor was available throughout the data collection period to answer
questions or establish and clarify decision rules. In addition, one website randomly selected
from every two waves was reviewed independently by both reviewers to assure that interrater reliability remained high. Again, discrepancies were identified and resolved, in order to
arrive at a single score for the doubly reviewed sites. In total, 24 websites from the final
sample were doubly reviewed and 78 were singly reviewed. The raw Kappa score of interrater reliability for all doubly reviewed sites was 0.79 for all response items. The adjusted
Kappa coefficient for disclosure elements that counted toward scoring was 0.81. 9
Once reviewers had completed the initial baseline data collection, the MPR review
supervisor cleaned and validated the data by (1) reviewing all responses that the reviewers
had flagged with comments, (2) reviewing all “other” responses and reassigning them to
specific response categories, (3) reviewing and validating all “not applicable” responses,
(4) flagging missing responses and returning items to the reviewer for completion,
(5) reviewing all items with a “no” response where a URL was indicated, and (6) reviewing
and validating a subset of all items with a “yes” response where no URL was indicated.
Questions of interpretation that arose during this review were discussed with reviewers and
with the MPR project director, and adjustments were made to the data, as appropriate.
The difference between raw and adjusted Kappa coefficients reflects the fact that multiple response
options could count as compliance for particular items. Thus, in a given case, reviewers might disagree on
which response option applied but still agree that the item was in compliance. Although the Kappa coefficient
is commonly used to assess inter-rater reliability, some statisticians have identified problems with it, including a
tendency to produce low scores even when agreement is high. We also calculated inter-rater reliability using the
Lin’s concordance correlation coefficient and found the sample concordance correlation coefficient (pc) =
0.8037, which similarly suggests moderate to substantial correlation.
9
Methodology
11
Scoring and Analyzing Data
Once the baseline data were cleaned and validated, we coded all responses for scoring
and analysis. Because the pretest revealed a lack of consistency in the way websites describe
some elements, the protocol includes multiple response options for some items, any one or
combination of which may count as disclosure. We assigned one point for any response
option that would count as disclosure of a required element. For disclosure elements that
were associated with selected items of health information content, we assigned one point per
item of health content. We then determined compliance at the criterion level: if the total
score for that criterion equaled the number of required disclosure elements subsumed under
that criterion, then the site was determined to be in compliance on the given criterion. If the
total score for the criterion was less than the number of required disclosure elements for the
criterion, it was designated as noncompliant even if some of the elements were present. The
number of points needed for compliance varied by criterion, from one point for the User
Feedback/Evaluation criterion to six points for the Content Updating criterion (which
required two disclosure elements on each of the three selected items of health content). To
be fully compliant with all six criteria, a website needed to disclose 20 separate elements.
Some optional elements of interest to ODPHP that would not count toward disclosure were
also tracked but were not included in scoring. Table 4 shows the criteria, required (and
optional) disclosure elements, and the points assigned to each.
We then weighted the baseline data to account for the disproportionate sampling from
the target stratum of most-frequently-visited websites and remainder websites, and to
account for ineligible sites within each stratum that were eliminated from the final sample.
We then analyzed the data using SUDAAN to produce weighted estimates of percentages of
health websites in compliance with the criteria (and with disclosure elements associated with
each criterion), as well as weighted estimates of compliance among the most frequently
visited websites and the remainder websites. We also calculated weighted standard errors,
relative standard errors, and 95 percent upper and lower confidence limits associated with all
of the estimated percentages. Finally, we tested for statistically significant differences in
compliance percentages across the two strata by criterion, at the 95 percent level of
confidence.
Methodology
12
Table 4. Required Elements for Scoring and Optional Elements
Required Disclosure Elements
Criterion
Identity
Purpose
Content
Privacy
User
Feedback/
Evaluation
Content
Updating
Total
Methodology
Description
Number of
Points
Name of person or organization
responsible for website
Street address for person or organization
responsible for website
1
Identified source of funding for website
1
Subtotal
3
Statement of purpose or mission for
website
Uses and limitations of services provided
1
1
Optional Disclosure Elements
Other contact information for
person or organization responsible
for website
1
Association with commercial products or
services
1
Subtotal
3
Differentiating advertising from nonadvertising content
Medical, editorial, or quality review
practices or policies
1
Authorship of health content (per page of
health content)
3
Subtotal
5
Privacy policy
How personal information is protected
1
1
Subtotal
2
Feedback form or mechanism
1
Subtotal
1
Date content created (per page of health
content)
3
Date content reviewed, updated, modified,
or revised (per page of health content)
3
Subtotal
6
1
20
Names/credentials of reviewers
How information from users is
used
Copyright date
FINDINGS
SUMMARY ESTIMATES OF COMPLIANCE
Appendix C displays the results of our analyses.
Because none of the 102 health websites reviewed for this analysis fully met all six of
the disclosure criteria, we are unable to produce an overall estimate of the current level of
their compliance with Healthy People 2010 Objective 11-4. However, these data strongly
suggest that full compliance is likely to be extremely low. Although 90 percent of health
websites comply with one or more criteria, only 3 percent comply with more than three. Ten
percent of all health websites meet none of the six criteria. Figure 1 displays estimates of
compliance for all health websites and the most-frequently-visited sites, by the number of
criteria in compliance.
Figure 1. Estimates of Compliance for All Health Websites and Frequently Visited Sites, by
Number of Criteria in Compliance
45.0
38.5
40.0
38.0
34.6
Percent of Websites In Compliance
35.0
28.7
30.0
25.0
All Health Websites
21.0
Frequently Visited Sites
20.0
15.0
11.5
9.7
10.0
5.8
5.8
3.9
5.0
2.3
0.3
0.0
0.0
0.0
None
One
Two
Three
Four
Number of Criteria in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
Note:
All percentages shown are weighted.
Five
Six
14
Of the six criteria, Privacy is met most often (by an estimated 75 percent of all health
websites), followed by User Feedback/Evaluation (59 percent), Purpose (35 percent), and
Identity (9 percent). We should note that compliance was higher across all criteria among the
most frequently visited websites. For example, nearly all of the websites in this stratum met
the Privacy criterion (92%), the majority met the User Feedback/Evaluation criterion (69
percent), and about half met the Purpose criterion (52 percent). We found only 2 of the 102
websites we reviewed to be in compliance with the Content criterion and only 1 in
compliance with Content Updating—numbers too low, given our sample size, to yield
reliable estimates of overall compliance on these two criteria.
Figure 2 displays estimates of compliance for all health websites and the sites most
frequently visited, by each of the six criteria.
Figure 2. Estimates of Compliance for All Health Websites and Frequently Visited Sites, by
Criterion
100.0
92.3
90.0
Percent of Websites in Compliance
80.0
75.3
69.2
70.0
58.8
60.0
51.9
All Health Websites
50.0
Frequently Visited Sites
40.0
35.2
30.0
20.0
10.0
15.4
8.5
3.9
0.3
0.1
1.9
0.0
Identity
Purpose
Content
Privacy
User Feedback
Updating
Criteria
Source:
Computations by Mathematica Policy Research, Inc.
Note:
All percentages shown are weighted.
As Figures 1 and 2 illustrate, estimates of percentage compliance among the websites
most frequently visited are notably higher than for all health websites reviewed, in terms
both of the total number of criteria in compliance and of compliance on individual criteria.
However, we found the differences in percentages of compliant websites between the mostfrequently-visited websites and the remainder to be statistically significant at the 95 percent
level of confidence only on the Purpose and Privacy criteria (Appendix C).
In the following section, we discuss in greater detail the level of compliance with
disclosure elements that are included within each of the six criteria.
Findings
15
ELEMENTS OF COMPLIANCE BY CRITERIA
Table 5 displays the estimates of website compliance, by criterion and by disclosure
elements associated with each criterion. We present estimates separately for all health
websites, the sites most frequently visited, and for the remainder websites. We discuss
findings by criterion in the following sections.
Identity
Full compliance with the Identity criterion requires disclosure of three elements: (1) the
name of the person or organization responsible for the website (Name), (2) the street
address (Address), and (3) identified sources of funding for the website (Funding). As shown
in Table 5, most websites (92 percent) disclose Name, and over half (55 percent) disclose
Address. However, only one-fifth (20 percent) disclose Funding. In sum, only 9 percent of
all websites (and 15 percent of the sites most frequently visited) comply with all three of the
Identity criterion elements, although most comply with one or two. While baseline estimates
of compliance with the criterion and its subcomponent elements differed somewhat between
the strata, we did not find statistically significant differences at the 95 percent level of
confidence.
Findings
Criterion/Disclosure Element
16
Findings
Table 5. Estimates of Compliance by Criterion and Disclosure Element
All Sites (n = 102)
Frequently Visited Sites (n=52)
Lower
bound
95% CI
Upper
bound
95% CI
Lower
bound
95% CI
Upper
bound
95% CI
Percent
Percent
Percent
Remainder (n=50)
Lower
bound
95% CI
Upper
bound
95% CI
Identity
Name
Street address
Funding sources
8.5
91.6
54.8
20.2
3.6
81.2
41.7
11.7
18.9
96.5
67.3
32.7
15.4
86.5
65.4
23.1
8.5
76.0
53.2
14.5
26.2
92.9
75.8
34.7
8.0
92.0
54.0
20.0
3.0
80.3
40.0
11.0
19.7
97.0
67.4
33.6
Purpose
Purpose or mission
Uses and limitations
Association with commercial products
35.2
64.3
71.0
60.8
24.0
50.9
57.8
47.5
48.4
75.7
81.5
72.7
51.9*
67.3
84.6
71.2
40.0
55.2
73.8
59.1
63.7
77.5
91.5
80.8
34.0
64.0
70.0
60.0
22.2
49.7
55.8
45.8
48.3
76.2
81.2
72.7
Content
Identify advertising content
Describe editorial policy
Authorship
0.3
74.9
5.1
11.4
0.1
61.8
1.8
5.6
0.9
84.6
13.5
22.0
3.9
86.5
19.2*
30.8*
1.1
76.0
11.4
20.9
12.3
92.9
30.5
42.9
0.0
74.0
4.0
10.0
n.a.
60.0
1.0
4.2
n.a.
84.4
14.9
22.1
Privacy*
Privacy policy
Describe protection of personal information
75.3
79.1
75.3
62.1
66.3
62.1
85.0
88.0
85.0
92.3*
94.2*
92.3*
82.9
85.3
82.9
96.7
97.9
96.7
74.0
78.0
74.0
60.0
64.3
60.0
84.4
87.5
84.4
User Feedback/Evaluation
Feedback mechanism
58.8
58.8
45.5
45.5
70.9
70.9
69.2
69.2
57.1
57.1
79.2
79.2
58.0
58.0
43.9
43.9
71.0
71.0
0.8
13.5
9.9
1.9
11.5
19.2
0.3
5.8
11.4
10.2
21.7
30.5
0.0
4.0
2.0
n.a.
1.0
0.3
n.a.
14.9
13.2
Content Updating
Display date createda
Display date reviewed or updateda*
0.1
4.5
3.2
0.02
1.4
1.0
Source:
Computations by Mathematica Policy Research, Inc.
Note:
All percentages and confidence intervals are weighted.
CI= Confidence Interval
n.a. = Not applicable due to a zero estimate and standard error
Shading indicates estimates that are unreliable, due to large standard error relative to small percentage estimates.
a
This criterion element must be fulfilled for all three pages of health content evaluated.
*Difference in estimated percentages for "most frequently visited sites" compared to "remainder sites" is statistically significant at the 95 percent level of
confidence, based on student's t-test of independent samples.
17
Figure 3 shows the estimated frequency of compliance with the elements of the
Identity criterion for frequently visited websites and the remainder, by the number elements
in compliance.
Figure 3. Compliance of Frequently Visited Sites and Remainder Sites with Identity
Criterion, by Number of Disclosure Elements in Compliance
70.0
58.0
Percent of Websites in Compliance
60.0
50.0
50.0
40.0
Frequently Visited Sites
Remainder
28.9
30.0
26.0
20.0
10.0
15.4
8.0
8.0
5.8
0.0
None
One
Two
Three
Number of Disclosure Elements in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
Purpose
Compliance with the Purpose criterion requires disclosure of three elements: (1) a
statement of the purpose or mission of the website (Mission), (2) uses and limitations of the
services provided (Uses), and (3) a statement about any association with commercial
products or services (Commercial Products). As shown in Table 5, 35 percent of all websites
disclose all three elements, although a larger percentage (between 61 and 71 percent) provide
statements that comply with at least one of the three elements (often in legal disclaimers
included somewhere in the website).
Compliance with all three elements is substantially higher among the websites most
frequently visited (52 percent) than among the remainder websites (34 percent), and the
difference is statistically significant at the 95 percent confidence level (Figure 4). However,
we did not find statistically significant differences between the two strata in the frequency of
disclosure of the individual elements included in this criterion, nor in the mean number of
elements in compliance.
Findings
18
Figure 4. Compliance of Frequently Visited Sites and Remainder Sites with Purpose
Criterion, by Number of Disclosure Elements in Compliance
60.0
Percent of Websites in Compliance
51.9
50.0
40.0
36.0
34.0
Frequently Visited Sites
30.0
26.9
Remainder
20.0
20.0
13.5
10.0
10.0
7.7
0.0
None
One
Two
Three
Number of Disclosure Elements in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
Content
Compliance with the Content criterion requires disclosure on three distinct elements:
(1) differentiation of advertising content from non-advertising content on the website
(Identify Advertising), (2) policy statements describing editorial policy or oversight of health
content (Editorial Policy), and (3) authorship of health content on each of three randomly
selected pages of health content on the website (Authorship).
As Table 5 shows, nearly three-fourths of health websites Identify Advertising, but very
few (about 5 percent) clearly disclose their Editorial Policy. However, a significantly higher
estimated proportion of frequently visited websites (19 percent) disclose Editorial Policy
compared with the remainder websites (4 percent).
Compliance with Authorship is complicated by the fact that, unlike the first two
disclosure elements, it is required on three items of randomly selected health content.
Slightly over 11 percent of all websites consistently identify Authorship on items of health
content, although a statistically significantly higher proportion (31 percent) of the sites most
frequently visited do so compared with the remainder websites (10 percent).
In total, though, we found only two websites in our sample of 102 to be fully compliant
with the Content criterion, which suggests that health websites in general do not adhere to
this criterion as we defined it. However, the sites most frequently visited, on average,
disclosed significantly more elements related to this criterion (a mean of 2.6 out of 5)
compared to the remainder websites (with a mean of 1.7). Figure 5 shows the frequency of
compliance with the elements of the Content criterion for each of the strata, by the number
elements in compliance.
Findings
19
Figure 5. Compliance of Frequently Visited Sites and Remainder Sites with Content
Criterion, by Number of Disclosure Elements in Compliance
45.0
40.0
Percent of Websites in Compliance
40.0
35.0
32.7
30.0
30.0
25.0
23.1
Frequently Visited Sites
Remainder
19.2
20.0
15.4 16.0
15.0
10.0
8.0
6.0
5.8
3.9
5.0
0.0
0.0
None
One
Two
Three
Four
Five
Number of Disclosure Elements in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
Privacy
Compliance with the Privacy criterion requires disclosure of two elements: (1) a
statement of privacy policy (Privacy Policy), and (2) a statement regarding protection of
personal information (Personal Information). About three-fourths of all websites are
estimated to be fully compliant with this criterion. Once again, a significantly higher
proportion of most-frequently-visited websites (92 percent) are in compliance compared
with the remainder websites (74 percent). Significant differences are also observed for the
two disclosure elements associated with this criterion (Table 5) and for the mean number of
elements in compliance.
Figure 6 shows the baseline estimates of compliance with the elements of the Privacy
criterion for each of the strata, by the number of elements in compliance.
Findings
20
Figure 6. Compliance of Frequently Visited Sites and Remainder Sites with Privacy
Criterion, by Number of Disclosure Elements in Compliance
100.0
92.3
Percent of Websites in Compliance
90.0
80.0
74.0
70.0
60.0
Frequently Visited Sites
50.0
Remainder
40.0
30.0
22.0
20.0
10.0
5.8
4.0
1.9
0.0
None
One
Two
Number of Disclosure Elements in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
User Feedback/Evaluation
Compliance with this criterion requires only one disclosure element, a mechanism for
the user to provide feedback about the website (Feedback).10 In all, 59 percent of websites
are estimated to comply with this criterion (Table 5). Compliance is higher, but not
statistically significantly different, among the most-frequently-visited websites (69 percent)
compared with the remainder websites (58 percent).
Content Updating
Compliance with the Content Updating criterion requires disclosure of two elements on
each of three randomly selected items of health content:11 (1) the date the content was
created (Date Created); and (2) the date the content was reviewed, updated, modified, or
revised (Date Updated).12 Fewer than 5 percent disclose Date Created consistently on all
We also tracked statements disclosing how such information would be used to improve the website (an
element of interest to ODPHP) but did not count this toward overall compliance. Only four of the websites in
the total sample included this information.
10
We used the same randomly selected pages of health content to determine Authorship on the Content
criterion and to assess both disclosure elements on the Content Updating criterion.
11
Copyright date did not count as disclosure of either element, since the pretest revealed that copyright
dates were often given (either as a month and year, or a year, or range of years) without any specific reference
12
Findings
21
items, and an even smaller proportion (about 3 percent) consistently disclose Date Updated.
Although both are low, the proportion of frequently visited sites disclosing Date Updated
(an estimated 19 percent) is significantly higher than for remainder websites (2 percent) at
the 95 percent confidence level. Differences between the two website strata for Date
Created were not statistically significant.
Overall, only 1 of the 102 websites we reviewed was fully compliant with both the
required elements for this criterion on all items of health content, which suggests that health
websites in general do not comply with this criterion as we defined it. However, the sites
most frequently visited, on average, disclosed significantly more elements related to this
criterion (a mean of 2.1 out of 6) compared to the remainder websites (with a mean of 0.9).
Figure 7 shows the estimated frequency of compliance with the elements of the Content
Updating criterion for each of the strata, by the number of elements in compliance.
Figure 7. Compliance of Frequently Visited Sites and Remainder Sites with Updating
Criterion, by Number of Disclosure Elements in Compliance
60.0
Percent of Websites in ComplianceSites
52.0
50.0
40.0
30.0
28.9
Frequently Visited Sites
28.9
Remainder
20.0
20.0
16.0
11.5
11.5
12.0
9.6
10.0
7.7
1.9
0.0
0.0
0.0
0.0
None
One
Two
Three
Four
Five
Six
Number of Disclosure Elements in Compliance
Source:
Computations by Mathematica Policy Research, Inc.
(continued)
to the date the content was created or updated. However, 66 of the websites displayed copyright dates on pages
of health content, in relatively equal distribution across the two strata of interest.
Findings
DISCUSSION
A
number of factors complicated the review and assessment of health-related websites
for this project, which reveal both the relative youth of the Internet as a medium of
health communication and the challenges of improving it.
There was a noteworthy lack of consistency in how or where websites disclosed
information relating to the criteria. With other communications media (such as print,
broadcast, and film), conventions in practice have, over decades or centuries, emerged that
make it fairly easy to know where to look (or watch or listen) for information on ownership,
sponsorship, authorship, publication or production dates, copyright information, legal
disclaimers, and other information that can help users determine the source or credibility of
information conveyed. Few such conventions have yet emerged on Internet sites, however.
The disclosure elements reported here on which compliance was high are indicative of
the few conventions that have emerged. For example, the name of the sponsoring
organization, if it is not provided on the home page, can usually be found on a link labeled
“About Us” (although contact information is found there less often). Privacy statements are
common and usually clearly labeled, if not through a tab or link at the top of the home page,
then through a link in small print at the bottom. Legal disclaimers—although the language
used to label them varies considerably—are usually present, delimiting the purpose and uses
of the information or services provided. Advertising content is usually clearly labeled as such
and differentiated from other content through its placement on the web page and its graphic
design.
No such conventions in practice, however, govern the disclosure of other critical pieces
of information—notably, information on sources of funding, editorial oversight, authorship,
or dating of information. It was not at all obvious where to begin to look for such
information on most websites, and finding information within “two clicks” of the home
page (as the disclosure criteria usually required) was often difficult. When information that
appeared to relate to these disclosure elements was found, the wording or presentation was
such that it was often not clear whether it satisfied the intentions of the criteria. For
example, when sources of funding were identified, it was not always easy to determine
whether the information provided referred to funding for the sponsoring organization or
funding for the website. Similarly, editorial oversight policies, when present, were often
24
vaguely worded, making it unclear as to what content they applied. There were few
conventions for identifying the authors of health content, and authorship was especially
ambiguous on websites where the content appeared to have been prepared by the site host.
The very qualities that make the Internet so appealing as a medium to search for
information—the ability to navigate very quickly to and through multiple pages, sites, and
sources—also complicate the task of finding and interpreting information relating to the
disclosure criteria. It was precisely because many of the websites provided ready access to
health information from a variety of different sources, for example, that they performed so
poorly on disclosure elements that referred to specific items of health content. Few complied
with disclosure elements relating to authorship and content updating on all three items of
health content that we reviewed.
While some hyperlinks take users to other parts of the website, others may take them to
separate websites altogether, including those of a partner, sister, or parent organization. It
was often unclear, in our review, whether stated policies found through links to related or
affiliated sites also applied to the home (sampled) site. This was especially problematic in the
case of “nested” websites (for example, websites for government programs nested within the
parent agency and/or department websites) where generic editorial or medical review
policies were sometimes found at the parent (or grandparent) site. In other cases, although
advertising may be clearly labeled on any given web page, hyperlinks to what appears to be
health information may take the user to commercial promotions.
We acknowledge that our study has several limitations that could affect the
generalizability of our findings to the universe of health websites of interest to ODPHP or
the ability to replicate the study in the future. First, some of the websites in our initial sample
did not conform to our working definition of health websites, which suggests that Hitwise’s
Health and Medical category was broader than needed. Nevertheless, we believe that the
comprehensiveness of the Hitwise database best meets the need to define the denominator
of health websites in inclusive terms that reflect actual and changing Internet use. Second,
we excluded, for practical reasons, websites that limited access to registered users or
subscribers. We were thus unable to review the disclosure practices of sites that may be an
important source of health information to some Internet users. Third, the three items of
health content we randomly selected to review may or may not have been representative of
all or most of the health content on any given website, and it is quite possible that a different
selection of material would have yielded different results. Fourth, the data that we purchased
reflected Internet traffic for only one month (October 2005) and thus did not account for
seasonal variation in the volume or patterns of traffic. Data drawn from a different time
period may yield a stratified sample with different characteristics, even with the same
sampling strategy. Fifth, as the earlier discussion suggests, there were also many gray areas of
interpretation that required judgment calls on the part of the reviewers and the project team.
While we have tried to document these as clearly as possible in the accompanying technical
manual, others looking at the same information might reach different conclusions. Sixth, our
small sample size limits both the reliability of several of our baseline estimates and the ability
to detect statistically significant progress toward meeting the Healthy People 2010 Objective
11-4 on later assessments. The negligible rates of compliance that we detected in our sample
Discussion
25
precluded reliable estimates of compliance for some disclosure elements. The sample size,
however, does permit reliable compliance estimates for the majority of criteria and their
associated elements that can be used to track future improvements in adherence to the
standards defined under this project.
Notwithstanding these challenges and limitations, we believe that the baseline estimates
of health websites’ compliance with the disclosure criteria clearly identifies the areas on
which some progress has been made, as well as those on which future improvement efforts
should focus. The consistently (and sometimes significantly) better performance of the
websites most frequently visited, across almost all criteria and disclosure elements, suggests
that the heavily trafficked health websites may be moving toward defining the conventions
in practice that can improve disclosure in the future. A qualitative analysis of the practices
used by the better-performing websites, which was beyond the scope of the current study,
could offer useful insights and guidance for improvement.
Discussion
APPENDIX A
REPORT OF FINDINGS FROM WEBSITES
EVALUATION PRETEST
MEMORANDUM
TO:
Cynthia Baur
FROM:
Margaret Gerteis, Anna Katz, Julie Ladinsky
SUBJECT:
Report of Findings from Websites Evaluation Pretest
DATE: 1/27/2006
Mathematica Policy Research, Inc. (MPR), under contract to the Office of Disease
Prevention and Health Promotion (ODPHP), will develop and test a methodology for estimating
the proportion of health websites that comply with disclosure criteria enumerated in Healthy
People 2010 Health Communication Objective No. 11-4. Consistent with the requirements of
this contract, the MPR project team has 1) finalized a methodology for determining the
denominator, 2) developed technical specifications for the assessment, 3) drafted protocols for
reviewing health websites, and 4) conducted a pretest of the protocols on a small sample of
health websites. Here we describe our approach to the preliminary testing, report key findings,
and recommend revisions to the protocols based on these findings. This memo will serve as a
basis for our pretest debriefing to be held on January 30, 2006.
Purpose
We pretested draft protocols on a sample of 10 websites that broadly represent the pool
from which we will choose 100 sites. The purpose of this test was to determine if the protocol
was able to appropriately elicit information about compliance with disclosure criteria from these
sites and to determine needed adjustments for the full review.
Pretest Methodology
Sample Selection
Our aim in selecting the 10 sites for the pretest was to mimic the sample selection procedure
that would be used in the full review by including 5 sites from the target stratum (that is, those
sites that account for 60 percent of user visits to health websites) and 5 from the remainder. We
also aimed to include the range of domains (.com, .org, .net, .gov, .edu) likely to show up in the
final sample of 100. We first reviewed the 3,608 health websites from the database provided by
Hitwise to determine the distribution of sites by stratum and domain. This distribution is shown
in Table 1:
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
2
Table 1: Distribution of Health Websites from Hitwise Database
Target Stratuma
Full Sample
Domain
Number
Total Hitwise Sample
.com
.org
.net
.gov
.edu
Otherb
3,608
2,496
538
96
81
56
341
Percent of
Total
100.0
69.2
14.9
2.7
2.2
1.6
9.5
Number
Percent of
Stratum
Percent of
Domain
214
152
27
6
22
2
5
100.0
71.0
12.6
2.8
10.3
0.9
2.3
6.1
5.0
6.2
27.2
3.5
1.5
a
The “target stratum” is defined as those websites that account for 60 percent of the visits to health websites from the Hitwise
database.
b
These “other” sites in the Hitwise database include an array of for-profit (commercial), non-profit, governmental, and other
sites, including domains outside the United States, with less commonly-used domain indicators. These will be included in the
sampling frame for the full review and will be classified according to their type of sponsorship. However, they were not included
in the pretest sample.
We then used a quasi-random selection process to select 10 sites with the characteristics
shown in Table 2:
Table 2: Distribution of Pretest Sample
Domain
Number in Target Stratum
Number in Remainder
.com
.org
.net
.gov
.edu
2
1
1
1
0
1
2
0
1
1
Total
5
5
Of the 10 websites selected initially, two were found to be targeted to specialized audiences
for specialized purposes unrelated to the purpose of this study (one site was a job listing for
health professionals; the other site listed federal grants for health researchers). We replaced
these with two sites with from the same domains and strata.
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
3
Selection of Health Content
Because our protocols call for a review of three separate items of health-related content to
answer specific questions, our next task was to select the 3 items for review for each of the 10
websites. Our aim in selecting the items for review was not only to ensure that both reviewers
looked at the same content but also to minimize selection bias that might result from a given
reviewer’s particular interests or from website sponsors’ efforts to direct users’ attention to
featured content. For each site, we traced three alternative paths from the home page to healthrelated content, using random numbers to identify topics or content from listed options. (You
may recall that during prior discussions we agreed that any health-related content that users
could access from the website under review would count, even if it led to content on other sites.)
Of the 30 items thus generated, 19 were items of health content residing on the website
under review, 9 were items generated through links to other sites, and two were .pdf files (one
from another website and one from the website under review).
Mode of Administration
Although the pretest was designed primarily to test the protocols, we also wanted to
obtain preliminary feedback on a mode of administration that we proposed to use for the full
review. First, we transferred the protocols to an Excel worksheet to facilitate data input,
scoring, and analysis. Second, we set up two computer screens to allow one reviewer to view
and navigate both the website under review and the protocol at the same time. The second
reviewer used the Excel worksheet but did not have access to two computer screens.
Review of Websites
Using the draft protocols submitted on December 19, 2005, two members of the project
team, who are not the primary reviewers of the full sample, separately reviewed each of the
10 websites and the selected pages of health content. Each reviewer documented any finding
that a particular disclosure item was present by indicating both the location (URL) and the
wording of the content. They were also asked to track difficulties or questions that arose, as
well as the time spent on each review.
Analysis and Debriefing
After both reviewers had finished reviewing the 10 websites, we conducted a simple test
of inter-rater reliability, based on a comparison of their choices of specific response options
on each question and for each website. We then debriefed reviewers, item by item and site
by site, to explore sources of the discrepancies and to identify lingering questions of
interpretation to be resolved with the project officer.
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
4
Key Findings: How the Review Process Worked
Site Selection
Even in a limited sample of 10 websites, the sampling method used for this pretest yielded a
diverse array of health-related websites, suggestive of what we may expect to find in the larger
universe. While our working definition of “health websites” has been intentionally inclusive, the
fact that 2 of the 10 websites selected initially from the Hitwise database were clearly
inappropriate for the purpose of this study suggests the need both to clarify exclusion criteria and
to create a sample frame large enough to accommodate a potentially large number of ineligible
websites. We propose to eliminate websites from the sample for the full review if they are
designed to provide narrowly defined services for specialized audiences and have no health
information that might be relevant to the general public. We will also design the sample frame
such that replacement sites can be selected, where needed, consistent with the stratified sampling
methodology that we have agreed upon.
Timing
Both reviewers spent well over an hour (1 hour 20 minutes to 1 hour 40 minutes) on each of
the first five website reviews. Thereafter, most reviews were completed within an hour. While
some of the time spent on earlier reviews resulted from ambiguities of meaning or interpretation
that were later clarified, there was also clearly a “learning curve” as reviewers became
accustomed to the protocols, the websites, and strategies to search for the disclosure criteria.
Mode of Administration
As noted above, we tested two aspects of mode of administration of the review protocols: 1)
the use of two computer screens, 2) the use of an Excel worksheet, online and on paper. Having
two computer screens to work from made it easier for the reviewer to move back and forth
between the website under review and the review protocols without having to close either
window. (The reviewer who did not have two screens found it easier to work from a paper
version of the protocol than to switch between windows on a single screen.) We have therefore
arranged for reviewers to have access to two screens for the full review.
However, the online Excel worksheet was somewhat unwieldy to use, requiring excessive
scrolling (left/right and up/down) to view definitions or paste content, which made it too easy to
lose one’s place. Although the paper version was relatively easy to use, it created an extra step
of later data entry into a spreadsheet for analysis, adding time to the process and creating the
opportunity for more errors. We therefore explored alternatives, including web-based survey
applications and Access databases, and propose to use an Access database for the full review.
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
5
Sources of Discrepancies Between Reviewers
The item by item, site by site comparison of the two sets of reviews yielded a large number
of discrepancies, although simple measures of inter-rater reliability showed the two reviewers to
be in moderate agreement, overall, for all of the response items. While particular questions and
particular websites were sometimes more problematic than others, the source of the
discrepancies generally fell into one of four categories: 1) problems with the protocols, 2)
differences in reviewers’ subjective interpretations of the meaning of the criteria or what satisfies
the criteria, 3) difficulty finding or identifying some disclosure elements, and 4) reviewer entry
errors. We review each category briefly below.
Problems with the protocols often resulted from ambiguously worded questions or
overlapping response categories. In most cases, we were able to agree on the meaning of the
question and resolve ambiguities through rewording the question or the accompanying definition.
In order to help reviewers identify disclosure elements, given that the specific wording would
vary, we initially broke out questions and/or response categories to provide multiple cues and
options. These were not mutually exclusive categories, however, and reviewers often disagreed
as to which response applied even as they agreed that the criterion had been met. Disagreements
of this sort would not affect overall scoring and can readily be accommodated through scoring
algorithms. Where multiple response options were helpful to reviewers (for example, listing
separately the different terms that may be used to describe how health content is reviewed), we
propose to retain them. Where these options added to the confusion and were not necessary to
determine compliance with a given criterion (for example, distinguishing between personal
information and personal health information), we propose to combine or eliminate them.
Differences in interpreting the meaning of the criteria or identifying elements that would
satisfy the criteria most often resulted from wide variations in disclosure practices among the
websites under review. While information may have been presented that related to the criteria,
the wording or presentation was such that it was not clear whether it satisfied them. In such
cases, reviewers’ judgment calls often differed. (We discussed and resolved the most common
issues that arose in this regard when we spoke by telephone on January 19.) They are described
further in the next section, as they related to the sample websites’ performance on specific
disclosure criteria.
In a small number of cases, one reviewer was able to find specific disclosure elements that
satisfied the criteria while the other was not. One might attribute this discrepancy to differences
in the diligence or perceptive capabilities of individual reviewers. In this pretest, however, it
happened more or less equally to both reviewers. Through discussion, we determined that in
such cases the disclosure element was quite simply hard to find and often found by accident in
sections ostensibly devoted to other topics.
Finally, a small number of discrepancies were simple entry errors, often attributable to
losing one’s place in the online Excel worksheet, as noted above. Reviewer entry errors were
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
6
less common on the paper worksheet (notwithstanding the opportunity for later transcription
errors when the data is entered into a database for analysis).
Key Findings: How the Sample Websites Fared on the Disclosure Criteria
Given the nature and purpose of the pretest, we did not compute a final compliance score for
the websites in the pretest. However, our preliminary review suggests that none of the 10
websites satisfied all six of the disclosure criteria. One commercial site appeared to have
satisfied five of the six. We discuss findings related to specific criteria below.
Identity
Most of the websites reviewed clearly identified the name of the organization responsible for
the website, and most provided a street address as well as other contact information for the
organization. However, sources of funding for the website were identified less often. Moreover,
when sources were identified, it was not always easy to determine whether the information
provided referred to funding for the sponsoring organization or funding for the website. Our
initial review suggests that about half of the websites fully complied with this criterion.
Purpose
Very few of the websites reviewed included an explicit statement about the mission or
purpose of the website, although many described features or services available to website users.
Here again, where mission statements were found, it was not always easy to distinguish whether
they were intended to describe the mission of the sponsoring organization or the mission of the
website. Statements regarding the website’s association (or lack of association) with commercial
products or services were often included in legal disclaimers (for example, through links
identified in small type at the bottom of the page). Overall, about half of the pretest websites
appeared to comply with this criterion.
Content
Most of the websites that included advertising on the homepage clearly distinguished
advertising from non-advertising content. In some cases, however, advertising on other pages
was not so clearly distinguished. Moreover, it was not always clear where specific links would
take the user and which ones would link to commercial promotions. Although it would not be
feasible to pursue every link or review every page of content to determine compliance with this
disclosure element, we will direct reviewers to explore at least two links beyond the content
displayed on the home page to look for advertising content.
There was little consistency in how or where websites described their oversight of health
content. Moreover, because many of the sites included content from many different sources, it
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
7
was not always clear whether the policies that were described referred to all content or only
some. This was also problematic in the case of “nested” websites (for example, websites for
government programs nested within the parent agency and/or the department websites) for which
generic review policies may be found at the parent (or grandparent) site.
Although a few sites clearly identified individual or organizational authors of specific health
content, many did not. Sites that included many different types of health content from many
different sources were often inconsistent in this regard, identifying authors in some cases but not
in others. Authorship was especially ambiguous on websites where the content was (apparently)
prepared by the site host. For example, health content on government websites may cite sources
of information (research studies, data files) but not clearly indicate who was responsible for
synthesizing, writing, or presenting the information.
Only one of the pretest websites reviewed appeared to comply fully with this criterion.
Privacy
All but one of the pretest websites complied with this criterion by including a clearly marked
privacy statement with fairly standard legal language explaining how personal information was
used and/or protected. However, the distinction between personally identifiable information and
personal health information (or between use of information and protection of information) did
not prove useful in determining compliance with this disclosure element, because the language
used was often generic and would apply to both kinds of information. We therefore propose to
combine these elements in the protocol questions for the full review.
Evaluation
Most websites included some mechanism for website users to provide feedback (such as a
user feedback or comment form), and in some cases a pop-up survey solicited specific feedback.
However, few sites provided any explanation as to how that feedback would be used to improve
website services. Three of the 10 pretest websites appear to have been fully compliant with this
criterion.
Updating Health Content
None of the pretest websites consistently identified the date health content was created,
reviewed, and updated on specific pages of health content. In many cases, a copyright date was
indicated (either as a month and year, year, or range of years) without any specific reference to
the date the content was created. None of the sites differentiated between date reviewed and date
updated, and many used other equivalent terms, such as “modified” or “revised.” Sites that
included many different types and/or sources of health content were inconsistent in this regard,
clearly dating material in some cases and not in others. As a result, none of the pretest websites
reviewed complied with this criterion.
MEMO TO:
FROM:
DATE:
PAGE:
Cynthia Baur
Margaret Gerteis, Anna Katz, Julie Ladinsky
1/27/06
8
Recommendations
As a result of our experience with the pretest, we propose to make the following
modifications to our approach to the full review:
1. Identify sites selected from the sample frame that are ineligible because of their
specialized content and audiences and eliminate them ahead of time, before giving
the sample to the reviewers for their review. (We will do this when we review the
sites to select pages of health content to review.) We have already generated a
sample frame that will accommodate the need for replacements without
compromising the integrity of the sample.
2. Create an Access database to provide reviewers a more user-friendly interface and to
allow direct entry of data for analysis.
3. Provide each reviewer with two monitors to eliminate the need to switch between
windows during the review.
Based on our review of findings from the pretest, we have also revised the protocols in order
to clarify the meaning of questions and response categories and to provide further direction to
reviewers through the accompanying explanations. A copy of the revised protocols is attached.
When we spoke by telephone on January 19, we also discussed and resolved some of the
overarching issues that have arisen (for example, how to approach “nested” home pages, .pdf
files, health content ghostwritten by site hosts, and copyright dates). We will also incorporate
these resolutions into reviewer training and manuals.
However, the depth and breadth of questions and issues that arose during our review, and the
lack of consistency among the health websites reviewed in their approach to many of the
disclosure elements, suggest the need for an incremental approach that will allow us to resolve
new issues as they arise and continue to revise the protocols as needed. For these reasons, the
protocols attached should be regarded as “work in progress.”
As we also discussed, the amount of time required for the review of each website, especially
during the early part of any given reviewer’s learning curve, and the time that will be required to
resolve additional issues that are likely to arise also suggest the need for an alternative approach
to the baseline review of 100 websites in order to complete the project on time and within
budget. We have had preliminary discussions with you about these issues and will describe
recommended approaches and alternatives in a separate memo.
cc: Davene Wright; Frank Potter; Margo Rosenbach
APPENDIX B
WEBSITES EVALUATION REVISED
PROTOCOL
ODPHP WEBSITES EVALUATION PROTOCOL
Website Name:
Website Home Page URL:
Type of site:
Date Accessed:
Rater:
Coding Start Time:
Coding End Time:
I. IDENTITY
1. Does the website identify by name the person or organization responsible for the website, within 2
clicks of the homepage?
__ Yes
__ No
Explanation: This is intended to refer to the individual, business, corporation, association, coalition,
or group that the user would identify as the website sponsor. Note that responsible entity is distinct
from the webmaster or other contractor to whom day-to-day website functions may have been
delegated.
2. Does the website provide the following contact information for the person or organization
responsible for the website, within 2 clicks of the homepage?
__ Street address
__ Other mailing address (e.g. post office box, mailstop)
__ Telephone number
__ E-mail address
3. Does the website provide the following information on sources of funding for the website, within 2
clicks of the homepage?
__ Includes explicit statement about sources of funding for website
__ Names individual or organizational sponsors, donors, or financial partners for website
Explanation: Note that this refers to funding for the website, not for the sponsoring organization.
This information may be found in an advertising or sponsorship policy.
1
II. PURPOSE
1. Does the website provide information about the purpose or mission of the website, within 2 clicks
of the homepage?
__ Yes
__ No
Explanation: Note that this refers to the purpose or mission of the website, and not of the sponsoring
organization. It may include a statement of purpose or a description of services provided to website
users such as health information, discussion groups or forums, advice from professionals, support for
health services, tools for self management, or the sale of products or services.
2. Does the website describe appropriate uses and limitations of the services it provides, within 2
clicks of the homepage?
__ Yes
__ No
Explanation: This may include terms and conditions regarding the provision of services, statements
that advice or information is not intended to replace the evaluation of a health care professional,
statements about the rights and responsibilities of users or chat room participants, or other
disclaimers.
3. Does the website include a statement regarding its association with commercial products or
services, within 2 clicks of the homepage?
__ Yes
__ No
Explanation: This may include a statement that the website has no financial interest or association
with any product or service mentioned; a statement disclosing a financial interest or association with
a product or service mentioned; or a statement that it endorses no product or service mentioned on
the website.
2
III. CONTENT DEVELOPMENT/EDITORIAL POLICY
1. Does the website clearly differentiate between advertising and non-advertising content?
__ Yes
__ No
__ Not applicable
Explanation: Look at advertising on the home page and on at least 2 links from the homepage.
Advertising, including sponsored health content, should be clearly distinguished from nonadvertising content using identifying words, design, or placement. Answer “yes” to this question
only if all advertisements found are clearly marked. “Not applicable” should be selected ONLY if no
advertising is found on the site.
2. Does the website describe how is oversees its health content in the following ways, within 2 clicks
of the homepage?
__ Describes its editorial or medical review process
Explanation: Note that this should include a description of the process, and not just a statement that
content is reviewed.
__ Provides names and credentials of medical/scientific editors, reviewers, or advisors
Explanation: Credentials may include degrees, licensure, titles, academic or clinical affiliations, or
areas of professional expertise. If the website provides the names and credentials of medial/scientific
advisors, it must clearly state that these advisors oversee health content.
__ Describes its policy for keeping health content current
Explanation: Note that this should include a description of the policy, and not just a statement that
content is kept current.
__ Describes other quality oversight practices (explain in comments)
3
3. Does the website disclose the author of this health-related content in the following ways, within 1
click of health content?1
__ States that the content is supplied by the website’s sponsoring organization or staff
__ States the name of an organization other than the website sponsor as supplying the content
__ Identifies individual authors of content by name
Explanation: When the health content is a .pdf file, it should be considered a stand-alone document.
Look for the disclosure items only on the .pdf file. Document the page number where the disclosure
item was found.
1
For this question the coder will visit three randomly selected pages of health content that are accessible
through direct paths from the website's homepage.
4
IV. PRIVACY AND CONFIDENTIALITY
1. Does the website describe its privacy policy, within 2 clicks of the homepage?
__ Yes
__ No
2. Does the site explain how users’ personal information is protected, within 2 clicks of the
homepage?
__ Yes
__ No
Explanation: “Personal information” may include e-mail addresses or e-mail exchanges, personal
health information, or information derived through the use of passive tracking mechanisms
(“cookies”).
5
V. USER FEEDBACK/EVALUATION
1. Does the website provide the following specific mechanisms for user feedback about the website,
within 2 clicks of the homepage?
__ Feedback form
Explanation: Feedback form refers to a form that is clearly marked as a means for submitting
comments or questions about the website.
__ Pop-up user survey
__ E-mail address
__ Other feedback mechanism (explain in comments)
2. Does the website describe how it uses information from users to improve its services or operations,
within 2 clicks of the homepage?
__ Yes
__ No
6
VI. CONTENT UPDATING2
1. Does this page of health content display the date this content was created?
__ Yes
__ No
Explanation: The date may be indicated as a year, month and year, or month, day, and year. This
question DOES NOT refer to copyright date. If there is a date listed with no other explanation, count
this as the date created.
2. Does this page of health content display the date this content was last reviewed and/or updated in
the following ways?
__ Displays date last reviewed or verified
__ Displays date last updated, modified, or revised
Explanation: The date may be indicated as a year, month and year, or month, day, and year.
3. Does this page of health content display a copyright date?
__ Yes
__ No
Explanation: The date may be indicated as a year, month and year, month, day, and year, or a range
of years.
2
For questions 1-3 in this section the coder will visit three randomly selected pages of health content that are
accessible through direct paths from the website's homepage.
7
APPENDIX C
BASELINE ESTIMATES OF COMPLIANCE
Table C1. Estimates of Compliance with Criteria and Associated Elements of Disclosure, All Health Websites
All Health Websites
Criterion/Disclosure Element
Number
(n=102)
Percent
SE
RSE (%)
Lower bound
95% CI
Upper bound
95% CI
Identity
Name
Street address
Funding sources
12
91
61
22
8.5
91.6
54.8
20.2
3.59
3.59
6.59
5.29
42.19
3.92
12.03
26.18
3.6
81.2
41.7
11.7
18.9
96.5
67.3
32.7
Purpose
Purpose or mission
Uses and limitations
Association with commercial products
44
67
79
67
35.2
64.3
71.0
60.8
6.27
6.35
6.05
6.48
17.79
9.88
8.52
10.66
24.0
50.9
57.8
47.5
48.4
75.7
81.5
72.7
Content
Identify advertising content
Describe editorial policy
a
Authorship
2
82
12
21
0.3
74.9
5.1
11.4
0.16
5.80
2.61
3.98
59.26
7.75
51.58
34.79
0.1
61.8
1.8
5.6
0.9
84.6
13.5
22.0
Privacy
Privacy policy
Describe protection of personal information
85
88
85
75.3
79.1
75.3
5.79
5.47
5.79
7.69
6.91
7.69
62.1
66.3
62.1
85.0
88.0
85.0
User Feedback/Evaluation
Feedback mechanism
65
65
58.8
58.8
6.52
6.52
11.09
11.09
45.5
45.5
70.9
70.9
Content Updating
a
Display date created
a
Display date reviewed or updated
1
8
11
0.1
4.5
3.2
0.12
2.60
1.88
92.31
57.52
58.75
0.02
1.4
1.0
Source:
Computations by Mathematica Policy Research, Inc.
Note:
Number=number of websites in sample found compliant
Percent=weighted percent of all websites estimated to be in compliance
SE= standard error of weighted percentage estimate
RSE=relative standard error (the standard error divided by the percentage estimate)
CI=confidence interval of weighted percentage estimate
Shading indicates estimates that are unreliable, due to large standard error relative to small percentage estimates.
a
This criterion element must be fulfilled for all three pages of health content evaluated.
0.8
13.5
9.9
Table C2. Estimates of Compliance with Criteria and Associated Elements of Disclosure, by Stratum
Frequently Visited Sites
Remainder Sites
SE
RSE
(%)
Lower
bound
95% CI
Upper
bound
95% CI
Number
(n=50)
Percent
15.4
86.5
65.4
23.1
4.39
4.16
5.79
5.13
28.54
4.81
8.86
22.23
8.5
76.0
53.2
14.5
26.2
92.9
75.8
34.7
4
46
27
10
27
35
44
37
51.9*
67.3
84.6
71.2
6.08
5.71
4.39
5.52
11.71
8.48
5.19
7.76
40.0
55.2
73.8
59.1
63.7
77.5
91.5
80.8
Content
Identify advertising content
Describe editorial policy
Authorship
2
45
10
16
3.9
86.5
19.2*
30.8*
2.34
4.16
4.80
5.62
60.78
4.81
24.96
18.26
1.1
76.0
11.4
20.9
Privacy
Privacy policy
Describe protection of personal information
48
49
48
92.3*
94.2*
92.3*
3.24
2.84
6.24
3.51
3.01
6.76
User Feedback/Evaluation
Feedback mechanism
36
36
69.2
69.2
5.62
5.62
Content Updating
Display date createda
Display date reviewed or updateda,*
1
6
10
1.9
11.5
19.2*
1.67
3.89
4.80
Number
(n=52)
Percent
Identity
Name
Street address
Funding sources
8
45
34
12
Purpose
Purpose or mission
Uses and limitations
Association with commercial products
Criterion/Disclosure Element
SE
RSE
(%)
Lower
bound
95% CI
Upper
bound
95% CI
8.0
92.0
54.0
20.0
3.85
3.85
7.07
5.67
48.13
4.18
13.09
28.35
3.0
80.3
40.0
11.0
19.7
97.0
67.4
33.6
17
32
35
30
34.0
64.0
70.0
60.0
6.72
6.81
6.50
6.95
19.76
10.64
9.29
11.58
22.2
49.7
55.8
45.8
48.3
76.2
81.2
72.7
12.3
92.9
30.5
42.9
0
37
2
5
0.0
74.0
4.0
10.0
0.00
6.22
2.78
4.25
n.a.
8.41
69.50
42.50
n.a.
60.0
1.0
4.2
n.a.
84.4
14.9
22.1
82.9
85.3
82.9
96.7
97.9
96.7
37
39
37
74.0
78.0
74.0
6.22
5.87
6.22
8.41
7.53
8.41
60.0
64.3
60.0
84.4
87.5
84.4
8.12
8.12
57.1
57.1
79.2
79.2
29
29
58.0
58.0
7.00
7.00
12.07
12.07
43.9
43.9
71.0
71.0
86.98
33.71
24.96
0.3
5.8
11.4
10.2
21.7
30.5
0
2
1
0.0
4.0
2.0
0.00
2.78
1.99
n.a.
69.50
99.50
n.a.
1.0
0.3
n.a.
14.9
13.2
Source:
Computations by Mathematica Policy Research, Inc.
Note:
Number=number of websites in stratum found compliant
Percent=percent of websites in stratum estimated to be in compliance
SE= standard error of the percentage estimate
RSE= relative standard error (the standard error divided by the percentage estimate)
CI= confidence interval of percentage estimate
n.a. = Not applicable due to a zero estimate and standard error
Shading indicates estimates that are unreliable, due to large standard error relative to small percentage estimates.
a
This criterion element must be fulfilled for all three pages of health content evaluated.
*Difference in estimated percentages for "frequently visited sites" compared to "remainder sites" is statistically significant at the 95 percent level of confidence based on student's t-test
of independent samples.