Critical Appraisal

實証醫學
Critical Appraisal In
Evidence-Based
Medicine
柯德鑫, MD, PhD.
奇美醫學中心 神經內科
Evidence-Based Medicine
(EBM)

The consistent use of current best
evidence derived from published
clinical and epidemiologic research
in management of patients, with
attention to the balance of risks and
benefits of diagnostic tests and
alternative treatment regimens,
taking account of each patient’s
unique circumstances, including
baseline risk, comorbid conditions
and personal preferences.
Evidence-Based Medicine
(EBM)

Meta-analysis is an overview in which uses
quantitative methods to summarise the
results from different clinical trials.

Odds Ratio describes the odds of an
experimental patient suffering an adverse
event relative to a control patient.
What evidence-based
medicine is:
The practice of EBM is the integration
of
 individual clinical expertise
with the
 best available external clinical
evidence from systematic research.
and
 patient’s values and expectations
What is Evidence-Based
Medicine?






See a patient
Ask a question
Seek the best evidence for that
question
Appraise that evidence
Apply the evidence
Monitor the change
健康狀態
疾病
病人
病態生理變化
感覺不適
診斷條件:
病史詢問
特異性高的發現
醫師
統計上相關性
相近的病態生理/病理機制
理學檢查
臨時診斷思考
形成最可能的假設診斷
相關的檢驗與檢查
證實假設診斷
Evidence-Based Medicine





Ask a question
Acquire some articles
Appraise the evidence
Apply the findings
Assess your performance
證據等級之定義
第一級:最高級證據 (Level Ⅰ)
來源 (1)隨機雙盲試驗,並有適當病人數所原發性終點分析;
(2)適當執行的優質隨機分組試驗之巨集分析
第二級:中級證據 (Level Ⅱ)
來源 (1)隨機非雙盲試驗
(2)小規模隨機試驗
(3)大規模隨機試驗並有事先界定的次發性終點分析
第三級:次級證據 (Level Ⅲ)
來源 (1)前瞻性病例系列,並有同時或前後對照
(2)隨機試驗之事後分析
第四級:證據未明 (Level Ⅳ)
來源 (1)小規模病例系例無對照或病例報告
(2)雖然沒有對照試驗之科學證據,但專家普遍同意
What is critical
appraisal?
Critical appraisal is the
process of systematically
examining research
evidence to assess its
validity, results and
relevance before using it to
inform a decision.
Critical Appraisal




The process of deciding whether a
piece of research can help you in
answering your clinical question.
Three questions you need to ask
about any kind of research:
1. Is it valid?
2. Is it important?
3. Is it applicable to the patient?
Critical Appraisal Skills

VIP
Valid:研究方法的探討
Important:結論的分析
Practice:如何運用來照顧病人
RCT/Cohort study/Case-control
study
How to Appraise
Evidence?
·By Expertise
-Secondary journal
-CAT(critical appraisal topics)
-Cochrane database
-JAMA & BMJ Guide
-Basic Clinical Statistics
·Have an appraisal course
·EBM style journal meeting
Evaluation of A Diagnostic
Test




Sensitivity is the proportion of people with the
disease who have a positive test.
Specificity is the proportion of people free of the
disease who have a negative test.
Positive Predictive Value (+PV) is the proportion
of people with a positive test who truly have the
disease.
Negative Predictive Value (-PV) is the proportion
of people with a negative test who are free of the
disease.
Evaluation of A Diagnostic
Test

SnNout when a sign/test has a high sensitivity, a negative
result rules out the diagnosis; e.g. the sensitivity of a
history of ankle swelling for diagnosing ascites is 92%,
therefore if a person does not have a history of ankle
swelling, it is highly unlikely that the person has ascites.

SpPin when a sign/test has a high specificity, a Positive
result rules in the diagnosis; e.g. the specificity of fluid
wave for diagnosing ascites is 92%. Therefore, if a person
has a fluid wave, it is highly likely that the person has
ascites.
Critical Appraisal for Diagnosis



Are the results of this diagnostic study
valid?
Are the valid results of this diagnostic
study important?
Can you apply this valid, important
evidence about a diagnostic test in
caring for your patient?
Are the results of this diagnostic
study valid?




Was there an independent, blind comparison
with a reference ("gold") standard of
diagnosis?
Was the diagnostic test evaluated in an
appropriate spectrum of patients (like those in
whom we would use it in practice)?
Was the reference standard applied
regardless of the test result?
Was the test (or cluster of tests) validated in a
second, independent group of patients?
Are the valid results of this diagnostic
study important?
Are the valid results of this diagnostic
study important?

Sensitivity = a / (a+c) = 371/809 = 90 %
Specificity = d / (b+d) = 1500/1770 = 85 %
LR+ = sens / (1-spec) = 90/15 = 6
LR- = (1-sens) / (spec) = 10/85 = 0.12
Positive Predictive Value = a / (a+b) = 731/1001 = 73 %
Negative Predictive value = d / (c+d) = 1500/1578 = 95 %
Prevalence = (a+c) / (a+b+c+d) = 809/2579 = 32 %
Pre-test odds = prevalence / (1-prevalence) = 31/69 = 0.45
Post-test odds = pre-test odds * LR
Post-test Probability = post-test odds / (post-test odds + 1)
Can you apply this valid, important
evidence about a diagnostic test in caring
for your patient?

Is the diagnostic test available, affordable,
accurate and precise in your setting?

Can you generate a clinically sensible
estimate of your patient's pre-test probability
(from practical data, from personal
experience, from the report itself, or from
clinical speculation)?
Can you apply this valid, important
evidence about a diagnostic test in caring
for your patient?

Will the resulting post-test probabilities affect
your management and help your patient?
(Could it move you across a test-treatment
threshold? Would your patient be a willing
partner in carrying it out?)

Would the consequences of the test help your
patient?
Evidence-Based Medicine
(EBM) in Clinical Trial

Number Needed to Treat (NNT) is the
number of patients who need to be treated
to prevent one bad outcome.

Number Needed to Harm (NNH) is the
number of patients under treatment that is
need to develop one bad outcome or adverse
effect.
Evidence-Based Medicine
(EBM) in Clinical Trial


Intention-to-treat analysis: including all
patients in the analysis who are randomized
to treatment, regardless of what occurs
during the clinical trial.
Adherence-to-protocal (On treatment)
analysis: analyzing patients based on the
treatment they received finally.
Therapies




To treat or not to treat?
Validity
Importance
Applicability
Therapies

Validity
• Was it randomised?
• Was the allocation concealed?
• Were the all the subjects analysed
correctly?
• Was it blinded?
• Were the groups similar?
Therapies

Importance
• What were the results?
• Over what time period?
• With what precision?
Therapies




Number needed to treat
Relative risk reduction
Absolute risk reduction
Event rates
Therapies

Event rates
• n with event / total


Control event rate (CER)
Experimental event rate (EER)
Therapies

Absolute risk reduction
• difference in two event rates
• CER - EER = ARR

Relative risk reduction
• proportion of control rate
• CER-EER / CER = RRR
Therapies

Number needed to treat
• number of extra patients you need
to treat to prevent one bad
outcome
• 1 / ARR = NNT
Therapies

95% confidence interval
• range within which the true value
falls with 95% confidence
• use computer (e.g. CATMaker)
Occurrence of death, stroke, or other
major complications
Number
needed to
treat NNT
Adverse events
Patient status
at entry
Placebo
P
Active
A
RRR
ARR
1/ARR=NNT
Prior target
organ damage
.22
.08
64%
.14
1/.14=7
No prior organ
damage
.10
.04
60%
.06
1/.06=17
THERAPY WORKSHEET
Citation:
Are the results of this single preventive or therapeutic trial valid?
Was the assignment of patients to treatments randomized?
-and was the randomization list concealed?
Were all patients who entered the trial accounted for at
its conclusion? -and were they analyzed in the groups to
which they were randomized?
Were patients and clinicians kept “blind” to which treatment was being received?
Aside from the experimental treatment, were the groups
treated equally?
Were the groups similar at the start of the trial?
THERAPY WORKSHEET
Are the valid results of this randomized trial
important?
SAMPLE CALCULATIONS:
Occurrence of diabetic
neuropathy
Relative Risk
Reduction
RRR
Absolute Risk
Reduction
ARR
Number
Needed
to Treat
NNT
Usual Insulin
Control Event
Rate
CER
Intensive
Insulin
Experimental
Event Rate
EER
CER - EER
CER
CER - EER
1/ARR
9.6%
2.8%
9.6% - 2.8% = 71%
9.6%
9.6% - 2.8% = 6.8%
(4.3% to 9.3%)
1/6.8% = 15
pts,(11 to 23)
THERAPY WORKSHEET
SAMPLE CALCULATIONS:
95% Confidence Interval (CI) on an NNT = 1 / (limits on the CI of its ARR) =
 /  1.96
CER  (1  CER)
EER  (1  EER)
0.096  0.904 0.028  0.972

  /  1.96

  /  2.4%
# of  control  pts. # of  exp er.  pts
730
711
YOUR CALCULATIONS:
CER
EER
Relative Risk
Reduction
RRR
Absolute Risk
Reduction
ARR
Number Needed
to Treat
NNT
CER - EER
CER
CER - EER
1/ARR
Therapies

Application
• Can it be applied to my patient?
• Can it be done here?
• How do patient values affect the
decision?
Therapies


Is it valid?
Is it important?
• NNT for what
over how long
with what precision

Does it apply?
Six guides to distinguish useful from
useless or even harmful therapy
1. Was the assignment of patients to treatments really
randomized?
2. Was all clinically relevant outcomes reported?
3. Were the study patients recognizably similar to your own
?
4. Were both clinical and statistical significance
considered?
5. Is the therapeutic maneuver feasible in your practice?
6. Were all patients who entered the study accounted for
its conclusion?
The likelihood of help vs.
harm (LHH)
In applying a SR or RCT to an individual
patient, we need to consider:
•our patient's risk, relative to patients in the
trial, of the event we hope to prevent with
the treatment: ft
•our patient's risk, relative to patients in the
trial, of the side-effect we might cause
from the treatment: fh
•our patient's perception of the severity of
the event we're trying to prevent relative
to the side-effect we might cause: s
The likelihood of help vs. harm
is
(1 /NNT) x ft x s vs. (1/NNH)x fh
For example, suppose we're applying a trial with
an NNT of 9 and an NNH of 12 and we think our
patient is at just half the risk of the event but at
twice the risk of the side-effect, then the "raw"
LHH before we adjust it for our patient's
perception of relative severity is 1/9 x 0.5 vs.
1/12 x 2 = 1/18 vs. 1/6, or three times as likely to
harm vs. help the patient. However, if our
patient regards the severity of the event that
the treatment might prevent to be six times
worse than the side-effect it might cause, then
the final LHH = 1/18 x 6 vs. 1/6, or two times as
likely to help vs. harm
Should these valid, potentially important
results change the treatment of our patient?
1. Is our patient so different from those included in
the study that its results don’t apply?
2. What is our patient’s risks of the adverse event?
What is our patient’s potential benefit from the
therapy?
3. What are our patient’s preferences, concerns and
expectations from this treatment?
4. What alternative treatments are available?
Are the results of this systematic
review of therapy valid?
1. Is this a systematic review of randomized trials?
2. Does it include a methods section that describes:
(a) finding and including all the relevant trials?
(b) assessing their individual validity?
3. Were the results consistent from study to study?
(4. Were individual patient data used in the analysis
(or aggregate data)? )
Is the systematic review
important?
Are the valid results of this
systematic review important?
Translating odds ratios to NNTs
Are the valid, important results of this
systematic review applicable to our
patient?
1.Is our patient so different from those in the study
that its results cannot apply?
2.Is the treatment feasible in our setting?
3.What are our patient's potential benefits and harms
from the therapy?
4.What are our patient's values and preferences for
both the outcome we are trying to prevent and the
side-effects we may cause?
Are the recommendations
in this guideline valid?
1. Did its developers carry out a
comprehensive, reproducible
literature review within the past 12
months?
2. Is each of its recommendations
both tagged by the level of
evidence upon which it is based
and linked to a specific citation?
Are the results of this
prognosis study valid?
1. Was a defined, representative sample of
patients assembled at a common (usually
early) point in the course of their
disease?
2. Was patient follow-up sufficiently long and
complete?
3. Were objective outcome criteria applied in
a “blind” fashion?
4. If subgroups with different prognoses are
identified, was there adjustment for
important prognostic factors?
5. Was there validation in an independent
group (“test set”) of patients?
Are the valid results of this
prognosis study
important?
1. How likely are the outcomes
over time?
2. How precise are the prognostic
estimates?
Can we apply this valid,
important evidence about
prognosis in caring for our
patient?
1. Were the study patients similar
to our own?
2. Will this evidence make a
clinically important impact on
our conclusions about what to
offer or tell our patient?
Are the results of this
harm/etiology study valid?
1. Were there clearly defined groups of
patients, similar in all important ways
other than exposure to the treatment
or other cause?
2. Were treatments/exposures and
clinical outcomes measured in the
same ways in both groups (was the
assessment of outcomes either
objective or blinded to exposure)?
3. Was the follow-up of study patients
complete and long enough?
Are the results of this
harm/etiology study valid?
4. Do the results satisfy some “diagnostic
tests for causation”?
• Is it clear that the exposure preceded the
onset of the outcome?
• Is there a dose-response gradient?
• Is there positive evidence from a
“dechallenge-rechallenge” study?
• Is the association consistent from study to
study?
• Does the association make biological
sense?
Are the valid results from this
harm/etiology study important?
Adverse outcome
Present
(case)
Exposed
to the
treatment
Totals
Absent
(control)
Yes (cohort)
a b
a+b
No (cohort)
c d
c+d
Totals
a+c b+d
a+b+c+d
In a randomized trial or cohort study:
Relative risk (RR) = [a/(a+b)]/[c/(c+d)]
In a case-control study: relative odds (RO) = ad/bc
To calculate the NNH for any OR and PEER:
Are the results of this
clinical decision analysis
valid?
1. Were all the important therapeutic
alternatives (including no
treatment) and outcomes included?
2. Are the probabilities of the
outcomes valid and credible?
3. Are the utilities of the outcomes
valid and credible?
4. Was the robustness of the
conclusion tested?
Are the valid results from
this decision analysis
important?
1. Did one course of action lead
to clinically important gains?
2. Was the same course of action
preferred despite clinically
sensible changes in
probabilities and utilities?
Are the valid, important results of
this decision analysis applicable
to our patient?
1. Do the probabilities apply to
our patient?
2. Can our patient state his/her
utilities in a stable, usable
form?