class slides - MedBlog

How can we study PGx?
Approaches to
pharmacogenomics studies
Broadly, there are two approaches:
1. Genotype variation to Phenotype
Russ B. Altman, MD, PhD
Professor of Genetics, Bioengineering &
Medicine
2. Phenotype variation to Genotype
Stanford University
Stanford – South Africa B iomedical Informatics P rogram
Genotype to Phenotype
• From the start: Suspicion/knowledge
that a gene or gene family is likely to be
important for drug response.
• So, look for genetic variation in these
genes, and characterize the functional
significance
• E.g. Phase I oxoreductase enzymes,
Phase II conjugating enzymes,
transporters
Stanford – South Africa B iomedical Informatics P rogram
Stanford – South Africa B iomedical Informatics P rogram
Genotype to Phenotype
• Screen individuals (by sequencing) from
different populations for polymorphisms
(SNPs, indels, etc…)
• Polymorphisms with high frequency are
then studied phenotypically
– First with molecular, cellular assays
– Then, with clinical studies
Stanford – South Africa B iomedical Informatics P rogram
1
Genotype to Phenotype
• Example: new transporter molecule
• Sequence gene in 100 individuals from different
ethnic groups
• Find most common variations (coding)
• Put transporter in cell system (e.g. yeast) and
measure transport phenotype (e.g. uptake of
radioactive small molecule)
• If functional differences, then…
• Study clinically with hypothesis about
increased or decreased function in individuals
with polymorphism.
Stanford – South Africa B iomedical Informatics P rogram
Problems with G to P
• How do you choose where to look for
variation (exons vs. everything)?
• How do you choose which polymorphisms
to followup on functionally?
• How do you know which drugs may be
affected by gene and its polymorphisms?
• What if there is no significant variation
in the gene? Not much to follow up on…
Stanford – South Africa B iomedical Informatics P rogram
Genotype to Phenotype
• Note: common polymorphisms may also be in
promoter regions, introns, synonymous coding
regions
• Then, studying protein product not directly
useful.
• Instead, must study rates of expression,
degradation.
• Still can advance to clinical hypotheses, based
on accumulated evidence.
Stanford – South Africa B iomedical Informatics P rogram
Phenotype to Genotype
• From the start: Suspicion/knowledge
that a drug-response phenotype shows
marked variation in population. Likely
genetic.
• So, find patients with high/low
phenotype, and use knowledge of drug
pathway to find genotypic variations that
explain.
Stanford – South Africa B iomedical Informatics P rogram
2
Variation in TPMT Activity
Distribution of Debrisoquine 4-Hydroxylase
Activity
Weinshilboum (Mayo Clinic) 2001
Number
Broly F et al: DNA and Cell Biol 10(8):545,1991
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
Log metabolic ratio
Stanford – South Africa B iomedical Informatics P rogram
Stanford – South Africa B iomedical Informatics P rogram
CP1077369-10
Distribution of FEV1 Change in
response to inhaled steroid for asthma
Microarray analysis of extreme
s
e samples (McLeod et al)
l
p
m
a
S 50
Population Variation at 10nM
Docetaxel
Patients, %
f
40
o
r
e
b
m
u
N
% Change in FEV1 from Baseline
Stanford – South Africa B iomedical Informatics P rogram
30
20
0105
0 0
. 0.
0 0
0
1
.
0
5
1
.
0
0
2
.
0
5
2
.
0
0
3
.
0
5
3
.
0
0
4
.
0
5
4
.
0
0
5
.
0
5
5
.
0
0
6
.
0
5
6
.
0
0
7
.
0
5
7
.
0
0
8
.
0
5
8
.
0
0
9
.
0
5
9
.
0
0
0
.
1
RelativeViability
Stanford – South Africa B iomedical Informatics P rogram
3
Phenotype to Genotype
• Given the phenotype variation, next must
find genotypic variation:
– Candidate genes from known pathways of PK
or PD
– Whole genome assessment of variation (more
on this later)
• Often sample the tails of the distribution
to find individuals with the most
differences
Problems with P to G
• Need to know something about the genes
involved in the drug
• May be multiple genes with small effects
• Need to go back and see how much of
the variation is really explained by
genetics. E.g. Warfarin work focused on
CYP2C9 for many years, recently VKORC1
found to explain much more of variation
in dosage.
Stanford – South Africa B iomedical Informatics P rogram
Stanford – South Africa B iomedical Informatics P rogram
Future looks good for P to G
• Whole genome scans for SNPs allow
large populations to be genotyped at
MANY positions.
• Then, becomes issue of finding the SNPs
that best predict the variability of
interest.
Stanford – South Africa B iomedical Informatics P rogram
Whole Genome Scan Strategy
1.
Find population with variability of clinical
interest.
2. [Somewhat critical: Establish that variability
is hereditable]
3. Pick outliers, and perhaps some central
individuals.
4. Genotype 100,000 to 300,000 SNPs
distributed throughout genome in ~1000
individuals.
Stanford – South Africa B iomedical Informatics P rogram
4
Whole Genome Scan Strategy
Whole Genome Scan Strategy
5. Compute correlation of individual SNPs with
phenotype.
• Need to choose appropriate measure of
correlation
• Genotype: e.g. AA, AG, GG
• Phenotype: quantitative [e.g. 0 to 1.0] vs.
categorical [e.g. X, Y or Z]
• Epistasis: interaction of SNPs. May be that
need to look at A/a & B/b to see effect
(computationally much more complex!)
6. Examine regions of genome with most
correlated SNPs. May identify numerous
regions, if multiple genes are involved.
Stanford – South Africa B iomedical Informatics P rogram
Some independent data sources
• Expression data: examine response of
cells to drug to see what genes are
up/down regulated in response to drug.
• Linkage analysis: Look for correlation of
phenotype with inherited markers in
family studies
• Proteomics: examine proteomic profiles
of cells to see if there are phenotypic
differences associated with drug
Stanford – South Africa B iomedical Informatics P rogram
•
•
Single gene = strong association (unlikely)
Multiple genes = multiple weak associations
7. Use independent sources of data to
evaluate the variation genomic regions
for supporting evidence.
Stanford – South Africa B iomedical Informatics P rogram
Whole Genome Scan Strategy
8. If able to focus on region that is suggested by
independent analyses, then examine genes
around the correlated SNP
9. Because of LD, SNP is likely in region, but NOT
the functionally important SNP
10. REPLICATION: In a smaller group (~100-1000)
of separate cases, do focused
sequencing/genotyping at higher density to
replicate findings and identify SNPs likely to be
functional.
Stanford – South Africa B iomedical Informatics P rogram
5
Will whole genome work?
Perlegen genotypes 1.6 millions SNPs in 71 people.
Hinds et al, Science 307, p 1072
Stanford – South Africa B iomedical Informatics P rogram
Linkage analysis
McLeod et al have used for PGx
1. Use Ceph panel of genotyped individuals
(family trios), with cell lines available.
2. Developed a drug assay for sensitivity to drug
on cell lines
3. Tested all cell lines for assay
4. Performed linkage analysis to find overall
region of phenotype linkage
5. Performed microarray expression to narrow
down genes of interest
6. Used SNP data to find correlated SNPs for
phenotype
Stanford – South Africa B iomedical Informatics P rogram
Genes associated with docetaxel sensitivity
2.73
Watters et al PNAS 2004
Stanford – South Africa B iomedical Informatics P rogram
Stanford – South Africa B iomedical Informatics P rogram
6
SNP associations
Whole Genome Scan Summary
2 GENES
>30,000
Association
Linkage
~300 QTL genes
3 GENES
1 GENE
Expression
Stanford – South Africa B iomedical Informatics P rogram
Challenges for whole-genome
• With 500K measurements, and only a few
phenotypes, many false positive
associations.
• Phenotype needs to be carefully defined,
as low-noise as possible
• Population admixture complicates analysis
– Variable LD patterns (correlations)
• Hard to use background knowledge (e.g.
hybrid candidate gene + whole genome
association)
Stanford – South Africa B iomedical Informatics P rogram
6 'high priority' genes
Stanford – South Africa B iomedical Informatics P rogram
Analogy to machine learning
• Machine learning:
– Independent features
– Dependent variables to be predicted
– Large data sets
• Web usage
• Consumer patterns of behavior
• Large database association mining
Stanford – South Africa B iomedical Informatics P rogram
7
Analogy to machine learning
• Genotypes + environmental variables =
independent variables
• Phenotypes = dependent variables to be
predicted
• Phenotype = F(Genotype + Environment)
• What is the functional form of X?
F = gene(i)*weight(i) + environment(j)*weight(j)??
F = g(I)*w(I)*e(j)*w(j) + …
F = sin[(g(I)^2 * tan(e(j)*2pi)]
Stanford – South Africa B iomedical Informatics P rogram
Issues in ML
• Nature of f(genotype + environment)?
• If f is linear = weighted combination of
genotypes = easier to detect
• If f is nonlinear = complicated function of
genotypes = much harder to detect
• Certain machine learning algorithms
better for different situations
Stanford – South Africa B iomedical Informatics P rogram
Issues in ML
• Feature selection: which features to
include as independent variables?
• Features may be correlated or identical
• Too many features may confuse machine
learning algorithm
• Genotype/Phenotype
– SNPs that are correlated (LD) can be
removed
Stanford – South Africa B iomedical Informatics P rogram
WEKA
• Public domain collection of machine
learning algorithms
• Provide clustering and classification
algorithms
• Relatively easy to use
• Free to download
• Subject of laboratory on Tuesday
afternoon.
Stanford – South Africa B iomedical Informatics P rogram
8
How have we discovered drugs?
Drug discovery and validation
Russ B. Altman, MD, PhD
Professor of Genetics, Bioengineering &
Medicine
• Average time from project inception to
drug launch: 13-14 years
• Average total investment per LAUNCHED
drug = $1 billion
• Average chance of project success:
– 1-3% at inception
– 7-8% if drug reaches preclinical testing
Stanford University
Stanford – South Africa B iomedical Informatics P rogram
1. Basic science
• Generate hypotheses about potential
drug targets based on basic research.
• E.g. A studied gene is mutated in some
HIV-infected patients who never
progress to AIDS.
Stanford – South Africa B iomedical Informatics P rogram
1. Basic science
OR
• E.g. I have elucidated a series of genes
involved in the development of cancer.
• Can I interrupt the development by
blocking one (or more) of the genes?
• Can I develop a drug that “mimics” this
mutation in other people, so that they
also will not progress?
Stanford – South Africa B iomedical Informatics P rogram
Stanford – South Africa B iomedical Informatics P rogram
9
2. Identify a “lead” compound
• Given the target, attempt to find a compound
that binds it (binding assay) or interferes with
its function (functional assay)
• Usually identified through screening:
– Using the target, develop an (ideally
inexpensive) assay for binding/function
– Create (or purchase) a large library of
compounds
– Test them in the assay, pull out the
“positives” for further study.
Stanford – South Africa B iomedical Informatics P rogram
Lipinski’s Rules
Christopher Lipinski created rules to
predict which drugs would fail because
of poor pharmacokinetics.
•
•
Molecular mass > 500 Da
High lipophilicity
•
•
More than 5 hydrogen bond donors
More than 10 hydrogen bond acceptors
Stanford – South Africa B iomedical Informatics P rogram
2. Identify a “lead” compound
• Usually, the lead compound(s) will not be
ideal drug candidates
– Do not fit Lipinski rules
– High chance of toxicity (or demonstrated in
animal studies)
– Does not have desired effect
– Myriad other problems.
Stanford – South Africa B iomedical Informatics P rogram
3. Optimize the lead
• Organic chemists create variations of
lead (using Lipinski rules, e.g.) to
eliminate problems.
– Can use “combinatorial chemistry” in which many
variations of a backbone molecule are generated by
systematically adding/removing different chemical
groups
• Develop more focused assays
– Test the desired characteristics more accurately
– Can be more expensive, since not used for screening
Stanford – South Africa B iomedical Informatics P rogram
10
4. Test optimized leads in animals
[NOTE: Rats are not just “small humans”]
Nevertheless, must establish safety in
animals (mice, rats, pigs, dogs, etc…)
Check for metabolism of drug
Check for toxicity, adverse reactions
Perhaps, check for signs of efficacy.
Get indication of dosage ranges (mg/kg)
Stanford – South Africa B iomedical Informatics P rogram
6. Phase II clinical trial
• < 1000 patients with disease
• Continue to evaluate safety
• Establish optimal dosing
• Preliminary test of efficacy
Stanford – South Africa B iomedical Informatics P rogram
5. Phase I clinical trial
• < 100 healthy people (usually paid)
• Start low dose, increase
• Check safety
– Liver, kidney blood tests
– Other, as indicated
• Evaluate pharmacokinetics (blood levels as a
function of dose)
• Establish maximum tolerated dose (from below!)
• In parallel, work on formulation (purity,
reproducibility)
Stanford – South Africa B iomedical Informatics P rogram
7. Phase III clinical trial
• < 10,000 patients with disease
• Use formal statistical hypothesis test to
evaluate the efficacy and safety of new
drug compared to “current best”
• Needs to at least match current best
• Fully documented trial data submitted to
the government agency that authorizes
marketing of drug.
Stanford – South Africa B iomedical Informatics P rogram
11
8. Phase IV clinical study
• Post-marketing surveillance
• After the drug is released, company must
continue to monitor for safety.
• Especially important for rare (< 1/10,000)
side effects
Stanford – South Africa B iomedical Informatics P rogram
Notes on drug development
• Cost of canceling a drug project
increases exponentially as it progresses
through steps.
• Thus, better to cancel a project early
with any indication of problems, than to
“hope” it all works out.
• These decisions currently made based on
incomplete information. Valuable drugs
may be cancelled that could be “saved.”
Stanford – South Africa B iomedical Informatics P rogram
Phase IV withdrawals
VERY expensive to pull drug this late:
• Chloramphenicol--antibiotic with rare bone
marrow failure
• Grepafloxacin--antibiotic causes increased
cardiac arrhythmia
• Vioxx -- arthritis medicine with increased rate
of heart attacks
• Troglitazone--diabetes medicine with rare liver
failure
• Viagra--erectile dysfunction medicine with rare
heart attacks (NOT WITHDRAWN, TOO
POPULAR?)
Stanford – South Africa B iomedical Informatics P rogram
Notes on drug development
• Drug companies are generally looking for
reasons to cancel a drug, and the pipeline of
targets is generally thought to be adequate.
• Adverse events that are 1/10,000 are not seen
until post-market, and are therefore very
expensive.
• More commmon adverse events (1/100-1/1000)
will lead to cancellation in phase I or II.
• What about pharmacogenomics to save these?
Stanford – South Africa B iomedical Informatics P rogram
12
Can PGx save drugs?
In principle, YES, but issues:
Need pharmacogenomic information early in
development, so studies can be focused:
• Choose subset of patients who will
tolerate drug in phase I studies.
• Avoid lengthy additional studies (patents
= last only 17 years)
• May need to co-develop a genetic test
(e.g. Herceptin)
Stanford – South Africa B iomedical Informatics P rogram
Off patent drugs
• After 17 years (in US) a drug goes off
the patent, and other companies can
begin producing it.
• Who is responsible for post-marketing
surveillance then?
• Who should followup on pharmacogenomic
opportunities?
Stanford – South Africa B iomedical Informatics P rogram
Can PGx save drugs?
• Companies prefer “one size fits all” drugs
• Unclear economic model for fractured markets
with “one size fits some”
• Orphan drug regulations exist currently to
make it attractive for companies to develop
drugs for small populations
– E.g. life-saving drug for very rare disease
• Will orphan drug laws apply to fractured
markets?
Stanford – South Africa B iomedical Informatics P rogram
Cost/Benefit Concerns for PGx
• If cost of the test > cost of adverse
reaction, then why do it?
– E.g. Codeine & CYP2D6, 7% of whites do not
metabolize into active metabolite
• Cost of information systems to support
PGx data storage and decision support
• Cost of industrial processes to create
multiple drugs vs. “one size fits all”
Stanford – South Africa B iomedical Informatics P rogram
13
Ethical Issues
• Will pharmaceutical companies focus on
particular genetic polymorphisms for drug
development and ignore others?
• What if these polymorphisms are
associated with groups that are
more/less economically advantaged?
More on this later…
Stanford – South Africa B iomedical Informatics P rogram
14