Primer Design

Methods in Molecular Biology
TM
TM
Volume 226
PCR
Protocols
SECOND EDITION
Edited by
John M. S. Bartlett
David Stirling
PCR Primer Design
81
19
PCR Primer Design
David L. Hyndman and Masato Mitsuhashi
1. Introduction
The selection of primers for a given polymerase chain reaction (PCR) can determine
the efficiency and specificity of the PCR. Although in many cases successful PCR
primers have been selected with little understanding of the principles involved, PCR
can often only be achieved by using primers that are designed appropriately. Here,
we give general recommendations for PCR primer selection and various aspects to be
considered when designing primers.
2. General Primer Considerations
2.1. Location
The location of PCR primers is sometimes dictated by the purpose of the experiment.
If the experiment is simply intended to identify the presence or absence of the sequence,
then the location is of no consequence, so long as the amplification works well. If,
however, the experiment is part of an assay for a particular allele of a gene, then the
amplicon would be required to contain that region of interest.
2.2. Amplicon Size
In general, amplicons are from 100 to 1000 bp in length. The lower limit is caused
by the typical need to be able to visualize the amplicon on an agarose gel. The upper
limit of 1000 bp is the result of difficulties in amplifying large sequences. If an assay
that does not require a minimum amplicon size is used, there is no theoretical minimum
amplicon size.
2.3. Guanine/Cytosine (G/C) Content
Defined as the proportion of bases in the primer that are either G (guanine) or
C (cytosine), good PCR primers are generally selected to have a G/C content between
40 and 60%. However, there is no well-defined reason for this, only that it has been
considered preferable.
3. Considerations for Optimal PCR
The issues concerning PCR primer design can be divided into two categories:
efficiency and specificity. Both of these are important to consider in most applications,
but often the factors that promote one of these will adversely affect the other.
From: Methods in Molecular Biology, Vol. 226: PCR Protocols, Second Edition
Edited by: J. M. S. Bartlett and D. Stirling © Humana Press Inc., Totowa, NJ
81
82
Hyndman and Mitsuhashi
3.1. Efficiency
Efficiency can be viewed as the proportion of templates that are used to synthesize
new strands with each round of PCR, assuming the PCR primers are in a high
abundance. A situation in which efficiency is the primary concern is the amplification
of a purified template in which there is no chance of nonspecific PCR amplifications,
so the main issue is that of primer binding to its target.
3.1.1. Melting Temperature
Melting temperature, or Tm, is defined for a given DNA duplex as the temperature
at which half of the strands are hybridized and half of the strands are not hybridized.
The original definition of Tm implied that the two complementary strands were in equal
proportions. In cases where the two strands are not in equal proportions, such as a
primer hybridizing to its target in PCR, the definition of Tm must be altered. Because
in a PCR primer concentrations will be orders of magnitude higher than template
concentrations, the interpretation is that the Tm is the temperature at which a primer is
hybridized to half of the template strands. The Tm of a given primer/template combination depends on primer concentration, template concentration, and salt concentration.
Generally, the template concentration is considered to be negligible compared with the
primer concentration, so the formula for calculating Tm is simplified such that it does
not contain a parameter for template concentration.
Tm can be expressed as follows:
∆H
Tm = ————— –273.15 + 16.6 log [Na+] (1,2)
∆S + Rln(c)
The primer concentration is c; the ∆H and ∆S refer to the total enthalpy and entropy
of hybridization, respectively; and [Na+] is the sodium ion concentration but can
refer to the total concentration of most monovalent cations (such as K+). If there is
Mg2+ or other divalent cations such as Mn2+, the conversion is generally accepted
as follows:
Na+ = 4 × [Mg2+]2 (3)
3.1.2. Calculating Tm with the Nearest Neighbor Model
The accurate calculation of Tm from a given sequence requires determining the
∆H and ∆S of the hybridization. The most successful method for this is through the
use of the nearest neighbor model (4–7). With the nearest neighbor model, every pair
of adjacent base pairs makes a specific contribution to the overall ∆H and ∆S of the
duplex. The total ∆H and ∆S are calculated by adding all of the component values
plus a value for initiation of duplex formation. A table of ∆H and ∆S nearest neighbor
values is shown in Table 1 (5).
3.1.3. Efficient PCR Primers and Tm
The most important issue for designing efficient PCR primers is that they must bind
to the target site efficiently under the conditions of the PCR. This generally means not
only that they bind at the annealing temperature but that if the annealing temperature
PCR Primer Design
83
Table 1
Nearest Neighbor Thermodynamic Values
for DNA Base Pairs
Base Pair
aa/tt
at/ta
aa/at
ca/gt
gt/ca
ct/ga
ga/ct
gc/cg
gg/cc
∆H
∆S
∆G
1–8.4
1–6.5
1–6.3
1–7.4
1–8.6
1–6.1
1–7.7
–11.1
1–6.7
–23.6
–18.8
–18.5
–19.3
–231.
–16.1
–20.3
–28.4
–15.6
–1.21
–0.73
–0.61
–1.38
–1.43
–1.16
–1.46
–2.28
–1.77
is so low that the thermostable DNA polymerase is not active, the primer must be
bound at the temperature at which the polymerase becomes active in order to begin
extension.
3.1.4. Typical Three-Step PCR
As an illustration, let us examine a typical PCR cycle, where there is a dissociation
step of 30 s at 95°C, an annealing step of 1 min at 37°C, and an extension step of 3
min at 72°C. If the Tm of the primer, in the conditions of the reaction, is 65°C, then
the following will happen. As the temperature decreases from 95°C to 37°C, at some
point the primer will hybridize to the template. At this temperature, however, the
thermostable DNA polymerase may be almost totally inactive. As the temperature rises
towards 72°C, at some point the polymerase becomes active and will start to extend the
primer along the template. As the temperature rises above 65°C, some of the primers
will dissociate if they haven’t been extended. Those that have extended sufficiently,
however will form a more stable duplex as a result of their added base pairs from the
extension, and they will not dissociate before reaching 72°C. During the extension at
72°C, the primers still hybridized will be fully extended to generate new strands.
Let’s now imagine that with this scenario the Tm of the primer is 75°C. In this
case, as the temperature rises from annealing to extension, even primers that have
not extended will not dissociate. In this scenario, there will be almost total extension
of every possible target strand. Therefore, this PCR would have a high degree of
efficiency.
3.1.5. Two-Step PCR
PCR is sometimes performed in two steps without a discrete annealing step. In this
case, the annealing takes place at the same temperature as the extension. This requires
that the primers will hybridize to some degree at the extension temperature. If the
extension temperature is 72°C, then a primer with a Tm of 72°C would be an efficient
primer. Because the definition of Tm is the point at which half of the templates are in
a duplex, a short time after reaching 72°C, half of the templates will be bound by a
primer. Shortly after that, a large percentage of those templates will be extended by
84
Hyndman and Mitsuhashi
Fig. 1. Hairpin structures.
the polymerase and thereby taken out of the equilibrium between bound and unbound
templates. Because the primer concentration will essentially be unchanged, half of the
remaining templates will then be bound by primers and extended. In this way, most
templates can be extended if the extension is done at the Tm of the primer.
If, however, the Tm of the primer is a few degrees below the extension temperature,
only a small percentage of the primers will be hybridized, and the PCR will not be
efficient. Therefore, very efficient and specific PCR can be performed with two-step
cycling, but a sufficiently high primer Tm is very important.
3.1.6. Hairpins
A hairpin is a structure formed by a single DNA molecule in which a portion on
one part of the DNA hybridizes to a complementary portion within the same DNA
strand, forming a structure resembling a hairpin (Fig. 1A). When a PCR primer forms
a hairpin, it adversely affects the primer’s ability to bind and extend at the target site.
In the worst case, the hairpin includes a base pair of the 3′-end and an overhang of the
5′-end (Fig. 1B). Such a structure allows the extension by DNA polymerase along the
primer and will result in the formation of a primer that will not be complementary to
the template and will not be extended if hybridized (Fig. 1C). In addition to removing
primers from the mixture, this also will prevent native primers from binding as target
sites that are bound by the extended primers. To avoid this, primers should be selected
that do not have any possible hairpin structures if possible.
3.1.7. Primer-Dimer Formation
The hybridization of two primers together is referred to as a primer-dimer (Fig. 2A).
There are two possibilities for these, homodimers and heterodimers. Homodimers are
formed from the hybridization of the same species of primer together. Heterodimers
are the duplex of two different primer sequences hybridizing together. The result of
either of these is that the primers will not be as efficient in hybridizing to the target.
PCR Primer Design
85
Fig. 2. Dimer structures.
As with hairpins, the worst case is that in which the 3′-end of one of the primers is
base paired and there is a 5′ overhang (Fig. 2B). In this case, the primer will extend,
using the other primer as a template, rendering the extended primer unable to prime
the desired template (Fig. 2C). Even worse than with hairpins, this situation leads to
amplification of the primer dimers and rapid depletion of useable primers. To prevent
this, primer pairs should be chosen such that primer-dimer formation is minimal.
3.1.8. 3′-Terminal Stability
3′-terminal stability can loosely be defined as the relative hybridization strength
of the 3′-end of the primer. If the 3′-end of the primer has a low stability, it may not
efficiently prime because of the transient fraying of the end of the duplex. Therefore, a
higher 3′-terminal stability will improve priming efficiency. As will be mentioned later,
however, this high stability can have an undesirable affect on specificity.
3.2. Specificity
Specificity can generally be defined as the tendency for a primer to hybridize to its
intended target and not to other, nonspecific, targets. There are a few ways in which
poor specificity can impair PCR. First, if primers are hybridizing to many locations
nonspecifically, they will not be available to prime the target sequence. Second, if such
nonspecific hybridization were to occur, priming could also occur at those nonspecific
locations, which would effectively remove the primers from the reaction permanently.
Finally, by priming nonspecifically, it can be possible to generate aberrant amplicons.
This will not only obfuscate an assay for successful PCR, but will very rapidly consume
the primers to remove them from the reaction for amplifying the intended target.
3.2.1. Specificity, Tm, and PCR Conditions
With respect to the annealing and extension temperatures chosen for the PCR
reaction, there is a balance that must be reached between considerations of efficiency
and specificity. As discussed previously, a more efficient PCR will result from having primer Tm equal to or above the extension temperature of the reaction. However, having a primer with a high Tm can often result in poor specificity. In such
cases, partial hybridization of the primer may be likely and extension can occur from
nonspecific sites. Such issues are less critical if highly specific primers can be selected
as discussed here.
86
Hyndman and Mitsuhashi
Fig. 3. Hybridization simulation data.
3.2.2. Hybridization Simulation
The most precise way to view the specificity of a PCR primer is by hybridization
simulation (8). Hybridization simulation is the computer simulation of a hybridization
of a primer with a specified database. This in silico analysis will identify all hybridization sites within the database for a candidate primer, allowing the user to select primers
that will be the most specific.
It is important to realize that hybridization simulation is qualitatively different
from a homology or similarity analysis (9). Hybridization simulation uses a thermodynamic model with nearest neighbor values to calculate the mismatch Tm of hybridization for all hybridization sites. An example of hybridization simulation data is
shown in Fig. 3.
Currently, hybridization simulation is only available from a single commercially
available program, the HYBsimulator™. HYBsimulator allows for the screening of a large
set of candidate primers and selection based on the hybridization simulation data.
3.2.3. Statistical Determination of Specificity
Various mathematical models exist in which the specificity of a given primer can
be estimated based on the frequency of its constituent smaller sequences. One such
method uses a table of frequencies of 6 mers found within a given genomic database
PCR Primer Design
87
(10,11). The entire statistical frequency of the entire oligonucleotide is calculated
based on the constituent 6-mers by starting with the 5′ terminal 6-mer frequency and
multiplying the relative frequency of the 5-mer on the 3′-end of that 6 mer having the
next nucleotide. This is repeated until the end of the oligonucleotide is reached.
For example, to calculate the frequency (f) of the 8-mer: CATAGCCT
f(CATAGCCT) =
f(CATAGC)
f(CATAGC)
f(CATAGC)
4 f(ATAGCC)
× ———————————————————————
f(ATAGCT) + f(ATAGCG) + f(ATAGCC) + f(ATAGCT)
4 f(TAGCCT)
× ———————————————————————
f(TAGCCA) + f(TAGCCG) + f(TAGCCC) + f(TAGCCT)
where f(CATAGC) denotes the frequency of the 6-mer, CATAGC.
3.2.4. 3′-Terminal Effects
Partial hybridization of the primer at the 3′-terminus can permit extension by DNA
polymerase. This could result in depletion of primers as well as possible nonspecific
amplification; therefore, this type of partial hybridization should be minimized as
much as possible.
There are two considerations for decreasing the chance of partial hybridization of
the 3′-terminal: frequency and stability.
3.2.5. 3′-Terminal Frequency
If the 3′-terminal region has a sequence that has many occurrences in the DNA
that will be in the reaction, then the likelihood of partial hybridization is greater. To
minimize this, the primers can be selected such that the 3′-terminal region does not
have a high frequency of occurrence within the genome of interest.
3.2.6. 3′-Terminal Stability
If the 3′-terminal region has a strong hybridization energy, then 3′-terminal partial
hybridizations will be relatively more stable. More stable 3′-terminal hybridization will
allow more false priming. Therefore if the primers are chosen such that the 3′-terminal
has a low hybridization strength (also referred to as terminal stability), the primer is
less likely to be priming as a result of such partial hybridization.
Note that in the efficiency section above, a stronger 3′-terminal stability is said
to improve efficiency. Whether one should select primers with high or low terminal
stability would depend on factors, such as the nature of the experiment (i.e., whether
there will be a large amount of other DNA) and the nature of the gene (whether highly
specific primers can be found).
3.2.7. Specificity Within the Target Sequence
If the primer is able to partially hybridize to a nonspecific region of the template,
particularly undesirable effects can occur. If the nonspecific hybridization allows
extension along the template such that a PCR product can be formed in conjunction
with one of the primers binding at one of the actual binding sites, nonspecific amplicons
88
Hyndman and Mitsuhashi
can be generated. This problem is compounded if the nonspecific binding site is within
the amplicon itself. For this reason, primers should be checked for nonspecific alternate
hybridization sites within the target sequence.
4. Selecting Primers for Multiplex PCR
Multiplex PCR, in which several primer sets amplify several amplicons in the same
reaction, add a degree of complexity to designing optimal primers. The additional
issues to consider are those of possible heterodimer formation between all of the
candidate primers and possible alternate hybridization sites within any of the target
sequences. Some of the available primer design software provides functions for these
types of designs.
5. Primer Design Software
Several programs are available for PCR primer design. As mentioned above, we
think HYBsimulator is the most powerful such program and does provide PCR primer
selection based on all criteria mentioned in this chapter. Other popular programs
are Oligo™ and Primer Premier™, which provide a subset of these functions but are
slightly easier to use.
References
1. Breslauer, K. J., Frank, R., Blocker, H., and Marky, L. (1986) Predicting DNA duplex
stability from the base sequence. Proc. Natl. Acad. Sci. USA 83, 3746–3750.
2. Freir, S. M., Kierzed, R., Jaeger, J. A., Sugimoto, N., Caruthers, M. H., Neilson, T., and
Turner, D. H. (1986) Improved free-energy parameters for predictions of RNA duplex.
Biochemistry 83, 9373–9377.
3. Wetmer, J. (1991) DNA probes: Applications of the principles of nucleic acid hybridization.
Crit. Rev. Biochem. Mol. Biol. 26, 227–259.
4. Sugimoto, N., Nakano, S., Katoh, M., Matsumura, A., Nakamuta, H., Ohmichi, T., et al.
(1995) Thermodynamic parameters to predict stability of RNA/DNA hybrid duplexes.
Biochemistry 34, 11,211–11,216.
5. SantaLucia, J., Allawi, H. T., and Seneviratne, P. A. (1996) Improved nearest-neighbor
parameters for predicting DNA duplex stability. Biochemistry 35, 3555–3562.
6. SantaLucia, J., Kierzed, R., and Turner, D. H. (1990) Effects of GA mismatches on the
structure and thermodynamics of RNA internal loops. Biochemistry 29, 8813–8819.
7. Sugimoto, N., Kierzed, R., Freier, S. M., and Turner, D. H. (1986) Energetics of internal GC
mismatches in ribooligonulceotide helix. Biochemistry 25, 5755–5759.
8. Hyndman, D., Cooper, A., Pruzinsky, S., Coad, D., and Mitsuhashi, M. (1996) Software
to determine optimal oligonucleotide sequences based on hybridization simulation data.
BioTechniques 20, 1090–1096.
9. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local
alignment search tool. J. Mol. Biol. 215, 403–410.
10. Han, J., Hsu, C., Zhu, Z., Longshore, J., and Finley, H. (1994) Over-representation of
the disease associated (CAG) and (CGG) repeats in the human genome. Nucleic Acids
Res. 22, 1735–1740.
11. Han, J., Zhu, Z., Hsu, C., and Finley, W. (1994) Selection of antisense oligonucleotides on
the basis of genomic frequency of the target sequence. Antisense Res. Devel. 4, 53–65.