Isolation of Human DNA and PCR amplification of an Alu insertion

BIO440 Genetics Laboratory Humboldt State University
Isolation of Human DNA
and PCR amplification of an Alu insertion site
and PCR amplification of a VNTR site
DNA fingerprinting is an efficient and highly accurate means of determining
identities and relationships. It has revolutionized the field of forensics, especially
concerning violent crimes such as rape cases. It has exonerated hundreds of falsely
convicted people, and removed from suspicion countless more falsely accused people,
while also implicating guilty parties where there were no other indications of their
involvement. It allows scientists to identify the remains of victims of highly disfiguring
accidents or soldiers lost in battle. It is routinely used (even anonymously, via companies
advertising on the internet) to settle child custody cases by identifying the child's true
biological parents. This amazing technology is the most powerful and accurate form of
human identification ever used.
In this laboratory exercise, you will first isolate your own DNA using a crude but
effective protocol that will yield relatively impure DNA. The DNA is pure enough,
however, for use in PCR reactions. A simple saltwater mouthwash will release cheek
cells, which you will then lyse by boiling. You will boil the cells in a mixture that
contains Chelex beads. Chelex will bind divalent cations required by nucleases. You will
then spin the solution down, and the Chelex beads and cell debris will pellet, leaving your
crude DNA prep in the supernatant.
In the first experiment, you will use the polymerase chain reaction (PCR) to amplify a
region of chromosome16 called the pV92 Alu locus. Some versions of chromosome 16
have a 300 nucleotide insert in this region (called an Alu insert), while other versions don't.
After you amplify the Alu repeat region, you will determine whether or not you carry
this particular Alu sequence on one, both, or none of your number 16 chromosomes – i.e.,
you will directly determine your genotype. This will be accomplished by
electrophoresing your PCR sample on an agarose gel and observing the size of the bands
that have been created. In the second experiment, we are going to use PCR to amplify a
portion of chromosome 8. The portion that we are going to amplify is in an intron in the
gene for the tissue plasminogen activation (TPA) factor. In some copies of this gene
there is an insertion of a 300 bp Alu I sequence. Since we each have two copies of
chromosome 8, the possible genotypes are +/+ (both copies have the insertion), +/- ( one
copy has the insertion), or (-/-) both copies lack the insertion. The different alleles of
these non-coding regions are inherited in Mendelian fashion.
Introduction: pV92 Alu
The human genome is made up of approximately 6 billion base pairs distributed on 46
chromosomes. All cells in your body, except red blood cells, sperm, and eggs, contain
these 46 chromosomes (sperm and egg cells contain only 23 chromosomes, while RBC's
don't have nuclei). Only 3 to 10 percent of this enormous amount of DNA, however, is
actually used to directly code for the proteins required for supporting cellular metabolism,
growth, and reproduction. The protein-encoding regions are scattered throughout the
genome. Genes may be separated by many thousands of base pairs. Furthermore, most
genes in the human organism are themselves broken into smaller protein-encoding
segments, called exons, with, in many cases, hundreds or thousands of base pairs
intervening between them. These intervening regions are called introns. Examination of
introns and other non-protein-coding regions has revealed the presence of unique genetic
elements that can be found in a number of different locations within the genome. One of
the first such repeating elements identified was Alu .
Alu repeats are approximately 300 base pairs in length. They carry within them
the base sequence recognition site for the Alu I restriction endonuclease. There are over
500,000 Alu repeats scattered throughout the human genome (thus making up ~5% of the
genome--equal to the protein coding portion!). On average, one can be found every 4,000
base pairs along a human DNA molecule. How they arose is still a matter of speculation
but evidence suggests that the first one may have appeared in the genome of higher
primates about 60 million years ago. Alu repeats are inherited in a stable manner; they
come intact in the DNA your mother and father contributed to your own genome at the
time you were conceived. Some Alu repeats are fixed in a population, meaning all humans
have that particular Alu repeat. Others are said to be dimorphic; different individuals may
or may not carry a particular Alu sequence at a particular chromosomal location.
Some versions of chromosome 16 have a PV92 Alu insert; some don't.
The Polymerase Chain Reaction
PCR is a method for the in vitro exponential amplification of a specific DNA
region (called target region), that lies between two regions (called primers) of known DNA
sequence, resulting in µg quantities of DNA from fg or zg starting masses. PCR has a
wide variety of uses. The diversity of applications derive from the ability of PCR to:
a) distinguish a specific target sequence from a large excess of background,
non-target sequences, and
b) produce a large number of copies of a specific sequence from a very small
initial amount (as low as a single molecule of the target sequence).
The extreme specificity and sensitivity of PCR have allowed it to be used to amplify and
clone DNA from such sources as mummified human tissues and samples of extinct plants
and animals. To accomplish this task, PCR utilizes DNA polymerase, an enzyme that
catalyzes the synthesis of a complementary strand of DNA from a template strand. This
synthesis always takes place in a 5’ -->3’ direction, and DNA polymerases require a 3’OH on the growing daughter strand in order to catalyze the formation of a new
phosphodiester bond.
This requirement for a 3’-OH is exploited to give the process the high degree of
specificty for which it is known. A target DNA sample is mixed with two different
primers, which are short (20-30 bases), single-stranded, synthetically-made
polynucleotides that are complementary to the sequences flanking the target sequence.
This mixture is heated to denature the double-stranded target DNA. When the mixture is
then cooled, the primers can ‘anneal’ to the sequences that they are complementary to,
that is, they can form a double-stranded DNA molecule consisting of the primer
hydrogen-bonded to a complementary target DNA sequence. The primer thus provides a
3’-OH group for the DNA polymerase, and a copy of the region adjacent to the primerbinding-site is synthesized.
After the complementary strand is synthesized, the mixture is then heated again,
and the double-stranded molecule consisting of the target sequence and the newlysynthesized molecule is denatured to single strands. When this mixture is cooled, the
newly synthesized strands and the original target strands can serve as templates for the
synthesis of more product sequences; thus, the process is exponential. In theory, the
process can double the amount of the specific target with each round of denaturating,
29
annealing, and synthesis. Thus, after thirty rounds, 2 copies of each original target
molecule would be present.
In practice, three temperatures are typically included in a PCR cycle,
corresponding to three reactions: denaturation, annealing, and extension. 94°C for 1
minute is usually sufficient for denaturation, and 72°C for 0.5 - 1 minute is usually
sufficient for extension (the DNA polymerase used is very active at this temperature).
The annealing temperature is typically set at 5°C lower than the Tm of the primers, and
thus will be dependent on the actual primer sequence (i.e, the higher the G+C content, or
the longer the primers, the higher the annealing temperature). The optimum annealing
temperature is usually determined empirically. A typical annealing temperature is 55°C.
Clearly, at these elevated temperatures, a human DNA polymerase would rapidly
denature and cease to catalyze synthesis of complementary DNA. Therefore, a variety of
DNA polymerases that have been cloned from extreme thermophiles, microorganisms
(Bacteria and Archea) which thrive at elevated temperatures. Taq is commonly used, a
DNA polymerase isolated from Thermus aquaticus, a bacterium isolated from a hot
spring in Yellowstone National Park.
The yield of a PCR reaction is defined as the number of target molecules
produced by the PCR, and can be described by the equation:
Y = No * (1+E)a-1
where Y = yield
No = number of starting target sequence molecules
E = efficiency
a = number of cycles
The efficiency of a PCR reaction is the fraction of target molecules that actually
serve as a template for the synthesis of new product molecules. Therefore E can have
values between 0 and 1. There are many factors which contribute to the efficiency of a
PCR reaction.
Another important concept is the specificity of a PCR reaction. Specificity refers
to the number of target molecules that are bound by primers relative to the number of
non-target molecules that are bound by primers. For example, decreasing the annealing
temperature of a PCR reaction typically decreases the specificity of that reaction, because
the temperature decrease allows the primer to form duplexes with molecules that are lessthan-perfect complements. Typically, decreases in specifity are accompanied by
increases in efficiency. In designing a PCR reaction, you need to balance out the desire
for a large amount of product with the desire to amplify only the desired target sequence.
Some of the factors which affect specificity are listed in the table below.
Conditions Favoring Enhanced Specificity
decrease
Mg++
decrease
[dNTPs]
decrease
[Taq polymerase]
decrease
cycle times
decrease
number of cycles
decrease
primer concentration
decrease
primer degeneracy
increase
annealing temperature
optimized primer design
use of hot start
use of Touchdown PCR
use of enhancing agents to decrease 2° structure of template
Components of a typical PCR reaction
10 ng of starting template DNA
1 U of Taq polymerase
0.1mM dNTPs
1.5 mM Mg++
1 µM each primer
buffer (Tris buffer pH 8.5/KCl/ BSA)
There is a good animation of PCR at
http://www.dnalc.org/ddnalc/resources/shockwave/pcranwhole.html
Protocol: DNA isolation from Cheek cells.
1.Use permanent marker to label your name on a 15-ml culture tube containing saline
solution. (this is commercial drinking water with table salt added to 0.8%).
2.Pour all of the saline solution into your mouth, and vigorously swish for 10 seconds.
Save the empty 15-ml tube for reuse in the next step.
3.Expel saline mouthwash into a paper cup. Then carefully pour saline mouthwash from
paper cup back into 15-ml tube from Step 1.
4. Transfer 1 ml of the cell solution to a 1.5 ml tube labeled with your name. Securely
close cap, and place mouthwash tube in a balanced configuration with other tubes in rotor
of microcentrifuge. Centrifuge on high setting for 1 minute to pellet cells.
5.Being careful not to disturb cell pellet, pour off as much supernatant as possible into
sink or paper cup. Place tube with mouthwash cell pellet on ice.
6.Use micropipettor to add 30 µl of saline solution to cell pellet. Resuspend cell pellet by
vortexing and by pipetting up and down. It is important to resuspend the cell pellet
entirely - no clumps.
7. Add 30 µl of your resuspended cell pellet to a 0.5 ml tube containing 200µl 10%
Chelex solution. Label the tube with your initials
8. Incubate the 0.5 ml sample tube at 99°C in the thermalcycler for 10 minutes.
9. Following incubation, remove sample tube from thermalcycler, briefly vortex the
sample, and cool tube on ice for approximately one minute.
10. Place sample tube in a balanced configuration in a microfuge rotor, and spin for 30
seconds to pellet Chelex beads at bottom of tube.
11. Transfer 50 microliters of the supernatant to a fresh 1.5 ml tube labeled with your
name, and place tube in freezer. Avoid transferring any of the Chelex pellet.
BIO440
Genetics Laboratory
Analysis and Interpretation
TPA-25 Alu Insertion / Pv92 Human Genotyping
Name: ______________________________________
Tape a picture of your gels below, indicating which lanes represent your samples.
Interpret the results of your amplifications, including expected results and possible
explanations of unexpected results:
Analysis of Alu TPA-25 results
For this calculation, include the values determined for your class, and with previous
classes as depicted on the example results gel at the end of this handout. We will
symbolize the presence of an Alu insertion with a + and the absence with a -. For HardyWeinberg calculations, we will have + =p and - = q.
1. How many class members were homozygous for the insertion (+/+)?
2. How many class members were heterozygous for the insertion (+/-)?
3. How many class members were homozygous for no insertion (-/-)?
4. What is the frequency of p? of q?
5. Given the frequency of p and q, and assuming that we are in Hardy-Weinberg
equilibrium, how many students would we expect to be +/+? +/-? -/-?
6. Presumably, there will be some differences between the actual distribution of
genotypes that we observe and the distribution predicted by Hardy-Weinberg
equilibrium. To determine whether this difference appears due to chance, use chisquared analysis to analyze the deviation. A table of p-values appears later in the
handout - use 1 degree of freedom.
Analysis of Alu Pv92 results
1. How many class members were homozygous for the insertion (+/+)?
2. How many class members were heterozygous for the insertion (+/-)?
3. How many class members were homozygous for no insertion (-/-)?
4. What is the frequency of p? of q?
5. Given the frequency of p and q, and assuming that we are in Hardy-Weinberg
equilibrium, how many students would we expect to be +/+? +/-? -/-?
6. Presumably, there will be some differences between the actual distribution of
genotypes that we observe and the distribution predicted by Hardy-Weinberg
equilibrium. To determine whether this difference appears due to chance, use chisquared analysis to analyze the deviation.
7. For the Pv92 insertion, what is the p-value associated with the chi-squared value that
you calculated? In your own words, describe what this p-value means.
Students in a genetics class at the University of Costa Rica performed the Pv92
amplification. Of 56 students, they had allele frequencies of 0.22 =+ and 0.78 = -, with a
genotype distribution of +/+ = 0.02, +/- = 0.40, and -/- = 0.59. Students in a genetics
class at the University Texas performed the Pv92 amplification. Of 31 students, they had
allele frequencies of 0.42 =+ and 0.58 = -, with a genotype distribution of +/+ = 0.26, +/= 0.32, and -/- = 0.42.
8. Using a chi-squared approach, compare your results with their results to see if our class
can be distinguished as an interbreeding population from either of these groups.
Table of p-values for Chi-squared calculations
P value(%)
Chi2 -1 DOF
Chi2 - 2 DOF
Chi2 - 3 DOF
90
0.016
0.21
0.58
70
0.148
0.7
0.4
50
0.45
1.4
2.4
30
1.1
2.4
3.7
10
2.7
4.6
6.3
5 ***
3.8
6.0
7.8
1
6.6
9.2
11.3
9. Some populations are not in Hardy-Weinberg equilibrium. Why? What are the
possible causes of a non-H-W distribution of genotypes?
10. The following problem refers to the gel pictured below. Forensic scientists from time
to time must reconstruct the DNA profile for a missing person from analysis of DNA
profiles of close relatives. In this case, a mother of four children is missing. All children
have the same biological father. Results from a single locus probe DNA fingerprint
analysis for the four children and their father are shown in the figure. Unfortunately, the
forensic scientist forgot to label the lane with the father's DNA.
Which lane is the father's DNA? Which alleles (A-D) does the mother have? Explain.
11. A scandal has broken. It's revealed that Ashley Olsen has four children from her early
teen years, and she is claiming that Rollin Richmond is the father. He claims that she's
not his type (he's more a Mary Kate fan), that he rebuffed her advances, and that she has
been stalking him for years. Results from a single locus probe DNA fingerprint analysis
for Ashley, Rollin, and the four children are shown in the illustration below. Which child,
if any, can be excluded as being the biological offspring of Rollin? Explain.
11. It is determined that the father of the children, and Dr. Richmond, have the following
genotype:
the same as you for the TPA 25 and the pv92 locus
B/D for the locus from question 9 (frequency in population = 0.15)
A/D for the locus from question 10 (frequency in population = 0.05)
C/E at another locus (frequency in population = 0.05)
Assuming the frequencies for the TPA 25 genotype and the pV92 genotype that you
determined as part of this lab, what fraction of individuals are expected to have the same
genotype as the father and Dr. Richmond?
How many people at Humboldt State, assuming these genotype frequencies, would have
that genotype?
How many people in America, assuming these genotype frequencies, would have that
genotype?
BIO440
Genetics Laboratory
Example results TPA-25 Alu Insertion
BIO440
Genetics Laboratory
Example results Pv92 Alu Insertion