DNA pooling identifies QTLs on chromosome 4 for general cognitive

 1999 Oxford University Press
Human Molecular Genetics, 1999, Vol. 8, No. 5
915–922
DNA pooling identifies QTLs on chromosome 4 for
general cognitive ability in children
Paul J. Fisher1, Dragana Turic1, Nigel M. Williams1, Peter McGuffin1,2,
Philip Asherson2, David Ball2, Ian Craig2, Thalia Eley2, Linzy Hill2, Karen Chorney3,
Michael J. Chorney3, Camilla P. Benbow4, David Lubinski4, Robert Plomin2 and
Michael J. Owen1,*
1Department
of Psychological Medicine, University of Wales College of Medicine, Cardiff CF4 4XN, UK, 2Social,
Genetic and Developmental Psychiatry Research Centre, Institute of Psychiatry, De Crespigny Park, London
SE5 8AF, UK, 3Department of Microbiology and Immunology, Milton S. Hershey Medical Center, Pennsylvania
State University, Hershey, PA 17033, USA and 4Department of Psychology and Human Development, Vanderbilt
University, Nashville, TN 37203, USA
Received January 13, 1999; Revised and Accepted February 15, 1999
General cognitive ability (g), which is related to many
aspects of brain functioning, is one of the most heritable traits in neuroscience. Similarly to other heritable
quantitatively distributed traits, genetic influence on g
is likely to be due to the combined action of many
genes of small effect [quantitative trait loci (QTLs)],
perhaps several on each chromosome. We used DNA
pooling for the first time to search a chromosome systematically with a dense map of DNA markers for allelic
associations with g. We screened 147 markers on
chromosome 4 such that 85% of the chromosome were
estimated to be within 1 cM of a marker. Comparing
pooled DNA from 51 children of high g and from 51 controls of average g, 11 significant QTL associations
emerged. The association with three of these 11
markers (D4S2943, MSX1 and D4S1607) replicated
using DNA pooling in independent samples of 50
children of extremely high g and 50 controls. Furthermore, all three associations were confirmed when each
individual was genotyped separately (D4S2943,
P = 0.00045; MSX1, P = 0.011; D4S1607, P = 0.019).
Identifying specific genes responsible for such QTL
associations will open new windows in cognitive neuroscience through which to observe pathways between genes and learning and memory.
INTRODUCTION
Diverse measures of cognitive ability, such as speed of processing, memory and spatial ability, intercorrelate at a modest level,
typically 0.20–0.40. This common factor is known as general
cognitive ability (g) (1). g is assessed as a total score across
diverse cognitive tests as in intelligence (IQ) tests or as an
unrotated principal component score that best reflects what is in
common among the tests (2). Although g has scarcely entered the
lexicon of cognitive neuroscience, genetic research suggests that
it may provide an important perspective on brain functions such
as learning and memory.
It is of considerable interest that variation in g is substantially
due to genetic factors. The substantial heritability of g is one of
the best documented findings in the behavioural sciences (3).
Model-fitting meta-analyses based on dozens of twin and
adoption studies estimate that ∼50% of the total population
variance in IQ can be attributed to genetic factors (4,5).
A second relevant result comes from multivariate genetic
research which analyses the covariance among traits rather than
the variance of each trait considered separately. The results of
such analyses indicate that the same genetic factors influence
different cognitive abilities, which implies that g reflects the
genetic foundation for cognitive functioning (6).
Genetic research on g will have its greatest impact on cognitive
neuroscience when specific genes responsible for its heritability are
identified. Heritability of quantitatively distributed traits such as g is
likely to be due to multiple genes of varying effect size, called
quantitative trait loci (QTLs) (7). Although traditional linkage
methods for identifying single-gene effects are not able to identify
QTLs of small effect size, allelic association studies do have the
statistical power to detect them; for example, by comparing allelic
frequencies for a selected group and controls (8,9). The major
strength of linkage is that it is systematic in the sense that a few
hundred DNA markers can be used to scan the genome. In contrast,
because allelic association with a quantitative trait can only be
detected if a DNA marker is itself the QTL or very close to it,
thousands of DNA markers would need to be genotyped in order to
scan the genome. For this reason, allelic association has been used
primarily to investigate associations with candidate genes. In our
earlier work, we genotyped 100 DNA markers in or near genes
involved in brain functioning, primarily neurotransmitters, but no
replicated associations with g were found (10). The problem with
*To whom correspondence should be addressed. Tel: +44 1222 743058; Fax: +44 1222 746554; Email: [email protected]
916
Human Molecular Genetics, 1999, Vol. 8, No. 5
Figure 1. Allele image patterns (AIPs) generated by GENOTYPER for
D4S1607 for the original control group (middle) and the original high g group
(bottom), and their overlaid images (top). The numbers above the ∆AIP
represent the peak numbers. The numbers below and to the right of the control
and high g AIPs represent peak heights in fluorescence units. ∆AIP was
calculated from the overlaid images by measuring the total area that was not
shared by the two images irrespective of how many times the curves from the
two pools crossed. This was then expressed as a fraction of the total shared and
non-shared area (13).
such a candidate gene approach is that any of the tens of thousands
of genes expressed in the brain could be considered as candidate
genes for g. Such association studies can be made more systematic
by using a dense map of markers with an inter-marker interval of <1
cM so that no QTL would be more than 0.5 cM from a marker. A
first attempt to use a dense map of markers to identify QTL
associations with g reported a replicated association for insulin-like
growth factor-2 receptor (IGF2R; 11), which has been shown to be
especially active in brain regions most involved in learning and
memory (12).
The problem with this approach is the amount of genotyping
required. In order to scan the entire genome at 1 cM intervals, one
would need to genotype ∼3500 microsatellite markers, which would
require 700 000 genotypings in a study of 100 high g individuals and
100 controls. We have developed a technique based on DNA
pooling which greatly reduces the need for genotyping by pooling
DNA from all individuals in each group and comparing the pooled
groups, so that only 7000 genotypings are required to scan the
genome in the previous example (13).
The main purpose of the present study was to provide proof of
principle for the use of DNA pooling for systematic, large-scale
association studies by using it to search for QTLs for g on an entire
chromosome. We therefore sought replicable QTLs for g on
chromosome 4 using a three-stage strategy: (i) nominate significant
QTLs using pooled DNA samples of high g and control individuals
(referred to as ‘original’ high g and control groups); (ii) replicate
these nominated QTLs using pooled DNA from independent
samples of high g and control individuals (referred to as ‘replication’
high g and control groups); and (iii) confirm the results of these
replicated QTLs by genotyping each individual separately.
groups were less likely to be due to ethnic differences. Two
samples were obtained that intentionally differed in sampling
frame and procedures in order to provide a ‘constructive’
replication rather than a ‘literal’ replication (14); i.e. rather than
obtaining a large sample using a single measure of g and dividing
the sample into original and replication groups, we chose to
obtain a replication sample with extremely high g scores in order
to increase our power to replicate QTLs of small effect size.
Because individuals of such high g are off the scale of standard
IQ tests, a different measure was used to select high g individuals
for the replication sample. Constructive replication is a conservative procedure that may increase the rate of false-negative results
but permits broader generalizations from positive results.
The original high g and control samples were selected from
children living in a six-county area around Cleveland, OH, who were
between 6 and 15 years of age. g was assessed by a widely used IQ
test, the Wechsler Intelligence Scale for Children (15). The high g
sample included 51 children (mean IQ = 136; SD = 9.3) and the
control group included 51 children of average g (mean IQ = 103; SD
= 5.6). A replication high g group was obtained from the Study of
Mathematically Precocious Youth (SMPY) in the USA, which
began in the 1970s as a study of mathematical talent but since the
late 1970s put as much emphasis on verbal as mathematical talent
(16). The highest-scoring SMPY individuals were selected from the
more than one million seventh and eighth graders who performed in
the top 3% on a standardized test administered in their schools and
were invited to take the Scholastic Aptitude Test (SAT) college
entrance exam 4 years early before the age of 13 years. The SAT
correlates highly with g and with standard IQ tests in the normal
range [e.g. 0.84 for SAT-Math (M) and 0.89 SAT-Verbal (V)
corrected for unreliability; 17]; using the SAT at 13 instead of the
usual age of 17 makes it possible to estimate IQ scores even though
standard IQ tests do not cover scores as high as these. Fifty of the
highest-scoring individuals were targeted for the high g replication
sample. These participants earned scores of at least SAT-V ≥ 630 and
SAT-M ≥ 630, or SAT-V ≥ 550 and SAT-M ≥ 700. They were
required to have ‘flat’ SAT profiles in the sense that their SAT-V and
SAT-M were required to be within one standard deviation of each
other. These participants represent a selection intensity of ∼1 in
30 000 as indicated by scores four standard deviations above the
mean (equivalent to an IQ score of 160) as estimated from their
composite (V + M) SAT scores. A replication control group
consisting of 50 individuals (mean IQ = 101; SD = 7.2) was selected
in the same manner (same geographical area, same age) as the
original control group. Informed consent was obtained from all
participants.
For all subjects, DNA was extracted from permanent cell lines
established from blood. Primers for 179 microsatellite markers on
chromosome 4 were purchased from MWG-Biotech in Germany.
These DNA markers were selected from the LDB summary map
(http://cedar.genetics.soton.ac.uk/public_html ) (18; Materials
and Methods) with an average interval of 1.2 cM across the
chromosome. However, due to uneven distribution of available
microsatellite markers, many of them are >1.2 cM apart. For
example, there are eight gaps between adjacent markers of >3 cM
(the largest gap is 5.5 cM).
RESULTS
Genotyping pooled DNA using microsatellite markers
The samples were restricted to non-Hispanic, Caucasian children
so that differences in marker allele frequencies between the
Two replicate DNA pools were constructed from individuals in
each of the four groups (Materials and Methods). Duplicate PCRs
917
Human
Genetics,
1999,
8, No.
NucleicMolecular
Acids Research,
1994,
Vol. Vol.
22, No.
1 5
were conducted for each of the pools. Allele image patterns
(AIPs) were generated on an ABI DNA sequencer for each
group’s four PCR products for each marker (Materials and
Methods). The four unmodified AIPs for each group (high g, or
control) were overlaid and the consensus AIP was taken to
represent the relative allele frequencies of the marker. In order to
compare the results of pooled genotyping of the original high g
and control groups, we measured the total area that was not shared
by the two superimposed consensus allele image patterns and
expressed this as a fraction of the total shared and non-shared area
according to the method of Daniels et al. (13). This test statistic
is called ∆AIP (Fig. 1).
Rather than optimizing each primer pair, standard conditions
were used for PCR amplification. Using our standard optimizing
conditions, 73% (129) of the 179 markers yielded replicable
917
amplification products (where at least three of each group’s four
replications gave near-identical overlays). A second amplification protocol was attempted for the 50 markers that failed to yield
amplimeres in the initial PCR (see the Taq gold procedure in
Materials and Methods). Eighteen of these markers yielded
replicable amplification products, bringing the total number of
scoreable markers to 147. As a result, ∼65% of the chromosome
was covered at the 1 cM grid (representing 0.5 cM on each side
of the marker) or, alternatively, 85% of chromosome 4 was
covered at the 2 cM grid level of resolution (within 1 cM of a
scored marker).
The ∆AIP was calculated for each marker (Table 1). An example
of overlaid and non-overlaid control and high g AIPs is shown for
marker D4S1607 (Fig. 1) with a ∆AIP of 0.22 (Table 1).
Table 1. A sample of 147 chromosome 4 markers and their ∆AIPs for the high g and control groups in the original sample
Marker
∆AIP
Marker
∆AIP
Marker
∆AIP
Marker
∆AIP
D4S3038
D4S127
D4S1614
D4S3034
D4S412
D4S2957
MSX1a
D4S3023
D1S503
D4S431
D4S2935
D4S394
D4S2923
D4S2928
DRD5
D4S3009
D4S1582
D4S2906
D4S2944
D4S1602
D4S1511
D4S3048
D4S1567
D4S2926
D4S419
D4S3020
D4S3017
D4S2953
D4S2933
D4S1590
D4S1551
D4S3044
D4S391
D4S1609
D4S418
D4S1618
D4S2408
0.15
0.04
0.10
0.02
0.04
0.03
0.21
0.10
0.08
0.08
0.07
0.07
0.05
0.09
0.19
0.20
0.13
0.13
0.06
0.09
0.13
0.10
0.17
0.05
0.11
0.19
0.29
0.14
0.05
0.07
0.15
0.13
0.16
0.02
0.17
0.01
0.15
D4S2912
D4S3027a
D4S2955
D4S3001a
D4S2995
D4S1587
D4S2950
D4S405
D4S2919
259
D4S174
D4S1547
D4S1536
D4S2971
D4S3002
D4S1577
D4S2996
D4S428
D4S2916
D4S3000
D4S1592
D4S1518
D4S1569
D4S1600
D4S3004
D4S1541
D4S1568
D4S392
D4S2931
D4S1517
D4S2990
D4S2947
D4S2361
D4S2964
D4S2922
D4S2932
D4S1538
0.04
0.28
0.13
0.25
0.02
0.07
0.06
0.22
0.00
0.21
0.17
0.16
0.10
0.10
0.05
0.06
0.07
0.14
0.12
0.20
0.13
0.12
0.14
0.05
0.09
0.06
0.09
0.14
0.11
0.10
0.02
0.18
0.12
0.24
0.15
0.07
0.08
D4S1534
D4S2460
D4S1544
D4S3006
D4S3037
D4S423
D4S2407
D4S1559a
D4S2973
D4S1560
D4S2986a
D4S1591
D4S2961
D4S1570
D4S3026
D4S2917
D4S2940
D4S1580
D4S191
D4S2392
D4S1612
D4S430
D4S3024
D4S1524a
D4S1527
D4S1615
D4S2938
D4S2286
D4S3039
D4S422
D4S1576
D4S175
D4S397
D4S1565a
D4S1561
D4S424
D4S2998
0.05
0.16
0.02
0.12
0.03
0.29
0.18
0.23
0.05
0.12
0.24
0.07
0.07
0.10
0.05
0.13
0.12
0.03
0.16
0.10
0.06
0.24
0.17
0.36
0.06
0.07
0.06
0.12
0.08
0.23
0.11
0.25
0.08
0.21
0.09
0.10
0.13
D4S3008
D4S1549
D4S1588
D4S2999
D4S3016
D4S2980
D4S3033
D4S3046
D4S2952
D4S1636
D4S1566
D4S1502
D4S2910
D4S243
D4S1545
D4S1617
D4S622
D4S2977
D4S3030
D4S1529
D4S2967a
D4S1607a
D4S3015
D4S2951
D4S3041
D4S2920
D4S2943a
D4S1554
D4S2954
D4S1535
D4S408
D4S171
D4S1540
D4S426
D4S2921a
D4S2975
0.07
0.09
0.08
0.07
0.16
0.05
0.16
0.09
0.09
0.15
0.04
0.07
0.10
0.15
0.05
0.15
0.02
0.13
0.01
0.22
0.22
0.22
0.10
0.17
0.06
0.11
0.20
0.22
0.05
0.12
0.17
0.18
0.13
0.14
0.24
0.23
The 147 markers and their respective ∆AIPs are listed according to their position on the genetic map (see text).
The upper left entry represents the marker closest to the top of the telomere of 4p, and the lower right entry represents the marker closest to the bottom of the telomere
of 4q. ∆AIP was calculated as detailed in the text.
aP < 0.05.
918
Human Molecular Genetics, 1999, Vol. 8, No. 5
Table 2. Specific alleles from markers with significant ∆AIP values in the
original pools were tested for significance in the replication pools
Marker
Original pool
∆AIP
P-value
Original pool
AST-χ2 P-value
Replication pool
AST-χ2 P-value
D4S2943
0.20
0.03
3.25
0.07
2.62
<0.05
D4S1565
0.21
0.03
1.52
0.22
0.11
0.37
MSX1
0.21
0.02
4.68
0.03
3.04
0.04
D4S1607
0.22
<0.05
6.34
0.02
3.29
0.03
D4S2967
0.22
0.01
1.78
0.18
0.60
0.22
D4S1559
0.23
0.01
4.50
0.03
N/A
N/A
D4S2921
0.24
0.02
4.61
0.03
0.47
0.25
D4S2986
0.24
<0.05
1.19
0.28
0.40
0.26
D4S3001
0.25
0.02
4.26
0.04
0.93
0.17
D4S3027
0.28
0.01
4.51
0.03
N/A
N/A
D4S1524
0.36
0.01
3.86
0.05
N/A
N/A
This table shows the ∆AIP and P-values for the 11 markers where ∆AIPs were
significant on pooled analysis. The allele from each of these 11 markers showing
the greatest frequency difference in the original sample was identified and tested
using an allele-specific χ2 test (AST) as described in the text. The χ2 and
P-values are given for the most significant allele (columns 4 and 5). Each of
these alleles was then tested in the replication pools. The χ2 and P-values are
given in columns 6 and 7. Significant allele-specific values (P < 0.05) are in
bold. The actual P-values for the D4S1607 and D4S2986 markers’ original pool
∆AIPs were just <0.05, but are shown as 0.05 in this table after rounding up.
N/A, not applicable, because the association between the tested high g and control peak of the original group changed direction in the replication group, i.e. a
positive association with high g in the original group was negatively associated
with high g in the replication group or vice versa.
Table 3. P-values of individual genotypings for three markers in the original
(O) and replication (R) samples
Marker
P-value
(CLUMPO)
Specifically
tested allele
P-value
(χ2O)
P-value
(χ2R)
P-value
(CLUMPtotal)
D4S1607
0.049
Allele 6
0.006
0.026
0.019
D4S2943
0.010
Allele 5
0.024
0.012
0.00045
MSX1
0.12
Allele 3
0.028
0.031
0.011
CLUMP analysis of the original high g and control samples revealed two significant associations at the 0.05 level (for markers D4S1607 and D4S2943). A negative association between D4S1607 allele 6 and high g was found in the original
sample (column 4) and was tested for significance in the replication group (see
text and column 5). A positive association between D4S2943 allele 5 and high
g was found in the original sample, as was a positive association between MSX1
allele 3 and high g. These positive associations were tested for significance in
the replication group (see text and column 5). The CLUMPtotal analysis included individuals from both original and replication populations (see text and
column six).
We used a three-stage strategy that provides a better balance
between false-positive and false-negative errors by permitting a
lenient significance level (P < 0.05) in the first stage (which
reduces false negatives but increases false positives) and then
removing false positives in the second stage. In the first stage,
∆AIPs were compared for the 147 markers for DNA pooled from
the original group of children of high g and the group of controls
of average g. In the second stage, markers that yielded significant
∆AIPs in the first stage were tested using DNA pooling in an
independent sample of children of extremely high g and an
independent control group. In the third stage, markers that yielded
significant (P < 0.05) differences for DNA pools in both the
original and replication samples were genotyped individually for
all subjects in order to confirm the results of DNA pooling using
traditional methods.
For pooled comparison of the original high g and control pools,
markers were tested for significance using the simulation
program described by Daniels et al. (13) and in Materials and
Methods. The simulation program estimates the significance of
the ∆AIP for a particular marker. Eleven markers showed
significant (P < 0.05) ∆AIPs between the high g and control group
(Table 2). More markers were observed that showed a significant
∆AIP (11) than would be expected by chance (7.4) given the
lenient criterion of P < 0.05 which does not correct for multiple
testing (Discussion).
These 11 markers were selected for pooled genotyping in the
replication sample. For each marker, the individual peak (allele)
showing the greatest difference in the original high g and control
groups was identified (see below). The replication sample was
then used to test this allele-specific hypothesis for the 11 markers;
i.e. rather than accepting any significant pattern of allelic
differences in the replication sample, we required that the same
allele yielded a significant difference in the replication sample.
We also required that the replication sample yielded an allelic
difference in the same direction as in the original sample; for this
reason, we used a one-tailed test of significance in the replication
sample. The strength of the multi-stage replication design is that,
of the 11 markers significant at P < 0.05 in the original sample,
none (i.e. 0.6) would be expected to be significant by chance
alone in the replication sample with P < 0.05.
The allele-specific test was significant and in the same direction
for three markers (D4S2943, MSX1 and D4S1607) in the
replication sample. These were then genotyped separately for all
individuals in order to confirm the results of DNA pooling.
Genotyping individuals
Each individual was genotyped separately for the three markers
that showed significant differences between high g and control
groups in both original and replication samples on pooled DNA
analyses. In the original sample, the CLUMP program was used
to determine whether the overall frequencies of the control
individuals’ alleles were significantly different from those of the
high g individuals using Monte Carlo simulations (19). The
analysis (Table 3) revealed significant differences for markers
D4S2943 (P = 0.01) and D4S1607 (P = 0.049), whereas MSX1 did
not reach statistical significance (P = 0.12).
For each of the three markers, the allele-specific hypothesis
(allele 5 of D4S2943, allele 3 of MSX1 and allele 6 of D4S1607)
was also tested for significance in the original groups using
Pearson χ2 analysis (Table 3). For all three markers, the difference
between the high g and control groups was significant at the 0.05
level. More importantly, the allele-specific hypothesis was also
significant for all three markers in the replication samples, with
P-values of 0.012, 0.031 and 0.026, respectively, using a
one-tailed Pearson χ2 test (Table 3). Finally, combined analysis
of individual genotyping data from both original and replication
samples using CLUMP revealed significant differences between
919
Human
Genetics,
1999,
8, No.
NucleicMolecular
Acids Research,
1994,
Vol. Vol.
22, No.
1 5
high g and control groups for the three markers (D4S2943, P =
0.00045; MSX1, P = 0.011; D4S1607, P = 0.019). Allele counts
for D4S2943, MSX1 and D4S1607 are shown in Figure 2 for the
original and replication samples as well as for the combined
populations.
DISCUSSION
Systematic screening of the genome for allelic association
requires genotyping a dense map of markers, which is facilitated
by DNA pooling. Application of DNA pooling to 147 markers on
chromosome 4, a larger than average chromosome, yielded three
alleles (one from each of markers D4S2943, MSX1 and D4S1607)
that showed significant associations with g in both original and
replication samples. These DNA pooling results were confirmed
when each individual was genotyped separately and the data
analysed by standard statistical procedures.
In addition to applying DNA pooling in a systematic approach
to allelic association using a dense map of markers, another novel
aspect of the present study is its use of a multi-stage replication
strategy with more lenient criteria in the first stage in an attempt
to strike a balance between false-positive and false-negative
results. In view of the large number of markers studied, we only
accepted the presence of marker–QTL associations when there
was an accumulation of evidence in their favour (20,21)
including the following: (i) an overall excess of statistically
significant results at P < 0.05 over the number expected by chance
alone; (ii) concentration of significant results in a few markers;
(iii) replications of significant results in a second independent
sample; and (iv) concordance between the original and replication samples with respect to the associated allele and the
direction of association. This issue of what is an acceptable level
of significance in studies of this kind is complex, with divergent
views being expressed (8). However, we feel that our approach of
using a multi-stage design with built-in replication offers the best
balance between type I and type II error, especially in the quest
for QTLs of small effect size.
It should be emphasized that allelic association using DNA
pooling to screen a dense map of markers can only identify some,
but not all, QTLs. For example, it will only detect old mutations
that have perfused through many generations. Haplotype analyses of even denser maps would increase the likelihood of
detecting other QTLs, leading eventually to saturation mapping
of all functional polymorphisms (8). In addition, greater power to
detect QTLs of smaller effect size can be obtained by increasing
the sample sizes or by selecting even more extreme samples.
Although it is reasonable to screen for old and relatively large
QTLs because these are likely to be most useful in terms of
understanding links between genes, brain and cognitive functioning, by no means does the approach exclude all other QTLs.
Encouraged by these first results of the application of DNA
pooling for a systematic analysis of allelic association screening
a dense map of markers, we are proceeding with a scan of 3500
markers across the genome to find other QTLs for g. Although
none of these QTLs is expected to account for a large amount of
the variance, we expect that a systematic genome scan will yield
QTLs that together account for a substantial portion of the genetic
variance for g. We are doubling our sample sizes in order to
increase the power to detect QTLs of even smaller effect size. We
are also obtaining DNA from parents of the high g subjects. These
parental data will provide a within-family replication to test our
919
hypothesis that by limiting the sample to non-Hispanic Caucasian
individuals we have attenuated the possibility that QTL associations are due to ethnic differences in marker allele frequencies.
As well as confirming associations with information from parents
and larger sample sizes, we intend to test more markers in the
close vicinity (within 1 cM) of those that yielded positive results.
Finally, we will target genes in close proximity to our associated
markers in an attempt to find the genes specifically responsible
for such associations.
Identifying replicable QTLs associated with g will make it
possible to address questions about development, multivariate
analysis and gene–environment interplay through the use of
measured genotypes rather than indirect inferences about heritable influence based on familial resemblance (22). In terms of
developmental questions, the present samples are children, which
raises the question of whether similar results will be found with
adult samples. Quantitative genetic research shows that the
heritability of g increases linearly throughout development (23),
which suggests that QTL associations for g may be stronger later
in life. Multivariate questions include whether QTLs found by
selecting extremes will also be correlated with the normal range
of variation as predicted by QTL theory. Also, QTLs for g
assessed using standard psychometric tests may be associated
with other types of behavioural tests such as informationprocessing tests and with brain-imaging measures of brain
structure and function. QTLs for g will provide discrete windows
through which to view pathways in the brain between genes and
learning and memory. As is the case with most important
advances, identifying genes for g will also raise new ethical
issues. These concerns must be taken seriously but they are
largely based on misconceptions about genetic research on
complex traits that are influenced by multiple genes as well as
multiple environmental factors (24,25).
As well as having implications for research into cognitive
ability, our results suggest that DNA pooling can be used to detect
group differences in allele frequencies and is thus of potential
importance in large-scale genome scanning for allelic association.
MATERIALS AND METHODS
Pooling of DNA for groups
Genomic DNA was extracted from permanent cell lines that were
derived from lymphocytes using a standard protocol. Each
individual DNA sample was diluted to 8 ng/µl. DNA quantification prior to pooling was performed in triplicate using the
PicoGreen fluorescent assay and a Fluoroskan Ascent fluorometer. Two sets of pools, original and replication, were constructed. Each set consisted of four separately prepared pools—
two from the control groups of average g and two from high g
individuals.
Primer selection
Markers containing di-, tri- and tetranucleotide repeats were
selected from the Location Database (LDB) composite map (18;
http://cedar.genetics.soton.ac.uk/public_html/ ). Ideally, the
markers were 1 cM apart, had between five and nine alleles,
heterozygosity scores between 0.5 and 0.9 and their genetic map
order was confirmed with physical map data. However, due to a
scarcity of markers in some regions, less than ideal (but still
920
Human Molecular Genetics, 1999, Vol. 8, No. 5
Figure 2. Allelic counts for individual genotyping of markers MSX1, D4S1607 and D4S2943 for high g and control groups in the original, replication and combined
samples. Control individuals are represented by black bars, and high g individuals are represented by white bars. Note that allele 5 of D4S2943 corresponds to peak
4 of the AIP for the original group (see Fig. 1). Small differences in total number of alleles between markers reflect failed genotypes.
921
Human
Genetics,
1999,
8, No.
NucleicMolecular
Acids Research,
1994,
Vol. Vol.
22, No.
1 5
informative) markers were chosen. Marker positions initially
were determined using an average of the LDB male and female
genetic map values assuming that the multiple recombinational
events that occurred after any marker–QTL associations were
independent of gender. In rare cases, where genetic and physical
map orders of a marker relative to its neighbours disagreed, the
physical map order was taken, and a new approximate genetic
map distance (in cM) was estimated.
Amplification of pooled DNA samples
Touchdown PCR (26) was carried out to amplify pooled DNA.
Each of the two replicate pools from the original set of high g and
control groups was amplified in duplicate, resulting in eight PCR
products. For the markers that gave significant ∆AIP values,
amplification was performed on the replication set of DNA pools
using the same PCR protocol. Each PCR contained the following
reagents: 48 ng of pooled genomic DNA, dNTPs (1.2 mM each),
1× Taq DNA polymerase buffer (Qiagen, Crawley, UK; with
1.5 mM MgCl2), Taq DNA polymerase (Qiagen; 0.6 U), 1.4 pmol
of each primer and water to 12 µl. The DNA was denatured
initially at 94C for 5 min, followed by five cycles at 94C (30 s),
[56C for the first cycle, then subtract 1C per cycle] (30 s), 72C
(30 s), and 28 cycles at 94C (30 s), 50C (30 s), 72C (30 s). A
final extension was performed at 72C (10 min). Note that the
annealing temperatures for the above PCRs were reduced by
10C when primer pair IFNG was used.
PCR using Taq Gold (Perkin-Elmer, Norwalk, CT) included the
same reagents as above except that Qiagen buffer and enzyme
were replaced with 10× Taq Gold buffer, MgCl2 solution (to
2.5 mM final concentration) and Taq Gold polymerase. The
cycling parameters were as follows: 1 cycle at 95C for 10 min,
followed by 35 cycles at 95C (45 s), 50C (45 s), and a final
cycle at 50C for 10 min.
921
∆AIP analysis of PCR products from pooled samples
ABI 373A image patterns of the pooled PCR products were
overlaid using GENOTYPER software, imported into DeBabelizer and ∆AIPs calculated as described (13). The statistical
significance of a ∆AIP depends upon the number of marker
alleles and their frequency (13). We therefore obtained an
estimate of the P-value by simulating case and control samples
from the population with allele frequencies estimated from the
peak heights of the control sample as described (13).
Allele-specific analysis of pooled samples
The height of the peak for both groups was converted to a ratio
of the total of all the AIP peak heights so that the score represented
the number of alleles in that peak. This procedure can be
illustrated using the AIPs from the original group pools of
D4S1607 shown in Figure 1. In this case, the sixth peak (from left
to right) has a height of 811 in the control group and 461 in the
high g group. The proportion of peak 4 in the control group is
811/(230 + 1037 + 80 + 61 + 231 + 811 + 105 + 230 + 269 + 394
+ 88 + 79)×100 (the number of alleles in the pooled sample) =
22.4. Pearson χ2 with a 2×2 contingency table was used to
compare the high g and control values. For D4S1607 peak 6, a
two-tailed Pearson χ2 analysis gave a χ2-value of 5.91 with a
P-value of 0.02.
Individual genotyping
PCR was performed using the same protocol as described for
pooled DNA, except that only 30 ng of DNA was used per PCR.
For the original sample, the significance of differences between
the control and high g individuals was determined for the overall
pattern of allelic differences using the CLUMP program which is
based on Monte Carlo simulations (19). The significance of the
allele showing the greatest differences in the original sample was
tested using Pearson χ2 comparing the frequency of that allele
against all other alleles.
Elimination of A-overhang from PCR products
Klenow fragment (United States Biochemical, Amersham, UK)
was used to eliminate A-overhangs from PCR products. The PCR
products that were to be electrophoresed in the same gel lane (see
below) were mixed prior to Klenow treatment. Klenow (0.25 U)
was added to 12.5 µl of mixed PCR product, 1 µl of water and
1.5 µl of reaction buffer (United States Biochemical). The
mixture was incubated at 30C for 1 h.
Gel electrophoresis
Up to four fluorescently labelled (Hex, Fam or Tet) markers were
electrophoresed in each gel lane. Only markers whose products
did not overlap (regardless of which dye they contained) were put
in the same lane. Due to differences in intensity of the three dyes,
Fam-labelled markers were diluted 10-fold, Tet-labelled markers
were diluted 5-fold and Hex-labelled markers were diluted
3-fold. Diluted pools of PCR products (1.5 µl) were mixed with
loading dye (1.5 µl) and a GS350 size ladder (0.5 µl; PerkinElmer). These mixes were loaded, typically on denaturing gels,
containing at least 8% acrylamide, and run at 12 W on ABI 373A
sequencers.
ACKNOWLEDGEMENTS
The research is supported by a grant from the US National
Institute of Child Health and Human Development (HD27694).
REFERENCES
1. Jensen, A.R. (1998) The g Factor: The Science of Mental Ability. Praeger,
London.
2. Brody, N. (1992) Intelligence. 2nd edn. Academic Press, New York.
3. Plomin, R., DeFries, J.C., McClearn, G.E. and Rutter, M. (1997) Behavioral
Genetics. 3rd edn. W.H. Freeman, New York.
4. Chipuer, H.M., Rovine, M.J. and Plomin, R. (1990) LISREL modeling:
genetic and environmental influences on IQ revisited. Intelligence, 14,
11–29.
5. Devlin, B., Daniels, M. and Roeder, K. (1997) The heritability of IQ. Nature,
388, 468–471.
6. Plomin, R. and Petrill, S.A. (1997) Genetics and intelligence: what’s new?
Intelligence, 24, 53–77.
7. Plomin, R., Owen, M.J. and McGuffin, P. (1994) The genetic basis of
complex human behaviors. Science, 264, 1733–1739.
8. Risch, N. and Merikangas, K. (1996) The future of genetic studies of complex
human diseases. Science, 273, 1516–1517.
9. Risch, N. and Teng, J. (1998) The relative power of family-based and
case–control designs for linkage disequilibrium studies of complex human
diseases. I. DNA pooling. Genome Res., 8, 1273–1288.
922
Human Molecular Genetics, 1999, Vol. 8, No. 5
10. Plomin, R., McClearn, G.E., Smith, D.L., Skuder, P., Vignetti, S., Chorney,
M.J., Chorney, K., Kasarda, S., Thompson, L.A., Detterman, D.K., Petrill,
S.A., Daniels, J., Owen, M.J. and McGuffin, P. (1995) Allelic associations
between 100 DNA markers and high versus low IQ. Intelligence, 21, 31–48.
11. Chorney, M.J., Chorney, K., Seese, N., Owen, M.J., McGuffin, P., Daniels, J.,
Thompson, L.A., Detterman, D.K., Benbow, C.P., Lubinski, D., Eley, T.C.
and Plomin, R. (1998) A quantitative trait locus (QTL) associated with
cognitive ability in children. Psychol. Sci., 9, 159–166.
12. Wickelgren, I. (1998) Tracking insulin to the mind. Science, 280, 517–519.
13. Daniels, J., Holmans, P., Williams, N., Turic, D., McGuffin, P., Plomin, R. and
Owen, M.J. (1998) A simple method for analyzing microsatellite allele image
patterns generated from DNA pools and its application to allelic association
studies. Am. J. Hum. Genet., 62, 1189–1197.
14. Lykken, D. (1968) Statistical significance in psychological research. Psychol.
Bull., 70, 151–159.
15. Wechsler, D. (1974) Wechsler Intelligence Scale for Children—Revised. The
Psychological Corporation, New York.
16. Lubinski, D. and Benbow, C.P. (1994) The Study of Mathematically
Precocious Youth (SMPY): the first three decades of a planned 50-year study
of intellectual talent. In Subotnik, R. and Arnold, K. (eds), Beyond Terman:
Longitudinal Studies in Contemporary Gifted Education. Ablex, Norwood,
NJ, pp. 255–281.
17. Brodnick, R.J. and Ree, M.J. (1995) A structural model of academic
performance, socioeconomic status, and Spearman’s g. Educ. Psychol.
Measurement, 55, 583–594.
18. Collins, A., Frezal, J., Teague, J. and Morton, N.E. (1996) A metric map of
humans: 23,500 loci in 850 bands. Proc. Natl Acad. Sci. USA, 93,
14771–14775.
19. Sham, P.C. and Curtis, D. (1995) Monte Carlo tests for associations between
disease and alleles at highly polymorphic loci. Ann. Hum. Genet., 59, 97–105.
20. Lander, E. and Kruglyak, L. (1995) Genetic dissection of complex traits:
guidelines for interpreting and reporting linkage results. Nature Genet., 11,
241–247.
21. Lipkin, E., Mosig, M.O., Darvasi, A., Ezra, E., Shalom, A., Friedmann, A.
and Soller, M. (1998) Quantitative trait locus mapping in dairy cattle by
means of selective milk DNA pooling using dinucleotide microsatellite
markers: analysis of milk protein percentage. Genetics, 149, 1557–1567.
22. Plomin, R. and Rutter, M. (1999) Child development, molecular genetics, and
what to do with genes once they are found. Child Dev., in press.
23. McGue, M., Bouchard, T.J.Jr, Iacono, W.G. and Lykken, D.T. (1993)
Behavioural genetics of cognitive ability: a life-span perspective. In Plomin,
R. and McClearn, G.E. (eds), Nature, Nurture and Pschology. American
Psychological Association., Washington, DC, pp. 59–76.
24. Rutter, M. and Plomin, R. (1997) Opportunities for psychiatry from genetic
findings. Br. J. Psychiatry, 171, 209–219.
25. Sherman, S.L., DeFries, J.C., Gottesman, I.I., Loehlin, J.C., Meyer, J.M.,
Pelias, M.Z., Rice, J. and Waldman, I. (1997) Behavioral Genetics ’97:
ASHG Statement. Recent developments in human behavioral genetics: past
accomplishments and future directions. Am. J. Hum. Genet., 60, 1265–1275.
26. Rithidech, K.N., Dunn, J.J. and Gordon, C.R. (1997) Combining multiplex
and touchdown PCR to screen murine microsatellite polymorphisms.
BioTechniques, 23, 36–44.