A glycogene mutation map for discovery of

Glycobiology, 2015, vol. 25, no. 2, 211–224
doi: 10.1093/glycob/cwu104
Advance Access Publication Date: 28 September 2014
Original Article
Original Article
A glycogene mutation map for discovery
of diseases of glycosylation
Lars Hansen1,2, Allan Lind-Thomsen3, Hiren J Joshi2, Nis Borbye Pedersen2,
Christian Theil Have4, Yun Kong2, Shengjun Wang2, Thomas Sparso4,
Niels Grarup4, Malene Bech Vester-Christensen2, Katrine Schjoldager2,
Hudson H Freeze5, Torben Hansen4, Oluf Pedersen4, Bernard Henrissat2,6,
Ulla Mandel2, Henrik Clausen2, Hans H Wandall2, and Eric P Bennett1,2
2
Copenhagen Center for Glycomics, Department of Cellular and Molecular Medicine, School of Dentistry, Faculty of
Health Sciences, 3Wilhelm Johannsen Center for Genome Research, Department of Cellular and Molecular Medicine,
Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3B, Copenhagen N DK-2200, Denmark, 4The Novo
Nordisk Foundation Center for Basic Metabolic Research, Metabolics Genetics, Universitetsparken, Copenhagen Ø
DK-2100, Denmark, 5Human Genetics Program, Sanford Children’s Health Research Center, Sanford Burnham Medical Research Institute, La Jolla, CA 92037, USA, and 6Architecture et Fonction des Macromolécules Biologiques,
UMR 7257, Centre National de la Recherche Scientifique, Aix-Marseille Université, Marseille 13288, France
1
To whom correspondence should be addressed: Tel: +45 35335499; e-mail: [email protected] (L.H.); Tel: +45 35326630; e-mail:
[email protected] (E.P.B.)
Received 24 July 2014; Revised 15 September 2014; Accepted 24 September 2014
Abstract
Glycosylation of proteins and lipids involves over 200 known glycosyltransferases (GTs), and deleterious
defects in many of the genes encoding these enzymes cause disorders collectively classified as congenital disorders of glycosylation (CDGs). Most known CDGs are caused by defects in glycogenes that affect
glycosylation globally. Many GTs are members of homologous isoenzyme families and deficiencies in
individual isoenzymes may not affect glycosylation globally. In line with this, there appears to be an
underrepresentation of disease-causing glycogenes among these larger isoenzyme homologous families. However, genome-wide association studies have identified such isoenzyme genes as candidates
for different diseases, but validation is not straightforward without biomarkers. Large-scale wholeexome sequencing (WES) provides access to mutations in, for example, GT genes in populations,
which can be used to predict and/or analyze functional deleterious mutations. Here, we constructed a
draft of a functional mutational map of glycogenes, GlyMAP, from WES of a rather homogenous population of 2000 Danes. We cataloged all missense mutations and used prediction algorithms, manual
inspection and in case of carbohydrate-active enzymes family GT27 experimental analysis of mutations
to map deleterious mutations. GlyMAP (http://glymap.glycomics.ku.dk) provides a first global view of
the genetic stability of the glycogenome and should serve as a tool for discovery of novel CDGs.
Key words: damaging mutations, glycogenes, nonsynonymous mutations, nsSNV, MAF
Introduction
Glycosylation of proteins and lipids in human involves complex
nontemplate-driven processes orchestrated by hundreds of enzymes,
transporters, chaperones and lectins (Stanley and Okajima 2010;
Moremen et al. 2012). Glycosylation is by far the most diverse
and complex class of posttranslational modification (PTM) and
© The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected]
211
212
includes N-linked, multiple O-linked and C-type glycans attached to
glycoproteins and proteoglycans. The diversity of glycan structures
found on proteins and lipids is enormous (Haltiwanger and Lowe
2004; Stanley 2011; Moremen et al. 2012). Overwhelming evidence
implicates significant functions for glycosylation in most biological
processes that contribute to health and disease, and in support of
this >100 congenital disorders of glycosylation (CDGs) caused by deficiencies in genes involved in glycosylation (glycogenes) have been
identified (Haeuptle and Hennet 2009; Jaeken 2011; Freeze et al.
2014). CDGs are generally recessive with two nonfunctional alleles
or one nonfunctional and one hypomorphic alleles and only very
few autosomal-dominant inherited glycosylation defects are known
(Freeze et al. 2014).
Most of these deficiencies result in severe multisystemic disorders
caused by global defects in N-glycosylation of proteins, identifiable by
analysis of the abundant serum N-glycoprotein, transferrin (Freeze
2013). The second largest group of known disease-causing glycogenes
affects the mannose O-glycosylation (O-Man) pathway that causes related phenotypes within the group of congenital muscular dystrophies
(Godfrey et al. 2007; 2011). Other CDGs are caused by deficiencies in
glycogenes controlling different O-linked glycosylation pathways including O-Fuc, O-Glc, O-GlcNAc, O-Xyl, O-GalNAc and HYL-Gal,
as well as glycolipid and glycosylphosphatidylinositol (GPI) anchors
(Freeze et al. 2014) (Figure 1). A common feature of most of the glycogenes identified as causing CDGs to date is that they represent genes
without substantial genetic backup, that is, genes which do not have
apparent potential paralogous isoenzymes that may be predicted to
provide partial functional backup. Thus, there appears to be a striking
underrepresentation of glycogenes that are part of larger homologous
L Hansen et al.
gene families among known disease-causing glycogenes. A fundamental question is therefore whether the many paralogous glycogenes in
human are spared from deleterious mutations, or if we have overlooked their role because they induce subtle nonglobal changes in glycosylation and perhaps also more subtle disease phenotypes.
Glycosyltransferases (GTs) are classified in GT families in the
carbohydrate-active enzymes (CAZy) database (Bourne and Henrissat
2001; Lombard et al. 2014) (www.cazy.org) based on sequence and
structure analyses and to date 44 human GT families exist. Large homologous GT families of isoenzymes with related properties cover many
steps in glycosylation in human. This is perhaps most pronounced for
steps in elongation, branching and capping of N-acetyllactosaminebased structures with large β4Gal-Ts (CAZy family GT7) (Almeida
et al. 1997; Amado et al. 1999), β3Gal-Ts (GT31) (Amado et al.
1998), β3GlcNAc-Ts (GT31) (Sasaki et al. 1997; Isshiki et al. 2003)
and capping by α2/3/4FUTs (GT10 and 11) (Becker and Lowe
2003) or α3/6STGals (GT29) (Audry et al. 2011), but many steps
have two or more isoenzymes with potential partially redundant functions (see Figure 1). Moreover, the initiation step of O-GalNAc glycosylation is covered by up to 20 polypeptide GalNAc-transferases
(GalNAc-Ts) belonging to GT27, which provides for the highest degree of differential regulation of a single glycosidic linkage and the
O-GalNAc glycoproteome (Bennett et al. 2012). So far only a few of
the GT genes that are members of the large isoenzyme families have
been shown to cause CDGs (GT7: B4GALT1, B4GALT7; GT27:
GALNT3; GT29: ST3GAL3, ST3GAL5 and GT31: B3GALNT1,
B3GALNT2, B3GALT6, B3GALTL and LFNG). Nevertheless,
genome-wide association studies (GWAS) and other gene association
strategies increasingly point to glycogenes belonging to homologous
Fig.1. The human glycans synthesized by the 208 GT genes depicted for the major biosynthetic pathways in endoplasmic reticulum, Golgi and cytosol PTM. The
CAZy GT families involved in each glycosylation step are denoted.
GlyMAP
gene families as candidate genes for diseases. Examples include several
members of the GalNAc-T gene family (GT27), β3GalT family
(GT31) and the sialyltransferase family (GT29) (Bennett et al.
2012). A major obstacle in validating and discovery of potential
disease-causing glycogene candidates belonging to large gene families
is the lack of simple phenotypic screening assays, because the changes
in glycosylation are expected to be subtle and not global.
Knowledge of one or more validated deleterious alleles of a glycogene implicated as a disease-causing candidate gene by, for example,
GWAS would enable rapid confirmation of a causal role of this gene in
the particular disease if the frequency of the deleterious alleles were
higher in a disease population compared with controls. Importantly,
this discovery and validation strategy does not rely on a comprehensive
set of deleterious alleles in the population because all other allelic
variants that accumulate with one or more of the validated deleterious
alleles would be candidates for full or partial inactive alleles. Nextgeneration sequencing (NGS) is now providing access to massive
amounts of data based on whole-genome sequencing and wholeexome sequencing (WES) from various normal as well as disease populations (Li et al. 2010; Albrechtsen et al. 2013). These data can serve
as a discovery platform for the identification and estimation of null allele frequencies in populations, provided these can be reliably predicted or experimentally determined.
We have recently reported WES data for a rather homogenous
population of 2000 Danes (LuCamp Initiative) (Albrechtsen et al.
2013; Lohmueller et al. 2013), which provides a unique resource for
discovery of deleterious mutant alleles with allele frequencies down to
0.01%. Here, we have examined 208 GT genes for functional deleterious mutant alleles in the Danish population using prediction algorithms, manual inspection, as well as experimental validation for
one large group of isoenzymes in GT27 (GalNAc-Ts). Results from
these analyses of unpublished WES data from 2000 individual represent the data extracted from Danish exomes and these data were
mirrored into WES data from the NHLBI EVS server representing
the same 208 GTs. The combined data provide the first global visualization of mutation rates and stability of the glycogenome, which
forms the basis for a Functional Mutation Map of Glycogenes, here
coined GlyMAP. In line with others we demonstrate that known
CDG disease-causing alleles, with a few exceptions, are rare and not
represented in a small largely healthy population like the Danish
population studied here. However, importantly we demonstrate that
the approach can identify null alleles in this population with fairly
low allele frequency. Thus, we identified and experimentally validated
two mutant alleles of GALNT5 and GALNT14 with allele frequencies of 0.2 and 0.1%, respectively. In summary, GlyMAP in its current
form describes the frequency of functionally validated GT nonsynonymous single-nucleotide variations (nsSNVs) identified in a homogenous Danish cohort. Future efforts are directed at expanding the
identification and validation of nsSNVs to include alternate genes
that participate in shaping the human glycome such as hydrolases,
carbohydrate-binding proteins, nucleotide transporters and others.
Ultimately, GlyMAP has the potential to embrace ongoing WES efforts and, in a comprehensive manner, display the frequencies of functionally validated nsSNVs in both healthy and disease cohorts.
Results
Human GT genes analyzed
We collected 198 human GT genes from 44 of the 96 classified GT
families in CAZy containing human genes (Table I; Supplementary
213
data, Table SI). We further included 10 GT genes, which were not annotated in CAZy. These genes are predicted to encode enzymes involved in the N-glycosylation pathway (ALG1L2 in GT33 and
ALG10B in GT59) and C-mannosylation (DPY19L1, DPY19L2,
DPY19L3) (Carson et al. 2006; Buettner et al. 2013), as well as putative GT genes with roles in glycosylation of α-dystroglycan (FKRP,
FKTN and TMEM5) (Kobayashi et al. 1998; Brockington et al.
2001; Vuillaumier-Barrot et al. 2012).
The predicted biosynthetic roles of the analyzed GTs and their CAZy
families in generating the human glycome are outlined in Figure 1, and
the figure illustrates the potential functional redundancies exist for many
biosynthetic steps. The human GT families with the largest number of
isoenzymes with potential redundant functions include the GT27 (polypeptide N-acetylgalactosaminyltransferases) followed by GT29 (α2,
3- and α2,6-sialyltransferases), GT31 (β3-galactosyltransferases) and
GT14 (β3-galactosyltransferases) families. Detailed information of the
208 human GT genes is summarized in Supplementary data, Table SI.
The overall aim with building GlyMAP was to catalog nsSNVs and
use prediction strategies as well as functional evaluation to identify GT
genes that encode inactive enzymes in a well-defined population (Danish population). Such a map of inactive GT alleles in a defined population will be highly useful to validate and dissect GT genes identified
as candidate genes for diseases by large-scale gene association studies.
The strategy for building GlyMAP was divided into three phases:
(i) assembly of exome information for human GTs, (ii) prediction of
damaging nsSNVs that affect function of the encoded GTs and (iii) experimental validation of predictions for GT27 (Figure 2).
Phase 1: Assembly of exome information of GTs
WES data derived from 2000 Danish individuals sequenced in the LuCAMP initiative (Albrechtsen et al. 2013; Lohmueller et al. 2013)
were used for analysis of the 208 GT genes. The WES study population
consisted of 1000 healthy individuals and 1000 with diagnosed type 2
diabetes, overweight and hypertension but otherwise healthy; and
complete WES data from 983 controls and 982 cases, in total 1965
individuals, were extractable. Clinical and biochemical characteristics
of the 2000 individuals selected from three different Danish study
populations were described previously (Albrechtsen et al. 2013;
Lohmueller et al. 2013).
The SNVs mapped in the LuCAMP dataset for 208 GT genes were
analyzed and all nsSNVs were collected and included for phase 2 analysis. Insertions and deletions (indels) were not included in the LuCAMP WES data and are therefore not represented here. Variations
in introns including splice sites and flanking UTRs were also not included in the study because the functional consequences of these cannot be predicted. A summary of nucleotide and amino positions,
minor allele frequencies (MAFs) and genotype frequencies for all
nsSNVs analyzed is available in a database designated dbGlyMAP
(www.CAZy.org).
We identified a total of 6588 SNVs in the LuCAMP GT WES data;
45% of these (2977) were located in exons and 28% (1875) represented nsSNVs (Table II and Figure 3A). The majority of these nsSNVs
had MAF values <0.001 corresponding to less than five alleles in the
study population (Figure 3B). A number of nsSNVs were only found
once or twice in the study population, and were initially considered potentially uncertain and too infrequent to serve the purpose of GlyMAP.
The effect of excluding these rare nsSNVs is illustrated in Figure 3C and
the number of nsSNVs is reduced from 1736 to 553 when only analyzing alleles occurring three times or more in the LuCAMP dataset. The
following analysis is therefore focused on these nsSNVs.
214
L Hansen et al.
Table I. Human glycosyltransferase genes
CAZy family
Number of
genes
Gene name (HGNC nomenclature)
GT1
23
GT2
GT3
GT4
GT6
GT7/31
6
2
5
3
15
GT8/49
GT10
GT11
GT12
GT13
GT14
GT16
GT17
GT18
GT21
GT22
GT23
GT24
GT25
GT27
9
8
2
2
2
8
1
1
2
1
4
1
2
3
20
GT29
20
GT31
21
ALG13, ALG14, UGT1A1–10, UGT2A1, UGT2A2, UGT2A3, UGT2B10 a, UGT2B11, UGT2B15, UGT2B17,
UGT2B28, UGT2B4, UGT2B7, UGT3A1, UGT3A2, UGT8
ALG5, B3GNTL1, DPM1, HAS1, HAS2, HAS3
GYS1, GYS2
ALG11, ALG2, GLT1D1, GTDC1, PIGA
ABO, GBGT1, GLT6D1
B4GALT1, B4GALT2, B4GALT3, B4GALT4, B4GALT5, B4GALT6, B4GALT7, CHPF b, CHPF2 b, CHSY1 b,
CHSY3 b, CSGALNACT1, CSGALNACT2, B4GALNT3, B4GALNT4
GLT8D1, GLT8D2, GXYLT1, GXYLT2, GYG1, GYG2, GYLTL1B b, LARGE b, XXYLT1
FUT10, FUT11, FUT3, FUT4, FUT5, FUT6, FUT7, FUT9
FUT1, FUT2
B4GALNT1, B4GALNT2
MGAT1, POMGNT1
GCNT1, GCNT2, GCNT3, GCNT4, GCNT6 a, GCNT7, XYLT1, XYLT2
MGAT2
MGAT3
MGAT5, MGAT5B
UGCG
ALG12, ALG9, PIGB, PIGZ
FUT8
UGGT1, UGGT2
CERCAM, GLT25D1, GLT25D2
GALNT1, GALNT2, GALNT3, GALNT4, GALNT5, GALNT6, GALNT7, GALNT8, GALNT9, GALNT10,
GALNT11, GALNT12, GALNT13, GALNT14, GALNT15, GALNT16, GALNT17 (L6), GALNT18, GALNT19
(WBSCR17), GALNT20(L5)
ST3GAL1, ST3GAL2, ST3GAL3, ST3GAL4, ST3GAL5, ST3GAL6, ST6GAL1, ST6GAL2, ST6GALNAC1,
ST6GALNAC2, ST6GALNAC3, ST6GALNAC4, ST6GALNAC5, ST6GALNAC6, ST8SIA1, ST8SIA2, ST8SIA3,
ST8SIA4, ST8SIA5, ST8SIA6
B3GALNT1, B3GALNT2, B3GALT1, B3GALT2, B3GALT4, B3GALT5, B3GALT6 a, B3GALTL, B3GNT2,
B3GNT3, B3GNT4, B3GNT5, B3GNT6, B3GNT7, B3GNT8, B3GNT9 a, C1GALT1, C1GALT1C1, LFNG,
MFNG, RFNG
A4GALT, A4GNT
ALG1, ALG1L, ALG1L2
PYGB, PYGL, PYGM
POMT1, POMT2
OGT
B3GAT1, B3GAT2, B3GAT3
EXT1 b, EXT2 b, EXTL1 b, EXTL2, EXTL3 b
B3GNT1
PIGM
MGAT4A, MGAT4B, MGAT4C
ALG6, ALG8
ALG3
ALG10
EOGT, POMGNT2 (GTDC2)
POFUT1
STT3A, STT3B
POFUT2
PIGV
KDELC1, POGLUT1
PLOD3, DPY19L1 c, DPY19L2 c, DPY19L3 c, DPY19L4 c, FKRP c, FKTN c, TMEM5 c
GT32
GT33
GT35
GT39
GT41
GT43
GT47/GT64
GT49
GT50
GT54
GT57
GT58
GT59
GT61
GT65
GT66
GT68
GT76
GT90
GTnc
2
3
3
2
1
3
5
3
1
3
2
1
1
2
1
2
1
1
2
9
a
WES data not available for B3GALT6, B3GNT9, GCNT6 and UGT2B10.
Genes with a tandem of two GT domains on a single polypeptide.
c
Poplypeptides with inferred GT activities.
b
Comparison of the LuCAMP data with similar data publicly available from the NHLBI Exome Sequencing Project (Exome Variant
Server, EVS—ESP6500) demonstrated that the LuCAMP data are
similar to the European American (EVS_EA) SNV data, and as
expected genetically distant to the African-American SNV data
(EVS_AA). Differences in the total number of SNVs and the intron/intergenic/exon distribution (Table II and Figure 3A) reflect the different
WES platforms and/or data-filtering protocols used (see Material and
215
GlyMAP
Fig. 2. The selection process for finding putative damaging nsSNVs included three phases: phase 1, assembly an SNV database for the 208 human
glycosyltransferases from the LuCAMP and the EVS public database (dbGlyMAP); phase 2, filtering dbGlyMAP for potentially damaging nsSNVs using
knowledge-based predictions; and phase 3, experimental validation of selected damaging nsSNVs in GT27.
Table II. SNVs in dbGlyMap
Cohort
Number of samples
Total SNVs
Intron/intergenic SNVsa
Exon SNVsa
nsSNVsb
Population-specific nsSNVsc
LuCamp
EVS EA
EVS AA
1965
4300
2203
6588
12,889
11,569
3611 (55%)
4997 (39%)
4694 (41%)
2977 (45%)
7892 (61%)
6875 (59%)
1875 (28%)
4586 (36%)
3815 (33%)
881 (47%)
3086 (67%)
2598 (68%)
a
The exonic regions represent both UTR and CDS regions, and the percentage SNVs represent fraction of exonic variants out of total SNV.
Percentage nsSNVs out of total SNVs.
c
Percentage population specific out of total number of nsSNVs.
b
Methods). The distribution of SNV, nsSNVs and predicted damaging
nsSNVs is similar with respect to MAF values for all three populations
with the majority having MAF values <0.001 (Figure 3B). The population occurrence of all nsSNVs including singletons and duplets and
nsSNVs represented by ≥3 and ≥5 alleles is illustrated in Figure 3C.
For the LuCAMP and the EVS_EA data, the population-specific
nsSNVs reduce from 47 to 5% and 67 to 12%, respectively, for allele
occurrence of ≥1 to ≥5 alleles (Figure 3C). The drastic decrease in
nsSNVs private for the LuCAMP population declines with the occurrence of ≥3, whereas the portion of nsSNVs found in all three populations remains almost constant and represents common nsSNVs
(Figure 3D). The same distribution between population-specific and
shared nsSNVs is found for the two other populations (data not
shown).
Phase 2: Prediction of nsSNVs that affect function of the encoded GTs
nsSNVs with allele occurrence ≥3 in the LuCAMP dataset were analyzed with the bioinformatic prediction tools PolyPhen2, SIFT and
Provean as well as by manual evaluation using multisequence alignment of paralogous and orthologous proteins as well as domain structure of GTs and structural information when available (Figure 2).
216
L Hansen et al.
Fig. 3. The distribution of SNVs in the LuCAMP, the EVS_EA and the AA_EVS populations. (A) The distribution of the SNV in the intron/intergenic and the exon
regions are shown for the three datasets. (B) The distribution of all SNVs, nsSNVs and nsSNVs-predicted damaging by PolyPhen2 is shown for LuCAMP,
EVS_EA and AA_EVS. The MAF values are represented by four intervals: [0.5;0.1], [0.1;0.01], [0.01;0.001] and [0.001;0.0001]. (C) The Venn diagrams show the
distribution of shared nsSNVs, nsSNVs represented in two populations and nsSNVs represented in one population; (a) all nsSNVs in the three populations, (b)
nsSNVs represented by three or more alleles and (c) by five or more alleles. (D) The nsSNV distribution in the three populations vs. cumulative number of
alleles shown for shared nsSNVs, nsSNVs found in two populations and nsSNVs found only in one population. (E) Venn diagram for the phase 2-predicted
damaging nsSNVs. (F) Distribution of CAZy GT families for the phase 2-predicted damaging nsSNVs in the LuCAMP and EVS populations, nsSNVs-predicted
damaging by all predictions tools including manual curation are marked dark gray.
Phase 2 resulted in a total of 52 nsSNVs represented by three or more
alleles. These nsSNVs were found in 36 different GT genes representing 17 of the 44 human GT families, and all were predicted to affect
the function of the encoded enzymes (Supplementary data, Table SII).
Comparison with the EVS data showed that six of the potentially
damaging nsSNVs were only found in the Danish population and included genes in GT2 (DPM1; p.Met1Leu), GT27 (GALNT12;
p.Ala188Val, GALNT14: p.Lys401Gln and p.Arg91Gln), GT31
(MFNG; p.Leu308Phe) and GT54 (MGAT4B: p.Arg373Trp). Another seven nsSNVs were found both in the LuCAMP and EVS_EA
datasets but not in EVS_AA and these were in families GT4, GT27,
GT31 and GT39; the remaining 44 nsSNVs were found in all three populations (Figure 3E) and represented 16 GT families (Supplementary
data, Table SII). Collectively, the large GT27 family contained most of
217
GlyMAP
the possible damaging nsSNVs occurring three or more times (14/52;
see Supplementary data, Table SII). The MAF values of the LuCAMP
nsSNVs predicted to affect enzyme functions ranged from 0.0003 to
0.12 and included one nsSNV (GALNT20: p.Cys124Arg) that can
be categorized as a common polymorphism (MAF > 0.01).
Surveying the more rare nsSNVs in the Danish population occurring
one or two times (total number 1183), we found that 61 nsSNVs introduced gain or loss of a stop codon. These likely deleterious SNVs were
most frequently found in the GT27 (6 nsSNVs) and GT14 (8 nsSNVs)
families, and within family GT27 these included GALNT1
(p.Arg368*, one allele), GALNT4 ( p.Glu87*, one allele), GALNT8
( p.Gln453*, two alleles, also in EVS_EA with three alleles and
EVS_AA with one allele), GALNT14 ( p.Arg315*, one allele),
GALNT15 ( p.Arg639*, two alleles) and GALNT18 ( p.Gln556*,
one allele). Although these are likely to affect the function of the encoded enzymes, the low frequency of occurrence suggests that validation
of the sequencing is required. Furthermore, since they, except for the
SNV identified in GALNT8, are only found in the Danish population
and at so low frequency, they are less useful for validating potential disease candidates identified by GWAS results.
Based on the fact that the large GT27 family contained most of the
possible damaging nsSNVs and due to our in-depth knowledge of the
enzymes belonging to this family (Bennett et al. 2012), we chose this
family for experimental validation of putative damaging nsSNVs. In
total, the phase 2 selection processes identified 13 potentially inactivating GT27 nsSNVs in LuCAMP data and one in the EVS_EA data
(Table III). Both GalNAc-T5 mutations reside in the Gal/GalNAc domain of the catalytic unit (Fritz et al. 2004). The p. Asp678Ala mutation affects a residue in a junction between a β-sheet and an α-helix
conserved in 11 of the 20 GalNAc-Ts, and the p.Gly697Arg mutation
affects an essential and highly conserved residue in the active site involved in UDP-GalNAc and acceptor peptide interactions (Bennett
et al. 2012). The GalNAc-T7: p.Cys325Ser mutation resides in a nonessential cysteine, and the p.Pro529Thr mutation is affecting a nonconserved residue and resides in the linker region interspacing the
catalytic and lectin domains (Fritz et al. 2006). The GalNAc-T12 mutation p.Ala188Val affects a nonconserved residue in an α-helix within
the Rossmann fold of the enzyme and the p.Asp261Asn mutation
affects an aspartate residue conserved in the two GalNAc subfamilies
Ic (T3 and T6) and IIa (T4 and T12) (Bennett et al. 2012), and the
p.Asp303Asn mutation affects a nonconserved residue close to the
Gal/GalNAc domain. Notably, the p.Asp261Asn and p.Asp303Asn
have been previously identified and in the latter case correlated with
colon cancer (Guda et al. 2009). The GalNAc-T13 p.Asp378Gly mutation affects a nonconserved residue in proximity of the Gal/GalNAc
domain. A total of five nsSNVs in GalNAc-T14 were selected for experimental validation. The three mutations such as p.Arg82Gln,
p.Arg86Trp and p.Arg91Gln are closely grouped in a region preceding
the catalytic domain; the first two positions are nonconserved, whereas the last affected residue is highly conserved among all GT27 members, and a mutation in this conserved position in GalNAc-T3 has
been reported pathogenic (Ichikawa et al. 2010). The p.Lys401Asn
Table III. Summary of GT27 SNV predictions and experimental validation
Gene
GALNT2
GALNT2
GALNT2
GALNT5
GALNT5
GALNT7
GALNT7
GALNT11
GALNT11
GALNT11
GALNT11
GALNT12
GALNT12
GALNT12
GALNT13
GALNT14
GALNT14
GALNT14
GALNT14
GALNT14
GALNT20 (L5)
GALNT20 (L5)
Percentage neutral
predictions
Percentage experimentally
confirmed damaging
Percentage experimentally
confirmed neutral
a
Amino acid
changes
Experimental Predictionsa
p.Gln216His
p.Asp314Ala
p.Val554Met
p.Asp678Ala
p.Gly697Arg
p.Cys325Ser
p.Pro529Thr
p.Val376Ala
p.Glu409Gly
p.Val495Leu
p.Val575Ala
p.Ala188Val
p.Asp261Asn
p.Asp303Asn
p.Asp378Gly
p.Arg82Gln
p.Arg86Trp
p.Arg91Gln
p.Lys401Asn
p.Asp519Asn
p.Cys124Arg
p.Gly206Ala
30% (6/20)
Populationsb
Phase 2
PolyPhen
Provean SIFT LuCAMP
EVS-EA
EVS-AA
Active
Active
Active
Active
Inactive
Active
Active
Active
Active
Active
Active
Active
Active
Active
Active
Active
Active
Inactive
Active
Active
not analyzed
not analyzed
30% (6/20)
N
N
N
D
D
D
D
D
N
N
N
D
D
D
D
D
D
D
D
D
D
D
45% (9/20)
D
N
D
D
D
D
D
D
N
N
D
D
D
N
N
N
D
D
D
D
D
D
50% (10/20)
N
D
N
D
D
N
D
D
D
N
N
N
D
N
D
N
N
D
D
D
D
D
0;9;4291
0;30;4270
9;420;3871
2;90;4208
0;5;4295
0;37;4263
0;22;4278
1;68;4231
0;64;4236
0;4;4296
0;9;4291
0;0;0
0;98;4202
0;11;4289
0;5;4295
0;4;4296
0;3;4297
0;0;0
0;0;0
0;1;4299
90;1060;3150
0;24;4276
0;2;2201
0;4;2199
1;105;2097
0;10;2193
0;1;2202
0;4;2199
0;4;2199
0;8;2195
0;9;2194
0;1;2202
0;0;2203
0;0;0
0;7;2196
0;1;2202
0;0;2203
0;3;2200
0;0;2203
0;0;0
0;0;0
0;0;2203
139;832;1232
4;181;2018
14% (2/14)
14% (2/14)
18% (2/11)
20% (2/10)
100% (6/6)
100% (6/6)
100% (8/8)
100% (10/10)
N
D
N
D
D
N
D
D
N
N
N
D
N
D
N
N
N
D
D
D
D
D
0;7;1958
0;3;1962
7;170;1788
0;33;1932
0;7;1958
0;17;1948
0;5;1960
0;54;1911
0;43;1922
0;4;1961
0;14;1951
0;8;1957
0;40;1925
0;3;1962
0;12;1953
0;5;1960
0;6;1959
0;5;1960
0;3;1962
0;0;0
21;449;1495
1;3;1961
Predictions: damaging gray D; neutral N; bold values highlight amino acid changes experimentally shown to be inactive.
Number of persons with the genotypes: minor allele/minor allele, minor allele/major allele and major allele/major allele.
b
218
mutation affects a nonconserved lysine residue positioned in the linker
region interspacing the catalytic and the lectin domains. The p.Asp519Asn mutation found only in the EVS_EA dataset was included
in the experimental analysis. This mutation affects an essential residue
in the lectin domain predicted to be involved in carbohydrate binding.
To validate the approach, we analyzed two frequent nsSNVs
known to be associated with CDGs. One nsSNV in the PMM2
( phosphomannomutase-2) gene ( p.Arg141His) has previously been
reported to be associated with CDGIa (Matthijs et al. 1997, 1998),
and this was identified in the LuCAMP dataset with a similar MAF
value as previously reported for the Danish population (0.0158 for
LuCAMP and 0.0167 by Kjaergaard et al. 2001). These findings supported the fidelity of the LuCAMP WES dataset. Moreover, a total of
34 previously published nsSNVs were also found in the LuCAMP
data, and these were in GTs involved in the O-Man pathway
(POMGNT1; GT13), the N-glycosylation pathway (ALG1; GT33,
ALG6; GT57 and ALG12; GT22), and formation of blood grouprelated antigens (ABO; GT6, FUT1 and FUT2; GT11, FUT3 and
FUT6; GT10, GCNT2; GT14 and B3GALNT1; GT31) (Supplementary data, Table SIII). These latter findings were all in agreement with
previously reported allele frequencies for the commonly found
nsSNVs in the respective glycogenes providing further support for
the fidelity of the LuCAMP WES data.
The LuCAMP WES presented genotype data for the healthy control group as well as the type 2 diabetes case group. Analyzing these
data for variation in control vs. case group alleles, we applied an allelebased χ² test for 52 phase 2 predicted damaging nsSNVs as well as for
all 1875 nsSNVs in the 208 GT genes and we were not able to find any
statistically significant difference in allele distribution for the two subpopulations (data not shown).
Phase 3: Experimental validation of nsSNV predictions
for family GT27
We choose CAZy family GT27 for experimental validation, in particular
because of the relatively high number of 14 predicted damaging nsSNVs
(Table III) out of a total of 52, and readily available recombinant expression and enzyme assays for a large number of members (Schjoldager
et al. 2011; 2012). Furthermore, many of the 20 GALNT genes in
GT27 family have been identified as candidate genes for diseases by
GWAS and other association studies (Bennett et al. 2012). The 14 predicted possible damaging nsSNVs found in six GalNAc-T isoforms
(GALNT5, GALNT7, GALNT11, GALNT12, GALNT13 and
GALNT14) were analyzed experimentally.
Six additional GALNT neutral nsSNVs were included in the analysis. Three were in GALNT2; two have previously been suggested to
be associated with triglyceride clearance and high HDL cholesterol
( p.Gln216His and p.Asp314Ala) (Holleboom et al. 2011; Tietjen
et al. 2012), one was included as a neutral control ( p.Val554Met).
Three remaining nsSNVs were in GALNT11, where two were in the
conserved residues in the catalytic domain ( p.Glu409Gly) and the lectin domain ( p.Val495Leu) and one was included as a neutral control
( p.Val575Ala).
Wild-type and variant recombinant enzymes were expressed as secreted soluble enzymes and purified to near homogeneity, and their
function assessed by time-course enzyme assays monitored by
MALDI-TOF with appropriate peptide substrates (Supplementary
data, Figure S1). All the variants could be expressed and purified albeit
with different yields. The analysis demonstrated that one nsSNV
( p.Gly697Arg) in GALNT5 and one ( p.Arg91Gln) in GALNT14 resulted in inactive enzymes, while all other tested variant enzymes
L Hansen et al.
appeared to exhibit normal activity (Table III). The p.Gly697Arg
nsSNV in GALNT5 changes a semiconserved small neutral amino
acid with a charged bulky residue in the catalytic domain. This
nsSNV was found heterozygote in seven LuCAMP samples yielding
an MAF of 0.0018, and the same mutation was found with an MAF
of 0.0006 for EVS_AE and 0.0002 for EVS_AA (Table III). The
p.Arg91Gln nsSNV in GALNT14 affects a highly conserved basic
residue in the catalytic domain, and a mutation of the same Arg residue ( p.Arg162Gln) conserved in the GALNT3 paralog has been identified in a patient with familial tumoral calcinosis caused by deficiency
in the enzyme function (Ichikawa et al. 2010). The p.Arg91Gln mutation was found heterozygote in five LuCamp samples yielding an
MAF of 0.0013, and interestingly this mutation was not found in
the EVS data.
These results demonstrate that our GlyMAP strategy, based on
WES data of large populations, can identify deleterious SNVs that
inactivate GT function. We were surprised to find that the majority
(12 of 14) of the predicted deleterious nsSNVs in the GT27 family
did not affect enzyme function substantially, although the assays employed in this study cannot rule out that these nsSNVs result in minor
changes in the kinetic properties of these mutant enzymes. However,
such minor changes are unlikely to introduce disease. For the six predicted neutral GalNAc-T nsSNVs included in the experimental assays,
the three GalNAc-T2 and the three GalNAc-T11 nsSNVs were all
found active.
Stability of the human glycogenome
Based on GlyMAP, we wanted to determine whether the genetic variation observed within the CAZy GT families was comparable and
whether the genetic variation in the glycogenes of one protein class
was similar to the genetic variation found in other protein classes
such as hydrolases, kinases and histones. For this, the number of
nsSNVs per amino acid was calculated for all the genes in the GlyMAP
and presented as a box plot showing the degree of statistical dispersion
and mean values for each GT family as a measure of the genetic stability (Figure 4A). Not surprisingly, the GT6 family including the ABO
blood group genes was found to have the highest mean value and
therefore the lowest genetic stability. The GT66 (oligosaccharyltransferase, OST, dolichyl-diphosphooligosaccharide-protein subunits,
STTA3 and STTB3) and GT41 (OGT) families had the lowest median
values and therefore the highest genetic stability. For the GT27 family,
found to exhibit the highest number of nsSNVs, the calculated stability
ranks this GT family in the midrange median values (Figure 4A).
Genetic stability was also calculated using LuCAMP WES data
from the dataset for G-protein-coupled receptors (465 genes), protein
kinases (467 genes), homeobox transcription factors (210 genes) and
other PTM genes including glycoside hydrolases (113 genes) and histones (63 genes) (Figure 4B). The box plot showed that the genetic stability of the human GTs is similar to the stability found for the other
selected protein classes. This suggests the GTs to be as evolutionary
stable as other important protein classes such as protein kinases,
receptors, transcription factors and histones.
Discussion
Here we constructed a draft of a catalog, designated as GlyMAP, of
nsSNVs in 208 human GTs from WES of a rather homogenous population of 2000 Danes with the aim to identify deleterious mutations,
especially in GT genes with high degree of potential genetic backup,
that is, in large homologous GT gene families. Our hypothesis was
GlyMAP
219
Fig. 4. The genome stability for the human GT CAZy families is shown as a function of number of nsSNVs per amino acid. (A) A box plot for the 44 human CAZy GT
families shows the highest median for the GT6 family (the ABO gene, the Forssman synthase GBGT1 and GLT6D1) and the lowest median values for the GT66 family
(the oligosaccharyltransferase complex genes STT3A and STT3B) and the GT41 (the cytoplasmic OGT gene). The three large GT families GT27, GT29 and GT31 are
denoted by asterisks and represent average mean values. (B) In order to analyze the genome stability of the human GT genes, the nsSNV frequency for the GT
genome is compared with the frequencies for five other protein classes representing the G-protein-coupled receptors, including the olfactory receptors, the
protein kinases, the homeobox transcription factors, the histones and the PTM-involved proteins encompassing the oligosaccharyltransferases (OST), the
conserved oligomeric Golgi complex (COG), the glycoside hydrolases (GH), the carbohydrate estereases (CE) and the carbohydrate-binding modules (CBM).
The non-PTM protein classes were selected using the Panther Classification System (Panther version 8.1) and the ER/Golgi-located PTM proteins were manually
selected using the CAZy classification. The nsSNV frequency was calculated using the LuCAMP dataset and the interquartile range (IQR) represents the one-third
quartiles (20–75%) and the ±1.5IQR intervals are marked by dotted lines. Outliers are denoted by black boxes and the number of genes in each GT family is shown in
parentheses.
that deleterious mutations in such genes would produce rather subtle
phenotypes only in homozygous and compound heterozygous state,
and hence deleterious alleles could exist with relatively high frequency
in the general population. The inspiration for using this strategy was
that most of the known CDGs are caused by deleterious mutations in
GT genes without predicted genetic and functional backup, while association studies (GWAS) have pointed to more common diseasecausing roles of many of the genes in large homologous gene families
220
with seemingly large degree of potential functional redundancy
(Bennett et al. 2012). The study first demonstrated that prediction of
functional consequences of mutations, despite considerable insight
into the structure and mechanisms of the GTs was quite poor. This
calls for use of experimental validation of all mutations at this time.
The second important conclusion was that the strategy is viable and
that two rather frequent null alleles of GALNT5 and GALNT14
were identified and experimentally validated. In contrast and perhaps
as expected, we failed to find many of the deleterious mutations described for rare CDGs known to date with few exceptions. GlyMAP
provides a first global view of the genetic stability of the glycogenome
and should serve as a tool for discovery of novel CDGs.
The LuCAMP dataset represents a unique source of DNA variants
identified by NGS of the exomes from 2000 individuals of the general
Danish population. Almost half of the typed SNVs in the GT genes
were located in exons and approximately one-quarter of the typed
SNVs lead to amino acid substitutions or nonsense mutations. The
majority (69% or in total 1,291) of the nsSNVs were represented by
one or two alleles and the remaining 584 nsSNVs, constituting less
than one-third, were represented by ≥3 alleles corresponding to an
MAF value of ≥0.0008. This high proportion of private low frequent
alleles is in line with what has been reported from other population
exome studies (Li et al. 2010; Fu et al. 2013).
Using the GlyMAP approach, a total of 134 potentially damaging
nsSNVs were identified whereof 52 were found in the LuCAMP dataset and 82 only in the EVS dataset (Figure 3E; Supplementary data,
Table SIII). A large proportion of the nsSNVs were found within the
large homologous gene families such as GT14, GT27 and GT31 accounting for 53 of 134 (Figure 3F). There was 25% compliance (33
of 134) between the three algorithm-based prediction tools PolyPhen2, SIFT and Provean and the manual-based prediction for impairment of protein function used in the phase 2 selection of nsSNVs
(Supplementary data, Table SII), which strongly highlight the vulnerability of the current tools to predict impact of amino acid substitution
on structure and function. In the validation study of GT27 members,
we only identified mutations potentially affecting the catalytic and lectin domain and we therefore used recombinant expression and enzyme
activity assays to probe functionality. For genes where nonconserved
mutations are found in the transmembrane and immediate juxtamembrane region, it may be necessary to also address membrane retention
of the encoded enzymes in cells.
Focusing on the GT27 GALNT gene family, a total of 20 nsSNVs
were selected for experimental tests (Table III). Fourteen of these were
predicted damaging by our manual predictions based on the extensive
structural knowledge of the GalNAc-Ts with their well-defined catalytic
and lectin domains (Fritz et al. 2004, 2006; Kubota et al. 2006), the remaining six nsSNVs were included as controls or they have been reported
having functional consequences. Of the 14 predicted damaging nsSNVs,
7 were damaging according to PolyPhen2, SIFT and Provean, but only 2
(GalNAc-T5 p.Gly697Arg and the GalNAc-T14 p.Arg91Gln) were confirmed to substantially affect the enzyme function by functional validation (Table III). This result was unexpected, and the poor ability to
predict deleterious effects of nsSNVs will require improvements to harness the full power of our GlyMAP approach.
Our knowledge of the functions of GALNT5 and GALNT14, harboring the two inactivating nsSNVs, is currently very limited and
knockout animals display no overt phenotype (Ten Hagen et al.
2003). GalNAc-T5 has only been characterized in rodents with a
few peptide substrates (Ten Hagen et al. 1998), and recently we
have assessed the human GalNAc-T5 with a large panel of peptide
substrates, and showed that this isoform has very few peptide
L Hansen et al.
substrates compared with other isoforms such as GalNAc-T1 and
-T2 (Kong et al. 2014). GalNAc-T14 is predominantly expressed in
the kidney and is a close paralog of GalNAc-T2 with more restricted
peptide substrate specificities (unpublished data). GalNAc-T14 has
been implicated in Apo2L/TRAIL death-receptor-mediated apoptosis
(Wagner et al. 2007) and may play a role in drug sensitivity to TRAIL
therapy (Stern et al. 2010). Interestingly, the GalNAc-T14
p.Arg91Gln nsSNV was found only in the Danish population and
not in the EVS data. The three GALNT12 nsSNVs ( p.Asp261Asn,
p.Asp303Asn and p.Ala188Val)-predicted damaging were shown to
be active in our experimental assay. Guda et al. (2009) has previously
reported the GALNT12: p.Asp303Asn mutation to be associated with
colon cancer, and demonstrated that the mutant protein had 37% of
the wild-type enzyme activity and the p.Asp261Asn mutation had
84% of wild-type enzyme activity. In our experimental analysis,
none of the three GALNT12 nsSNVs including the control p.Ala188Val affected the enzyme function. The remaining predicted-damaging
nsSNVs were all tested active experimentally (Table III).
The six nsSNVs previously reported in the literature (GALNT2:
p.Gln216His and p.Asp314Ala) or selected as control nsSNVs
(GALNT2: p.Val554Met and GALNT11: p.Glu409Gly, p.Val495Leu and p.Val575Ala) with questionable damaging prediction were
all experimentally tested active. The two GALNT2 nsSNVs were selected based on a recent genome study association of the locus
with dysfunctional lipid metabolism (Kathiresan et al. 2008; 2009;
Teslovich et al. 2010), and a recent study claims that the nsSNVs affect
function of GalNAc-T2 (Holleboom et al. 2011; Tietjen et al. 2012).
We did confirm that both ( p.Gln216His and p.Asp314Ala) were present at low allele frequencies in the Danish population (Table III);
however, we were unable to demonstrate reduced enzyme activity
using an IgA hinge acceptor substrate. Holleboom et al. (2011) reported heterozygote carriers of GALNT2: p.Asp314Ala to have reduced glycosylation of ApoC III. ApoC III has one very effective and
specific GalNAc-T2 O-glycosylation site (Schjoldager et al. 2012;
Schjoldager and Clausen 2012), and it is therefore unlikely that a
slightly reduced GalNAc-T2 activity in a heterozygote state can affect
O-glycosylation. We have additionally demonstrated that a complete
knockout of the GALNT2 gene in a liver cell line is required to affect
glycosylation (Schjoldager et al. 2012). The complete lack of damaging
nsSNVs in the coding region of GALNT2 and the association of a region in intron 1 with HDL and triglyceride metabolism (Kathiresan
et al. 2008, 2009; Teslovich et al. 2010) suggests a dysregulatory rather
than dysfunctional cause of the HDL/lipid metabolism phenotypes.
CDG mutations are predominantly rare, and we therefore expected
that only a few pathogenic glycogene mutations were to be found in
the 4000 LuCAMP exomes. Analyzing the LuCAMP data for pathogenic CDG mutations reported in the literature identified nsSNVs in
four GT genes and PMM2 (Supplementary data, Table SIII). The frequent PMM2 Northern European founder mutation, p.Arg141His,
was found with an MAF value of 0.01578 corresponding to the published carrier prevalence of 1 : 60 in Denmark (Schollen et al. 2000).
The mutation is suggested lethal and never observed homozygous, but
mainly compound heterozygote with the p.Phe119Leu mutation
(Kjaergaard et al. 1998; Schollen et al. 2000), which was found
in LuCAMP with an MAF value of 0.00102. The POMGNT1
p.Asp556Asn mutation, reported to cause a mild limb-girdle muscular
dystrophy phenotype (Clement et al. 2008), was found in 56 carriers
corresponding to an MAF value of 0.01374 in LuCAMP and with
similar MAF values in the EVS dataset (data not shown). A
POMGNT1 splice site mutation in intron 17 (c.1539+1G>A) reported
as a Finish founder mutations found in 18 of 19 muscular dystrophy
221
GlyMAP
patients (Diesen et al. 2004) was represented by nine carries in LuCAMP and was found with lower MAF values in the EVS data (data
not shown). Five nsSNVs affecting N-glycosylation were found in the
genes ALG1, ALG6 and ALG12, respectively. Three of these have
been reported as pathogenic and two reported as polymorphisms (Supplementary data, Table SIII). In addition, several common nsSNVs were
found in the blood group genes ABO, FUT1/2/3/6, B3GALNT1 and
GCNT1 in the LuCAMP data (Supplementary data, Table SIII), but a
more detailed discussion of their allele frequencies in the Danish population is out of the scope of this study, and we refer to recent reviews
covering this area (Storry and Olsson 2004, 2009). In general, only a
few published mutations were detected in the LuCAMP or EVS data.
Since CDGs are rare syndromes with a published estimated prevalence
of 1:20,000 for the most common subtype, PMM2-CDG, and since the
total number of identified CDG cases caused by pathogenic glycogene
mutations is very limited, these results were expected.
The completed cataloging of glycogene nsSNVs prompted us to
question whether the number of variations identified in large redundant
GT families resembled the number of variations observed in nonredundant GT families (Figure 4A). As expected, the GT6 family containing
the highly polymorphic ABO gene with its 161 reported allelic variants
(Patnaik and Blumenfeld 2011) had the highest nsSNV frequency. At the
other extreme, the GT66 and GT41 families containing the oligosaccharyl complex subunits SST3A/B and OGT, respectively, possessed
the lowest nsSNV frequencies. Somewhat surprisingly, the GT1 family
with its genetically polymorphic UGT genes (Stingl et al. 2014) and
the GT10/11 families containing the polymorphic FUT1/2 genes
(Patnaik and Blumenfeld 2011) possessed intermediate nsSNV frequencies in a range with the large homologous GT families GT27
(GalNAc-Ts), GT29 (Sialyl-Ts) and GT31 (B3Gal-Ts). These observations suggest that nsSNVs have not accumulated to a greater extent in
large homologous GT families compared with nonhomologous GT families, and at a more global level, it does not seem that nsSNV frequencies
in the CAZy GT class as a whole differ from the nsSNV frequencies
observed in other protein classes (Figure 4B).
In conclusion, GlyMAP provides the first global view of the nsSNV
landscape for the human GTs mapped in the homogenous population
of 2000 Danes. The WES-based strategy was proven effective in identifying a number of glycogene nsSNVs in a defined population. On the
other hand, we demonstrate that actual damaging nsSNVs are difficult
to predict using the currently available tools and cannot be determined
without extensive experimental validation. Clearly, a deeper insight
into the structures and catalytic mechanisms of the GTs is needed before reliable functional consequences of amino acid substitutions can
be predicted. Our approach demonstrated a substantial need for better
functional knowledge of the GTs and more refined prediction tools.
Thus, assessment of the impact of each individual nsSNV on enzyme
function for GTs belonging to families other than GT27, will require
greater insight and knowledge of the enzymes in question, and a comprehensive functional analysis will require considerable efforts.
Taken together, GlyMAP may serve as a useful tool for disease
discovery in specific disease cohorts where yet unknown glycogene
dysfunction can be associated with common disease traits. A GlyMAP
database has been established hosting information for the human
Glyco-genes (GlyMAP: http://glymap.glycomics.ku.dk).
Materials and Methods
The human GT-encoded glycogenes
A total of 198 different human GT protein annotations were extracted
from the CAZy classification system database (CAZY.org) and
additional 10 GTs were included based on sequence homologies or reported putative GT activities (Table I; Supplementary data, Table SI).
The study populations and exome sequencing
The GT exome data were extracted from the LuCAMP exome sequencing project (Lundbeck Foundation Centre for Applied Medial Genomics in Personalised Disease Prediction, Prevention and Care,
www.lucamp.org), and from the NHLBI Exome Sequencing Project
(evs.gs.washington.edu/EVS/). The LuCAMP project included 2000
Danish individuals, of which half had type 2 diabetes, moderate
adiposity (body mass index, BMI > 27.5 kg/m2) and hypertension
(systolic/diastolic blood pressure, BP > 140/90 mmHg or use of antihypertensive medication). The others were healthy individuals who all
had fasting plasma glucose <5.6 mmol/L, 2-h OGTT-based plasma
glucose <7.8 mmol/L, BMI < 27.5 kg/m2 and BP < 140/90 mmHg
and no antihypertensive treatment. The public NHLBI Exome Sequencing Project (http://evs.gs.washington.edu/EVS/) included data from
6500 unrelated individuals phenotyped in 15 different projects with
the goal of discover novel genes and mechanisms contributing to
heart, lung and blood disorders (NCBI Bioproject ID:165957 and
dbGaP (https://esp.gs.washington.edu/drupal/dbGaP_Releases)).
The LuCAMP exomes were captured using the Agilent SureSelect
All Exon Kit v.2 (46 Mb target region), and sequenced using an Illumina HiSeq 2000 machine with a mean sequencing depth of 56.3×
and an exome coverage ranged from 94.11 to 98.76% (average of
97.27% per sample). The WES data were aligned to GRCh37/hg19
human reference genome and annotation of sequence variants was
performed using the SeattleSeq Annotation 137 server (Lohmueller
et al. 2013). The NHLBI Exome Sequencing Project samples were
exome captured using Roche/NimbleGen capture or Agilent reagents;
all SNP data were called simultaneously using the UMAKE pipeline,
for details see EVS homepage (http://evs.gs.washington.edu/EVS/).
The study was approved by The Danish National Ethical Committee
on Health Research and is in accordance with the ethical scientific
principles of the Helsinki Declaration II.
The GlyMAP prediction strategy
The GlyMAP strategy was divided into three phases (Figure 2).
Phase 1: Assembly of exome information
The SNVs for the 208 human GTs (Table I) were extracted from the
LuCAMP WES dataset and the EVS server and merged into a database, dbGlyMAP. The LuCAMP data included only SNVs wherefore
all INDEL data were excluded from the EVS dataset. The dbGlyMAP
includes, in addition to chromosomal location and nucleotide variation MAF values, genotype frequencies and rs annotations if
known. For genes represented by more than one coding transcript,
the longest transcript with respect to the coding region was selected
for the SNV data.
Phase 2: Prediction of functionally inactive nsSNVs
The nsSNV data for the LuCAMP and the EVS datasets were analyzed
for inactivating nsSNVs by WEB-based algorithms and manually curated for possible damaging variants. Three prediction tools were
used: Polyphen2 (Adzhubei et al. 2010), SIFT (Kumar et al. 2009)
and Provean (Choi et al. 2012). Manual prediction was based on (i)
occurrence of three or more alleles in the LuCAMP dataset, (ii) alignment of paralogous proteins based on CAZy GT family classification,
(iii) alignment of orthologous proteins, (iv) protein domain information and/or crystallographic data if available and web-based prediction
222
of possible impact of an amino acid substitution on the structure and
function. Additional inclusion criteria were gain or loss of stop codons
and nsSNVs affecting start methionines. Multisequence alignments
were done using MAFFT or Clustal Omega, protein domain structures
were adapted from UniProt, CDD and SMART, and crystallographic
information was retrieved from PDB (Figure 2).
L Hansen et al.
Provean (Protein Variation Effect Analyzer): http://provean.jcvi.
org/index.php
SIFT: http://sift.jcvi.org/
SMART: http://smart.embl-heidelberg.de/
UniProt: http://www.uniprot.org/
Supplementary Material
Phase 3: Experimental validation of predictions
Phase 2 selected nsSNVs found in the GalNAc genes (GT27) were selected for experimental validation and expression constructs encoding
wild-type and nsSNVs harboring proteins were generated (Table III).
Recombinant expression of secreted wild-type and mutant nsSNVs
constructs was done essentially as described previously (Bennett
et al. 1996). Products were analyzed by matrix-assisted laser desorption ionization mass spectrometry imaging and compared with
the glycosylation capacity of wild-type enzyme. In brief, GT-encoded
sequences lacking the membrane anchoring domain were fused
N-terminally with an in-frame 6× His tag allowing for downstream
NiNTA enzyme purification. Constructs were inserted into pAcGP67
Baculovirus insect cell expression vector system and expressed in Hi5
insect cells. nsSNVs were either introduced into existing expression
constructs using QuikChange Site-Directed Mutagenesis (Agilent Technologies, Waldbronn, Germany) or synthesized synthetically (GeneWiz,
London, UK). Wild-type and mutant enzyme was purified by NiNTA
purification schemes essentially as described previously (Pedersen
et al. 2011). All enzymes were purified to homogeneity (Supplementary
data, Figure S1) and tested in an in vitro product development glycosylation assays using appropriate acceptor substrates for the respective
GTs (Table III, Figure 1).
Statistics and comparative analysis
Box plot depiction for numbers of nsSNVs divided by protein length
was calculated as the nsSNV frequency for each protein. The nsSNV
data were extracted from the LuCAMP dataset and the protein classes
were achieved from the Panther Classification System or from CAZy.
The protein groups were G-protein-coupled receptors (465 proteins,
Panther class PC00021), protein kinases (467 proteins, Panther
PC00193), homeobox transcription factors (210 proteins, Panther
PC00119), histones (63 proteins, Panther PC00118), GTs (201 proteins) and as one group consisting of glycoside hydrolases, carbohydrate estereases, carbohydrate-binding modules, OST and conserved
oligomeric Golgi complex, in total 113 proteins.
Web sources
CAZy: www.CAZy.org
GlyMAP: http://glymap.glycomics.ku.dk
CDD, Conserved Domain Database: http://www.ncbi.nlm.nih.
gov/cdd/
Clustal Omega: http://www.ebi.ac.uk/Tools/msa/clustalo/
HGMD, The Human Gene Mutation Database: http://www.hgmd.
cf.ac.uk/
NHLBI GO Exome Sequencing Project (ESP), Seattle, WA: http://
evs.gs.washington.edu/EVS/, Sep 2013
MAFFT: http://www.ebi.ac.uk/Tools/msa/mafft/
Panther version 8.1: http://www.pantherdb.org/
PDB, Protein Data Bank: http://www.rcsb.org/pdb/home/home.do
PolyPhen2 (Polymorphism Phenotyping v2): http://genetics.bwh.
harvard.edu/pph2/
Supplementary Material is available at http://glycob.oxfordjournals.
org/ online.
Funding
This work was supported by Kirsten og Freddy Johansen Fonden, A.P. Møller
og Hustru Chastine Mc-Kinney Møllers Fond til Almene Formaal, The Carlsberg Foundation, The Novo Nordisk Foundation, The Danish Research Councils, a program of excellence from the University of Copenhagen, The Danish
National Research Foundation (DNRF107), and The Rocket Fund and
R01DK99551. The Danish whole exome study supported by the Lundbeck
Foundation (The Lundbeck Foundation Centre for Applied Medical Genomics
in Personalised Disease Prediction, Prevention and Care [LuCamp], https://vpn.
sund.ku.dk/,DanaInfo=.awxyCpzihuyJz3t+" \t "pmc_ext" www.lucamp.org)
and The Danish Council for Independent Research. The Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center
at the University of Copenhagen partially funded by an unrestricted donation
from the Novo Nordisk Foundation (https://vpn.sund.ku.dk/,DanaInfo=.
awxyCqjzhjxvKw7Ns0+" \t "pmc_ext" www.metabol.ku.dk).
Abbreviations
BMI, body mass index; BP, blood pressure; CAZy, carbohydrate-active enzymes; CDG, congenital disorders of glycosylation; EVS, Exome Variant Server;
GalNAc-Ts, GalNAc-transferases; GPI, glycosylphosphatidylinositol; GT, glycosyltransferase; GWAS, genome-wide association study; MAF, minor allele frequency; NGS, next-generation sequencing; nsSNV, nonsynonymous SNV; OST,
oligosaccharyltransferases; PTM, posttranslational modification; SNV, singlenucleotide variation; WES, whole-exome sequencing
References
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P,
Kondrashov AS, Sunyaev SR. 2010. A method and server for predicting
damaging missense mutations. Nat Methods. 7:248–249.
Albrechtsen A, Grarup N, Li Y, Sparsø T, Tian G, Cao H, Jiang T, Kim SY,
Korneliussen T, Li Q, et al. 2013. Exome sequencing-driven discovery of
coding polymorphisms associated with common metabolic phenotypes.
Diabetologia. 56:298–310.
Almeida R, Amado M, David L, Levery SB, Holmes EH, Merkx G, van
Kessel AG, Rygaard E, Hassan H, Bennett E, et al. 1997. A family of
human beta4-galactosyltransferases. Cloning and expression of two novel
UDP-galactose:beta-n-acetylglucosamine beta1, 4-galactosyltransferases,
beta4Gal-T2 and beta4Gal-T3. J Biol Chem. 272:31979–31991.
Amado M, Almeida R, Carneiro F, Levery SB, Holmes EH, Nomoto M,
Hollingsworth MA, Hassan H, Schwientek T, Nielsen PA, et al. 1998. A
family of human beta3-galactosyltransferases. Characterization of four
members of a UDP-galactose:beta-N-acetyl-glucosamine/beta-n-acetylgalactosamine beta-1,3-galactosyltransferase family. J Biol Chem. 273:
12770–12778.
Amado M, Almeida R, Schwientek T, Clausen H. 1999. Identification and characterization of large galactosyltransferase gene families: galactosyltransferases for all functions. Biochim Biophys Acta. 1473:35–53.
Audry M, Jeanneau C, Imberty A, Harduin-Lepers A, Delannoy P, Breton C.
2011. Current trends in the structure-activity relationships of sialyltransferases. Glycobiology. 21:716–726.
GlyMAP
Becker DJ, Lowe JB. 2003. Fucose: biosynthesis and biological function in mammals. Glycobiology. 13:41–53.
Bennett EP, Hassan H, Clausen H. 1996. cDNA cloning and expression
of a novel human UDP-N-acetyl-alpha-D-galactosamine. Polypeptide
N-acetylgalactosaminyltransferase, GalNAc-t3. J Biol Chem. 271:
17006–17012.
Bennett EP, Mandel U, Clausen H, Gerken TA, Fritz TA, Tabak LA. 2012. Control of mucin-type O-glycosylation: a classification of the polypeptide
GalNAc-transferase gene family. Glycobiology. 22:736–756.
Bourne Y, Henrissat B. 2001. Glycoside hydrolases and glycosyltransferases:
families and functional modules. Curr Opin Struct Biol. 11:593–600.
Brockington M, Blake DJ, Prandini P, Brown SC, Torelli S, Benson MA,
Ponting CP, Estournet B, Romero NB, Mercuri E, et al. 2001. Mutations
in the fukutin-related protein gene (FKRP) cause a form of congenital
muscular dystrophy with secondary laminin alpha-2 deficiency and
abnormal glycosylation of alpha-dystroglycan. Am J Hum Genet. 69:
1198–1209.
Buettner FF, Ashikov A, Tiemann B, Lehle L, Bakker H. 2013. C. elegans
DPY-19 is a C-mannosyltransferase glycosylating thrombospondin repeats.
Mol Cell. 50:295–302.
Carson AR, Cheung J, Scherer SW. 2006. Duplication and relocation of
the functional DPY19L2 gene within low copy repeats. BMC Genomics.
7:45.
Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. 2012. Predicting the Functional Effect of Amino Acid Substitutions and Indels. PLoS ONE. 7:
e46688.
Clement EM, Godfrey C, Tan J, Brockington M, Torelli S, Feng L, Brown SC,
Jimenez-Mallebrera C, Sewry CA, Longman C, et al. 2008. Mild
POMGnT1 mutations underlie a novel limb-girdle muscular dystrophy
variant. Arch Neurol. 65:137–141.
Diesen C, Saarinen A, Pihko H, Rosenlew C, Cormand B, Dobyns WB,
Dieguez J, Valanne L, Joensuu T, Lehesjoki AE. 2004. POMGnT1 mutation
and phenotypic spectrum in muscle-eye-brain disease. J Med Genet. 41:
e115.
Freeze HH. 2013. Understanding human glycosylation disorders: biochemistry
leads the charge. J Biol Chem. 288:6936–6945.
Freeze HH, Chong JX, Bamshad MJ, Ng BG. 2014. Solving Glycosylation Disorders: Fundamental Approaches Reveal Complicated Pathways. Am J
Hum Genet. 94:161–175.
Fritz TA, Hurley JH, Trinh LB, Shiloach J, Tabak LA. 2004. The beginnings of
mucin biosynthesis: The crystal structure of UDP-GalNAc:polypeptide
{alpha}-N-acetylgalactosaminyltransferase-T1. Proc Natl Acad Sci USA.
101:15307–15312.
Fritz TA, Raman J, Tabak LA. 2006. Dynamic association between the catalytic
and lectin domains of human UDP-GalNAc:polypeptide alpha-Nacetylgalactosaminyltransferase-2. J Biol Chem. 281:8613–8619.
Fu W, O’Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S,
Rieder MJ, Altshuler D, Shendure J, et al. 2013. Analysis of 6,515 exomes
reveals the recent origin of most human protein-coding variants. Nature.
493:216–220.
Godfrey C, Clement E, Mein R, Brockington M, Smith J, Talim B, Straub V,
Robb S, Quinlivan R, Feng Lbbs S, et al. 2007. Refining genotype phenotype correlations in muscular dystrophies with defective glycosylation of
dystroglycan. Brain. 130:2725–2735.
Godfrey C, Foley AR, Clement E, Muntoni F. 2011. Dystroglycanopathies:
coming into focus. Curr Opin Genet Dev. 21:278–285.
Guda K, Moinova H, He J, Jamison O, Ravi L, Natale L, Lutterbaugh J,
Lawrence E, Lewis S, Willson JK, et al. 2009. Inactivating germ-line and
somatic mutations in polypeptide N-acetylgalactosaminyltransferase 12 in
human colon cancers. Proc Natl Acad Sci USA. 106:12921–12925.
Haeuptle MA, Hennet T. 2009. Congenital disorders of glycosylation: an update on defects affecting the biosynthesis of dolichol-linked oligosaccharides. Hum Mutat. 30:1628–1641.
Haltiwanger RS, Lowe JB. 2004. Role of glycosylation in development. Annu
Rev Biochem. 73:491–537.
Holleboom AG, Karlsson H, Lin RS, Beres TM, Sierts JA, Herman DS,
Stroes ES, Aerts JM, Kastelein JJ, Motazacker MM, et al. 2011.
223
Heterozygosity for a loss-of-function mutation in GALNT2 improves plasma triglyceride clearance in man. Cell Metab. 14:811–818.
Ichikawa S, Baujat G, Seyahi A, Garoufali AG, Imel EA, Padgett LR,
Austin AM, Sorenson AH, Pejin Z, Topouchian V, et al. 2010. Clinical variability of familial tumoral calcinosis caused by novel GALNT3 mutations.
Am J Med Genet A. 152A:896–903.
Isshiki S, Kudo T, Nishihara S, Ikehara Y, Togayachi A, Furuya A, Shitara K,
Kubota T, Watanabe M, Kitajima M, et al. 2003. Lewis type 1 antigen synthase (beta3Gal-T5) is transcriptionally regulated by homeoproteins. J Biol
Chem. 278:36611–36620.
Jaeken J. 2011. Congenital disorders of glycosylation (CDG): it’s (nearly) all in
it! J Inherit Metab Dis. 34:853–858.
Kathiresan S, Melander O, Guiducci C, Surti A, Burtt NP, Rieder MJ,
Cooper GM, Roos C, Voight BF, Havulinna AS, et al. 2008. Six new
loci associated with blood low-density lipoprotein cholesterol, highdensity lipoprotein cholesterol or triglycerides in humans. Nat Genet.
40:189–197.
Kathiresan S, Willer CJ, Peloso GM, Demissie S, Musunuru K, Schadt EE,
Kaplan L, Bennett D, Li Y, Tanaka T, et al. 2009. Common variants at
30 loci contribute to polygenic dyslipidemia. Nat Genet. 41:56–65.
Kjaergaard S, Schwartz M, Skovby F. 2001. Congenital disorder of glycosylation type Ia (CDG-Ia): phenotypic spectrum of the R141H/F119L genotype.
Arch Dis Child. 85:236–239.
Kjaergaard S, Skovby F, Schwartz M. 1998. Absence of homozygosity for
predominant mutations in PMM2 in Danish patients with carbohydratedeficient glycoprotein syndrome type 1. Eur J Hum Genet. 6:331–336.
Kobayashi K, Nakahori Y, Miyake M, Matsumura K, Kondo-Iida E,
Nomura Y, Segawa M, Yoshioka M, Saito K, Osawa M, et al. 1998. An ancient retrotransposal insertion causes Fukuyama-type congenital muscular
dystrophy. Nature. 394:388–392.
Kong Y, Joshi HJ, Schjoldager KT, Madsen TD, Gerken TA, VesterChristensen MB, Wandall HH, Bennett EP, Levery SB, Vakhrushev SY,
et al. 2014. Probing polypeptide GalNAc-transferase isoform substrate specificities by in vitro analysis. Glycobiology. 25:55–65.
Kubota T, Shiba T, Sugioka S, Furukawa S, Sawaki H, Kato R, Wakatsuki S,
Narimatsu H. 2006. Structural basis of carbohydrate transfer activity by
human UDP-GalNAc: polypeptide alpha-N-acetylgalactosaminyltransferase
(pp-GalNAc-T10). J Mol Biol. 359:708–727.
Kumar P, Henikoff S, Ng PC. 2009. Predicting the effects of coding nonsynonymous variants on protein function using the SIFT algorithm. Nat
Protoc. 4:1073–1081.
Li Y, Vinckenbosch N, Tian G, Huerta-Sanchez E, Jiang T, Jiang H,
Albrechtsen A, Andersen G, Cao H, Korneliussen T,, et al. 2010. Resequencing of 200 human exomes identifies an excess of low-frequency nonsynonymous coding variants. Nat Genet. 42:969–972.
Lohmueller KE, Sparsø T, Li Q, Andersson E, Korneliussen T, Albrechtsen A,
Banasik K, Grarup N, Hallgrimsdottir I, Kiil K, et al. 2013. Whole-exome
sequencing of 2,000 Danish individuals and the role of rare coding variants
in type 2 diabetes. Am J Hum Genet. 93:1072–1086.
Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. 2014.
The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids
Res. 42:D490–D495.
Matthijs G, Schollen E, Pardon E, Veiga-Da-Cunha M, Jaeken J, Cassiman JJ,
Van Schaftingen E. 1997. Mutations in PMM2, a phosphomannomutase
gene on chromosome 16p13, in carbohydrate-deficient glycoprotein type I
syndrome (Jaeken syndrome). Nat Genet. 16:88–92.
Matthijs G, Schollen E, Van Schaftingen E, Cassiman JJ, Jaeken J. 1998. Lack of
homozygotes for the most frequent disease allele in carbohydrate-deficient
glycoprotein syndrome type 1A. Am J Hum Genet. 62:542–550.
Moremen KW, Tiemeyer M, Nairn AV. 2012. Vertebrate protein glycosylation:
diversity, synthesis and function. Nat Rev Mol Cell Biol. 13:448–462.
Patnaik SK, Blumenfeld OO. 2011. Patterns of human genetic variation inferred
from comparative analysis of allelic mutations in blood group antigen genes.
Hum Mutat. 32:263–271.
Pedersen JW, Bennett EP, Schjoldager KT, Meldal M, Holmér AP, Blixt O,
Cló E, Levery SB, Clausen H, Wandall HH. 2011. Lectin domains of
224
polypeptide GalNAc transferases exhibit glycopeptide binding specificity.
J Biol Chem. 286:32684–32696.
Sasaki K, Kurata-Miura K, Ujita M, Angata K, Nakagawa S, Sekine S, Nishi T,
Fukuda M. 1997. Expression cloning of cDNA encoding a human beta-1,3N-acetylglucosaminyltransferase that is essential for poly-N-acetyllactosamine
synthesis. Proc Natl Acad Sci USA. 94:14294–14299.
Schjoldager KT, Clausen H. 2012. Site-specific protein O-glycosylation modulates proprotein processing – deciphering specific functions of the large
polypeptide GalNAc-transferase gene family. Biochim Biophys Acta.
1820:2079–2094.
Schjoldager KT, Vakhrushev SY, Kong Y, Steentoft C, Nudelman AS,
Pedersen NB, Wandall HH, Mandel U, Bennett EP, Levery SB, et al.
2012. Probing isoform-specific functions of polypeptide GalNActransferases using zinc finger nuclease glycoengineered SimpleCells. Proc
Natl Acad Sci USA. 109:9893–9898.
Schjoldager KT, Vester-Christensen MB, Goth CK, Petersen TN, Brunak S,
Bennett EP, Levery SB, Clausen H. 2011. A systematic study of site-specific
GalNAc-type O-glycosylation modulating proprotein convertase processing. J Biol Chem. 286:40122–40132.
Schollen E, Kjaergaard S, Legius E, Schwartz M, Matthijs G. 2000. Lack of
Hardy-Weinberg equilibrium for the most prevalent PMM2 mutation in
CDG-Ia (congenital disorders of glycosylation type Ia). Eur J Hum Genet.
8:367–371.
Stanley P. 2011. Golgi glycosylation. Cold Spring Harb Perspect Biol. 3.
Stanley P, Okajima T. 2010. Roles of glycosylation in Notch signaling. Curr Top
Dev Biol. 92:131–164.
Stern HM, Padilla M, Wagner K, Amler L, Ashkenazi A. 2010. Development of
immunohistochemistry assays to assess GALNT14 and FUT3/6 in clinical
trials of dulanermin and drozitumab. Clin Cancer Res. 16:1587–1596.
L Hansen et al.
Stingl JC, Bartels H, Viviani R, Lehmann ML, Brockmöller J. 2014. Relevance
of UDP-glucuronosyltransferase polymorphisms for drug dosing: A quantitative systematic review. Pharmacol Ther. 141:92–116.
Storry JR, Olsson ML. 2004. Genetic basis of blood group diversity. Br J
Haematol. 126:759–771.
Storry JR, Olsson ML. 2009. The ABO blood group system revisited: A review
and update. Immunohematology. 25:48–59.
Ten Hagen KG, Fritz TA, Tabak LA. 2003. All in the family: the UDP-GalNAc:
polypeptide N-acetylgalactosaminyltransferases. Glycobiology. 13:1–16.
Ten Hagen KG, Hagen FK, Balys MM, Beres TM, Van Wuyckhuyse B, Tabak LA.
1998. Cloning and expression of a novel, tissue specifically expressed member
of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase family.
J Biol Chem. 273:27749–27754.
Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM,
Koseki M, Pirruccello JP, Ripatti S, Chasman DI, Willer CJ, et al. 2010. Biological, clinical and population relevance of 95 loci for blood lipids. Nature.
466:707–713.
Tietjen I, Hovingh GK, Singaraja RR, Radomski C, Barhdadi A, McEwen J,
Chan E, Mattice M, Legendre A, Franchini PL, et al. 2012. Segregation of
LIPG, CETP, and GALNT2 mutations in Caucasian families with extremely
high HDL cholesterol. PLoS ONE. 7:e37437.
Vuillaumier-Barrot S, Bouchet-Séraphin C, Chelbi M, Devisme L, Quentin S,
Gazal S, Laquerrière A, Fallet-Bianco C, Loget P, Odent S, et al. 2012.
Identification of mutations in TMEM5 and ISPD as a cause of severe cobblestone lissencephaly. Am J Hum Genet. 91:1135–1143.
Wagner KW, Punnoose EA, Januario T, Lawrence DA, Pitti RM, Lancaster K,
Lee D, von Goetz M, Yee SF, Totpal K, et al. 2007. Death-receptor
O-glycosylation controls tumor-cell sensitivity to the proapoptotic ligand
Apo2L/TRAIL. Nat Med. 13:1070–1077.