Genes and Immunity (2001) 2, 119–127 2001 Nature Publishing Group All rights reserved 1466-4879/01 $15.00 www.nature.com/gene The human gene for mannan-binding lectin-associated serine protease-2 (MASP-2), the effector component of the lectin route of complement activation, is part of a tightly linked gene cluster on chromosome 1p36.2–3 C Stover1, Y Endo2, M Takahashi2, NJ Lynch1, C Constantinescu1, T Vorup-Jensen3, S Thiel3, H Friedl4, T Hankeln4, R Hall5, S Gregory5, T Fujita2 and W Schwaeble1 1 Department of Microbiology and Immunology, University of Leicester, Leicester, UK; 2Department of Biochemistry, Fukushima Medical University, School of Medicine, Fukushima, Japan; 3Department of Medical Microbiology and Immunology, The Bartholin Building, University of Århus, Århus, Denmark; 4Genterprise, Institut für Molekulargenetik und Gentechnische Sicherheit, Johannes Gutenberg University Mainz, Mainz, Germany; 5The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK The proteases of the lectin pathway of complement activation, MASP-1 and MASP-2, are encoded by two separate genes. The MASP1 gene is located on chromosome 3q27, the MASP2 gene on chromosome 1p36.23–31. The genes for the classical complement activation pathway proteases, C1r and C1s, are linked on chromosome 12p13. We have shown that the MASP2 gene encodes two gene products, the 76 kDa MASP-2 serine protease and a plasma protein of 19 kDa, termed MAp19 or sMAP. Both gene products are components of the lectin pathway activation complex. We present the complete primary structure of the human MASP2 gene and the tight cluster that this locus forms with non-complement genes. A comparison of the MASP2 gene with the previously characterised C1s gene revealed identical positions of introns separating orthologous coding sequences, underlining the hypothesis that the C1s and MASP2 genes arose by exon shuffling from one ancestral gene. Genes and Immunity (2001) 2, 119–127. Keywords: complement; lectin pathway; serine proteases; gene cluster; MASP2 gene Introduction The complement system is an integral part of the innate antimicrobial immune defence and mediates humoral and cellular interactions within the immune response, including chemotaxis, phagocytosis, cell adhesion, and B cell differentiation.1 Complement may be activated via three different routes, the classical pathway, the alternative pathway, and the recently described lectin pathway. The classical activation pathway is initiated by binding of the hexaoligomeric recognition molecule C1q to immune complexes via its globular heads. This effects a conformational change of its collagenous stalks which in turn leads to the activation of the C1q-associated serine protease complexes C1r2 and C1s2. Of these, C1r cleaves and activates the zymogen C1s. Activated C1s subsequently cleaves C4 and C4b bound C2 to generate the C3 conCorrespondence: Dr WJ Schwaeble, Department of Microbiology and Immunology, University of Leicester, University Road, Leicester LE1 9HN, UK. E-mail: ws5얀le.ac.uk This work was supported by the Wellcome Trust, the German Research Foundation (Deutsche Forschungsgemeinschaft), the Japanese Society for the Promotion of Science, and the EMDO Foundation, Switzerland. Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession numbers AL109811, AB033742, AJ297949, AJ299718, and AJ300188. Received 4 December 2000; accepted 1 February 2001 vertase, C4b2b, and the C5 convertase, C4b2b(C3b)n. The alternative pathway forms an amplification loop of complement activation and is initiated by binding of the complement activation product C3b to Factor B. The subsequent cleavage of Factor B by Factor D forms an enzymatically active complex, C3bBb, which converts native C3 to C3b. Accumulation of multiple C3b molecules in proximity of the alternative pathway convertase C3bBb allows the complex to cleave and activate the fifth component of complement, C5. The lectin pathway can be activated in the absence of immune complexes and is initiated by the recognition of carbohydrate structures present on microbial organisms via macromolecular complexes present in body fluids. These complexes are composed of a multivalent pattern recognition subunit and associated serine proteases which initiate complement activation upon binding of the recognition subunits to microbial oligosaccharide structures. To date, two pattern recognition components of the lectin activation pathway have been described, the well characterised plasma glycoprotein mannan-binding lectin (MBL)2 and ficolin p35,3 both members of the collectin family,4,5 yet differing in their binding specificities to microbial surface oligosaccharides. As these vary considerably amongst members of the collectin family, the presence of more than one recognition subunit may result in a wider spectrum of microbial surfaces recognised by the lectin pathway activation complexes. MBL forms a Organisation of the human MASP2 gene locus C Stover et al 120 hexoidal macromolecular complex of six structural subunits each composed of three identical polypeptide chains. This associates with two distinct serine proteases, the MBL-associated serine proteases-1 and -2 (ie, MASP1, MASP-2), and a 19 kDa MASP-2 related plasma protein, termed MAp19 or sMAP.6 No detailed information is yet available on the stoichiometry of lectin pathway activation complexes formed with ficolin p35. It is, however, clear that ficolin p35 complexes associate with MASP-1, MASP-2 and MAp19/sMAP and initiate the lectin pathway via the aforementioned serine proteases.3 In vitro, purified and recombinant MASP-2 was shown to cleave the fourth and second component of complement (ie, C4 and C2) in absence of MASP-1.6–8 The sequential proteolytic cleavage of C4 and C4b bound C2 to form the C3 convertase, C4b2b, and the C5 convertase, C4b2b(C3b), respectively, is the essential step by which the lectin pathway activation complex as well as the classical pathway activation complex initiate further downstream activation of complement.7,8 The potential of MASP-2 to activate complement in absence of MASP-1 in vitro is supported by the analysis of sera of a recently established MASP-1 deficient mouse strain, which exhibits effective lectin pathway activation under physiological conditions.9 Thus, MASP-2 is the effector component of the lectin pathway of complement activation. MASP-1 and MAp19 are constitutive components of the lectin pathway activation complexes, their specific functional role(s), however, have yet to be defined. We have previously shown that a single structural MASP2 gene of approximately 20 kb on chromosome 1p36.23–31 is transcribed and processed to generate two gene products, a 2.6 kb MASP-2 mRNA, encoding the 76 kDa MASP-2 serine protease, or an abundant 1.0 kb mRNA transcript, encoding a truncated MASP-2 related plasma protein of 19 kDa devoid of a serine protease domain, termed MAp19 or sMAP.10–13 The present report amends and completes our previously presented organisational map (based solely on genomic PCR)10,11 by a comparative analysis of overlapping sequences obtained from human genomic DNA isolated as independent YAC, BAC and FixII clones. These represent approximately 222 kb of the chromosomal area 1p36.2–3. Furthermore, we present putative promoter elements responsible for the strict hepatic expression pattern of MASP-2 and MAp19 mRNA. Results Human MASP2 gene organisation The revised multi-exon structure of the human MASP2 gene (encompassing approximately 20 kb) is presented as derived from sequence analysis of six overlapping clones (Figure 1a and b). The 5⬘ boundary of exon a (62 bp) was determined by mapping the transcription initiation start site. From two independent 5⬘ RACE reactions performed on human liver cDNA, it was determined to lie 57 bp upstream of the translation initiation codon (Figure 2). In comparison to the published sequences for human MASP-2 and MAp19/sMAP mRNA species, the 5⬘ UT region of human MASP-2 mRNA (Y09926) could be extended by 37 bp and that of human MAp19/sMAP mRNA by 41 bp (Y18281) and 31 bp (AB008047), respectively. The Genes and Immunity sequence coding for the signal peptide is split by an intron of 83 bp (Figure 1a). Exon b codes for the remainder of the signal peptide (R)LLTLLGLLCGSVA and the N-terminal 63 residues of the CUB I domain. The C-terminal 59 residues of the CUB I domain are encoded by exon c, separated from exon b by 157 bp and from exon d, the single exon coding for the EGF-like domain, by 1016 bp. The alternative splice/polyadenylation exon e (which contains coding sequence for the four C-terminal amino acids of MAp19 not found in MASP-2, namely glutamic acid, glutamine, serine, and leucine, and sequence of the 3⬘ UT region specific for MAp19 mRNA) follows after 454 bp of intronic sequence. The 3⬘ boundary of exon e is determined by the position of the poly(A) addition site14 noted in the human MAp19 mRNA transcripts with the accession numbers Y18281–Y18284. 1262 bp further downstream, sequence for the second CUB structural domain is split by an intron of 316 bp (exon f, 197 bp; exon g, 148 bp, respectively). An intron of 1997 bp follows (AJ299718). The two CCP domains of MASP2 are encoded by two exons each over a stretch of 7.6 kb, exon h (119 bp) and exon i (79 bp), exon j (135 bp) and exon k (75 bp), respectively. 2527 bp further downstream, exon l encodes the serine protease domain of MASP-2 and contains the 3⬘ UT region specific for MASP-2 mRNA. The 3⬘ boundary of exon l is determined from the position of the poly (A) addition site14 found in cDNA clones Y09926, phl-1, phl-2, phl-3.15 Exons b-l are preceded by polypyrimidine rich tracts. Sequence comparison of the clones aligned in Figure 1 revealed only two variants (transitions) within the coding sequence of the MASP2 gene: the codon for the amino acid position 362 of the mature MASP-2 protein is GTG (valine) in AL109811 and GCG (alanine) in AJ300188, respectively; the codon for amino acid position 478 is TCC (serine) in AL109811 and TCT (serine) in AJ300188, respectively. The codon phases (positions of introns within the codon) are given in Table 1. They are identical for the human MASP2, MASP1, and C1s genes in the orthologous exonic composition represented in MASP2 exons a, b, c, d, f, g, h, i, j, and k.16 The codon phase at the 5⬘ end of exon l is also identical for the orthologous position in the human C1r gene.17 A GC level of 50.37% was determined for sequence encompassing the MASP2 gene (−2137 from transcription start site to 167 bp past exon l, ie a total of 19980 bp; http://ftp.genome.washington.edu/cgi-bin/RepeatMasker). In keeping with this high GC content, this area is characterised by the abundance of ‘short interspersed nuclear elements’, especially ALU repeats (36.01% of the sequence) and MIR (1.59%), and scarce ‘long interspersed nuclear elements’, LINE2 (1.46%). Within 1.2 kb 5⬘ of the determined transcription start site three potential promoter sites (with a score cut-off of 0.9, 0.99, and 1.0, respectively) are predicted by a human promoter analysis programme (http://www.fruitfly. org/seqFtools/promoter.html).18 There are two putative GC-box motifs.19 Several potential transcription factor binding sequences are revealed in their vicinities (http://bimas.dcrt.nih.gov/molbio/signal/)20,21 (Figure 2). Most importantly, among the latter, consensus sequences for liver-specific (LF-A1 and TGT3) or -enriched transcription factors (HNF4, and HNF5, respectively, http://www.cbil.upenn.edu/tess/) are contained which Organisation of the human MASP2 gene locus C Stover et al 121 Figure 1 (a) Organisation of the human MASP2 gene localised on chromosome 1p36.23–31. Exons are denoted as boxes and enumerated a through l, their products in terms of structural and functional modules conserved in the serine proteases C1r, C1s, MASP-1, MASP-2 are indicated. Introns are indicated as lines and shortened to scale; their sizes are given in brackets. 1 indicates the transcription start site determined by 5⬘ RACE analysis (see Figure 2), 17.675 kb indicates the position of the Poly(A) addition site of the MASP-2 mRNA. M, methionine; SP, signal peptide. (b) Alignment of overlapping clones. AL109811 represents a human PAC clone of approximately 150 kb which contains all of the MASP2 gene. The intronic sequence g/h (1997 bp) is deposited separately under AJ299718. ø4–12 and ø2–2 are two clones isolated from a FixII genomic library in this work which have been characterised by restriction mapping, Southern blot and partial sequence analyses. AB033742 represents sequence (2819 bp) of the 5⬘ part of the gene giving rise to the abundant plasma protein MAp19 (exons a through e). AJ300188 is a human genomic clone isolated from a RP11 BAC library with partial representation of the MASP2 gene (see Figure 3). AJ297949 (2577 bp) represents an amplification product obtained from human genomic DNA spanning the end of coding sequence for the EGF-like domain up to the sequence coding for the C-terminal part of the CUB II domain and encompasses sequence for the alternative splice and polyadenylation exon e. may play a role in determining the liver specific expression of MASP-2 and MAp19 mRNA.10,12 Clone RP11–99P18 (AJ300188) only covers sequence for the 3⬘ end of the MASP2 gene including exons l, k, and j, and 1.5 kb intronic sequence prior to exon j (Figure 1). As indicated in Figure 3, sequence encompassing the promoter area of the MASP2 gene (Figure 2) and the entire genomic sequence for exons a through i (including the alternative splice exon e) are replaced by a shortened, unrelated sequence of 1.3 kb. At its boundaries with the known MASP2 gene sequence (100% identical in overlapping parts), there is an increasing number of nucleotide mismatches towards a sequence which is composed of 48% ALU repeats and 6.3% LINE2 repeats (http:// ftp.genome.washington.edu/cgi-bin/RepeatMasker). Genetic neighbourhood of the MASP2 gene Figure 3 (a and b) gives an overview of the MASP2 gene cluster, as determined from sequence of overlapping clones characterised in this work. The MASP2 gene is located in close vicinity to non-complement genes. FRAP is a phospatidylinositol kinase conserved in yeast,22 cryptococcus neoformans,23 rat,24 and bovidae25 and is involved in G1 cell cycle progression.26 Its gene, FRAP1, is located in the vicinity of a cytogenetic breakpoint.27 It has been mapped to chromosomal location 1p36.228 and 1p36.2–3,29 respectively. The gene bank entries U88966 and NMF004958 represent two alternatively polyadenylated transcripts of this one gene differing only in sequence established within a GC rich region of 46 bp. PM-Scl 100 kDa nucleolar autoantigen30,31 has a yeast homologue, Rrp6p, which is required for processing of 5.8 S rRNA prior to ribosomal subunit assembly.32 Its gene, PMSCL2, has been located to human chromosome 1p36 and to mouse chromosome 4q27 (which is syntenic to the human gene location).33 Two gene bank entries for mRNA transcripts of this gene, X66113 and L01457, sharing the same 5⬘ and 3⬘ UT regions and differing in the presence or absence of coding sequence for 25 amino acids were analysed in comparison with the complete transcription unit of PMSCL2. The two mRNA transcripts are alternative products from the same gene arising from in-frame inclusion or skipping of the nineteenth exon (codon phase 0). TAR DNA binding protein of 43 kDa is a transcription factor that binds specifically to pyrimidine-rich motifs contained in the HIV-1 long terminal repeat and acts to repress gene expression.34 Its gene is termed TARDBP. The International Radiation Hybrid Mapping consortium has mapped the TARDBP gene to chromosome 20 (http://www.ncbi.nlm.nih.gov/omim/). Indeed, databank search (http://www.sanger.ac.uk/HGP/Chr20/) using the full-length mRNA sequence (AL050265) retrieves the following human bacterial artificial chromosome (BAC) clone, RP11–318P23 (AL359954). However, the degree of identity is between 86 and 95%, whereas the TARDBP mRNA sequence (AL050265) is 98–100% identical to corresponding sequence represented in PAC clone RP4–635E18 (AL109811), mapping to chromosome 1p36.11–31. The chromosomal localisation of the gene for ‘corneaderived transcript 6’, CDT6 (Y16132), has not previously been assigned. The single report in the literature identGenes and Immunity Organisation of the human MASP2 gene locus C Stover et al 122 Figure 2 Promoter region, transcriptional regulatory elements, and transcription initiation start site of the human MASP2 gene. Consensus sequences of putative binding sites for transcription factors are underlined and indicated and predicted promoter sequences are in bold. Putative GC boxes are marked. The transcription initiation start site as defined by 5⬘ RACE analysis is indicated as +1. Exon a (containing the 5⬘ UT region of MAp19/MASP-2 mRNA transcripts and their translation initiation codon) and the 5⬘ portion of exon b (coding for the remainder of the signal peptide and the N-terminal part of CUB I) are shown. ifies CDT6 as a novel gene whose translational product shows homology to angiopoietins 1 and 2 and is characterised by a coiled-coil structural domain and a fibrinogen-like domain.35 The gene for CDT6 lies within a 25 kb intron of the FRAP1 gene, in opposite transcriptional orientation (Figure 3). Discussion The MASP2 gene extends over approximately 20 kb and is much smaller than the MASP1 gene (about 72 kb as seen from sequence made available by the Human Genome Sequencing Project, accession number AC007920) owing to smaller introns. The human C1s gene comprises approximately 13 kb.16 Genes and Immunity Like the mRNA for the serine protease C1s, the MASP2 mRNA transcript is encoded by 12 exons. There is one additional exon in the MASP2 gene to allow generation of the alternative splice and polyadenylation product, MAp19/sMAP. In the C1s and the MASP2 gene, each of the two CUB structural domains as well as each of the two CCP modules are encoded by two exons each. The EGF-like domain is encoded by one exon only. Likewise, the enzymatically active protease domain is encoded by one exon only and clearly distinguishes the C1s and MASP2 genes from the MASP1 gene, in which the protease domain is encoded by six exons.16 A previously published comparative analysis of the genomic organisation of the serine protease domain of MASP-2 in mouse and Xenopus and of MASPs in carp, shark, and lamprey Organisation of the human MASP2 gene locus C Stover et al 123 Table 1 Codon phases within the human MASP2 gene Splice donor and acceptor sequences of all exons are compiled. The arrows indicate the alternative polyadenylation/splicing event that takes place in the hnRNA to generate either the MAp19 or MASP-2 mRNA. The position of the intron within the respective codon is given to the right; codon phase 0 and I denote a position of the intronic sequence before and after the first base, respectively. revealed that these MASP genes belong to the so-called AGY lineage of thrombin-like serine proteases. The members of this lineage are characterised by an AGY codon at the active site serine and a single exon coding for the protease domain.16 C1r was also shown to possess an intronless protease domain,36 its active site serine residue is also encoded by an AGY triplet. The close proximity of the human C1r and C1s genes, both of the AGY lineage (in 5⬘ → 3⬘ (C1r gene) and 3⬘ → 5⬘ (C1s gene) orientation, at an intergenic distance of 9.5 kb37), suggested that both genes are the result of a gene duplication event38 and have remained closely linked on chromosome 12p13.39 MASP-1, by contrast, belongs to the TCN lineage of trypsin-like serine proteases.16 Taking this into account,40 a model for the evolution of the MASP1, MASP2, C1r, and C1s genes was proposed which involves events such as mutation, retrotransposition, and duplication from an ancestral gene of the TCN type lineage showing a split gene organisation for the protease domain.16,17,41,42 The identical codon phases in the orthologous coding Genes and Immunity Organisation of the human MASP2 gene locus C Stover et al 124 Figure 3 (a) The MASP2 Gene Cluster on chromosome 1p36.2–3. The transcription units for the individual genes in the vicinity of the MASP2 gene are presented as boxes and arrows indicate orientation of transcription. Names of the gene products are included. The position of this analysed chromosomal area with respect to telomere and centromere of the short (p) arm is indicated. (b) Coverage of this chromosomal area by overlapping clones (AL109811, AJ300188, AL049653). The sequence area denoted by an asterisk represents the divergent sequence of 1340 bp solely composed of repetitive elements found in AJ300188. sequences of the MASP1, MASP2, C1s genes, and the partially characterised C1r gene underline the likelihood of such a genesis. The phenomenon of exon shuffling may have to be added to these events:43 symmetric exons, ie exons for which the phase of the 5⬘ intron is identical to the phase of the 3⬘ intron, will spread more easily by exon shuffling than nonsymmetric ones.44 Within the MASP2 gene, exons d, e, j, and k are symmetric, ie phase correlated, exons (coding for the EGF-like domain, the MAp19 specific C-terminus, and all of CCP II). By contrast, introns that lie in phase 0 (ie between codons), are said to be ancient, ie originally present in the progenote, and are probably more related to structural modules of the protein.45 In the MASP2 gene, phase 0 introns are found between the exons coding for CUB I, CUB II, and CCP I, respectively. The MASP2 gene lies in a gene-rich area of chromosome 1,46 in close proximity to non-complement genes, such as the gene coding for TAR DNA binding protein, the gene coding for the 100 kDa Polymyositis-Scleroderma autoantigen, the gene coding for FRAP kinase which harbours in one of its introns the gene for CDT6. All of these genes have a split exon architecture. In a wider genetic context, the detection on 1p35–36.2 of several and polymorphic pseudogenes of the macrophage stimulating protein gene has lead to the concept that recombination may occur frequently between these repeated sequences.47 In addition, a PAC clone from this chromosomal area (Z98257) contains sequence with 81– 96% identity to a human endogenous retrovirus type C oncovirus sequence (M74509). An adenoviral integration site has been mapped to 1p36.1.48 The cytogenetic aberration described in neuroblastomas is a deletion of the short arm of chromosome 1.49 This deletion encompasses the MASP2 gene locus (the MASP2 gene is at 5.66 cR from the marker D1S54813) as seen from an analysis of several framework maps (http://www.ncbi.nlm.nih.gov/ genemap/map.cgi for chromosome 1).28,49 The significance of this is not yet clear. Here, it is of interest to note the substitution of part of the MASP2 gene observed in clone RP11–99P18 (AJ300188), which encompasses the Genes and Immunity promoter region of the MASP2 gene, exons a through i, and includes most of the intron prior to exon j which codes for the N-terminal part of CCP II. Analysis of this sequence area reveals that there is sequence identity over a stretch of 1.5 kb for the 3⬘ part of the intron i/j of the MASP2 gene and a stretch of 4.3 kb for the 5⬘ part of the intergenic region to the PMSCL2 gene, respectively. Approaching the divergent sequence of 1.3 kb (solely composed of repetitive elements) from either side, there is an increasing number of nucleotide mismatches. The origin of the divergent organisation of the MASP2 gene locus in this BAC clone is presently investigated by sequencing two overlapping BACs of the same human BAC library from which clone RP11–99P18 was isolated. The serine proteases of the classical and lectin pathway of complement activation have a similarity score between 39 and 52%.15 Their genes belong to a multigene family. Genes coding for various types of trypsinogen are also members of this family. Trypsin, composed solely of a serine protease domain (of the TCN type and encoded by five exons, the boundaries differ from the ones found in the orthologous part of the MASP1 gene coding for the serine protease domain), is thought to be the primordial enzyme present before the emergence of complement serine proteases, and novel substrate specificities are thought to have been introduced by the evolutionary anteposition of certain modular domain structures found in, eg, MASP-2, MASP-1, C1r, and C1s. Recent work emphasises the importance of residue 225 (chymotrypsin numbering) in serine proteases.50 While degradative enyzmes such as trypsin have a proline residue at this position, P225, a tyrosine is found in MASP-1, MASP-2, C1r, C1s (and thrombin), Y225. Y225 allows Na+ binding near the primary specificity site and thus enhances catalytic activity. It is thought that the mutation P225 to Y225 occurred in prevertebrates in the face of evolutionary pressures, prior to the conversion of the codon for the active site serine (TCN to AGY) in more specialised serine proteases found in vertebrates.50 The concomitant presence of members of unrelated multigene families in the vicinity of the MASP1, MASP2, Organisation of the human MASP2 gene locus C Stover et al 125 Table 2 Compilation of positional overlap of human genes belonging to four different multigene families Presented on the left are the genes for the serine proteases MASP-2, MASP-1, C1r, C1s, and two trypsins; these are compared against members of three unrelated gene families, a glucose transporter gene family (SLC2A), a chloride channel gene family (CLCN), and the tumor necrosis factor receptor gene family (TNFR), and their respective chromosomal locations are indicated. C1r, C1s, and trypsinogen genes would lend support to the hypothesis of a joint ancestry of these genes and weaken the theoretical possibility of a convergent evolution of unrelated genes. In search of these, strikingly overlapping chromosomal positions have been found for the serine protease genes of interest to this work, two of the trypsinogen genes,51 the genes for certain isoforms of a glucose transporter52 and of a chloride channel53 as well as for the genes for the two TNF receptors (http://www. ncbi.nlm.nih.gov/omim/, http://www.ncbi.nlm.nih. gov/genemap99/)54,55 (Table 2). We propose that the consideration of such neighbouring genes which are equally members of multigene families should be included in phylogenetic analyses. Materials and methods RACE Initiation of transcription of the human MASP2 gene10,11 was mapped by RACE technology using 5⬘ RACE System for Rapid Amplification of cDNA Ends (Version 2.0) from Life Technologies (Paisley, UK). Total RNA prepared from human liver tissue56 was transcribed to cDNA using a MASP-2 mRNA specific antisense oligonucleotide (5⬘ GGAGAGTTTGGGATACGGCCGTGG GTA 3⬘, bp pos. 617–643, accession number Y09926). The single strand cDNA was purified and dC-tailed according to the manufacturer’s protocol. An anchor primer annealing to the homopolymeric tail and two other, nested MASP2/MAp19 specific oligonucleotides (5⬘ GCGTGGCCAG CACCTTGGCCCC 3⬘, bp pos. 265–286 of Y18281, 5⬘ GCGA GTAGAAAGTGTCCTTGC 3⬘, bp pos. 326–346 of Y18281) positioned 3⬘ of the above oligonucleotide were used for two independent cyclic amplification reactions. Two distinct amplification products of 320 bp and 380 bp, respectively were subcloned in pGEMTeasy vector (Promega, Southampton, UK) and analysed by double stranded cycle sequencing. Cloning of the MASP2 gene Human bacterial artificial chromosome (BAC) clones were isolated by screening a human RP11 BAC library using a radiolabeled cDNA probe specific for the coding region of the serine protease domain of MASP-2 (Research Genetics, Huntsville, Alabama, USA). The clones were verified by colony PCR using MASP-2 specific oligonucleotides (5⬘ ACCT GGTGATTTTCCTTGG 3⬘, bp. pos. 1372–1390 of Y09926 and 5⬘ GTCACTCT CGTGGTTTATG 3⬘, bp. pos. 2278–2296 of Y09926) and Southern blotting. Partial primary structure for the MASP2 gene was established from overlapping subclones obtained from a shotgun library constructed of one BAC clone, RP11–99P18. Double stranded sequencing and analysis were performed using automated sequencer ABI 377 and Sequencher 3.0 programme (Gene Codes Corporation, MI, USA), respectively. In parallel, two clones, ø4– 12 and ø2–2, were isolated from a human Fix II genomic library using a MASP-2 cDNA probe and analysed by restriction mapping and sequencing. One clone, AJ297949, was obtained by cyclic amplification of human genomic DNA using MASP-2 specific oligonucleotides (5⬘ GCGCCCACCTGCGACCACCAC 3⬘, bp pos. 461–481 of Y09926 and 5⬘ TCCTGATTCATCTGTGACAAAGG 3⬘, bp pos. 846–868 of Y09926) and characterised by restriction mapping and sequencing. Acknowledgements We wish to thank The International Institute for the Advancement of Medicine, University of Leicester for kindly providing the non-tumorous human liver specimen used in this study. We acknowledge the expertise of Cathy Summer (Research Genetics, Huntsville, Alabama, USA) in isolating the MASP-2 specific human BAC clone RP11–99P18 (AJ300188). The authors also wish to thank the Sanger Mapping and Sequence groups for the sequence production of RP4–635E18 (AL109811). Guelnihal Yueksekdag is acknowledged for excellent technical support. We especially thank Dr Jim Kaufman (Institute of Animal Health, Compton, England) for his inspiring comments. References 1 Whaley K, Schwaeble W. Complement and complement deficiencies. Semin Liver Dis 1997; 17: 297–310. Genes and Immunity Organisation of the human MASP2 gene locus C Stover et al 126 2 Lu JH, Thiel S, Wiedemann H, Timpl R, Reid KB. Binding of the pentamer/hexamer forms of mannan-binding protein to zymosan activates the proenzyme C1r2C1s2 complex, of the classical pathway of complement, without involvement of C1q. J Immunol 1990; 144: 2287–2294. 3 Matsushita M, Endo Y, Fujita T. Complement-activating complex of ficolin and mannose-binding lectin-associated serine protease. J Immunol 2000; 164: 2281–2284. 4 Hansen S, Holmskov U. Structural aspects of collectins and receptors for collectins. Immunobiology 1998; 199: 165–189. 5 Hoppe H, Reid K. Collectins--soluble proteins containing collagenous regions and lectin domains--and their roles in innate immunity. Protein Sci 1994; 3: 1143–1158. 6 Thiel S, Petersen S, Vorup-Jensen T et al. Interaction of C1q and Mannan-Binding Lectin (MBL) with C1r, C1s, MBL-associated serine proteases 1 and 2, and the MBL-associated protein MAp19. J Immunol 2000; 165: 878–887. 7 Vorup-Jensen T, Petersen S, Hansen A et al. Distinct pathways of mannan-binding lectin (MBL)- and C1 complex autoactivation revealed by reconstitution of MBL with recombinant MBLassociated serine protease-2. J Immunol 2000; 165: 2093–2100. 8 Matsushita M, Thiel S, Jensenius J, Terai I, Fujita T. Proteolytic activities of two types of mannose-binding lectin-associated serine protease. J Immunol 2000; 165: 2637–2642. 9 Takahashi M, Miura S, Ishii N et al. An essential role of MASP1 in activation of the lectin pathway. Immunopharmacology 2000; 49: 3 (Abstract 003). 10 Stover C, Thiel S, Thelen M, Lynch N, Vorup-Jensen T, Jensenius J, Schwaeble W. Two constituents of the initiation complex of the Mannan-Binding Lectin activation pathway of complement are encoded by a single structural gene. J Immunol 1999; 162: 3481–3490. 11 Takahashi M, Endo Y, Fujita T, Matsushita M. A truncated form of mannose-binding lectin-associated serine protease (MASP)-2 expressed by alternative polyadenylation is a component of the lectin complement pathway. Int Immunol 1999; 11: 859–863. 12 Stover C, Thiel S, Lynch N, Schwaeble W. The rat and mouse homologues of MASP-2 and MAp19, components of the mannan-binding lectin activation pathway of complement. J Immunol 1999; 63: 6848–6859. 13 Stover C, Schwaeble W, Lynch N, Thiel S, Speicher M. Assignment of the gene encoding Mannan-Binding Lectin-associated serine protease-2 (MASP-2) to human chromosome 1p36.2–3 by in situ hybridization and somatic cell hybrid analysis. Cytogenet Cell Genet 1999; 84: 148–149. 14 Cooke C, Alwine J. The cap and the 3⬘ splice site similarly affect polyadenylation efficiency. Mol Cell Biol 1996; 16: 2579–2584. 15 Thiel S, Vorup-Jensen T, Stover C et al. A second serine protease associated with mannan-binding lectin that activates complement. Nature 1997; 386: 506–510. 16 Endo Y, Takahashi M, Nakao M et al. Two lineages of mannosebinding lectin-associated serine protease (MASP) in vertebrates. J Immunol 1998; 161: 4924–4930. 17 Endo Y, Sato T, Matsushita M, Fujita T. Exon structure of the gene encoding the human mannose-binding protein-associated serine protease light chain: comparison with complement C1r and C1s genes. Int Immunol 1996; 8: 1355–1358. 18 Reese MG, Eeckman FH. Novel Neural Network Algorithms for Improved Eukaryotic Promoter Site Recognition. The 7th international Genome sequencing and analysis conference, Hilton Head Island, South Carolina, 16–20 September 1995. 19 Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol 1990; 212: 563–578. 20 Prestridge DS. SIGNAL SCAN: A computer program that scans DNA sequences for eukaryotic transcriptional elements. CABIOS 1991; 7: 203–206. 21 Ghosh D. New developments of a transcription factors database. Trends Biochem Sci 1991; 16: 445–447. 22 Kunz, J, Henriquez R, Schneider U, Deuter-Reinhard M, Movva N, Hall M. Target of rapamycin in yeast, TOR2, is an essential Genes and Immunity 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 phosphatidylinositol kinase homolog required for G1 progression. Cell 1993; 73: 585–596. Cruz M, Cavallo L, Gorlach J et al. Rapamycin antifungal action is mediated via conserved complexes with FKBP12 and TOR kinase homologs in cryptococcus neoformans. Mol Cell Biol 1999; 19: 4101–4112. Sabatini D, Erdjument-Bromage H, Lui M, Tempst P, Snyder S. RAFT1: a mammalian protein that binds to FKBP12 in a rapamycin-dependent fashion and is homologous to yeast TORs. Cell 1994; 78: 35–43. Brown E, Albers M, Shin T et al. A mammalian protein targeted by G1-arresting rapamycin-receptor complex. Nature 1994; 369: 756–758. Vilella-Bach M, Nuzzi P, Fang Y, Chen J. The FKBP12-rapamycin-binding domain is required for FKBP12-rapamycin-associated protein kinase activity and G1 progression. J Biol Chem 1999; 274: 4266–4272. Onyango P, Lubyova B, Gardellin P, Kurzbauer R, Weith A. Molecular cloning and expression analysis of five novel genes in chromosome 1p36. Genomics 1998; 50: 187–198 Lench N, Macadam R, Markham A. The human gene encoding FKBP-rapamycin associated protein (FRAP) maps to chromosomal band 1p36.2. Hum Genet 1997; 99: 547–549. Moore P, Rosen C, Carter K. Assignment of the human FKBP12rapamycin-associated protein (FRAP) gene to chromosome 1p36 by fluorescence in situ hybridization. Genomics 1996; 33: 331–332. Gelpi C, Algueró A, Angeles Martinez M, Vidal S, Juarez C, Rodriguez-Sanchez J. Identification of protein components reactive with anti-PM/Scl autoantibodies. Clin Exp Immunol 1990; 81: 59–64. Ge Q, Wu Y, Trieu E, Targoff I.Analysis of the specificity of antiPM-Scl autoantibodies. Arthritis Rheum 1994; 37: 1445–1452. Briggs M, Burkard K, Butler J. Rrp6p, the yeast homologue of the human PM-Scl 100-kDa autoantigen, is essential for efficient 5.8 S rRNA 3⬘ end formation. J Biol Chem 1998; 273: 13255–13263. Bliskovski V, Liddell R, Ramsay E, Miller M, Mock B. Structure and localization of mouse Pmscl1 and Pmscl2 genes. Genomics 2000; 64: 106–110. Ou SH, Wu F, Harrich D, Garcia-Martinez L, Gaynor R. Cloning and characterization of a novel cellular protein, TDP-43, that binds to human immunodeficiency virus type 1 TAR DNA sequence motifs. J Virol 1995; 69: 3584–3596. Peek R, van Gelderen BE, Bruinenberg M, Kijlstra A. Molecular cloning of a new angiopoietin-like factor from the human cornea. Invest Ophthalmol Vis Sci 1998; 39: 1782–1788. Tosi M, Duponchel C, Meo T, Couture-Tosi E. Complement genes C1r and C1s feature an intronless protease domain closely related to haptoglobin. J Mol Biol 1989; 208: 709–714. Kusumoto H, Hirosawa S, Salier J, Hagen F, Kurachi K. Human genes for complement C1r and C1s in a close tail-to-tail arrangement. Proc Natl Acad Sci 1988; 85: 7307–7311. Tosi M, Duponchel C, Meo T, Julier C. Complete cDNA sequence of human complement C1s and close physical linkage of the homologous genes C1s and C1r. Biochemistry 1987; 26: 8516–8524. Van Cong N, Tosi M, Gross M et al. Assignment of the complement serine protease genes C1r and C1s to chromosome 12 region 12p13. Hum Genet 1988; 78: 363–368. Dang Q, DiCera E. Residue 225 determines the Na+-induced allosteric regulation of catalytic activity in serine proteases. Proc Natl Acad Sci USA 1996; 93: 10653–10656. Lawson P, Reid K. A novel PCR-based technique using expressed sequence tags and gene homology for murine genetic mapping: localization of the complement genes. Int Immunology 1999; 12: 231–240. Irwin D. Evolution of an active-site codon in serine proteases. Nature 1988; 336: 429–430. Long M, Rosenberg C, Gilbert W. Intron phase correlations and the evolution of the intron/exon structure of genes. Proc Natl Acad Sci USA 1995; 92: 12495–12499. Organisation of the human MASP2 gene locus C Stover et al 44 Gilbert W, de Souza S, Long M. Origin of genes. Proc Natl Acad Sci USA 1997; 94: 7698–7703. 45 De Souza S, Long M, Klein R, Roy S, Lin S, Gilbert W. Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins. Proc Natl Acad Sci 1998; 95: 5094–5099. 46 Saccone S, De Sario A, Della Valle G, Bernardi G. The highest gene concentrations in the human genome are in telomeric bands of metaphase chromosomes. Proc Natl Acad Sci 1992; 89: 4913–4917. 47 van der Drift P, Chan A, Zehetner G, Westerveld A, Versteeg R. Multiple MSP pseudogenes in a local repeat cluster on 1p36.2: an expanding genomic graveyard? Genomics 1999; 62: 74–81. 48 Romani M, Baldini A, Volpi E, Casciano I, Nobile C, Muresu R, Siniscalco M. Concurrent mapping of an adeovirus 5/SV40 integration site and the U1 snRNA cluster (RNU1) within 400 kb of the chromosome 1p36.1. Cytogenet Cell Genet 1994; 67: 37–40. 49 White P, Maris J, Beltinger C et al. A region of consistent deletion in neuroblastoma maps within human chromosome 1p36.2–36.3. Genetics 1995; 92: 5520–5524. 50 Guinto ER, Caccia S, Rose T, Futterer K, Waksman G, Di Cera E. Unexpected crucial role of residue 225 in serine proteases. Proc Natl Acad Sci USA 1999; 96: 1852–1857. 51 Emi M, Nakamura Y, Ogawa M et al. Cloning, characterization and nucleotide sequences of two cDNAs encoding human pancreatic trypsinogens. Gene 1986; 41: 305–310. 52 Saier M. Families of transmembrane sugar transport proteins. Mol Microbiol 2000; 35: 699–710. 53 Jentsch TJ, Friedrich T, Schriever A, Yamada H. The CLC chloride channel family. Pflugers Arch 1999; 437: 783–795. 54 Fuchs P, Strehl S, Dworzak M, Himmler A, Ambros P. Structure of the human TNF receptor 1 (p60) gene (TNFR1) and localization to chromosome 12p13. Genomics 1992; 13: 219–224. 55 Kemper O, Derre J, Cherif D, Engelmann H, Wallach D, Berger R. The gene for the type II (p75) tumor necrosis factor receptor (TNF-RII) is localized on band 1p36.2-p36.3. Hum Genet 1991; 87: 623–624. 56 Chirgwin J, Przybyla A, MacDonald R, Rutter W. Isolation of biologiocally active ribonucleic acid from sources enriched in ribonucleases. Biochemistry 1979; 18: 5294–5299. 127 Genes and Immunity
© Copyright 2026 Paperzz