cDNA Cloning of a Developmentally Regulated Hemocyanin Subunit in the Crustacean Cancer magister and Phylogenetic Analysis of the Hemocyanin Gene Family Gregor Durstewitz and Nora Barclay Terwilliger Oregon Institute of Marine Biology, Charleston; and Department of Biology, University of Oregon The complete cDNA sequence and protein reading frame of a developmentally regulated hemocyanin subunit in the Dungeness crab (Cancer magister) is presented. The protein sequence is aligned with 18 potentially homologous hemocyanin-type proteins displaying apparent sequence similarities. Functional domains are identified, and a comparison of predicted hydrophilicities, surface probabilities, and regional backbone flexibilities provides evidence for a remarkable degree of structural conservation among the proteins surveyed. Parsimony analysis of the protein sequence alignment identifies four monophyletic groups on the arthropodan branch of the hemocyanin gene tree: crustacean hemocyanins, insect hexamerins, chelicerate hemocyanins, and arthropodan prophenoloxidases. They form a monophyletic group relative to molluscan hemocyanins and nonarthropodan tyrosinases. Arthropodan prophenoloxidases, although functionally similar to tyrosinases, appear to belong to the arthropodan hexamer-type hemolymph proteins as opposed to molluscan hemocyanins and tyrosinases. Introduction Hemocyanins and related copper proteins are ancient molecules. They probably arose about 1.6 billion years ago (BYA) when the earth’s atmosphere changed from a reducing to an oxidizing environment. Prior to that time, most of the earth’s available copper (Cu) was precipitated in insoluble sulfides, CuS and Cu,S (Ochiai 1983), and was therefore virtually inaccessible to living organisms. When oxygen-producing photosynthesis began to increase about 2 BYA, significant amounts of Cu were oxidized to Cu2+. In this form, it was readily dissolved in aquatic systems, distributed in the biosphere and therefore became available to living organisms. Hemocyanin is the oxygen transport protein of many arthropods and molluscs. It occurs freely dissolved in the hemolymph. Although similar in function to molluscan hemocyanins, arthropodan hemocyanin is radically different in molecular architecture. Molluscan hemocyanin subunits are multidomain polypeptides of about 350-450 kDa, containing 7-8 functional units, each of which contains two Cu-binding sites and combines reversibly with oxygen. The subunits form cylindrical macromolecules of 3,500-4,500 kDa and higher multiples depending on the species. Arthropodan hemocyanin is composed of heterogeneous subunits with molecular weights of about 75 kDa. These subunits selfassemble into 450~kDa hexamers or multiples thereof. Each subunit contains two Cu-binding sites, CuA and CUB, that together reversibly bind one molecule of oxygen The evolutionary relationship between the hemocyanins has long been the topic of speculation: are arthropodan and molluscan hemocyanins homologous gene products or the result of convergent evolution? They are very different in sequence as well as in subunit Key words: hemocyanin, respiratory proteins, magister, parsimony analysis, prophenoloxidases. phylogeny, Cancer Address for correspondence and reprints: Dr. Nora B. Terwilliger, Oregon Institute of Marine Biology, 4619 Boat Basin Drive, Charleston, Oregon 97420. E-mail: [email protected]. Mol. Biol. Evol. 14(3):26&276. 1997 0 1997 by the Society for Molecular Biology and Evolution. 266 ISSN: 0737-4038 structure and composition (van Holde and Miller 1995), but there is good evidence for a common origin of at least part of their active site (Drexel et al. 1987). On the basis of sequence comparisons, other potential members of a putative hemocyanin gene family have recently been identified. These include tyrosinases (Lerch et al. 1986), prophenoloxidases (Aspan et al. 1995), and insect storage proteins or hexamerins (Munn and Greville 1969; Telfer and Massey 1987). The latter do not bind Cu. Other Cu proteins like the plastocyanins, Cu-dependent cytochrome c oxidase, ceruloplasmin, azurins, laccase, ascorbate oxidase, or Cu-dependent superoxide dismutase show no structural or sequence similarity with the hemocyanin family (Markl and Decker 1992). In this paper we present the complete cDNA and protein sequence of a developmentally regulated hemocyanin subunit from a brachyuran crustacean, the Dungeness crab Cancer magister. In brachyurans, hemocyanin occurs in the hemolymph predominantly as a two-hexamer molecule. Although subunit sequences of chelicerate hemocyanins and hexameric crustacean hemocyanins have been reported (Linzen et al. 1985; Beintema et al. 1994), subunit sequences of multihexameric crustacean hemocyanins have been unavailable up to now. This protein, Cmag6, is the first Cu-based respiratory protein described whose expression appears to be developmentally regulated. In adult C. magister, onehexamer Hc is composed of subunits 1, 2, 4, 5, and 6, while the two-hexamer Hc contains subunits 1, 2, 3, 4, 5 and 6. The megalopa and early juvenile crab Hcs lack subunit 6 until about the sixth juvenile instar (Terwilliger and Terwilliger 1982). This is the same time at which Cmag6 mRNA is first detectable in hepatopancress tissue (Durstewitz and Terwilliger 1997). The appearance of subunit Cmag6 correlates with a change in oxygen-binding properties; under the same experimental conditions, the oxygen affinity of adult purified two-hexamer Hc is about 50% higher than that of juvenile twohexamer Hc lacking Cmag6 (Terwilliger and Brown 1993). Once expression of subunit Cmag6 is initiated, it persists for the rest of the crab’s life. Phylogeny In this study, we also aligned 18 potentially homologous proteins with Cmag6 for investigation of potential conserved structural features and for parsimony analysis. The combination of both structural and sequence comparisons is used to shed light on the evolutionary relationships among hemocyanin-type proteins. Materials and Methods Isolation of Total RNA from Cancer magister Hepatopancreas Fresh tissue samples (100 mg) from adult male C. magister hepatopancreas tissue were rinsed with C. magister hemolymph buffer (50 mM Tris-HCl, 454 mM NaCl, 11.5 mM KCl, 13.5 mM CaC12, 18 mM MgC12, 23.5 mM Na2S04, pH 7.6; Terwilliger and Brown 1993), frozen in liquid nitrogen, and ground to a fine powder with mortar and pestle. Total RNA was isolated with the guanidinium isothiocyanate method using a RAPID Total RNA Isolation Kit (5 Prime + 3 Prime, Inc.). Total RNA yield was quantified by measuring absorbance at 260 nm. Reverse Transcription (PCR) Amplification Sequences and Polymerase Chain Reaction of Hemocyanin Coding In an eppendorf tube, 1 l.~l total RNA (1 pg/bl) was diluted with 10.65 pl autoclaved water, and 0.75 ~1 oligo dT-primer (0.27 pg/pl) was added. The mixture was incubated for 3 min at 65°C and then was allowed to cool down to room temperature. Next were added (in this order): 4 pl 5 X reverse transcription buffer (250 mM Ti-is-HCl, pH 8.5; 200 mM KCl; 30 mM MgC12), 1 ~1 20 mM dithiothreiotol (DTT), 1 l.~l 25 mM dNTPs, 1 ~1 RNAsin (10 U/p,l), and 0.6 I_L~AMV reverse transcriptase (17 II/@). The reaction was incubated at 42°C for 90 min and then diluted to a total volume of 500 l~11. A lo-p1 aliquot of this reverse transcription reaction was added to 18.5 pl water, 5 pl 10 X PCR-buffer (670 mM Tris-HCl), 4 ~12.5 mM dNTPs, 5 ~1 10 X bovine serum albumin (1 Fg/pl), 1 ~1 of each primer (0.2 p&l), 0.5 ~1 Taq polymerase (5 U/pi), and 5 ~1 40 mM MgC12. PCR was carried out using the following protocol: (denature: 94°C for 40 s; anneal: 55°C for 40 s; polymerize: 72°C for 1 min) * 35 cycles, then 5 min at 72°C and hold at 4°C. Ten-microliter aliquots of each reaction were analyzed on 1.2% agarose Tris-acetate/EDTA (TAE) minigels. Cloning cDNA and Sequencing of PCR-Amplified Cmag6 Unless indicated otherwise, the following procedures were performed in accordance with Sambrook, Fritsch, and Maniatis (1989). PCR products (40 pl total) were separated by size through electrophoresis on 1.2% agarose TAE maxigels. Bands of interest were excised under UV-light and purified in a glassmilk procedure (GENECLEAN II kit, Bio 101, Inc.). Ends were repaired with Klenow polymerase and 5’ ends were phosphorylated with T4 polynucleotide kinase. A Bluescript II SK+ vector (Stratagene) was cut with restriction endonuclease Sma I and dephosphorylated using calf intes- of Hemocyanin Gene Family 267 tinal phosphatase (CIP). In a total volume of 20 ~1, the inserts were blunt-end-ligated into 50 ng vector DNA in a molar ratio insert/vector of 3: 1, using 1 Weiss unit T4 DNA ligase. Ligation occurred overnight at 16°C. Competent Escherichia coli XL-l Blue cells were transformed with 50 ng ligated DNA and plated out on LB-Amp plates. Positive clones were selected and DNA was isolated in alkaline lysis minipreps. When desired, inserts were excised and analyzed using restriction enzymes EcoRV and Xba I. cDNA inserts were sequenced manually by standard walking, using the dideoxy method with a SEQUENASE 2.0 kit (U.S. Biochemical) and radioactive 35S-labeled nucleotides (NEN-Du Pont). T3 and T7 were used as initial sequencing primers. To provide generous overlap between the sequenced parts, further primers were designed as 17mers based on stretches of cDNA sequence located approximately at the -50 bp position relative to the end of the known region. Screening a cDNA Library of Cancer magister Hepatopancreas Tissue and Sequencing of CmagX A cDNA library of adult C. magister hepatopancress tissue was created as described before (Terwilliger and Durstewitz 1996) and screened with a 32P randomprime-labeled 783-bp C. magister hemocyanin-specific probe (Durstewitz and Terwilliger 1997). Positive clones were analyzed for insert size and partially sequenced to identify them as hemocyanin coding sequences. An l,SOO-bp insert (CmagX) was sequenced by creating overlapping nested deletions: the clone containing the CmagX cDNA fragment was digested with DNase I in the presence of Mn2+ (Lin, Lei, and Wilcox 1985; Terwilliger and Durstewitz 1996). The resulting population of plasmids contained overlapping nested deletions and was sequenced with the dideoxy method. Results The complete cDNA sequence coding for C. masubunit 6 (Cmag6) was amplified in gister hemocyanin two overlapping fragments by PCR. The template was first-strand cDNA derived from hepatopancreas total RNA. The four primers used to amplify hemocyanin coding sequences were (1) a degenerate primer based on the unique N-terminal amino acid sequence of C. magister hemocyanin subunit 6 (5’ TCT-GCA-GGC-GGAGCG-TIC-GAC-GCG-CA 3’, “5’ sub 6 primer”), (2) an antisense primer based on the 3’ PCR product (5’ CAC-TGC-CTG-GGG-ATC-GAA-GCC-CTC-ATG 3’) “CuA II primer”), (3) a degenerate primer based on a conserved sequence within the copper A site (CuA) of arthropodan hemocyanin (5 ’GAA-C’IT-TIT-TIT-TGGGTT-CAT-CAT-CAA-CTT-AC 3’) “CuA I primer”; Bak and Beintema 1987), and (4) a universal oligo-dT primer (see Terwilliger and Durstewitz 1996, fig. 2). Using the primer combinations 1 and 2 in one PCR reaction plus 3 and 4 in another, the experiments generated two overlapping cDNA fragments. Each fragment was blunt-end-cloned into a Bluescript II SK+ vector and sequenced. One fragment coded for the 5’ end, the other one for the 3’ end of hemocyanin subunit 6. The iden- 268 Durstewitz and Terwilliger GCAGCACGATGTCAACAGCGCTCTGTGGAAGGTCTACGAG 1 ---------+---------+---------f------------~---_-_---~---------~-________~_________~ AGACGTCCGCCTCGCAAGCTGCGCGTCTTCGTCGTGCTACAGTTGTCGCGAGACACCTTCCAGATGCTCCTATAGGTCCT TAGGAFDAQ KQHDVNSALWKVYEDIQD TCCCCACCTAATACAACTTTCCCAGAACTTCGACCCGCTCGACCCGCTCTCCGGCCACTATGACGACGATGGTGTCGCCGCC~GCGCC 81 ---------+---------+-------__+_________+---------+---------+---__---_+_________+ AGGGGTGGATTATGTTGAAAGGGGTCTTG~GCTGGGCGAGAGGCCGGTGATACTGCTGCTACCACAGCGGCGGTTCGCGG P H L I Q L S Q N F DPLSGHYDDDGVAAKRL TCATG~GGAGCTCAACGAAAACCGCTTGCTTGCTG~GCAG~CCACTGGTTCTCACTGTTC~CACCCGCCAGCGCGAGGAG 161 _---_-_--+---------+---------+------------~_________~_--------~---------~_____-___~ AGTACTTCCTCGAGTTGCTTTTGGCGAACGACTTCGTCTTGTTGTGGGCGGTCGCGCTCCTC M K E LNENRLLKQNHWF SLFNTRQREE GCTCTCATGCTCTACGACGTCCTCGAACACTCCACTCCACAGACTGGAGCACCTTCGCCGGC~CGCTGCCTTCTTCCGCGTTAG 241 -------_-+---------+---------+--_______+----_----+---------+---------+---------+ CGAGAGTACGAGATGCTGCAGGAGCTTGTGAGGTGTCTGACCTCGTGG~GCGGCCGTTGCGACGG~G~GGCGC~TC ALMLYDVLEHSTDWSTFAGNAAFFRVS CATGAACGAGGGCGAGTTCGTTTACGCACTGTACGCTGCCGTTATCCACTCTGAGCTGACAC~CACGTGGTGCTACCAC 321 _________+________-+---------+------------~---------~---------~-------__~_________~ GTACTTGCTCCCGCTCAAGCAAATGCGTGCGTGACATGCGACGGC~TAGGTGAGACTCGACTGTGTTGTGCACCACGATGGTG MN E G E F VYA L YAAV I H S EL T Q Hl7VL P P CCCTCTACGAGGTCACTCCTCACCTCTTCACCAACAGCGAATGACCCAGACT 401 ---------+__-----_-+--_-_---_+---_--------~_________~_________~_________~-----_---~ GGGAGATGCTCCAGTGAGGAGTGGAGAAGTGGTGGTTGTCGCTCCACTAGGTTCTTCGGATGTTTCGGTTCTACTGGGTCTGA LYEVTPHLFTNSEVIQ EAYKAKMTQT GCCGCCAAGATTGAGTCCCACTTCACCGGCAGCAAGAGTACATCGG 481 ---------+---------+----_-_-_+---------+_________+_________+_________+_________+ CGGCGGTTCTAACTCAGGGTGAAGTGGCCGTGGCCGTCGTTCTCATTGGGCCTTGTCGCACACCGGATG~GCCGCTCCTG_TAG~C AA K I E S H F T G SK S N P E Q RVAYF G EIY * * EB CATGAATACCCATCACGTCACCTGGCATTTGGAGTTCCCCTTCTGGTGGGACGACGCCCATGAG~CCACCACATCGAGC 561 _______--+---------+---------+-----------_~---------~---------+---------~_-_______~ GTACTTATGGGTAGTGCAGTGGACCGTAAACCTCAAGGGGG~GA~~A~~~~T~C‘T~C~~~TA~TCTTGG~TG~~~TAG~~CG H E N Hx H? . I &&< \T,, Fr,_+g V ,%l W ,li -L 43 E: Et"'i?l W 'VP D D> A Primer 3 + E $i * GCAAGGGC~AGAGCTGTTCTTCTTGGGTCCACCACCAGC?CTACTTG 641 _________+_________+_--------+------______~---------~---------~---------~---------+ CGTTCCCGCTCTCGACAAGAAGAACCAGGTGGTGGTCGAC K G E ':s e& "S 5: W V 'a H'"Q 721 L $s'"*V :R F, D ,A E -kc &, S N‘ y L ---------+---------+_________+--_-CTAGGGCAGCTGCTTGAGGTGACCCTGCTACAGTA DPVDELHWDDVIHEGFDPQAVYKYGGY TTTCCCCTCCCGCCCTGACATATCCACTTTGAAGATGTGTGGATGGTGTTGCTGATGTTCGTGACATGCTTTTGTATG~G 801 _________+________-+---------+---------+---------+---------+---------+-----____+ AllAGGGGAGGGCGGGACTGTTATAGGTGAAACTTCTACACCTACCACAACGACTACAAGCACTGTACGAAAACATACTTC FPSRPDNIHFEDVDGVADVRDMLLYEE AACGTATTCTTGACGCTACTGCTCATGGCTACGTGCGGATCAACGGTCAGATCAGATCGTTGACCTGAGAAACAATGATGGCATC 881 ---------+---------+--------_+-----___-+---------+---------+---------+---------+ TTGCATAAGAACTGCGATGACGAGTACCGATGCACGCCTAGTTGCCAGTCTAGC~CTGGACTCTTTGTTACTACCGTAG IVDLRNNDGI RILDATAHGYVRINGQ * GATCTCCTTGGAGACGTGATTGAATCTTCTTCCTTATACAGCCCC~TCCTCAGTACTACGGCGCCCTGCAC~CACAGCTCA 961 ---_-----+---------+_--______+-------_-+---------~---------~---------~---------~ CTAGAGGAACCTCTGCACTACTTAGAAGGAATATGTCGGT S L Y S P N P Q Y Y G A L Bt.N DLLGDVIES * .%&A B'x FIG. I.-Cancer magisfer hemocyanin subunit 6; cDNA sequence and correct protein reading frame. Shaded areas, Cu-binding sites CuA and CUB. Asterisk indicates conserved histidine, presumably acting as copper ligand. Boxes indicate PCR primers. Boxed region labeled “primer 3” marks annealing site for universal crustacean CuA primer based on CuA site of Punulirus interruptus hemocyanin subunit a. tical overlap between both clones was 133 bp. The complete cDNA sequence of C. magister hemocyanin subunit 6 (GenBank accession number U48881) with the correct protein reading frame (Cmag6) is shown in figure 1. The subunit is composed of 650 amino acid residues. The six histidines marked by an asterisk in figure 1 have been implicated in Cu binding and are highly conserved among other arthropodan hemocyanin subunits (Linzen et al. 1985; Beintema et al. 1994). The molecular weight of subunit 6, calculated from the amino acid residues, was determined to be 74,903 Da, as opposed to an estimate of 67,300 Da, based on its mo- bility on SDS-PAGE gels, by Larson et al. (1981). The subunit is acidic (isoelectric point [PI] = 5.02), and a glycosylation site (Asn-Thr-Ser) occurs at residue 600. The sequence of another putative hemocyanin subunit, CmagX, was obtained from a C. magister hepatopancreas cDNA library. We call it a “putative hemocyanin subunit” for three reasons: (1) it was obtained through a cDNA library screen with a C. magister hemocyanin-specific probe, (2) it shows an extremely high degree of sequence similarity with Cmag6 (85% sequence identity), and (3) all of its potential Cu ligands are conserved. It is unknown, however, which hemocyanin subunit, if any, this clone represents, whether it Phylogeny of Hemocyanin Gene Family 269 TATGATGCTTGGCCGCCAGGGTGACCCTCATGGAAAGTTCGACCTTCCTCCCGGTGTTCTGGAGCACTTCGAGACCGCAA --_____--+---______+_____--__+_________+_________+_________+_____--__+____-__--+ GTGCACTAGGGCG~G~GGCAGATGTGTTCATGTACCTATTGTAG~GTCTTTTGTGTTCCTGTCGGACGGTGGGAT~ R :;b fi+ A‘,>E ,F .R L 8 X Y M;,?b“‘N I,\~$?:'k K H K D S 1281 1361 1441 1521 1601 1681 1761 1841 1921 L p CTTCGAGTACAGTCTTGTGAATGCTGCTGTTGACGACACAG~GATGTCGATGACGTGGATATCTTCACGTATATTTCACGCT ________-+------___+____-----+--_______+--------_+__-------+____-----+_________+ GAAGCTCATGTCAGAACACTTACGACAACTGCTGTGTCTTCTACAGCTACTGCACCTATAGAAGTGCATATAAAGTGCGA F E Y S LVNAVDDTEDVDDVDIFTYI P S Y R L TGAATCATAAGGAATTTTCATTTGTTGGTGATGTCACCAAGCCACTGTGCGCATCTTT ________-+------___+_________+-----____+________-~-________~_________~_________~ ACTTAGTATTCCTTAAAAGTAAACAACCACTACACTACAGTGGTTACTTG~CTAGTACTACATGATCGGTGACACGCGTAG~ NHKEFSFVGDVTNELDHDVLATVRIF GCCTGGCCGCACGAGGACAACAATGGAGTGGAGTGGCTGTT ---------+__-___-_-+_________+____---__+_________+-________+-----__--+---------+ CGGACCGGCGTGCTCCTGTTGTTACCTCACCGCAAGTCGA AWPHEDNNGVAFSFNDGRWNAIEMDKF CTGGGTTATGTTGCATCCCGGCCACAACCACACATCGAGCGATCGTCTCATGACTCCTCCGCGACCGTTCCTGATATACCCA -_______-+------___+_________+__--_____+____----_+_________+-_--_____+____-----+ GACCCAATACAACGTAGGGCCGGTGTTGGTGTAGCTCGCTAGCAGAGTACTGAGGAGGCGCTGGC~GGACTATATGGGT WVMLHPGHNHIERSSHDSSATVPDIPS GCTTCCAAATCATTAAGGACAGGACCAATGAAGCGATAGCGATAGCTCAG~C~GG~CTCCATATTG~G~TTTG~GCGGT --______-+------___+_________+---______+_______-_~_________~----____-~---------~ CGAAGGTTTAGTAATTCCTGTCCTGGTTACTTCGCTATCGAGTCTTGTTCCTTGAGGTAT~CTTCTT~CTTTCGCCA EEFESG F Q 1 IKDRTNEAIAQNKELHI CTTGGCCTGCC~CAGGTTCCTCATTCCCAAGGGGC~TGTG~GGGCCTTGACATGGATGT~TGGTGGCCATCACGAG --------_+--___-___+_________+---__-___+--------_~______---~----____-~---------~ GAACCGGACGGTTTGTCCAGAGTAAGGGGTTCCCGTTACACTTCCCGG~CTGTACCTACATTACCACCGGTAGTGCTC LGLPNRFLIPKGNVKGLDMDVMVAITS CGGAGAGGCGGATGCTGCCGTTGAAGGGGTTGCACG~CACTTCCTTC~CCACTACGGCTGTCCTGACGGCACCTACC ________-+------___+_________+-----____+_________~________-~----____-~---------~ GCCTCTCCGCCTACGACGGCAACTTCCCC~CGTGCTTTTGTG~GG~GTTGGTGATGCCGACAGGACTGCCGTGGATGG GEADAAVEGLHENTSFNHYGCPDGTYP CAGACAAGAGGCCCCACGGTTACCCACTGGACCGCCACGTCGACGATGAGCGCATCATC~TGACTTGCAC~CTTC~G -----___-+------___+_--------+---------+-------__+---------+---_____-+---------+ GTCTGTTCTCCGGGGTGCCATGGGTGACCTGGCGGTGCAGCTGCTACTCGCGTAGTAGTTACTG~CGTGTTG~GTTC I ND L H DKRPHGYPLD RHVDDERI N F K CACATTCAGGTCAAGGTGTTCCATCATGCG ------_--+---------+-_____--_+ GTGTAAGTCCAGTTCCACAAGGTAGTACGC HIQVKVFHHA FIG. 1 (Continued) might even code for a crustacean storage protein or prophenoloxidase, or whether it reflects an error in reverse transcription or second-strand cDNA synthesis. Aligned with Cmag6, its 484-amino-acid open reading frame shows a 191-residue deletion between Cmag6 residues 4 10 and 601. This extensive deletion extends from the C-terminal part of domain 2, beyond the second Cubinding site (CUB), well into domain 3. However, all putative Cu-binding histidine residues are preserved. In addition to that, it shows a typical signal peptide of 21 hydrophobic residues at the N-terminal end, indicating that the gene product is targeted for secretion. This CmagX sequence and the as yet unpublished sequence of Penaeus vannamei hemocyanin subunit 1 (GenBank accession number X82502) are the first examples of hemocyanins with leader sequences (fig. 2). The hemocyanin e gene in the spider Eurypelma contains no sequence coding for a signal peptide (Voll and Voit 1990); whether other arthropodan hemocyanin subunits (including Cmag6) contain a signal peptide is not known. A protein sequence alignment of C. magister hemocyanin subunit Cmag6 as well as CmagX with other members of the hemocyanin family is shown in figure 2. Alignment was done manually. It was our goal to include in our analysis representatives of all major groups thought to be within the hemocyanin family of proteins. Among these, the 02-transporting hemocyanins of arthropods and molluscs are respiratory proteins. Tyrosinases and prophenoloxidases (Lerch et al. 1986), both binuclear copper proteins, are enzymes involved in dopa and melanin biosynthesis, catalyzing the hydroxylation of monophenols and the oxidation of diphenols. Recent studies (Aspan et al. 1995) assign prophenoloxidases a key role in the arthropodan immune system. Another group of proteins, the hexamerins, is found in insect hemolymph. One of several functions assigned to 270 Durstewitz Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy and Terwilliger . . . .. . . . . . . . . . . . . . . . . . . .ADCQAGDSADKLLAQKQHD~L~KLYGDIRDDHLKELGETFNPQGDLLLYHDNGAS~TL~DFK~RLLQKKH MRvLwL:;;LvAA::::::::::::::. ..DALGTGNAQKQQDINHLLDKIYEPTKYPDLKEIAENFNLLEQRH .AAFQVASADVQQQKDVLYLLNKIYGDIQDGDIQ~DLLAT~SFDPVG~GSYSDGG~VQKLVQD~DGKLLEQKH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .TLHDKQIRVCHLFEQLSSATVIGD...........GD..........KHKHSDRL~GKLQPGA . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .TIADHQARILPLFKKLTSLSP...........DPLP...........EAERDPRLKGVGFLPRGT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .PDKQKQLRVISLFEHMTSIN............TPLP...........RDQID~LHHLGRLPQGE . . . . . . . . . . . . . . . . . . . . . . . . . . ..I.... TVADKQARLMPLFKHLTALTR...........EKLP...........LDQRDERLKGVGILPRGT MKSVLILAGLVAVALSSAVPKP.. .STIKSKNvDAvFvEKQKKILSFFQDVSQLNTDDEWKIGK MKTWILAGLVALALSSAVPPPKYQHHYKTSPVDAIFVEKQK~SLF~QLDY~EYYKIGKDYD~~.ID~S~~DFLLL~TG.FMPKGF MRVLVLVASLGLR..GSVVKDDTTWIGKDNMVTMDIKMKELCILKLLNHILKL~ILQPT~DDIREV~E~IE~.MDKYLKTD~KFIDTF~G.~PRGE MRVLVLLACLAAASASAISGGYGTMVFTKEPMVNLDMKMKELCIMKLLDHILQPTMFEDIKEIAKEYNIEKS.CDKYMNVDVVKQFMEMYKMG.MLPRGE MQVTQKLLRRDTE.................... .MADAQKQL..LYLFERPYDPINAPRADGSFLYAVAGAXTLLG MTNTDLKALELMFQRPLEPAFT.......... ..TRDSGKTVLELPDSFYTDRYRNDTEEVGNRFSKDVDLKQFSLFN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . ...................................... ...................................... ...................................... WFSLFNTRQREEALMLYDVLEHSTDWSTFAGNAAFFRVSM ........ ..NEGEFVYALYAAVIHSELTQHVVLPPLYEVTPHLFTNSEVI....QEAY. WFSLFNTRQRKEALMLYDVLEHSTDWSTFAGNAAFFRVHM ........ ..NEGEFVYALYAAVIHSELTQHWLPPLYEVTPHLFTNSEVI....QEAY. WFSLFNTRQREEALMMHRVLMNCKNWHAFVSNAAYFRTNM ........ ..NEGEYLYALYVSLIHSGLGEGVVLPPLYEVl'PHMFTNSEVI....HEAY. WYSLFNTRQRKEALMLFAVLNQCKEWYCFRSNAAYFRERM ........ ..NEGEFVYALYVSVIHSKLGDGIVLPPLYQITPHMFTNSEVI....DKAY. WFSLFNTRHRNEALMLFDVLIHCKDWASFVGNAAYFRQKM ........ ..NEGEFVYALYVAVIHSSLAEQWLPPLYEVTPHLFTNSEVI....EEAY. IFSCFHPDHLEEARHLYEVFWEAGDFNDFIEIAKEARTFV ........ ..NEGLFAFAAEVAVLHRDDCKGLYVPPVQEIFPDKFIPSAAI....NEA F LFGSFHEEHLAEAIVFIEIIHDAKNFDDFLALATNARAW ........ ..NEGLYAFAMSVALLSRDDCNGWIPPIQEVFPDRFVPAETI....N'RAL. LFSCFHEEDLEEATELYKILYTAKDFDEVINLAKQSRTFV ........ ..NEGLFVYAVSVALLHRDDCKGIWPAIQEIFPDRFVPTETI....NLAV. LFSCFHARHLAEATELYVALYGAKDFNDFIHLCEQARQIV ........ ..NEGMFVYAVSVAVLHREDCKGITVPPIQEVFPDRFVPAETI....NRAN. EFSVFYDKMRDEAIALLDLFYYAKDFETFYKSACFARVHL ........ ..NQGQFLYAFYIAVIQRPDCHGFWPAPYEVYPKMFMNMEVL....QKIY. EFSIFYERMREEAIALFELFYYAKDFETFYKTASFARVHV ........ ..NEGMFLYAYYIAVIQRMDTNGLVLPAPYEVYPQYFTNMEVL....FKVD. VFVHTNELHLEQAVKVFKIMYSAKDFDVFIRTACWLRERI ........ ..NGGMFVYALTACVFHRTDCRGITLPAPYEIYPYVFVDSHII....NKAF. TFVHTNELQMEEAVKVFRVLYYAKDFDVFMRTACWMRERI ........ ..NGGMFVYAFTAACFHRTDCKGLYLPAPYEIYPYFFVDSHVI....SKAF. RAPSVPRGAVFSFFIRSHREDLCD~~TQNSTDLMQ~S~R~.NENLFIYALSFTILRKQELRG~LPPILE~PHKFIPMEDLTSMQ~~ NRHREIASELITLFMSAPNLRQFVSLSVYTKDRVNPVL .............. ..FQYAYAVAVAHRPDTREVPITNISQIFPSNFVEPSAFRDARQEAS V ......................................................................... EGNEYLVRKNVERLSLSEMNSLIHAFR ....................................................................... DAVTVASHVRKDLDTLTAGEIESLRSAFL .STDIKFAITGVPTTPSSNGAVP.LRRELRDLQQNYPEQFNLYLLGLRDF APL~~~F~~T~~DDRESWP~~~~~~~~~~~~~~~~~~~~~~~~~~~~~TE~LL~IFDLSAPEKDKFFAYLT~KHTISSD~IPIGTYGQM~G 230 Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BonibA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy -Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy I FDAERLSNYLDPVDELHW.DDVIHEGFDPQAWK.YGCYFPD.... FDAERLSNYLDPVDELHW.DDVIHEGFAPHTMYK.YGGYFPS.RPDNVHFEDVDGVARVRDMLILESRIRDAIAHGYVTGRT....GSIISISDSH.... YDAERLSNHLDPVEELSW.NIDEGFAPHTAYK.YGGYFPS.RPD~FSD~GV~~DMSMTEDRIRDAIAHGYID~D....GSHIDI~SH.... FDFERLSNWLDPVDELHW.DRIIREGFAPLTSYK.YGGEFPV.RPDNIHFEDVDGVA~DLEITESRIHEAIDHGYITDSD....GHTIDIRQPK.... FDAERLSNYLDPVGELQW.NKPIVDGFAPHTTYK.YGGQFPA.RPD~FEDVDDV~IRD~I~SRIRDAIAHGYIVDSE....GKHIDISNEK..~~ YDCERLSNGMHRMLPFNN.FDEPLAGYAPHLTHV.ASGKYERILDSIHLG~ISED....GSHKTLDELH.... YDCERLSVGLQRMLPFQN.IDDELEGYSPHLSSL.VSGLSYGSRPAG~LRD.INDCSVQ.MER~ERILDAIHTGL~DSH....GKEIKITEEN.... YDCERLSNGMRRMIPFSN.FDEKLEGYSAHLTSL.VSGLPYAFRPDGLCLHD.LKDIDLKEMFR~ERILDAIDSG~IDNE....GHQ~LDIVD.... YDSERLSNGLQRMIPFHN.FDEPLEGYAPHLTSL.VSGLQYASRPEGYSIHD.LSDVDVQD~~ERILDAI~YI~KD....~KIPLDIEH.... YYFERLTNGLGKIPEFSW.YSPIKTGWPLMLTK..FTPFAF.....GQKIDFHDPK.... WLERLTNGLGEIPEFSW.YSP~TG~P.MLYG.SYYPFAQ.RP~YDI~D~EQIRFLDMFE~FLQYLQKGHF~F.....DKEINFHD~.... LRLERLSHEMCDIKSIMW.NEPLKTGYWPKIRLH.TGDEMPV.RSNNKIIVTKENVKVKRMLDDVERMLRDGILTGKIERRD....GTIINLKKAE.... MRLERLSHKMCDVKPMMW.NEPLETGYWPKIRLP.SGDEMPV.RQ~ATKDNL~KQ~DDVE~IREGILTGKIE~D....GTVISL~SE.... YDWERLSVNLNR VEKLENWRVPIPDGYFSKLTANNSGRPWGT.RQDNTFIKDIHQG YMLNRNGERVPLSDNVTT YNVERFCNNLKKVQPLNNLRVEVPEGYFPKILSSTNNRTY.RVTNQKLRDVDRHDGRVE... ISDVERWRDRVLAAIDQGYVEDSSGNRIPL.DEV.. . . .VPYWDWTRPISKIPDFIASEKYSDPFTKIEVYNPFNHGHIEQTDYCDF........... . . .VPYFDWISPIQKLPDLISKATYYNSREQRFDPNPFFSGKVA..GEDA~TRDPQPELF~.......YFYEQALY~EQDNFDDF........... DFRAPYFDWASQPPKGTLAFPESLSSRTIQWDVDGKTKSI~PLHRFTFHP~PSPGDFS~WSRYPST~YP~LTGASRDERIAPI~E~SL~ .FTIPYWDWRDAEKCDICTDE~GGQHPTNPNLLSPASFFSSWQIVCSRLEE~SHQSLCNGTPEGPL~PGNHDKSRTPRLPSSAD~FCLSLTQYES 3 la FIG. 2.-Sequence alignment of hemocyanin-type proteins. Residue numbers refer to Cmag6. Other numbers indicate protein domain. Shaded areas, Cu binding sites CuA and CUB. Cmag6, Pintc, Pinta, LimII, and Anda have not yet been examined for presence of leader sequence. Carboxy-terminal regions of Tni_M, BombM, and NeuTy not shown. Cmag6 = C. magister hemocyanin subunit 6, GenBank (GB) accession number U48881; CmagX = Possible C. magister hemocyanin subunit with deletion between residues 410 and 601; Pintc = Punulirus interruptus hemocyanin subunit c, SwissProt (SP) accession number P80096; Pinta = Punulirus interruptus hemocyanin subunit a, SP number P04254; Penvl = Penueus vunnumei hemocyanin subunit 1, GB number X82502; LimII = Limulus polyphemus hemocyanin subunit II, SP number P04253; Euryd = Eurypelmu culifornicu hemocyanin subunit d, SP number P02241; Eurye = Eurypelmu culifornicu hemocyanin Phylogeny Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy of Hemocyanm Gene Family 271 . . .GIDLLGDVIESSLYSP.N... . . . GIDVLGDVIESSLYSP.N... . . . GIEFLGDIIESSGYSA.N... . . .GIELLGDIIESSKYSS.N... . . .GIDILGDIIESSLYSP.N... .GTDILGALVESSYESV.N... 1:.GINVIGALIESSHDSV.N... .GINVLGALIESSFETK.N... ::.GTDILGDIIESSDESK.N... . . .AINFVGNYWQDNADLY.G... . . .AVNFVGNYWQANADLY.N... .DVEHLARLLLGGMGLV.G... ::.DIENLARLVLGGLEIV.G... GKRGIDILGDAFEADAQLSPN... ..RGIDILGNMIEASPVLSIN... . .... ... .. .... . .... .... . ;sLLLLb;KDFD..FS;NRWDPtiN GSMDKAANFSFRNTLEGFASPLTG Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy 3+ PPYTKEELNFEGVNIDNFYIKGNLETYFETFEYSLVNAVDDTED-VDD.... .VDIFTYISRLNHKEFSFVGDVTNELDHDVLATVRIFAWPHEDNNGVA TP;TRDELEFNGVs;Dd;A;~~~~~~~~~~~~~~~~~~~~~:~~~::::..................................*............. .VEILTYIERLNHKKFSFLILVTNNNNTEVLATVRIFAWPLRDNNGIE PPYTHDNLEFSGMWNGVAIDGELITFFDEFQYSLINAVDSGEN.IED.... .VEINARVHRLNHKEFTYKITMSNNNDGERLATFRIFLCPIEDNNGIT PPYTKADLEFSGVSVTELAWGELETYFEDFEYSLINAVDDAEG.IPD.... .VEISTYVPRLNHKEFTFRIDVENGGA.ERLATVRIFAWPHKDNNGIE KPYDHDVLNFPDIQVQDVTLHARVDNWHFTMREQELELKHGINPGNA.... .RSIKARYYHLDHEPFSYAVNVQNNSASDKHATVRIFLAPKYDELGNE PSYTHQQLDFPGVRISRVTVSKVPNILHTYSKDSLLELSHGLNIKGH..... IQVKYNYEHLDHEPYNYEIEVDNRTGEARETCVRIFLAPKYDELGNR PHYTPEDLTCPGVHVVNVT VNAKVPNVVTTFMKEAELELSYGIDFGSD.... .HSVKVLYRHLDHEPFTYNISVENSSGGAKDVI'~IFLGPKYDELGNR HPYTKEELSFPGVEWGVSINSKTANVITTLIKESLLELSHGINFGTD.... .QSVKVKYHHLDHEPFTYNIVVENNSGAEKHSTVRIFLAPKYDELNNK KPYTQDKLYFDGVKITDVKVD.KLTTFFENFEFDASNSVYFSKEEIKN.. .NHVHELRCATRLNHSPFNVNIEVD..SNVASDAVVKMLLAPKYDDNGIP QPYNQNDLHFVGVKISDVKVD.KLATYFEYYDFDVSNSVFVSKKDIKN.. .FPYGYKVRQPRLNHKPFSVSIGVK..SDVAVDAVFKIFLGPKYDSNGFP PKYTREDFDFPGVKIEKFTTD.KLTTFIDEYDMDITNAMFDCLGRL PKYTREQFSFPGVKVEKITTD.EL~FVDEYDMDISN~LDATEMQ~T.SDMTF~RLNHHPFQVSID~..SDKT~A~IFLGPKYDCMGRL PPYTMEDLSLPGVVLDKVG~DQ~TLTTGWS~EF~SRGLDFNSPNP~AHYPSRPCTLHLPSPD~QHRKPKS.....~~I~PK~ERGLE NPYNAGELNFDGITVDYIEAKIGKSNT~TLLT~QKSSAD~GLDFGPTTD~IFASFTHLQNAPFTYTF~~G~TGTCRIFICP~E~QA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. .ELQKLRGLNAYESHCALELMKVPLKPFSFGAPYNLNDLTTD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ELQRYRGLPYNEADCAINLMRKPLQPFQDKKL.NPRNITNIYSRPADTFDYRN NSFMTPRPAPYSTFVAQ.............................. ..EGESQSKSTPLEPFWDKSAANFWTSEQVKDSITFGYAYPETQKWKYSSVKE PLQEVYPEANAPIGHNR.............................. ..ESYMVPFI.PLYRNGDFFISSKDLGYDYSYLQDSDPDSFQDYIKSYLEQAS Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy FSFNDGRWNAIEMDKFWVMLHPGHNHIERSSHDSSHDSSATVPDIPSFQIIKDRTN~IAQ.. .NKELHIEEFESGLG.LPNRFLIPKGNVKGLDMDVMVAITS 587 ;6FNEG~~ELDRFW;I;;I;HGHHQ;TRQS~~~::............................... .FHLVVFVSD .GCALHLEDYESALG.LPNRFLLPKGQAQGMEFNLWAVTD LTLDEARWFCIELDKFFQKKGPETIERSSKDSSVTVPD.. .GHDLDLSAYERSCG.IPDRMLLPKSKPEGMEFNLYVAVTD YTFDEGRWNAIELDKWSLKGGKTSIERKSTESSVTVPD...... ..LAKFESATG.LPNRFLLPKGNDRGLEFDLWAVTD IKADELRRTAIELDKFKTDLHPGKNTVVRHSLDSSVTLSHEH........ ..KSEYCSCG.WPSHLLVPKGNIKGMEYHLFVMLTD LLLEEQRRLYIELDKFHRRLEPGKNVLVRASGDSSVTLSK........... EYCSDG.KPEHMLVPRGKERGMDFYLFVMLTD LQPEQQRTLNIELDKFKATLDPGKNWTRDHRNSTVTVEQSG............ .EYCSCG.WPEHMLIPKGNHRGMDFELFVIVTD LEPDEQRRLFIELDKFFYTLTPGKNTIVRNHQDSSVTISKEGVSEDST........ ..EYCSCG.WPEHMLIPRGSHKGMEFELFVMLTD LTLEDNWMKFFELDWFTTKLTAGQNKIIRNSNEFVIFKEDLD.. ..EGKVPFDMSEEFCY.MPKRLMLPRGTEGGFPFQLFVFVYP IPLAKNWNKFYELDWFVHKVMPGQNHIVRQSSDFLFFKED....SLPMSEIYKLLD.. ..EGKIPSDMSNSSDT.LPQRLMLPRGTKDGYPFQLFVFVYP MSVNDKRMDMIEMDTFLYKLETG~TI~SLE~GVIEQRP~~I~IGTVGTISKT~~S~K.RHR.LPH~LPLG~GGMPMQMFVI~P MSVNDKRLDMFELDSFMYKL~G~I~SSMDMQGFIPEYLSTRR~ESE~PSG..DGQT~D~CKS~G.FPQRL~PLGTIGGLEMQ~IVSP MGFMEQRLLWAEMDKFTQDLKPGQNQIVRASNLSSITNPSETNFCGCG.... ..WPEHLLLPRGKPEGMTYQLFFMLTD LNLEEQRLLAIEMDKFTVDLVPGENTIRRQSTESSVAIPFFK.FCGCG.... ..WPQHLLLPKGNAQGMLFDLFVMISD NFHYEYDILDINSMSINQIESSYIRHQKDHDRVFAGFLLSFEICIEGGEC.... ..HEGSHFAVLGGSTEMPWAFDRLYKIEITDVLSDMH HFHYEYDTLELNHQTVPQLELL~Q.EYGR~AGFLI~GLSAD~~CVPSGPKG~DC~~G~S~GGELEMPFTFDRLYKLQITDTIKQLG YQAAIRKSVTALYGSNVFANFVENVADRTPALKKPQATGE QHAEEKAQKPWPVKDTKAESSTAAGMMIG RIWSWLLGAAMVGAVLTALLLVSLLCRHKRKQLPEEKQLPEEKQPLLMEKEDYHSLYQSHL........................................... Cmag6 CmagX Pintc Pinta Penvl LimII Euryd Eurye Anda BombA MsexA Tni_M BombM PapPO DrpPO Octoe Hpomd NeuTy HumTy .............................. GEADAAV.EGLHENTSFNHYGCP .. .DGT..YPDKRRHGYPLDRHDERIINDLH.NFKHIQVK ............................ GAKDAAI.DGLLENTSFNHYG .. .AHSGK..YPDKQPHPYPLDRRVDDKRIITGVT.NFKGMDEQ ......................... GRTDAAL.DDLHENTKFIHYGY .. ..DRQ..YPDKRPHGYPLDRRVDDERIFEALP.NFKQRTVKLYSHEG .......................... GDKDTEG.HNGGHDYGGTHAQCGV.HGEA..YPDNRPLGYHLEHHD .......................... .. .YPDKRPHGYPLDRKVPDERVFEDLP.NFKHIQVKVFNHGEHIH GDADSAV.PNLHENTEYNHYG...SHGV .............................. WDKDKVD.GSESVACVDAVSYCGA.RDHK..YPDKKPMGFPFDRPIHTEHISDFLT~FIKDIKIKFHE ........................... YEEDSVQGAGEQTIDQDAVSYCGA.KDQK..YPDK~GYPFDRPIQ~TPSQFKTP~FQEIIIQYEGHKH ............................ YAQDAVNGHGENAECVDAVSYCGA.KDQK..YPDKKPMGFPFDRVIEGLTLEEFLTPSMSCTDVRIKYTDIK ......... HDEDTVAGLSENAVCSDAVSYCGA.RDDR..YPDKKAMGFLTDIKIKFHG...................._ ....... ..FD...NKG ...... ..KD.LAP.FESF..VLDNNLLASLWIAPLLMHYSR.FLTCISRIFSFTT.R~GSLTNSIFL~~I~FQKIKF .............. YQ......AVP ...... .KE.MEP.FKSI..VPDSKPFGYPFDRPVHPEYFKQPNMHV VK .. .TNLLLPNLDMNIMKERKTC.AGAS..VSTRCRSGFPFDRKIDMTHFFT~FTD~IFRKDLSLS~IKD~MSD~KDDLTYLDSD~~ W . VR .. .TGMLLPTLDMTMMKDRCAC.RWSS..CISTMPLGYPFDRPID~SFFTS~FAD~IYRKDLGMSNTSKTTSE~..KDDLTYLDSD~ ........... LEKDQVD.QPAGPRR..CASFCGILDSKFPDKRPMGFPFD~PPPRLQDAE~SVADY~LS~TVQDITITFLTTASRSRHDGPI ....................... YSQDSVE.QPKTPNDACSTAYSFCGLKDKLYPDRRTMGYPFDRRLPN~TELVGAFG~TDLRI~NDRVID~ ......................................................... LAFDSA..FTIKTKIVAQNGTELPASILPEATVIRIPPSKQDA .......................................................... LKVNNAASYQLMrEIKAVPGTLLDPHILPDPSIIFEPGTKER LSIKRPSKLTASPGPIPESLKYLAPDGKYTDWIVNVRAQKLIVSGT .................................................................................................... 491 subunit e, GB number X16650; Anda = Androctonus austrulis hemocyanin subunit 6, SP number P80476; BombA = Bombyx mori storage protein 2, GB numbers M24370, 504829; MsexA = Munduca sextu arylphorin subunit alpha, GB numbers M28396, 505092, 505093; TniM = Trichoplusiu ni basic juvenile hormone sensitive hemolymph protein 1, GB number LO3280; BombM = Bombyx mori sex-specific storage protein 1, GB numbers X12978, 503722; PapPO = Pucifastucus leniusculus prophenoloxidase, GB number X83494; DrpPO = Drosophila melunoguster prophenoloxidase, GB number D45835; Octoe = Octopus dojeini hemocyanin domain e, GB number M57288; Hpomd = Helix pomutiu hemocyanin domain d, SP number P1203 1; NeuTy = Neurosporu crussu tyrosinase, GB numbers M3327 1, 505052; HumTy = Human tyrosinase, GB number M743 14. 272 Durstewitz and Terwilliger these hexamerins is that of storage proteins during insect metamorphosis (Telfer and Kunkel 1991). Hexamerins include the arylphorins, proteins rich in aromatic amino acids, and the methionine-rich storage proteins. Although structurally similar to arthropodan hemocyanins (Markl and Winter 1989), hexamerins contain no copper. Indices of structural features of the arthropodan proteins aligned in figure 2 were predicted by the PEPTIDESTRUCTURE program (GCG Sequence analysis software package, Devereux, Haeberli, and Smithies 1984, data not shown). As could be expected for a globular protein occurring freely dissolved in the hemolymph, Cmag6 shows no extensive hydrophilic or hydrophobic domains. Indices for all three structural features, hydrophilicity, surface probability based on amino acid side-chain solvent accessibilities, and regional backbone flexibility are strikingly similar among all crustacean hemocyanins as well as among the other arthropodan subgroups. These subgroups include chelicerate hemocyanins (Euryd, Eurye, Anda6, and LimII), methionine-rich storage proteins (BombM and TniM), arylphorins (BombA and MsexA), and prophenoloxidases (PapPO and DrpPO). Some motifs appear to be conserved in all arthropodan hemocyanin-type proteins. None of the three indicators suggests any structural homology between the arthropodan proteins mentioned above on the one hand and the molluscan hemocyanins and tyrosinases on the other. The high degree of sequence similarity among arthropodan hemocyanins (30%-70% sequence identity) suggests a common tertiary structure. X-ray crystallography of hemocyanin from Panulirus and Limulus (Volbeda and Ho1 1989b; Hazes et al. 1993) has shown that arthropodan hemocyanins consist of three domains. Domain 1 (residues 1-174 in C. magister subunit 6) is quite variable and mainly o-helical in structure. Domain 2 (residues 175399 in C. magister) contains the oxygen-binding CuA and CUB sites and is the most conserved part of the protein. CuA and CUB each consist of an antiparallel helix pair containing three Cu-binding histidine residues. In C. magister subunit 6, the CuA helix pair extends from residue 186 to residue 200 (helix 2.1) and from residue 215 to residue 239 (helix 2.2). The Cubinding histidines are located at positions 192 and 196 (helix 2.1) as well as 224 (helix 2.2). The CUB helix pair extends from residue 341 to residue 353 (helix 2.5) and from residue 378 to residue 396 (helix 2.6). Its Cu-binding histidines are located at positions 343 and 347 (helix 2.5) as well as 383 (helix 2.6). Domain 3 is rich in P-sheets and forms a P-barrel structure (Hazes and Ho1 1992). Phylogenetic analysis using parsimony was performed with the PAUP program (Swofford 1991). Sequence comparison (fig. 2) revealed varying degrees of sequence similarity among taxa. Amino acid sequence identity between Cmag6 and CmagX was 85%. Among any two crustacean hemocyanins it was approximately 60%, with chemical similarity of over 80%. Sequence identity among chelicerate hemocyanins ranged from 53% to 65%; among molluscan hemocyanins it was 42%. 131 BombA MsexA Tni M BombM ,A 155 63% ‘35 LimII Euryd 210 PapPO DrpPO 247 98% lmJ7. 1 205 321 FIG. 3.-Single aligned in figure 2. 5,045 substitutions. stitutions (indicated are indicated below 114 Hpomd NeuTy HurnTy most-parsimonious unrooted tree Gaps are treated as missing data. Branch lengths are proportional to above branches). Bootstrap values branches. of the proteins Total tree size: number of sub(500 replicates) The single most-parsimonious phylogenetic tree consistent with the data set (total size = 5,045 substitutions) is shown in figure 3. It was obtained through a heuristic search algorithm treating gaps as missing data. Various search options (simple and random addition, branch and tree swapping) gave the same result. The resulting tree represents a molecular phylogeny of hemocyanin-class proteins, not a phylogeny of the involved species. Sequence alignment of the functionally important CuA and CUB sites (fig. 4) illustrates several points: (1) The histidine ligands are conserved in those proteins that bind Cu, i.e., the arthropodan and molluscan hemocyanins, the tyrosinases, and the prophenoloxidases. In the non-Cu-binding insect hexamerins these residues are not conserved, although the overall sequence similarity of the hexamerins to crustacean hemocyanins is high. (2) The CUB site is the only region that exhibits significant homologies in all taxa surveyed, including the molluscan hemocyanins and tyrosinases. This suggests a common origin for at least part of the molecule. (3) The CuA site is either of the arthropodan or the molluscan type. Sequence homology between these types is marginal at best. All arthropod proteins in this study form a monophyletic group relative to molluscan hemocyanins and tyrosinases. Phylogeny l 181 P 13 RKGE scssw Qagx P 13 RKGE s. PFW Pintc LIW HIM E P P 12 RKGE s . FPW Pinta PWHKDF II P 12 RKGE .LFPW P 12 RKGE DVGINAH cl DIGINSH P 13 RKGE P 13 RKGE DVGTNAH II DIGINAH P 13 RKGE Penvl Euryd Eurye Anda YFT E I II D I "1"1"1" MsexA IY F(YIE Tni_M /Y F(TIE DAD D IjLINIS i Y YIY Y YIY P 13 RKGE ‘1’1”1” H L P 13 YIF~Y~M H; 13 RRGE r-l RRGE.IY 13 RRGE.IM LI NTYHYYLHUSY P P !d-ld- BombM PapPO DrpPO DVDLNTYMY P DFGINSHEW P DICVNSHHW P d Octoe S PIBIG RRGE.IM u RKGE.LF 13 '8 11 I l--l-J RRGE.LF 16 IRIG M P T/PIP S ::[r:: [z$J: NeuTy HumTy residues 3 42 deleted l . TAE TAB Pintc . . Pinta .TAE Cmag6 CmagX . . Penvl LimII TAB residues deleted r KHK II I EHK cl TAB KHT EH I WGlIi EY Euryd EY Eurye FIDNIFQ VMl4ANIT AndaB YQRS E VFAR MsexA YQRSYEINARHV Tni_M MMH....LMKRL RV IL G]A A P MIPI 18 /RI. . . . .ID PIAlP Octoe KE.. L H Y A A YID Hpomd HA.. LDYTA NeuTy HumTy 273 PH Hpomd BombA Gene Family . CmagC LimII of Hemocyanin PlIlF KY YIQ FIDNIFQ LJ . L Y N R I V E Y I \ Y L H RiS N VlDlR EH 1 E F L W V IIW LEVSA NG.. .ALEIYM..NG.. Ll VQGSA residues deleted FIG. 4.-Sequence conservation in the copper-binding sites of hemocyanin-type proteins. Top, CuA site; bottom, CUB site. Numbering of residues is according to Cmag6. Residues conserved in more than half of the taxa or in one complete group of taxa are boxed. Asterisk indicates conserved histidine, presumably acting as copper ligand. Discussion The phylum Arthropoda is composed of three major taxa: The Chelicerata, the Insecta, and the Crustacea. Traditionally, the latter two have been considered to be more closely related and were grouped together as Mandibulata (Remane, Starch, and Welsch 1980, p. 227). This relationship is supported by 18s r-RNA (Turbeville et al. 1991; Garey et al. 1996) and mitochondrial 12s rRNA sequence comparisons (Ballard et al. 1992). Three minor taxa, the Myriapoda, the Onychophora, and the Tardigrada, are placed at the base of the arthropod lineage by these studies, a placement that also is supported 274 Durstewitz and Terwilliger f of cu t gene duplications gene duplications and fusions addition of domains 1 and ancestral lbinuclear I t I molluscan Cu protein1 I 3 I ancestral arthropod binuclear > gene duplication Cu protein and fusion uniquely molluscan CuA peptide ICu binding FIG. 5.-Possible Durstewitz 1996). helix pair] evolutionary relationships between respiratory proteins (based on Volbeda and Ho1 1989~; van Holde and Miller 1995; by their greater morphological similarity to the annelids. However, this phylogeny is by no means certain, and the problem is compounded by the question of whether the arthropods form a monophyletic group at all or arose from annelid-like ancestors in several independent lineages. The most parsimonious phylogenetic tree of the 19 taxa aligned in figure 2 indicates four monophyletic groups within the arthropods: the crustacean hemocyanins, the insect hexamerins, the chelicerate hemocyanins, and the prophenoloxidases (fig. 3). These arthropodan proteins are clearly monophyletic with respect to the molluscan hemocyanins and tyrosinases. These conclusions are supported by very robust nodes in the phylogenetic tree as indicated by bootstrap values well over 80%. They are also in agreement with the comparison of predicted structural parameters (data not shown) that suggests a significant degree of structural conservation among the arthropod proteins, but not between the arthropodan and the molluscan groups. However, parsimony analysis fails to resolve the relative arrangement of crustacean and chelicerate hemocyanins, hexamerins, and prophenoloxidases within the arthropod lineage, as indicated by low bootstrap values (58% and 63%) for the two major arthropodan branches in figure 3. These data suggest that (1) the common ancestor of all arthropodan hemocyanins, hexamerins, and prophenoloxidases was a Cu-binding arthropodan hemocyanin-type protein and (2) the insect hexamerins lost their Cu-binding capabilities after the insects diverged from the crustaceans, presumably due to the development of the tracheal system that made respiratory proteins obsolete. Aspan et al. (1995) recently discovered certain sequence similarities between arthropodan prophenoloxi- dases and arthropodan hemocyanins. The prophenoloxidases represent nonhemocyanin proteins that bind copper and occur in both insects (DrpPO) and crustaceans (PapPO). Although identical in function to tyrosinases (NeuTy and HumTy), their sequences show only a slight resemblance to them. Instead, prophenoloxidases appear to be most closely related to the hexamer-type family of arthropodan proteins and feature a typical arthropodan CuA site. Tyrosinases from most nonarthropod phyla of the animal kingdom as well as from plants, fungi, and procaryotes contain a CuA site of the mollusc type (van Holde and Miller 1995). It is therefore reasonable to assume that after the arthropods and molluscs diverged, an ancestral arthropod-type binuclear Cu protein evolved into four classes of proteins: crustacean hemocyanins, chelicerate hemocyanins, arthropodan prophenoloxidases, and-through loss of Cu-insect hexamerins. Prophenoloxidases, with two functional copper sites, are structurally more similar to arthropodan hemocyanins than are the hexamerins. Detection of prophenoloxidase in the tracheal cuticle of insects (which are thought not to have respiratory proteins) has suggested the fascinating possibility of a respiratory function for prophenoloxidases in that taxon (Kawabata et al. 1995). The length of the branches leading to both the insect and crustacean prophenoloxidases illustrates the long independent evolutionary history of these proteins and suggests that they diverged from the other lineages early in arthropodan evolution. The multitude of different subunit types found in crustacean and chelicerate hemocyanins is probably the result of gene duplications that occurred independently after these taxa diverged (Neuteboom et al. 1990). This Phylogeny split occurred about 600 MYA, during the early Cambrian, after the arthropods and molluscs diverged. The absence of a true outgroup (hemocyanin occurs only in arthropods and molluscs) makes speculation about the relationship of arthropodan and molluscan hemocyanins difficult. The CUB sites are homologous and not the result of convergence (van Holde and Miller 1995). This means that all arthropodan hemocyanins, hexamerins, and prophenoloxidases share a common ancestor with the molluscan hemocyanins and tyrosinases. The tyrosinases in particular appear to be phylogenetitally very old, because they are found in animals, plants, fungi, and even procaryotes, and the degree of sequence similarity between human and procaryotic (Streptomyces) tyrosinase, for example, is remarkable. Since there apparently are no procaryotic hemocyanins, it seems reasonable to assume that molluscan hemocyanins arose from tyrosinase-like ancestors. A speculative model of hemocyanin evolution is given in figure 5. Our analysis supports the view that both arthropodan and molluscan hemocyanins arose from a common ancestral Cu protein (Durstewitz 1996). Whether this ancestor was mono- or binuclear cannot be decided from our data. van Holde and Miller (1995) assume a common origin for the arthropodan and molluscan CUB sites and consider the arthropodan CuA site a result of gene duplication and fusion in that lineage. This notion is supported by the fact that in arthropods, the CuA site is very similar in sequence and structure (HXXXH for the first two Cu ligands) to the CUB site (Volbeda and Ho1 1989a), while in molluscs it is not. The ancestral arthropodan hemocyanin would, then, be a binuclear Cu-binding protein, corresponding roughly to domain 2 of today’s arthropodan hemocyanin. Domains 1 and 3 would have been added later, following an evolutionary trend to provide sites for allosteric regulation and multisubunit cooperativity. In this scenario, the CuA site of molluscan hemocyanins and tyrosinases is of separate origin from the CuA site of arthropods, and the molluscan hemocyanins are the fusion product of this uniquely molluscan CuA peptide and a CUB site shared with the arthropods. The weak tyrosinase activity of molluscan hemocyanins (Nakahara, Suzuki, and Kino 1983; Salvato et al. 1983; Ma&l and Decker 1992) is further evidence for a common origin of tyrosinases and molluscan hemocyanins. The huge multidomain hemocyanins of modern-day molluscs would have arisen from this monomeric binuclear Cu-protein through a series of gene duplication and fusion events. A recent report of residual o-diphenol oxidase activity in crustacean hemocyanins, particularly in the dissociated subunit, under nonphysiological conditions (Zlateva et al. 1996) will provide new opportunities to explore the evolutionary relationships between the molluscan hemocyanins and nonarthropodan tyrosinases on the one hand and the arthropodan hemocyanins and prophenoloxidases (tyrosinases) on the other hand. Two approaches were used in this study to investigate hemocyanin evolution. The results of both parsimony analysis of a protein sequence alignment and comparison of conserved structural features are consistent of Hemocyanin Gene Family 275 with a monophyletic origin of arthropodan and molluscan hemocyanins. Homology between the two, however, is limited to a small portion of the molecule, and both classes of proteins, the cylindrical molluscan hemocyanins and the arthropodan hexamer-type proteins, diverged during the early stages of life on this planet (and have evolved quite differently ever since). Prophenoloxidases and hexamerins are proteins homologous to arthropodan hemocyanin that have left oxygen binding to hemocyanin and taken on a variety of other functions within the animal. Acknowledgments We thank Robert Hanner for his valuable ideas on phylogenetic reconstruction. This study was supported by NSF grants DCB 89-08362 and IBN 92-17530 to N.B.T. LITERATURE CITED ASPAN, A., T-S. HUANG, L. CERENIUS,and K. S~DERH;~LL. 1995. cDNA cloning of prophenoloxidase from the freshwater crayfish Pacifastacus leniusculus and its activation. Proc. Natl. Acad. Sci. USA 92:939-943. BAK, H. J., and J. J. BEINTEMA. 1987. Panulirus interruptus hemocyanin: the elucidation of the complete amino acid sequence of subunit a. Eur. J. Biochem. 169:333-348. BALLARD,J. W. O., G. J. OLSEN, D. l? FAITH, W. A. ODGERS, D. M. ROWELL,and I? W. ATKINSON.1992. Evidence from 12s ribosomal RNA sequences that Onychophorans are modified arthropods. Science 258: 1345-1348. BEINTEMA,J. J., W. T. STAM, B. HAZES, and M. I? SMIDT. 1994. Evolution of arthropod hemocyanins and insect storage proteins (hexamerins). Mol. Biol. Evol. 11:493-503. DEVEREUX,J., I? HAEBERLI,and 0. SMITHIES. 1984. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12:387-395. DREXEL,R., S. SIEGMUND,H. J. SCHNEIDER,B. LINZEN, C. GIELENS,G. PREAUX,J. KELLERMANN, and E LO~TSPEICH. 1987. Complete amino acid sequence of a functional unit from a molluscan hemocyanin (Helix pomatia). Biochem. HS 368:617-635. DURSTEWITZ,G. 1996. Molecular studies of hemocyanin expression in the Dungeness crab. Dissertation, University of Oregon, Eugene, Oreg. DURSTEWITZ,G., and N. B. TERWILLIGER.1997. Developmental changes in hemocyanin expression in the Dungeness crab, Cancer magister. J. Biol. Chem. (in press). GAREY, J. R., M. KROTEC, D. R. NELSON, and J. BROOKS. 1996. Molecular analysis supports a tardigrade-arthropod association. Invertebr. Biol. 115:79-88. HAZES, B., and W. G. J. HOL. 1992. Comparison of the hemocyanin B-barrel with other greek key B-barrels: possible importance of the “P-zipper” in protein structure and folding. Proteins 12:278-298. HAZES, B., K. A. MAGNUS,C. BONAVENTURA, J. BONAVENTURA, Z. DAUTER, K. H. KALK, and W. G. J. HOL. 1993. Crystal structure of deoxygcnated Limulus polyphemus subunit II hemocyanin at 2.18A resolution: clues for a mechanism for allosteric regulation. Protein Sci. 2:597-619. KAWABATA,T., Y. YUSAHARA,M. OCHIAI,S. MATSUURA,and A. MASAAKI. 1995. Molecular cloning of insect pro-phenol oxidase: a copper-containing protein homologous to arthropod hemocyanin. Proc. Natl. Acad. Sci. USA 92:77747778. 276 Durstewitz and Terwilliger and D. F. DOOLITTLE. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105-132. LARSON,B. A., N. B. TERWILLIGER,and R. C. TERWILLIGER. 1981. Subunit heterogeneity of Cancer magister hemocyanin. Biochim. Biophys. Acta 667:294-302. LERCH, K., M. HUBER, H.-J. SCHNEIDER,R. DREXEL,and B. LINZEN. 1986. Different origins of metal binding sites in binuclear copper proteins tyrosinase and hemocyanin. J. Inorg. Biochem. 26:213-217. LIN, H. C., S. LEI, and G. WILCOX. 1985. An improved DNA sequencing strategy. Anal. Biochem. 147: 114-l 19. LINZEN,B., N. M. SOETER,A. E RIGGS et al. (17 co-authors). 1985. The structure of arthropod hemocyanins. Science 2295 19-524. MARKL, J., and H. DECKER. 1992. Molecular structure of the arthropod hemocyanins. Pp. 325-376 in C. I? MANGUM,ed. Advances in comparative and environmental physiology. Vol. 13. Springer Verlag, Berlin, Heidelberg, New York. MARKL,J., and S. WINTER. 1989. Subunit-specific monoclonal antibodies to tarantula hemocyanin, and a common epitope shared with calliphorin. J. Comp. Physiol. B 159:139-151. MUNN, E. A., and G. D. GREVILLE.1969. The soluble proteins of developing Calliphora erythrocephala, particularly calliphorin, and similar proteins in other insects. J. Insect Physiol. 15: 1935-1950. NAKAHARA,A., S. SUZUKI,and J. KINO. 1983. Tyrosinase activity of squid hemocyanin. Pp. 319-322 in E. J. WOOD, ed. Structure and function of invertebrate respiratory proteins. Life Chemistry Reports Supplement 1. Harwood, London. NEUTEBOOM,B., I? A. JEKEL,R. M. W. HOFSTRA,S. J. SIERDSEMA,and J. J. BEINTEMA. 1990. Structure, function and evolution of crustacean hemocyanins. Pp. 85-88 in G. PREAUXand R. LONTIE,eds. Invertebrate dioxygen carriers. Leuven University Press, Leuven, Belgium. OCHIAI, E. I. 1983. Copper and the biological evolution. Biosystems 16:81-86. REMANE, A., V. STORCH, and U. WELSCH. 1980. Systematische Zoologie. Gustav Fischer Verlag, Stuttgart and New York. SALVATO,B., G. JORI, A. PIAZZESE,E GHIRETTI,M. BELTRAMINI, and K. LERCH. 1983. Enzymic activities of type-3 copper pair in Octopus vulgaris hemocyanin. Pp. 3 13-3 17 in E. J. WOOD, ed. Structure and function of invertebrate respiratory proteins. Life Chemistry Reports Supplement 1. Harwood, London. KYTE, J., SAMBROOK,J., E. E FRITSCH,and T. MANIATIS. 1989. Molecular cloning. 2nd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. SWOFFORD,D. L. 1991. PAUP: phylogenetic analysis using parsimony. Version 3.1. Illinois Natural History Survey, Champaign, Ill. TELFER, W. H., and J. G. KUNKEL. 1991. The function and evolution of insect storage hexamers. Annu. Rev. Entomol. 36:205-228. TELFER,W. H., and H. C. MASSEY. 1987. A storage hexamer from Hyalophoru that binds riboflavin and resembles the apoprotein of hemocyanin. Pp. 305-314 in J. H. LAW, ed. UCLA Symposium on Molecular and Cellular Biology, New Series. Vol. 49. Liss, New York. TERWILLIGER,N. B., and A. C. BROWN. 1993. Ontogeny of hemocyanin function in the Dungeness crab Cancer magister: the interactive effects of developmental stage and divalent cations on hemocyanin oxygenation properties. J. Exp. Biol. 183: 1-13. TERWILLIGER,N. B., and G. DURSTEWITZ.1996. Molecular studies of the sequential expression of a respiratory protein during crustacean development. Pp. 353-368 in J. D. FERRARISand S. R. PALUMBI,eds. Molecular zoology: advances, strategies and protocols. Wiley-Liss, New York. TERWILLIGER,N. B., and R. C. TERWILLIGER.1982. Changes in the subunit structure of Cancer magister hemocyanin during larval development. J. Exp. Zool. 221: 181-191. TURBEVILLE,J. M., D. M. PFEIFER, K. G. FIELD, and R. A. RAFF. 1991. The phylogenetic status of arthropods, as inferred from 18s rRNA sequences. Mol. Biol. Evol. 8:669686. VAN HOLDE, K. E., and K. I. MILLER. 1995. Hemocyanins. Adv. Protein Chem. 47: 1-81. VOLBEDA,A., and W. G. J. HOL. 1989~. Pseudo 2-fold symmetry in the copper-binding domain of arthropodan hemocyanins. J. Mol. Biol. 206:531-546. -. 19896. Crystal structure of hexameric hemocyanin from Punulirus interruptus refined at 3.2A resolution. J. Mol. Biol. 209:249-279. VOLL, W., and R. VOIT. 1990. Characterization of the gene encoding the hemocyanin subunit e from the tarantula Eurypelma californicum. Proc. Natl. Acad. Sci. USA 87:53125316. ZLATEVA,T., l? DI MURO, B. SALVATO,and M. BELTRAMINI. 1996. The o-diphenol oxidase activity of arthropod hemocyanin. FEBS Lett. 384:25 l-254. CLAUDIA KAPPEN, reviewing Accepted November 11, 1996 editor
© Copyright 2025 Paperzz