in Molecular Biology Molecular characterisation of arsC gene involved in accumulation of arsenics in Lysinibacillus sphaericus B1-CDA. MB701A HT15 (30 ECTS) 2015-05-11 to 2015-10-09 Version 3 Chandini Murarilal Ratnadevi [email protected] Master Degree Project in Molecular Biology Supervisor: Prof. Abul Mandal [email protected] Examiner: Magnus Fagerlind (PhD) [email protected] School of Bioscience, University of Skovde, Box 428, SE 54128 Abstract Previously it has been reported, Lysinibacillus sphaericus B1-CDA contains several arsenicresponsive genes like acr3, arsC, arsR, and arsB conferring arsenic tolerance. The objective of this study was to characterize the molecular function of the arsC gene in Lysinibacillus sphaericus B1-CDA by in silico and in vitro analysis. For in silico studies, Iterative Threading Assembly and Refinement (ITASSER) server predicted the 3D model of the arsC protein. The 3D predicted structure of the protein exhibited six ligand binding site residues (Tyr-8, Cys-11, Thr-13, Arg-95, Gly-107, Phe-108) that bind to arsenic. The in silico results predicting arsC is responsive to arsenic was validated by in vitro experiments. The arsC gene present in the genome of B1-CDA was transferred to arsC deficient strain of Escherichia coli JW3470-1 mutant by complementation studies. The presence of the arsC gene in transgenic strain was verified by PCR and DNA sequencing. The mutant and transgenic strains were exposed to 50mM arsenic under 96hrs and their growth rate was measured by a spectrophotometer. The results represented a statistically significant difference (p<0.05) in the growth rate between mutant and transgenic strains. Further preliminary analysis obtained from Inductively Coupled Plasma-Mass Spectroscopy (ICP-MS) demonstrated that the levels of arsenic accumulation in the cells of transgenic strain were 1.92 times higher than the mutant strain of Escherichia coli JW3470-1 at 24 hours of exposure to arsenic. These results show that Lysinibacillus sphaericus B1-CDA can be utilized as a potential organism for the removal of arsenic from the arsenic contaminated environment. Popular scientific summary Arsenic, a colorless, odorless toxic metalloid from the nitrogen family occurs in many allotropic forms. Primarily, arsenic in its inorganic form exists in two redox states, i.e. arsenate, the oxidized form, and the reduced form arsenite. Arsenite disrupts protein structure by interfering with the sulfhydryl groups in amino acids. Arsenate is a phosphate analog and disrupts the various cellular process that involves phosphate uptake. Exposure to arsenic is mainly through ingestion of contaminated ground water, inhalation of arsenic contaminated air due to a spray of arsenical pesticides and absorption by the skin through contaminated soil and crops. Arsenic exposure causes arsenic poisoning, which has become a major global concern. Millions of people suffer from cancers, keratosis and neural system damage due to arsenic poisoning as a result of consuming arsenic contaminated water and foods. The U.S Environmental Protection Agency (U.S.E.P.A) has classified arsenic as a potential carcinogenic agent. The World Health organization (WHO) has set the permissible levels of arsenic in drinking water to 0.01 mg/L. To overcome the hazardous effects of arsenic poisoning, researchers have suggested mitigation and remediation processes. The mitigation process involves finding a water source free of arsenic and remediation process is defined as eliminating arsenic from contaminated water source before consumption. These two processes can be achieved by two methods 1) physical – chemical methods and 2) biological methods. The physicochemical methods are expensive, ecologically not friendly and inefficient. Hence, researchers have preferred biological methods which are inexpensive, efficient and made ecologically friendly by use of microorganisms for bioremediation of arsenic. The microorganisms involved in bioremediation of arsenic from the contaminated environments utilize arsenic due to the presence of arsenic tolerance genes catalyzing various detoxification pathways. Lysinibacillus sphaericus B1-CDA is a soil-borne bacterial strain which has been identified to contain several arsenic-responsive genes like acr3, arsC, arsR and ars B conferring arsenic tolerance. Arsenate reductase gene (arsC) is involved in the reduction of arsenate to arsenite. The aim of this study was to clone and characterize the arsC gene occurring in Lysinibacillus sphaericus B1-CDA into Escherichia coli JW3047-1 mutant strain lacking arsC gene. The presence of arsC gene renders the organism tolerant to arsenic, therefore, arsC gene could be used in the removal of arsenic from an arsenic contaminated environment. In silico analysis, performed on computer predicted the 3D structure and putative function of arsC protein by Iterative Threading Assembly and Refinement (ITASSER) server. The invitro or test tube experiments were performed to confirm the in silico results. The arsC gene from B1-CDA approx. 350 bp was isolated and transformed into mutant Escherichia coli JW3047-1 strain lacking arsC gene resulting in transgenic Escherichia coli JW3047-1 strains. The arsC gene insert in the transgenic strain was confirmed by colony PCR and DNA sequence analysis. Reverse Transcriptase - PCR (RT-PCR) was performed to check the expression of arsC gene in transgenic strain. The arsenic tolerance assays performed for the mutant and transgenic Escherichia coli JW3047-1 strains optimized the concentrations of sodium arsenate at 50 mM. The growth of the bacterial strains was measured by a spectrophotometer. A statistical analysis by Mann - Whitney U test indicated that the transgenic strain had a significantly higher growth rate (p<0.05) when compared to the mutant strain. Inductive Coupled Plasma-Mass Spectroscopy (ICP-MS) analysis for the accumulation of arsenic from the liquid medium was performed for transgenic strains and mutant strains of Escherichia coli JW3047-1 cultured in 50mM sodium arsenate for a time interval ranging from 24 hours to 96 hours. The transgenic strain accumulated more arsenic from the liquid medium than the mutant strain due to the presence of arsC gene. Therefore, these results suggest that Lysinibacillus sphaericus B1-CDA can be employed as a suitable bioremediation organism for detoxifying arsenic from the arsenic contaminated environment. Abbreviations As As (III) As (V) Ars aa ATP Bp BLAST BLASTn dNTP’S DNA EC EDTA GO ICP-MS Kbp KDa LB LOMETS Mg2+ MIC NCBI NEB OD PCR PDB PSI-BLAST PSI-PRED REMO RMSD RNA RT-PCR Sp. SVMSEQ SN TAE Tm TM-Score U.S.E.P.A WHO 3D Arsenic Arsenite Arsenate Arsenic resistance Amino acids Adenosine triphosphate Base pair Basic local alignment search tool Nucleotide blast Deoxynucleotide triphosphates Deoxyribose nucleic acid Enzyme Commission Ethylene diamine tetraacetic acid Gene Ontology Inductively coupled plasma mass spectroscopy Kilobase pair Kilo Dalton’s Luria Bertani Locally installed meta-threading-server Magnesium Minimum inhibitory concentration National center for biotechnology information New England Bio labs Optical density Polymerase chain reaction Protein database Position specific iterated-blast Psi-blast based secondary structure prediction Reconstruct atomic model Root-mean-square deviation Ribose nucleic acid Reverse transcriptase – polymerase chain reaction Species Support vector machine sequence Serial Number Tris acetate EDTA Melting temperature Template modeling score United states environment protection agency World health organization Three-dimensional Table Of Contents Introduction .................................................................................................................................................. 1 Aim ................................................................................................................................................................ 4 Materials and Methods................................................................................................................................. 5 Results ........................................................................................................................................................... 9 Discussion.................................................................................................................................................... 19 Ethical Aspects and Impact On The Society ................................................................................................ 23 Future Perspectives .................................................................................................................................... 24 Acknowledgements..................................................................................................................................... 25 References .................................................................................................................................................. 26 Appendices..................................................................................................................................................... I Introduction Arsenic, a natural component of the earth’s crust is a toxic metalloid of geochemical and anthropological origin existing as the soluble forms of pentavalent arsenate As [V] and trivalent arsenite As [III]. Environmentally depending on the pH, arsenic exists as arsenate oxyanions (H2AsO4), (H2AsO4-2) and non-ionic arsenite form (H3AsO3), (H2AsO3-). Both the chemical forms of arsenic have disastrous effects, but the trivalent arsenite [AsIII] is 25-60 times more toxic when compared to the pentavalent arsenate [As (V)] (Xiong et al., 2012; Oremland et al., 2009; Silver and Phung, 2005). Arsenate predominantly occurs in oxidized waters and in reduced environments like hot spring biofilms, arsenite is present (Duquesne et al., 2008; Oremland and Stolz, 2005; Santini et al., 2000). Due to the compelling levels of contamination of soil, crops, and water, arsenic poisoning has emerged as a global concern in and around many parts of the globe (Matilda et al., 2010). In several countries like India, Bangladesh, China, Chile, Argentina, Japan, Mongolia, Mexico, Nepal, Taiwan, Poland, and Vietnam and in some areas of the USA, the groundwater has been found to contain higher concentrations of arsenic than the permissible levels (Nordstrom 2002). The U.S. Environmental Protection Agency (U.S.E.P.A) have classified arsenic as a potential carcinogenic agent (Smith et al., 1992). World Health Organization (WHO) has established guidelines for the permissible levels of arsenic in drinking water at 0.01 mg/L (WHO, 1993) to decrease the health risks associated with consumption of drinking water exposed to arsenic, but several third world countries follow the previously declared WHO guideline of 0.05 mg/L (WHO, 1963) owing to economic reasons (Singh et al., 2011). The drinking water supplies in the region of the Brahmaputra in Bangladesh have been vastly contaminated with naturally occurring arsenic and the WHO has quoted it as the mass poisoning of the living population (Smith et al., 2006). Inhibition of energy flow and inactivation of enzymes involved in DNA repair, DNA replication, phospholipid and nucleic acid synthesis has been associated with arsenic toxicity (Banerjee et al., 2011, Hughes, 2002; Lynn et al., 1997). There has been a widespread increase in the neural system damage, cancers related to lung, bladder, kidney, and skin as the drinking water has been exposed to high levels of arsenic leading to hyperpigmentation of skin, keratosis of feet and hands which rapidly advance to skin cancers by the intake of arsenic-contaminated food and water and inhaling air contaminated with spraying of arsenical pesticides (Argos et al., 2010; Marshall et al., 2007; Ng et al., 2003). Globally, an estimated 100 million people are severely exposed to arsenic through the drinking water contaminated with high arsenic levels (Singh et al., 2011; Ahsan et al., 2006; Bang et al., 2005). The impact of the locally occurring atmospheric depositions and interactions of the natural drinking water with sediments, bedrocks and soils influence the arsenic cycling globally (Meliker et al., 2007). It has been reported that the redox characteristics of aquifers sediments and bedrocks naturally influence the concentration of arsenic in the groundwater (Bhattacharya et al., 2002). To overcome the destructive effects of arsenic poisoning in areas exposed to arsenic, researchers have considered two main options. 1) Mitigation i.e. finding a water source free from arsenic and 2) Remediation i.e. elimination of arsenic from the contaminated water before consumption (Visoottiviseth and Ahmed, 2008). For the removal of toxic arsenic substances from the contaminated sources, several methods have been developed like electrochemical treatment, activated coal adsorption, precipitation, reverse osmosis, evaporation, coagulation ion-exchange (Chowdhury et al., 2000), bioremediation, rhizofiltration and phytoremediation (Shrestha and Spuhler 2012). The bioremediation process involves the bacterial biochemical mechanism of reducing or tolerating the toxic form of arsenic and converting it into less toxic forms by enzymatic reactions. The pathways through which microorganisms reduce or oxidize toxic form of arsenic are by biotransformation, periplasmic biosorption, intracellular accumulation, etc. These mechanisms were reported in organisms belonging to the species Kocuria, Proteobacterium, Firmicutes, by Banerjee et al. (2011) and microbes from the species Herminnimonas, Stenotrophomonas, Alcaligenes were reported by Bahar et al. (2012) and in lactobacillus species by Halttunen et al. (2007). The microbial species belonging 1 to Achromobacter, Rhodococcus, and Aliihoeflea were reported by Sattar et al. (2014) in Bacillus species by Mondal et al. (2008) in Pseudomonas species by Kao et al. (2013) in Lysinibacillus species by Lozanzo et al. (2013). The presence of a detoxification mechanism in these microorganisms can be correlated with the occurrence of arsenite oxidation gene (aox/aro/aso), arsenic respiratory reduction genes (arr) and arsenic resistance genes (ars) (Chang et al., 2010). Thermodynamically the arsenate form is more favorable in the majority of the aerobic systems and enters the bacterial cell through the low-affinity phosphate nutrient transport Pit system (Jackson et al., 2003) which is expressed constitutively and non-specifically (Comino et al., 2009). During phosphorylation, arsenate enters the cell through the Pit system as it is a structural analog of phosphate and replaces the cellular phosphate causing toxicity and death of the cell (Lee et al., 2003). Microbiologically and chemically following the oxidation - reduction process as shown in (Figure 1) arsenate is reduced to arsenite, which undergoes biomethylation and released as arsines like trimethylarsine [TMA (III)]. Bacterial organisms have evolved several mechanisms to overcome arsenic toxicity like methylation and peroxidation reactions (Lebrun et al., 2003). Arsenite oxidase a periplasmic membrane-bound enzyme in some Gram-negative bacteria oxidize arsenite to arsenate (Jackson et al., 2003). Several microorganisms have been identified which are tolerant to lethal arsenic concentrations (> 60 mM sodium arsenite) yet very little has been known regarding the genetics of environmental bacteria resistant to arsenic (Mateos et al., 2006). Figure 1: The biochemistry of arsenate reduction to arsenite with in a cell initiated by arsenate reductase and undergoes several redox reactions involving methylation reactions to be excluded out as arsines. The microbial arsenic detoxification pathway is regulated by the ars operon (Jackson and Duggas, 2003). The isolated arsenic resistance determinants (ars) are organized into one transcriptional unit consisting of three genes (ars RBC) or five genes (ars RDABC). The ars genes have been identified 2 both on plasmids (Owalobi and Rosen, 1990) and chromosomes (Diorio et al., 1995) of numerous prokaryotes (Broer et al., 1993; Carlin et al., 1995) and eukaryotes (Bobrowicz et al., 1997). The chromosome of Escherichia coli and other enterobacteriacea members and Pseudomonas aeruginosa, contains the three ars gene (ars RBC) system, encoding arsenate reductase (ars C) a soluble enzyme which reduces arsenate to arsenite, arsenite permease (ars B) a membrane protein, which excludes arsenite from the cell and arsenic transcriptional repressor (arsR) codes a repressor protein, which regulates the expression of ars operon (Jackson and Duggas, 2003; Rosen, 2002; Mobley et al., 1982). Some of the microorganisms contain other genes in their operons like arsA an ATPase enzyme stimulated by oxyanions involved in ATP hydrolysis for extruding arsenicals from the action of arsB protein. Ars D is identified as a regulatory protein for regulating the overexpression of arsB (Neyt et al., 1997). Ars operons consisting of four genes in the order of arsR, ORF2, arsB, arsC have been reported in the skin elements of Bacillus subtilis (Sato and Kobayashi, 1998) which confers arsenic tolerance to the organism. Arsenate reductase, a product of the ars C gene enzymatically reduces arsenate to arsenite (Guo et al., 2005; Demel et al., 2004; Messens et al., 2002; Cohen et al., 2001). The phylogenetic tree of protein sequences represents three distinguished clades of arsenate reductases belonging to two bacteria and one yeast family (Mukhopadhyay et al., 2002). The best-studied arsenate reductase is ArsC of E. coli R773 plasmid of the clade glutaredoxin/glutathione (GrxS/GSH) (Kao et al., 2013). ArsC protein has an active redox cysteine residue in the active site (Liu et al., 1995; Gladysheva et al., 1994) which is involved in the mechanism of arsenate reduction (Mukhopadhyay and Rosen, 2002). The second family of arsenate reductases has the prototype of ArsC of Staphylococcus aureus pI258 plasmid (Ji and Silver, 1992) which is referred to as thioredoxin (Trx) clade related to phosphotyrosine phosphatases (Demel et al., 2004; Gladysheva et al., 1994). The Ars C of pI258 plasmid has CX5R sequence flanked by the alpha (α) helix and beta (β) strand (Messens et al., 2002) which uses three cysteine residues for arsenate reduction (Mukhopadhyay and Rosen, 2002). The arsenate reductase Arr2p in yeast Saccharomyces cerevisiae (Mukhopadhyay et al., 2000) forms the third clade of arsenate reductases which belongs to the class of tyrosine phosphatases similar to the cdc25a cell cycle control protein phosphatases found in humans (Slyemi et al., 2012; Martin et al., 2001). Studies have been published on the eukaryotic arsenate reductases like Leishmania major arsenic reductase 2 (LmACR2) and Saccharomyces cerevisiae arsenic reductase 2 proteins (ScAc2p) (Nahar et al., 2012; Mukhopadhyay et al., 2000). To overcome the problem of arsenic contamination recent studies have identified and detected ars genes in the environmental samples by correlating their presence with the bacterial isolates resistant to arsenic. The studies of these genes can be used for designing prospective molecular biomarkers employed for remediation of arsenic (Jackson et al., 2005). Lysinibacillus sphaericus B1-CDA, a soil-borne arsenic resistant bacterium identified from the arsenic contaminated land from the southwest region of Bangladesh grows at 500mM concentration of sodium arsenate in the medium (Rahman at al., 2014). In earlier studies conducted by Rahman et al. (2015) using the bioinformatics tool, InterproScan identified arsC gene on the genome of B1-CDA which is approximately 350bp and found similar to the arsC gene present in Lysinibacillus sphaericus C3-41 along with other genes like arsB, arsR, and arsSX. In recent years, computational prediction of structures has yielded highly reliable result in a span of few hours for large sequences unlike the experimental procedures for structure prediction which are expensive, time-consuming and laborious (Krivov et al., 2009). The preliminary methods for the prediction of the protein structure are comparative modeling, threading and detecting similarities between model structure and single known sequence structure. The other methods include ab-initio modeling (Roy et al., 2010; Wu et al., 2007) or denovo methods (Fung et al., 2007; Floudas et al., 2006) which predicts protein structure through a sequence without any similarities between any known structural sequence or model sequence at fold level. Molecular biology, biomedicine, and biochemistry provide enormous 3 potential for prediction and analysis of protein structures within a span of a few hours by employing bioinformatics tools. Aim The long-term goal of this study is to determine the molecular function of the arsC gene in Lysinibacillus sphaericus B1-CDA, whether it is involved in the accumulation of arsenic within the bacterial cells. If so, arsC gene could be involved in the removal of arsenic from the areas heavily polluted with arsenic. The short-term goals of this study are (i) to clone the arsC gene from L. sphaericus B1-CDA (ii) characterize the molecular function of arsC gene by in silico and in vitro analysis (iii) perform complementation studies by the transfer of arsC gene from B1-CDA to a strain lacking arsC gene. 4 Materials and Methods 3.1. In silico Studies The genome sequence of L. sphaericus B1-CDA was obtained from the GenBank with accession number PRJEB7750, [http://www.ebi.ac.uk/ena/data/view/PRJEB7750] (Rahman et al., 2015). The arsC protein sequence contained 118 amino acids (Appendix I). To predict the structure and function of arsC protein of L. sphaericus B1-CDA, the Iterative Threading Assembly Refinement (I-TASSER) (Roy et al., 2010) methodology (Figure 2) were used as follows: 3.1.1. Threading This is the first stage of I-TASSER where the arsC protein query sequence was matched with the database of non-redundant sequences by position specific iterated – BLAST (PSI-BLAST) (Margelevicium et al., 2010). From the homologous sequences, multiple alignments were created which was then used by position specific iterated - prediction (PSI-PRED) (McGuffin et al., 2000) for predicting the secondary structure of the protein. A locally installed meta-threading-server (LOMETS) (Wu and Zhang, 2008) along with seven other threading programs SP3, PPA10, PROSPECT54, MUSTER53, HHSEARCH44, FUGUE52 and SPARKS56 (Margelevicium et al., 2010) threaded the query sequence whose secondary structure was predicted with sequence profile. In each threading program, structure based and sequence based scores formed the basis for ranking of templates. The template with the highest rank was further considered and based on the statistical significance, i.e. Z – Score of the leading threading program the quality of template alignment was judged. 3.1.2. Structural Assembly In this stage, on the basis of ab-initio modeling, (Bonneau and Baker, 2001) continuous fragments excised from the template structure in threading programs were assembled to structural conformations aligning with the unaligned regions. The fragments were assembled by a modified replica exchange technique, i.e. Monte Carlo simulation technique (Zhang et al., 2004) guided by the data derived from the protein database (PDB) (Berman et al., 2000), Support vector machine sequence (SVMSEQ60) (Lee et al., 2009) prediction based on sequence contact and threading templates spatial restraints. The refinement simulations clustered by SPICKER (Zhang and Skolnick, 2004) which is a well-organized and simple clustering program generates cluster centroids by averaging the clustered structural entices. 3.1.3. Model selection and refinement The cluster centroids generated from structural assembly were selected and the fragments assembled were subjected to simulations. From the threading alignments LOMETS, external constraints were generated and Template model alignment (TM-align) (Zhang and Skolnick, 2005) identified the PDB structures closest to cluster centroids of SPICKER helping refinement of the global topology of the SPICKER cluster centroids. The clustered structural entices from structural assembly were clustered again and REMO (Zhang, 2009) selected the lowest energy cluster as input generating concluding structural model by constructing complete atom models from the traces of Cα and sidechain interactions of the proteins (Holm and Sander, 1991) from PDB by optimization of networks of hydrogen bonding (McDonald and Thornton, 1994). 3.1.4. Structure-based functional annotation. The function of the arsC protein was conferred by predicting the structural identical 3D model against the PDB database of proteins of known structure and function. The query protein structural analog from the gene ontology (GO) library (Roy et al., 2010) using TM - align were matched on the 5 basis of global topology and from the frequency of recurrence of GO terms a consensus was derived. From the libraries of the binding site and enzyme commission (EC) (Bartlett et al., 2002), structural analog was matched for local and global similarities in structure. The local similarity structure search was used for a method complementary to the global similarity for identification of analogs having a dissimilar fold, but due to the binding/active sites conservation performed the same function. The global similarity structure search was used for the recognition of proteins with the identical global fold. The global search functional analog results were based on structural patterns conserved in the current model measured by TM-score, structural alignment coverage, sequence identity and reconstructed atomic model (REMO) (Li and Zhang, 2009). Figure 2: Prediction of the 3D structure and function of the arsC protein by iterative threading assembly refinement (I-TASSER) methodology includes five major steps. Step 1: Submission of the query sequence in the I-TASSER server Step 2: Threading of the submitted sequence uses three programs-PSI-BLAST, PSI-PRED, and LOMETS Step 3: Structure assembly by SVMSEQ and SPICKER Step 4: Model selection by TM-align and refinement by REMO Step 5: Structural and functional prediction by Enzyme Commission, Gene Ontology terms, and ligand binding sites. 3.2. Bacterial strains, Chemicals, and Growth The bacterial strains used in this study were Lysinibacillus sphaericus B1-CDA (University of Skövde, Sweden) and Escherichia coli JW3470-1 mutant strain (arsC gene knocked down) (Coli Genetic stock center, USA). The bacteria were cultured overnight in LB broth at 37 °C at 180 rpm in a closed shaker incubator with selective antibiotic marker ampicillin at a concentration of 100 µg/ml. The source of arsenic in this study was sodium arsenate from Sigma - Aldrich and the working concentration ranged from 5 mM to 100 mM (Mergeay et al., 1995). 3.3. Nucleic acid isolation. The cloning of arsC gene involved the isolation of RNA from L. sphaericus B1-CDA strain in preparing cDNA to be used for ligation into pGEM-T cloning vector from Promega for transformation. RNA isolation from L. sphaericus B1-CDA and E. coli JW3470-1 mutant and the transgenic strain were performed by MasterpureTM RNA purification kit (Epicentre). The arsC gene ligated into the pGEM-T vector and cloned in mutant E. coli JW3470-1 competent cells was isolated from E. coli JW3470-1 transgenic strain using the plasmid mini Kit (Qiagen). The arsC gene PCR product obtained by amplification of E. coli JW3470-1 plasmid using gene specific primers (Appendix II), PCR reaction 6 components (Appendix III) and PCR program (Appendix IV) was purified by QIAquick PCR purification kit (Qiagen). All the isolation procedures were performed in triplicates following the respective kit manufacturer’s protocols. Nanodrop 2000 (Thermo Fisher Scientific, Wilmington, USA) was used to measure the concentration and purity of isolated nucleic acids. 3.4. Primer design. The ars operon of Lysinibacillus sphaericus B1-CDA contains arsenic metabolizing genes which confer arsenic resistance to the organism (Rahman et al., 2015). The sequence of the arsenic reductase gene arsC identified in the ars operon of the genome of L.sphaericus B1-CDA was retrieved from the GenBank database [http://www, ebi.ac.uk/ena/data/view/PRJEB7750], accession number PRJEB7750. Gene-specific primers (Appendix II) were custom designed to amplify an arsC gene using Primer3Plus software (Untergasser et al., 2007). 3.5. RT – PCR One step 50 µl reverse transcriptase - PCR reaction was set up using the isolated RNA of B1-CDA at a concentration of 100 ng/µl following the reaction protocol of MasterAmpTM High fidelity RT – PCR kit (Epicentre) with RT-PCR reaction components (Appendix V) and cycling conditions (Appendix VI). The resultant cDNA was purified by QIAquick PCR purification kit (Qiagen) following the kit protocol and 3 µl of the purified sample was visualized by agarose gel electrophoresis. 3.6. Cloning of arsC gene The purified cDNA product obtained from the RT-PCR of B1-CDA strain was ligated into a commercial cloning vector pGEMT procured from Promega. An overnight ligation reaction (Appendix VII) was set up according to the kit protocol along with a negative control. The ligation mixture was transformed into E. coli JW3470-1 mutant strain competent cells by the calcium chloride transformation protocol. The transformed reaction mixture was plated on LB plates with ampicillin at a concentration of 100 µg/ml and incubated overnight at 37 °C. 3.7. Colony PCR The screening of recombinants was performed by colony PCR using arsC gene primers. A 50 µl PCR reaction was set up using MasterAmpTM PCR kit (Epicentre), PCR reaction mixture (Appendix III) and cycling conditions (Appendix IV) followed by agarose gel electrophoresis to visualize the amplified arsC gene. 3.8. Agarose gel electrophoresis The amplified PCR products, RT – PCR products and purified PCR products were analyzed by agarose gel electrophoresis. The electrophoresis running buffer was 1 X Tris acetate EDTA (TAE) prepared from the stock concentration of 50 X TAE. All the PCR products were visualized on 2% agarose gel run at 100V. GelredTm (Biotium) at 1X concentration was used as an intercalating dye. As a reference for a molecular weight 2log DNA ladder (NEB) and 6X gel loading dye at 1x concentration (NEB) was used. 3.9. DNA sequencing The plasmid was isolated from the transgenic strains of E. coli obtained from the cloning of the ligated arsC gene with pGEM-T vector into E. coli mutant JW3470-1 strain. Using gene specific primers (Appendix II) and about 50ng of the plasmid as template, the arsC gene was amplified by PCR reaction mixture (Appendix III) following the cycling conditions (Appendix IV). The amplified PCR product was purified using the QIAquick PCR purification kit (Qiagen). The purified PCR product 7 about 100 ng was sent for DNA sequencing along with 10 µl of gene-specific primers to KIGene at Karolinska University Hospital, Stockholm, Sweden. The sequencing results were analyzed with the existing sequences and amino acid databases in the NCBI using BlastN. 3.10. Arsenic resistance assay The minimum inhibitory concentration (MIC) of arsenic for transgenic and mutant E. coli JW3470-1 strain was determined to optimize the arsenic concentration for ICP-MS and bacterial growth analysis. The bacterial cultures of the transgenic and mutant E. coli JW3470-1 strains were separately cultured overnight at 37 °C in LB broth ( 100 µg/ml ampicillin for transgenic strains and no ampicillin for mutant strains) to be used as inoculum. For the assay, one percent inoculum was transferred to 50 ml LB broth in conical flasks supplemented with arsenate at concentrations of 5 mM, 10 mM, 50 mM and 100 mM. The samples were cultured in one parallel set at 37 °C at 180 rpm in a shaker incubator. The growth of the bacterial cultures was measured after 24 hours by optical density at 600nm by WPA Bio wave CO 8000 cell density meter and the variation in the bacterial growth between the transgenic and the mutant E. coli JW3470-1 strains were observed as illustrated by Musingarimi et al. (2010) 3.11. Growth and Inductively Coupled Plasma - Mass Spectroscopy (ICP-MS) analysis The inoculum for bacterial growth assay was prepared by overnight culturing of transgenic and mutant E. coli JW3470-1 strains at 37 °C in LB broth (100 µg/ml ampicillin for transgenic strains and no ampicillin for mutant strains). One percent inoculum was added to all the samples. The cells were cultured separately in three parallel sets of 50 ml LB broth in conical flasks supplemented with 50 mM arsenate and incubated at 37 °C at 180 rpm in a shaker incubator till 96 hours. The turbidity of the bacterial cells was recorded every 24 hrs at an absorbance of 600nm by cell density spectrophotometer meter as described by Kostal et al. (2004). After the absorbance readings were measured in every 24 hours the bacterial cultures were prepared for ICP-MS analysis following the protocol by Rahman et al. (2015). The bacterial cultures were harvested by centrifugation at 10000 g for 10 min. After harvesting the samples, the supernatant or cell free media was separated from the pellet and stored at 4 °C. The pellet was washed thrice with de-ionized water and dried until the cell dry weight was achieved after discarding the water. For the analysis of total arsenic, supernatant with blank, i. e. media containing arsenate but not exposed to inoculum was sent to ICP-MS analysis at Eurofins Environment Testing Sweden AB (Lidköping, Sweden) 3.12. Statistical analysis The data were statistically analyzed to compare the difference in the growth rate between the mutant E. coli JW3470-1 and transgenic E. coli JW3470-1 strains. Mann-Whitney U test was performed and the significance level was set at 0.05 (p<0.05). The results were reported in mean value and standard deviation. 8 Results I-TASSER server, an online tool was used for the in silico analysis of ars C protein predicting the secondary and tertiary structure including the putative function of arsC protein. 4.1. Prediction of secondary structure The arsC protein query sequence was aligned against 5,798 non-redundant protein structure in ITASSER using PSI-BLAST. The secondary structure of the protein was predicted by PSI-PRED. The predicted structures were defined with the β-sheets, α-helices, and coiling structures along with confidence scores (Figure 3). LOMETS threaded the query sequence from the PDB library generating top ten best models from ten highly efficient threading programs depicting Z-score values and sequence identity coverage. The models were selected based on normalized Z-score greater than one which indicates a good alignment with the query sequence. Higher the Z - score more closely the alignment with the query sequence. The model ranked as 1 (Table 1) was identified as 3gKxB with a normalized Z - score of 3.95 and identity coverage of 0.47 and 0.48 percentage and was identified from the protein database as putative arsC protein from Bacteroides fragilis. Figure 3: Prediction of secondary structure of arsC protein by PSI-PRED. Blue indicates beta sheets(s), Red alpha helix (H), Black coil and Confidence score (Conf. Score) of each amino acids from 1-9. Table 1: Prediction of a protein model by I-TASSER used top ten threading folds. The accuracy of model-1 was predicted based on the Z-score. Normalized Z-score > 1 indicates a good alignment. Rank PDB Hit Iden1A Iden2B CovC Norm Z-scoreD 1 3gkxB 0.47 0.48 0.98 3.95 2 2m46A 0.52 0.51 0.97 3.02 3 2m46A 0.51 0.51 0.99 4.11 4 2m46A 0.51 0.51 0.99 1.94 5 2m46A 0.51 0.51 0.99 2.06 6 2m46A 0.52 0.51 0.99 3.96 7 2m46A 0.52 0.51 0.98 3.16 8 3gkxA 0.50 0.49 0.99 2.56 9 9 2m46A 0.52 0.51 0.99 3.82 10 3rdwA 0.23 0.24 0.97 4.08 A) Percentage of sequence identity of the templates in the threading aligned region with the query sequence B) Percentage of sequence identity of the whole templates with query sequence C) Coverage of threading alignment=number of aligned residues divided by the length of query protein D) Normalized Z-scores of the threading alignment 4.2. Prediction of 3D structure Based on the pairwise structure similarity, I-TASSER used the spicker cluster centroids to generate 3D structures (Figure 4). The quality of the model was measured by C-score or confidence score. The C-score ranges from [-5 to 2]. Higher the C-score, higher the quality of the predicted structure. The C-score of the predicted structure is 1.27. The TM - score and RMSD was used to measure the structural similarity of the predicted structure. The TM- score < 0.17 indicates random similarity and a TM –score > 0.5 indicates the model of correct topology. The TM -score of the predicted structure is 0.89 + 0.07, RMSD = 1.9 + 1.6 Å suggesting correct topology and proper quality. The full-length model was generated by I-TASSER by cluster density using Monte Carlo simulations. Figure 4: Predicted 3D structure of arsC protein by I-TASSER. Prediction is based on TM-score, RMSD, C-score and cluster density, which determines the quality of the model. With the increase in C-score and cluster density, the quality of the model is enhanced. Refinement of the model is done by TM-score and RMSD values. The model’s C-score = 1.27, TM-score = 0.89 + 0.07, RMSD = 1.9 + 1.6 Å. The arsC protein ribbon presentation indicates the active sites for ligand binding at the red loop Tyr-8, Cys-11, Thr-13, Arg-95, Gly-107, Phe-108. 4.3. Structural analogs to the target protein identified by TM-Align The TM-align program was used to match the first generated I-TASSER model with all the structures in the PDB library, thereby generating top ten models which have the close structural similarity to the predicted I-TASSER model resulting in the proteins having similar functions to the target protein. The ranking of the proteins was made on the basis of highest TM-score indicating closest structural similarity to the query sequence from across the PDB library. The PDB hit 2m46A (Table 2) is ranked model number one due to highest TM-score of 0.965 when compared to the other PDB hits. 10 Table 2: Prediction of the structural analogs of arsC protein based on TM-score. Rank PDB Hit TM-scoreA RMSDB IDENc COVD 1 2m46A 0.965 0.68 0.513 0.992 2 3Gkxb 0.896 1.46 0.483 0.983 3 3fz4A 0.844 1.82 0.422 0.975 4 1z3Ea 0.840 1.73 0.243 0.975 5 3rdwA 0.839 1.69 0.219 0.966 6 1s3da 0.817 1.96 0.202 0.966 7 3f0Ia 0.807 1.88 0.214 0.949 8 1rw1A 0.777 2.13 0.264 0.932 9 3I78A 0.703 2.60 0.287 0.907 10 2kokA 0.676 3.22 0.226 0.949 A) Measure of structural similarity between the predicted model and the native structure. B) Root mean square deviation (deviation between residues that are structurally aligned by TM-score. c) Percentage sequence identity in the structurally aligned region. D) Coverage of the alignment by TM align=number of structurally aligned residues divided by the length of the model. 4.4. Structure-based functional annotation The function of the query protein was predicted by the estimation of confidence scores and functional analogs based on GO terms, EC number and active sites and ligand binding sites. 4.4.1. Ligand binding sites To identify the function of protein, the ligand binding sites of the query protein were predicted by I TASSER. The prediction was made on the basis of global and local structural similarities between the binding sites on the predicted 3D model run against the 19,658 non-redundant proteins with known ligand binding site in the PDB library. The template 1k0nA (Table 3) has the highest C-score of 0.43 with six ligand binding site residues and 19 clusters. The template 3ihqA (Table 3) has the lowest Cscore of 0.04 with three ligand binding sites lacking complex structure. The identified ligand binding site residues in the template 1k0nA with higher C-score of 0.43 are Tyr-8, Cys-11, Thr-13, Arg-95, Gly-107, Phe-108. 11 Table 3: Prediction of the ligand binding sites of model 1 based on the measure of global and local sequence and structural similarities between the template binding site and predicted binding site of model 1. Higher Cscore and larger cluster size were used for significant predictions. Rank C-score Cluster size PDB HIT Lig name Ligand Binding site residues 1 0.43 19 1K0nA GSH 8,11,13,95,107.108 2 0.13 6 3mszA GSH 13,94,96,107,108,109 3 0.04 2 1i9dA SO4 13,16,109 4 0.04 2 1J9bA TAS 11,12,13 5 0.04 2 3ihgA IMD 60,61,94 4.4.2. Prediction of Enzyme Commission (EC) and active sites The predicted 3D model structure was run against a PDB library of 5,798 non-redundant entries with known EC numbers. On the basis of global and structural similarities the structural analogs of the enzyme commission and active sites were matched using TM-align. The analysis generated top five models ranked according to the TM-score. The PDB hit 1sk1A (Table 4) has the highest EC number of 1.20.4.1 with three active site residues indicating V-111, H-7, P-9 i.e His-7, Pro-9, Val-111. The lowest EC number is for PDB hit 2gsdA (Table 4) with no active site residues. The PDB hit 1sk1A was identified as the enzyme arsenate reductase (glutaredoxin) recognized from ExPASy. Table 4: Prediction of functional analogs of model 1 based on Enzyme commission (EC) score. A ranking of the model was based on EC score. Higher the EC score higher is the confidence of the prediction. Rank EC-score PDB hit TM-score RMSD IDEN Cov EC Number Active Site Residues 1 0.465 1sk1A 0.807 1.99 0.195 0.958 1.20.4.1 7,9,11 2 0.282 2f8aA 0.568 3.43 0.080 0.822 1.11.1.9 52 3 0.280 2he3A 0.517 3.94 0.058 0.831 1.11.1.9 NA 4 0.279 2nadA 0.506 4.35 0.063 0.890 1.2.1.2 NA 5 0.278 2gsdA 0.502 4.42 0.063 0.890 1.2.1.43 NA 4.4.3. Prediction of Gene Ontology (GO) terms The structural analogs of the predicted 3D protein were matched against a PDB library of 26,045 non-redundant entries with known GO terms and a consensus was derived based on the number of occurrences of GO terms. The consensus prediction of GO terms involved the prediction of molecular function, biological process, and the cellular component along with the respective GO scores (Table 5). Higher values of GO scores indicate more confident predictions. The molecular function with GO terms, GO: 0008794 and GO – scor0e 0.80 is predicted as an arsenate reductase (glutaredoxin) activity. The biological process predicted with GO terms, GO: 0055114 and highest GO - score of 0.85 is an oxidation - reduction process. The cellular component GO term, GO: 0043231 with a highest GO - score of 0.58 is predicted as cytoplasm. The predicted 3D protein has a molecular 12 function of arsenate reductase activity involved in biological oxidation-reduction process which occurs at the cytoplasmic cellular component. Table 5: Prediction of GO - terms, i.e. molecular function, biological process and cellular component by ITASSER for the 3D structure of the protein based on GO-score. GO-score > 0.5 signifies a high confidence prediction. Gene Ontology (GO) GO term GO score Molecular function GO:0008794 (arsenate reductase (glutaredoxin) activity) 0.80 Biological process GO:0055114 (oxidation – reduction process) 0.85 Cellular component GO: 0043231 (cytoplasm) 0.58 4.5. Prediction of Gene Function On the basis of structural and functional predictions of arsC protein a novel hypothetical pathway of arsenate reduction in B1-CDA was proposed (Figure 5). The first step in the biochemical pathway involves the binding of an oxyanion [As (v)] to the enzyme arsC reductase non-covalently with the nucleophilic attack on cysteine - 11. This results in the formation of thiarsahydroxy intermediates in step two with thiolate of activated cysteine-11 which is stabilized by hydrogen bonds of arginine-95. In step three glutathionylated intermediates formed are reduced to novel intermediates Arsc-S-AS +O which in the final step dissociates to produce free arsenite As (III) which undergoes biomethylation and extruded from the cell as arsines. Figure 5: Prediction of a novel hypothetical biochemical mechanism of arsenate reduction by the arsC gene in B1-CDA. Step 1: involves the nucleophilic attack on cys11 bound non-covalently to arsenate releases OH- ions. This intermediate formed in step 2: is glutathionated by the release of the H2O molecule. Step 3: involves the reduction of arsenate to arsenite by binding of glutaredoxin (GrxSH) resulting in the formation of intermediate compound dihydroxy monothiol arsenate and releases GrxS-SG and GSH. Step 4: positively charged arsenic with a mono hydroxy intermediate is formed. Free arsenite is released with the addition of OH- ions and undergoes biomethylation. 13 4.6. Concentration and Purity of Nucleic acids The concentration and purity of RNA isolated from L. sphaericus B1-CDA and transgenic strains E. coli JW3470-1 were measured by nanodrop 2000. The plasmid was isolated from transgenic E. coli JW3470-1 strain. The nanodrop readings of the best isolated samples (Table 6) were reported which were further used in the study. Table 6: The concentrations and purity level of the RNA and DNA isolated from B1-CDA and transgenic strains. SN Samples Concentration (ng/µl) 260/280 260/230 1. RNA (B1-CDA) 945.4 2.1 2.18 2. RNA (transgenic strain) 1331 2.13 2.22 3. Plasmid 531 1.81 2.0 4.7. Expression of arsenate reductase gene (arsC) in B1-CDA Having established the nature of ars operon in the L. sphaericus B1 - CDA strain (Rahman et al., 2015), expression of arsC gene was checked by using gene-specific primers by endpoint RT - PCR as explained in materials and methods. The resultant cDNA with an expected length of 350 bp was visualized on 2% agarose gel (Figure 6) and purified by PCR purification kit and further used for cloning. Figure 6: The expression level of the arsC gene of B1- CDA detected by the complete reverse transcription of RNA to cDNA by RT-PCR. The results were visualized on 2% agarose gel by GelRedTm, 6X loading dye, and 1X TAE running buffer. Lane L - NEB 2Log DNA Ladder, Lane 1 - cDNA approx. 350bp. 14 4.8. Transformation and Amplification of arsC gene The purified cDNA obtained from the RT- PCR of L. sphaericus B1-CDA was transformed into E. coli JW3470-1 mutant strain lacking arsC gene using a commercial cloning pGEM - T vector (Appendix IX). After transformation, several recombinant clones (transgenic) were screened for the presence of arsC genes by colony PCR (Figure 7). The arsC gene from the plasmid of transgenic strain was amplified by PCR and PCR purified product was sent for DNA sequencing. Figure 7: Transformation of arsC gene was confirmed by colony PCR with arsC-F/arsC-R primers. The results show arsC gene amplified at approx. 350bp. Lane 1- negative control, Lane 2- NEB 2log DNA ladder. Lane 3, 4, 5 - colonies picked from transformed plates. 4.9. BLASTN search of arsC gene The DNA sequencing results of arsC gene were aligned with the NCBI BLASTN search of the nonredundant GenBank, EMBL, PDB, DDBJ database which revealed a homolog of arsC gene on the chromosome of Lysinibacillus sphaericus C3-41 (Rahman et al., 2015). Alignment of the predicted nucleotide sequence of arsC gene in the non-redundant database of NCBI showed that it was 86% identical to the L. sphaericus C3-41 (Figure 8). Figure 8: The DNA sequencing of the 350 bp fragment transformed into E. coli JW3470-1 showed 86% identity with the Lysinibacillus sphaericus C3-41 indicating that it belongs to Lysinibacillus sphaericus family. BLAST alignment was performed in the NCBI database. The query indicates sequenced arsC gene and subject Lysinibacillus sphaericus C3-41. 15 4.10. Expression of arsC gene in transgenic strains The RNA was isolated from the transgenic E. coli JW3470-1 strains, the concentration of which is described in Table 6. The expression of the arsC gene in the transgenic strain was studied by RT-PCR as mentioned in materials and methods and the results were visualized on 2% agarose gel (Figure 9). Figure 9: The expression of arsC gene in transgenic E. coli JW3470-1 detected by the complete reverse transcription of RNA to cDNA by RT-PCR. The results were visualized on 2% agarose gel by GelRedTm, 6X loading dye, and 1X TAE running buffer. Lane L- NEB 2Log DNA Ladder, Lane 2 and Lane 3 - cDNA approx. 350 bp. 4.11. Arsenic resistance analysis The cell growth of mutant and transgenic E. coli JW3470-1 strains grown in LB medium containing 5 mM to 100 mM of sodium arsenate measured after 24 hours exhibited tolerance to arsenic. The transgenic E. coli JW3470-1 strain had high growth density at an absorbance of 600 nm measured by cell density spectrophotometer when compared to the mutant E. coli JW3470-1 strains as shown in (Table 7). Table 7 : The bacterial growth of mutant and transgenic E. coli JW3470-1 strains grown at different concentrations of arsenic. The growth was measured at absorbance 600nm. OD at 600 nm Concentration of arsenic (mM) Mutant E. coli JW3469-1 Transgenic E. coli JW3469-1 5 0.53 1.50 10 0.21 1.26 25 0.10 0.92 50 0.09 0.74 100 0.06 0.26 16 4.12. Statistical analysis of growth. The optimal growth of mutant and transgenic E. coli JW3470-1 strains was seen at 50 mM concentration of arsenate from arsenic tolerance assays hence bacterial cultures were grown in 50 mM arsenate over a range of time intervals (24 hours to 96 hours) and bacterial growth was measured by optical density at 600 nm. A Mann - Whitney U test was run to determine if there was a difference in the bacterial growth between the mutant and transgenic E. coli JW3470-1 strain at 50 mM sodium arsenate. Median growth of the mutant (median = 0.155, n = 4) and the transgenic (median = 0.595, n = 4) strains were significantly different, U = 4, p = 0.02 (Figure 10). The p - value is lesser than the significance level p<0.05. Hence it can be concluded that there is a significant difference in the growth pattern of the mutant and transgenic E. coli JW3470-1 strain when exposed to arsenic. Figure 10: Comparison of the mean growth rate between the mutant and transgenic E. coli JW3470-1 strain. The blue curve indicates mutant strain and brown indicates transgenic strain. The curve was plotted with optical density (600 nm) values vs time interval (hours). The P value was found to be 0.02 from Mann Whitney U test which is less than the significance level p<0.05. Thereby indicating a significant difference in the growth rate between mutant and transgenic strains. The error bars were plotted with 95% Confidence interval and + 1 SD. 17 4.13. Analysis of arsenate accumulation The amount of arsenate accumulated from the cell free medium by mutant and transgenic E. coli JW3470-1 strains was analyzed by ICP-MS (Table 8). The bacterial strains were cultured in the presence of 50 mM arsenate at varying time intervals of 24 hours to 96 hours. Table 8: The amount of arsenic accumulated from the supernatant analyzed by ICP-MS for mutant and transgenic E. coli JW3470-1 strain cultured at 50mM arsenate concentration for 24 hours to 96 hours (n = 1). Concentration of arsenic in cell free broth (mM) Time Interval (hours) Mutant E. Coli JW3470-1 Transgenic E. Coli JW3470-1 24 9.29 8.33 48 10.89 10.57 72 12.78 11.85 96 13.16 13.15 18 Discussion As mentioned in the introduction, arsenic contamination of water and foods is a severe danger to human health in many regions of the world, particularly in the South-East Asia. One of the most ecofriendly and inexpensive methods for removal of arsenic from the contaminated water or effluents might be the use of microorganisms (Rahman et al. 2015; Congeevaram et al., 2007) occurring naturally in the environment. Several researchers have been engaged in isolating and characterizing arsenic resistant microorganisms from the arsenic contaminated regions and in-situ bioremediation has garnered a lot of importance in recent years. Because it is cost effective, most efficient and environmentally safe mechanism for the detoxification of arsenic. Also, this method has many advantages over the conventional processes (Pous et al., 2015; Escalante et al., 2009). During the past decades, several bacterial strains have been reported to be highly resistant to arsenic (Sattar et al., 2014; Kao et al., 2013; Jenson et al., 2010). They can grow and survive in the arsenic contaminated environment. One of such resistant strains is the L. sphaericus, B1-CDA which was previously isolated and reported by a research team at the University of Skövde, Sweden (Rahman et al., 2014). The genome sequence of this strain B1-CDA has proven that this bacterium contains several genes responsive to metal ions like arsenic, iron, cobalt, manganese, nickel, lead, zinc, cadmium and chromium (Rahman et al., 2015). One such gene is the arsC gene responsive to arsenic. In this thesis work, an attempt was made to determine the molecular function of the arsC gene in L. sphaericus B1-CDA by both in silico and in vitro studies. The in silico analysis of arsC protein of Lysinibacillus sphaericus B1-CDA involved the analysis of 118 amino acid sequence obtained from GenBank database with accession number PRJEB7750 [http://www.ebi.ac.uk/ena/data/view/PRJEB7750] (Rahman et al., 2015) by I-TASSER server (Roy et al., 2010). I-TASSER is a consolidated automated platform for modeling of protein structures and prediction of function on the basis of query sequence-structure-function prototype. By this method the secondary structure and 3D structure of the query protein was predicted revealing the substrate bound residues. The molecular, cellular and biological function of the query protein was predicted on the basis of the 3D structure. The results from the in silico experiments are credible as various repeated benchmark tests form the basis of Bioinformatics analysis (Lee et al., 2009). The in silico analysis, when compared to in vitro experiments, yield faster, significant results, and is inexpensive. Protein crystallography and Nuclear Magnetic resonance (NMR) can be employed in constructing a 3D model of a protein (Pittelkow et al., 2011). These approaches are less preferred as they are laborious, expensive and time-consuming. However, there are disadvantages of performing in-silico analysis as it becomes highly difficult to prioritize and choose a satisfactory candidate as it considers various parameters of selection. In the current experiments 3gkxB (Z-score 3.95) was selected as a primary structure for further analysis, even though the PDB hits 2m46A (Z score-4.11) and 3rdwA (Zscore 4.08) had higher Z-scores (Table 1). In this current case, sequence identity (Iden1 and Iden2) was the deciding factor for ranking top fold suggested by the algorithmic computer analysis. However, the selection of the target model or structure is simplified when every suggested parameter indicates increased significance values. In our current experiment, I-TASSER server selected one model and ranked it one due to highest c-score 1.27 (Figure 4) which also showed highest TM-score (0.89+0.07) and RMSD 1.9+1.6 Å. The functional similarities of a predicted structure and analogs detected are indicated by the GO-score. As the function and structure of a protein are multifaceted, the similarities of functionalities between the identified analogs result in the association of many GO terms. Based on the GO-scores, GO-term consensus prediction was performed to choose reliable GO terms (table 5). Based on the highest GO scores the biological, cellular and molecular function of the protein was predicted and the hypothetical biochemical mechanism was postulated (Figure 5). Studies on resistance of arsC gene to arsenate and arsC protein encoding arsenate reductase have been conducted in yeast and other bacteria (Ye et al., 2011; Demel et al., 2004) where protein crystallography and NMR have been used for constructing protein models of arsC protein where the mechanism of arsenic binding to arsenate reductase with 19 cysteine residues was studied in E. coli R773 plasmid (Shen et al., 2013; Ye et al., 2011) and found to be similar to the arsC protein bound to cysteine residues in B1-CDA. The arsC protein structure, based on molecular modeling in Herbaspirillum sp. GW103 (Govarthanan et al., 2015) and homology modelling in cronobacter sakazakii BAA-894 strain (Chaturvedi et al., 2013) revealed close structural similarities with arsC family proteins of Pseudomonas aeruginosa and Escherichia coli respectively, where it is proven that arsC protein-coding for arsenate reductase are involved in the reduction of arsenic from the environment which is in accordance with our novel findings of arsC protein in B1CDA which codes for arsenate reductase and involved in reduction of arsenic. In vitro laboratory experiments were performed to validate the in silico results of arsC gene. The nucleic acid isolations were performed in triplicates and duplicates, but the samples with good concentrations and high purity levels, which were further used in the study were compiled together with their nanodrop readings (Table 6). The other nucleic acid samples with lesser purity and concentration in comparison to the best samples were stored in -80 °C. They could be used as a second source of samples in the study in higher concentrations if the best samples got completely used up, as the isolation procedures involving culturing of bacterial strains is time-consuming and laborious. To assess the concentration and purity of the isolated DNA and RNA, the absorbance ratio at 260 nm and 280 nm was used. The 260:280 ratio of DNA and RNA at approximately 1.8 and 2.0 respectively is acknowledged as a pure form of DNA and RNA and seen as a rule of thumb. The 260:230 ratio used as a secondary measure for purity, generally lies in the range of 2.0 - 2.2 (Leninger 1997). On the basis of the rule of thumb, the concentration and purity of isolated nucleic acids were found to be good (Table 6). The arsC gene identified on Lysinibacillus sphaericus B1-CDA shows that it is responsive to arsenic (Rahman et al., 2015). The similar action of arsC gene was identified on R773 plasmid in E. coli (Ye et al., 2011; Kao et al., 2011; Demel et al., 2004) and arsC gene with similar phenotype was detected on Staphylococcus plasmids pSX267 and pI258 and chemolithotrophic Zetaproteobacteria (Jesser et al., 2015; Demel et al., 2004). The expression of arsC gene in B1-CDA was checked by RT-PCR with an expected 350bp length with arsC-F/arsC-R gene specific primers designed by primer3plus software (Untergasser et al., 2007) (Figure 6). The cDNA formed was purified and ligated into a pGEM-T cloning vector and transformed into E. coli JW3470-1 mutant strain in which the arsC gene is knocked off. The resultant transgenic E. coli JW3470-1 strain complemented with arsC gene were verified by colony PCR (Figure 7). The absence of target arsC gene in mutant E. coli JW3470-1 strain was confirmed by amplification with PCR by arsC-F/arsC-R primers. In bacterial species, Halomonas and Acinetobacter a 409 bp arsC gene coding for arsenate reductase were amplified (Sunita et al., 2012). The DNA sequence alignment by NCBI BLASTN showed 86% similarity to Lysinibacillus sphaericus C3-41 (Figure 8) and is identical in function (Rahman et al., 2015) which confirms our cloning results. The sequencing results of arsC gene in Bacillus sp. SXB by Wu et al. (2013) showed 99% similarity with Bacillus thuringenesis sp. The expression of the arsC gene in transgenic strain was verified by RT-PCR (Figure 9) which concludes that the complementation studies have made the transgenic strain resistant to arsenic. Similar studies were performed by the isolation of arsenate reductase genes from Acinetobacter species conferring arsenic resistance in E. coli JM109 and WC3110 by complementation studies (Sunitha et al., 2015) and confirms our studies that the presence of arsC gene confers an organism resistance to arsenic. Arsenic resistance assays were conducted to analyze the minimum inhibitory concentration of transgenic and mutant E. coli JW3470-1 strains by varying the concentrations of sodium arsenate (Musingarimi et al., 2010). The bacterial strains were grown in medium containing sodium arsenate at 5 mM, 10 mM, 25 mM, 50 mM and 100 mM till 24hrs to conclude a final single concentration of arsenate to be employed for further growth studies. The bacterial strains were grown at 37 °C as the temperature has the maximum effect on the growth of the bacteria (Bhakoo et al., 1979). The transgenic E. coli JW3470-1 strains with an increase in the concentration of sodium arsenate showed 20 a decrease in the growth rate. A similar pattern of growth was observed in mutant strains. However, comparison of the growth rate indicates, higher growth rate for the transgenic strains when compared to the mutant strains which can be explained as follows: at 5 mM and 10 mM concentration of sodium arsenate the transgenic strain had eight and seven times higher growth rate than mutant strain, at 25 mM and 50 mM concentration, five times higher growth rate than the mutant strain. At 100 mM the transgenic growth rate was four times higher than the mutant strain as shown in (Table 7). This difference in the growth rate could be explained due to the presence of arsC gene in transgenic strains and the lack of arsC gene in mutant strains. Similar results were reported in bacterial strains deficient in arsC gene (Pandey et al., 2012). From the results, it is evident that the cloning of arsC gene into mutant E. coli JW3470-1 strain was successful and the presence of arsC gene in a bacterial strain renders the organism more tolerant to arsenic (Sarkar et al., 2012; Liao et al., 2011; Srivastava et al., 2009). From the arsenic resistance assays, the concentration of arsenate at 50 mM was considered optimum concentration for further studies at varying time intervals (24 hours-96 hours). The growth curve studies suggest the exponential increase in the growth of the mutant and transgenic E. coli JW3470-1 strains (Figure 10). The transgenic strain displayed an increase in the growth from 24 hours until 72 hours. The active doubling of the cells during this time interval could be explained by the availability of an abundance of nutrients, adequate aeration and less production of secondary metabolites. As the time interval increased to 96 hours the growth of the bacterial strains was declining due to the exhaustion of the nutrients resulting in decreasing the cell growth. A similar growth pattern was seen in Exiguobacterium strain grown in presence of 30 mM of sodium arsenate till 24 hours (Chowdhury et al., 2014). The transgenic E. coli JW3470-1 strain which was complemented with the arsC gene of B1-CDA showed a significant difference in the growth rate when compared to the mutant E. coli JW3470-1 strain lacking the arsC gene, which was verified by statistical analysis by Mann - Whitney U test (Corder and Foreman 2014). The statistical results could be interpreted that the complementation studies on arsC gene from B1-CDA to mutant E. coli JW3470-1 strains has been successfully accomplished. The ICP-MS analysis was performed to assess the amount of arsenic accumulated in the cell-free broth or liquid medium by the transgenic and mutant E. coli JW3470-1 strain. The bacterial strains were grown in 50 mM sodium arsenate and cultured from 24 hours to 96 hours. The preliminary ICPMS results (Table 8) indicates that the transgenic strain accumulated more amount of arsenic due to the complementation of arsC gene from B1-CDA, unlike the mutant strain which was lacking arsC gene. The results for the cell-free broth indicated that the maximum difference in the reduction of arsenic was observed at 24 hours, where the transgenic strain decreased 83.34% of arsenic (50 mM to 8.33 mM) from the liquid medium when compared to mutant strain which decreased 81.42% of arsenic (50 mM to 9.29 mM) (Table 8). The difference in the arsenic accumulation at 24 hours between the transgenic and mutant strains could be calculated as 0.96 mM or 1.92 %. At 48 hours the arsenic reduction by transgenic strain was 78.86% (50 mM to 10.57mM) and by mutant strain was 78.72% (50 mM to 10.89 mM). It has been observed that the reduction rate of arsenic was lowered after 48 hours due to the mechanism of arsenic efflux. Similar ICP-MS experiments were carried out by Rahman et al. (2014) and Banerjee et al. (2011) in B1-CDA and other bacterial species which show that the accumulation of arsenic by the bacterial strains reduced after 48 hours and increased after 72 hours. However, the current experimental result was not consistent as there was a decrease in the arsenic accumulation after 72hours (76.3% in transgenic and 74.44% in mutant strains) and 96 hours (73.7% in transgenic and 73.68% in mutant strains). The ICP-MS results are inconclusive as the significant difference in the accumulation of arsenic in the cell-free broth between the transgenic and mutant strain could not be assessed statistically as the samples could not be generated in triplicates. 21 From the current study, it can be concluded that the in silico results were confirmed by in vitro experiments. The in silico analysis by I-TASSER for the arsC protein identified in the L. sphaericus B1CDA predicted arsC protein to have a molecular arsenate reductase activity (Table 5) involved in the cytoplasmic oxidation and reduction reactions for arsenate reduction (Figure 5). The in vitro experiments accomplished the complementation of the arsC gene from B1-CDA to mutant E. coli JW3470-1 strain lacking arsC gene, generating transgenic E. coli JW3470-1 strains (Figure 7). The bacterial growth analysis between the transgenic and mutant E. coli JW3470-1 strains cultured at 50 mM sodium arsenate for 24 hours to 96 hours concluded a significant difference in the growth pattern of the bacterial strains (Figure 10) which verifies our complementation studies along with the sequencing results (Figure 8) followed by expression analysis of arsC gene in transgenic strains (Figure 9). The arsenic tolerance assays concluded the minimum inhibitory concentration of arsenate for the transgenic and mutant E. coli JW3470-1 strains at 50 mM (Table 7). The results for the amount of arsenic accumulated in the cell free broth obtained by ICP-MS (Table 8) for the transgenic and mutant strain were inconclusive due to lack of replicate samples and should be performed again along with the studies for arsenic accumulation within the cell. Finally, it can be concluded that the structural and functional predictions by I-TASSER, along with complementation studies suggest that the arsC gene identified on the chromosome of L. Sphaericus B1-CDA does confer tolerance to arsenic. Therefore B1-CDA can be employed for the bioremediation of arsenic from the arsenic contaminated environment, thereby reducing the need to employ physical-chemical processes. 22 Ethical aspects and impact of the research on the society The ethical aspects do not imply for this study as there has been no pain or suffering caused unlike the human and animal experiments. The bacterial strains L. sphaericus B1-CDA and E. coli mutant JW3470-1 strain employed in this study have not been reported anywhere to be a biohazard or involved in the spread of diseases or cause of any infections. The selective antibiotic marker ampicillin in the concentration of 100 µg/ml was used only to produce transgenic E. coli JW3470-1 strain in strict sterile laboratory conditions and labelled safe as ampicillin at any concentrations till date has not produced any new antibiotic resistant bacterial strains of L. sphaericus B1-CDA and E. coli mutant JW3470-1 strain which would be pathogenic in nature. The bacterial strains were disinfected with low level multipurpose disinfectant virkon used at 1 % concentration. The 1 % virkon has not been reported to produce any superbugs of L. sphaericus B1-CDA and E. coli mutant JW3470-1 strain. The arsenic used in this study has been labeled as a potential biohazard, hence it was handled and disposed of carefully following all the safety guidelines for handling arsenic in the laboratory laid down by the University of Skövde. Therefore, it can be safely concluded that there has been no biotechnology risk reported by this study. The impact of the research on the society can be summarized as follows (1) the study would help in mitigation and remediation of the environment by using the bioremediation techniques involving the use of arsC gene in combination with other arsenic-responsive genes of L. sphaericus B1-CDA for bioengineering of crops resistant to arsenic thereby detoxifying arsenic from the contaminated sites (2) After a certain time period the crops could be recycled for the production of biofuel thereby creating a safe and healthier environment free from toxic pollutants. 23 Future Perspectives The principal objectives of this study involving the validation of in silico analysis by in vitro laboratory experiments have been successfully achieved. In silico analysis included the structural and functional prediction of arsC protein by I-TASSER and in vitro experiments comprised of complementation studies involving cloning and amplification of the arsC gene identified on L. sphaericus B1-CDA. Employing the bioinformatics studies and various protein modeling tools followed by X- ray crystallography technique the crystal structure of arsC protein could be constructed to study the protein-protein interactions, which would help in better understanding for characterization of the structure and binding of protein with arsenic. The protein expression analysis of the arsC gene of L. sphaericus B1-CDA could be carried out to understand the mechanism of arsenate reductase involved in arsenate reduction and further understanding the mechanism of biomethylation and volatilization of arsenic, which could be used for detoxification of arsenic from soil and water. Further laboratory experiments, including real-time PCR could be performed to check the expression levels of arsC gene. The studies on arsC gene along with other arsenic-responsive genes of L. sphaericus B1-CDA could be used for designing prospective molecular biomarkers employed for remediation of arsenic. ICP-MS studies could be repeated again with triplicate samples to analyze and deduce the statistically significant amount of arsenic accumulated in the cell free broth as well as within the cell. The arsC gene in combination with other arsenic tolerance genes identified on the chromosome of L. sphaericus B1-CDA could be used in the construction of plant expression vectors or transgenic plants which could be used as an ecological approach for remediation of arsenic, which is a better and safer alternative to physical – chemical processes towards remediation of arsenic contaminated sites. 24 Acknowledgements I, would like to thank my supervisor Prof. Abul Mandal (School of Bioscience, University of Skövde, Sweden) for providing me with the opportunity to work in his laboratory under his supervision. I, would also like to thank Ph.D. student Md. Aminur Rahman for his suggestions and guidance. I am thankful for the teaching and non-teaching staff of the School of Bioscience, university of Skövde, Sweden for their co-operation in order to accomplish my master thesis work successfully. I, would also like to thank my examiner Dr. Magnus Fagerlind for his insightful comments and suggestions on this report which helped in the creation of a proper complete scientific report. I would like to extend my thanks to Eurofins Environment Testing Sweden AB, Lidköping, Sweden and Karolinska University Hospital, Stockholm, Sweden for rendering their timely professional services for ICP-MS analysis and DNA sequencing respectively. Last but not the least I would like to thank my family and friends for their constant support and encouragement all throughout my endeavors. 25 References Ahsan H., Chen Y., Parvez F., Zablotska L., Argos M., Hussain I., Momotaj H., Levy D., Cheng Z., Slavkovich V. and Van Geen A. (2006). Arsenic exposure from drinking water and risk of premalignant skin lesions in Bangladesh: Baseline results from the health effects of arsenic longitudinal study. American Journal of Epidemiology, 163 (12), pp. 1138-1148. Argos M., Kalra T., Rathouz P.J., Chen Y., Pierce B., Parvez F., Islam T., Ahmed A., Rakibuz-Zaman M., Hasan R. and Sarwar G. (2010). Arsenic exposure from drinking water, and all-cause and chronicdisease mortalities in Bangladesh (HEALS): a prospective cohort study. The Lancet, 376 (9737), pp. 252-258. Bang S., Patel M., Lippincott L. and Meng X. (2005). Removal of arsenic from groundwater by granular titanium dioxide adsorbent. Chemosphere, 60 (3), pp. 389-397. Banerjee I., Pangule R.C. and Kane R.S. (2011). Antifouling coatings: recent developments in the design of surfaces that prevent fouling by proteins, bacteria, and marine organisms. Advanced Materials, 23 (6), pp. 690-718. Bahar F.G., Ohura K., Ogihara T. and Imai T. (2012). Species difference of esterase expression and hydrolase activity in plasma. Journal of pharmaceutical sciences, 101 (10), pp. 3979-3988. Bartlett G.J., Porter C.T., Borkakoti N. and Thornton J.M. (2002). Analysis of catalytic residues in enzyme active sites. Journal of molecular biology, 324 (1), pp. 105-121. Bhattacharya P., Jacks G., Ahmed K.M., Routh J. and Khan A.A. (2002). Arsenic in groundwater of the Bengal Delta Plain aquifers in Bangladesh. Bulletin of Environmental Contamination and Toxicology, 69 (4), pp. 538-545. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N. and Bourne P.E. (2000). The protein data bank. Nucleic acids research, 28 (1), pp. 235-242. Bhakoo M. and Herbert R.A. (1979). The effects of temperature on the fatty acid and phospholipid composition of four obligatory psychrophilic Vibrio Spp. Archives of Microbiology, 121 (2), pp. 121127. Bonneau R. and Baker D. (2001). Ab initio protein structure prediction: progress and prospects. Annual review of biophysics and biomolecular structure, 30 (1), pp. 173-189. Bobrowicz P., Wysocki R., Owsianik G., Goffeau A. and Ułaszewski S. (1997). Isolation of three contiguous genes, ACR1, ACR2 and ACR3, involved in resistance to arsenic compounds in the yeast Saccharomyces cerevisiae. Yeast, 13 (9), pp. 819-828. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N. and Bourne P.E. (2000). The protein data bank. Nucleic acids research, 28 (1), pp.235-242. Bröer S., Ji Guangyong., Bröer A. and Silver S. (1993). Arsenic efflux governed by the arsenic resistance determinant of Staphylococcus aureus plasmid pI258. Journal of bacteriology, 175 (11), pp. 3480-3485. Carlin A., Shi W., Dey S. and Rosen B.P. (1995). The ars operon of Escherichia coli confers arsenical and antimonial resistance. Journal of Bacteriology, 177 (4), pp. 981-986. Corder G. W and Foreman D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach. Wiley. 26 Chowdhury U.K., Biswas B.K., Chowdhury T.R., Samanta G., Mandal B.K., Basu G.C., Chanda C.R., Lodh D., Saha K.C., Mukherjee S.K. and Roy S. (2000). Groundwater arsenic contamination in Bangladesh and West Bengal, India. Environmental health perspectives, 108 (5), pp. 393. Chowdhury R., Karak P., Sen A.K., Chatterjee R. and Chaudhuri, K. (2014). Arsenic Extrusion and Energy Derivation as Survival Mechanism in a Novel Exiguobacterium Isolated from Arsenic Contaminated Groundwater of West Bengal. AIJRFANS, 7(1), pp.19-27. Chang F., Qu J., Liu R., Zhao X. and Lei P. (2010). Practical performance and its efficiency of arsenic removal from groundwater using Fe-Mn binary oxide. Journal of Environmental Sciences, 22 (1), pp. 1-6. Cohen M.H., Hirschfeld S., Honig S.F., Ibrahim A., Johnson J.R., O'Leary J.J., White R.M., Williams, G.A. and Pazdur R. (2001). Drug approval summaries: arsenic trioxide, tamoxifen citrate, anastrazole, paclitaxel, bexarotene. The oncologist, 6 (1), pp. 4-11. Congeevaram S., Dhanarani S., Park J., Dexilin M. and Thamaraiselvi K. (2007). Biosorption of chromium and nickel by heavy metal resistant fungal and bacterial isolates. Journal of Hazardous Materials, 146 (1), pp. 270-277. Comino E., Fiorucci A., Menegatti S. and Marocco C. (2009). A preliminary test of arsenic and mercury uptake by Poa annual. Ecological engineering, 35 (3), pp. 343-350. Chaturvedi., Navaneet., Vinay Kumar Singh. and Paras Nath Pandey. (2013) "Computational identification and analysis of arsenate reductase protein in Cronobacter sakazakii ATCC BAA-894 suggests potential microorganism for reducing arsenate." Journal of structural and functional genomics 14 (2), pp. 37-45. DeMel S., Shi J., Martin P., Rosen B.P. and Edwards B.F. (2004). Arginine 60 in the ArsC arsenate reductase of E. coli plasmid R773 determines the chemical nature of the bound As (III) product. Protein science, 13 (9), pp. 2330-2340. Diorio C., Cai J., Marmor J., Shinder R. and DuBow M.S. (1995). An Escherichia coli chromosomal ars operon homolog is functional in arsenic detoxification and is conserved in gram-negative bacteria. Journal of bacteriology, 177 (8), pp. 2050-2056. Escalante G., Campos V.L., Valenzuela C., Yañez J., Zaror C. and Mondaca M.A. (2009). Arsenic resistant bacteria isolated from the arsenic contaminated river in the Atacama Desert (Chile). Bulletin of environmental contamination and toxicology, 83 (5), pp. 657-661. Fatmi Z., Zama I., Ahmed F., Kazi A., Gill A.B., Kadir M.M., Ahmed M., Ara N. and Janjua N.Z. (2009). Core group for Arsenic Mitigation in Pakistan. Health burden of skin lesions at low arsenic exposure through groundwater in Pakistan. Is river the source? Environmental Research, 109 (5), pp. 575-581. Floudas C.A., Fung H.K., McAllister S.R., Mönnigmann M. and Rajgaria R. (2006). Advances in protein structure prediction and de novo protein design: A review. Chemical Engineering Science, 61 (3), pp. 966-988. Fung H.K., Taylor M.S. and Floudas C.A. (2007). Novel formulations for the sequence selection problem in de novo protein design with flexible templates. Optimisation Methods and Software, 22 (1), pp. 51-71. Gladysheva T.B., Oden K.L. and Rosen B.P. (1994). Properties of the arsenate reductase of plasmid R773. Biochemistry, 33 (23), pp. 7288-7293. 27 Govarthanan M., Praburaman L., Kim J. W., Oh S. G., Kamala-Kannan S. and Oh B. T. ( 2015). Characterization, real-time quantification and in silico modeling of arsenate reductase (arsC) genes in arsenic-resistant Herbaspirillum sp. GW103. Research in microbiology 166 (3), pp. 196-204. Guo W., Hou Y.L., Wang S.G. and Zhu Y.G. (2005). Effect of silicate on the growth and arsenate uptake by rice (Oryza sativa L.) seedlings in solution culture. Plant and Soil, 272 (1-2), pp. 173-181. Halttunen T., Salminen S. and Tahvonen R. (2007). Rapid removal of lead and cadmium from water by specific lactic acid bacteria. International journal of food microbiology, 114 (1), pp. 30-35. Hughes M.F. (2002). Arsenic toxicity and potential mechanisms of action. Toxicology letters, 133 (1), pp. 1-16. Holm L. and Sander C. (1991). Database algorithm for generating protein backbone and side-chain co-ordinates from a C α trace: Application to model building and detection of co-ordinate errors. Journal of molecular biology, 218 (1), pp.183-194. Jackson C.R. and Dugas S.L. (2003). Phylogenetic analysis of bacterial and archaeal arsC gene sequences suggests an ancient, common origin for arsenate reductase. BMC evolutionary biology, 3 (1), pp. 18. Jackson C.R., Dugas S.L. and Harrison K.G. (2005). Enumeration and characterization of arsenateresistant bacteria in arsenic free soils. Soil Biology and Biochemistry, 37 (12), pp. 2319-2322. Jensen S.O. and Lyon B.R. (2009). Genetics of antimicrobial resistance in Staphylococcus aureus. Future microbiology, 4 (5), pp. 565-582. Ji G. and Silver S. (1992). Reduction of arsenate to arsenite by the ArsC protein of the arsenic resistance operon of Staphylococcus aureus plasmid pI258. Proceedings of the National Academy of Sciences, 89 (20), pp. 9474-9478. Jesser K.J., Fullerton H., Hager K.W. and Moyer C.L., (2015). Quantitative PCR Analysis of Functional Genes in Iron-Rich Microbial Mats at an Active Hydrothermal Vent System. Applied and environmental microbiology, 81 (9), pp. 2976-2984. Kamiya T., Tanaka M., Mitani N., Ma J.F., Maeshima M. and Fujiwara T. (2009). NIP1; 1, an aquaporin homolog, determines the arsenite sensitivity of Arabidopsis thaliana. Journal of Biological Chemistry, 284 (4), pp. 2114-2120. Kao A.C., Chu Y.J., Hsu F.L. and Liao V.H.C. (2013). Removal of arsenic from groundwater by using a native isolated arsenite-oxidizing bacterium. Journal of contaminant hydrology, 155, pp. 1-8. Krivov G.G., Shapovalov M.V. and Dunbrack R.L. (2009). Improved prediction of protein side‐chain conformations with SCWRL4. Proteins: Structure, Function, and Bioinformatics, 77 (4), pp. 778-795. Koštál V., Vambera J. and Bastl J. (2004). On the nature of pre-freeze mortality in insects: water balance, ion homeostasis and energy charge in the adults of Pyrrhocoris apterus. Journal of Experimental Biology, 207 (9), pp. 1509-1521. Lee J., Wu S. and Zhang Y. (2009). Ab initio protein structure prediction. From protein structure to function with bioinformatics. Springer Netherlands. pp. 3-25. Lee D.A., Chen A. and Schroeder J.I. (2003). ars1, an Arabidopsis mutant exhibiting increased tolerance to arsenate and increased phosphate uptake. The Plant Journal, 35 (5), pp. 637-646. 28 Lebrun E., Brugna M., Baymann F., Muller D., Lièvremont D., Lett M.C. and Nitschke W. (2003). Arsenite oxidase, an ancient bioenergetic enzyme. Molecular biology and evolution, 20 (5), pp. 686693. Leninger T., Stoll H., Werner H.J. and Savin A. (1997). Combining long-range configuration interaction with short-range density functionals. Chemical physics letters, 275 (3), pp. 151-160. Liu C.P., Luo C.L., Gao Y., Li F.B., Lin L.W., Wu C.A. and Li X.D. (2010). Arsenic contamination and potential health risk implications at an abandoned tungsten mine, southern China. Environmental Pollution, 158 (3), pp. 820-826. Liu X., Hanson B.L., Langan P. and Viola R.E. (2007). The effect of deuteration on protein structure: a high-resolution comparison of hydrogenous and perdeuterated haloalkane dehalogenase. Acta Crystallographica Section D: Biological Crystallography, 63 (9), pp. 1000-1008. Liao V.H.C., Chu Y.J., Su Y.C., Hsiao S.Y., Wei C.C., Liu C.W., Liao C.M., Shen W.C. and Chang F.J. (2011). Arsenite-oxidizing and arsenate-reducing bacteria associated with arsenic-rich groundwater in Taiwan. Journal of contaminant hydrology, 123 (1), pp. 20-29. Liu J., Gladysheva T.B., Lee L. and Rosen B.P. (1995). Identification of an essential cysteinyl residue in the ArsC arsenate reductase of plasmid R773. Biochemistry, 34 (41), pp. 13472-13476. Lozano, L.C. and Dussán J. (2013). Metal tolerance and larvicidal activity of Lysinibacillus sphaericus. World Journal of Microbiology and Biotechnology, 29 (8), pp. 1383-1389. Lynn S., Lai H.T., Gurr J.R. and Jan K.Y. (1997). Arsenite retards DNA break rejoining by inhibiting DNA ligation. Mutagenesis, 12 (5), pp. 353-358. Li Y. and Zhang Y. (2009). REMO: A new protocol to refine full atomic protein models from C‐alpha traces by optimizing hydrogen‐bonding networks. Proteins: Structure, Function, and Bioinformatics, 76 (3), pp.665-676. Marshall G., Ferreccio C., Yuan Y., Bates M.N., Steinmaus C., Selvin S., Liaw J. and Smith A.H. (2007). Fifty-year study of lung and bladder cancer mortality in Chile related to arsenic in drinking water. Journal of the National Cancer Institute, 99 (12), pp. 920-928. Matilda O., Cohly H.H., Isokpehi R.D. and Awofolu O.R. (2010). The case for visual analytics of arsenic concentrations in foods. International journal of environmental research and public health, 7(5), pp. 1970-1983. Mateos L.M., Ordóñez E., Letek M. and Gil J.A. (2006). “Corynebacterium glutamicum" as a model bacterium for the bioremediation of arsenic. International microbiology: official journal of the Spanish Society for Microbiology, 9 (3), pp. 207-216. McGuffin L.J. and Jones D.T. (2003). Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, pp. 874–881. Margelevicium M., Laganeckas M. and Venclovas C. (2010). Bioinformatics 26, pp. 1905-1906 Meliker J.R., Wahl R.L., Cameron L.L. and Nriagu J.O. (2007). Arsenic in drinking water and cerebrovascular disease, diabetes mellitus, and kidney disease in Michigan: a standardized mortality ratio analysis. Environ Health, 6 (4), pp. 1-11. Messens J., Martins J.C., Van Belle K., Brosens E., Desmyter A., De Gieter M., Wieruszeski J.M., Willem R., Wyns L. and Zegers I. (2002). All intermediates of the arsenate reductase mechanism, 29 including an intramolecular dynamic disulphide cascade. Proceedings of the National Academy of Sciences, 99 (13), pp. 8506-8511. Mergeay M. (1995). Heavy metal resistances in microbial ecosystems. In Molecular microbial ecology manual. Springer Netherlands. pp. 439-455. Mondal D. and Polya D.A. (2008). Rice is a major exposure route for arsenic in Chakdaha block, Nadia district, West Bengal, India: A probabilistic risk assessment. Applied Geochemistry, 23 (11), pp. 29872998. Mobley H.L., Chen C.M., Silver S. and Rosen B.P. (1983). Cloning and expression of R-factor mediated arsenate resistance in Escherichia coli. Molecular and General Genetics MGG, 191 (3), pp. 421-426. Mukhopadhyay R. and Rosen B.P. (2002). Arsenate reductases in prokaryotes and eukaryotes. Environmental health perspectives, 110 (5), pp. 745. Mukhopadhyay R., Shi J. and Rosen B.P. (2000). Purification and Characterization of Acr2p, the Saccharomyces cerevisiae Arsenate Reductase. Journal of Biological Chemistry, 275 (28), pp. 2114921157. Mukhopadhyay R., Rosen BP., Phung LT. and Silver S. (2002). Microbial arsenic: from geocycles to genes and enzymes. FEMS Microbiol Rev, 26, pp 311-325. Musingarimi W., Tuffin M. and Cowan D. (2010). Characterisation of the arsenic resistance genes in Bacillus sp. UWC isolated from maturing fly ash acid mine drainage neutralised solids. South African Journal of Science, 106 (2), pp. 59-63. Muller D., Lièvremont D., Simeonova D.D., Hubert J.C. and Lett M.C. (2003). Arsenite oxidase aox genes from a metal-resistant β-Proteobacterium. Journal of bacteriology, 185 (1), pp. 135-141. McDonald IK. and Thornton JM. (1994). J Mol Biol 238, pp. 777–793 Nahar N., Rahman A., Moś M., Warzecha T., Algerin M., Ghosh S., Sheila J.B. and Mandal A. (2012). In silico and in vivo studies of an Arabidopsis thaliana gene, ACR2, putatively involved in arsenic accumulation in plants. J Mol Model. 18, p. 4249-4262. Narcis Pous., Barbara Casantini., Simona Rosetti., Steffano Fazi., Sebastia Puig. and Federico Aulenta. (2015). Anaerobic arsenite oxidation with an electrode serving as the sole electron acceptor: A novel approach to the bioremediation of arsenic-polluted groundwater. Journal of hazardous materials 283, pp. 617-622. Nahar N., Rahman A., Moś M., Warzecha T., Ghosh S., Hossain K., Nawani N.N. and Mandal A. (2014). In silico and in vivo studies of molecular structures and mechanisms of AtPCS1 protein involved in binding arsenite and/or cadmium in plant cells. Journal of molecular modelling, 20 (3), pp. 1-16. Neyt C., Iriarte M., Thi V.H. and Cornelis G.R. (1997). Virulence and arsenic resistance in Yersinia. Journal of bacteriology, 179 (3), pp. 612-619. Ng J.C., Wang J. and Shraim A. (2003). A global health problem caused by arsenic from natural sources. Chemosphere, 52 (9), pp. 1353-1359. Nordstrom D.K. (2002). Worldwide occurrences of arsenic in ground water. Science, 296 (5576), pp. 2143-2145. Oremland R.S. and Stolz J.F. (2005). Arsenic, microbes and contaminated aquifers. Trends in microbiology, 13 (2), pp. 45-49. 30 Oremland R.S., Saltikov C.W., Wolfe-Simon F. and Stolz J.F. (2009). Arsenic in the evolution of earth and extra-terrestrial ecosystems. Geomicrobiology Journal, 26 (7), pp. 522-536. Owalobi JB. and Rosen BP. (1990). Differential mRNA stability controls relative gene expression within the plasmid-encoded arsenical resistance operon. J Bacteriol. 172, pp. 2367–2371. Pandey S., Rai R. and Rai L.C. (2012). Proteomics combines morphological, physiological and biochemical attributes to unravel the survival strategy of Anabaena sp. PCC7120 under arsenic stress. Journal of proteomics, 75 (3), pp. 921-937. Pittelkow M. and Bremer E. (2011). Cellular adjustments of Bacillus subtilis and other bacilli to fluctuating salinities. In Halophiles and Hypersaline Environments. Springer Berlin Heidelberg. pp. 275-302. Quéméneur M., Heinrich-Salmeron A., Muller D., Lièvremont D., Jauzein M., Bertin P.N., Garrido F. and Joulian C. (2008). Diversity surveys and evolutionary relationships of aoxB genes in aerobic arsenite-oxidizing bacteria. Applied and environmental microbiology, 74 (14), pp. 4567-4573. Rahman A., Nahar N., Nawani N.N., Jass J., Desale P., Kapadnis B.P., Hossain K., Saha A.K., Ghosh S., Olsson B. and Mandal A. (2014). Isolation and characterization of a Lysinibacillus strain B1-CDA showing potential for bioremediation of arsenics from contaminated water. Part A: Toxic/Hazardous Substances and Environmental Engineering. Journal of Environmental Science and Health, 49 (12). Rahman A., Nahar N., Nawani N.N., Jass J., Ghosh S., Olsson B. and Mandal A. (2015). Data in support of the comparative genome analysis of Lysinibacillus B1-CDA, a bacterium that accumulates arsenics. Data in Brief, 5, pp. 579-585. Rosen B.P. (2002). Biochemistry of arsenic detoxification. Febs Letters, 529 (1), pp. 86-92. Roy A., Kucukural A. and Zhang Y. (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols, 5 (4), pp. 725-738. Sattar N.A., Ginsberg H., Ray K., Chapman M.J., Arca M., Averna M., Betteridge D.J., Bhatnagar D., Bilianou E., Carmena R. and Češka R. (2014). The use of statins in people at risk of developing diabetes mellitus: evidence and guidance for clinical practice. Atherosclerosis Supplements, 15 (1), pp. 1-15. Santini J.M., Macy J.M., Pauling B.V., O’Neill A.H. and Sly L.I. (2000). Two new arsenate/sulfatereducing bacteria: mechanisms of arsenate reduction. Archives of microbiology, 173 (1), pp. 49-57. Sato T. and Kobayashi Y. (1998). The ars Operon in the skin element of Bacillus subtilis confers resistance to arsenate and arsenite. J. Bacteriol. 180 (7), p. 1655-1661. Shen S., Li X.F., Cullen W.R., Weinfeld M. and Le X.C. (2013). Arsenic binding to proteins. Chemical reviews, 113 (10), pp. 7769-7792. Santini J.M. and vanden Hoven R.N. (2004). Molybdenum-containing arsenite oxidase of the chemolithoautotrophic arsenite oxidizer NT-26. Journal of bacteriology, 186 (6), pp. 1614-1619. Sambrook J., Fritsch E. F. and Maniatis T. (1989). Molecular cloning: a laboratory manual. Cold Spring Harbor, N.Y., Cold Spring Harbor Laboratory. Sunitha M.S.L., S. Prashanth. and PB Kavi Kishor. (2015). Characterization of Arsenic-Resistant Bacteria and their ars Genotype for Metal Bioremediation. International Journal of Scientific & Engineering Research 6 (2), pp. 304-309 31 Sarkar S., Basu B., Kundu C.K. and Patra P.K. (2012). Deficit irrigation: An option to mitigate arsenic load of rice grain in West Bengal, India. Agriculture, Ecosystems & Environment, 146(1), pp. 147-152. Silver S. and Phung L.T. (2005). Genes and enzymes involved in bacterial oxidation and reduction of inorganic arsenic. Applied and Environmental Microbiology, 71(2), pp. 599-608. Slyemi D. and Bonnefoy V. (2012). How prokaryotes deal with arsenic. Environmental microbiology reports, 4 (6), pp. 571-586. Smith A.H., Marshall G., Yuan Y., Ferreccio C., Liaw J., Von Ehrenstein O., Steinmaus C., Bates M.N. and Selvin S. (2006). Increased mortality from lung cancer and bronchiectasis in young adults after exposure to arsenic in utero and in early childhood. Environmental health perspectives, pp. 12931296. Srivastava S., Srivastava A.K., Suprasanna P. and D'souza S.F. (2009). Comparative biochemical and transcriptional profiling of two contrasting varieties of Brassica juncea L. in response to arsenic exposure reveals mechanisms of stress perception and tolerance. Journal of experimental botany, 60 (12), pp. 3419-3431. Sunita M.S.L., Prashant S., Chari P.B., Rao S.N., Balaravi P. and Kishore P.K. (2012). Molecular identification of arsenic-resistant estuarine bacteria and characterization of their ars genotype. Ecotoxicology, 21 (1), pp. 202-212. Singh A.P., Goel R.K. and Kaur T. (2011). Mechanisms pertaining arsenic toxicity. Toxicol. Intl. 18 (2), pp. 87-93. Shrestha R and Spuhler D. (2012). Arsenic removal technologies. Available http://www.sswm.info/content/arsenic-removal-technologies [Accessed on 2016.03.13]. at Untergasser A., Nijveen H., Rao X., Bisseling T., Geurts R. and Leunissen J.A. (2007). Primer3Plus, an enhanced web interface to Primer3. Nucleic acids research, 35 (2), pp. W71-W74. Wu S., Skolnick J. and Zhang Y. (2007). Ab initio modelling of small proteins by iterative TASSER simulations. BMC biology, 5 (1), pp. 17. Wu S. and Zhang Y. (2008). Bioinformatics 24, pp 924–931. Wu Q., Du J., Zhuang G. and Jing C. (2013). Bacillus sp. SXB and Pantoea sp. IMH, aerobic As (V)‐reducing bacteria isolated from arsenic‐contaminated soil. Journal of applied microbiology, 114 (3), pp.713-721. WHO, 1963. International Standards for Drinking-water, Second edition, WHO, 1993. Guidelines for drinking-water quality, second edition, volume 1 Xiong J., He Z., Van Nostrand J.D., Luo G., Tu S., Zhou J. and Wang G. (2012). Assessing the microbial community and functional genes in a vertical soil profile with long-term arsenic contamination. PloS one, 7 (11), pp. 505-507. Ye W.L., Khan M.A., McGrath S.P. and Zhao F.J. (2011). Phytoremediation of arsenic contaminated paddy soils with Pteris vittata markedly reduces arsenic uptake by rice. Environmental pollution, 159 (12), pp. 3739-3743. Zhang Y. and Skolnick J. (2004). SPICKER: A clustering approach to identify near‐native protein folds. Journal of computational chemistry, 25 (6), pp. 865-871. 32 Zhang X., Zwiers F.W. and Li G. (2004). Monte Carlo experiments on the detection of trends in extreme values. Journal of Climate, 17 (10), pp. 1945-1952. Zhang Y. and Skolnick J. (2005). TM-align: a protein structure alignment algorithm based on the TMscore. Nucleic acids research, 33 (7), pp. 2302-2309. Zhang Y. (2009). I‐TASSER: Fully automated protein structure prediction in CASP8. Proteins: Structure, Function, and Bioinformatics, 77 (S9), pp. 100-113. 33 Appendices Appendix (I) arsC Fasta sequence MTIQFFHYPKCTTCKKAQKWLNEHEVSYEEVHIAEQPPTKEELTAIWQASGQPLKKFFNT SGMKYRELGLKDKLAEMTEDEQLELLASDGMLMKRPIVTDGKKVTLGFKEVDFEQAWK Appendix (II) Table 1: Degenerate primers designed for amplification of arsC genes with respective primer length (bp), G+C content (GC %) and annealing temperature Tm (oC) Primer Sequence (5’-3’) Length (bp) GC (%) Tm (oC) ArsC (F) ATGACGATTCAATTTTTCCATTATCC 26 30.8 62.4 ArsC (R) TTATTTCCAAGCTTGCTCAAAATCT 25 32.0 61.6 Appendix (III) Table 2: PCR reaction components and volumes for a single reaction of 50 µl. The forward primer arsC-F and arsC-R reverse primer was used. The reaction was set up on the ice and the reaction components were thoroughly mixed before placing in the PCR. Reaction Components Volume per reaction 100 pM Forward primer 1µl 100 pM Reverse primer 1µl MasterAmp 1X PCR Mix 25µl MasterAmp TAQurate Polymerase Mix (1 unit) DNA 1µl DNA Template (50ng) 1µl Deionized water 21µl Total volume 50µl I Appendix (IV) Table 3: PCR reaction cycling conditions. Cycle Step Temperature Time Cycles Initial Denaturation 95 ◦C 5 minutes 1 Denaturation 94 ◦C 30seconds Annealing 60 ◦C 30sec Extension 72 ◦C 1 min/kb Final Extension 72 ◦C 10 min 1 Hold 4 ◦C ∞ 1 35 Appendix (V) Table 4: RT - PCR reaction components and volumes for a single reaction of 50 µl. The forward primer arsC-F and arsC-R reverse primer was used. The reaction was set up on the ice and the reaction components were thoroughly mixed before placing in the PCR. Reaction Components Volume per reaction 12.5 pM Forward primer 1µl 12.5 pM Reverse primer 1µl MasterAmp 1X RT-PCR PreMix 25µl MMLV RT-Plus (40 units) 1µl MasterAmp TAQurate Polymerase Mix (1 unit) DNA 1µl RNA Template (100ng) 1µl Deionized water 20µl Total volume 50µl II Appendix (VI) Table 5 : RT - PCR reaction cycling conditions. Cycle Step Temperature Time Cycles Reverse transcription 37◦C 30 minutes 1 Initial Denaturation 95 ◦C 5 minutes 1 Denaturation 94 ◦C 30seconds Annealing 60◦C 30sec Extension 72 ◦C 1 min/kb Final Extension 72 ◦C 10 min 1 Hold 4 ◦C ∞ 1 35 Appendix (VII) Table 6: Ligation reaction mixture for cloning of arsC gene with a pGEM-T vector into E. coli JW3470-1 mutant strain. The reaction components of a positive control and negative control. Positive control represents the cloning of arsC gene. The negative control allows the verification of background colonies resulting from vector alone. Reaction Component Positive Control Negative Control 10X T4 DNA ligase buffer 2µl 2µl pGEM-T vector 1µl 1µl Insert DNA (50ng) 3µl - T4 DNA ligase 1 µl 1µl Nuclease-free water 13 µl 15µl Total volume 20µl 20µl Appendix (VIII) arsC nucleotide sequence ATGACGATTCAATTTTTCCATTATCCGAAATGTACAACATGTAAAAAGGCACAGAAGTGG TTAAATGAGCATGAGGTTTCCTATGAAGAGGTACATATCGCAGAGCAACCTCCCACTAAA GAAGAACTAACAGCTATTTGGCAAGCTAGTGGACAGCCTTTGAAGAAATTTTTTAATACC TCAGGCATGAAATATCGTGAGCTTGGCTTAAAGGACAAGTTAGCGGAAATGACGGAGGAT GAGCAGCTTGAGCTACTTGCTTCAGATGGCATGTTAATGAAACGTCCCATCGTGACAGAT GGAAAAAAAGTTACTTTAGGTTTTAAAGAAGTAGATTTTGAGCAAGCTTGGAAATAA III Appendix (IX) pGEM-T Vector map Appendix (X) Table 7: Prediction of GO – terms, i.e. molecular function, biological process and cellular component by ITASSER for the 3D structure of the protein based on GO-score. GO-score > 0.5 signifies a high confidence prediction. Molecular function Biological process Cellular component GO term GO term GO term GO:0008794 GO – score 0.80 GO –score GO- score GO:0055114 0.85 GO:0043231 0.58 GO:0045892 0.45 GO:0044444 0.58 GO:0006950 0.45 GO:0046685 0.40 IV V
© Copyright 2026 Paperzz