in Molecular Biology

in Molecular Biology
Molecular characterisation of arsC gene
involved in accumulation of arsenics in
Lysinibacillus sphaericus B1-CDA.
MB701A HT15 (30 ECTS)
2015-05-11 to 2015-10-09
Version 3
Chandini Murarilal Ratnadevi
[email protected]
Master Degree Project in Molecular Biology
Supervisor: Prof. Abul Mandal
[email protected]
Examiner: Magnus Fagerlind (PhD)
[email protected]
School of Bioscience,
University of Skovde,
Box 428, SE 54128
Abstract
Previously it has been reported, Lysinibacillus sphaericus B1-CDA contains several arsenicresponsive genes like acr3, arsC, arsR, and arsB conferring arsenic tolerance. The objective of this
study was to characterize the molecular function of the arsC gene in Lysinibacillus sphaericus B1-CDA
by in silico and in vitro analysis. For in silico studies, Iterative Threading Assembly and Refinement (ITASSER) server predicted the 3D model of the arsC protein. The 3D predicted structure of the
protein exhibited six ligand binding site residues (Tyr-8, Cys-11, Thr-13, Arg-95, Gly-107, Phe-108)
that bind to arsenic. The in silico results predicting arsC is responsive to arsenic was validated by in
vitro experiments. The arsC gene present in the genome of B1-CDA was transferred to arsC deficient
strain of Escherichia coli JW3470-1 mutant by complementation studies. The presence of the arsC
gene in transgenic strain was verified by PCR and DNA sequencing. The mutant and transgenic
strains were exposed to 50mM arsenic under 96hrs and their growth rate was measured by a
spectrophotometer. The results represented a statistically significant difference (p<0.05) in the
growth rate between mutant and transgenic strains. Further preliminary analysis obtained from
Inductively Coupled Plasma-Mass Spectroscopy (ICP-MS) demonstrated that the levels of arsenic
accumulation in the cells of transgenic strain were 1.92 times higher than the mutant strain of
Escherichia coli JW3470-1 at 24 hours of exposure to arsenic. These results show that Lysinibacillus
sphaericus B1-CDA can be utilized as a potential organism for the removal of arsenic from the
arsenic contaminated environment.
Popular scientific summary
Arsenic, a colorless, odorless toxic metalloid from the nitrogen family occurs in many allotropic
forms. Primarily, arsenic in its inorganic form exists in two redox states, i.e. arsenate, the oxidized
form, and the reduced form arsenite. Arsenite disrupts protein structure by interfering with the
sulfhydryl groups in amino acids. Arsenate is a phosphate analog and disrupts the various cellular
process that involves phosphate uptake. Exposure to arsenic is mainly through ingestion of
contaminated ground water, inhalation of arsenic contaminated air due to a spray of arsenical
pesticides and absorption by the skin through contaminated soil and crops. Arsenic exposure causes
arsenic poisoning, which has become a major global concern. Millions of people suffer from cancers,
keratosis and neural system damage due to arsenic poisoning as a result of consuming arsenic
contaminated water and foods. The U.S Environmental Protection Agency (U.S.E.P.A) has classified
arsenic as a potential carcinogenic agent. The World Health organization (WHO) has set the
permissible levels of arsenic in drinking water to 0.01 mg/L. To overcome the hazardous effects of
arsenic poisoning, researchers have suggested mitigation and remediation processes. The mitigation
process involves finding a water source free of arsenic and remediation process is defined as
eliminating arsenic from contaminated water source before consumption. These two processes can
be achieved by two methods 1) physical – chemical methods and 2) biological methods. The
physicochemical methods are expensive, ecologically not friendly and inefficient. Hence, researchers
have preferred biological methods which are inexpensive, efficient and made ecologically friendly by
use of microorganisms for bioremediation of arsenic. The microorganisms involved in
bioremediation of arsenic from the contaminated environments utilize arsenic due to the presence
of arsenic tolerance genes catalyzing various detoxification pathways.
Lysinibacillus sphaericus B1-CDA is a soil-borne bacterial strain which has been identified to contain
several arsenic-responsive genes like acr3, arsC, arsR and ars B conferring arsenic tolerance.
Arsenate reductase gene (arsC) is involved in the reduction of arsenate to arsenite. The aim of this
study was to clone and characterize the arsC gene occurring in Lysinibacillus sphaericus B1-CDA into
Escherichia coli JW3047-1 mutant strain lacking arsC gene. The presence of arsC gene renders the
organism tolerant to arsenic, therefore, arsC gene could be used in the removal of arsenic from an
arsenic contaminated environment. In silico analysis, performed on computer predicted the 3D
structure and putative function of arsC protein by Iterative Threading Assembly and Refinement (ITASSER) server. The invitro or test tube experiments were performed to confirm the in silico results.
The arsC gene from B1-CDA approx. 350 bp was isolated and transformed into mutant Escherichia
coli JW3047-1 strain lacking arsC gene resulting in transgenic Escherichia coli JW3047-1 strains. The
arsC gene insert in the transgenic strain was confirmed by colony PCR and DNA sequence analysis.
Reverse Transcriptase - PCR (RT-PCR) was performed to check the expression of arsC gene in
transgenic strain. The arsenic tolerance assays performed for the mutant and transgenic Escherichia
coli JW3047-1 strains optimized the concentrations of sodium arsenate at 50 mM. The growth of the
bacterial strains was measured by a spectrophotometer. A statistical analysis by Mann - Whitney U
test indicated that the transgenic strain had a significantly higher growth rate (p<0.05) when
compared to the mutant strain. Inductive Coupled Plasma-Mass Spectroscopy (ICP-MS) analysis for
the accumulation of arsenic from the liquid medium was performed for transgenic strains and
mutant strains of Escherichia coli JW3047-1 cultured in 50mM sodium arsenate for a time interval
ranging from 24 hours to 96 hours. The transgenic strain accumulated more arsenic from the liquid
medium than the mutant strain due to the presence of arsC gene. Therefore, these results suggest
that Lysinibacillus sphaericus B1-CDA can be employed as a suitable bioremediation organism for
detoxifying arsenic from the arsenic contaminated environment.
Abbreviations
As
As (III)
As (V)
Ars
aa
ATP
Bp
BLAST
BLASTn
dNTP’S
DNA
EC
EDTA
GO
ICP-MS
Kbp
KDa
LB
LOMETS
Mg2+
MIC
NCBI
NEB
OD
PCR
PDB
PSI-BLAST
PSI-PRED
REMO
RMSD
RNA
RT-PCR
Sp.
SVMSEQ
SN
TAE
Tm
TM-Score
U.S.E.P.A
WHO
3D
Arsenic
Arsenite
Arsenate
Arsenic resistance
Amino acids
Adenosine triphosphate
Base pair
Basic local alignment search tool
Nucleotide blast
Deoxynucleotide triphosphates
Deoxyribose nucleic acid
Enzyme Commission
Ethylene diamine tetraacetic acid
Gene Ontology
Inductively coupled plasma mass spectroscopy
Kilobase pair
Kilo Dalton’s
Luria Bertani
Locally installed meta-threading-server
Magnesium
Minimum inhibitory concentration
National center for biotechnology information
New England Bio labs
Optical density
Polymerase chain reaction
Protein database
Position specific iterated-blast
Psi-blast based secondary structure prediction
Reconstruct atomic model
Root-mean-square deviation
Ribose nucleic acid
Reverse transcriptase – polymerase chain reaction
Species
Support vector machine sequence
Serial Number
Tris acetate EDTA
Melting temperature
Template modeling score
United states environment protection agency
World health organization
Three-dimensional
Table Of Contents
Introduction .................................................................................................................................................. 1
Aim ................................................................................................................................................................ 4
Materials and Methods................................................................................................................................. 5
Results ........................................................................................................................................................... 9
Discussion.................................................................................................................................................... 19
Ethical Aspects and Impact On The Society ................................................................................................ 23
Future Perspectives .................................................................................................................................... 24
Acknowledgements..................................................................................................................................... 25
References .................................................................................................................................................. 26
Appendices..................................................................................................................................................... I
Introduction
Arsenic, a natural component of the earth’s crust is a toxic metalloid of geochemical and
anthropological origin existing as the soluble forms of pentavalent arsenate As [V] and trivalent
arsenite As [III]. Environmentally depending on the pH, arsenic exists as arsenate oxyanions
(H2AsO4), (H2AsO4-2) and non-ionic arsenite form (H3AsO3), (H2AsO3-). Both the chemical forms of
arsenic have disastrous effects, but the trivalent arsenite [AsIII] is 25-60 times more toxic when
compared to the pentavalent arsenate [As (V)] (Xiong et al., 2012; Oremland et al., 2009; Silver and
Phung, 2005). Arsenate predominantly occurs in oxidized waters and in reduced environments like
hot spring biofilms, arsenite is present (Duquesne et al., 2008; Oremland and Stolz, 2005; Santini et
al., 2000). Due to the compelling levels of contamination of soil, crops, and water, arsenic poisoning
has emerged as a global concern in and around many parts of the globe (Matilda et al., 2010). In
several countries like India, Bangladesh, China, Chile, Argentina, Japan, Mongolia, Mexico, Nepal,
Taiwan, Poland, and Vietnam and in some areas of the USA, the groundwater has been found to
contain higher concentrations of arsenic than the permissible levels (Nordstrom 2002). The U.S.
Environmental Protection Agency (U.S.E.P.A) have classified arsenic as a potential carcinogenic agent
(Smith et al., 1992). World Health Organization (WHO) has established guidelines for the permissible
levels of arsenic in drinking water at 0.01 mg/L (WHO, 1993) to decrease the health risks associated
with consumption of drinking water exposed to arsenic, but several third world countries follow the
previously declared WHO guideline of 0.05 mg/L (WHO, 1963) owing to economic reasons (Singh et
al., 2011). The drinking water supplies in the region of the Brahmaputra in Bangladesh have been
vastly contaminated with naturally occurring arsenic and the WHO has quoted it as the mass
poisoning of the living population (Smith et al., 2006). Inhibition of energy flow and inactivation of
enzymes involved in DNA repair, DNA replication, phospholipid and nucleic acid synthesis has been
associated with arsenic toxicity (Banerjee et al., 2011, Hughes, 2002; Lynn et al., 1997). There has
been a widespread increase in the neural system damage, cancers related to lung, bladder, kidney,
and skin as the drinking water has been exposed to high levels of arsenic leading to
hyperpigmentation of skin, keratosis of feet and hands which rapidly advance to skin cancers by the
intake of arsenic-contaminated food and water and inhaling air contaminated with spraying of
arsenical pesticides (Argos et al., 2010; Marshall et al., 2007; Ng et al., 2003). Globally, an estimated
100 million people are severely exposed to arsenic through the drinking water contaminated with
high arsenic levels (Singh et al., 2011; Ahsan et al., 2006; Bang et al., 2005).
The impact of the locally occurring atmospheric depositions and interactions of the natural drinking
water with sediments, bedrocks and soils influence the arsenic cycling globally (Meliker et al., 2007).
It has been reported that the redox characteristics of aquifers sediments and bedrocks naturally
influence the concentration of arsenic in the groundwater (Bhattacharya et al., 2002). To overcome
the destructive effects of arsenic poisoning in areas exposed to arsenic, researchers have considered
two main options. 1) Mitigation i.e. finding a water source free from arsenic and 2) Remediation i.e.
elimination of arsenic from the contaminated water before consumption (Visoottiviseth and Ahmed,
2008). For the removal of toxic arsenic substances from the contaminated sources, several methods
have been developed like electrochemical treatment, activated coal adsorption, precipitation,
reverse osmosis, evaporation, coagulation ion-exchange (Chowdhury et al., 2000), bioremediation,
rhizofiltration and phytoremediation (Shrestha and Spuhler 2012). The bioremediation process
involves the bacterial biochemical mechanism of reducing or tolerating the toxic form of arsenic and
converting it into less toxic forms by enzymatic reactions. The pathways through which
microorganisms reduce or oxidize toxic form of arsenic are by biotransformation, periplasmic
biosorption, intracellular accumulation, etc. These mechanisms were reported in organisms
belonging to the species Kocuria, Proteobacterium, Firmicutes, by Banerjee et al. (2011) and
microbes from the species Herminnimonas, Stenotrophomonas, Alcaligenes were reported by Bahar
et al. (2012) and in lactobacillus species by Halttunen et al. (2007). The microbial species belonging
1
to Achromobacter, Rhodococcus, and Aliihoeflea were reported by Sattar et al. (2014) in Bacillus
species by Mondal et al. (2008) in Pseudomonas species by Kao et al. (2013) in Lysinibacillus species
by Lozanzo et al. (2013). The presence of a detoxification mechanism in these microorganisms can
be correlated with the occurrence of arsenite oxidation gene (aox/aro/aso), arsenic respiratory
reduction genes (arr) and arsenic resistance genes (ars) (Chang et al., 2010).
Thermodynamically the arsenate form is more favorable in the majority of the aerobic systems and
enters the bacterial cell through the low-affinity phosphate nutrient transport Pit system (Jackson et
al., 2003) which is expressed constitutively and non-specifically (Comino et al., 2009). During
phosphorylation, arsenate enters the cell through the Pit system as it is a structural analog of
phosphate and replaces the cellular phosphate causing toxicity and death of the cell (Lee et al.,
2003). Microbiologically and chemically following the oxidation - reduction process as shown in
(Figure 1) arsenate is reduced to arsenite, which undergoes biomethylation and released as arsines
like trimethylarsine [TMA (III)]. Bacterial organisms have evolved several mechanisms to overcome
arsenic toxicity like methylation and peroxidation reactions (Lebrun et al., 2003). Arsenite oxidase a
periplasmic membrane-bound enzyme in some Gram-negative bacteria oxidize arsenite to arsenate
(Jackson et al., 2003). Several microorganisms have been identified which are tolerant to lethal
arsenic concentrations (> 60 mM sodium arsenite) yet very little has been known regarding the
genetics of environmental bacteria resistant to arsenic (Mateos et al., 2006).
Figure 1: The biochemistry of arsenate reduction to arsenite with in a cell initiated by arsenate reductase and
undergoes several redox reactions involving methylation reactions to be excluded out as arsines.
The microbial arsenic detoxification pathway is regulated by the ars operon (Jackson and Duggas,
2003). The isolated arsenic resistance determinants (ars) are organized into one transcriptional unit
consisting of three genes (ars RBC) or five genes (ars RDABC). The ars genes have been identified
2
both on plasmids (Owalobi and Rosen, 1990) and chromosomes (Diorio et al., 1995) of numerous
prokaryotes (Broer et al., 1993; Carlin et al., 1995) and eukaryotes (Bobrowicz et al., 1997). The
chromosome of Escherichia coli and other enterobacteriacea members and Pseudomonas
aeruginosa, contains the three ars gene (ars RBC) system, encoding arsenate reductase (ars C) a
soluble enzyme which reduces arsenate to arsenite, arsenite permease (ars B) a membrane protein,
which excludes arsenite from the cell and arsenic transcriptional repressor (arsR) codes a repressor
protein, which regulates the expression of ars operon (Jackson and Duggas, 2003; Rosen, 2002;
Mobley et al., 1982). Some of the microorganisms contain other genes in their operons like arsA an
ATPase enzyme stimulated by oxyanions involved in ATP hydrolysis for extruding arsenicals from the
action of arsB protein. Ars D is identified as a regulatory protein for regulating the overexpression of
arsB (Neyt et al., 1997). Ars operons consisting of four genes in the order of arsR, ORF2, arsB, arsC
have been reported in the skin elements of Bacillus subtilis (Sato and Kobayashi, 1998) which confers
arsenic tolerance to the organism.
Arsenate reductase, a product of the ars C gene enzymatically reduces arsenate to arsenite (Guo et
al., 2005; Demel et al., 2004; Messens et al., 2002; Cohen et al., 2001). The phylogenetic tree of
protein sequences represents three distinguished clades of arsenate reductases belonging to two
bacteria and one yeast family (Mukhopadhyay et al., 2002). The best-studied arsenate reductase is
ArsC of E. coli R773 plasmid of the clade glutaredoxin/glutathione (GrxS/GSH) (Kao et al., 2013). ArsC
protein has an active redox cysteine residue in the active site (Liu et al., 1995; Gladysheva et al.,
1994) which is involved in the mechanism of arsenate reduction (Mukhopadhyay and Rosen, 2002).
The second family of arsenate reductases has the prototype of ArsC of Staphylococcus aureus pI258
plasmid (Ji and Silver, 1992) which is referred to as thioredoxin (Trx) clade related to
phosphotyrosine phosphatases (Demel et al., 2004; Gladysheva et al., 1994). The Ars C of pI258
plasmid has CX5R sequence flanked by the alpha (α) helix and beta (β) strand (Messens et al., 2002)
which uses three cysteine residues for arsenate reduction (Mukhopadhyay and Rosen, 2002). The
arsenate reductase Arr2p in yeast Saccharomyces cerevisiae (Mukhopadhyay et al., 2000) forms the
third clade of arsenate reductases which belongs to the class of tyrosine phosphatases similar to the
cdc25a cell cycle control protein phosphatases found in humans (Slyemi et al., 2012; Martin et al.,
2001). Studies have been published on the eukaryotic arsenate reductases like Leishmania major
arsenic reductase 2 (LmACR2) and Saccharomyces cerevisiae arsenic reductase 2 proteins (ScAc2p)
(Nahar et al., 2012; Mukhopadhyay et al., 2000). To overcome the problem of arsenic contamination
recent studies have identified and detected ars genes in the environmental samples by correlating
their presence with the bacterial isolates resistant to arsenic. The studies of these genes can be used
for designing prospective molecular biomarkers employed for remediation of arsenic (Jackson et al.,
2005).
Lysinibacillus sphaericus B1-CDA, a soil-borne arsenic resistant bacterium identified from the arsenic
contaminated land from the southwest region of Bangladesh grows at 500mM concentration of
sodium arsenate in the medium (Rahman at al., 2014). In earlier studies conducted by Rahman et al.
(2015) using the bioinformatics tool, InterproScan identified arsC gene on the genome of B1-CDA
which is approximately 350bp and found similar to the arsC gene present in Lysinibacillus sphaericus
C3-41 along with other genes like arsB, arsR, and arsSX. In recent years, computational prediction of
structures has yielded highly reliable result in a span of few hours for large sequences unlike the
experimental procedures for structure prediction which are expensive, time-consuming and
laborious (Krivov et al., 2009). The preliminary methods for the prediction of the protein structure
are comparative modeling, threading and detecting similarities between model structure and single
known sequence structure. The other methods include ab-initio modeling (Roy et al., 2010; Wu et
al., 2007) or denovo methods (Fung et al., 2007; Floudas et al., 2006) which predicts protein
structure through a sequence without any similarities between any known structural sequence or
model sequence at fold level. Molecular biology, biomedicine, and biochemistry provide enormous
3
potential for prediction and analysis of protein structures within a span of a few hours by employing
bioinformatics tools.
Aim
The long-term goal of this study is to determine the molecular function of the arsC gene in
Lysinibacillus sphaericus B1-CDA, whether it is involved in the accumulation of arsenic within the
bacterial cells. If so, arsC gene could be involved in the removal of arsenic from the areas heavily
polluted with arsenic. The short-term goals of this study are (i) to clone the arsC gene from L.
sphaericus B1-CDA (ii) characterize the molecular function of arsC gene by in silico and in vitro
analysis (iii) perform complementation studies by the transfer of arsC gene from B1-CDA to a strain
lacking arsC gene.
4
Materials and Methods
3.1. In silico Studies
The genome sequence of L. sphaericus B1-CDA was obtained from the GenBank with accession
number PRJEB7750, [http://www.ebi.ac.uk/ena/data/view/PRJEB7750] (Rahman et al., 2015). The
arsC protein sequence contained 118 amino acids (Appendix I). To predict the structure and function
of arsC protein of L. sphaericus B1-CDA, the Iterative Threading Assembly Refinement (I-TASSER)
(Roy et al., 2010) methodology (Figure 2) were used as follows:
3.1.1. Threading
This is the first stage of I-TASSER where the arsC protein query sequence was matched with the
database of non-redundant sequences by position specific iterated – BLAST (PSI-BLAST)
(Margelevicium et al., 2010). From the homologous sequences, multiple alignments were created
which was then used by position specific iterated - prediction (PSI-PRED) (McGuffin et al., 2000) for
predicting the secondary structure of the protein. A locally installed meta-threading-server
(LOMETS) (Wu and Zhang, 2008) along with seven other threading programs SP3, PPA10,
PROSPECT54, MUSTER53, HHSEARCH44, FUGUE52 and SPARKS56 (Margelevicium et al., 2010)
threaded the query sequence whose secondary structure was predicted with sequence profile. In
each threading program, structure based and sequence based scores formed the basis for ranking of
templates. The template with the highest rank was further considered and based on the statistical
significance, i.e. Z – Score of the leading threading program the quality of template alignment was
judged.
3.1.2. Structural Assembly
In this stage, on the basis of ab-initio modeling, (Bonneau and Baker, 2001) continuous fragments
excised from the template structure in threading programs were assembled to structural
conformations aligning with the unaligned regions. The fragments were assembled by a modified
replica exchange technique, i.e. Monte Carlo simulation technique (Zhang et al., 2004) guided by the
data derived from the protein database (PDB) (Berman et al., 2000), Support vector machine
sequence (SVMSEQ60) (Lee et al., 2009) prediction based on sequence contact and threading
templates spatial restraints. The refinement simulations clustered by SPICKER (Zhang and Skolnick,
2004) which is a well-organized and simple clustering program generates cluster centroids by
averaging the clustered structural entices.
3.1.3. Model selection and refinement
The cluster centroids generated from structural assembly were selected and the fragments
assembled were subjected to simulations. From the threading alignments LOMETS, external
constraints were generated and Template model alignment (TM-align) (Zhang and Skolnick, 2005)
identified the PDB structures closest to cluster centroids of SPICKER helping refinement of the global
topology of the SPICKER cluster centroids. The clustered structural entices from structural assembly
were clustered again and REMO (Zhang, 2009) selected the lowest energy cluster as input generating
concluding structural model by constructing complete atom models from the traces of Cα and sidechain interactions of the proteins (Holm and Sander, 1991) from PDB by optimization of networks of
hydrogen bonding (McDonald and Thornton, 1994).
3.1.4. Structure-based functional annotation.
The function of the arsC protein was conferred by predicting the structural identical 3D model
against the PDB database of proteins of known structure and function. The query protein structural
analog from the gene ontology (GO) library (Roy et al., 2010) using TM - align were matched on the
5
basis of global topology and from the frequency of recurrence of GO terms a consensus was derived.
From the libraries of the binding site and enzyme commission (EC) (Bartlett et al., 2002), structural
analog was matched for local and global similarities in structure. The local similarity structure search
was used for a method complementary to the global similarity for identification of analogs having a
dissimilar fold, but due to the binding/active sites conservation performed the same function. The
global similarity structure search was used for the recognition of proteins with the identical global
fold. The global search functional analog results were based on structural patterns conserved in the
current model measured by TM-score, structural alignment coverage, sequence identity and
reconstructed atomic model (REMO) (Li and Zhang, 2009).
Figure 2: Prediction of the 3D structure and function of the arsC protein by iterative threading assembly
refinement (I-TASSER) methodology includes five major steps. Step 1: Submission of the query sequence in the
I-TASSER server Step 2: Threading of the submitted sequence uses three programs-PSI-BLAST, PSI-PRED, and
LOMETS Step 3: Structure assembly by SVMSEQ and SPICKER Step 4: Model selection by TM-align and
refinement by REMO Step 5: Structural and functional prediction by Enzyme Commission, Gene Ontology
terms, and ligand binding sites.
3.2. Bacterial strains, Chemicals, and Growth
The bacterial strains used in this study were Lysinibacillus sphaericus B1-CDA (University of Skövde,
Sweden) and Escherichia coli JW3470-1 mutant strain (arsC gene knocked down) (Coli Genetic stock
center, USA). The bacteria were cultured overnight in LB broth at 37 °C at 180 rpm in a closed shaker
incubator with selective antibiotic marker ampicillin at a concentration of 100 µg/ml. The source of
arsenic in this study was sodium arsenate from Sigma - Aldrich and the working concentration
ranged from 5 mM to 100 mM (Mergeay et al., 1995).
3.3. Nucleic acid isolation.
The cloning of arsC gene involved the isolation of RNA from L. sphaericus B1-CDA strain in preparing
cDNA to be used for ligation into pGEM-T cloning vector from Promega for transformation. RNA
isolation from L. sphaericus B1-CDA and E. coli JW3470-1 mutant and the transgenic strain were
performed by MasterpureTM RNA purification kit (Epicentre). The arsC gene ligated into the pGEM-T
vector and cloned in mutant E. coli JW3470-1 competent cells was isolated from E. coli JW3470-1
transgenic strain using the plasmid mini Kit (Qiagen). The arsC gene PCR product obtained by
amplification of E. coli JW3470-1 plasmid using gene specific primers (Appendix II), PCR reaction
6
components (Appendix III) and PCR program (Appendix IV) was purified by QIAquick PCR purification
kit (Qiagen). All the isolation procedures were performed in triplicates following the respective kit
manufacturer’s protocols. Nanodrop 2000 (Thermo Fisher Scientific, Wilmington, USA) was used to
measure the concentration and purity of isolated nucleic acids.
3.4. Primer design.
The ars operon of Lysinibacillus sphaericus B1-CDA contains arsenic metabolizing genes which confer
arsenic resistance to the organism (Rahman et al., 2015). The sequence of the arsenic reductase
gene arsC identified in the ars operon of the genome of L.sphaericus B1-CDA was retrieved from the
GenBank database [http://www, ebi.ac.uk/ena/data/view/PRJEB7750], accession number
PRJEB7750. Gene-specific primers (Appendix II) were custom designed to amplify an arsC gene using
Primer3Plus software (Untergasser et al., 2007).
3.5. RT – PCR
One step 50 µl reverse transcriptase - PCR reaction was set up using the isolated RNA of B1-CDA at a
concentration of 100 ng/µl following the reaction protocol of MasterAmpTM High fidelity RT – PCR kit
(Epicentre) with RT-PCR reaction components (Appendix V) and cycling conditions (Appendix VI). The
resultant cDNA was purified by QIAquick PCR purification kit (Qiagen) following the kit protocol and
3 µl of the purified sample was visualized by agarose gel electrophoresis.
3.6. Cloning of arsC gene
The purified cDNA product obtained from the RT-PCR of B1-CDA strain was ligated into a commercial
cloning vector pGEMT procured from Promega. An overnight ligation reaction (Appendix VII) was set
up according to the kit protocol along with a negative control. The ligation mixture was transformed
into E. coli JW3470-1 mutant strain competent cells by the calcium chloride transformation protocol.
The transformed reaction mixture was plated on LB plates with ampicillin at a concentration of 100
µg/ml and incubated overnight at 37 °C.
3.7. Colony PCR
The screening of recombinants was performed by colony PCR using arsC gene primers. A 50 µl PCR
reaction was set up using MasterAmpTM PCR kit (Epicentre), PCR reaction mixture (Appendix III) and
cycling conditions (Appendix IV) followed by agarose gel electrophoresis to visualize the amplified
arsC gene.
3.8. Agarose gel electrophoresis
The amplified PCR products, RT – PCR products and purified PCR products were analyzed by agarose
gel electrophoresis. The electrophoresis running buffer was 1 X Tris acetate EDTA (TAE) prepared
from the stock concentration of 50 X TAE. All the PCR products were visualized on 2% agarose gel
run at 100V. GelredTm (Biotium) at 1X concentration was used as an intercalating dye. As a reference
for a molecular weight 2log DNA ladder (NEB) and 6X gel loading dye at 1x concentration (NEB) was
used.
3.9. DNA sequencing
The plasmid was isolated from the transgenic strains of E. coli obtained from the cloning of the
ligated arsC gene with pGEM-T vector into E. coli mutant JW3470-1 strain. Using gene specific
primers (Appendix II) and about 50ng of the plasmid as template, the arsC gene was amplified by
PCR reaction mixture (Appendix III) following the cycling conditions (Appendix IV). The amplified PCR
product was purified using the QIAquick PCR purification kit (Qiagen). The purified PCR product
7
about 100 ng was sent for DNA sequencing along with 10 µl of gene-specific primers to KIGene at
Karolinska University Hospital, Stockholm, Sweden. The sequencing results were analyzed with the
existing sequences and amino acid databases in the NCBI using BlastN.
3.10. Arsenic resistance assay
The minimum inhibitory concentration (MIC) of arsenic for transgenic and mutant E. coli JW3470-1
strain was determined to optimize the arsenic concentration for ICP-MS and bacterial growth
analysis. The bacterial cultures of the transgenic and mutant E. coli JW3470-1 strains were
separately cultured overnight at 37 °C in LB broth ( 100 µg/ml ampicillin for transgenic strains and no
ampicillin for mutant strains) to be used as inoculum. For the assay, one percent inoculum was
transferred to 50 ml LB broth in conical flasks supplemented with arsenate at concentrations of 5
mM, 10 mM, 50 mM and 100 mM. The samples were cultured in one parallel set at 37 °C at 180 rpm
in a shaker incubator. The growth of the bacterial cultures was measured after 24 hours by optical
density at 600nm by WPA Bio wave CO 8000 cell density meter and the variation in the bacterial
growth between the transgenic and the mutant E. coli JW3470-1 strains were observed as illustrated
by Musingarimi et al. (2010)
3.11. Growth and Inductively Coupled Plasma - Mass Spectroscopy (ICP-MS)
analysis
The inoculum for bacterial growth assay was prepared by overnight culturing of transgenic and
mutant E. coli JW3470-1 strains at 37 °C in LB broth (100 µg/ml ampicillin for transgenic strains and
no ampicillin for mutant strains). One percent inoculum was added to all the samples. The cells were
cultured separately in three parallel sets of 50 ml LB broth in conical flasks supplemented with 50
mM arsenate and incubated at 37 °C at 180 rpm in a shaker incubator till 96 hours. The turbidity of
the bacterial cells was recorded every 24 hrs at an absorbance of 600nm by cell density
spectrophotometer meter as described by Kostal et al. (2004). After the absorbance readings were
measured in every 24 hours the bacterial cultures were prepared for ICP-MS analysis following the
protocol by Rahman et al. (2015). The bacterial cultures were harvested by centrifugation at 10000 g
for 10 min. After harvesting the samples, the supernatant or cell free media was separated from the
pellet and stored at 4 °C. The pellet was washed thrice with de-ionized water and dried until the cell
dry weight was achieved after discarding the water. For the analysis of total arsenic, supernatant
with blank, i. e. media containing arsenate but not exposed to inoculum was sent to ICP-MS analysis
at Eurofins Environment Testing Sweden AB (Lidköping, Sweden)
3.12. Statistical analysis
The data were statistically analyzed to compare the difference in the growth rate between the
mutant E. coli JW3470-1 and transgenic E. coli JW3470-1 strains. Mann-Whitney U test was
performed and the significance level was set at 0.05 (p<0.05). The results were reported in mean
value and standard deviation.
8
Results
I-TASSER server, an online tool was used for the in silico analysis of ars C protein predicting the
secondary and tertiary structure including the putative function of arsC protein.
4.1. Prediction of secondary structure
The arsC protein query sequence was aligned against 5,798 non-redundant protein structure in ITASSER using PSI-BLAST. The secondary structure of the protein was predicted by PSI-PRED. The
predicted structures were defined with the β-sheets, α-helices, and coiling structures along with
confidence scores (Figure 3). LOMETS threaded the query sequence from the PDB library generating
top ten best models from ten highly efficient threading programs depicting Z-score values and
sequence identity coverage. The models were selected based on normalized Z-score greater than
one which indicates a good alignment with the query sequence. Higher the Z - score more closely the
alignment with the query sequence. The model ranked as 1 (Table 1) was identified as 3gKxB with a
normalized Z - score of 3.95 and identity coverage of 0.47 and 0.48 percentage and was identified
from the protein database as putative arsC protein from Bacteroides fragilis.
Figure 3: Prediction of secondary structure of arsC protein by PSI-PRED. Blue indicates beta sheets(s), Red
alpha helix (H), Black coil and Confidence score (Conf. Score) of each amino acids from 1-9.
Table 1: Prediction of a protein model by I-TASSER used top ten threading folds. The accuracy of model-1 was
predicted based on the Z-score. Normalized Z-score > 1 indicates a good alignment.
Rank
PDB Hit
Iden1A
Iden2B
CovC
Norm Z-scoreD
1
3gkxB
0.47
0.48
0.98
3.95
2
2m46A
0.52
0.51
0.97
3.02
3
2m46A
0.51
0.51
0.99
4.11
4
2m46A
0.51
0.51
0.99
1.94
5
2m46A
0.51
0.51
0.99
2.06
6
2m46A
0.52
0.51
0.99
3.96
7
2m46A
0.52
0.51
0.98
3.16
8
3gkxA
0.50
0.49
0.99
2.56
9
9
2m46A
0.52
0.51
0.99
3.82
10
3rdwA
0.23
0.24
0.97
4.08
A)
Percentage of sequence identity of the templates in the threading aligned region with the query sequence
B)
Percentage of sequence identity of the whole templates with query sequence
C)
Coverage of threading alignment=number of aligned residues divided by the length of query protein
D)
Normalized Z-scores of the threading alignment
4.2. Prediction of 3D structure
Based on the pairwise structure similarity, I-TASSER used the spicker cluster centroids to generate
3D structures (Figure 4). The quality of the model was measured by C-score or confidence score. The
C-score ranges from [-5 to 2]. Higher the C-score, higher the quality of the predicted structure. The
C-score of the predicted structure is 1.27. The TM - score and RMSD was used to measure the
structural similarity of the predicted structure. The TM- score < 0.17 indicates random similarity and
a TM –score > 0.5 indicates the model of correct topology. The TM -score of the predicted structure
is 0.89 + 0.07, RMSD = 1.9 + 1.6 Å suggesting correct topology and proper quality. The full-length
model was generated by I-TASSER by cluster density using Monte Carlo simulations.
Figure 4: Predicted 3D structure of arsC protein by I-TASSER. Prediction is based on TM-score, RMSD, C-score
and cluster density, which determines the quality of the model. With the increase in C-score and cluster
density, the quality of the model is enhanced. Refinement of the model is done by TM-score and RMSD values.
The model’s C-score = 1.27, TM-score = 0.89 + 0.07, RMSD = 1.9 + 1.6 Å. The arsC protein ribbon presentation
indicates the active sites for ligand binding at the red loop Tyr-8, Cys-11, Thr-13, Arg-95, Gly-107, Phe-108.
4.3. Structural analogs to the target protein identified by TM-Align
The TM-align program was used to match the first generated I-TASSER model with all the structures
in the PDB library, thereby generating top ten models which have the close structural similarity to
the predicted I-TASSER model resulting in the proteins having similar functions to the target protein.
The ranking of the proteins was made on the basis of highest TM-score indicating closest structural
similarity to the query sequence from across the PDB library. The PDB hit 2m46A (Table 2) is ranked
model number one due to highest TM-score of 0.965 when compared to the other PDB hits.
10
Table 2: Prediction of the structural analogs of arsC protein based on TM-score.
Rank
PDB Hit
TM-scoreA
RMSDB
IDENc
COVD
1
2m46A
0.965
0.68
0.513
0.992
2
3Gkxb
0.896
1.46
0.483
0.983
3
3fz4A
0.844
1.82
0.422
0.975
4
1z3Ea
0.840
1.73
0.243
0.975
5
3rdwA
0.839
1.69
0.219
0.966
6
1s3da
0.817
1.96
0.202
0.966
7
3f0Ia
0.807
1.88
0.214
0.949
8
1rw1A
0.777
2.13
0.264
0.932
9
3I78A
0.703
2.60
0.287
0.907
10
2kokA
0.676
3.22
0.226
0.949
A)
Measure of structural similarity between the predicted model and the native structure.
B)
Root mean square deviation (deviation between residues that are structurally aligned by TM-score.
c)
Percentage sequence identity in the structurally aligned region.
D)
Coverage of the alignment by TM align=number of structurally aligned residues divided by the length of the
model.
4.4. Structure-based functional annotation
The function of the query protein was predicted by the estimation of confidence scores and
functional analogs based on GO terms, EC number and active sites and ligand binding sites.
4.4.1. Ligand binding sites
To identify the function of protein, the ligand binding sites of the query protein were predicted by I TASSER. The prediction was made on the basis of global and local structural similarities between the
binding sites on the predicted 3D model run against the 19,658 non-redundant proteins with known
ligand binding site in the PDB library. The template 1k0nA (Table 3) has the highest C-score of 0.43
with six ligand binding site residues and 19 clusters. The template 3ihqA (Table 3) has the lowest Cscore of 0.04 with three ligand binding sites lacking complex structure. The identified ligand binding
site residues in the template 1k0nA with higher C-score of 0.43 are Tyr-8, Cys-11, Thr-13, Arg-95,
Gly-107, Phe-108.
11
Table 3: Prediction of the ligand binding sites of model 1 based on the measure of global and local sequence
and structural similarities between the template binding site and predicted binding site of model 1. Higher Cscore and larger cluster size were used for significant predictions.
Rank
C-score
Cluster size PDB HIT
Lig name
Ligand Binding site residues
1
0.43
19
1K0nA
GSH
8,11,13,95,107.108
2
0.13
6
3mszA
GSH
13,94,96,107,108,109
3
0.04
2
1i9dA
SO4
13,16,109
4
0.04
2
1J9bA
TAS
11,12,13
5
0.04
2
3ihgA
IMD
60,61,94
4.4.2. Prediction of Enzyme Commission (EC) and active sites
The predicted 3D model structure was run against a PDB library of 5,798 non-redundant entries with
known EC numbers. On the basis of global and structural similarities the structural analogs of the
enzyme commission and active sites were matched using TM-align. The analysis generated top five
models ranked according to the TM-score. The PDB hit 1sk1A (Table 4) has the highest EC number of
1.20.4.1 with three active site residues indicating V-111, H-7, P-9 i.e His-7, Pro-9, Val-111. The lowest
EC number is for PDB hit 2gsdA (Table 4) with no active site residues. The PDB hit 1sk1A was
identified as the enzyme arsenate reductase (glutaredoxin) recognized from ExPASy.
Table 4: Prediction of functional analogs of model 1 based on Enzyme commission (EC) score. A ranking of the
model was based on EC score. Higher the EC score higher is the confidence of the prediction.
Rank
EC-score
PDB hit
TM-score RMSD
IDEN
Cov
EC
Number
Active Site
Residues
1
0.465
1sk1A
0.807
1.99
0.195
0.958
1.20.4.1
7,9,11
2
0.282
2f8aA
0.568
3.43
0.080
0.822
1.11.1.9
52
3
0.280
2he3A
0.517
3.94
0.058
0.831
1.11.1.9
NA
4
0.279
2nadA
0.506
4.35
0.063
0.890
1.2.1.2
NA
5
0.278
2gsdA
0.502
4.42
0.063
0.890
1.2.1.43
NA
4.4.3. Prediction of Gene Ontology (GO) terms
The structural analogs of the predicted 3D protein were matched against a PDB library of 26,045
non-redundant entries with known GO terms and a consensus was derived based on the number of
occurrences of GO terms. The consensus prediction of GO terms involved the prediction of
molecular function, biological process, and the cellular component along with the respective GO
scores (Table 5). Higher values of GO scores indicate more confident predictions. The molecular
function with GO terms, GO: 0008794 and GO – scor0e 0.80 is predicted as an arsenate reductase
(glutaredoxin) activity. The biological process predicted with GO terms, GO: 0055114 and highest GO
- score of 0.85 is an oxidation - reduction process. The cellular component GO term, GO: 0043231
with a highest GO - score of 0.58 is predicted as cytoplasm. The predicted 3D protein has a molecular
12
function of arsenate reductase activity involved in biological oxidation-reduction process which
occurs at the cytoplasmic cellular component.
Table 5: Prediction of GO - terms, i.e. molecular function, biological process and cellular component by ITASSER for the 3D structure of the protein based on GO-score. GO-score > 0.5 signifies a high confidence
prediction.
Gene Ontology (GO)
GO term
GO score
Molecular function
GO:0008794 (arsenate reductase (glutaredoxin) activity)
0.80
Biological process
GO:0055114 (oxidation – reduction process)
0.85
Cellular component
GO: 0043231 (cytoplasm)
0.58
4.5. Prediction of Gene Function
On the basis of structural and functional predictions of arsC protein a novel hypothetical pathway of
arsenate reduction in B1-CDA was proposed (Figure 5). The first step in the biochemical pathway
involves the binding of an oxyanion [As (v)] to the enzyme arsC reductase non-covalently with the
nucleophilic attack on cysteine - 11. This results in the formation of thiarsahydroxy intermediates in
step two with thiolate of activated cysteine-11 which is stabilized by hydrogen bonds of arginine-95.
In step three glutathionylated intermediates formed are reduced to novel intermediates Arsc-S-AS +O which in the final step dissociates to produce free arsenite As (III) which undergoes biomethylation
and extruded from the cell as arsines.
Figure 5: Prediction of a novel hypothetical biochemical mechanism of arsenate reduction by the arsC gene in
B1-CDA. Step 1: involves the nucleophilic attack on cys11 bound non-covalently to arsenate releases OH- ions.
This intermediate formed in step 2: is glutathionated by the release of the H2O molecule. Step 3: involves the
reduction of arsenate to arsenite by binding of glutaredoxin (GrxSH) resulting in the formation of intermediate
compound dihydroxy monothiol arsenate and releases GrxS-SG and GSH. Step 4: positively charged arsenic
with a mono hydroxy intermediate is formed. Free arsenite is released with the addition of OH- ions and
undergoes biomethylation.
13
4.6. Concentration and Purity of Nucleic acids
The concentration and purity of RNA isolated from L. sphaericus B1-CDA and transgenic strains E. coli
JW3470-1 were measured by nanodrop 2000. The plasmid was isolated from transgenic E. coli
JW3470-1 strain. The nanodrop readings of the best isolated samples (Table 6) were reported which
were further used in the study.
Table 6: The concentrations and purity level of the RNA and DNA isolated from B1-CDA and transgenic strains.
SN
Samples
Concentration (ng/µl)
260/280
260/230
1.
RNA (B1-CDA)
945.4
2.1
2.18
2.
RNA (transgenic strain)
1331
2.13
2.22
3.
Plasmid
531
1.81
2.0
4.7. Expression of arsenate reductase gene (arsC) in B1-CDA
Having established the nature of ars operon in the L. sphaericus B1 - CDA strain (Rahman et al.,
2015), expression of arsC gene was checked by using gene-specific primers by endpoint RT - PCR as
explained in materials and methods. The resultant cDNA with an expected length of 350 bp was
visualized on 2% agarose gel (Figure 6) and purified by PCR purification kit and further used for
cloning.
Figure 6: The expression level of the arsC gene of B1- CDA detected by the complete reverse transcription of
RNA to cDNA by RT-PCR. The results were visualized on 2% agarose gel by GelRedTm, 6X loading dye, and 1X
TAE running buffer. Lane L - NEB 2Log DNA Ladder, Lane 1 - cDNA approx. 350bp.
14
4.8. Transformation and Amplification of arsC gene
The purified cDNA obtained from the RT- PCR of L. sphaericus B1-CDA was transformed into E. coli
JW3470-1 mutant strain lacking arsC gene using a commercial cloning pGEM - T vector (Appendix IX).
After transformation, several recombinant clones (transgenic) were screened for the presence of
arsC genes by colony PCR (Figure 7). The arsC gene from the plasmid of transgenic strain was
amplified by PCR and PCR purified product was sent for DNA sequencing.
Figure 7: Transformation of arsC gene was confirmed by colony PCR with arsC-F/arsC-R primers. The results
show arsC gene amplified at approx. 350bp. Lane 1- negative control, Lane 2- NEB 2log DNA ladder. Lane 3, 4,
5 - colonies picked from transformed plates.
4.9. BLASTN search of arsC gene
The DNA sequencing results of arsC gene were aligned with the NCBI BLASTN search of the nonredundant GenBank, EMBL, PDB, DDBJ database which revealed a homolog of arsC gene on the
chromosome of Lysinibacillus sphaericus C3-41 (Rahman et al., 2015). Alignment of the predicted
nucleotide sequence of arsC gene in the non-redundant database of NCBI showed that it was 86%
identical to the L. sphaericus C3-41 (Figure 8).
Figure 8: The DNA sequencing of the 350 bp fragment transformed into E. coli JW3470-1 showed 86% identity
with the Lysinibacillus sphaericus C3-41 indicating that it belongs to Lysinibacillus sphaericus family. BLAST
alignment was performed in the NCBI database. The query indicates sequenced arsC gene and subject
Lysinibacillus sphaericus C3-41.
15
4.10. Expression of arsC gene in transgenic strains
The RNA was isolated from the transgenic E. coli JW3470-1 strains, the concentration of which is
described in Table 6. The expression of the arsC gene in the transgenic strain was studied by RT-PCR
as mentioned in materials and methods and the results were visualized on 2% agarose gel (Figure 9).
Figure 9: The expression of arsC gene in transgenic E. coli JW3470-1 detected by the complete reverse
transcription of RNA to cDNA by RT-PCR. The results were visualized on 2% agarose gel by GelRedTm, 6X loading
dye, and 1X TAE running buffer. Lane L- NEB 2Log DNA Ladder, Lane 2 and Lane 3 - cDNA approx. 350 bp.
4.11. Arsenic resistance analysis
The cell growth of mutant and transgenic E. coli JW3470-1 strains grown in LB medium containing 5
mM to 100 mM of sodium arsenate measured after 24 hours exhibited tolerance to arsenic. The
transgenic E. coli JW3470-1 strain had high growth density at an absorbance of 600 nm measured by
cell density spectrophotometer when compared to the mutant E. coli JW3470-1 strains as shown in
(Table 7).
Table 7 : The bacterial growth of mutant and transgenic E. coli JW3470-1 strains grown at different
concentrations of arsenic. The growth was measured at absorbance 600nm.
OD at 600 nm
Concentration of arsenic (mM)
Mutant E. coli JW3469-1
Transgenic E. coli JW3469-1
5
0.53
1.50
10
0.21
1.26
25
0.10
0.92
50
0.09
0.74
100
0.06
0.26
16
4.12. Statistical analysis of growth.
The optimal growth of mutant and transgenic E. coli JW3470-1 strains was seen at 50 mM
concentration of arsenate from arsenic tolerance assays hence bacterial cultures were grown in 50
mM arsenate over a range of time intervals (24 hours to 96 hours) and bacterial growth was
measured by optical density at 600 nm. A Mann - Whitney U test was run to determine if there was a
difference in the bacterial growth between the mutant and transgenic E. coli JW3470-1 strain at 50
mM sodium arsenate. Median growth of the mutant (median = 0.155, n = 4) and the transgenic
(median = 0.595, n = 4) strains were significantly different, U = 4, p = 0.02 (Figure 10). The p - value is
lesser than the significance level p<0.05. Hence it can be concluded that there is a significant
difference in the growth pattern of the mutant and transgenic E. coli JW3470-1 strain when exposed
to arsenic.
Figure 10: Comparison of the mean growth rate between the mutant and transgenic E. coli JW3470-1 strain.
The blue curve indicates mutant strain and brown indicates transgenic strain. The curve was plotted with
optical density (600 nm) values vs time interval (hours). The P value was found to be 0.02 from Mann Whitney U test which is less than the significance level p<0.05. Thereby indicating a significant difference in
the growth rate between mutant and transgenic strains. The error bars were plotted with 95% Confidence
interval and + 1 SD.
17
4.13. Analysis of arsenate accumulation
The amount of arsenate accumulated from the cell free medium by mutant and transgenic E. coli
JW3470-1 strains was analyzed by ICP-MS (Table 8). The bacterial strains were cultured in the
presence of 50 mM arsenate at varying time intervals of 24 hours to 96 hours.
Table 8: The amount of arsenic accumulated from the supernatant analyzed by ICP-MS for mutant and
transgenic E. coli JW3470-1 strain cultured at 50mM arsenate concentration for 24 hours to 96 hours (n = 1).
Concentration of arsenic in cell free broth (mM)
Time Interval (hours)
Mutant E. Coli JW3470-1
Transgenic E. Coli JW3470-1
24
9.29
8.33
48
10.89
10.57
72
12.78
11.85
96
13.16
13.15
18
Discussion
As mentioned in the introduction, arsenic contamination of water and foods is a severe danger to
human health in many regions of the world, particularly in the South-East Asia. One of the most ecofriendly and inexpensive methods for removal of arsenic from the contaminated water or effluents
might be the use of microorganisms (Rahman et al. 2015; Congeevaram et al., 2007) occurring
naturally in the environment. Several researchers have been engaged in isolating and characterizing
arsenic resistant microorganisms from the arsenic contaminated regions and in-situ bioremediation
has garnered a lot of importance in recent years. Because it is cost effective, most efficient and
environmentally safe mechanism for the detoxification of arsenic. Also, this method has many
advantages over the conventional processes (Pous et al., 2015; Escalante et al., 2009). During the
past decades, several bacterial strains have been reported to be highly resistant to arsenic (Sattar et
al., 2014; Kao et al., 2013; Jenson et al., 2010). They can grow and survive in the arsenic
contaminated environment. One of such resistant strains is the L. sphaericus, B1-CDA which was
previously isolated and reported by a research team at the University of Skövde, Sweden (Rahman et
al., 2014). The genome sequence of this strain B1-CDA has proven that this bacterium contains
several genes responsive to metal ions like arsenic, iron, cobalt, manganese, nickel, lead, zinc,
cadmium and chromium (Rahman et al., 2015). One such gene is the arsC gene responsive to
arsenic. In this thesis work, an attempt was made to determine the molecular function of the arsC
gene in L. sphaericus B1-CDA by both in silico and in vitro studies.
The in silico analysis of arsC protein of Lysinibacillus sphaericus B1-CDA involved the analysis of 118
amino acid sequence obtained from GenBank database with accession number PRJEB7750
[http://www.ebi.ac.uk/ena/data/view/PRJEB7750] (Rahman et al., 2015) by I-TASSER server (Roy et
al., 2010). I-TASSER is a consolidated automated platform for modeling of protein structures and
prediction of function on the basis of query sequence-structure-function prototype. By this method
the secondary structure and 3D structure of the query protein was predicted revealing the substrate
bound residues. The molecular, cellular and biological function of the query protein was predicted
on the basis of the 3D structure. The results from the in silico experiments are credible as various
repeated benchmark tests form the basis of Bioinformatics analysis (Lee et al., 2009). The in silico
analysis, when compared to in vitro experiments, yield faster, significant results, and is inexpensive.
Protein crystallography and Nuclear Magnetic resonance (NMR) can be employed in constructing a
3D model of a protein (Pittelkow et al., 2011). These approaches are less preferred as they are
laborious, expensive and time-consuming. However, there are disadvantages of performing in-silico
analysis as it becomes highly difficult to prioritize and choose a satisfactory candidate as it considers
various parameters of selection. In the current experiments 3gkxB (Z-score 3.95) was selected as a
primary structure for further analysis, even though the PDB hits 2m46A (Z score-4.11) and 3rdwA (Zscore 4.08) had higher Z-scores (Table 1). In this current case, sequence identity (Iden1 and Iden2)
was the deciding factor for ranking top fold suggested by the algorithmic computer analysis.
However, the selection of the target model or structure is simplified when every suggested
parameter indicates increased significance values. In our current experiment, I-TASSER server
selected one model and ranked it one due to highest c-score 1.27 (Figure 4) which also showed
highest TM-score (0.89+0.07) and RMSD 1.9+1.6 Å. The functional similarities of a predicted
structure and analogs detected are indicated by the GO-score. As the function and structure of a
protein are multifaceted, the similarities of functionalities between the identified analogs result in
the association of many GO terms. Based on the GO-scores, GO-term consensus prediction was
performed to choose reliable GO terms (table 5). Based on the highest GO scores the biological,
cellular and molecular function of the protein was predicted and the hypothetical biochemical
mechanism was postulated (Figure 5). Studies on resistance of arsC gene to arsenate and arsC
protein encoding arsenate reductase have been conducted in yeast and other bacteria (Ye et al.,
2011; Demel et al., 2004) where protein crystallography and NMR have been used for constructing
protein models of arsC protein where the mechanism of arsenic binding to arsenate reductase with
19
cysteine residues was studied in E. coli R773 plasmid (Shen et al., 2013; Ye et al., 2011) and found to
be similar to the arsC protein bound to cysteine residues in B1-CDA. The arsC protein structure,
based on molecular modeling in Herbaspirillum sp. GW103 (Govarthanan et al., 2015) and homology
modelling in cronobacter sakazakii BAA-894 strain (Chaturvedi et al., 2013) revealed close structural
similarities with arsC family proteins of Pseudomonas aeruginosa and Escherichia coli respectively,
where it is proven that arsC protein-coding for arsenate reductase are involved in the reduction of
arsenic from the environment which is in accordance with our novel findings of arsC protein in B1CDA which codes for arsenate reductase and involved in reduction of arsenic.
In vitro laboratory experiments were performed to validate the in silico results of arsC gene. The
nucleic acid isolations were performed in triplicates and duplicates, but the samples with good
concentrations and high purity levels, which were further used in the study were compiled together
with their nanodrop readings (Table 6). The other nucleic acid samples with lesser purity and
concentration in comparison to the best samples were stored in -80 °C. They could be used as a
second source of samples in the study in higher concentrations if the best samples got completely
used up, as the isolation procedures involving culturing of bacterial strains is time-consuming and
laborious. To assess the concentration and purity of the isolated DNA and RNA, the absorbance ratio
at 260 nm and 280 nm was used. The 260:280 ratio of DNA and RNA at approximately 1.8 and 2.0
respectively is acknowledged as a pure form of DNA and RNA and seen as a rule of thumb. The
260:230 ratio used as a secondary measure for purity, generally lies in the range of 2.0 - 2.2
(Leninger 1997). On the basis of the rule of thumb, the concentration and purity of isolated nucleic
acids were found to be good (Table 6).
The arsC gene identified on Lysinibacillus sphaericus B1-CDA shows that it is responsive to arsenic
(Rahman et al., 2015). The similar action of arsC gene was identified on R773 plasmid in E. coli (Ye et
al., 2011; Kao et al., 2011; Demel et al., 2004) and arsC gene with similar phenotype was detected on
Staphylococcus plasmids pSX267 and pI258 and chemolithotrophic Zetaproteobacteria (Jesser et al.,
2015; Demel et al., 2004). The expression of arsC gene in B1-CDA was checked by RT-PCR with an
expected 350bp length with arsC-F/arsC-R gene specific primers designed by primer3plus software
(Untergasser et al., 2007) (Figure 6). The cDNA formed was purified and ligated into a pGEM-T
cloning vector and transformed into E. coli JW3470-1 mutant strain in which the arsC gene is
knocked off. The resultant transgenic E. coli JW3470-1 strain complemented with arsC gene were
verified by colony PCR (Figure 7). The absence of target arsC gene in mutant E. coli JW3470-1 strain
was confirmed by amplification with PCR by arsC-F/arsC-R primers. In bacterial species, Halomonas
and Acinetobacter a 409 bp arsC gene coding for arsenate reductase were amplified (Sunita et al.,
2012). The DNA sequence alignment by NCBI BLASTN showed 86% similarity to Lysinibacillus
sphaericus C3-41 (Figure 8) and is identical in function (Rahman et al., 2015) which confirms our
cloning results. The sequencing results of arsC gene in Bacillus sp. SXB by Wu et al. (2013) showed
99% similarity with Bacillus thuringenesis sp. The expression of the arsC gene in transgenic strain was
verified by RT-PCR (Figure 9) which concludes that the complementation studies have made the
transgenic strain resistant to arsenic. Similar studies were performed by the isolation of arsenate
reductase genes from Acinetobacter species conferring arsenic resistance in E. coli JM109 and
WC3110 by complementation studies (Sunitha et al., 2015) and confirms our studies that the
presence of arsC gene confers an organism resistance to arsenic.
Arsenic resistance assays were conducted to analyze the minimum inhibitory concentration of
transgenic and mutant E. coli JW3470-1 strains by varying the concentrations of sodium arsenate
(Musingarimi et al., 2010). The bacterial strains were grown in medium containing sodium arsenate
at 5 mM, 10 mM, 25 mM, 50 mM and 100 mM till 24hrs to conclude a final single concentration of
arsenate to be employed for further growth studies. The bacterial strains were grown at 37 °C as the
temperature has the maximum effect on the growth of the bacteria (Bhakoo et al., 1979). The
transgenic E. coli JW3470-1 strains with an increase in the concentration of sodium arsenate showed
20
a decrease in the growth rate. A similar pattern of growth was observed in mutant strains. However,
comparison of the growth rate indicates, higher growth rate for the transgenic strains when
compared to the mutant strains which can be explained as follows: at 5 mM and 10 mM
concentration of sodium arsenate the transgenic strain had eight and seven times higher growth
rate than mutant strain, at 25 mM and 50 mM concentration, five times higher growth rate than the
mutant strain. At 100 mM the transgenic growth rate was four times higher than the mutant strain
as shown in (Table 7). This difference in the growth rate could be explained due to the presence of
arsC gene in transgenic strains and the lack of arsC gene in mutant strains. Similar results were
reported in bacterial strains deficient in arsC gene (Pandey et al., 2012). From the results, it is
evident that the cloning of arsC gene into mutant E. coli JW3470-1 strain was successful and the
presence of arsC gene in a bacterial strain renders the organism more tolerant to arsenic (Sarkar et
al., 2012; Liao et al., 2011; Srivastava et al., 2009).
From the arsenic resistance assays, the concentration of arsenate at 50 mM was considered
optimum concentration for further studies at varying time intervals (24 hours-96 hours). The growth
curve studies suggest the exponential increase in the growth of the mutant and transgenic E. coli
JW3470-1 strains (Figure 10). The transgenic strain displayed an increase in the growth from 24
hours until 72 hours. The active doubling of the cells during this time interval could be explained by
the availability of an abundance of nutrients, adequate aeration and less production of secondary
metabolites. As the time interval increased to 96 hours the growth of the bacterial strains was
declining due to the exhaustion of the nutrients resulting in decreasing the cell growth. A similar
growth pattern was seen in Exiguobacterium strain grown in presence of 30 mM of sodium arsenate
till 24 hours (Chowdhury et al., 2014). The transgenic E. coli JW3470-1 strain which was
complemented with the arsC gene of B1-CDA showed a significant difference in the growth rate
when compared to the mutant E. coli JW3470-1 strain lacking the arsC gene, which was verified by
statistical analysis by Mann - Whitney U test (Corder and Foreman 2014). The statistical results could
be interpreted that the complementation studies on arsC gene from B1-CDA to mutant E. coli
JW3470-1 strains has been successfully accomplished.
The ICP-MS analysis was performed to assess the amount of arsenic accumulated in the cell-free
broth or liquid medium by the transgenic and mutant E. coli JW3470-1 strain. The bacterial strains
were grown in 50 mM sodium arsenate and cultured from 24 hours to 96 hours. The preliminary ICPMS results (Table 8) indicates that the transgenic strain accumulated more amount of arsenic due to
the complementation of arsC gene from B1-CDA, unlike the mutant strain which was lacking arsC
gene. The results for the cell-free broth indicated that the maximum difference in the reduction of
arsenic was observed at 24 hours, where the transgenic strain decreased 83.34% of arsenic (50 mM
to 8.33 mM) from the liquid medium when compared to mutant strain which decreased 81.42% of
arsenic (50 mM to 9.29 mM) (Table 8). The difference in the arsenic accumulation at 24 hours
between the transgenic and mutant strains could be calculated as 0.96 mM or 1.92 %. At 48 hours
the arsenic reduction by transgenic strain was 78.86% (50 mM to 10.57mM) and by mutant strain
was 78.72% (50 mM to 10.89 mM). It has been observed that the reduction rate of arsenic was
lowered after 48 hours due to the mechanism of arsenic efflux. Similar ICP-MS experiments were
carried out by Rahman et al. (2014) and Banerjee et al. (2011) in B1-CDA and other bacterial species
which show that the accumulation of arsenic by the bacterial strains reduced after 48 hours and
increased after 72 hours. However, the current experimental result was not consistent as there was
a decrease in the arsenic accumulation after 72hours (76.3% in transgenic and 74.44% in mutant
strains) and 96 hours (73.7% in transgenic and 73.68% in mutant strains). The ICP-MS results are
inconclusive as the significant difference in the accumulation of arsenic in the cell-free broth
between the transgenic and mutant strain could not be assessed statistically as the samples could
not be generated in triplicates.
21
From the current study, it can be concluded that the in silico results were confirmed by in vitro
experiments. The in silico analysis by I-TASSER for the arsC protein identified in the L. sphaericus B1CDA predicted arsC protein to have a molecular arsenate reductase activity (Table 5) involved in the
cytoplasmic oxidation and reduction reactions for arsenate reduction (Figure 5). The in vitro
experiments accomplished the complementation of the arsC gene from B1-CDA to mutant E. coli
JW3470-1 strain lacking arsC gene, generating transgenic E. coli JW3470-1 strains (Figure 7). The
bacterial growth analysis between the transgenic and mutant E. coli JW3470-1 strains cultured at 50
mM sodium arsenate for 24 hours to 96 hours concluded a significant difference in the growth
pattern of the bacterial strains (Figure 10) which verifies our complementation studies along with
the sequencing results (Figure 8) followed by expression analysis of arsC gene in transgenic strains
(Figure 9). The arsenic tolerance assays concluded the minimum inhibitory concentration of arsenate
for the transgenic and mutant E. coli JW3470-1 strains at 50 mM (Table 7). The results for the
amount of arsenic accumulated in the cell free broth obtained by ICP-MS (Table 8) for the transgenic
and mutant strain were inconclusive due to lack of replicate samples and should be performed again
along with the studies for arsenic accumulation within the cell. Finally, it can be concluded that the
structural and functional predictions by I-TASSER, along with complementation studies suggest that
the arsC gene identified on the chromosome of L. Sphaericus B1-CDA does confer tolerance to
arsenic. Therefore B1-CDA can be employed for the bioremediation of arsenic from the arsenic
contaminated environment, thereby reducing the need to employ physical-chemical processes.
22
Ethical aspects and impact of the research on the society
The ethical aspects do not imply for this study as there has been no pain or suffering caused unlike
the human and animal experiments. The bacterial strains L. sphaericus B1-CDA and E. coli mutant
JW3470-1 strain employed in this study have not been reported anywhere to be a biohazard or
involved in the spread of diseases or cause of any infections. The selective antibiotic marker
ampicillin in the concentration of 100 µg/ml was used only to produce transgenic E. coli JW3470-1
strain in strict sterile laboratory conditions and labelled safe as ampicillin at any concentrations till
date has not produced any new antibiotic resistant bacterial strains of L. sphaericus B1-CDA and E.
coli mutant JW3470-1 strain which would be pathogenic in nature. The bacterial strains were
disinfected with low level multipurpose disinfectant virkon used at 1 % concentration. The 1 %
virkon has not been reported to produce any superbugs of L. sphaericus B1-CDA and E. coli mutant
JW3470-1 strain. The arsenic used in this study has been labeled as a potential biohazard, hence it
was handled and disposed of carefully following all the safety guidelines for handling arsenic in the
laboratory laid down by the University of Skövde. Therefore, it can be safely concluded that there
has been no biotechnology risk reported by this study.
The impact of the research on the society can be summarized as follows (1) the study would help in
mitigation and remediation of the environment by using the bioremediation techniques involving
the use of arsC gene in combination with other arsenic-responsive genes of L. sphaericus B1-CDA for
bioengineering of crops resistant to arsenic thereby detoxifying arsenic from the contaminated sites
(2) After a certain time period the crops could be recycled for the production of biofuel thereby
creating a safe and healthier environment free from toxic pollutants.
23
Future Perspectives
The principal objectives of this study involving the validation of in silico analysis by in vitro laboratory
experiments have been successfully achieved. In silico analysis included the structural and functional
prediction of arsC protein by I-TASSER and in vitro experiments comprised of complementation
studies involving cloning and amplification of the arsC gene identified on L. sphaericus B1-CDA.
Employing the bioinformatics studies and various protein modeling tools followed by X- ray
crystallography technique the crystal structure of arsC protein could be constructed to study the
protein-protein interactions, which would help in better understanding for characterization of the
structure and binding of protein with arsenic. The protein expression analysis of the arsC gene of L.
sphaericus B1-CDA could be carried out to understand the mechanism of arsenate reductase
involved in arsenate reduction and further understanding the mechanism of biomethylation and
volatilization of arsenic, which could be used for detoxification of arsenic from soil and water.
Further laboratory experiments, including real-time PCR could be performed to check the expression
levels of arsC gene. The studies on arsC gene along with other arsenic-responsive genes of L.
sphaericus B1-CDA could be used for designing prospective molecular biomarkers employed for
remediation of arsenic. ICP-MS studies could be repeated again with triplicate samples to analyze
and deduce the statistically significant amount of arsenic accumulated in the cell free broth as well
as within the cell. The arsC gene in combination with other arsenic tolerance genes identified on the
chromosome of L. sphaericus B1-CDA could be used in the construction of plant expression vectors
or transgenic plants which could be used as an ecological approach for remediation of arsenic, which
is a better and safer alternative to physical – chemical processes towards remediation of arsenic
contaminated sites.
24
Acknowledgements
I, would like to thank my supervisor Prof. Abul Mandal (School of Bioscience, University of Skövde,
Sweden) for providing me with the opportunity to work in his laboratory under his supervision. I,
would also like to thank Ph.D. student Md. Aminur Rahman for his suggestions and guidance. I am
thankful for the teaching and non-teaching staff of the School of Bioscience, university of Skövde,
Sweden for their co-operation in order to accomplish my master thesis work successfully. I, would
also like to thank my examiner Dr. Magnus Fagerlind for his insightful comments and suggestions on
this report which helped in the creation of a proper complete scientific report. I would like to extend
my thanks to Eurofins Environment Testing Sweden AB, Lidköping, Sweden and Karolinska University
Hospital, Stockholm, Sweden for rendering their timely professional services for ICP-MS analysis and
DNA sequencing respectively.
Last but not the least I would like to thank my family and friends for their constant support and
encouragement all throughout my endeavors.
25
References
Ahsan H., Chen Y., Parvez F., Zablotska L., Argos M., Hussain I., Momotaj H., Levy D., Cheng Z.,
Slavkovich V. and Van Geen A. (2006). Arsenic exposure from drinking water and risk of
premalignant skin lesions in Bangladesh: Baseline results from the health effects of arsenic
longitudinal study. American Journal of Epidemiology, 163 (12), pp. 1138-1148.
Argos M., Kalra T., Rathouz P.J., Chen Y., Pierce B., Parvez F., Islam T., Ahmed A., Rakibuz-Zaman M.,
Hasan R. and Sarwar G. (2010). Arsenic exposure from drinking water, and all-cause and chronicdisease mortalities in Bangladesh (HEALS): a prospective cohort study. The Lancet, 376 (9737), pp.
252-258.
Bang S., Patel M., Lippincott L. and Meng X. (2005). Removal of arsenic from groundwater by
granular titanium dioxide adsorbent. Chemosphere, 60 (3), pp. 389-397.
Banerjee I., Pangule R.C. and Kane R.S. (2011). Antifouling coatings: recent developments in the
design of surfaces that prevent fouling by proteins, bacteria, and marine organisms. Advanced
Materials, 23 (6), pp. 690-718.
Bahar F.G., Ohura K., Ogihara T. and Imai T. (2012). Species difference of esterase expression and
hydrolase activity in plasma. Journal of pharmaceutical sciences, 101 (10), pp. 3979-3988.
Bartlett G.J., Porter C.T., Borkakoti N. and Thornton J.M. (2002). Analysis of catalytic residues in
enzyme active sites. Journal of molecular biology, 324 (1), pp. 105-121.
Bhattacharya P., Jacks G., Ahmed K.M., Routh J. and Khan A.A. (2002). Arsenic in groundwater of the
Bengal Delta Plain aquifers in Bangladesh. Bulletin of Environmental Contamination and
Toxicology, 69 (4), pp. 538-545.
Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N. and Bourne
P.E. (2000). The protein data bank. Nucleic acids research, 28 (1), pp. 235-242.
Bhakoo M. and Herbert R.A. (1979). The effects of temperature on the fatty acid and phospholipid
composition of four obligatory psychrophilic Vibrio Spp. Archives of Microbiology, 121 (2), pp. 121127.
Bonneau R. and Baker D. (2001). Ab initio protein structure prediction: progress and
prospects. Annual review of biophysics and biomolecular structure, 30 (1), pp. 173-189.
Bobrowicz P., Wysocki R., Owsianik G., Goffeau A. and Ułaszewski S. (1997). Isolation of three
contiguous genes, ACR1, ACR2 and ACR3, involved in resistance to arsenic compounds in the yeast
Saccharomyces cerevisiae. Yeast, 13 (9), pp. 819-828.
Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N. and Bourne
P.E. (2000). The protein data bank. Nucleic acids research, 28 (1), pp.235-242.
Bröer S., Ji Guangyong., Bröer A. and Silver S. (1993). Arsenic efflux governed by the arsenic
resistance determinant of Staphylococcus aureus plasmid pI258. Journal of bacteriology, 175 (11),
pp. 3480-3485.
Carlin A., Shi W., Dey S. and Rosen B.P. (1995). The ars operon of Escherichia coli confers arsenical
and antimonial resistance. Journal of Bacteriology, 177 (4), pp. 981-986.
Corder G. W and Foreman D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach. Wiley.
26
Chowdhury U.K., Biswas B.K., Chowdhury T.R., Samanta G., Mandal B.K., Basu G.C., Chanda C.R.,
Lodh D., Saha K.C., Mukherjee S.K. and Roy S. (2000). Groundwater arsenic contamination in
Bangladesh and West Bengal, India. Environmental health perspectives, 108 (5), pp. 393.
Chowdhury R., Karak P., Sen A.K., Chatterjee R. and Chaudhuri, K. (2014). Arsenic Extrusion and
Energy Derivation as Survival Mechanism in a Novel Exiguobacterium Isolated from Arsenic
Contaminated Groundwater of West Bengal. AIJRFANS, 7(1), pp.19-27.
Chang F., Qu J., Liu R., Zhao X. and Lei P. (2010). Practical performance and its efficiency of arsenic
removal from groundwater using Fe-Mn binary oxide. Journal of Environmental Sciences, 22 (1), pp.
1-6.
Cohen M.H., Hirschfeld S., Honig S.F., Ibrahim A., Johnson J.R., O'Leary J.J., White R.M., Williams,
G.A. and Pazdur R. (2001). Drug approval summaries: arsenic trioxide, tamoxifen citrate, anastrazole,
paclitaxel, bexarotene. The oncologist, 6 (1), pp. 4-11.
Congeevaram S., Dhanarani S., Park J., Dexilin M. and Thamaraiselvi K. (2007). Biosorption of
chromium and nickel by heavy metal resistant fungal and bacterial isolates. Journal of Hazardous
Materials, 146 (1), pp. 270-277.
Comino E., Fiorucci A., Menegatti S. and Marocco C. (2009). A preliminary test of arsenic and
mercury uptake by Poa annual. Ecological engineering, 35 (3), pp. 343-350.
Chaturvedi., Navaneet., Vinay Kumar Singh. and Paras Nath Pandey. (2013) "Computational
identification and analysis of arsenate reductase protein in Cronobacter sakazakii ATCC BAA-894
suggests potential microorganism for reducing arsenate." Journal of structural and functional
genomics 14 (2), pp. 37-45.
DeMel S., Shi J., Martin P., Rosen B.P. and Edwards B.F. (2004). Arginine 60 in the ArsC arsenate
reductase of E. coli plasmid R773 determines the chemical nature of the bound As (III) product.
Protein science, 13 (9), pp. 2330-2340.
Diorio C., Cai J., Marmor J., Shinder R. and DuBow M.S. (1995). An Escherichia coli chromosomal ars
operon homolog is functional in arsenic detoxification and is conserved in gram-negative bacteria.
Journal of bacteriology, 177 (8), pp. 2050-2056.
Escalante G., Campos V.L., Valenzuela C., Yañez J., Zaror C. and Mondaca M.A. (2009). Arsenic
resistant bacteria isolated from the arsenic contaminated river in the Atacama Desert (Chile).
Bulletin of environmental contamination and toxicology, 83 (5), pp. 657-661.
Fatmi Z., Zama I., Ahmed F., Kazi A., Gill A.B., Kadir M.M., Ahmed M., Ara N. and Janjua N.Z. (2009).
Core group for Arsenic Mitigation in Pakistan. Health burden of skin lesions at low arsenic exposure
through groundwater in Pakistan. Is river the source? Environmental Research, 109 (5), pp. 575-581.
Floudas C.A., Fung H.K., McAllister S.R., Mönnigmann M. and Rajgaria R. (2006). Advances in protein
structure prediction and de novo protein design: A review. Chemical Engineering Science, 61 (3), pp.
966-988.
Fung H.K., Taylor M.S. and Floudas C.A. (2007). Novel formulations for the sequence selection
problem in de novo protein design with flexible templates. Optimisation Methods and Software, 22
(1), pp. 51-71.
Gladysheva T.B., Oden K.L. and Rosen B.P. (1994). Properties of the arsenate reductase of plasmid
R773. Biochemistry, 33 (23), pp. 7288-7293.
27
Govarthanan M., Praburaman L., Kim J. W., Oh S. G., Kamala-Kannan S. and Oh B. T. ( 2015).
Characterization, real-time quantification and in silico modeling of arsenate reductase (arsC) genes
in arsenic-resistant Herbaspirillum sp. GW103. Research in microbiology 166 (3), pp. 196-204.
Guo W., Hou Y.L., Wang S.G. and Zhu Y.G. (2005). Effect of silicate on the growth and arsenate
uptake by rice (Oryza sativa L.) seedlings in solution culture. Plant and Soil, 272 (1-2), pp. 173-181.
Halttunen T., Salminen S. and Tahvonen R. (2007). Rapid removal of lead and cadmium from water
by specific lactic acid bacteria. International journal of food microbiology, 114 (1), pp. 30-35.
Hughes M.F. (2002). Arsenic toxicity and potential mechanisms of action. Toxicology letters, 133 (1),
pp. 1-16.
Holm L. and Sander C. (1991). Database algorithm for generating protein backbone and side-chain
co-ordinates from a C α trace: Application to model building and detection of co-ordinate
errors. Journal of molecular biology, 218 (1), pp.183-194.
Jackson C.R. and Dugas S.L. (2003). Phylogenetic analysis of bacterial and archaeal arsC gene
sequences suggests an ancient, common origin for arsenate reductase. BMC evolutionary biology, 3
(1), pp. 18.
Jackson C.R., Dugas S.L. and Harrison K.G. (2005). Enumeration and characterization of arsenateresistant bacteria in arsenic free soils. Soil Biology and Biochemistry, 37 (12), pp. 2319-2322.
Jensen S.O. and Lyon B.R. (2009). Genetics of antimicrobial resistance in Staphylococcus aureus.
Future microbiology, 4 (5), pp. 565-582.
Ji G. and Silver S. (1992). Reduction of arsenate to arsenite by the ArsC protein of the arsenic
resistance operon of Staphylococcus aureus plasmid pI258. Proceedings of the National Academy of
Sciences, 89 (20), pp. 9474-9478.
Jesser K.J., Fullerton H., Hager K.W. and Moyer C.L., (2015). Quantitative PCR Analysis of Functional
Genes in Iron-Rich Microbial Mats at an Active Hydrothermal Vent System. Applied and
environmental microbiology, 81 (9), pp. 2976-2984.
Kamiya T., Tanaka M., Mitani N., Ma J.F., Maeshima M. and Fujiwara T. (2009). NIP1; 1, an aquaporin
homolog, determines the arsenite sensitivity of Arabidopsis thaliana. Journal of Biological
Chemistry, 284 (4), pp. 2114-2120.
Kao A.C., Chu Y.J., Hsu F.L. and Liao V.H.C. (2013). Removal of arsenic from groundwater by using a
native isolated arsenite-oxidizing bacterium. Journal of contaminant hydrology, 155, pp. 1-8.
Krivov G.G., Shapovalov M.V. and Dunbrack R.L. (2009). Improved prediction of protein side‐chain
conformations with SCWRL4. Proteins: Structure, Function, and Bioinformatics, 77 (4), pp. 778-795.
Koštál V., Vambera J. and Bastl J. (2004). On the nature of pre-freeze mortality in insects: water
balance, ion homeostasis and energy charge in the adults of Pyrrhocoris apterus. Journal of
Experimental Biology, 207 (9), pp. 1509-1521.
Lee J., Wu S. and Zhang Y. (2009). Ab initio protein structure prediction. From protein structure to
function with bioinformatics. Springer Netherlands. pp. 3-25.
Lee D.A., Chen A. and Schroeder J.I. (2003). ars1, an Arabidopsis mutant exhibiting increased
tolerance to arsenate and increased phosphate uptake. The Plant Journal, 35 (5), pp. 637-646.
28
Lebrun E., Brugna M., Baymann F., Muller D., Lièvremont D., Lett M.C. and Nitschke W. (2003).
Arsenite oxidase, an ancient bioenergetic enzyme. Molecular biology and evolution, 20 (5), pp. 686693.
Leninger T., Stoll H., Werner H.J. and Savin A. (1997). Combining long-range configuration interaction
with short-range density functionals. Chemical physics letters, 275 (3), pp. 151-160.
Liu C.P., Luo C.L., Gao Y., Li F.B., Lin L.W., Wu C.A. and Li X.D. (2010). Arsenic contamination and
potential health risk implications at an abandoned tungsten mine, southern China. Environmental
Pollution, 158 (3), pp. 820-826.
Liu X., Hanson B.L., Langan P. and Viola R.E. (2007). The effect of deuteration on protein structure: a
high-resolution comparison of hydrogenous and perdeuterated haloalkane dehalogenase. Acta
Crystallographica Section D: Biological Crystallography, 63 (9), pp. 1000-1008.
Liao V.H.C., Chu Y.J., Su Y.C., Hsiao S.Y., Wei C.C., Liu C.W., Liao C.M., Shen W.C. and Chang F.J.
(2011). Arsenite-oxidizing and arsenate-reducing bacteria associated with arsenic-rich groundwater
in Taiwan. Journal of contaminant hydrology, 123 (1), pp. 20-29.
Liu J., Gladysheva T.B., Lee L. and Rosen B.P. (1995). Identification of an essential cysteinyl residue in
the ArsC arsenate reductase of plasmid R773. Biochemistry, 34 (41), pp. 13472-13476.
Lozano, L.C. and Dussán J. (2013). Metal tolerance and larvicidal activity of Lysinibacillus sphaericus.
World Journal of Microbiology and Biotechnology, 29 (8), pp. 1383-1389.
Lynn S., Lai H.T., Gurr J.R. and Jan K.Y. (1997). Arsenite retards DNA break rejoining by inhibiting DNA
ligation. Mutagenesis, 12 (5), pp. 353-358.
Li Y. and Zhang Y. (2009). REMO: A new protocol to refine full atomic protein models from C‐alpha
traces by optimizing hydrogen‐bonding networks. Proteins: Structure, Function, and
Bioinformatics, 76 (3), pp.665-676.
Marshall G., Ferreccio C., Yuan Y., Bates M.N., Steinmaus C., Selvin S., Liaw J. and Smith A.H. (2007).
Fifty-year study of lung and bladder cancer mortality in Chile related to arsenic in drinking water.
Journal of the National Cancer Institute, 99 (12), pp. 920-928.
Matilda O., Cohly H.H., Isokpehi R.D. and Awofolu O.R. (2010). The case for visual analytics of arsenic
concentrations in foods. International journal of environmental research and public health, 7(5), pp.
1970-1983.
Mateos L.M., Ordóñez E., Letek M. and Gil J.A. (2006). “Corynebacterium glutamicum" as a model
bacterium for the bioremediation of arsenic. International microbiology: official journal of the
Spanish Society for Microbiology, 9 (3), pp. 207-216.
McGuffin L.J. and Jones D.T. (2003). Improvement of the GenTHREADER method for genomic fold
recognition. Bioinformatics 19, pp. 874–881.
Margelevicium M., Laganeckas M. and Venclovas C. (2010). Bioinformatics 26, pp. 1905-1906
Meliker J.R., Wahl R.L., Cameron L.L. and Nriagu J.O. (2007). Arsenic in drinking water and
cerebrovascular disease, diabetes mellitus, and kidney disease in Michigan: a standardized mortality
ratio analysis. Environ Health, 6 (4), pp. 1-11.
Messens J., Martins J.C., Van Belle K., Brosens E., Desmyter A., De Gieter M., Wieruszeski J.M.,
Willem R., Wyns L. and Zegers I. (2002). All intermediates of the arsenate reductase mechanism,
29
including an intramolecular dynamic disulphide cascade. Proceedings of the National Academy of
Sciences, 99 (13), pp. 8506-8511.
Mergeay M. (1995). Heavy metal resistances in microbial ecosystems. In Molecular microbial ecology
manual. Springer Netherlands. pp. 439-455.
Mondal D. and Polya D.A. (2008). Rice is a major exposure route for arsenic in Chakdaha block, Nadia
district, West Bengal, India: A probabilistic risk assessment. Applied Geochemistry, 23 (11), pp. 29872998.
Mobley H.L., Chen C.M., Silver S. and Rosen B.P. (1983). Cloning and expression of R-factor mediated
arsenate resistance in Escherichia coli. Molecular and General Genetics MGG, 191 (3), pp. 421-426.
Mukhopadhyay R. and Rosen B.P. (2002). Arsenate reductases in prokaryotes and eukaryotes.
Environmental health perspectives, 110 (5), pp. 745.
Mukhopadhyay R., Shi J. and Rosen B.P. (2000). Purification and Characterization of Acr2p, the
Saccharomyces cerevisiae Arsenate Reductase. Journal of Biological Chemistry, 275 (28), pp. 2114921157.
Mukhopadhyay R., Rosen BP., Phung LT. and Silver S. (2002). Microbial arsenic: from geocycles to
genes and enzymes. FEMS Microbiol Rev, 26, pp 311-325.
Musingarimi W., Tuffin M. and Cowan D. (2010). Characterisation of the arsenic resistance genes in
Bacillus sp. UWC isolated from maturing fly ash acid mine drainage neutralised solids. South African
Journal of Science, 106 (2), pp. 59-63.
Muller D., Lièvremont D., Simeonova D.D., Hubert J.C. and Lett M.C. (2003). Arsenite oxidase aox
genes from a metal-resistant β-Proteobacterium. Journal of bacteriology, 185 (1), pp. 135-141.
McDonald IK. and Thornton JM. (1994). J Mol Biol 238, pp. 777–793
Nahar N., Rahman A., Moś M., Warzecha T., Algerin M., Ghosh S., Sheila J.B. and Mandal A. (2012).
In silico and in vivo studies of an Arabidopsis thaliana gene, ACR2, putatively involved in arsenic
accumulation in plants. J Mol Model. 18, p. 4249-4262.
Narcis Pous., Barbara Casantini., Simona Rosetti., Steffano Fazi., Sebastia Puig. and Federico Aulenta.
(2015). Anaerobic arsenite oxidation with an electrode serving as the sole electron acceptor: A novel
approach to the bioremediation of arsenic-polluted groundwater. Journal of hazardous
materials 283, pp. 617-622.
Nahar N., Rahman A., Moś M., Warzecha T., Ghosh S., Hossain K., Nawani N.N. and Mandal A. (2014).
In silico and in vivo studies of molecular structures and mechanisms of AtPCS1 protein involved in
binding arsenite and/or cadmium in plant cells. Journal of molecular modelling, 20 (3), pp. 1-16.
Neyt C., Iriarte M., Thi V.H. and Cornelis G.R. (1997). Virulence and arsenic resistance in Yersinia.
Journal of bacteriology, 179 (3), pp. 612-619.
Ng J.C., Wang J. and Shraim A. (2003). A global health problem caused by arsenic from natural
sources. Chemosphere, 52 (9), pp. 1353-1359.
Nordstrom D.K. (2002). Worldwide occurrences of arsenic in ground water. Science, 296 (5576), pp.
2143-2145.
Oremland R.S. and Stolz J.F. (2005). Arsenic, microbes and contaminated aquifers. Trends in
microbiology, 13 (2), pp. 45-49.
30
Oremland R.S., Saltikov C.W., Wolfe-Simon F. and Stolz J.F. (2009). Arsenic in the evolution of earth
and extra-terrestrial ecosystems. Geomicrobiology Journal, 26 (7), pp. 522-536.
Owalobi JB. and Rosen BP. (1990). Differential mRNA stability controls relative gene expression
within the plasmid-encoded arsenical resistance operon. J Bacteriol. 172, pp. 2367–2371.
Pandey S., Rai R. and Rai L.C. (2012). Proteomics combines morphological, physiological and
biochemical attributes to unravel the survival strategy of Anabaena sp. PCC7120 under arsenic
stress. Journal of proteomics, 75 (3), pp. 921-937.
Pittelkow M. and Bremer E. (2011). Cellular adjustments of Bacillus subtilis and other bacilli to
fluctuating salinities. In Halophiles and Hypersaline Environments. Springer Berlin Heidelberg. pp.
275-302.
Quéméneur M., Heinrich-Salmeron A., Muller D., Lièvremont D., Jauzein M., Bertin P.N., Garrido F.
and Joulian C. (2008). Diversity surveys and evolutionary relationships of aoxB genes in aerobic
arsenite-oxidizing bacteria. Applied and environmental microbiology, 74 (14), pp. 4567-4573.
Rahman A., Nahar N., Nawani N.N., Jass J., Desale P., Kapadnis B.P., Hossain K., Saha A.K., Ghosh S.,
Olsson B. and Mandal A. (2014). Isolation and characterization of a Lysinibacillus strain B1-CDA
showing potential for bioremediation of arsenics from contaminated water. Part A: Toxic/Hazardous
Substances and Environmental Engineering. Journal of Environmental Science and Health, 49 (12).
Rahman A., Nahar N., Nawani N.N., Jass J., Ghosh S., Olsson B. and Mandal A. (2015). Data in support
of the comparative genome analysis of Lysinibacillus B1-CDA, a bacterium that accumulates
arsenics. Data in Brief, 5, pp. 579-585.
Rosen B.P. (2002). Biochemistry of arsenic detoxification. Febs Letters, 529 (1), pp. 86-92.
Roy A., Kucukural A. and Zhang Y. (2010). I-TASSER: a unified platform for automated protein
structure and function prediction. Nature protocols, 5 (4), pp. 725-738.
Sattar N.A., Ginsberg H., Ray K., Chapman M.J., Arca M., Averna M., Betteridge D.J., Bhatnagar D.,
Bilianou E., Carmena R. and Češka R. (2014). The use of statins in people at risk of developing
diabetes mellitus: evidence and guidance for clinical practice. Atherosclerosis Supplements, 15 (1),
pp. 1-15.
Santini J.M., Macy J.M., Pauling B.V., O’Neill A.H. and Sly L.I. (2000). Two new arsenate/sulfatereducing bacteria: mechanisms of arsenate reduction. Archives of microbiology, 173 (1), pp. 49-57.
Sato T. and Kobayashi Y. (1998). The ars Operon in the skin element of Bacillus subtilis confers
resistance to arsenate and arsenite. J. Bacteriol. 180 (7), p. 1655-1661.
Shen S., Li X.F., Cullen W.R., Weinfeld M. and Le X.C. (2013). Arsenic binding to proteins. Chemical
reviews, 113 (10), pp. 7769-7792.
Santini J.M. and vanden Hoven R.N. (2004). Molybdenum-containing arsenite oxidase of the
chemolithoautotrophic arsenite oxidizer NT-26. Journal of bacteriology, 186 (6), pp. 1614-1619.
Sambrook J., Fritsch E. F. and Maniatis T. (1989). Molecular cloning: a laboratory manual. Cold Spring
Harbor, N.Y., Cold Spring Harbor Laboratory.
Sunitha M.S.L., S. Prashanth. and PB Kavi Kishor. (2015). Characterization of Arsenic-Resistant
Bacteria and their ars Genotype for Metal Bioremediation. International Journal of Scientific &
Engineering Research 6 (2), pp. 304-309
31
Sarkar S., Basu B., Kundu C.K. and Patra P.K. (2012). Deficit irrigation: An option to mitigate arsenic
load of rice grain in West Bengal, India. Agriculture, Ecosystems & Environment, 146(1), pp. 147-152.
Silver S. and Phung L.T. (2005). Genes and enzymes involved in bacterial oxidation and reduction of
inorganic arsenic. Applied and Environmental Microbiology, 71(2), pp. 599-608.
Slyemi D. and Bonnefoy V. (2012). How prokaryotes deal with arsenic. Environmental microbiology
reports, 4 (6), pp. 571-586.
Smith A.H., Marshall G., Yuan Y., Ferreccio C., Liaw J., Von Ehrenstein O., Steinmaus C., Bates M.N.
and Selvin S. (2006). Increased mortality from lung cancer and bronchiectasis in young adults after
exposure to arsenic in utero and in early childhood. Environmental health perspectives, pp. 12931296.
Srivastava S., Srivastava A.K., Suprasanna P. and D'souza S.F. (2009). Comparative biochemical and
transcriptional profiling of two contrasting varieties of Brassica juncea L. in response to arsenic
exposure reveals mechanisms of stress perception and tolerance. Journal of experimental
botany, 60 (12), pp. 3419-3431.
Sunita M.S.L., Prashant S., Chari P.B., Rao S.N., Balaravi P. and Kishore P.K. (2012). Molecular
identification of arsenic-resistant estuarine bacteria and characterization of their ars
genotype. Ecotoxicology, 21 (1), pp. 202-212.
Singh A.P., Goel R.K. and Kaur T. (2011). Mechanisms pertaining arsenic toxicity. Toxicol. Intl. 18 (2),
pp. 87-93.
Shrestha R and Spuhler D. (2012). Arsenic removal technologies. Available
http://www.sswm.info/content/arsenic-removal-technologies [Accessed on 2016.03.13].
at
Untergasser A., Nijveen H., Rao X., Bisseling T., Geurts R. and Leunissen J.A. (2007). Primer3Plus, an
enhanced web interface to Primer3. Nucleic acids research, 35 (2), pp. W71-W74.
Wu S., Skolnick J. and Zhang Y. (2007). Ab initio modelling of small proteins by iterative TASSER
simulations. BMC biology, 5 (1), pp. 17.
Wu S. and Zhang Y. (2008). Bioinformatics 24, pp 924–931.
Wu Q., Du J., Zhuang G. and Jing C. (2013). Bacillus sp. SXB and Pantoea sp. IMH, aerobic As
(V)‐reducing bacteria isolated from arsenic‐contaminated soil. Journal of applied microbiology, 114
(3), pp.713-721.
WHO, 1963. International Standards for Drinking-water, Second edition,
WHO, 1993. Guidelines for drinking-water quality, second edition, volume 1
Xiong J., He Z., Van Nostrand J.D., Luo G., Tu S., Zhou J. and Wang G. (2012). Assessing the microbial
community and functional genes in a vertical soil profile with long-term arsenic contamination. PloS
one, 7 (11), pp. 505-507.
Ye W.L., Khan M.A., McGrath S.P. and Zhao F.J. (2011). Phytoremediation of arsenic contaminated
paddy soils with Pteris vittata markedly reduces arsenic uptake by rice. Environmental pollution, 159
(12), pp. 3739-3743.
Zhang Y. and Skolnick J. (2004). SPICKER: A clustering approach to identify near‐native protein
folds. Journal of computational chemistry, 25 (6), pp. 865-871.
32
Zhang X., Zwiers F.W. and Li G. (2004). Monte Carlo experiments on the detection of trends in
extreme values. Journal of Climate, 17 (10), pp. 1945-1952.
Zhang Y. and Skolnick J. (2005). TM-align: a protein structure alignment algorithm based on the TMscore. Nucleic acids research, 33 (7), pp. 2302-2309.
Zhang Y. (2009). I‐TASSER: Fully automated protein structure prediction in CASP8. Proteins:
Structure, Function, and Bioinformatics, 77 (S9), pp. 100-113.
33
Appendices
Appendix (I)
arsC Fasta sequence
MTIQFFHYPKCTTCKKAQKWLNEHEVSYEEVHIAEQPPTKEELTAIWQASGQPLKKFFNT
SGMKYRELGLKDKLAEMTEDEQLELLASDGMLMKRPIVTDGKKVTLGFKEVDFEQAWK
Appendix (II)
Table 1: Degenerate primers designed for amplification of arsC genes with respective primer length (bp), G+C
content (GC %) and annealing temperature Tm (oC)
Primer
Sequence (5’-3’)
Length (bp)
GC (%)
Tm (oC)
ArsC (F)
ATGACGATTCAATTTTTCCATTATCC
26
30.8
62.4
ArsC (R)
TTATTTCCAAGCTTGCTCAAAATCT
25
32.0
61.6
Appendix (III)
Table 2: PCR reaction components and volumes for a single reaction of 50 µl. The forward primer arsC-F and
arsC-R reverse primer was used. The reaction was set up on the ice and the reaction components were
thoroughly mixed before placing in the PCR.
Reaction Components
Volume per reaction
100 pM Forward primer
1µl
100 pM Reverse primer
1µl
MasterAmp 1X PCR Mix
25µl
MasterAmp
TAQurate
Polymerase Mix (1 unit)
DNA
1µl
DNA Template (50ng)
1µl
Deionized water
21µl
Total volume
50µl
I
Appendix (IV)
Table 3: PCR reaction cycling conditions.
Cycle Step
Temperature
Time
Cycles
Initial Denaturation
95 ◦C
5 minutes
1
Denaturation
94 ◦C
30seconds
Annealing
60 ◦C
30sec
Extension
72 ◦C
1 min/kb
Final Extension
72 ◦C
10 min
1
Hold
4 ◦C
∞
1
35
Appendix (V)
Table 4: RT - PCR reaction components and volumes for a single reaction of 50 µl. The forward primer arsC-F
and arsC-R reverse primer was used. The reaction was set up on the ice and the reaction components were
thoroughly mixed before placing in the PCR.
Reaction Components
Volume per reaction
12.5 pM Forward primer
1µl
12.5 pM Reverse primer
1µl
MasterAmp 1X RT-PCR PreMix
25µl
MMLV RT-Plus (40 units)
1µl
MasterAmp
TAQurate
Polymerase Mix (1 unit)
DNA
1µl
RNA Template (100ng)
1µl
Deionized water
20µl
Total volume
50µl
II
Appendix (VI)
Table 5 : RT - PCR reaction cycling conditions.
Cycle Step
Temperature
Time
Cycles
Reverse transcription
37◦C
30 minutes
1
Initial Denaturation
95 ◦C
5 minutes
1
Denaturation
94 ◦C
30seconds
Annealing
60◦C
30sec
Extension
72 ◦C
1 min/kb
Final Extension
72 ◦C
10 min
1
Hold
4 ◦C
∞
1
35
Appendix (VII)
Table 6: Ligation reaction mixture for cloning of arsC gene with a pGEM-T vector into E. coli JW3470-1 mutant
strain. The reaction components of a positive control and negative control. Positive control represents the
cloning of arsC gene. The negative control allows the verification of background colonies resulting from vector
alone.
Reaction Component
Positive Control
Negative Control
10X T4 DNA ligase buffer
2µl
2µl
pGEM-T vector
1µl
1µl
Insert DNA (50ng)
3µl
-
T4 DNA ligase
1 µl
1µl
Nuclease-free water
13 µl
15µl
Total volume
20µl
20µl
Appendix (VIII)
arsC nucleotide sequence
ATGACGATTCAATTTTTCCATTATCCGAAATGTACAACATGTAAAAAGGCACAGAAGTGG
TTAAATGAGCATGAGGTTTCCTATGAAGAGGTACATATCGCAGAGCAACCTCCCACTAAA
GAAGAACTAACAGCTATTTGGCAAGCTAGTGGACAGCCTTTGAAGAAATTTTTTAATACC
TCAGGCATGAAATATCGTGAGCTTGGCTTAAAGGACAAGTTAGCGGAAATGACGGAGGAT
GAGCAGCTTGAGCTACTTGCTTCAGATGGCATGTTAATGAAACGTCCCATCGTGACAGAT
GGAAAAAAAGTTACTTTAGGTTTTAAAGAAGTAGATTTTGAGCAAGCTTGGAAATAA
III
Appendix (IX)
pGEM-T Vector map
Appendix (X)
Table 7: Prediction of GO – terms, i.e. molecular function, biological process and cellular component by ITASSER for the 3D structure of the protein based on GO-score. GO-score > 0.5 signifies a high confidence
prediction.
Molecular function
Biological process
Cellular component
GO term
GO term
GO term
GO:0008794
GO – score
0.80
GO –score
GO- score
GO:0055114
0.85
GO:0043231
0.58
GO:0045892
0.45
GO:0044444
0.58
GO:0006950
0.45
GO:0046685
0.40
IV
V