An Insight in to the Diversity and Phylogenetic Implications of NAC

Open Access
UGB Journal of Plant Biology
and Biotechnology
Research Article
An Insight into the Diversity and Phylogenetic Implications of
NAC Transcription Factors across the Plant Groups
Rakhi Chakraborty1 and Swarnendu Roy*2
1
Department of Botany, A. P. C. Roy Government
College, Matigara, Siliguri - 734010,West Bengal,
INDIA
2
Molecular & Analytical Biochemistry Laboratory,
Department of Botany, University of Gour Banga,
Mokdumpur, Malda - 732103, West Bengal, INDIA
*Corresponding author: Swarnendu Roy,
Molecular & Analytical Biochemistry Laboratory,
Department of Botany, University of Gour Banga,
Mokdumpur, Malda - 732103, West Bengal, INDIA
Email: [email protected]
Received: January 6, 2017; Accepted: January 7,
2017; Published: January 26, 2017
Abstract
NAC transcription factors (TFs) are one of the largest and important TF
family that are involved in the regulation of plant growth and development.
Several NAC TFs are also discovered that play an important part in the
regulation of other stress related genes under biotic and abiotic stresses.
The NAC TFs are characterized by a highly conserved N-terminal domain
and a variable C-terminal domain. In the present study, the amino acid
sequences of NAC TFs from 4 plant species viz. Arabidopsis thaliana, Picea
abies, Selaginella moellendorffii and Physcomitrella patens were collected
from the Plant Transcription Factor Database and the phylogenetic
relationships were evaluated. The phylogenetic tree revealed that the
majority of the NAC members were interspersed in the major subgroups that
indicated that the expansion of the NAC members predates the speciation
events. 31, 5, 1 and 10 paralog pairs were determined respectively for
Arabidopsis, Picea, Selaginella and Physcomitrella. The structure-function
relationship of the paralog pairs were inferred from the phylogenetic tree of
combined set of paralogous gene pairs by studying the prevalence of
flanking regions and motif analysis of the NAC proteins. The motif analysis
revealed the presence of an N-terminal conserved domain, a characteristic
of the majority of NAC family proteins. Conserved motifs in the C-terminal
region were absent in the majority of the protein sequences except few
members in Arabidopsis and Physcomitrella. Also the time of gene
duplication of the paralog pairs were calculated that revealed that in
Arabidopsis, the duplication events occurred between 4.48 to 45.94 MYA; in
Picea, 167.57 to 532.86 MYA; and in Physcomitrella, 29.12 to 53.53 MYA.
Keywords: NAC transcription factors; Phylogenetic tree; Gene duplication;
Motif analysis; Conserved domain
INTRODUCTION
NAC transcription factors (TFs) are one of the major plantspeci icTFsthatareinvolvedinregulationofplantgrowthand
development (Nuruzzaman et al., 2013; Shao et al. 2015).
TheseTFsderiveitsnamefromthreedifferentidenti iedgenes,
viz. NAM (no apical meristems), ATAF 1/2 (Arabidopsis
transcription activation factor) and CUC 2 (cup-shaped
cotyledon)(Soueretal.,1996;Aidaetal.,1997;Olsenetal.,
2005). Members of NAC superfamily share a conserved NterminaldomainandavariableC-terminaldomain(Xieetal.,
2000). The DNA-binding N-terminal domain consists of
approximately150-160aminoacidresidueswhichhasbeen
againdividedinto5sub-domains,buttheC-terminaldomainis
highlyvariablebothinlengthandaminoacidresidues(Ookaet
al.,2003).
DistributionofNACTFsinawiderangeofplantspecieshas
lead to extensive investigation in the identi ication and
UGB J Plant Biol Biotech - Volume 1 Issue 1 - 2017
ISSN: Applied for | www.ugbplantjournal.org
© All rights are reserved. Department of Botany, UGB
Citation:
characterization of these genes. Complete set of NAC TFs in
differentspecieshavebeenreported,viz.151inrice,117in
Arabidopsis,152insoyabean,180inapple,204incabbageand
soon(Nuruzzamanetal.,2010;Nuruzzamanetal.,2013;Leet
al.,2011;Suetal.,2013).Altogethermorethan80plantspecies
with or without complete genome sequences have been
characterisedforNACTFs(Jinetal.,2014).However,noNAC
TFsfrombacteria,algaeandfungihavebeenreportedtilldate
(Kikuchi,2014).
NACTFsareinvolvedinalargenumberoffunctionsincluding
root and shoot development, loral morphogenesis, leaf
senescence, seed and embryo development and cell cycle
regulationindifferentplantspecies(Uauyetal.,2006).Apart
from these, NAC TFs also plays modulatory roles in plant
responses to biotic and abiotic stresses (Nakashima et al.,
2012,Puraniketal.,2012).Upregulationofdroughttolerance
genesattranscriptionallevelwasconferredby3NACgenesin
Arabidopsis thaliana, viz. ANAC019, ANAC055 and ANAC072
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
(Tran et al., 2004). According to Jensen et al., 2010, ABA
hypersensitivity is conferred through positive regulation of
ABA signalling, which may be due to ectopic expression of
ANACO19.Similarly,mitochondrialretrogressiveregulationis
mediated by ANAC013 in response to oxidative stress (De
Clercqetal.,2013).SNAC1geneexpressioninricesigni icantly
increasescropproductionwhensubjectedtoseveredrought
andsalinitystress(Huetal.,2006).Recentstudiesincassava
isolated96NACgenes(MeNAC)whichconferawidedegreeof
tolerance to several stress factors like salinity, temperature,
ABA,andH2O2(Huetal.,2016).Alsoinsoyabean,six(GmNAC1,
GmNAC2, GmNAC3, GmNAC4, GmNAC5 and GmNAC6) are
activelyinvolveduponexposuretovariousstressconditions
(Pinheiroetal.,2009).
The objective of this study was to evaluate the phylogenetic
relationshipofthemembersofNACproteinfamilyacrossthe4
speciesofmajorplantgroups,viz.Arabidopsisthaliana,Picea
abies, Selaginella moellendorf ii and Physcomitrella patens.
Also,thestudyoftheexpansionoftheNACproteinfamilyand
thepossibleimplicationsofthisexpansionwasundertakenby
evaluating the paralogous gene pairs as inferred from the
phylogenetic tree; for which an in silico approach has been
takenwiththeaidofseveralbioinformaticstools.
MATERIALSANDMETHODS
Retrieval of NAC TF sequences and construction of
phylogenetictree:AminoacidsequencesencodingNACTFs
from4plants,viz.Arabidopsisthaliana,Piceaabies,Selaginella
moellendorf iiandPhyscomitrellapatenswereretrievedfrom
the Plant Transcription Factor Database v3.0 (PlantTFDB)
(http://planttfdb.cbi.pku.edu.cn/) (Jin et al., 2014). The
sequences were saved in FASTA format and renumbered for
phylogenetic analyses. The combined unrooted phylogenetic
treeofalltheNACTFswereconstructedwithMEGA7.0using
the Neighbor-Joining (NJ) method with the bootstrap test
carried out with 1000 iterations (Kumar et al., 2016). The
resultingtreewascrticallyanalysed,remodelledanddisplayed
using Fig Tree v1.4.2 (http://tree.bio.ed.ac.uk/software/
igtree/).
Retrieval of genomic sequences and phylogenetic
divergenceofcorrespondingparalogs:TheparalogsforNAC
TFsinArabidopsis,Picea,SelaginellaandPhyscomitrellawere
inferred from the phylogenetic tree. The corresponding
genomicsequencesforthepolypeptideswereretrievedfrom
different databases: The Arabidopsis Information Resource
(TAIR, https://www.arabidopsis.org/) for Arabidopsis,
Congenie(https://congenie.org/)forPicea,JGIPhytozome11
( h t t p s : / / p hy t o z o m e . j g i . d o e . g o v / p z / # ! i n f o ? a l i a s =
Org_Smoellendor f ii) for Selaginella and EnsemblPlants
(http://plants.ensembl.org/Physcomitrella_ patens/Info/
Index)forPhyscomitrella.Aphylogenetictreewasconstructed
usingthenucleotidesequencesoftheparalogswithMEGA7.0
andthecorrespondingintron-exonjunctionsweredisplayed
with the aid of Gene Structure Display Software (GSDS 2.0,
http://gsds.cbi.pku.edu.cn/)(Huetal.,2015).
Estimation of synonymous and non-synonymous
substitution rates: Multiple sequence alignments of the
www.ugbplantjournal.org
Page 02
amino acid sequences of the inferred paralog gene pairs of
Arabidopsis, Picea and Selaginella was performed with
MUSCLE3.8(http://www.ebi.ac.uk/Tools/msa/muscle/).The
alignedaminoacidsequencesandtheircorrespondingcDNA
sequences were introduced in to the server of PAL2NAL
(http://www.bork.embl.de/pal2nal/) and the estimation of
thesynonymous(Ks)andnon-synonymous(Ka)substitution
rates were performed using the CODEML program in PAML
interface(Suyamaetal.,2006).Time(millionyearsago,MYA)
ofduplicationanddivergenceofeachparaloggenepairswere
calculatedusingtheformula,T=Ks/2λ,whereKsreferstothe
rateofsynonymoussubstitutionsandλreferstotheclocklike
ratesofsynonymoussubstitutionsthatvariesfromspeciesto
species.ForArabidopsis,thevalueofλwastakentobe1.5X10-8
(Kochetal.2000);forPicea,thevalueofλwastakentobe0.68X
10-9(Buschiazzoetal.2012)andforPhyscomitrella,thevalueof
λwastakentobe0.94X10-8(Rensingetal.2007,2016).
Identi ication of conserved motifs and domains: The
programMEMEversion4.11.2wasusedfortheelucidationof
motifsintheNACproteinsequencesoftheparalogousgenes
(Bailey et al., 2009). MEME was run with the following
parameters: number of repetitions - zero or one, maximum
number of motifs - 20, and the optimum motif widths were
constrainedtobetween6and50residues.TheNCBIConserved
Domain Database search (https://www.ncbi.nlm.nih.gov/
Structure/cdd/wrpsb.cgi)wasalsoperformedwiththesame
setofsequencestoanalyzethepresenceofconserveddomains
and the relative position of the conserved domains in the
sequence(Marchler-Baueretal.,2015).
RESULTS
Collection of NAC TF sequences: The protein sequences of
NACTFsof4plantgenomes,Arabidopsis,Picea,Selaginellaand
Physcomitrella were downloaded from PlantTFDB v3.0 and
compiled in to FASTA format for phylogenetic analysis. The
number of sequences that were retrieved and used for
phylogeneticanalysiswas137,91,22and35respectivelyfor
Arabidopsis, Picea, Selaginella and Physcomitrella. NAC TF
sequences in Arabidopsis ranged from 87 to 806 amino acid
residues;inPicearangedfrom101to940aminoacidresidues;
inSelaginellarangedfrom128to474aminoacidresidues;and
inPhyscomitrellarangedfrom243to710aminoacidresidues.
Phylogenetic relationships of NAC TF sequences in
Arabidopsis, Picea, Selaginella and Physcomitrella: The
phylogenetic relationship among the NAC TF proteins in
Arabidopsis,Picea,SelaginellaandPhyscomitrellawasdoneby
constructing an unrooted tree from alignments of the fulllength NAC protein sequences (Figure 1). The phylogenetic
treewasconstructedusingMEGA7.0usingNeighbor-Joining
(NJ)method.Acloseintrospectionofthephylogenetictreeled
tothedistributionoftheNACTFsinto12distinctsubgroups
that are depicted in different colours. The subgroups were
designatedasA1,A2,B,C,D,E,F,G,H,I,JandKforsimplicityof
analysis. Among the subgroups, B and D were the largest
followedbyA1andA2.Amongthesesubgroups,theNACTFs
from all the 4 species were found to be interspersed that
indicatedthattheexpansioneventsofNACproteinspredates
thedivergenceofthestudiedspecies.Thisrevelationalsoholds
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
correctforthesubgroupsGandJ.Incontrast,thesubgroupH
andIcomprisedof15and5AtNACTFsrespectivelyonly,which
suggestsexpansionofthememberofthissubgroupoccurred
mostrecentlyafterthediversi icationoftheangiospermstock.
Moreover,thesubgroupFrevealedtheoccurrenceof3AtNACs
(AtNAC13,AtNAC94,AtNAC129)alongsidethePiceaNACTFs.
This indicated the expansion of this subgroup following the
divergenceofgymnosperm–angiospermstockfromthelower
groupofplantsandthesubsequentdivergenceofthe3AtNAC
TFsfromthegymnospermstock.
Figure1.PhylogenetictreeofNACTFsfromArabidopsis,Picea,
Selaginella and Physcomitrella. The deduced full-length amino
acidsequencesof137,91,22and35membersinArabidopsis,
Picea,SelaginellaandPhyscomitrellaNACproteinsrespectively
were aligned by MUSCLE 3.8 and the phylogenetic tree was
constructedusingMEGA7.0bytheNeighbor-Joining(NJ)method
with 1000 bootstrap replicates. Each NAC subfamily has been
separatedandisdepictedusingdifferentcolours.
Determinationofparalogsandevaluationofphylogenetic
relationship of the paralog genes: The paralog pairs were
inferredfromthephylogenetictreeofthefulllengthprotein
sequencesoftheNACTFs(Figure1).31,5,1and10paralog
pairs were determined respectively for Arabidopsis, Picea,
Selaginella and Physcomitrella. Both tandem and segmental
duplicationeventscontributedtotheexpansionofNACTFsin
alltheplantgroups;butinArabidopsisandPhyscomitrella,the
contributionoftheseduplicationeventsintheexpansionofthe
gene family was more pronounced. A phylogenetic tree was
reconstructed taking the genomic sequences of the paralog
pairs using NJ method and the subsequent intron-exon
junction was displayed using GSDS 2.0 (Figure 2). The
phylogenetictreerevealedthattheparaloggenesinSelaginella
shareacommonancestrywiththeparalogpairsinArabidopsis.
Interestingly, the paralog pairs in Picea with large introns
shared a common ancestry with the paralog genes in
Arabidopsis.Theintrondistributionalsoprovidedimportant
evidencetosupportphylogeneticrelationshipsoftheparalog
pairsofthespeciesstudied.InArabidopsis,numberofintrons
www.ugbplantjournal.org
Page 03
rangedfrom2(viz.AtNAC109,AtNAC135)to6(viz.AtNAC17,
AtNAC18).Thesizesoftheintronswerecomparativelysmaller
ascomparedtothatinotherspecies,largestintronsupto1.5
Kb observed in AtNAC98 and AtNAC99. In Picea, number of
introns ranged from 2 (viz. PaNAC37, PaNAC48) to 8 (viz.
PaNAC14).ThelargestintronswereobservedinPiceawhich
rangedfrom8-10KbasseeninPaNAC3,PaNAC7andPaNAC14.
OnlyonepairofparaloggenewasdeterminedinSelaginella
that showed 1-2 very small introns with a relatively large
upstream non-coding region in SmNAC10. In contrast, in
Physcomitrella,numberofintronsrangedfrom1(viz.PpNAC14,
PpNAC15)to3(viz.PaNAC33,PpNAC35);withrelativelylarge
upstreamregions(viz.PpNAC14,PpNAC15)anddownstream
regions(viz.PpNAC6).
Evolutionary pattern of paralog genes: The presence of
highlyidenticalandconserved lankingregionsforthepairsof
paralogous genes, suggest that the expansion of paralogous
NACgenesinitiatedfromsegmentalduplicationevents,though
tandemduplicationeventswerenotrare.WeusedKsasthe
proxy for time to estimate the approximate dates of the
duplicationevents.TheKsvaluesandtheestimateddatesfor
all duplication events of the paralog pairs of NAC genes of
Arabidopsis,PiceaandPhyscomitrellaarelistedinTable1,2and
3respectively.Ksvalues<1e-05forthepairofparalogswere
notusedforthecalculationofthetimeofduplicationevents.
Also,thetimeofduplicationeventfortheonlyparalogpairin
Selaginella was not calculated, as no reliable records
documenting the value of λ (clock like rates of synonymous
substitution)forSelaginellamoellendorf iicouldbefound.In
Arabidopsis,thevaluesofKsrangedfrom0.13to1.37andthe
timeofduplicationeventwascalculatedtohavespannedfrom
4.48to45.94MYA(Table1).SimilarlyinPicea,thevaluesofKs
rangedfrom0.22to0.72andthetimeofduplicationeventwas
calculatedtohavespannedfrom167.57to532.86MYA(Table
2);andinPhyscomitrella,thevaluesofKsrangedfrom0.54to
1.01andthetimeofduplicationeventwascalculatedtohave
spanned from 29.12 to 53.53 MYA (Table 3). Ka/Ks is an
indicatorofpositiveornegativeselectionapplicableregarding
thedivergenceofparalogpairs.Inalltheparalogouspairsof
the3species,Ka/Ksvaluewas<1inallinstances.
Analysis of conserved motifs and domains: The protein
sequencesofalltheNACparalogsthatrangedbetween158to
940aminoacidresidueswereintroducedtoMEMEformotif
analysiswithasettingof20motifstobediscovered.Theresults
ofthemotifanalysisalongwiththeconservedsequenceofthe
motifs are shown in Figure 3. Among the discovered motifs,
motifs14and8werethesmallestwithawidthof6and8amino
acidsrespectively;whereasthemotifs13,15and19werethe
largest with a width of 50 amino acids (Figure 3). Also, the
motifs1-8and14registeredmorethan80hitsamongthetotal
sequenceof94,whichindicatedtheconservednatureofthe
sequences across the paralogous sequences and species
(Figure3).ThemotifanalysisrevealedthepresenceofanNterminalconserveddomain,acharacteristicofthemajorityof
NACfamilyproteins.These indingswerealsovalidatedbythe
CDD search which also revealed the presence of an NAM
conserveddomainintheN-terminalofthemajorityofproteins
(Figure 4). Interestingly, in few proteins, the NAC conserved
domain could not be located in CDD search viz. PpNAC23,
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
PpNAC25, AtNAC26, AtNAC57, PaNAC3, AtNAC30 and
AtNAC31. Conserved motifs in the C-terminal region were
absentinthemajorityoftheproteinsequencesowingtotheir
enormous variability (Figure 4). Though some conserved
motifs could be located viz. AtNAC17, AtNAC18, AtNAC102,
AtNAC103, AtNAC83, AtNAC124, 20, AtNAC21, AtNAC15,
Page 04
AtNAC16, AtNAC98, AtNAC99 in Arabidopsis; PpNAC33,
PpNAC35, PpNAC27, PpNAC28, PpNAC4, PpNAC30, PpNAC9,
PpNAC3,PpNAC20inPhyscomitrella(Figure4).Noconserved
motifsintheC-terminalcouldbeidenti iedinthemembersof
SelaginellaandPicea.
Figure2.PhylogenetictreeofthecombinedparaloggenepairsfromArabidopsis,Picea,SelaginellaandPhyscomitrella,constructed
usingMEGA7.0bytheNeighbor-Joining(NJ)methodwith1000bootstrapreplicates.Thecorrespondingintron-exonjunctionswere
displayedwithGSDS2.0.
www.ugbplantjournal.org
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
Page 05
Figure3.SequencelogosforthediscoveredmotifsintheproteinsequencesoftheparalogousgenesinMEME4.11.2alongwiththe
widthandnumberofoccurrence.
www.ugbplantjournal.org
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
Page 06
Figure4.Sequence-wisedistributionoftheindividualmotifsandthecorrespondingresultsoftheconserveddomaindatabase
searchfordomainanalysis.
www.ugbplantjournal.org
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
Page 07
Table1.InfererenceofduplicationtimeofNACparalogouspairsinArabidopsis
Table2.InfererenceofduplicationtimeofNACparalogouspairsinPicea
Table3.InfererenceofduplicationtimeofNACparalogouspairsinPhyscomitrella
www.ugbplantjournal.org
UGB J Plant Biol Biotech 1 (1) 2017 - 7
Page 08
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
DISCUSSION
The number of NAC protein members in angiosperms and
gymnospermsoutnumberthemembersinmossesandferns
(Soltis and Soltis, 2013). This is due to the more frequent
relativeexpansionofthegenefamilyinhigherplantsbythe
duplication events either tandem or segmental. The major
differencesbetweentheNACproteinfamilynumberswithin
theangiospermsandgymnospermscouldbeduetocommon
selectivepressuressuchasenvironmentalstresses,whichmay
haveguidedtheregulationofplantgrowthanddevelopment
(HughesandFriedman,2003).Duplicationeventsfacilitatethe
TFs to accrue different functions from their ancestors and
couldbenaturallyselectedfortheirnovelfunctions(Forceet
al., 1999). Interestingly, the number of paralogous pairs in
PiceawascomparativelylesserconsideringthenumberofNAC
memberspresentinthisspecies.
NACproteinfamilyisoneofthelargestproteinfamiliesandthe
membersofthisfamilyarebothstructurallyandfunctionally
diverse. Therefore, it has been a dif icult task to assign or
designatestructure-functionrelationshiptotheindividualNAC
genes (Puranik et al., 2012). However, several stress related
NACshavebeenreportedtoplayregulatoryfunctioninbiotic
andabioticstresses(Tranetal.,2004;Jeongetal.,2010;Jensen
etal.2007).5conservedsubdomainshavebeenidenti iedin
theN-terminalregionoftheNACTFs(designatedasA-E);outof
whichthesubdomainsBandEaresomewhatdivergentthat
mayberelatedwiththediversefunctionoftheNACmembers
(Ooka et al., 2003). Also, the transcription regulatory region
(TRR)lyingattheC-terminalregionishighlydivergedandis
associated with the activation or repression of transcription
(Puraniketal.,2012).Althoughdiverse,theTRRmayormay
notpossesssomespeci icmotifsthataresometimesconserved
acrosstheproteinsub-families,ifpresentthesemotifsimpart
variable dimensions to the functionality of the individual
members, viz. the TRR of rice NAC proteins were found to
contain ten C-terminal motifs (Fang et al., 2008; Shen et al.,
2009). In our indings, the presence of variable C-terminal
motifs in Arabidopsis and Physcomitrella in few protein
members indicates the possibility of these members to play
regulatoryfunctioninstressedenvironments(Figure4).This
inferenceisinaccordancewiththe indingsofTranetal.(2004)
wheretheconservedmotifsinC-terminalhalfofrelatedNACs
in Arabidopsis viz. ANAC019, ANAC055 and ANAC072 were
attributedtotheregulationofthetranscriptionofotherstress
related genes. Also, considering the time of divergence of
paralogsinArabidopsisandPhyscomitrella,aclearindication
towards the accrual of stress induced regulatory function
during the expansion of NAC family can be obtained, as the
expansion of paralogs in these two species took place very
recentlyascomparedtothatinPicea.
Geneduplicationeventsareimportantfortheevolutionofgene
family,becauseitisassociatedwiththestructuraldivergenceof
new genes and facilitate the generation of novel functions
(Kongetal.,2007).Fromourresults,wecouldassumethatthe
duplication events were more in Arabidopsis followed by
Physcomitrella.PositiveDarwinianselectionhasbeenreported
to be associated with gene duplication and functional
divergence (Zhang, 2003). To explore whether positive
www.ugbplantjournal.org
selection drove the divergence of the paralog pairs, we
estimatedtheKa/Ksratio.TheKa/Ksratioprovidesasensitive
measureofselectivepressureontheproteinanditisaccounted
as one of the major forces contributing to the variation of
structuralpatternsinafunctionalproteinthatultimatelyleads
totheemergenceofnewmotifs/functionsinproteinaftergene
duplication (Yang et al., 2006). Ka/Ks values =1 indicates
neutral evolution or no selection; whereas Ka/Ks values <1
indicatespurifyingselection.RarelytheKa/Ksvaluesof>1are
observed,inthatcasepositiveDarwinianselectionisinvolved
(Li and Gojobori, 1983). The results obtained indicated a
purifyingselectionamongalltheparalogousgenepairsinall
thespeciesstudied.
CONCLUSION
Inthepresentstudy,acomprehensiveanalysisofNACproteins
in 4 species of the major plant groups in terms of their
phylogeny, gene structure, conserved domains and motifs,
divergencetimeofparalogousgenepairswasperformed.The
phylogenetictreeofalltheNACproteinsrevealedthepresence
of12distinctsubgroupsandrevealedthattheexpansionofthe
majority of NAC TFs in all the species occurred prior to the
speciation event. However, the functional attribution to the
subgroupscouldnotbeperformedowingtothelargenumber
ofmembersintheNACproteinfamily.Theparalogouspairs
wereinferredfromthephylogenetictreeandthedivergence
time of the duplication events were calculated. The time of
duplicationeventrevealedthattheexpansionoftheNACTFsin
Picea occurred much prior to that in Arabidopsis and
Physcomitrella.ThisrecentexpansionoftheNACmembersin
the2speciescouldberelatedtotheaccrualofnovelfunctions
with the changing environmental conditions on the basis of
motifanalysis.Also,theexpansionoftheNACproteinfamilyis
drivenbypurifyingselectionasevidentfromtheKa/Ksratioof
theparalogouspairs.
ACKNOWLEDGEMENT
TheauthorsaregratefultoDr.VinaySingh,InformationOf icer,
Centre for Bioinformatics, School of Biotechnology, BHU,
Varanasi for guidance regarding the application and use of
bioinformaticstoolsusedinthisstudy.
CONFLICTOFINTEREST
Nonedeclared.
REFERENCES
AidaM,IshidaT,FukakiH,FujisawaH,TasakaM(1997)Genesinvolvedinorganseparation
inArabidopsis:ananalysisofthecup-shapedcotyledonmutant.PlantCell9(6):841857.
BaileyTL,BodenM,BuskeFA,FrithM,GrantCE,ClementiL,RenJ,LiWW,NobleWS(2009)
"MEMESUITE:toolsformotifdiscoveryandsearching".NucleicAcidsRes37:W202W208.
Buschiazzo E, Ritland C, Bohlmann J, Ritland K (2012) Slow but not low: genomic
comparisonsrevealslowerevolutionaryrateandhigherdN/dSinconiferscompared
toangiosperms.BMCEvolBiol12:8.
DeClercqI,VermeirssenV,VanAkenO,VandepoeleK,MurchaMW,LawSR,InzeA,NgS,
IvanovaA,RombautD,VandeCotteB,JaspersP,VandePeerY,KangasjarviJ,WhelanJ,
VanBreusegemF(2013)ThemembraneboundNACtranscriptionfactorANAC013is
aregulatorofmitochondrialretrograderegulationoftheoxidativestressresponsein
Arabidopsis.PlantCell25(9):3472-3490.
HuB,JinJ,GuoA-Y,ZhangH,LuoJ,GaoG(2015).GSDS2.0:anupgradedgenefeature
UGB J Plant Biol Biotech 1 (1) 2017 - 7
AN INSIGHT INTO THE DIVERSITY AND PHYLOGENETIC IMPLICATIONS OF NAC TRANSCRIPTION FACTORS
visualizationserver.Bioinformatics31(8):1296-1297.
HuH,DaiM,YaoJ,XiaoB,LiX,ZhangQ,XiongL(2006)OverexpressingaNAM,ATAF,and
CUC (NAC) transcription factor enhances droughtresistance and salt tolerance in
rice.Proc.NatlAcadSci103:12987-12992.
HuW,YangH,YanY,WeiY,TieW,DingZ,ZuoJ,PengM,KaimianL(2016)Genome-wide
characterization and analysis of bZIP transcription factor gene family related to
abioticstressincassava.Scienti icRep6:22783.
Jensen MK, Kjaersgaard T, Petersen K and Skriver K (2010) NAC genes: Time-speci ic
regulatorsofhormonalsignalinginArabidopsis.PlantSignalingBehav5(7):907910.
JinJP,ZhangH,KongL,GaoGandLuoJC(2014)PlantTFDB3.0:aportalforthefunctional
and evolutionary study of plant transcription factors. Nucleic Acids Res 42(D1):
D1182-D1187.
KikuchiS(2014)Genome-wideviewoftheexpressionpro ilesofNAC-domaingenesin
responsetoinfectionbyriceviruses.In:BenkebliaN(Ed.)OmicsTechnologiesand
CropImprovement.CRCPress,pp127-152.
Koch MA, Haubold B, Mitchell-Olds T (2000) Comparative evolutionary analysis of
chalconesynthaseandalcoholdehydrogenaselociinArabidopsis,Arabis,andrelated
genera(Brassicaceae).MolBiolEvol17:1483-1498.
KumarS,StecherG,TamuraK(2016)MEGA7:MolecularEvolutionaryGeneticsAnalysis
version7.0forbiggerdatasets.MolBiolEvol33:1870-1874.
LeDT,NishiyamaR,WatanabeY,MochidaK,Yamaguchi-ShinozakiK,ShinozakiK,TranLS
(2011). Genome-wide survey and expression analysis of the plant-speci ic NAC
transcriptionfactorfamilyinsoybeanduringdevelopmentanddehydrationstress.
DNARes18:263-276.
Nakashima K, Takasaki H, Mizoi J, Shinozaki K, Yamaguchi-Shinozaki K (2012) NAC
transcriptionfactorsinplantabioticstressresponses.BiochimBiophysActa1819:
97-103.
NuruzzamanM,ManimekalaiR,SharoniAM,SatohK,KondohH,OokaH,KikuchiS(2010)
Genome-wideanalysisofNACtranscriptionfactorfamilyinrice.Gene465(1-2):3044.
NuruzzamanM,SharoniAM,KikuchiS(2013)RolesofNACtranscriptionfactorsinthe
regulationofbioticandabioticstressresponsesinplants.FrontMicrobiol4:248.
OlsenAN,ErnstHA,LeggioLL,SkriverK(2005)NACtranscriptionfactors:structurally
distinct,functionallydiverse.TrendsPlantSci10(2):79-87.
OokaH,SatohK,DoiK,NagataT,OtomoY,MurakamiK,MatsubaraK,OsatoN,KawaiJ,
Carninci P, Hayashizaki Y, Suzuki K, Kojima K, Takahara Y, Yamamoto K, Kikuchi S
(2003)ComprehensiveanalysisofNACfamilygenesinOryzasativaandArabidopsis
thaliana.DNARes10:239-247.
PinheiroGL,MarquesCS,CostaMDBL,ReisPAB,AlvesMS,CarvalhoCM,FiettoLG,Fontes
EPB (2009) Complete inventory of soybean NAC transcription factors: Sequence
conservationandexpressionanalysisuncovertheirdistinctrolesinstressresponse.
Gene444(1-2):10-23.
PuranikS,SahuPP,SrivastavaPS,PrasadM(2012)NACproteins:regulationandrolein
stresstolerance.TrendsPlantSci17(6):369-381.
ShaoH,WangH,TangX(2015)NACtranscriptionfactorsinplantmultipleabioticstress
responses:progressandprospects.FrontPlantSci6:902.
SouerE,vanHouwelingenA,KloosD,MolJNM,KoesR(1996).Thenoapicalmeristem
gene of petunia is required for pattern formation in embryos and lowers and is
expressedatmeristemandprimordiaboundaries.Cell85:159-170.
Su H, Zhang S, Yuan X, Chen C, Wang XF, Hao YJ (2013) Genome-wide analysis and
identi icationofstress-responsivegenesoftheNAM-ATAF1,2-CUC2transcription
factorfamilyinapple.PlantPhysiolBiochem71:11-21.
UGB J Plant Biol Biotech - Volume 1 Issue 1 - 2017
ISSN: Applied for | www.ugbplantjournal.org
© All rights are reserved. Department of Botany, UGB
Page 09
SuyamaM,TorrentsD,BorkP(2006)PAL2NAL:robustconversionofproteinsequence
alignments into the corresponding codon alignments. Nucleic Acids Res 34:
W609–W612.
Tran LS, NakashimaK, Sakuma Y, Simpson SD, Fujita Y, Maruyama K, Fujita M, Seki M,
Shinozaki K, Yamaguchi-Shinozaki K (2004) Isolation and functional analysis of
Arabidopsis stress-inducible NAC transcription factors that bind to a droughtresponsive cis-element in the early responsive to dehydration stress 1 promoter.
PlantCell16:2481-2498.
Uauy C, Distelfeld A, Fahima T, Blechl A, Dubcovsky J (2006) A NAC gene regulating
senescence improves grain protein, zinc, and iron content in wheat. Science
314(5803):1298-1301.
Xie Q, Frugis G, Colgan D, Chua N (2000) Arabidopsis NAC1 transduces auxin signal
downstreamofTIR1topromotelateralrootdevelopment.GenesDev14:3024-3036.
Marchler-BauerA,DerbyshireMK,GonzalesNR,LuS,ChitsazF,GeerLY,GeerRC,HeJ,
GwadzM,HurwitzDI,LanczyckiCJ,LuF,MarchlerGH,SongJS,ThankiN,WangZ,
YamashitaRA,ZhangD,ZhengC,BryantSH(2015)CDD:NCBI'sconserveddomain
database.NucleicAcidsRes43:D222-D226.
SoltisPS,SoltisDE(2013)Aconifergenomesprucesupplantphylogenomics.GenomeBiol
14(6):122.
HughesAL,FriedmanR(2003)Parallelevolutionbygeneduplicationinthegenomesof
twounicellularfungi.GenomeRes13(5):794-799.
ForceA,LynchM,Pickett FB,AmoresA,Yan YL,Postlethwait J(1999)Preservation of
duplicategenesbycomplementary,degenerativemutations.Genetics151(4):15311545.
JeongJS,KimYS,BaekKH,JungH,HaS-H,ChoiYD,KimM,ReuzeauC,KimJ-K(2010)Rootspeci icexpressionofOsNAC10improvesdroughttoleranceandgrainyieldinrice
under ielddroughtconditions.PlantPhysiol153(1):185-197.
JensenMK,RungJH,GregersenPL,GjettingT,FuglsangAT,HansenM,JoehnkN,Lyngkjaer
MF, Collinge DB (2007) The HvNAC6 transcription factor: a positive regulator of
penetrationresistanceinbarleyandArabidopsis.PlantMolBiol65(1-2):137-150.
TranLS,NishiyamaR,Yamaguchi-ShinozakiK,ShinozakiK(2010)Potentialutilizationof
NAC transcription factors to enhance abiotic stress tolerance in plants by
biotechnologicalapproach.GMCrops1:32-39.
FangY,YouJ,XieK,XieW,XiongL(2008)Systematicsequenceanalysisandidentification
oftissue-specificorstress-responsivegenesofNACtranscriptionfactorfamilyinrice.
MolGenetGenomics280:535-546.
ShenH,YinY,ChenF,XuY,DixonRA(2009)AbioinformaticanalysisofNACgenesforplant
cellwalldevelopmentinrelationtolignocellulosicbioenergyproduction.Bioenerg
Res2:217-232.
Kong H, Landherr LL, Frohlich MW, Leebens-Mack J, Ma H, de Pamphilis CW (2007)
PatternsofgeneduplicationintheplantSKP1genefamilyinangiosperms:evidence
formultiplemechanismsofrapidgenebirth.PlantJ50:873-885.
RensingSA,IckJ,FawcettJA,LangD,ZimmerA,VandePeerY,ReskiR(2007)Anancient
genomeduplicationcontributedtotheabundanceofmetabolicgenesinthemoss
Physcomitrellapatens.BMCEvolBiol7:130.
RensingSA,IckJ,FawcettJA,LangD,ZimmerA,VandePeerY,ReskiR(2016)Erratumto:
Anancientgenomeduplicationcontributedtotheabundanceofmetabolicgenesin
themossPhyscomitrellapatens.BMCEvolBiol16:184.
ZhangJ(2003)Evolutionbygeneduplication:anupdate.TrendsEcolEvol18(6):292-298.
Yang X, Tuskan GA, Cheng MZ (2006) Divergence of the Dof gene families in poplar,
Arabidopsis,andricesuggestsmultiplemodesofgeneevolutionafterduplication,
PlantPhysiol142:820-830.
LiWH,GojoboriT(1983)Rapidevolutionofgoatandsheepglobingenesfollowinggene
duplication.MolBiolEvol1(1):94-108.
Submit your next manuscript to UGB J Plant Biol Biotech with a 1. Convenient online submission, 2. Rapid editorial review followed by peer review,
3. Immediate publication on acceptance.