Classification and expression analyses of homeobox genes from Dictyostelium discoideum HIMANSHU MISHRA and SHWETA SARAN* School of Life Sciences, Jawaharlal Nehru University, New Delhi 110 067, India *Corresponding author (Email, [email protected]; [email protected]) Homeobox genes are compared between genomes in an attempt to understand the evolution of animal development. The ability of the protist, Dictyostelium discoideum, to shift between uni- and multicellularity makes this group ideal for studying the genetic changes that may have occurred during this transition. We present here the first genome-wide classification and comparative genomic analysis of the 14 homeobox genes present in D. discoideum. Based on the structural alignment of the homeodomains, they can be broadly divided into TALE and non-TALE classes. When individual homeobox genes were compared with members of known class or family, we could further classify them into 3 groups, namely, TALE, OTHER and NOVEL classes, but no HOX family was found. The 5 members of TALE class could be further divided into PBX, PKNOX, IRX and CUP families; 4 homeobox genes classified as NOVEL did not show any similarity to any known homeobox genes; while the remaining 5 were classified as OTHERS as they did show certain degree of similarity to few known homeobox genes. No unique RNA expression pattern during development of D. discoideum emerged for members of an individual group. Putative promoter analysis revealed binding sites for few homeobox transcription factors among many probable factors. [Mishra H and Saran S 2015 Classification and expression analyses of homeobox genes from Dictyostelium discoideum. J. Biosci. 40 241–255] DOI 10.1007/s12038-015-9519-3 1. Introduction Homeobox genes play key roles in animal, plant, fungi and amoebozoans development such as regional patterning, regulation of cell proliferation, differentiation, adhesion, migration and regulation of cell fate (Gehring 1994; Derelle et al. 2007). They contain a conserved DNA motif of about 180 base pairs, which when translated encode a peptide motif 60 amino acids long, called the homeodomain. The homeodomain consists of three α-helices that form a helix-loop-helix-turn-helix DNAbinding motif that recognizes and binds to specific DNA sequences to regulate the expression of target genes (Kissinger et al. 1990). Keywords. These proteins are usually classified based on phylogenetic analyses of homeodomain sequence similarity, chromosomal location and additional conserved protein domains or motifs outside of the homeodomain. The homeodomaincontaining proteins of animal genomes are classified into 11 classes, namely, ANTP, PRD, TALE, POU, CERS, PROS, ZF, LIM, HNF, CUT and SINE. A few identified animal proteins that could not be placed readily into any of these categories have been placed in a class called OTHER (Holland et al. 2007) Homeodomain proteins from plants have been classified into 14 classes: HD-ZIP I, HD-ZIP II, HD-ZIP III, HD-ZIP IV, KNOX, BEL, PLINC, WOX, DDT, PHD, NDX, LD, PINTOX and SAWADEE (Mukherjee et al. 2009). Classification; Dictyostelium; expression; homeobox Supplementary materials pertaining to this article are available on the Journal of Biosciences Website at http://www.ias.ac.in/jbiosci/ jun2015/supp/Mishra.pdf http://www.ias.ac.in/jbiosci Published online: 27 April 2015 J. Biosci. 40(2), June 2015, 241–255, * Indian Academy of Sciences 241 242 H Mishra and S Saran Conservation among these genes provides an effective tool to study evolution of organisms. Presently, Dictyostelium discoideum, a member of mycetozoa belonging to phylum Amoebozoa and phylogenetically closer to ophisthokonta than the green plants, has been analysed. This position is supported with a combined maximum likelihood analysis of 19 proteins (Kuma et al. 1995) and concatenated alignment of 82 universally conserved eukaryotic proteins (Iyer et al. 2008). Phylogenetic analyses of the combined data set of EF-1α, HSP70, actin and β-tubulin amino acid sequences strongly support the assemblages of ophisthokonta, which includes animals and fungi with amoebozoa (Steenkamp et al. 2006). The thought that D. discoideum had originated from ophisthokonta is ruled out by its lack of an 11- to 13-amino-acid insertion found exclusively in all animal and fungal EF-1α (Baldauf and Palmer 1993). An analysis of 1,280 expressed sequence tags (EST) from M. balamuthi, D. discoideum and E. histolytica in a data set of 123 genes from 30 species provided very strong support for the position of Conosa (M. balamuthi, D. discoideum and E. histolytica) as a sister clade of ophisthokonta and also suggested that Dictyostelium diverged after the plant–animal split but before the divergence of fungi (Bapteste et al. 2002). In other words, dictyostelids occupy a position in evolution of eukaryotes as a sister clade of metazoans and fungi and are members of the eukaryote crown group. The homeobox genes are largely studied in metazoans, and in the present study, we have analysed the homeobox genes from the unicellular eukaryote, D. discoideum. On the basis of its homeodomain sequence and predicted tertiary structures, we could successfully classify them as per the known classification criteria from animals (Holland et al. 2007; Takatori et al. 2008). There are 14 predicted homeobox genes with one duplicate on chromosome 2. The functions of few of the genes have already been characterized in D. discoideum: hbx1 (wariai) regulates cell-type differentiation and proportioning, hbx3 plays a role in phagocytosis and hbx4 exhibits defects in cytokinesis and development (Han and Firtel 1998; Sillo et al. 2008; Kim et al. 2011). The homeobox genes from D. discoideum could be classified largely into two groups: TALE class and nonTALE. The non-TALE could be further subdivided into NOVEL class and OTHER. TALE class members were further divided into PBX, PKNOX, IRX and CUP families. In this phylogenetic analyses fungus were used as an out-group since these are sister to both Metazoans and Amoebas. The RNA expression analyses of all 14 during development of D. discoideum were carried out. Genome data from such additional unicellular relatives of metazoans will be useful in the analysis of evolution of multicellularity and will facilitate comparative genomics of this important gene superclass. J. Biosci. 40(2), June 2015 2. 2.1 Materials and Methods Homeobox genes from D. discoideum A genome-wide search for the homeodomain sequences was performed using BLAST server at the dictyBase using the known homeodomain sequences from the ANTP, PRD, LIM, POU, HNF, SINE, TALE, CUT, PROS, ZF and also OTHER classes of different organisms ranging from unicellular fungus to multicellular vertebrates. No arbitrary Evalue cut-off was selected but instead each list of hits was manually analysed until true homeodomain sequences ceased to be detected. Hits were analysed for the presence of a homeodomain, additional domains and motifs by using the Simple Modular Architecture Research Tool (http:// smart.embl-heidelberg.de/) (Letunic et al. 2011). The individual homeodomains from multi-homeodomain protein in Dictyostelium was considered as separate taxa in our phylogenetic analyses – lowercase letters appended to the gene name distinguish downstream homeodomain that are derived from a single protein (Ryan et al. 2006). 2.2 Database searches and retrieval of protein sequences To classify the Dictyostelium homeobox genes, sequences of known homeodomains were collected using BLASTp through NCBI and of different databases such as http:// www.tigrorg/tdb/e2k1/cna1 (Cryptococcus neoformans) and www.wormbase.org (Caenorhabditis elegans, WS203). Cut-off E-value (e−05) was selected. In case of the NOVEL class proteins, hits showed insignificant E-values and therefore we relaxed the E-value criteria for obtaining sequences showing slight similarity (E-value e0.009– e1.5) to NOVEL class members. No unnamed, predicted or hypothetical proteins were selected. Homeodomain sequences were extracted using SMART. Incomplete and redundant sequences were removed manually. 2.3 Phylogenetic analyses Phylogenetic analyses were performed with homeodomain sequences after aligning the 60 amino acids using Maximum Likelihood (ML) method. ML trees were constructed using the PhyML program (Guindon and Gascuel 2003; Guindon et al. 2005) with JTT amino acids substitution model Branch support was assessed using SH-like (Shimodaira-Hasegawa) in PhyML program. Sequences showing insignificant branch support were removed manually from the final trees constructed. Trees were re-rooted using TreeDyn (http:// www.phylogeny.fr) (Dereeper et al. 2008) and visualized by MEGA6.0 (Molecular Evlutionary Genetics Analysis) (Tamura et al. 2013). Species codes used: Gg: Gallus gallus; Homeobox genes from Dictyostelium Dm: Drosophila melanogaster; Hs: Homo sapiens; Mm: Mus musculus; Xl: Xenopus laevis; Sc: Saccharomyces cerevisiae;; Cf: Camponotus floridanus; Tc: Tribolium castaneum; Dp: Danaus plexippus; Hsa: Harpegnathos saltator; Cb: Cerapachys biroi; Ae: Acromyrmix echinatior; Dd: Dictyostelium discoideum; Cn: Cryptococcus n e of o r m a n s ; Ec : E nceph alito zoo n cunicu l i ; Nv : Nematostella vectensis; Cs: Cupiennius salei; Sp: Strongylocentrotus purpuratus; Mda: Myotis davidii; Aa: Aedes aegypti; Ta: Trichoplax adhaerens; Sk: Saccoglossus kowalevskii; Cgi: Crassostrea gigas; Dr: Danio rerio; Rn: Rattus norvegicus; Cg: Cricetulus griseus; Cl: Columba livia; Ss: Salmo salar; So: Sepia officinalis; Am: Ambystoma mexicanum; Cc: Cyprinus carpio; Bt: Bos taurus; Cgat: Cryptococcus gattii; Cq: Culex quinquefasciatus; Tas: Trichosporon asahii; Op: Ogataea polymorpha; Om: Osmerus mordax; Ggo: Gorilla gorilla; Pa: Pteropus alecto; Mf: Macaca fascicularis; Od: Oikopleura dioica; Zn: Zootermopsis nevadensis; Cj: Callithrix jacchus; Md: Microplitis demolitor; Wi: Wallemia ichthyophaga; Cmy: Chelonia mydus; He: Heliocidaris erythrogramma; Pv: Patella vulgata; Pl: Paracentrotus lividus; Clo: Convolutriloba longifissura; Hm: Hymenolepis microstoma; Xt: Xenopus tropicalis; Phc: Pediculus humanuscorporis; Po: Paralichthys olivaceus; Csa: Ciona savignyi; Bm: Brugia malayi; Pdu: Platynereis dumerilii; Lc: Lethenteroncamts chaticum; Eb: Eptatretus burgeri; Pd: Papilio dardanus; On: Ostrinia nubilalis; Ha: Haliotis asinina; Td: Thermobia domestica; Fc: Folsomia candida; Lf: Lithobius forficatus; Gm: Glomeris marginata; Sr: Symsagittifera roscoffensis; Bmu: Bos mutus; Tch: Tupaia chinensis; Rt: Rhodosporidium toruloides 2.4 Homology modelling for tertiary structure predictions Secondary and tertiary structures of Dictyostelium homeodomains were predicted by JPred (Cole et al. 2008) (http://www.compbio.dundee.ac.uk/www-jpred) implemented through the Barton Group, University of Dundee and Phyre2 Protein Homology/Analogy Recognition Engine (Kelley and Sternberg 2009) (http:// www.sbg.Bio.ac.uk.phyre2) respectively. Structures were aligned using aligner 3D algorithm in STRAP (http:// www.bioinformaticsorg/strap/) and tree was obtained by Archaeopteryx in the same program using NJ method. Alignment was modified for the present representation (figure 1) purpose where the positions of TALE class genes were changed. For homology modelling the crystal structure of the different homeodomains obtained from Protein Data Bank (PDB) was used as a template and the quality of the model was judged by Ramachandran Plot as checked by PROCHECK tool on SAVES online resource for each protein (http://nihserver.mbi.ucla.edu/SAVES/). 2.5 243 RNA detection by semi-quantitative RT-PCR RNA was isolated from cells collected at specific stages of development as mentioned in Gosain (Gosain et al. 2012). RT-PCR reactions were performed using specific primer sets and ig7 (rnlA) was used as an internal control (supplementary table 1). Three to five independent mRNA isolations were carried for each stage which was processed further for RT-PCR reactions. An average of three were taken and plotted in the given figures. 2.6 In silico analyses of putative promoter regions of homeobox gene classes 1000 bp upstream or intergenic region of the genes were taken to search for probable transcription factors binding sites using MEME (Multiple EM algorithm for Motif EliCitation) suite (http://meme.nbcr.net/meme/) (Bailey and Elkan 1994) taking a default criteria of zero or one motif per sequence. Significant presence of these motifs were found using program MAST (Bailey and Gribskov 1998) in a total of 13,495 upstream sequences of D. discoideum. Identical motifs were removed and selected motifs were matched with known transcription factor database TRANSFAC (Wingender et al. 1996) using PATCH1.0 program (http:// www.gene-regulation.com/pub/programs.html). The first 57 promoter sequences of promoter database from dictybase which did not have homeobox genes were chosen as random promoter sequences for the analyses. 3. 3.1 Results and discussion Classification of the homeobox genes from D. discoideum The purpose of comparing and classifying the homeobox genes is to determine evolutionary relationships between different genes and also to place them in a framework that would allow revealing the structural and functional relationships. A few homeobox genes such as Hox, ParaHox and NK are present in clusters which evolved by cis-duplication events (for review, see Garcia-Fernàndez 2005) and are one of the characteristic features of Hox genes. A total of 39 hox genes from humans are present in four paralogous group on four different chromosomes and in case of Drosophila these genes are present as a single cluster of eight genes. The relative position in their cluster and also the positions of their introns are conserved (Roy et al. 2003). BLAST search results confirmed 14 homeobox genes whose sequences were collected from dictyBase (Chisholm et al. 2006) but none of them could be hox genes as we did not observe any clustering in the genomic structure of D. discoideum J. Biosci. 40(2), June 2015 244 H Mishra and S Saran Figure 1. Multiple structural alignments of the identified homeodomains of D. discoideum. The online software aligner 3D of STRAP server was used for the construction of this figure as well as consensus sequences. The position of the three helices as well as consensus sequences on the aligned homeodomains is marked. Box marks the characteristic three amino acids in the loop region of TALE class members. (supplementary figure 2). These genes are present on 4 different chromosomes and show no uniformity in distribution: chromosomes 2 and 6 have 4 genes each, while chromosomes 3 and 4 have 3 genes each. This probably does not seem to be a result of recent chromosomal duplication as their positions on the chromosomes are not conserved. Most of the genes (10 out of 14) were interrupted by introns (1 or 2 in number) and their positions were also not conserved. The homeobox in hbx11 and hbx7 genes were interrupted and they fall in different classes. The intron in warA interrupts the ankyrin repeats and is a homeotic gene (Han and Firtel 1998). Homeodomain has a very conserved architecture with three helices where the first and the second helices are almost antiparallel to each other while the third helix is roughly perpendicular to the other two (Kissinger et al. 1990). The second and the third helices form a helix-turnhelix motif (Harrison and Aggarwal 1990). The predicted tertiary structure alignment of homeodomains from D. discoideum show the residues involved in the construction of the three helixes, turn and loop (figure 1). We do observe a broad division of TALE class and non-TALE. Within the non-TALE subclades exist which later on were recognized as NOVEL class and OTHER (figure 2). In OTHER group Hbx2 appeared as an outcast and come closer to TALE group of genes. A comparison of the structure based tree with the sequence based tree was also carried, but the clarity was found to be better in this case. A typical homeodomain has 60 amino acids long sequence with few exceptions whereas the atypical homeodomain is characterized by an insertion of extra residues between helix 1 and 2 or between 2 and 3 of the homeodomain (Mukherjee et al. 2009). The TALE class is J. Biosci. 40(2), June 2015 very ancient being present in wide range of eukaryotic lineages. The TALE class homeobox proteins always have three Proline-Tyrosine-Proline residues except in TGIF family where it is Alanine-Tyrosine-Proline making it a total of 63 amino acid residues (Bertolino et al. 1995). Another characteristic feature of the TALE class is that the loop is often followed by a Serine/Threonine or any other acidic residues (Bürglin 1997). The three (Hbx6, Hbx8 and Hbx14) NOVEL class proteins show the presence of double homeodomain where each of these domains are separated by six residues wherein the 1st and 4th position are occupied by Tyrosine and Asparagine residues respectively. The second homeodomain is 56–57 amino acids long. For the members of the OTHER (Hbx2, Hbx5-1, Hbx13, Hbx10 and WarA), the first helix starts from the tenth residue which is mostly Proline or any other acidic residue. In total, five TALE class and nine non-TALE class members were identified in D. discoideum. Both TALE and nonTALE classes were already present in the ancestor of eukaryotes (Derelle et al. 2007). The phylogenetic position of D. discoideum is far away from the known metazoans like Drosophila and humans, where the multicellularity is dissimilar to that originated in these metazoans. We found nearly 36–48% sequence similarity in the homeodomain proteins of D. discoideum and metazoans. The classical way to classify these proteins is based on the phylogenetic analyses of the homeodomain sequences, their chromosomal location and domain composition (Holland et al. 2007). In case of D. discoideum no specific codomains were found except the ankyrin repeats in WarA and Chromatin organization modifier domain (CHROMO) in Hbx12 (supplementary figure 3). The ML tree thus obtained (figure 3) strengthen our earlier classification as Homeobox genes from Dictyostelium 245 Figure 2. Tertiary structure based Neighbor Joining phylogenetic tree. The 3D structure of the homeodomains of all homeodomain containing proteins were predicted and based on that a tree was generated by Archaeopteryx method in the STRAP server. No root was selected for the tree. The broad division of different classes namely, TALE, NOVEL and OTHERS which would be confirmed later can also be seen here. shown in figure 1. The three groups of proteins were identified in this organism, namely: TALE class, NOVEL class and OTHER according to the similarity they share with the known homeodomain proteins from different organisms. NOVEL class proteins are unique to D. discoideum and share no similarity to any known homeodomain containing proteins and thus make a separate clade. OTHER group members show most ambiguous positions as they do not make a united class but rather behave as individual genes showing similarity (38–61%) to members of ZFH, PRD, LIM and TALE classes of animals, fungi and microsporidia. WarA show similarity to Encephalitozoon cuniculi; LIM, Hbx13 make a clade with PRD class members; Hbx10, Hbx5-1 show affinity to ZFH class members while Hbx2 shares similarity with TALE class members. Structure based tree analysis (figure 2) allows most of them to fall in a single clade with one stray member, Hbx2. It was therefore not feasible to classify them in a particular class and have thus been categorized as unclassified and placed in OTHER of homeodomain proteins according to the animal classification (Holland et al. 2007). TALE class members show most promising similarity to TALE class proteins mostly from metazoans. These fall together in a clade along with members of MEIS, PKNOX, CUP and PBX families of TALE class. Dictybase also annotates hbx4 as an orthologue of the cup gene. The phylogenetic analysis of these proteins failed to confidently assign homeodomains of D. discoideum to any of the major class of homeodomains from metazoans, and fungi except for the TALE class. To further improve the resolution, phylogenetic trees specific to TALE class and non-TALE members were constructed where D. discoideum genes were resolved into a family. We retrieved homeodomain sequences of proteins that showed E-value <e−05in BLASTp through NCBI, where individual homeodomains of D. discoideum were used as a query. Using this approach, we got good support for the position of our sequences in the phylogenetic tree assessment (figures 4, 5 and 6). As shown in figure 4, NOVEL class members form a clade but show similarity to members of SINE class of metazoans. SINE class (named after the Drosophila gene so: sine oculis) homeobox genes encode a distinct typical homeodomain and upstream to it is the highly conserved 115-amino-acid-long SIX domain. Both the SIX domain and the homeodomain are required for DNA binding (Kawakami et al. 2000). This similarity is distant as NOVEL class homeobox proteins (Hbx6, Hbx7, Hbx8 and Hbx14) do not show sequence similarity to any known homeodomain J. Biosci. 40(2), June 2015 246 H Mishra and S Saran Figure 3. Maximum likelihood (ML) tree was obtained with homeodomains of selected proteins showing homology to all homeodomain containg proteins from D. discoideum. Tree was built using SPRs tree topology searches and JTT amino acid substitution model. Trees have been rooted with TasLim. Key (●, ■, ♦) represent TALE, NOVEL and OTHER proteins of D. discoideum respectively. sequences. Irrespective of this, we have considered these homeodomain sequences to construct the tree simply to show that these proteins are unique to this organism. This is not very unusual as every organism have few proteins which are unique to them and therefore do not belong to any known class for example bix1 of Xenopus and ahbx1 of Branchiostoma (homeodb). NOVEL class proteins such Hbx6, Hbx8 and Hbx14 show distinct characteristics of having double homeodomains (supplementary figure 3). Hbx6 and Hbx14 have very similar sequence in their homeodomains. In other words, first homeodomain of Hbx6 is similar to first homeodomain of the Hbx14 (98%) and second with second (94%) but this similarity J. Biosci. 40(2), June 2015 was not observed in the first and second (49%) homeodomains of the particular protein itself. Their presence on different chromosomes suggests that they may have translocated and diverged from a common ancestor. Both Hbx8 and Hbx7 are present on the same chromosome but on different strands, possibly a result of duplication and transposition. Both human and mouse have genes with double homeodomains belonging to the family DUX of PRD class (Holland et al. 2007; Zhong and Holland 2011). Most of these genes are pseudogenes and they are derived from retrotransposition of mRNA transcript of functional DUX genes (Booth and Holland 2007). NOVEL class genes of D. discoideum do not show significant similarity to any of them. Homeobox genes from Dictyostelium 247 Figure 4. ML tree was obtained with homeodomains of selected genes which matched with NOVEL class D. discoideum homeodomain proteins. The tree was constructed using SPRs tree topology searches and JTT amino acid substitution model. Trees have been rooted with RtYox1. Key (●) represent D. discoideum proteins. D. discoideum homeodomains fall with LIM, ZFH, POU and PRD class members (figure 5). WarA shows similarity with the fungal and microsporidia (Encephalitozoon cuniculi) LIM class. It was also earlier reported that one homeodomain (EcHD-10) of Encephalitozoon cuniculi was related to the homeodomain of WarA (Bürglin 2003). WarA and Hbx10 share similarity in their protein sequences (78%) suggesting that they may have duplicated recently from a common ancestor. Surprisingly, Hbx10 shows PRD pattern signature sequence at positions 16-26 except at positions 23rd and 25th wherein this protein has aspartic acid and methionine instead of AEFHKQRV and FHY respectively (Fonseca et al. 2008). Hbx13 of D. discoideum, which falls between Hbx10 and WarA, show affinity with members of PRD class. This protein is highly divergent from the rest of the proteins represented here as it is clear from the branch length and this variability in evolutionary rates sometime may generate artificial groupings in the eukaryotic molecular tree because of LBA (Long Branch attraction) (Philippe et al. 2000). Hbx2 and Hbx5-1 show affinity with members of ZFH class and NK family homeodomain, respectively. Since our analyses failed to classify these homeodomain proteins of D. discoideum satisfactorily we grouped them under OTHER, taking note of the criteria used in human homeodomain classification (Holland et al. 2007). We thus placed the NOVEL class members in a separate class and did J. Biosci. 40(2), June 2015 248 H Mishra and S Saran Figure 5. ML tree was obtained with homeodomains of selected genes which matched with OTHER D. discoideum homeodomain proteins. The tree was constructed using SPRs tree topology searches and JTT amino acid substitution model. Trees have been rooted with TasLim. Key (●) represents D. discoideum proteins. not consider them as members of OTHER. NOVEL class proteins bear no similarity to any known class of homeodomain proteins and are unique to D. discoideum while the OTHER group members show weak similarity to many known homeodomain proteins but still do not belong to any one class as such. TALE class has following genes families: in case of animals: MEIS, PBX, TGIF, MKX, PKNOX/PREP and IRX/IRO (Iroquois); in case of plants: KNOX and BEL; and in case of fungi: M-ATYP (Atypical Mating type gene) and CUP (Bürglin 1997). D. discoideum TALE class members belong to IRX, PBX and CUP (present only in fungus) J. Biosci. 40(2), June 2015 family clades of fungus and metazoans (figure 6). In this tree, Hbx4 is placed before the animals and along with the fungus CUP family. This position is basal to IRX, MEIS and PKNOX families of animals. PKNOX, found only in animals is not orthologous to KNOX family of plants (Holland et al. 2007). It is quite possible that Hbx4 have features of MEINOX domain (the amalgam of the MEIS and KNOX domains) that was already present when plants and animals diverged (Bürglin 1997). MEIS domain also show similarity to PBX domain of animals meaning that PBX domain are also derived from MEINOX domain (Bürglin 1998). Hbx3, Hbx11 and Hbx12 TALE class members form a clade that Homeobox genes from Dictyostelium 249 Figure 6. ML tree was obtained with homeodomains of selected genes which matched with TALE class D. discoideum homeodomain proteins. The tree was constructed using SPRs tree topology searches and JTT amino acid substitution model. Trees have been rooted with ScCup9. Key (●) represents D. discoideum proteins. belongs to PBX class of metazoans. Hbx9 shows good supported similarity to IRX family proteins but if we construct a tree without IRX then Hbx9 comes with PKNOX family. These proteins do not show significant resemblance with MKX and TGIF families of vertebrates. 3.2 Temporal expression pattern of homeobox genes RNA expression patterns of the homeobox genes at specific stages of development of D. discoideum were analysed to find if there exists any unique expression pattern for each class. 3-5 independent mRNA isolations were carried for each gene and an average was taken to plot the graphs shown. Figure 7 shows the temporal expression pattern in members of TALE class namely hbx3, hbx4, hbx9, hbx11 and hbx12. We found a gradual increase in expression from freshly starved cells to slug stage and hence after, it maintained its levels in case of hbx3, hbx4 and hbx11 (figures 7A, B and D). hbx9 (figure 7C) showed near uniform expression throughout development and is the only TALE class member that express at higher levels during vegetative stage suggesting its possible requirement during growth and proliferation. These patterns are consistent with RNA-seq expression J. Biosci. 40(2), June 2015 250 H Mishra and S Saran Figure 7. Temporal expression pattern of TALE class members during development of D. discoideum. Relative abundance of the specific genes during development is shown: (A) hbx3, (B) hbx4, (C) hbx9, (D) hbx11, (E) hbx12. [FS – Freshly starved LA – Loose aggregates, M – Mound, SL – Slug, EC – Early culminant and FB – Fruiting body. N=3.] available on dictyExpress (Rot et al. 2009). hbx12 (figure 7E) transcript was at its peak at the mound stage which is different as compared to dictyExpress where it shows maxima in freshly starved cells. It is already known that hbx4 plays a role in cell patterning and cytokinesis as it regulates cadA gene which encodes CAD-1, a Ca2+-dependent cell adhesion molecule (Kim et al. 2011). Hbx3 plays a role during phagocytosis (Sillo et al. 2008). Figure 8 shows the expression pattern of members from OTHER class namely, hbx2, hbx5-1, hbx10, hbx13 and warA during development. hbx2, hbx5-1and hbx10 (figures 8A, B and C) show higher expression in freshly starved cells suggesting a role during early development hbx5-1show sharp decrease at mound stage which is not in accordance with dictyExpress where it decreases gradually. In dictyExpress hbx13 (figure 8D) shows a dip at mound stage whereas in this study expression showed a gradual increase till early culmination suggesting its involvement in differentiation. warA (figure 8E) expression increased from vegetative to mound and showed a small decrease till culmination. warA is known to be responsible for prestalkO proportioning in slugs by regulating the number of prestalkO cells differentiating from anterior-like-cells (Han and Firtel 1998). Figure 9 shows the expression patterns of NOVEL class members namely, hbx6, hbx7, hbx8 and hbx14 having double homeoboxes. The level of expression for all the members of this class was comparatively lower and no unique pattern was observed. hbx6 (figure 9A) expression is more or less uniform throughout development with a small rise at aggregation and slug stages. hbx7 (figure 9B) expression gradually decreased from starved cell to mound then remained at basal levels throughout development. hbx8 (figure 9C) show a wave like pattern having two peaks: one in freshly starved cells and other during slug and early culmination stages with a trough at aggregation and mound stages. RNAseq data from dictyExpress gives a different picture, where hbx8 show only one peak at loose aggregate stage. hbx14 (figure 9D) expression is more or less same throughout development with maximum during aggregation stage but according to dictyExpress it shows an increase only at culmination. One could possibly say that the members of NOVEL class genes show higher expression levels during freshly starved cells. Expression analyses were carried to check if the individual members of each class followed a unique pattern of expression, which could further help speculate functional significance for this organism. The results observed did not show the emergence of any such unique pattern for any class J. Biosci. 40(2), June 2015 Homeobox genes from Dictyostelium 251 Figure 8. Temporal expression pattern of OTHER members during development of D. discoideum. Relative abundance of the specific genes during development is shown: (A) hbx2, (B) hbx5, (C) hbx10, (D) hbx13, (E) warA. [FS – Freshly starved, LA – Loose aggregates, M – Mound, SL – Slug, EC – Early culminant and FB – Fruiting body. N=3. ] suggesting that each homeodomain containing protein may be having its own specific function. We could thus only speculate their probable functions during development of D. discoideum. Further experiments with each of them are necessary for showing their specific roles during growth and development. 3.3 Putative promoter analysis Homeobox genes like any other genes are regulated by various mechanism(s) including chromatin remodeling, receptor mediated regulation and regulation by other factors. Upstream sequence analyses of identified homeobox genes from D. discoideum could shed light on their functional regulation and the pathways in which they are likely involved. Hence, we searched for the conserved motifs or binding sites in their putative promoter region that could possibly regulate them. We found three significant motifs for each class or group of genes and their sequence logos are shown in figure 10. Conservation among the promoter is a necessary aspect for regulation by transcription factors of conserved pathways. Repetitive occurrence of these motifs within the class and low presence outside the particular class can be a measure of their unique properties; therefore we calculated the repetitive presence of these motifs within the class using the same program and compared the results obtained with homeobox genes with that of 57 randomly chosen genes and 9 conserved hsp70 genes from D. discoideum. TALE class motifs had 3.2 times on an average presence in TALE class putative promoter; NOVEL had 9.5 times per gene and OTHER had 4 times per gene as compared to randomly selected genes which had on an average 2.7 times whereas conserved hsp70 have 8.4 times per gene. We searched the presence of these motifs in non-homeobox containing genes using MAST program. TALE class motifs appeared in 43 genes, NOVEL class in 12 genes and OTHER group in 10 non-homeobox containing genes out of total 13,495 upstream sequences of the D. discoideum. These results revealed that the NOVEL class upstream regions were more unique than those of the TALE class and OTHER. This is also reflected in NOVEL class proteins (figure 4) where they appeared as a unique clade. Resulting motifs were further utilized to find known transcription factor binding sites using TRANSFAC database (Wingender et al. 1996) (binding sites and transcription J. Biosci. 40(2), June 2015 252 H Mishra and S Saran Figure 9. Temporal expression pattern of NOVEL class members during development of D. discoideum. Relative abundance of the specific genes during development is shown: (A) hbx6, (B) hbx7, (C) hbx8, (D) hbx14. [FS – Freshly starved, LA – Loose aggregates, M – Mound, SL – Slug, EC – Early culminant and FB – Fruiting body. N=3.] factors are shown in supplementary figure 3A, B and C). Most of them were DNA binding proteins for example homeobox genes, retinoic acid receptors, MADS box proteins, CACCC box binding factors, PUF (c-myc transcription factor) etc. Most of the binding factors identified were not present in D. discoideum, but few factors have been identified as ortholog or similar proteins. Sequence features of promoters to which these factors bind could suggest the pathways or functions of these genes. NOVEL class homeobox genes may play a role in cellular differentiation as most of them express at initial and late developmental stages where prespore and prestalk fate determination and terminal differentiation occur. Moreover their promoter analysis showed a Mef2 (Myocyte enhancer factor 2) binding site, a MADS box transcription factor among many other factors. The ortholog of Mef2A in Dictyostelium plays a role in cellular differentiation by modulating prestalk/prespore gene expression (Galardi-Castilla et al. 2013). OTHER group promoters have Puf and HSF (Heat shock factors) binding site. Puf protein had been identified as nucleoside diphosphate kinase of nm23-H2 protein (Postel et al. 1993). These proteins have role in cellular growth and differentiation through TGF-β1 signaling (Hsu et al. 1994; Gervasi et al. 1996). D. discoideum have NDPK (nucleoside diphosphate kinase) protein which might bind to these promoter sites. D. discoideum also has an HSF gene, an ortholog of Homo sapiens HSF1 and Lycopersicom HSF24. HSF have known role in cell growth in unstressed fission yeast (Gallo et al. 1993) that could also be a case in D. discoideum. TALE class promoter revealed binding site of wnt pathway transcription factors like TCF1, LEF1 and SRY. These factors act downstream of β-catenin and GSK3. D. discoideum have an ortholog of mammalian β-catenin and GSK3, Aar (Aardvark) and gskA respectively. These orthologs have roles in cell-cell adhesion and cell type proportioning in multicellular phase, under the influence of cAMP (Harwood et al. 1995; Grimson et al. 2000; Coates et al. 2002). It indicates that these attributes get materialized through TALE class genes via TCF1/LEF1 transcription factors. Clearly these motifs are conserved and some of the motifs have binding sites for the homeobox genes itself which is an indicative of cross regulation and auto-regulation. We found few homeodomain binding sites in the promoter of these homeobox genes but the binding factors identified from J. Biosci. 40(2), June 2015 Homeobox genes from Dictyostelium 253 Figure 10. Sequences logos of conserved motifs found in the putative promoter regions of different homeobox gene classes in D. discoideum. Motifs were searched by MEME suite using position specific scoring matrix: (A) TALE class motifs, (B) NOVEL class, (C) OTHER motifs. our analyses, are probably not present in D. discoideum. Therefore, there is no scope of discussing these aspects in the present study. 4. Conclusions The 14 homeobox genes identified in D. discoideum were classified, of which 5 belong to the TALE (atypical) class, and 9 to the non-TALE (typical) group. Further, the nonTALE group could be classified as NOVEL class and OTHER, according to the structure and sequence similarity. Also, we did not find any Hox proteins in D. discoideum. TALE class proteins received the much supported position in a clade of PBC, IRX and CUP families of TALE class of metazoans. This may be because TALE class members are much conserved and primitive in nature. NOVEL class is found to be unique to D. discoideum as they do not show any similarity to known homeodomains. This we think is due to the fact that these genes evolved after the Dictyostelium had diverged from animal and fungal lineage. OTHER genes show similarity to LIM, ZFH and PRD class members, but since these genes did not appear as a united class in the constructed phylogenetic tree, we were unable to classify them further. These genes have sequence features that match with signature sequences of PRD class genes. We, thus, speculate them to be the direct descendants of a common ancestor of this gene class. D. discoideum is a primitive eukaryote and holds a position before fungus and metazoans in the evolutionary tree of life. No unique RNA expression pattern during development of D. discoideum emerged for any of the class. To date only a few homeobox genes have been functionally characterized in this organism. Similar to many multigene families, many homeobox genes may have overlapping functions. Putative promoter analysis revealed that these genes have many probable protein factors binding sites upstream of transcription start site. Studying those features could suggest some pathways that function with or regulate the homeobox proteins in D. discoideum. Acknowledgements SS acknowledges the DST PURSE grant to her, and HM acknowledges UGC for fellowship. References Bailey TL and Elkan C 1994 Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2 28–36 Bailey TL and Gribskov M 1998 Combining evidence using pvalues: application to sequence homology searches. Bioinformatics 14 48–54 Baldauf SL and Palmer JD 1993 Animals and fungi are each other’s closest relatives: congruent evidence from multiple proteins. Proc. Natl. Acad. Sci. USA 90 11558–11562 Bapteste E, Brinkmann H, Lee JA, Moore DV, Sensen CW, Gordon P, Duruflé L, Gaasterland T, et al. 2002 The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba and Mastigamoeba. Proc. Natl. Acad. Sci. USA 99 1414–1419 Bertolino E, Reimund B, Wildt-Perinic D and Clerc RG 1995 A novel homeobox protein which recognizes a TGT core and J. Biosci. 40(2), June 2015 254 H Mishra and S Saran functionally interferes with a retinoid-responsive motif. J. Biol. Chem. 270 31178–31188 Booth HAF and Holland PWH 2007 Annotation nomenclature and evolution of four novel homeobox genes expressed in the human germ line. Gene 387 7–14 Bürglin TR 1997 Analysis of TALE superclass homeobox genes (MEIS PBC KNOX Iroquois TGIF) reveals a novel domain conserved between plants and animals. Nucleic Acids Res. 25 4173–4180 Bürglin TR 1998 The PBC domain contains a MEINOX domain: coevolution of Hox and TALE homeobox genes? Dev. Genes Evol. 208 113–116 Bürglin TR 2003 The homeobox genes of Encephalitozoon cuniculi (Microsporidia) reveal a putative mating-type locus. Dev. Genes Evol. 213 50–52 Chisholm RL, Gaudet P, Just EM, Pilcher KE, Fey P, Merchant SN and Kibbe WA 2006 dictyBase the model organism database for Dictyostelium discoideum. Nucleic Acids Res. 34 D423–D427 Coates JC, Grimson MJ, Williams RSB, Bergman W, Blanton RL and Harwood AJ 2002 Loss of the beta-catenin homologue aardvark causes ectopic stalk formation in Dictyostelium. Mech. Dev. 116 117–127 Cole C, Barber JD and Barton GJ 2008 The Jpred 3 secondary structure prediction server. Nucleic Acids Res. 36 W197–W201 Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, Dufayard JF, Guindon S, et al. 2008 Phylogeny fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 36 W465–W469 Derelle R, Lopez P, Le Guyader H and Manuel M 2007 Homeodomain proteins belong to the ancestral molecular toolkit of eukaryotes. Evol. Dev. 9 212–219 Fonseca NA, Vieira CP, Holland PWH and Vieira J 2008 Protein evolution of ANTP and PRD homeobox genes. BMC Evol. Biol. 8 200 Galardi-Castilla M, Fernandez-Aguado I, Suarez T and Sastre L 2013 Mef2A a homologue of animal Mef2 transcription factors regulates cell differentiation in Dictyostelium discoideum. BMC Dev. Biol. 13 12 Gallo GJ, Prentice H and Kingston RE 1993 Heat shock factor is required for growth at normal temperatures in the fission yeast Schizosaccharomyces pombe. Mol. Cell. Biol. 13 749–761 Garcia-Fernàndez J 2005 The genesis and evolution of homeobox gene clusters. Nat. Rev. Genet. 6 881–892 Gehring WJ 1994 A history of the homeobox; in Guidebook to the homeobox genes (ed) D Duboule (New York: Oxford Univ Press) Gervasi F, D’Agnano I, Vossio S, Zupi G, Sacchi A and Lombardi D 1996 nm23 influences proliferation and differentiation of PC12 cells in response to nerve growth factor. Cell Growth Differ. 7 1689–1695 Gosain A, Lohia R, Shrivastava A and Saran S 2012 Identification and characterization of peptide: N-glycanase from Dictyostelium discoideum. BMC Biochem. 13 9 Grimson MJ, Coates JC, Reynolds JP, Shipman M, Blanton RL and Harwood AJ 2000 Adherens junctions and beta-cateninmediated cell signalling in a non-metazoan organism. Nature 408 727–731 J. Biosci. 40(2), June 2015 Guindon S and Gascuel O 2003 A simple fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52 696–704 Guindon S, Lethiec F, Duroux P and Gascuel O 2005 PHYML Online–a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res. 33 W557–W559 Han Z and Firtel RA 1998 The homeobox-containing gene Wariai regulates anterior-posterior patterning and cell-type homeostasis in Dictyostelium. Development 125 313–325 Harrison SC and Aggarwal AK 1990 DNA recognition by proteins with the helix-turn-helix motif. Annu. Rev. Biochem. 59 933–969 Harwood AJ, Plyte SE, Woodgett J, Strutt H and Kay RR 1995 Glycogen synthase kinase 3 regulates cell fate in Dictyostelium. Cell 80 139–148 Holland PWH, Booth HAF and Bruford EA 2007 Classification and nomenclature of all human homeobox genes. BMC Biol. 5 47 Hsu S, Huang F, Wang L, Banerjee S, Winawer S and Friedman E 1994 The role of nm23 in transforming growth factor beta 1-mediated adherence and growth arrest. Cell Growth Differ. 5 909–917 Iyer LM, Anantharaman V, Wolf MY and Aravind L 2008 Comparative genomics of transcription factors and chromatin proteins in parasitic protists and other eukaryotes. Int. J. Parasitol. 38 1–31 Kawakami K, Sato S, Ozaki H and Ikeda K 2000 Six family genes– structure and function as transcription factors and their roles in development. Bioessays 22 616–626 Kelley LA and Sternberg MJE 2009 Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4 363–371 Kim JS, Seo JH, Yim HS and Kang SO 2011 Homeoprotein Hbx4 represses the expression of the adhesion molecule DdCAD-1 governing cytokinesis and development. FEBS Lett. 585 1864–1872 Kissinger CR, Liu BS, Martin-Blanco E, Kornberg TB and Pabo CO 1990 Crystal structure of an engrailed homeodomain-DNA complex at 2 8 A resolution: a framework for understanding homeodomain-DNA interactions. Cell 63 579–590 Kuma K, Nikoh N, Iwabe N and Miyata T 1995 Phylogenetic position of Dictyostelium inferred from multiple protein data sets. J. Mol. Evol. 41 238–246 Letunic I, Doerks T and Bork P 2011 SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 40 D302–D305 Mukherjee K, Brocchieri L and Bürglin TR 2009 A comprehensive classification and evolutionary analysis of plant homeobox genes. Mol. Biol. Evol. 26 2775–2794 Philippe H, Germot A and Moreira D 2000 The new phylogeny of eukaryotes. Curr. Opin. Genet. Dev. 10 596–601 Postel EH, Berberich SJ, Flint SJ and Ferrone CA 1993 Human cmyc transcription factor PuF identified as nm23-H2 nucleoside diphosphate kinase, a candidate suppressor of tumor metastasis. Science 261 478–480 Rot G, Parikh A, Curk T, Kuspa A, Shaulsky G and Zupan B 2009 dictyExpress: a Dictyostelium discoideum gene expression database with an explorative data analysis web-based interface. BMC Bioinformatics 10 265 Roy SW, Fedorov A and Gilbert W 2003 Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc. Natl. Acad. Sci. USA 100 7158–7162 Ryan JF, Burton PM, Mazza ME, Kwong GK, Mullikin JC and Finnerty JR 2006 The cnidarian-bilaterian ancestor possessed at Homeobox genes from Dictyostelium least 56 homeoboxes: evidence from the starlet sea anemone Nematostella vectensis. Genome Biol. 7 R64 Sillo A, Bloomfield G, Balest A, Balbo A, Pergolizzi B, Peracino B, Skelton J, Ivens A, et al. 2008 Genome-wide transcriptional changes induced by phagocytosis or growth on bacteria in Dictyostelium. BMC Genomics 9 291 Steenkamp ET, Wright J and Baldauf SL 2006 The protistan origins of animals and fungi. Mol. Biol. Evol. 23 93–106 Takatori N, Butts T, Candiani S, Pestarino M, Ferrier DEK, Saiga H and Holland PWH 2008 Comprehensive survey and classification of homeobox genes in the genome of 255 amphioxus Branchiostoma floridae. Dev. Genes Evol. 218 579–590 Tamura K, Stecher G, Peterson D, Filipski A and Kumar S 2013 MEGA6: molecular evolutionary genetics analysis version 6.0. Mol. Biol. Evol. 30 2725–2729 Wingender E, Dietze P, Karas H and Knüppel R 1996 TRAN SFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24 238–241 Zhong Y and Holland PWH 2011 The dynamics of vertebrate homeobox gene evolution: gain and loss of genes in mouse and human lineages. BMC Evol. Biol. 11 169 MS received 02 December 2014; accepted 12 March 2015 Corresponding editor: DURGADAS P KASBEKAR J. Biosci. 40(2), June 2015
© Copyright 2026 Paperzz