1 2 SUPPLEMENTARY MATERIALS 3 Ecophysiology of freshwater Verrucomicrobia inferred from genomes 4 recovered through time-series metagenomics 5 6 Shaomei He1,2, Sarah LR Stevens1, Leong-Keat Chan4, Stefan Bertilsson3, Tijana Glavina del 7 Rio4, Susannah G Tringe4, Rex R Malmstrom4, and Katherine D McMahon1,5,* 8 9 1Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA 10 2Department of Geoscience, University of Wisconsin-Madison, Madison, WI, USA 11 3Department of Ecology and Genetics, Limnology and Science for Life Laboratory, Uppsala 12 University, Uppsala, Sweden 13 4DOE 14 15 16 17 18 19 5Department 20 SUPPLEMENTARY TEXT 21 Population abundance estimated from MAG coverage depth 22 Abundance of populations represented by MAGs at the sampling time points was inferred 23 by the coverage depth of these MAGs within individual metagenomes. First, coverage 24 depth of each contig was obtained by mapping merged reads from each metagenome to all 25 MAGs using the Burrows–Wheeler aligner (BWA)-backtrack alignment algorithm with a 26 95% sequence identity cutoff and n=0.05, as described in Bendall et al. (2016). Based on 27 the number of reads mapped to each contig, we calculated the coverage depth of each 28 contig. The contig coverage depth was then weighted by its contig length and averaged 29 within each MAG to obtain a weighted average, so that longer contigs (which tend to have 30 more reliable coverage estimation) weigh more in the estimate of the MAG coverage depth. 31 The MAG coverage depth within each metagenome was finally normalized by the total 32 number of reads in each metagenome and multiplied by the maximal number of reads from Joint Genome Institute, Walnut Creek, CA, USA of Civil and Environmental Engineering, University of Wisconsin-Madison, Madison, WI, USA * Corresponding author 33 all metagenomes so that the coverage can be compared across different time points and 34 different lakes (Figure 2). 35 36 Glycolate utilization 37 Previously, Verrucomicrobia were suggested to be among the glycolate utilizers in humic 38 lakes, based on the retrieval of genes encoding glycolate oxidase subunit D (glcD) (Paver 39 and Kent 2010). Glycolate is an algal exudate, which was suggested to influence bacterial 40 community structure in lakes. The first step in bacterial glycolate utilization is converting 41 glycolate to glyoxylate by glycolate oxidase, a multi-subunit protein complex consisted of 42 subunits D, E, and F (glcDEF). In E. coli, all three subunits are essential to its activity 43 (Pellicer et al 1996). The glc operon of E. coli also contains glcB, encoding malate synthase 44 G, which converts glyoxylate to malate to be utilized through the TCA cycle. Among the 45 MAGs, only TE4605 possesses all three subunits of glycolate oxidase (glcDEF) (Figure S6). 46 However, different from E. coli, the TE4605 glc operon lacks the malate synthase G, but 47 contains an alanine (or serine)-glyoxylate transaminase (AGXT) and a glycolate permease 48 instead (Figure S6). Therefore, it is likely that glyoxylate generated from glycolate 49 oxidation is converted to glycine for amino acid assimilation (Figure S6), instead of energy 50 generation through the TCA cycle as in E. coli. A similar operon containing glcDEF and 51 AGXT is also present in a soil verrucomicrobial aerobe, Chthoniobacter flavus. Notably, C. 52 flavus was reported unable to grow on glycolate as the sole carbon and energy source 53 (Sangwan et al 2004), supporting our hypothesis that glycolate is likely utilized for amino 54 acid assimilation, instead of energy generation by TE4605. 55 56 Interestingly, TE4605 also contains a second copy of glcD (glcD2), which is not associated 57 with glcEF. GlcD2 only shares a 34% amino acid identity with glcD in the glc operon 58 mentioned earlier. Notably, this glcD2 is 100% identical at the nucleotide level to the 59 verrucomicrobial glcD clone OTU45 from the study by Paver and Kent (2010) likely 60 derived from the same species. In fact, nearly all MAGs have glcD, some of which share 61 >60% amino acid identities to glcD clones (OTU43, OTU44 and OTU45). However, these 62 glcD, like glcD2 in TE4605, lack glcEF in its genome vicinity, and glcD is either an orphan 63 gene or on operons that are not apparently involved in glycolate metabolism. Therefore, 64 this raises the question whether these genes are bona fide glcD and whether freshwater 65 Verrucomicrobia are glycolate-degraders in general. 66 67 Acetate metabolism 68 Transporters for monocarboxylic acid (such as pyruvate, acetate, propionate) belong to a 69 large solute:sodium symporter (SSS) family, which can transport sugars, amino acids, 70 nucleosides, inositols, vitamins, urea or anions. Genes belong to SSS family were present in 71 all MAGs, yet most of their substrate specificities based on the annotation are unknown. 72 Several SSS genes are annotated as acetate permeases (actP), together with the presence of 73 genes for acetate activation to acetyl-CoA in these MAGs, actP would allow acetate enter the 74 TCA cycle for energy generation. However, these MAGs lack isocitrate lyase and malate 75 synthase (Figure S8), key enzymes on the glyoxylate cycle, which is necessary when cells 76 grow with two-carbon compounds, such as acetate as the sole carbon source. Pathways 77 alternative to the glyoxylate shunt have been proposed to replenish four-carbon 78 intermediates during growth on acetate. Yet, among the MAGs, only TH2746 possess key 79 genes in the ethylmalonyl-CoA pathway for growing on acetate (Figure S8) (Schneider et al 80 2012). Therefore, for most MAG-represented freshwater Verrucomicrobial populations, 81 acetate might be used as a supplementary source of energy, but not as the sole energy and 82 carbon source for growth. 83 84 Phosphorus (P) metabolism and adaptation to P-limited conditions 85 The high-affinity phosphate-specific transporter (PstABC) system genes were recovered in 86 nearly all MAGs, and the low-affinity phosphate permease (PitA) genes are also present in 87 most MAGs (Figure 6), allowing cells to efficiently take up inorganic phosphate at a wide 88 range of concentrations. In addition, alkaline phosphatase (PhoA) genes were recovered in 89 half of the MAGs and phosphonoacetate hydrolase (PhnA) genes are also present in some 90 MAGs. 91 organophosphonates as a P source under P starvation, respectively. Further, the 92 polyphosphate kinase (PPK) genes found in nearly all MAGs may allow cells to accumulate 93 polyphosphate for future use when environmental P becomes scarce. Overall, the presence 94 of genes responding to P limitation, such as the two-component regulator (phoRB), phoA, These two enzymes enable cells to use phosphate monoesters and 95 phnA, and pstABC in these Verrucomicrobia populations suggest a strategy to survive P 96 limitation. Previously, positive correlations between freshwater Verrucomicrobia 97 abundance and P availability were observed (Haukka et al 2006, Lindström et al 2004). 98 However, despite the much higher P levels in Mendota, we did not observe higher 99 population abundance of Verrucomicrobia in Mendota or underrepresentation of their 100 genes responding to P limitation. Therefore, Verrucomicrobia populations in Mendota are 101 probably more influenced by the availability of organic autochthonous C substrates. 102 103 Sulfur metabolisms 104 Dissimilatory sulfate reduction genes were only found in TH4590, and genes for 105 Dimethylsulfoxide (DMSO) reduction and polysulfide reduction are absent in all MAGs, as 106 are sulfur and thiosulfate oxidation (SOX) genes (Figure S8). These suggest that redox 107 processes with sulfur-containing compounds are not important modes of energy 108 generation for these Verrucomicrobia populations. 109 The ABC-type sulfate transporter or sulfate permease genes, as well as assimilatory 110 sulfate reduction genes were found in most MAGs (Figure S8). By contrast, sulfonate 111 transporter genes were only found in TH4903, and genes encoding alkanesulfonate 112 monooxygenase, which is involved in sulfur acquisition under sulfur-limiting conditions by 113 splitting organosulfonates to sulfite and formaldehyde, are absent in all MAGs. The 114 presence of sulfate transporter and assimilatory sulfate reduction genes, and the absence of 115 genes involved in sulfur acquisition under sulfur-limited conditions is consistent with our 116 hypothesis that the degradation of sulfated polysaccharide may serve as an abundant 117 source of sulfur for cell biosynthesis, based on the high occurrence of sulfatase genes in 118 these MAGs. 119 120 Oxygen tolerance 121 Oxygen (O2) reduction products such as superoxide (O2-) and hydrogen peroxide (H2O2) 122 can damage cells. Superoxide dismutases (SODs) convert O2- to H2O2 and O2, and H2O2 is 123 less destructive to cells and can be subsequently eliminated by the activities of catalases or 124 peroxidases. All MAGs have SOD genes, and the majority of them also contain catalase 125 and/or peroxidase genes (Figure S8). The lack of catalases and peroxidase is not 126 particularly associated with MAGs recovered from the anoxic hypolimnion, but rather is 127 probably due to the incomplete coverage of their genomes. Therefore, the presence of SOD, 128 catalase and/or peroxidase genes in most of MAGs suggests that most of these 129 Verrucomicrobia, including the ones found in hypolimnion, are able to tolerate oxygen. 130 131 Oxidative phosphorylation and alternative complex III 132 Most of these MAGs possess genetic components of the oxidative phosphorylation pathway, 133 including NADH:quinone oxidoreductase (Complex I), succinate:quinone oxidoreductase 134 (Complex II), the low-affinity caa3-type cytochrome c oxidase and/or the high-affinity cbb3- 135 type cytochrome c oxidase (Complex IV), and the F-ATPase (Complex V) (Figure S8). 136 However, bona fide cytochrome bc1 complex, an quinol:cytochrome c oxidoreductase 137 (Complex III), is missing in all of the MAGs, and is also missing in all Verrucomicrobia 138 isolate genomes, including the obligate aerobes (data not shown). An alternative complex 139 III (ACIII) was proposed to perform the same function traditionally provided by 140 cytochrome bc1 complex in Bacteroidetes Rhodothermus marinus (Pereira et al 2007). We 141 found ACIII genes in Verrucomicrobia isolate genomes and most MAGs, suggesting this 142 phylum uses ACIII for electron transfer. In some cases, ACIII is immediately upstream of 143 cbb3-type cytochrome c oxidase complex located in the same operon in some cases. Taken 144 together, the presence of oxidative phosphorylation and cytochrome c oxidase genes would 145 enable oxygen to be used as an electron acceptor for energy generation. Interestingly, the 146 low-affinity aa3-type cytochrome c oxidase genes are not restricted to MAGs in the 147 epilimnion where oxygen is available in higher concentrations. 148 149 Occurrence of Planctomycete-specific cytochrome c and domains 150 A number of domains that were initially identified as “Planctomycete-specific” (Studholme 151 et al 2004) are abundant in our Verrucomicrobia MAGs. Among them are three 152 Planctomycete-specific cytochrome c domains (PSCyt1, PSCyt2, and PSCyt3, represented by 153 pfam07635, pfam07583, and pfam07627, respectively), five Planctomycete-specific 154 domains (PSD1 through PSD5, represented by pfam07587, pfam07624, pfam07626, 155 pfam07631, and pfam07637, respectively), and two domains with unknown functions 156 (DUF1501 and DUF1552, represented by pfam07394 and pfam07586, respectively). 157 PSCyt-containing genes in our Verrucomicrobia MAGs encode multi-domain proteins, most 158 of which contain both PSCyt and PSD domains and exhibit various domain architectures. 159 Based on the combination of specific PSCyt and PSD, these domain structures can be 160 classified into three groups. Group I contains PSCyt1, but not PSD or other PSCyt; Group II 161 contains PSCyt2, which exclusively pairs with PSD1 and also often with PSCyt1; and Group 162 III contains PSCyt3, which exclusively pairs with PSD4 and also often with PSD2, PSD3 and 163 PSD5 (Figure 7). The pairing between specific PSCyt and PSD is also reflected in their 164 domain occurrence frequencies in these MAGs (Figure S9a). Further, PSCyt2-containing 165 genes are usually next to DUF1501-containing genes; and PSCyt3-containing genes are 166 usually next to DUF1552-containing genes (Figure S9b). Such conserved domain 167 architectures and gene organizations, as well as their high occurrence frequencies in some 168 of the Verrucomicrobia MAGs are intriguing, yet nothing is known about their functions. 169 Some of the PSCyt-containing genes also contain additional domains besides PSCyt 170 and PSD. Most of these additional domains can be classified into two categories: one 171 involved in protein-protein interactions (PPI) and the other involved in carbohydrate 172 binding (CBM, carbohydrate-binding modules) (Figure 7), similar to previous findings in a 173 number of PVC genomes by Kamneva et al. (Kamneva et al 2012). These authors suggested 174 that PPI domains in these genes were responsible for protein complex assembly or 175 substrate recognition, and cytochromes encoded by these genes likely transfer electrons to 176 acceptors (possibly proteins and sugars) due to the presence of CBM domains (Kamneva et 177 al 2012). The presence of CBM domains in redox active proteins is indeed interesting. For 178 example, both CBM1 and cytochrome b562 (another redox active protein domain) are 179 components of cellobiose dehydrogenase (CDH) in the white-rot fungus Phanerochaete 180 chrysosporium (Yoshida et al 2005) and sugar dehydrogenase (SDH) in mushroom 181 Coprinopsis cinerea (Matsumura et al 2014). Therefore, it is plausible that some of the 182 PSCyt-containing genes, especially the ones with CBMs, are involved in carbohydrate 183 degradation. 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 REFERENCE Bendall ML, Stevens SLR, Chan L-K, Malfatti S, Schwientek P, Tremblay J et al (2016). Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations. The ISME journal. Haukka K, Kolmonen E, Hyder R, Hietala J, Vakkilainen K, Kairesalo T et al (2006). Effect of nutrient loading on bacterioplankton community composition in lake mesocosms. Microbial ecology 51: 137-146. Kamneva OK, Knight SJ, Liberles DA, Ward NL (2012). Analysis of genome content evolution in pvc bacterial super-phylum: assessment of candidate genes associated with cellular organization and lifestyle. Genome biology and evolution 4: 1375-1390. Lindström ES, Vrede K, Leskinen E (2004). Response of a member of the Verrucomicrobia, among the dominating bacteria in a hypolimnion, to increased phosphorus availability. Journal of Plankton Research 26: 241-246. Matsumura H, Umezawa K, Takeda K, Sugimoto N, Ishida T, Samejima M et al (2014). Discovery of a Eukaryotic Pyrroloquinoline Quinone-Dependent Oxidoreductase Belonging to a New Auxiliary Activity Family in the Database of Carbohydrate-Active Enzymes. PloS one 9: e104851. Paver SF, Kent AD (2010). Temporal patterns in glycolate-utilizing bacterial community composition correlate with phytoplankton population dynamics in humic lakes. Microbial ecology 60: 406-418. Pellicer MT, Badia J, Aguilar J, Baldoma L (1996). glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator protein. Journal of bacteriology 178: 2051-2059. Pereira MM, Refojo PN, Hreggvidsson GO, Hjorleifsdottir S, Teixeira M (2007). The alternative complex III from Rhodothermus marinus - a prototype of a new family of quinol:electron acceptor oxidoreductases. FEBS letters 581: 4831-4835. Sangwan P, Chen X, Hugenholtz P, Janssen PH (2004). Chthoniobacter flavus gen. nov., sp. nov., the first pure-culture representative of subdivision two, Spartobacteria classis nov., of the phylum Verrucomicrobia. Applied and environmental microbiology 70: 5875-5881. Schneider K, Peyraud R, Kiefer P, Christen P, Delmotte N, Massou S et al (2012). The ethylmalonyl-CoA pathway is used in place of the glyoxylate cycle by Methylobacterium extorquens AM1 during growth on acetate. The Journal of biological chemistry 287: 757766. Studholme DJ, Fuerst JA, Bateman A (2004). Novel protein domains and motifs in the marine planctomycete Rhodopirellula baltica. FEMS microbiology letters 236: 333-340. 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 Yoshida M, Igarashi K, Wada M, Kaneko S, Suzuki N, Matsumura H et al (2005). Characterization of carbohydrate-binding cytochrome b562 from the white-rot fungus Phanerochaete chrysosporium. Applied and environmental microbiology 71: 4548-4555. SUPPLEMENTARY FIGURE LEGENDS Figure S1. A tiled display of an emergent self-organizing map (ESOM) based on the tetranucleotide frequency (TNF) of the 19 Verrucomicrobia MAGs. TNF was calculated with a window size of 5 kbp, with each dot on the ESOM representing a 5-kbp fragment (or a contig if its length is shorter than 5 kbp). Dots (i.e. fragments) are colored according to MAGs. A numeric ID is assigned to each MAG, and IDs from Mendota are labeled in black and IDs from Trout Bog labeled in white. A red outline was drawn to indicate the clustering of MAGs from Mendota on the ESOM. Figure S2. Counts of GH genes among the 78 different GH families present in MAGs. Figure S3. Heat map based on GH abundance profile patterns showing the clustering of MAGs by different lakes. Figure S4. Counts of carbohydrate and amino acid transporter genes. Figure S5. Comparison of glycolate oxidase gene operons in E. coli, C. flavus and TE4605. Figure S6. Nitrogen (N) and carbon (C) utilization in the proteome and genome. N and C utilization in the proteome is indicated by the number of N and C atoms per amino-acid residue side chain (ARSC) respectively, and N utilization in the genome is indicated by genome GC content. (a and b) Quantile plots showing the number of N and C atoms per ARSC calculated from all predicted proteins in the 19 MAGs . Plots were generated according to Grzymski and Dussaq (2012). (c and d) The median number of N and C atoms per ARSC for the 19 MAGs ranked by the median number. (e) Plot showing the median number of N per ARSC and median number of C per ARSC is negatively correlated (r = 0.83). (f) Plot showing the median number of N per ARSC is positively correlated with genome GC content. In all plots, genomes and proteomes from ME are in red, and those from TE and TH are in blue. The three proteomes/genomes enclosed by the dash circle (TE1800, TH2519 and TH4093) have extremely low N- but high C-contents. Figure S7. Summary of important metabolic genes and pathways. Figure S8. Occurrence and gene organization of Planctomycetes-specific domains, DUF1501, and DUF1552. (a) Counts of PSCyt, PSD, DUF1501, and DUF1552 domains in the MAGs. (b) Clustering of PUF1501- and PSCyt2-containing genes, and clustering of PUF1552- and PSCyt3-containing genes in the genome. 280 281 282 283 284 285 286 287 288 289 Figure S1. A tiled display of an emergent self-organizing map (ESOM) based on the tetranucleotide frequency (TNF) of the 19 Verrucomicrobia MAGs. TNF was calculated with a window size of 5 kbp, with each dot on the ESOM representing a 5-kbp fragment (or a contig if its length is shorter than 5 kbp). Dots (i.e. fragments) are colored according to MAGs. A numeric ID is assigned to each MAG, and IDs from Mendota are labeled in black and IDs from Trout Bog labeled in white. A red outline was drawn to indicate the clustering of MAGs from Mendota on the ESOM. 290 291 292 293 294 295 Figure S2. Counts of GH genes among the 78 different GH families present in MAGs. 296 297 298 299 300 Figure S3. Heat map based on GH abundance profile patterns showing the clustering of MAGs by different lakes. 301 302 303 304 Figure S4. Counts of carbohydrate and amino acid transporter genes. 305 306 307 308 309 Figure S5. Comparison of glycolate oxidase gene operons in E. coli, C. flavus and TE4605. 310 311 312 313 314 315 316 317 Figure S6. Nitrogen (N) and carbon (C) utilization in the proteome and genome. N and C utilization in the proteome is indicated by the number of N and C atoms per amino-acid residue side chain (ARSC) respectively, and N utilization in the genome is indicated by genome GC content. (a and b) Quantile plots showing the number of N and C atoms per ARSC calculated from all predicted proteins in the 19 MAGs . Plots were generated according to Grzymski and Dussaq (2012). (c and d) The median number of N and C atoms per ARSC for the 19 MAGs ranked by the median number. (e) Plot showing the median 318 319 320 321 322 323 324 325 326 327 328 number of N per ARSC and median number of C per ARSC is negatively correlated (r = 0.83). (f) Plot showing the median number of N per ARSC is positively correlated with genome GC content. In all plots, genomes and proteomes from ME are in red, and those from TE and TH are in blue. The three proteomes/genomes enclosed by the dash circle (TE1800, TH2519 and TH4093) have extremely low N- but high C-contents. Figure S7. Summary of important metabolic genes and pathways. 329 330 331 332 333 (a) 334 335 336 337 338 339 340 341 342 (b) Figure S8. Occurrence and gene organization of Planctomycetes-specific domains, DUF1501, and DUF1552. (a) Counts of PSCyt, PSD, DUF1501, and DUF1552 domains in the MAGs. (b) Clustering of PUF1501- and PSCyt2-containing genes, and clustering of PUF1552- and PSCyt3-containing genes in the genome.
© Copyright 2026 Paperzz