LeBIBI PPF – step by step RUN #1 Let’s start with a SSU rDNA sequence in Fasta format: >unkown_sequence GTCTTCGGACTTAGCGGCGGACGGGTGAGTAACGCGTGGGAACGTGCCCTTTGCTTCGGAATAGCCCCGG GAAACTGGGAGTAATACCGAATGTGCCCTTTGGGGGAAAGATTTATCGGCAAAGGATCGGCCCGCGTTGG ATTAGGTAGTTGGTGGGGTAATGGCCTACCAAGCCGACGATCCATAGCTGGTTTGAGAGGATGATCAGCC ACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCA AGCCTGATCTAGCCATGCCGCGTGATCGATGAAGGCCTTAGGGTTGTAAAGATCTTTCAGGTGGGAAGAT AATGACGGTACCACCAGAAGAAGCCCCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGGGCTA GCGTTATTCGGAATTACTGGGCGTAAAGCGCACGTAGGCGGATCGGAAAGTCAGAGGTGAAATCCCAGGG CTCAACCCTGGAACTGCCTTTGAAACTCCCGATCTTGAGGTCGAGAGAGGTGAGTGGAATTCCGAGTGTA GAGGTGAAATTCGTAGATATTCGGAGGAACACCAGTGGCGAAGGCGGCTCACTGGCTCGATACTGACGCT GAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGCC AGTCGTCGGGCAGCATGCTGTTCGGTGACACACCTAACGGATTAAGCATTCCGCCTGGGGAGTACGGCCG CAAGGTTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCA ACGCGCAGAACCTTACCAACCCTTGACATGGCGATCGCGGTTCCAGAGATGGTTCCTTCAGTTCGGCTGG ATCGCACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTCGGTTAAGTCCGGCAACGAG CGCAACCCACGTCCTTAGTTGCCAGCATTCAGTTGGGCACTCTAGGGAAACTGCCGGTGATAAGCCGGAG GAAGGTGTGGATGACGTCAAGTCCTCATGGCCCTTACGGGTTGGGCTACACACGTGCTACAATGGCAGTG ACAATGGGTTAATCCCAAAAAGCTGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGA ATCGCTAGTAATCGCGTAACAGCATGACGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCAC ACCATGGGAATTGGTTCTACCCGAAGGCGGTGCGCCAACCTCGCAAGAGGAGGCAGCCGACCACGGTAGG The first step of the analysis is to get some idea of its identity. 1 Query building Paste the sequence in the query window and set a maximum number of 10 hits of a Blast search of the PPFDB_SSU_rDNA-16S_superstringent database (database of one SSU_rDNA-16S sequence per type species). Increasing the maximum number of BLAST hits to be retained may be necessary when the input sequences are highly divergent. 2 Settings and analysis parameters You may give a specific name to the analysis. In the ‘Settings’ part of the window, replace ‘Query’ by the suitable name in ‘User-given Id.’. The length of the sequence names can be set to any value from 10 to 200 characters (default = 30). Keep the default settings of the post-BLAST processing: you are analyzing a single sequence for which you want to get a global phylogenetic tree of all the sequences (unknown + 10 closest BLAST hits). When inputing several sequences with BLAST retrievals for them all, you have the possibility of building as many trees as the input sequences, each tree corresponding to a single sequence. The default settings of the sequence alignment algorithms, sequence algorithm and the phylogeny program may be modified. For that latter trimming If you do not wish to use the default substitution model, un-click the option in the ‘Phylogeny’ part of the window. Once the settings are correct, click on the ‘Proceed to the next PPF step’ red button. 3 Verification of the run settings A summary of the analysis settings and parameters appears in another window that opens automatically. Alternatively, if you opted to do so at the previous step, you can select in a list the substitution model for the phylogeny reconstruction at this stage (lowest part of the window) If/When everything is OK, click on the ‘Run’ red button to launch the sequence analysis. 4 Results window A separate result window opens automatically. The top of the window lists the run settings and, below, the results are presented in a double frame. The left frame displays the log of the run, with the most recent information appearing at the top. The run stops when three clickable buttons appear in this frame and the right frame shows the resulting phylogenetic tree. You may visualize the run log in a separate window by clicking on the links ‘See the actual page here’ above the frame. The result of the BLAST run is accessible by clicking on the ‘BLAST results’ link in the log. All generated files are directly accessible by clicking on the link ‘All files’ above the frame log. [to add :description of files when list is final] 5 Tree visualization The tree (svg format) can be visualized in a separate window by clicking on the most left ‘View the svg tree’ yellow button. The ‘unknown sequence’ is highlighted in red. 6 Visualization/Download of the tree in PDF format Clicking on the most right ‘View the pdf tree’ yellow button gets you access to the tree in PDF format. 0.01 Haematobacter_massiliensis~v~TT~AF452106 Haematobacter_massiliensis~v~TT~DQ342309 Haematobacter_missouriensis~v~TT~DQ342315 Frigidibacter_albus~v~TT~KF944301 Rhodobacter_sphaeroides~v~TT~D16425 Rhodobacter_sphaeroides~v~TT~X53853 QRY_QRY_unkown_sequence Rhodobacter_sphaeroides~v~TT~CP000143 Rhodobacter_johrii~v~TT~AM398152 Rhodobacter_megalophilus~v~TT~AM421024 Rhodobacter_azotoformans~v~TT~AB607332 7 Tree edition Clicking on the middle ‘Edit the svg tree’ red button opens a tree edition interface. The tree appears in a dedicated frame with, on the right, a control panel showing the available modification options. To modify the aapearance of the tree, select the option(s) in the control sections ‘Shape’, ‘Outgroups’, ‘Branch support’, ‘Highlighting’, ‘Scale’ and click on the ‘modify’ red button at the top of the control panel. Scale: The tree width and length can be rescaled as well as the font size. Tree shape: A squared representation showing the branch supports as branch width is used by default. Keep in mind that the tree is always **unrooted** until a root is selected. The circular representation does not show the branch supports. Outgroups: Up to two outgroup sequences can be selected from the complete list of sequences. Branch support: Branch width is used as default. Only SH branch support value above a selected threshold (0.7 to 0.95) are displayed. The branch width is calculated relatively to a selected maximum support value (0.7 to 1.0). The branch support display can be changed to numerical support value or be removed. Highlighting: By default, the queried sequences are singled out by red-coloring whilst all other non-highlighted sequences appear in black. The sequences can be color-highlighted according to their taxonomy at the species, genus, family, order or class level. This may be used to identify the lineage at any rank level to which the queried sequences belong. - Species: The example Query sequence belongs to Rhodobacter sphaeroides species The modified tree can be visualized in a separate window in SVG format for all modification options and in PDF in the case of the coloring options. - Genus: The example Query sequence belongs to Rhodobacter genus - Family: The example Query sequence belongs to the Rhodobacteraceae family - Free choice: This option allows you to highlight any sequence which name contains a string of characters of your choosing. - Clear: Ticking the ‘Clear’ option and clicking on ‘Modify’ will get you back to a tree devoid of any tag. RUN #2 The unknown sequence is likely a member of Rhodobacter sphaeroides species within the Rhodobacteraceae (Rhodobacterales, Alphaproteobacteria). The next step is to ascertain its position among all the sequences available for this this species and within Rhodobacterales. Including the Rhodobacterales will root the Rhodobacter sphaeroides species lineage. In the Supplemental Query frame, write the following commands: #Rhodobacter_sphaeroides @stringent %notag #Rhodobacterales @genuslevel %notag The first command will select all Rhodobacter sphaeroides SSU rDNA-16S sequences that are currently available in the database. The second command will select representative sequences for all Rhodobacterales genera. Query: Run summary: Run log: Circular tree: Unknon sequence Rhodobacter genus representative sequence The unknown sequence (red) groups with the Rhodobacter genus representative sequence (yellow) and the majority of Rhodobacter sphaeroides species (pale yellow) among which are found sequences for the type strain. Note that several sequences for Rhodobacter sphaeroides (black arrows) do not group with the Rhodobacter genus representative sequence. They likely were misnamed. RUN #3 Alternatively, the position of the unknown sequence can be explored within the Rhodobacter sphaeroides species as previously (first line of command in the Supplemental Query frame), and among Rhodobacterales species (second line of command in the Supplemental Query frame) instead of Rhodobacterales genus: #Rhodobacter_sphaeroides @stringent %notag #Rhodobacterales @superstringent %notag Query: Run summary: Run log: A larger number of sequences is included in the analysis: 660 Rhodobacterales type species in comparison to the 260 Rhodobacterales genus representatives in the prvious analysis. The run will thus last a little longer (about 5 min. instead of 1 for the previous run). Family-colored tree: RHODOBACTERACEAE Unknon sequence HYPHOMONADACEAE The Rhodobacterales are separated into the Rhodobacteraceae and the Hyphomonadaceae. This latter family can thus be used as outgroup for the rooting of the Rhodobacteraceae to which the unknown sequence belongs. Species-colored tree: Unknon sequence HYPHOMONADACEAE Zooming in on the part of the tree with the query sequence, the affiliation to Rhodobacter sphaeroides species is confirmed by the co-occurrence of sequences for this species type strain. Note however that representative sequences for the type strains of two other Rhodobacter species (R. megalophilus and R. johrii) branch among the Rhodobacter shpaeroides strain sequences. In reverse, some Rhodobacter sphaeroides strain sequences group with Rhodobacter azotoformans type strain (top of the figure below). Finally the presence of a long branch may deserve some more digging, starting with the checking of he sequence itself. Rhodobacter azotoformans T Unknown sequence Rhodobacter sphaeroides T Rhodobacter megalophilus T Rhodobacter sphaeroides T Rhodobacter sphaeroides T Rhodobacter sphaeroides T Rhodobacter sphaeroides T Rhodobacter johrii T Long branch: possible problem with the sequence
© Copyright 2026 Paperzz