NIPs are reliable phylogenetic markers

Motivation
Outline
Near intron positions are reliable phylogenetic
markers:
An application to holometabolous insects
Veiko Krauss, Christian Thümmler, Franziska Georgi,
Jörg Lehmann, Peter F. Stadler and Carina Eisenhardt
Bioinformatics Group, Department of Computer Science, University of Leipzig
Bled, Slovenia, Feb 18 2008
J. Lehmann
NIPs are reliable phylogenetic markers
1/29
What are introns?
Phylogeny?
from: Charles Darwin, “On the Origin of Species” (1859)
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Outline
1
Short introduction to phylogenetics/cladistics
2
A novel marker class: Near-intron-positions
3
Application to holometabolic insect orders
4
Discussion and Outlook
J. Lehmann
NIPs are reliable phylogenetic markers
4/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Outline
1
Short introduction to phylogenetics/cladistics
2
A novel marker class: Near-intron-positions
3
Application to holometabolic insect orders
4
Discussion and Outlook
J. Lehmann
NIPs are reliable phylogenetic markers
5/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Molecular phylogenetics & cladistics
hierarchical classification of species based on evolutionary
ancestry using molecular data
input data: set of species, characteristics, and corresp.
character states for each species
final result: cladogram
inferring of phylogenies: parsimony, maximum likelihood,
bayesian approaches
J. Lehmann
NIPs are reliable phylogenetic markers
6/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Some definitions (cladistics)
plesiomorph: character states defined to have been present
before the lca of a species group; ancestral
(“original”)
synapomorph: character states that were present only in the
lca; common derived; (“changed”)
homoplasy: character shared by multiple species not due to
common ancestry
→ is to be avoided in cladistic analysis
J. Lehmann
NIPs are reliable phylogenetic markers
7/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Some definitions (cladistics)
plesiomorph: character states defined to have been present
before the lca of a species group; ancestral
(“original”)
synapomorph: character states that were present only in the
lca; common derived; (“changed”)
homoplasy: character shared by multiple species not due to
common ancestry
→ is to be avoided in cladistic analysis
J. Lehmann
NIPs are reliable phylogenetic markers
7/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Some definitions (cladistics)
plesiomorph: character states defined to have been present
before the lca of a species group; ancestral
(“original”)
synapomorph: character states that were present only in the
lca; common derived; (“changed”)
homoplasy: character shared by multiple species not due to
common ancestry
→ is to be avoided in cladistic analysis
J. Lehmann
NIPs are reliable phylogenetic markers
7/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Problems with molecular sequence comparisons
Evolutionary (eukaryotic) tree still remains partly unresolved:
only few synapomorphies among higher taxa in
monophyletic groups
many cases of homoplasy (character
convergence/reversion)
J. Lehmann
NIPs are reliable phylogenetic markers
8/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Genome-level characters
Higher-order features (genome-level characters)
esp. to supplement molecular studies (to be weighted w.r.t.
morphological/DNA data)
character distributions allow evaluation of different (ordinal)
tree hypotheses by counting synapomorphies
Some examples
mitochondrial gene arrangements
retrotransposon markers (transposon insertion analysis,
e.g. for mammalian introns)
intron markers (variation in spliceosomal intron positions)
J. Lehmann
NIPs are reliable phylogenetic markers
9/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Definitions
Genome-level characters
Genome-level characters
Higher-order features (genome-level characters)
esp. to supplement molecular studies (to be weighted w.r.t.
morphological/DNA data)
character distributions allow evaluation of different (ordinal)
tree hypotheses by counting synapomorphies
Some examples
mitochondrial gene arrangements
retrotransposon markers (transposon insertion analysis,
e.g. for mammalian introns)
intron markers (variation in spliceosomal intron positions)
J. Lehmann
NIPs are reliable phylogenetic markers
9/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Outline
1
Short introduction to phylogenetics/cladistics
2
A novel marker class: Near-intron-positions
3
Application to holometabolic insect orders
4
Discussion and Outlook
J. Lehmann
NIPs are reliable phylogenetic markers
10/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
NIP character - synapomorphic distribution
J. Lehmann
NIPs are reliable phylogenetic markers
11/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Outline
1
Short introduction to phylogenetics/cladistics
2
A novel marker class: Near-intron-positions
3
Application to holometabolic insect orders
4
Discussion and Outlook
J. Lehmann
NIPs are reliable phylogenetic markers
12/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Holometabolous insects (Endopterygota)
Main characteristics
complete metamorphosis (holometabolism)
insects with distinctive larval, pupal, and adult stages
major orders:
Coleoptera (beetles, e.g. Tribolium)
Diptera (flies, e.g. Drosophila)
Hymenoptera (ants, bees, sawflies, wasps;
e.g. Apis)
Lepidoptera (butterflies, moths;
e.g. Bombyx)
J. Lehmann
NIPs are reliable phylogenetic markers
13/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Holometabolous insects (Endopterygota)
Main characteristics
complete metamorphosis (holometabolism)
insects with distinctive larval, pupal, and adult stages
major orders:
Coleoptera (beetles, e.g. Tribolium)
Diptera (flies, e.g. Drosophila)
Hymenoptera (ants, bees, sawflies, wasps;
e.g. Apis)
Lepidoptera (butterflies, moths;
e.g. Bombyx)
J. Lehmann
NIPs are reliable phylogenetic markers
13/29
Relatedness of major groups
widely accepted hypothesis:
NIP-based analysis supports:
(Coleoptera (Hymenoptera
(Diptera+Lepidoptera)))
(Hymenoptera (Coleoptera
(Diptera+Lepidoptera)))
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Dataset compilation
J. Lehmann
NIPs are reliable phylogenetic markers
15/29
Example alignment and cladogram
A
Aedes
Anopheles
Drosophila
Bombyx
Tribolium
Nasonia
Apis
Acyrthosiphon
Pediculus
Daphnia
Schistostoma
Strongylocentrotus
Ciona
Gallus
Danio
Nematostella
AA Consensus
331-0
335-0
339-0
=
TACAACCAA------------------------ATGGACTCCGGA------------------------GAGGATGCGCAG------------------------GAGGAATTC
TACAATCAG------------------------ATGGATTCGGGC------------------------GAGGACGCGCAG------------------------GAAGAGTTC
TACAATCAG------------------------ATGGACAGCGGC------------------------GAGGATGCCCAA------------------------GAGGAGTTC
TACAACCAA------------------------ATGGAATGTGGT------------------------GAAGATGCTCAGgtgattaa//ttatattttttcagGAGGAATTC
TACAACCAA------------------------ATCGACAGCGGC------------------------GAAGACGCCCAGgtccgctt//aatcatttttacagGAAGAGTTC
TACAATCAGgtacatag//aggcttattttcagATTGAATCAGGA------------------------GAAGATGCTCAA------------------------GAGGAATTT
TACAATCAGgtacaata//aggcttctttccagATAGAATCAGGA------------------------GAAGATGCACAT------------------------GAAGAATTT
TATAATCAGgtaaagtc//ttgaatttctttagATTGATGCCGGT------------------------GACGATATTAAG------------------------GAAGAATTT
TATAATCAA------------------------ATCGATTCAGGA------------------------GAGGATGCACAA------------------------GAGGAATTT
TACAATCAAgtaatttt//tatgcaattgatagATTGATTCTGGA------------------------GAAGACGCACAA------------------------GAAGAGTTT
TATAATCGC------------------------AGTGATGATGTAgtaagtag//tacctaccttatagGAGGATATTTTA------------------------GAGGAATCA
TACAATCAA------------------------CAGGATCTG---gtaagttt//ttctgtccccgtagGAAGAGGCCCAG------------------------GAAGAGTTT
TATAATGAC------------------------ATGGAGTCGGTGgtaagttc//tatataaattgcagGAAGACGCACAA------------------------GAGGAGTTT
TACAATCAG------------------------ATGGATTCCACTgtaagtct//gatttcctttacagGAAGATGCGCAG------------------------GAGGAATTT
TACAACCAG------------------------CTTGATCAG---gtatcttg//tacatctttggcagGAGGAAGCCCAA------------------------GAGGAGTCA
TACAACCAG------------------------ATGGACTCTGGGgtaagggt//ctttctatgaacagGAGGATGCACAG------------------------GAAGAGTTT
Y N Q
M/I D S G
E D A Q
E E F
B
Nema­
tostella
Danio
Gallus
Ciona
Strongylo­ Schisto­
centrotus
stoma
Daph­
nia
Pedi­
culus
Acyrtho­
siphon
Apis
Naso­
nia
Tribolium
Bombyx
Droso­
phila
Ano­
pheles
Aedes
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Automated re-analysis
Computational steps
extraction of CDS for all obtained ortholog sets (118)
multiple alignment of CDS (codon alignment) for each set
naming of intron positions w.r.t. Apis protein
NIP extraction: partial codon alignments
Result: all 135 NIPs in agreement with manual analysis
J. Lehmann
NIPs are reliable phylogenetic markers
17/29
Example of a partial alignment (gi66516304)
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
NIP character - synapomorphic distribution
J. Lehmann
NIPs are reliable phylogenetic markers
19/29
Apomorphic introns mapped onto the tree
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
NIP distribution results
J. Lehmann
NIPs are reliable phylogenetic markers
21/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Holometabolous insect phylogenies
NIP analysis
Results
Sources of homoplasy in NIP characters
ancient small exons (differential loss of bordering introns)
intron loss and gain in parallel
J. Lehmann
NIPs are reliable phylogenetic markers
22/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Outline
1
Short introduction to phylogenetics/cladistics
2
A novel marker class: Near-intron-positions
3
Application to holometabolic insect orders
4
Discussion and Outlook
J. Lehmann
NIPs are reliable phylogenetic markers
23/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Outlook
Ongoing work
Extension of automated analysis to include
identification of orthologous sequences for metazoan
species (tblastn on assembly data)
gene structure annotation (CDS, intron positions)
NIP extraction (incl. codon alignment+intron naming)
extraction of intron (pair) matrices for phylogenetic
analyses
J. Lehmann
NIPs are reliable phylogenetic markers
24/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Outlook - Proposed research
Further NIP-based hypothesis evaluations:
other large orders of the Holometabola
contradicting hypotheses of Ecdysozoa and Coelomata
(nematods, chordates, arthropods)
branching pattern of arthropods
J. Lehmann
NIPs are reliable phylogenetic markers
25/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Outlook - Proposed research
Additional analysis possibilities from intron orthology data
(beyond NIP data):
classical sequence-based phylogeny reconstruction
reconstruction of intron structure based phylogenies
mining of unknown structured non-coding RNAs in the pool
of orthologous introns
J. Lehmann
NIPs are reliable phylogenetic markers
26/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Summary
NIP study demonstrates
support for a more basal position of the hymenopterans
w.r.t. the beetles within the tree of holometabolous insects
potential general usability of NIPs as novel phylogenetic
marker in metazoans/eukaryotes
J. Lehmann
NIPs are reliable phylogenetic markers
27/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Evidence for the alternative hypothesis
J. Lehmann
NIPs are reliable phylogenetic markers
28/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Acknowledgements
Many thanks to all the authors
Veiko Krauss, Peter F. Stadler, Carina Eisenhardt, Franziska
Georgi, Christian Thümmler
Thank YOU!
J. Lehmann
NIPs are reliable phylogenetic markers
29/29
Short introduction to phylogenetics/cladistics
A novel marker class: Near-intron-positions
Application to holometabolic insect orders
Discussion and Outlook
Ongoing work
Proposed research
Summary
Acknowledgements
Many thanks to all the authors
Veiko Krauss, Peter F. Stadler, Carina Eisenhardt, Franziska
Georgi, Christian Thümmler
Thank YOU!
J. Lehmann
NIPs are reliable phylogenetic markers
29/29