Download this supplementary material

Supplementary Figures
Color Key
and Histogram
25
AUC
0
Count
1
0.5
0.8
1
Value
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
si.S
arh.S
*
klas.S
*
H.10.60.15
H.30.60.15
H.10.100.5
L.10.60.5
L.30.100.15
H.10.60.5
H.30.100.15
H.30.100.5
H.30.60.5
L.10.100.5
L.10.100.15
L.30.100.5
H.10.100.15
L.10.60.15
L.30.60.5
L.30.60.15
pac.S
Figure 1: Score based AUC for all scenarios. Asterisks indicate highest
values per scenario. Column names encode scenarios in the order expression.exons.percent.samples, thus H.10.100.5 describes the scenario with high
expression, 10 exons per gene, 100 percent spliced samples in the respective
group and 5 versus 15 samples per group.
1
Figure 2: For all genes simulated, classification by method is displayed. The
upper half of the genes contain differential splicing events while the lower half
serves as a control. Correctly classified genes are indicated in blue while incorrect predictions are highlighted in yellow. Genes in the first half of the DS
set contain one while the second half contains two DS exons. In the scenarios
displaying a different sample number per group and less than 100 percent DS
samples the group containing the DS event is switched for half of the genes. In
order of top-left to bottom-right: ARH, KLAS, SI, MIDAS, ANOSVA, PAC,
MADS’, SplicingCompass. Column names encode scenarios in the order expression exons samples percent, thus H 10 5 100 describes the scenario with high
expression, 10 exons per gene, 5 versus 15 samples per group and 100 percent
spliced samples in the respective group.
2
H_30_15_60
H_10_15_100
H_30_15_100
H_30_15_60
H_10_15_100
H_30_15_100
L_10_15_60
L_30_15_60
L_10_15_100
L_30_15_100
L_10_15_60
L_30_15_60
L_10_15_100
L_30_15_100
H_30_15_100
L_10_15_60
L_30_15_60
L_10_15_100
L_30_15_100
L_30_15_60
L_10_15_100
L_30_15_100
H_10_15_60
H_30_15_60
H_10_15_100
H_30_15_100
H_10_15_60
H_30_15_60
H_10_15_100
H_30_15_100
L_10_15_60
L_30_15_60
L_10_15_100
L_30_15_100
L_10_15_60
L_30_15_60
L_10_15_100
L_30_15_100
H_10_15_60
H_30_15_60
H_10_15_100
H_30_15_100
H_10_15_60
H_30_15_60
H_10_15_100
H_30_15_100
L_30_15_100
L_30_15_100
L_30_5_60
L_30_15_60
L_10_15_100
L_10_15_60
L_10_15_60
L_30_15_60
L_30_5_100
L_30_5_100
L_10_15_100
L_10_5_100
L_30_5_60
L_10_5_100
L_10_5_60
H_30_5_100
H_30_5_100
L_10_5_60
H_30_5_60
H_10_5_100
H_30_5_60
H_10_5_100
H_10_5_60
L_30_5_100
L_30_5_100
H_10_5_60
L_30_5_60
L_10_5_100
L_30_5_60
L_10_5_100
L_10_5_60
H_30_5_100
H_30_5_100
L_10_5_60
H_30_5_60
H_10_5_100
H_30_5_60
H_10_5_100
H_10_5_60
L_30_5_100
L_10_15_60
H_10_5_60
L_30_5_60
L_10_5_100
L_30_5_100
L_10_5_60
H_30_15_60
H_10_15_100
L_30_5_60
L_10_5_100
L_10_5_60
H_30_15_60
H_10_15_60
H_10_15_60
H_30_15_100
H_30_5_100
H_30_5_100
H_10_15_100
H_30_5_60
H_10_5_100
H_30_5_60
H_10_5_100
H_10_5_60
L_30_5_100
L_30_5_100
H_10_5_60
L_30_5_60
L_10_5_100
L_30_5_60
L_10_5_100
L_10_5_60
H_10_15_60
L_10_5_60
H_30_5_60
H_30_5_100
H_10_15_60
H_10_5_60
H_10_5_100
H_30_5_60
H_30_5_100
H_10_5_60
H_10_5_100
6
0
Count
Color Key
and Histogram
0.2
0.8
Value
pcnt
expr:pcnt
enum:pcnt
expr:enum
enum
enum:snum
*
expr:snum
snum
*
expr
*
arh.S
si.S
klas.S
pac.S
snum:pcnt
Figure 3: Heatmap of ANOVA-based p-values. Asterisks indicate significant
values (∗ ∗ ∗ < 0.001, ∗∗ < 0.01, ∗ < 0.1). Analysis of variance reveals the
influence of the parameters as well as the influence of parameter combinations
on the performance. AUC from score based evaluation(right). enum=number
of exons, snum=number of samples, pcnt=percentage of DS samples in one
condition, expr =expression intensity
3
Count
0
1
0.2 0.6
Spec
Value
1
Value
*
*
*
*
*
mads'
*
midas
*
anosva
*
spComp
*
*
*
*
*
*
*
*
anosva
*
*
*
*
*
midas
*
*
*
*
*
*
*
spComp
arh
*
klas
*
*
si
*
*
*
mads'
*
*
*
*
*
*
*
*
*
*
L.30.60.5
H.10.100.5
L.30.100.15
*
*
*
H.30.100.15
H.30.100.5
L.10.100.15
L.30.100.5
H.10.100.15
L.10.100.5
H.30.60.15
L.10.60.5
H.10.60.15
H.10.60.5
H.30.60.5
L.10.60.15
L.30.60.5
*
*
pac
*
pac
L.30.60.15
*
H.10.60.15
*
*
*
H.30.60.15
*
*
*
H.10.100.15
*
*
*
*
L.10.100.15
*
*
H.30.100.15
*
*
H.10.60.5
*
klas
*
*
L.10.60.15
*
*
*
L.10.100.5
*
*
H.30.100.5
*
*
*
L.30.60.15
*
*
arh
*
L.10.60.5
*
*
si
*
L.30.100.15
*
*
L.30.100.5
*
H.10.100.5
0.4
H.30.60.5
0
Color Key
and Histogram
0 60
Sens
25
Count
Color Key
and Histogram
Figure 4: Supplementary Figure. Sensitivity (left) and Specificity (right)
for all scenarios (p-value based evaluation).
Asterisks indicate highest
value per scenario. Column names encode scenarios in the order expression.exons.percent.samples, thus H.10.100.5 describes the scenario with high
expression, 10 exons per gene, 100 percent spliced samples in the respective
group and 5 versus 15 samples per group.
10000
12000
8000
10000
6000
8000
6000
4000
4000
2000
2000
firma
scomp
anosva
pac
klas
mads'
si
midas
arh
firma
scomp
anosva
pac
mads'
klas
si
midas
0
arh
0
Figure 5: Number of genes beeing predicted as DS per method for the colon
cancer data set (left) and the lung cancer data set (right).
4
1.0
10
Correctly classified
Spec
Sens
firma
scomp
pac
anosva
mads'
klas
arh
si
0
firma
0.0
scomp
2
pac
0.2
anosva
4
mads'
0.4
si
6
klas
0.6
midas
8
arh
0.8
midas
# TP
# TN
Figure 6: Sensitivity and specificity for RT-PCR validated DS events in the
colon cancer data set (left) and the number of RT-PCR validated DS events for
every method (right).
1.0
Correctly classified
Spec
Sens
# TP
# TN
0.8
15
0.6
10
0.4
5
0.2
firma
scomp
anosva
pac
mads'
klas
si
midas
arh
firma
scomp
anosva
pac
klas
mads'
si
midas
0
arh
0.0
Figure 7: Sensitivity and specificity for validated DS events in the lung cancer
data set.
5
Count
5000
Color Key
and Histogram
0
0.4
1
scomp
midas
pac
si
firma
klas
anosva
arh
Value
Figure 8: Heatmap of the predicted DS events in the colon cancer data set.
Note, that MADS’ was excluded for computational reasons due to the high
number of predictions. Columns and rows are hierarchically clustered.
6