Biolog Decomposition R Package User Manual

Biolog Decomposition R Package
User Manual
Appendix to the article
Identifying multiple potential metabolic cycles
from Biolog experiments
Mikhail Shubin, Katharina Schaufler, Karsten Tedin, Minna Vehkala,
Jukka Corander
1
Requirements
The Biolog decomposition is a package for an R programming language. It is a part of
a pipeline for analysing Biolog PM data (PABPM for short) [1] which, in turn, is built
upon the opm package [2]. PABPM does not require installation, simply download the
code from: www.helsinki.fi/bsg/software/R-Biolog. You would require two scripts:
PMworkflow.R
PMvis.R
2
Package content
The Biolog decomposition package can be found at
www.helsinki.fi/bsg/software/Biolog Decomposition. The folder contains the following
files:
◦ Decomposition Functions.R: script with the decomposition functions;
◦ Example decomposition test1.R and Example decomposition test2.R: scripts
used for a performance analysis of a decomposition functions;
◦ Example run decomposition.R: script analysing the sample dataset;
◦ Example data: the sample dataset;
◦ this manual.
3
Installation
The Biolog decomposition package does not require installation, simply download the
script Decomposition Functions.R from the package site.
1
4
Starting a work
You can start using the package by uploading it with the following code:
source("PMworkflow.R")
source("Decomposition_Functions.R")
using a corresponding paths for the scripts.
Data is uploaded using the opm and PABPM. For a help on data uploading see the
PABPM manual or the decomposition example.
5
Data Structures
The decomposition package uses two data structures: Decomposition List and Component
list, both bases on the R list type.
5.1
Component list
The list representing a single component in a decomposition. It has the following structure:
◦ Component List$model: type of the component: gauss, brick or slope;
◦ Component List$par: named vector with three parameters corresponding to the
model used;
◦ Component List$error: minimal value of a target function achieved while optimizing the component;
◦ Component List$pred: values of the component at each time point (i.e. Ct )
◦ Component List$weigth: sum of the component, i.e.
Component List$weigth==sum(Component List$weigth)
5.2
Decomposition list
The list representing a single decomposition. It has the following structure:
◦ Decomposition List$[[1]] ...
the decomposition;
Decomposition List$[[n]] : Components in
◦ Decomposition List$pow: number of components in the decomposition (i.e. n)
2
6
Main Functions
6.1
MS.decompose()
Decomposes a single PM metabolic signal.
MS.decompose(signal,
time = 0:(length(signal)-1)/4,
smooth = 0.5,
threshold = 15,
corrcoef = 2,
show = TRUE)
Arguments
◦ signal – target signal (vector St );
◦ time – time points corresponding to each value of a signal. Usually measurements
are taken every 15 minutes, therefore by default time = c(0, 0.25, 0.5 ...).
◦ smooth – smoothing coefficient (parameter b);
◦ threshold – component size threshold (parameter δ);
◦ corrcoef – penalty for a correlation between components (parameter γ);
◦ show – If TRUE, every step of the decomposition would be visualized.
Output
Returns a Decomposition List.
Example
D = PM.get.grow(EColi_corrected$data[[1]]$data_frame)
signal = D$Growth[ D$Substrate==’D10’ ]
decompose = MS.decompose(signal)
3
6.2
MS.frame.decompose()
Decomposes a whole Biolog PM plate
MS.frame.decompose(data,
signalname=’Signal’,
smooth = 0.5,
threshold = 15,
corrcoef = 2,
show = TRUE)
Arguments
◦ data – the list containing a data frame with a raw signal. The list is assumed to
be produced by PABPM and formatted accordingly.
◦ signalname – alias of the raw signal in the data frame. For example, if set to
’BgCorrected’ background-corrected signals would be used for a decomposition.
If set to ’Normalized’ normalized signal would be used for a decomposition. Corresponding procedures should be done in PABPM. By default signalname=’Signal’,
meaning that no additional procedures would be taken into account.
◦ smooth – smoothing coefficient (parameter b);
◦ threshold – components size threshold (parameter δ);
◦ corrcoef – penalty for a correlation between components (parameter γ);
◦ show – If TRUE, every step of the decomposition of every substrate would be
visualized.
Output
Returns an input list data with the following changes.
◦ data gains and additional element: data$Decomposition - a list with a decomposition for each substrate.
◦ The data frame in data$data frame gains an additional row:
data$data frame[’Growth’], containing the values of a target signal (vector St )
estimated during the decomposition.
Example
decomposition = MS.frame.decompose(EColi_corrected$data[[1]])
decomposition$Decomposition$A11
4
6.3
MS.similarity()
Measures a similarity between two decompositions.
MS.similarity(A,
B,
d_max=3,
d_size=100,
d_center=30)
Arguments
◦ A, B – two Decomposition Lists to be compared;
◦ d max – scales the importance of the max summary statistics (parameter δ(max) );
◦ d size – scales the importance of the size summary statistics (parameter δ(size) );
◦ d center – scales the importance of the center summary statistics (parameter δ(center) ).
Output
Returns a single value representing a similarity measure between the decompositions
A and B.
Example
decompose = list()
substr = ’A08’
for (i in 1:9){
D = PM.get.grow(EColi_corrected$data[[i]]$data_frame)
decompose[[i]] = MS.decompose(D$Growth[ D$Substrate==substr ])
}
M = matrix(1,9,9)
for (i in 1:8){
for (j in (i+1):9){
M[i,j]=MS.similarity(decompose[[i]],decompose[[j]])
M[j,i]=M1[i,j]
}}
5
7
Visualization Functions
7.1
draw.decomposition()
Plots a single Decomposition List.
draw.decomposition(Components,
time,
t=’overlap’,
pow = Components$pow,
col=-1,
col2=-1,
lwd=1)
Arguments
◦ Components – decomposition List to be visualized.
◦ time – time points, corresponding to each value of a signal.
◦ t – visualization method. Can take the following values.
– overlap – components are overlapping;
– sum – components are put one on top of each other;
– cumsum – cumulative sums of the components are shown;
◦ pow – the number of components to be visualized. By default, plots all the components.
◦ col – fill color for the components. By default, the color depends on the component’s type.
◦ col2 – line color for the components. By default, the color depends on the component’s type.
◦ col – visualization method. Can take the following values.
◦ lwd – graphic parameter, line width.
Example
D = PM.get.grow(EColi_corrected$data[[1]]$data_frame)
signal = D$Growth[ D$Substrate==’D10’ ]
time = D$Time[ D$Substrate==’D10’ ]
decompose = MS.decompose(signal,time,show=FALSE)
plot(time,signal,t=’l’)
draw.decomposition(Components, time)
6
plot(time,signal,t=’l’)
draw.decomposition(Components, time, t= ’sum’)
plot(time,D$BgCorrected[ D$Substrate==’D10’ ],t=’l’)
draw.decomposition(Components, time, t= ’cumsum’)
7
7.2
MS.plot.decomposition()
Plots the decomposition of a whole Biolog PM plate.
MS.plot.decomposition(data,
cex=0.5,
signalname=’Signal’,
t=’sum’)
Arguments
◦ data – list containing a data frame with a raw signal and a decomposition results
(i.e. output of the MS.frame.decompose() function). The list is assumed to be
produced by PABPM and formatted accordingly.
◦ cex – graphic parameter cex used to define the font size.
◦ t – visualization method. Can take the following values.
– overlap – components are overlapping;
– sum – components are put one on top of each other;
– cumsum – cumulative sums of the components are shown;
◦ signalname – alias of the row with the raw signal. Only applicable when t=’cumsum’
option is evoked.
Example
decomposition = MS.frame.decompose(EColi_corrected$data[[1]])
MS.plot.decomposition(decomposition)
8
8
Axillary functions
These function are used inside the main decomposition function, but can be called separately as well.
8.1
PM.get.grow()
Performs the pre-processing: extracts the smoothed lagged difference from the raw signal.
PM.get.grow(frame,
signalname=’Signal’,
smooth = 0.5)
Arguments
◦ frame – data frame containing the raw target signals.
◦ signalname – alias of the raw signal in the data frame. For examples, if set to
BgCorrected background-corrected signals would be used for a decomposition, if set
to Normalized normalized signal would be used for a decomposition. Corresponding
procedures should be done in PABPM. By default set to ’Signal’, meaning that
no additional procedures would be taken into account.
◦ smooth – smoothing coefficient (parameter b);
Output
The same data frame frame, but with an additional row frame$Growth containing
the target signal.
Example
D = PM.get.grow(EColi_corrected$data[[1]]$data_frame)
9
8.2
fit.component()
Fits a single component.
fit.component(signal,
time,
positions=c(),
corrcoef=2,
smooth=0.5)
Arguments
Arguments are the same for all three functions:
◦ signal – target signal (vector St ).
◦ time – time points, corresponding to each value of a signal.
◦ positions – sum of all other components. Fitted component is penalized for correlation to this vector;
◦ corrcoef – penalty for correlation between components (parameter γ);
◦ smooth – smoothing coefficient (parameter b);
Output
Component List.
Example
D = PM.get.grow(frame,signalname=’BgCorrected’,d=0.5)
signal = D$Growth[ D$Substrate==’D10’ ]
time
= D$Time[ D$Substrate==’D10’ ]
C = fit.component(signal, time)
10
8.3
fit.gauss()
8.4
fit.slope()
8.5
fit.brick()
Three functions used to fit a single component of a corresponding type.
fit.gauss
fit.slope
fit.brick(signal,
time,
d,
positions=c(),
corrcoef=2)
Arguments
◦ signal – target signal (vector St ).
◦ time – time points, corresponding to each value of a signal.
◦ d – smoothing coefficient (parameter b);
◦ positions – sum of all other components. Fitted component is penalized for correlation to this vector. If positions=c() no penaly is assigned.
◦ corrcoef – penalty for correlation between components (parameter γ);
Output
Component List.
Example
D = PM.get.grow(frame,signalname=’BgCorrected’,d=0.5)
signal = D$Growth[ D$Substrate==’D10’ ]
time
= D$Time[ D$Substrate==’D10’ ]
C1 = fit.gauss(signal, time, 0.5)
C2 = fit.brick(signal, time, 0.5)
C3 = fit.slope(signal, time, 0.5)
11
References
[1] M. Vehkala, M. Shubin, T. R. Connor, N. R. Thomson, and J. Corander, “Novel R
pipeline for analyzing biolog phenotypic microarray data,” PLoS ONE, vol. 10, no. 3,
p. e0118392, 2015.
[2] L. A. Vaas, J. Sikorski, B. Hofner, A. Fiebig, N. Buddruhs, H. P. Klenk, and M. Goker,
“opm: an R package for analysing OmniLog(R) phenotype microarray data,” Bioinformatics, vol. 29, pp. 1823–1824, Jul 2013.
12

Download Report

Biolog Decomposition R Package User Manual

Paperzz.com

Your Paperzz