Package `euroMix`

Package ‘euroMix’
December 16, 2015
Type Package
Title Calculations for DNA Mixtures
Version 1.1.1
Date 2015-12-16
Author Guro Dorum and Thore Egeland
Maintainer Guro Dorum <[email protected]>
Description Calculations for DNA mixtures accounting for possibly inbred pedigrees (simulations with conditioning, LR). Calculation of exact p-values.
Depends R (>= 3.0), paramlink (>= 0.9-7), Familias, forensim
Imports graphics, utils
License GPL (>= 2)
LazyLoad yes
NeedsCompilation yes
Repository CRAN
Date/Publication 2015-12-16 15:10:48
R topics documented:
euroMix-package .
convertToFamilias .
db . . . . . . . . .
db2 . . . . . . . .
famMix . . . . . .
generate . . . . . .
LRmoments . . . .
LRp . . . . . . . .
LRpvalue . . . . .
LRstat . . . . . . .
paraMix . . . . . .
pvalue.machine . .
q012 . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
2
4
5
6
9
10
11
12
14
15
18
20
2
convertToFamilias
qkappa . . . . . .
R. . . . . . . . .
sample . . . . . .
simLR . . . . . .
simMixMerlin . .
simMixParamlink
tableELRHP . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Index
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
22
22
23
24
26
27
29
euroMix-package
Forensic calculations including mixtures with pedigrees
Description
Mixtures are simulated and LR (likelihood ratio) calculations are performed. Complex pedigrees,
possibly with inbreeding, theta correction, mutation and silent alleles, are allowed. General conditioning is also accounted in the simulation. There is also a function pvalue.machine that calculates
tail probabilities for LR-s.
Details
Package:
Type:
Version:
Date:
depends: paramlink, Familias, forensim License: GPL
euroMix
Package
1.1
2015-12-07
The linkdat is created using the R-package paramlink. This could be done within R or by reading
files on Merlin format using the linkdat function. Some of the functions require Merlin to be
installed.
Author(s)
Guro Dorum and Thore Egeland <[email protected]>
References
See Egeland et al. (2013)
convertToFamilias
Convert genotype data to Familias format
convertToFamilias
3
Description
Genotype data are transformed from lines two columns. If there is only one column for each marker
and two lines for each individual, the data is transformed so that there is one line for each individual.
The pecularities of the input format of Familias is handled.
Usage
convertToFamilias(infile, outfile = paste("out", infile, sep = ""))
Arguments
infile
File name.
outfile
File name for output file.
Details
The first column is the name of the individual, the second indicates sex (X X for females, X Y for
males) while the remaining columns are marker names (no blanks are allowed in names; usual rules
for variable names apply). There are two lines for each individual. The smallest example (below) is
for a female called 32293 with genotypes 15/16 for Marker1:
Name
32293
32293
Sex
X
X
Marker
15
16
Typically, Familias is started by loading a file containing the database (markers and allele frequencies). This file needs only be prepared once for each individual. The Case Related DNA Data can
then be read from the file produced by convertToFamilias. Note that allele must have precisely
the same name (8 and 8.0 are different alleles for instance)
Value
A file with default file name as input file name preceded by "out". This file can be read by Familias
in the Case Related DNA Data window.
Author(s)
[email protected]
Examples
## Not run: convertToFamilias("denise3.txt")
4
db
db
Allele database
Description
Norwegian database with 17 EXS17 markers and 6 additional markers.
Usage
data(db)
Format
A data frame with 324 observations on the following 3 variables.
Marker a factor with levels corresponding to name of markers
Allel a numeric vector denoting allele
Frequency a numeric vector in (0,1)
Details
The format is convenient for R
Source
Dupuy et al. (2013), unpublished.
Examples
data(db)
#Checks that frequencies add to 1
lapply(split(db$Frequency,db$Marker),sum)
#Finds number of alleles for all markers
unlist(lapply(split(db$Frequency,db$Marker),length))
#A closer look at the marker SE33
SE33=db[db$Marker=="SE33",]
barplot(SE33$Frequency)
db2
5
db2
Allele database.
Description
Norwegian database for 10 SGM Plus markers.
Usage
data(db2)
Format
A data frame with 119 observations on the following 3 variables.
Marker a factor with levels corresponding to name of markers
Allele a numeric vector denoting allele
Frequency a numeric vector in (0,1)
Details
The format is convenient for R.
Source
Andreassen et al. (2007).
Examples
data(db2)
#Checks that frequencies add to 1
lapply(split(db2$Frequency,db2$Marker),sum)
#Finds number of alleles for all markers
unlist(lapply(split(db2$Frequency,db2$Marker),length))
#A closer look at the marker TH01
TH01=db2[db2$Marker=="TH01",]
barplot(TH01$Frequency)
6
famMix
famMix
Likelihood for mixtures with related contributors based on Familias
Description
Likelihood for mixtures with related contributors based on Familias. For a general description of
the problem, see paraMix. As opposed to paraMix this function uses the R version of Familias for
likelihood calculation and therefore theta-correction, mutation models and silent allele frequencies
(but not X-chromosomes or simulation) are accomodated.
Usage
famMix(x, R, id.U, id.V = NULL, partialmarker = NULL, theta = 0,
mutationRateFemale = 0, mutationRateMale = 0,
mutationModelFemale = "stable", mutationModelMale = "stable",
mutationRangeFemale = 0.1, mutationRangeMale = 0.1,
silentFrequency = 0,check=TRUE)
Arguments
x
linkdat object.
R
Integers, mixture.
id.U
List of unknown contributors (e.g.,suspect(s)).
id.V
Integers indicating typed non-contributors.
partialmarker
A marker object.
theta
Real in [0,1]
mutationRateFemale
See FamiliasLocus.
mutationRateMale
See FamiliasLocus.
mutationModelFemale
See FamiliasLocus.
mutationModelMale
See FamiliasLocus.
mutationRangeFemale
See FamiliasLocus.
mutationRangeMale
See FamiliasLocus.
silentFrequency
Real in [0,1].
check
Details
See paraMix.
If TRUE check of input is performed and calculations stop if they are likely to
take too much time.
famMix
7
Value
x
linkdat object updated with genotypes of missing individuals specified by id.U
likelihod
The likelood Pr(R,T,V|H)
allLikelihoods
Terms adding to above Pr(R,T,V|H)
Author(s)
Thore Egeland <[email protected]>
References
Egeland et al (2013)
See Also
paraMix
Examples
#Example
require(paramlink)
require(Familias)
8
famMix
generate
9
generate
Generates genotypes for unknown contributors
Description
Given a mixture, alleles for unknown contributors and the number of untyped contributors, the
genotypes of the unknown contributors are generated. The function is recursive.
Usage
generate(R, K, x = 1)
Arguments
R
Integers representing the alleles of the mixtures
K
Integers representing the alleles of the known contributors
x
The number of untyped contributors
Details
Normally x is 4 or less. Computing time may be long for larger values of x.
Value
A matrix. The number of rows is x, one row corresponds to one contributor. The columns are the
alleles, the two first for first genotype and so on.
Author(s)
Thore Egeland <[email protected]>
10
LRmoments
Examples
#Given evidence R=1/2/3, known contribution K=1/2, the possible genotypes
#for 1,2 and 3 contributors are generated:
set1=generate(R=1:3,K=1:2,x=1)
set2=generate(R=1:3,K=1:2,x=2)
set3=generate(R=1:3,K=1:2,x=3)
stopifnot(all(dim(set3)==c(3,378)))
LRmoments
Calculates expectation, standard deviation and skewness of LR under
HP and HD
Description
Exact numerical calculation
Usage
LRmoments(p = c(0.5, 0.5), kappaP = c(0, 1, 0),kappaD = c(1, 0, 0),log10=FALSE)
Arguments
p
Allele frequencies
kappaP
Probabilities of 0,1 and 2 alleles IBD corresponding to pedigree for HP
kappaD
Probabilities of 0,1 and 2 alleles IBD corresponding to pedigree for HD
log10
If TRUE, LR is log10 transformed.
Value
moments
expectation, standard deviation and skewness of LR under HP and HD
LRtable
Distribution of LR under HP and HD
Author(s)
Thore Egeland [email protected]
References
Slooten and Egeland (2013, submitted)
LRp
11
Examples
LRmoments(kappaP=c(0,1,0)) #Motivating example
LRmoments(kappaP=c(0,0.25,0.75)) #skew(LR(HP))<0
#Appendix of Slooten and Egeland (2013, submitted)
## Not run: data(db)
p=db[db$Marker=="VWA",]$Freq
LRmoments(p=p,kappaP=c(0,1,0))
## End(Not run)
LRp
Compute the p-value corresponding to a likelihood ratio.
Description
Computes the likelihood ratio for the given hypotheses and finally calculates a p-value corresponding to the likelihood ratio. The p-value is the probability of observing a likelihood ratio at least as
large as the one observed, given that the defense hypothesis is true.
Usage
LRp( sampleData, victimData, suspectData, db, hp, hd, prD, prC )
Arguments
sampleData
Data frame or matrix with sample profile. Each column represent an allele, each
row represent a marker. Only autosomal markers. Marker names that correspond
with markers in allele frequency database must be given as row names.
victimData
Data frame or matrix with victim profile. Each column represent an allele, each
row represent a marker. Only autosomal markers. Markers must be in the same
order as for sampleData.
suspectData
Data frame or matrix with suspect profile. Each column represent an allele, each
row represent a marker. Only autosomal markers. Markers must be in the same
order as for victimData and sampleData.
db
Data frame with allele frequencies. Data for the various markers are stacked.
First column contains marker names, each name repeated as many times as there
are alleles for the marker. Second column contains the allele names and third
column contains the frequencies.
hp
Prosecution hypothesis. A character vector of all contributors under $H_p$,
where S denotes suspect, V victim and U unknown. E.g. if the hypothesis is
that the sample is a mixture of the suspect, the victim and one unknown, this is
specified with the vector c(’S’,’V,’U’).
hd
Defense hypothesis. A character vector of all contributors under $H_d$, specified like hp. E.g. if the hypothesis is that the sample is a mixture of the suspect,
and two unknowns, this is specified with the vector c(’S’,’U,’U’).
12
LRpvalue
prD
Probability of drop-out. A number between 0 and 1.
prC
Probability of drop-in. A number between 0 and 1.
Details
The function is a wrapper for pvalue.machine. Likelihood ratios are computed with the LR function
in forensim. Use pvalue.machine for a more generic function that is independent of LR model.
Value
LR
Likelihood ratio
pvalue
P-value corresponding to the likelihood ratio
Author(s)
Guro Dorum <[email protected]>
References
Dorum et al. Exact computation of the distribution of likelihood ratios with forensic applications.
FSI: Genetics, 9, 2014, doi: http://dx.doi.org/10.1016/j.fsigen.2013.11.008
See Also
pvalue.machine,LRpvalue
Examples
data(R,S,V)
data(db2)
LRp(sampleData=R,victimData=V,suspectData=S,db=db2,hp=c('V','S'),hd=c('V','U'),prD=0.47,prC=0.05 )
LRpvalue
Compute the p-value corresponding to a likelihood ratio.
Description
Reads mixture data from files, computes the likelihood ratio for the given hypotheses and finally
calculates a p-value corresponding to the likelihood ratio. The p-value is the probability of observing a likelihood ratio at least as large as the one observed, given that the defense hypothesis is
true.
Usage
LRpvalue(samplefile, victimfile, suspectfile, freqfile, hp, hd, prD, prC)
LRpvalue
13
Arguments
samplefile
CSV file with sample profile. The file can only contain data for autosomal markers, and apart from that the format is the same as required in LRmix. See the
LRmix manual for details. The file name must contain the complete path if the
file is not in the current working directory.
victimfile
CSV file with victim profile. Same format as in LRmix. Only autosomal markers.
suspectfile
CSV file with suspect profile. Same format as in LRmix. Only autosomal markers.
freqfile
CSV file with allele frequencies. Same format as in LRmix.
hp
Prosecution hypothesis. A character vector of all contributors under $H_p$,
where S denotes suspect, V victim and U unknown. E.g. if the hypothesis is
that the sample is a mixture of the suspect, the victim and one unknown, this is
specified with the vector c(’S’,’V,’U’).
hd
Defense hypothesis. A character vector of all contributors under $H_d$, specified like hp. E.g. if the hypothesis is that the sample is a mixture of the suspect,
and two unknowns, this is specified with the vector c(’S’,’U,’U’).
prD
Probability of drop-out. A number between 0 and 1.
prC
Probability of drop-in. A number between 0 and 1.
Details
The function is a wrapper for LRp which again is a wrapper for pvalue.machine. Likelihood ratios
are computed with the LR function in forensim. For liberties regarding reading data from files, LRp
can be used to compute p-values for already prepared data frames. For liberty also regarding the
LR model used, pvalue.machine is the most generic function to compute a p-value.
Value
LR
Likelihood ratio
pvalue
P-value corresponding to the likelihood ratio
Author(s)
Guro Dorum <[email protected]>
References
Dorum et al. Exact computation of the distribution of likelihood ratios with forensic applications.
FSI: Genetics, 9, 2014, doi: http://dx.doi.org/10.1016/j.fsigen.2013.11.008
See Also
pvalue.machine,LRp
14
LRstat
Examples
data(sample);data(suspect);data(victim);data(freqs)
samplefile <- tempfile(); write.table(sample, samplefile, sep=",", row.names=FALSE)
victimfile <- tempfile(); write.table(victim, victimfile, sep=",", row.names=FALSE)
suspectfile <- tempfile(); write.table(suspect, suspectfile, sep=",", row.names=FALSE)
freqfile <- tempfile(); write.table(freqs, freqfile, sep=",", row.names=FALSE)
LRpvalue(samplefile, victimfile, suspectfile, freqfile, hp=c("V","S"), hd=c("V","U"),
prD=0.47, prC=0.05)
unlink(c(samplefile, victimfile, suspectfile, freqfile))
LRstat
Distribution of LR(HP) and LR(HD)
Description
Distribution of LR(HP) and LR(HD) are calculated as well as some summary statistics.
Usage
LRstat(ped_claim, ped_true, ids, alleles, afreq = NULL,
known_genotypes = list(), loop_breakers = NULL, Xchrom = F, plot = T)
Arguments
ped_claim
a linkdat object, or a list of several linkdat and/or singleton objects, describing
the claimed relationship. If a list, the sets of ID labels must be disjoint, that is,
all ID labels must be unique.
ped_true
a linkdat object, or a list of several linkdat and/or singleton objects, describing
the true relationship. ID labels must be consistent with ped_claim. individuals
available for genotyping.
ids
individuals available for genotyping.
alleles
a numeric or character vector containing marker alleles names
afreq
a numerical vector with allele frequencies. An error is given if they don’t sum
to 1 (rounded to 3 decimals).
known_genotypes
list of triplets (a, b, c), indicating that individual a has genotype b/c.
loop_breakers
a numeric containing IDs of individuals to be used as loop breakers. Relevant
only if any of the pedigrees has loops. See breakLoops.
Xchrom
a logical: Is the marker on the X chromosome?
plot
either a logical or the character "plot_only", controlling if a plot should be produced. If "plot_only", a plot is drawn, but no further computations are done
(useful for reproducing the plot in computer-intensive applications)
paraMix
15
Details
Connected to joint work with Klaas Slooten
Value
main
extra
LRdist
Expected values, variances of LR(HP) and LR(HD). RMNE
P(data|ped_claim) and P(data|ped_true)
Distribution of LR
Author(s)
Thore Egeland [email protected]
See Also
See Also exclusionPower
Examples
HP = nuclearPed(noffs=1, sex=2) # Specifies individual 1 as the father of 3
HD= list(singleton(id=1,sex=1), singleton(id=3, sex=2)) # Specifies 1 and 3 as unrelated
p=c(0.2,0.3,0.5);L=length(p)
available = c(1, 3)
res=LRstat(HP, HD, available, alleles = 1:L, afreq=p)
E.LR.HP=res$main[1]
stopifnot(abs(E.LR.HP-(L+3)/4)<1e-06)
res$LRdist #Distribution of LR
paraMix
Likelihood for mixtures with related contributors based on paramlink
Description
A DNA mixture (R) has been observed and some individuals may have been typed. Some of these
typed individuals are known contributors to the mixture, some are known non-contributors. In addition, there may be specified untyped individuals that have contributed to the mixture. Individuals
can be specified as members of a pedigree defined by a linkdat object x corresponding to a hypothesis H. Relevant individuals unrelated to all others, are defined using singleton.The likelihood
Pr(mixture,Typed contributors,Typed non-contributors|H)=P(R,T,V|H)
is calculated; the notation on the right hand side corresponds to that of Curran, Gill and Bill (2005).
A plot is also produced summarising the essential information. Compared to previous literature and
methods, including a series of papers by Fung and Hu, we generalise calculations to allow for general, possibly inbred, pedigrees. Typically calculations are performed for competing hypotheses and
the ratio of likelihoods, the likelihood ratio LR is calculated and reported. Previous methods have
assumed the relationships between typed contributors to be same for the competing hypotheses.
This restriction does not apply for our approach. The calculation may also be used for identification
cases where a mixture and reference samples are available. Likelihood calculations are performed
using the likelihood of paramlink. The function checkInput checks input to paraMix.
16
paraMix
Usage
paraMix(x, R, id.U, id.V = NULL, alleles, afreq = NULL,
Xchrom= FALSE, known_genotypes = list(), loop_breakers =NULL,
eliminate = 0, check = TRUE, plot = TRUE, title= NULL)
checkInput(x, R, id.U, id.V, alleles, all_typed, K, R_not_masked)
Arguments
x
linkdat object, or a list of such (if disconnected), describing the claimed relationship.
R
Integers, mixture.
id.U
Integers indicating untyped contributors (e.g.,suspect(s)).
id.V
Integers indicating typed non-contributors.
alleles
Integers indicating alleles for marker.
afreq
A numerical vector with allele frequencies. An error is given if they don’t sum
to 1 (rounded to 3 decimals).
Xchrom
Logical, FALSE for autosomal marker.
known_genotypes
List, each element a triplet of integers corresponding to (id,allele1,allele2)
loop_breakers
A numeric containing IDs of individuals to be used as loop breakers. Relevant
only if the pedigree has loops. See breakLoops.
eliminate
A non-negative integer, indicating the number of iterations in the internal genotypecompatibility algorithm. Positive values can save time if partialmarker is nonempty and the number of alleles is large.
check
If TRUE check of input is performed and calculations stop if they are likely to
take too much time
plot
If TRUE a plot is produced
title
Title of the plot
all_typed
An integer vector identifying typed individuals
K
Known alleles in contrib_typed
R_not_masked
Unexplained alleles
Details
The required likelihood Pr(R,T,V|H)=Pr(R|T,V,H)Pr(T,V|H)= Pr(T,V|H)sum_u Pr(U=u,T,V|H)
where the sum extends over u among persons specified by id.U so that the union of u,T, V is R. The
likelihoohd for each u and the sum is returned. Assumes alleles to be numbered 1,2,...
Value
likelihod
The likelood Pr(R,T,V|H)
allLikelihoods
Terms adding to above Pr(R,T,V|H)
paraMix
Author(s)
Magnus Dehli Vigeland and Thore Egeland <[email protected]>
See Also
famMix
Examples
#Example 1: Motivating example Egeland et al. (2013)
require(paramlink)
y1=swapSex(nuclearPed(3),c(3,4))
p=c(0.1,0.2,0.3,0.4)
alleles=1:length(p)
T1=c(1,1)
T2=c(2,2)
R=1:2
known=list(c(3,T1),c(4,T2))
l1=paraMix(y1,R,id.U=5,alleles=alleles,afreq=p,known_genotypes=known)
y2=swapSex(nuclearPed(1),3)
y2=addOffspring(y2,mother=2,noff=1,sex=2)
y2=relabel(y2,c(1:3,6,4),1:5)
l2=paraMix(y2,R,id.U=6,alleles=alleles,afreq=p,known_genotypes=known)
LR1=l1$lik/l2$lik
exact=1/(2*(p[1]+p[2]))
stopifnot(abs(LR1-exact)<10^(-6))
#Example 2. Example 1 in Egeland et al. (2013) based on Fung and Hu (2008)
#Data:
#Mixture 1/2/3
#Suspect=4, genotype 3/3
#Victim=10, genotype 1/2
#H1: Contributors were the suspect and victim (unrelated)
#H2: Contributors were the father of suspect and victim (unrelated)
#H3: Contributors were the brother of suspect and victim (unrelated)
afreq=c(0.044,0.166,0.110,0.680)
alleles=1:length(afreq)
R=1:3 #Mixture
man_ped=nuclearPed(2)
victim = singleton(id=10, sex=2)
known = list(c(4,3,3),c(10,1,2)) #individual 4 is 3/3, and 10 (the victim) is 1/2.
#The likelihoods corresponding to H1,H2 and H3
l1=paraMix(list(man_ped, victim), R, id.U=NULL, id.V=NULL,
alleles=alleles, afreq=afreq, known_genotypes=known)$lik
l2=paraMix(list(man_ped, victim), R, id.U=1, id.V=4,
alleles=alleles, afreq=afreq, known_genotypes=known)$lik
l3=paraMix(list(man_ped, victim), R, id.U=3, id.V=4,
alleles=alleles, afreq=afreq, known_genotypes=known)$lik
LR12=l1/l2
stopifnot(abs(LR12-3.125)<10^(-6))
LR13=l1/l3
stopifnot(abs(LR13- 2.355296)<10^(-6))
17
18
pvalue.machine
pvalue.machine
Computes the p-value for LR.suspect
Description
It is difficult to obtain accurate p-values based on simulation. This function provides an exact
alternative.
Usage
pvalue.machine(LR.suspect, LR.table, P.table)
Arguments
LR.suspect
Numeric. Observed likeliood ratio (1x1 positive value)
LR.table
Pre-computed likelihood ratios for every genotype of every marker (MxG matrix). Each row corresponds to a marker. G is the maximum number of genotypes for any marker. Markers with fewer than G genotypes must have 0 in
redundant columns
P.table
The population probabilities for every genotype of every marker (MxG matrix).
Must corresponds to the genotypes in LR.table. See description of LR.table
Value
The p-value, where a value close to 0 indicates that the suspect is a contributor.
Author(s)
Dorum, Bleka, Snipen <[email protected]>
See Also
The function is obsolete.
See dists.product and dists.product.pair for efficient computation of likelihood ratio distributions.
Examples
#Simple example, 2 markers, 3 genotypes. LR's and genotype probabilities precalculated
#The LR's for all possible genotypes for both markers. Each row corresponds to a marker
LR.table <- matrix(c(6,5,5,4,3,2),2,3)
#The population probabilities corresponding to the genotypes in LR.table
P.table <- rbind(c(0.2, 0.4, 0.4), c(0.1,0.6,0.3))
#LR observed for suspect
LR.suspect <- 20
pvalue <- pvalue.machine(LR.suspect, LR.table, P.table)
cat("p-value = ", pvalue, "\n")
pvalue.machine
19
20
q012
q012
Probabilities for pairwise relationships
Description
Calculates the probability distribution for a pair of individuals conditionally on 0,1, and 2 IBD
alleles.
Usage
q012(p = c(0.5, 0.5))
Arguments
p
A numerical vector with allele frequencies
Details
The function calls oneMarkerDistribution for IBD=0,1 and 2.
Value
q0
Joint distribution given IBD=0
q1
Joint distribution given IBD=1
q2
Joint distribution given IBD=2
Author(s)
Thore Egeland <[email protected]>
References
None
Examples
require(paramlink)
q012()
qkappa
qkappa
21
Calculates joint distribution for a pair of individals given IBD probabilities
Description
Based on conditional distribution given IBD from q012, the joint probability distribution for two
individuals are given for specified IBD probabilities
Usage
qkappa(kappa = c(0, 1, 0), q = NULL)
Arguments
kappa
Three reals summing to 1 giving IBD (0,1,2) probabilities
q
The joint probability distribution for two individuals
Value
A matrix giving the joint distribution
Author(s)
Thore Egeland <[email protected]>
References
To appear
Examples
require(paramlink)
#Sibs. One SNP marker with
qkappa(kappa=c(0.25,0.5,0.25),q012(p=c(0.2,0.8)))
22
sample
R
R, S and V
Description
Data used for examples in LRp.
Usage
R
S
V
Format
R is a data.frame containing mixture alleles for 9 markers.
S is a data.frame containing suspect’s genotype for 9 markers.
V is a data.frame containing victim’s genotype for 9 markers.
Examples
data(R);data(S);data(V)
sample
sample, suspect, victim, freqs
Description
Data used for examples in LRpvalue.
Usage
sample
suspect
victim
freqs
Format
sample is a data frame with mixture alleles for 9 markers. suspect is a data frame with suspect’s
genotype for 9 markers. victim is a data frame with victim’s genotype for 9 markers. freqs is a data
frame with frequencies for 10 markers.
simLR
23
Examples
data(sample)
data(suspect)
data(victim)
data(freqs)
simLR
Likelihood for mixtures that may have related contributors and drop-in
and drop-out of alleles
Description
Likelihood for mixtures that may have related contributors and drop-in and drop-out of alleles.
For a general description of the problem, see see paraMix. As opposed to paraMix, drop-in and
drop-out of alleles are allowed. The likelihood is based on simulations from an urn model. Possible
mixtures are simulated by applying drop-in and drop-out to genotypes for the assumed contributors.
Genotypes for unknown contributors are simulated conditioned on the pedigree.
Usage
simLR(R, x, alleles, afreq, pDO, pDI, N, known_genotypes = NULL,
ped = NULL, id.U = NULL, id.V = NULL)
Arguments
R
Integers, mixture
x
Number of unknown contributors
alleles
Integers indicating alleles for marker
afreq
A numerical vector with allele frequencies
pDO
Probability of drop-out applied per allele
pDI
Probability of drop-in per locus
N
Number of simulations
known_genotypes
List of known genotypes. If a pedigree is specified, each element must a triplet
of integers corresponding to (id,allele1,allele2). If no pedigree is specified, the
id can be omitted.
ped
linkdat object, or a list of such (if disconnected), describing the claimed relationship.
id.U
Integers indicating untyped contributors (e.g.,suspect(s)). Only relevant if a
pedigree is specified.
id.V
Integers indicating typed non-contributors. Only relevant if a pedigree is specified.
24
simMixMerlin
Value
p.R: the likelihood of the mixture R
Author(s)
Guro Dorum and Thore Egeland <[email protected]>
See Also
See paraMix.
Examples
require(paramlink)
alleles <-1:4
p <- c(0.044, 0.166, 0.11, 0.68)
names(p) <- alleles
R <- 1:3
known <- list(c(6,1,2),c(4,3,3))
x <- halfCousinPed(0)
y <- singleton(6,sex=2)
pDO <- 0.1
pDI <- 0.05
N <- 20000
lp <- simLR(R=R, x=0, alleles=alleles, afreq=p,
pDO, pDI, N, known_genotypes=known, ped=list(x,y))
ld <- simLR(R=R, x=1, alleles=alleles, afreq=p, pDO, pDI,
N, known_genotypes=known, ped=list(x,y), id.U=5,id.V=4)
lp/ld
simMixMerlin
A DNA mixture is generated from individual genotypes using
paramlink and Merlin
Description
A linkdat object is created. MERLIN files can be generated or mixtures can be generated based on
existing files. This function requires MERLIN to be installed and correctly pointed to in the PATH
environment variable.
Usage
simMixMerlin(x, aa, afreq, options=NULL, seed = 12345, generate = FALSE)
simMixMerlin
25
Arguments
x
linkdat object
aa
allele list. aa[[1]] contains alleles for marker 1. MERLIN has an upper limit
on the number of alleles therefore has problems with one marker, SE33, in db
afreq
Frequency list
options
A character with additional options to pass on to Merlin
seed
Random seed to pass on to Merlin. If not set, Merlin will return the same simulated data each time
generate
If TRUE, Merlin files are generated
Value
y
linkdat object
comp2
list of mixtures
Author(s)
Thore Egeland <[email protected]>
Examples
## Not run:
#Example 1
require(paramlink)
data(db)
x=cousinPed(1)
x=swapSex(addOffspring(x,father=7,mother=8,noff=2),ids=10)
db2=split(db,db$Marker)
Nmarkers=5
aa=vector("list",Nmarkers)
afreq=vector("list",Nmarkers)
for (i in 1:Nmarkers){
aa[[i]]=db2[[i]]$Allel
afreq[[i]]=db2[[i]]$Frequency
m=marker(x,9,c(1,1),10,c(1,1),alleles=1:length(aa[[i]]),afreq=afreq[[i]])
x=addMarker(x,m)
}
res=simMixMerlin(x,aa,afreq,generate=TRUE)
#The map file generated default above leads to tightly linked markers. The map file
#can be edited and simMixMerlin rerun with generate=FALSE.
#Example 2
#Next we consider an example #with two markers (for simplicity),
#D12 and VWA, markers 3 and 23 in db and illustrate how the map
#file is edited to account for linkage
x=cousinPed(1)
x=swapSex(addOffspring(x,father=7,mother=8,noff=2),ids=10)
Nmarkers=2
aa=vector("list",Nmarkers)
afreq=vector("list",Nmarkers)
26
simMixParamlink
i=0
for (j in c(3,23)){
i=i+1
aa[[i]]=db2[[i]]$Allel
afreq[[i]]=db2[[i]]$Frequency
m=marker(x,9,c(1,1),10,c(1,1),alleles=1:length(aa[[i]]),
afreq=afreq[[i]])
x=addMarker(x,m)
}
res=simMixMerlin(x,aa,afreq,generate=TRUE)
#Next edit map file, normally this is done
#simpler than below
map=read.table("merlin.map",header=FALSE)
map[,1]=c(12,13)
map[,3]=c(0.5,0.5)
write.table(map,"merlin.map",col.names=FALSE,quote=FALSE,
row.names=FALSE)
res=simMixMerlin(x,aa,afreq,generate=FALSE)
## End(Not run)
simMixParamlink
Generates DNA mixtures
Description
A DNA mixture is generated from individual genotypes using the R package paramlink
Usage
simMixParamlink(y, alleles)
Arguments
y
linkdat object from paramlink
alleles
Alleles in original form.
Details
The alleles are internally represented as consecutive integers 1,2,..., mixtures are generated and
transferred back to original allele values
Value
A list of length equal to the number of markers (or simulations) each giving the mixture
Author(s)
Thore Egeland <[email protected]>
tableELRHP
27
Examples
#Example 1
require(paramlink)
x=cousinPed(1)
x=swapSex(addOffspring(x,father=7,mother=8,noff=2),ids=9)
plot(x)
data(db)
locus="FGA"
afreq1=db[db$Marker==locus,3]
alleles=db[db$Marker==locus,2]
m1=marker(x,alleles=alleles,afreq=afreq1)
y=markerSim(x,N=3,available=c(9,10),partialmarker=m1,verbose=FALSE,loop=7,seed=2)
res=simMixParamlink(y,alleles)
plot(y,marker=1:3)
#Example 2 With conditioning
x=halfCousinPed(2)
data(db)
locus="FGA"
afreq1=db[db$Marker==locus,3]
alleles=db[db$Marker==locus,2]
g.13=c(18,19.2)
m1=marker(x,13,g.13,alleles=alleles,afreq=afreq1)
y=markerSim(x,N=2,available=c(8,9,12),partialmarker=m1,verbose=FALSE,seed=2)
res=simMixParamlink(y,alleles)
plot(y,marker=1:2,cex=0.7,starred=13)
tableELRHP
Calculates E(LR(HP))
Description
Calculates E(LR(HP)), SD(LR(HP) and SD(LR(HD)) exactly Answers is independent of allele
frequencies except for SD(LR(HD))
Usage
tableELRHP(L = 4,p=rep(1/L,L))
Arguments
L
Integer, at least 1.
p
Allele frequencies, vector of length $L$
Value
A table for a set of pairwise relationships.
28
tableELRHP
Note
Other pairwise relationships requires simple changes in code.
Author(s)
Thore Egeland [email protected]
References
Slooten and Egeland (to appear)
Examples
tableELRHP(L = 2)
Index
LRpvalue, 12, 12, 22
LRstat, 14
∗Topic \textasciitildekwd1
LRpvalue, 12
qkappa, 21
∗Topic \textasciitildekwd2
LRpvalue, 12
qkappa, 21
∗Topic datasets
db, 4
db2, 5
R, 22
sample, 22
∗Topic package
euroMix-package, 2
marker, 6
oneMarkerDistribution, 20
paraMix, 6, 7, 15, 23, 24
paramlink, 15
pvalue.machine, 2, 12, 13, 18
q012, 20, 21
qkappa, 21
R, 22
breakLoops, 16
S (R), 22
sample, 22
simLR, 23
simMixMerlin, 24
simMixParamlink, 26
singleton, 15
suspect (sample), 22
checkInput (paraMix), 15
convertToFamilias, 2
db, 4, 25
db2, 5
dists.product, 18
dists.product.pair, 18
tableELRHP, 27
euroMix (euroMix-package), 2
euroMix-package, 2
exclusionPower, 15
V (R), 22
victim (sample), 22
Familias, 6
FamiliasLocus, 6
famMix, 6, 17
forensim, 12, 13
freqs (sample), 22
generate, 9
likelihood, 15
linkdat, 2, 6, 7, 15, 16, 23–25
LR, 12, 13
LRmoments, 10
LRp, 11, 13, 22
29