Evolutionary Genetics: Part 5 Coalescent simulations S. chilense S. peruvianum Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik Color code Color code: Red = Important result or definition Purple: exercise to do Green: some bits of maths Population genetics: 4 evolutionary forces random genomic processes (mutation, duplication, recombination, gene conversion) molecular diversity natural selection random spatial process (migration) random demographic process (drift) Simulating sequence data How to simulate? How to simulate? How to simulate? Algorithm to generate sequence data Put k+n where n is the sample size Choose an exponential variable with parameter k(k-1+θ)/2 With probability: (k-1)/(k-1+θ) the event is a coalescent event And with probability θ/(k-1+θ) the event is a mutation If a coalescent event occurs choose a pair of lineages to coalesce, k becomes then k-1 If a mutation event occurs, choose a lineage to mutate, k is unchanged Repeat all this until k=1 Simulations 1 What is θ ????? Simulations 1 Do you see the same numbers? WHY? Simulations 1 Simulations 1 4 –t 5 –T > treefile.tre pdf(file=‘‘constant_tree.pdf‘‘) Dev.off() Simulations 1: neutral and constant size Simulations 2: neutral and expansion Ancestral population size = x*N0 Time t1 of expansion In 4N0 generations Present population size = N0 Do you see a problem ??? What is N0 ??? t1 = 0.5 = time at which the expansion starts in the past x = 0.1 = the population in the past is 0.1*N0 Simulations 2: neutral and expansion 4 -eN 0.5 0.1 4 -eN 0.05 0.1 – T > expansion.tre 0.5 = time at which the expansion starts in the past 0.1 = the population in the past is 0.1*N0 Simulations 2: trees of expansion expansion.tre pdf(file=‘‘expansion-tree.pdf‘‘) Dev.off() Simulations 3: crash or bottleneck? For a crash: ./ms 10 4 –t 5 -eN 0.5 5 Ancestral population size = x*N0 Time t1 of expansion In 4N0 generations Present population size = N0 Simulations 3: crash or bottleneck? For a bottleneck: ./ms 10 4 –t 5 -eN 0.5 0.25 -eN 0.75 t1 x1 t2 2 x2 Ancestral population size = x2*N0 Time t2 Bottleneck population size = x1*N0 Time t1 Present population size = N0 Simulations 2: trees of expansion Exercise Summarize the ms output Exercise Exercise Then save the output in a file: > test1.out Exercise Now using R Load the file: test <- read.table(“test1.out“,header=FALSE) Then draw graphs: pdf(file=‘‘summary_neutral_constant.pdf‘‘) hist(test[,2],main=“Theta_Pi Tajima“) hist(test[,4],main=“Theta_Watterson“) hist(test[,6],main=“Tajima D“) Dev.off() Then do the same for an expansion, decline or bottleneck Exercise Final simulations Using msmsplay on your computer Command line is similar Can see directly the site Frequency-Spectrum Can you compare the site frequency spectrum with values of Tajima‘s D ? Lets simulate neutral model, expansion, decline What differences we see? Some data analysis Use datasets: Use DnaSP to calculate usual statistics: Diversity = θW , θπ Site frequency spectrum Tajima‘s D What do you conclude on these various data? Do you have an idea of the past demography of these populations? Why do you need several independent loci ?
© Copyright 2026 Paperzz