Genetic Diversity in Tasmanian Atlantic Salmon and Prospects for GWAS and Genomic Prediction James Kijas 1, Peter Kube 1, Brad Evans 2, Natasha Botwright 1, Harry King 1, Craig Primmer 3, Klara Verbyla 1 1 AGRICULTURE 2 3 OUTLINE 1. Genomic Resources - reference assembly, SNP chips - opportunities 2. Tasmanian Atlantic Salmon - population history - selective breeding program - objectives 3. Results - genetic diversity, LD, imputation 4. Conclusions and Next Steps 1. Genomic Resources AGRICULTURE Genomic Resources Include: • Haploid female • 200 x ILMN, 4 x Sanger • Genome duplication SNP Genotyping Arrays Lien et al Nature 2016 Transciptomes Lien et al Nature 2016 Reference Assembly • 37,206 PCGs • 16K iSelect (2009) • 130K, 200K Affy arrays (2014) • Discovery panels european Opportunities: Genomic Prediction Barson et al Nature 2015 GWAS • Incorporates genotypes in BV estimation • Speed genetic gain, success in livestock • Highly relevant to our Salmon industries • Major genes 2. Tasmanian Atlantic Salmon AGRICULTURE Population History Transfer of animals in 1964 / 1965 Breeding Program from 2001 Population History Earliest consequences of domestication, GxE Founder effect Genome and SNP platforms from European stocks The Tasmanian salmon breeding cycle female DNA taken from every fish One-year old smolt are tagged and fin-clipped Fin-clips into 2 ml tubes, DNA for pedigree plus.. Tags allow automated data capture and fail safe systems Any idiot can be called upon to help at spawning... female male Selective Breeding Program (SBP) Challenges • Early maturation • Sex determination • All female commercial animals Focus Traits • Growth rate • Ameobic Gill Disease • Flesh quality, other minor traits NEXT STEPS - Genomic Prediction - GWAS for key traits - Impact of domestication and selection Objectives 1: Evaluate Levels of Genetic Diversity - population comparison - across year classes since inception of SBP 2: Measure extent of linkage disequilibrium GWAS 3: Imputation ongoing GP program 3. RESULTS AGRICULTURE Materials and Methods: 782 fish from the SBP Genotyped using custom 220K Affymetrix array (AquaGen, CiGene) Data QC and filtering (call and sample rate) 777 fish with high quality SNP data Genetic Diversity: Polymorphism PN Proportion of polymorphic loci H E1 Exp. Heterozygosity, all SNP 2 HE Exp. Heterozygosity, poly. SNP D ST Average pairwise distance Population Location Status N SNP PN H E1 H E2 D ST TAS FIN_55 FIN_56 Farmed Wild Wild 782 137 326 218132 208704 208704 0.537 0.999 0.999 0.119 0.381 0.380 0.222 0.381 0.380 0.200 0.313 0.310 Tasmania Finland Finland Very high rate of monomorphism in the TAS population Source This study Barson et al. 2015 Barson et al . 2015 Genetic Diversity: MAF Allele Frequency Distribution: 106 K SNP TAS 25 20 Proportion of SNP 20 Proportion of SNP FIN 25 15 10 5 15 10 5 0 0 0.0 0.05 0.05 0.10 0.10 0.15 0.15 0.20 0.20 0.25 0.25 0.30 0.30 0.35 0.35 0.40 0.40 0.45 0.45 0.50 ≥ 0.5 Minor Allele Frequency Bin 55% of polymorphic SNP had MAF 15% Likely ascertainment bias in SNP collection 0.0 0.05 0.05 0.10 0.10 0.15 0.15 0.20 0.20 0.25 0.25 0.30 0.30 0.35 0.35 0.40 Minor Allele Frequency Bin 0.40 0.45 0.45 0.50 ≥ 0.5 Genetic Diversity: Inbreeding F Reduction in heterozygosity compared with HWE Higher inbreeding, higher F values (PLINK v1.p) Expect SBP to affect F Genetic Diversity: Inbreeding F Distribution of Individual Inbreeding Coefficient (F) 40 Proprotion of Animals (%) 35 Founders 2001 - 2003 (n=131) 30 2010 Year Class (n=100) 25 20 15 10 5 0 Inbreeding Coefficient (F) Bin Linkage Disequilibrium Allele_A Allele_B Distance Between Adjacent SNP pair spacing onSNP the(Kb)chip Ssa01 2000 1800 SNP1 (A/T) 1600 SNP2 (G/C) Number of SNP Pairs 1400 Min Gap Max Gap Average Gap 1200 1000 1 bp 450,156 bp 22 Kb +/- 35 Kb 800 600 400 LD as r2 Range 0 - 1 200 0 5 15 25 35 45 55 65 75 85 95 105 115 125 135 145 155 165 175 185 195 205 215 225 235 245 255 265 275 285 295 Distance Between Adjacent SNP (Kb) SN Linkage Disequilibrium LD Decay Comparing Two SNP sets 0.7 Linkage Disequilibrium (r2) 0.6 0.5 SNP Set Properties All SNP High MAF SNP SNP Number MAF average Average SNP Pair Distance Total SNP Pairs Average Pairs Per Distance Bin 106,492 0.167 88,906 932,726 1,865 21,372 0.337 200,082 126,664 254 0.4 0.3 0.2 High MAF SNP 0.1 All SNP 0 0 50 100 150 200 250 300 350 400 450 500 Marker Distance (Kb) Very high LD at short distances; ascertainment bias not the cause. Linkage Disequilibrium: Average r2 Across Distance Bins Marker Distance (kb) 0 to 10 10 to 20 20 to 30 30 to 40 40 to 50 50 to 100 100 to 200 200 to 300 300 to 500 Tasmanian Population Finnish Population 106K SNP 21K SNP 106K SNP 106K SNP 106K SNP 21K SNP All Fish All Fish Males Females All Fish All Fish 0.540 0.412 0.363 0.334 0.312 0.270 0.211 0.171 0.131 0.441 0.361 0.327 0.303 0.285 0.248 0.215 0.177 0.153 0.541 0.414 0.366 0.335 0.314 0.272 0.212 0.173 0.133 0.541 0.414 0.366 0.335 0.314 0.272 0.213 0.173 0.133 0.037 0.027 0.024 0.022 0.021 0.019 0.017 0.016 0.015 0.032 0.025 0.022 0.021 0.020 0.017 0.016 0.014 0.014 Linkage Disequilibrium: Average r2 Across Distance Bins Marker Distance (kb) 0 to 10 10 to 20 20 to 30 30 to 40 40 to 50 50 to 100 100 to 200 200 to 300 300 to 500 Tasmanian Population Finnish Population 106K SNP 21K SNP 106K SNP 106K SNP 106K SNP 21K SNP All Fish All Fish Males Females All Fish All Fish 0.540 0.412 0.363 0.334 0.312 0.270 0.211 0.171 0.131 0.441 0.361 0.327 0.303 0.285 0.248 0.215 0.177 0.153 0.541 0.414 0.366 0.335 0.314 0.272 0.212 0.173 0.133 0.541 0.414 0.366 0.335 0.314 0.272 0.213 0.173 0.133 0.037 0.027 0.024 0.022 0.021 0.019 0.017 0.016 0.015 LD dramatically higher in TAS (farmed) versus FIN (wild) population. 0.032 0.025 0.022 0.021 0.020 0.017 0.016 0.014 0.014 Linkage Disequilibrium: Average r2 Across Distance Bins Marker Distance (kb) 0 to 10 10 to 20 20 to 30 30 to 40 40 to 50 50 to 100 100 to 200 200 to 300 300 to 500 Tasmanian Population Finnish Population 106K SNP 21K SNP 106K SNP 106K SNP 106K SNP 21K SNP All Fish All Fish Males Females All Fish All Fish 0.540 0.412 0.363 0.334 0.312 0.270 0.211 0.171 0.131 0.441 0.361 0.327 0.303 0.285 0.248 0.215 0.177 0.153 0.541 0.414 0.366 0.335 0.314 0.272 0.212 0.173 0.133 0.541 0.414 0.366 0.335 0.314 0.272 0.213 0.173 0.133 0.037 0.027 0.024 0.022 0.021 0.019 0.017 0.016 0.015 Should translate into high power for GWAS to tag haplotypes. 0.032 0.025 0.022 0.021 0.020 0.017 0.016 0.014 0.014 Average Gap 22 Kb +/- 35 Kb Imputation: How many SNP do we need to impute with accuracy ? Year Class 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 N 31 42 58 12 81 107 120 123 97 100 6 Reference Panel 78 K SNP 574 fish Test Panel: set SNP to missing to create 4 panels 203 fish IMPUTATION ACCURACY 0.5 K SNP 89% 78 K SNP 1 K SNP 3 K SNP 5 K SNP 92% 96% 97% 78 K SNP 78 K SNP 78 K SNP Means we can deploy GP with low density SNP genotyping and imputation. 4. CONCLUSIONS AGRICULTURE Genetic Diversity Diversity levels consistent with population history Genome sequencing to provide an unbiased comparative estimate Linkage Disequilibrium High LD at short to medium physical distances Good for detecting gene effects with medium density SNP chips Critical intervals likely to be large Imputation Accuracies high due to population history Good for delivery of genomic prediction using low density SNP panels ($) Next Steps: Characterisation of the sex determination locus Biological understanding of unwanted early maturation Consequences of domestication on genome variation Completed 30 x genome sequencing of 20 SBP fish male female Acknowledgements Brad Evans Peter Kube Natasha Botwright Harry King Klara Verbyla AGRICULTURE Craig Primmer SPEAKERS Naomi Wray David Hume Anna Campbell Heather Burrow Andres Legarra Tad Sonstegard Lucia Galvao de Albuquerque 2
© Copyright 2026 Paperzz