Lessons learnt in implementation of genomic selection in breeding dairy cattle for New Zealand. What are the implications for numerically smaller breeds such as Guernsey? Bevin Harris, Science Leader, LIC, Hamilton New Zealand Introduction LIC has been investing in DNA technology since the early 1990s. In the mid 1990s, the first detection of quantitative trait loci (QTL) for dairy cattle had begun (Georges et al., 1995). In New Zealand (NZ) the first application of DNA information in the LIC breeding scheme was for parentage testing in the mid 1990s using microsatellite markers costing approximately $3 (NZ) per marker to analyse. LIC invested in a QTL discovery program that yielded a small number of QTL for the milk production traits. These QTL were used via marker assisted selection (MAS) in 1998 and 1999 within the LIC breeding programme until it was determined that the cost of the undertaking MAS was greater than its economic return (Spelman, 2002). Application of genomic information in dairy cattle breeding schemes moved from the MAS to the use of high density marker panels within the breeding scheme, which is commonly referred to as genomic selection (Meuwissen et al. (2001)). Genomic selection has become a key component of most dairy cattle breeding schemes over the past 8 years (Pryce and Daetwyler (2012)), including LICs breeding scheme, with predominantly young sires being routinely evaluated and selected on the basis of the combination of their genomic profile and pedigree information. Genomic breeding values are now widely used for the selection of young dairy sires. In most countries, the genomic predictions are from within-breed analyses using a single-breed reference population. Unlike other countries, NZ has purebred populations, as well as a large crossbred population. Crossbreeding in NZ dairy industry has been steadily increasing since the early 1980s. The crosses are mainly between the Holstein Friesian and Jersey breeds. In the 2014/15 season, the proportion of crossbreed heifers entering the herd was 54%, compared to 35% and 10% for Holstein Friesians and Jerseys, respectively. In 2001 progeny tested crossbred sires became available to the NZ industry. The recorded NZ dairy cattle population have been evaluated using a multi-breed animal model since 1996 allowing direct breeding value comparison of all animals regardless of breed or breed mix. The incorporation of genomic data provides additional challenges in a multi-breed analysis. Genomic relationships are a function of allele frequencies, which may differ among breeds because of different origins and selection pressures. Since the introduction of genomics in 2008 there has been a marked increase in the number of animals genotyped and the scale genomics resource that is being used to improve genomic selection. This paper outline the experiences, challenges and progress made in the LIC genomic selection programme from 2008 to the present. Also discussed will be challenges of implementing a genomic selection programmes in small dairy cattle populations. The LIC Genomic Selection 2008 through to 2016 The initial implementation of genomic selection Meuwissen et al. (2001) first proposed the theory of genomic selection. However, it was not until 2007, after the sequencing of the bovine genome had been completed (Kappes et al. 2006) and the Illumina 50K Bovine SNP chip was released that genomic selection became a reality for commercial breeding schemes. In NZ, in 2007, the cost of genotyping dropped from $3 (NZ) per marker for a microsatellite to less than 1 cent per marker when tens of thousands of SNP are genotyped in parallel. Genomic selection now became technically and financially feasible. In 2007, LIC genotyped approximately 2,400 Holstein Friesian, 1,500 Jersey and 650 crossbred sires on the Illumina 50k Bovine SNP chip. These were essentially all the historical sires used in the LIC breeding scheme that had DNA available for genotyping and all sires that were in the current breeding scheme as at 2007. The initial genomic selection research validated the genomic breeding values on 3 years of progeny test sires (not included in the training population). The correlations between the genomic breeding values and the breeding values from daughter information ranged from 0.50 to 0.72. It was also determined that using the Holstein-Friesian sires to predict Jersey genomic breeding values and vice-versa produced much poorer results than using all breeds and crossbreds in one genomic selection model. At this time the NZ reference population was approximately 3500 sires which was considered large at this time. In the 2008 season teams of genomically selected sires were made available to NZ farmers alongside the teams of progeny test sires. The breeding scheme size was dropped from 300 bulls per annual intake to 160 bulls. The reduction in size of progeny test scheme from 300 sire to 160 was due to the ability to screen a large number of young bulls based on their genomic breeding values. The team of genomically selected sires included both yearling and 2 year old young bulls. The sire analysts aggressively used the genomic young sires as sires of sons for the future progeny test intakes. The initial daughter results from genomic selection By 2010-2011 the teams of genomically selected sires were getting breeding values based on their daughter information. The daughter breeding values showed that the initial genomic breeding values had been biased upwards and the accuracies lower than predicted. Also, it was shown that the parent average breeding value was also biased upwards which further compounded the genomic breeding value bias. The genomic breeding value inflation (inverse of the bias) values were around 0.7 (or 30% to high). The lower accuracies were in part due to over-fitting of the SNP markers by the genomic selection model. Over-fitting occurs when you are trying to estimate a large number of factors from a small amount of data. The genomic breeding value model was attempting to estimate approximately 44,000 factors from 3500 sires. The performance of the reduced progeny test scheme (160 sires) depended on the the ability to screen a large number of young bulls based on their genomic breeding values. The 160 bulls are selected from approximately 2000 bull calves with genomic information. The initial pre-selection was based on a custom-made 384 SNP panel for in the first year and a 50K panel for the next four years. The pre-selection process is only as good as the accuracy of the genomic selection model. A number of initiatives were undertaken to improve the accuracy of the genomic selection model. 1. Genotyping of 3000 animals on the Illumina 777K panel. It was considered that having a higher density of SNP markers would improve the accuracy of genomic selection particularly for crossbreed populations since there would be a greater probability that at least 1 or more SNP marker would be associated with QTL in the different breeds compared to the 50k panel. 2. The size of training population was increased. Two approaches were undertaken, genotype swapping with Australia, Ireland and CRV Ambreed and genotyping progeny test cows. By 2011, approximately 14,000 cows had been genotyped. 3. Statistical methods had been developed to control the biases, however, these methods do not change the accuracy of genomic selection. The use of Illumina 777K panel with in the genomic selection model produced similar results to the 50K panel (Harris et. al., 2011). The major problem was that the training population sizes used in these studies had insufficient statistical power to exploit the increase in marker density. Inclusion of sire genotypes from Australia and Ireland produced no increases in accuracy in NZ. Most of these sires had little or no phenotype information in NZ. Their Interbull breeding values had to be used as a proxy NZ phenotype. Any increases in accuracy from increased training population size were nullified by the non-unity between country genetic correlations reducing the accuracy of the sires phenotype. However, the inclusion of genotypes from CRV Ambreed which had NZ phenotypes improved the accuracy by approximately 3% (Spelman et., al. 2012). The inclusion of cow genotypes resulted in improvements in accuracy and reduction in bias (Harris et. al., 2013). The accuracies from the genomic selection model validations improved by 5% and the inflation of the genomic breeding values were reduced by 2030%. Increasing the accuracy of genomic selection by sequencing (towards the future) and new statistical models In 2012, LIC undertook to whole-genome sequencing of 600 Holstein- Friesian, Jersey and crossbred dairy animals with the objective of increasing the rate of genetic improvement through increasing accuracy of genomic prediction. At the same time increasing numbers of females were genotyped. By 2016, approximately 120,000 cows had been genotyped. An output of the whole-genome sequence was a pool of 18 million SNP variants. The SNP variants can be imputed into the existing 120,000 genotyped animals. The imputed genotypes can be then used in genome wide association mapping to identify causal variants or markers in strong association with the causal variants. The aim is use the selected markers to increase the accuracy of genomic selection. In addition, RNA-sequencing of mammary tissue for 350 lactating cows has been undertaken to augment identification of causal variants. This allows the identification of SNP variants that are located in genes which are reasonable for lactation. These variants can augment identification of causal variants within in our NZ populations. Meuwissen and Goddard (2010) reported from simulations that the increase in marker density and also the inclusion of causative mutations from the whole genome sequence can further increase the accuracy of genomic selection. Another advantage of the whole genome sequencing data is ability to identify recessive deleterious genes. To date three recessive deleterious variants in the New Zealand dairy cattle population have discovered. The discovery of these variants has enabled LIC to include the information in software to reduce the frequency of carrier-to-carrier mating and thus reduce the frequency of affected offspring. In the 2012/13 season a yearling genomic sire whose sire had a de novo mutation in the prolactin receptor (Littlejohn et. al., 2014) was used in the genomic team. The mutation resulted in changes to the animal coat type, heat tolerance and the ability to lactate. The mutation was dominant. This means the traits were exhibit if the offspring inherited a single copy of the mutation. The causative mutation was identified quickly from genome wide association mapping of affected and unaffected individuals using the whole genome sequence data set. A genetic test was built to identify affected individuals in farmers herds. The consequence of the mutation was that LIC decided not use yearling sires for wide spread use in the genomic teams from 2013. A considerable worldwide effort has been made in the development of new statistical models for genomic evaluation. These include the single step approach (Mizstal et al., (2014)) and marker model approaches (Liu et. al., (2015), and Fernando et. al., (2015)). Such approaches are becoming computationally feasible for large dairy cattle populations. These methods should offer increased genomic breeding value accuracy by removing statistical errors occurring from approximations that are required in the current statistical models. The current performance of genomic selection in NZ where 102,000 genotyped animals are evaluated using a hybrid single step model (Winkelman et., al. 2015) has improved considerably compared to the 2008 version of genomic selection. The validation correlations range now from 0.60 to 0.85 and inflation values range from 0.9 to 1.05. Challenges for small populations The NZ genomic selection experience has provided insight to the key factors that contribute to successfully genomic selection programme. 1. There is no substitute for training population size, the bigger the training population the better the genomic selection accuracy. The ideal training population will be determined by the number of chromosome segments segregating in the population. The ideal size of the training population will be larger for multi-breed data sets and populations that have large effective population sizes than single breed populations with a small effective population size such as the Holstein population. 2. Cows are an important resource for genotyping. They can contribute to the training population size. Our experience suggests that 6-8 cows provide a similar improvement in genomic accuracy to 1 progeny tested sire. 3. Across country genomic evaluation is an important tool for small populations. The Brown Swiss Intergenomics project run by Interbull centre in Sweden has been very successful genomic selection implementation. Provided common sires are used across the participating countries and the between country genetic correlations are close to unity then an across country genomic evaluation will provide increased accuracy for small populations. 4. The cost of sequencing is decreasing rapidly. Sequencing key sires could a useful activity. The sequence could be made available to 1000 bull genomics project. This would allow the smaller populations to leverage the data and tools provided by the 1000 bull genomics project to help analyse their sequence data. The outcomes could be finding SNP markers that provide higher genomic prediction accuracies than the use of standard Bovine SNP panels and the ability to detect and manage deleterious genes/variants. Conclusions The introduction of genomic information into LIC dairy cattle breeding scheme has been a steep learning curve over the last nine years. Initially, Dairy farmers that utilised the new technology did not benefit to the degree that was expected, which is not an uncommon situation with new technology. Increased investment by LIC was required to improve the accuracy of genomic selection in NZ. Further investment in sequencing and the implementation of new statistical methods is expected to continue to improve the accuracy of genomics and it is expected that future breeding schemes will utilise increased levels of genomic information at the expense of progeny testing. In the genomics era small populations have an even greater difficultly competing with large populations such as Holstein. The key is to have the largest training population possible. This may require genotyping bulls and cows, sharing genotypes and phenotypes across countries. Providing sequence data to the 1000 bull genomics project could provide benefits by pooling sequence data across multiple breeds and allow small populations to be part of large across breed genome wide association studies. References Fernando R. L, Dekkers J. CM and Garrick D. J. (2014) GSE:46 Harris, B. L., Creagh, F., Winkelman A. M., et al., (2011). Interbull Bull 44 Harris, B. L., Winkelman A. M. and Johnson, D. L. (2013) Interbull Bull 46 Georges, M., Nielsen, D., Mackinnon, M., et al. (1995). Genetics 139 Littlejohn, M. D., K M. Henty, K Tiplady, T. Johnson et al. (2014) Nat. Comm. Z. Liu, M. E. Goddard, F. Reinhardt, and R. Reents (2014) J. Dairy Sci. Misztal, I., Legarra, A., Aguilar, I. (2014). J. Dairy Sci. Spelman, R. J., (2002). In Proc. 7th WCGALP Spelman, R. J, M.D. Keehan, V. Obolonkin, A.M. Winkelman, D.L. Johnson and Bevin Harris (2012) Interbull Bull 45 Kappes, S.M.; Green, R.D. and van Tassell, C.P. 2006. In Proc. 8th WCGALP. Meuwissen TH, Hayes BJ, Goddard ME. (2001). Genetics. 157(4) Meuwissen T., Goddard M. (2010). Genetics. 185(2) Pryce, J. E. and Daetwyler, H. D. (2012). Animal Production Science, Vol. 52 No. 3
© Copyright 2026 Paperzz