Supporting information for Safeguarding our genetic resources with libraries of doubled-haploid lines Albrecht E. Melchinger*, Pascal Schopp*, Dominik Müllera, Tobias A. Schrag*, Eva Bauer†, Sandra Unterseer†, Linda Homann*, Wolfgang Schipprack*, Chris-Carolin Schön† *Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Fruwirthstraße 21, 70593 Stuttgart, Germany † Technische Universität München, TUM School of Life Sciences Weihenstephan, Liesel- Beckmann-Straße 2, 85354 Freising, Germany Correspondence should be sent to: Prof. Dr. A. E. Melchinger University of Hohenheim Institute of Plant Breeding, Seed Science and Population Genetics Fruwirthstraße 21 70593 Stuttgart, Germany E-mail: [email protected] or Prof. Dr. Chris-Carolin Schön Technische Universität München, Plant Breeding Liesel-Beckmann-Straße 2 85354 Freising, Germany E-mail: [email protected] 1 Supporting information Melchinger et al CONTENTS Supplemental Figures Figure S1 Working steps in production of doubled-haploid lines. Figure S2 Gene diversity in the original landraces and the DH libraries derived from them. Figure S3 Tests for Hardy-Weinberg equilibrium in the original landraces. Figure S4 Comparison of allele frequencies in the original landraces and doubled-haploid (DH) libraries derived from them. Supplemental Tables Table S1 Success rates in production of doubled-haploid lines from landraces versus elite germplasm. Table S2 Costs in production of DH lines from individual landraces. Supplemental Notes File S1 Process of DH production. File S2 Permutation tests for statistics comparing two populations across multiple markers simultaneously. File S3 Theory for effects of hitchhiking under selection at the haploid and/or diploid stage during the development of doubled-haploid (DH) lines. File S4 Translation of allele designation between different SNP arrays. 2 Supporting information Melchinger et al Figure S1 Working steps in production of doubled-haploid lines, for details see Supporting File. 3 Supporting information Melchinger et al Figure S2 Nei’s (1973) gene diversity Hs in (A) the original landrace (S0 generation) and (B) doubledhaploid (DH) lines (D1 generation) derived from them, averaged across all markers in a sliding window of 10 Mb width along the chromosomes for five European flint maize landraces (BU, GB, RT, SC, SF). The heat map at the bottom, calculated on the basis of the 28,133 SNPs analyzed, indicates the marker density within the window (Mb-1). Centromeres are indicated by grey vertical lines. (C) Genome-wide means of Hs values for the S0 and D1 generation. 4 Supporting information Melchinger et al Figure S3 Fisher’s exact test of for deviations from Hardy-Weinberg equilibrium in the original landrace (S0 generation), averaged across markers in a sliding window of 10 Mb width along the chromosomes for five European maize landraces (BU, GB, RT, SC, SF). The heat map at the bottom, calculated on the basis of the 28,133 SNPs analyzed, indicate the marker density within the window (Mb-1). Centromeres are indicated by grey vertical lines. 5 Supporting information Melchinger et al Figure S4 Allele frequencies in the original landrace (S0 generation) plotted against the corresponding allele frequencies in the population of doubled-haploid (DH) lines (D1 generation) derived from it for each of five European maize landraces (BU, GB, RT, SC, SF) shown for the 28,133 SNPs analyzed in this study. Allele frequencies refer to the major allele determined in the combined data set. 6 Supporting information Melchinger et al Table S1 Success rates in different stages i of production of doubled-haploid (DH) lines for five European maize landraces and elite crosses from the flint germplasm pool. For definition of stages and counts Ni in each stage, see Figure S1. Success rate in different stages† Source germplasm N2 N1 N3 N2 N4 N3 N5 N4 N6 N5 N7 N6 N8 N7 --------------------------------------------- % --------------------------------------------Landraces (LR) Gelber Badischer (GB) 1.38d 77.2c 93.2c 90.1b 21.5c 60.4a 44.8b Rheintaler (RT) 2.03c 80.0c 88.3d 91.7a 46.2a 30.6c 51.8b Strenzfelder (SF) 3.02b 83.4b 92.2c 93.0a 29.2b 46.0b 50.4b Satu Mare (SM) 2.09c 79.5c 93.4c 93.7a 20.7c 55.5a 56.5b Walliser (WA) 3.43a 87.0a 95.1b 82.5c 22.1c 58.1a 57.1b Mean 2.39* 81.4 92.4* 90.2* 27.9 50.1** 52.1** Elite crosses (EC) 2.72a 83.3a 97.0a 93.3a 26.7a 71.0a 81.0a † Values followed by the same letter are not significantly different at Bonferroni corrected P < 0.05. *, ** Mean of the landraces and elite crosses materials differed at P < 0.05 and P < 0.01, respectively 7 Supporting information Melchinger et al Table S2 Production costs† of one doubled-haploid (DH) line for five European maize landraces (GB, RT, SF, SM, WA) on the basis of the success rates shown in Table S1. 103 x costs† per unit Stage i§ 1 Activity from stage i to i+1 Labor† Consum. Units‡ required in stage i for obtaining one propagatable D1 line GB RT SF SM WA Costs† of working step GB RT SF SM WA 12.02 9.45 ¶ Production of induction crosses RT 15.67 12.24 ¶ Production of induction crosses SF 8.37 6.55¶ 6.12 4.83 ¶ 10.20 8.05 ¶ 194.82 0.00 28.55 22.65 22.22 23.72 21.47 5.17 4.10 4.03 4.31 3.91 Production of induction crosses GB Production of induction crosses SM Production of induction crosses WA 2,066.23 41.32 1,112.01 28.92 735.26 10.22 1,137.77 11.60 626.85 10.66 2 Identification of haploid seeds 3 423.87 175.39 22.00 18.14 18.46 18.89 18.68 12.29 10.10 10.34 10.54 10.44 339.18 85.65 20.50 15.99 17.07 17.60 17.82 8.11 6.32 6.75 6.98 7.04 142.65 0.00 2.04 1.29 1.18 1.07 3.11 0.27 0.17 0.16 0.15 0.42 6 Germination, colchicine treatment transplanting to jiffy pods Transplanting form greenhouse to field Verification of true H/DH plants and rogueing of F1 plants Isolation of H/DH plants 215.10 13.52 18.46 14.71 15.89 16.53 14.71 3.93 3.12 3.38 3.52 3.12 7 Pollination of fertile H/DH plants 872.97 46.58 3.97 6.77 4.62 3.43 3.22 3.40 5.80 3.96 2.93 2.77 8 Harvest of D1 ears 1,395.48 3.00 2.36 2.04 2.15 1.93 1.93 3.12 2.69 2.77 2.47 2.45 9 Self-pollination of D1 lines 4.19¶ 1.07 1.07 1.07 1.07 1.07 8.06 8.06 8.06 8.06 8.06 85.67 69.28 49.67 50.56 48.87 4 5 # 3.86 Total † Costs (in USD) are based on wages, machinery, consumables and land rent in Germany Units refer to seeds, seedlings or plants § For detailed description see Materials and Methods ¶ Includes taxes and proportional costs for handling, travel, shipping etc. (~24% of total costs) # Induction rate of 7.5% ‡ 8 Supporting information Melchinger et al 1 Supporting Files 2 File S1: Process of DH production. 3 The entire production process of doubled-haploid (DH) lines by the in vivo haploid method 4 applied in our study (Prigge and Melchinger 2012) can be subdivided into the following eight 5 steps (see also Figure S1): 6 1. Provision of seeds from induction crosses produced by emasculating plants from the 7 source germplasm (female parent) and pollinating them with pollen from inducer UH400 8 (https://plant-breeding.uni-hohenheim.de/84531); harvesting of all seeds from each 9 induction cross in bulk. 10 2. Identification of all putative haploid seeds in each induction cross by selecting seeds 11 which shows (i) purple coloration of the aleurone to check expression of the R1-nj 12 marker gene and (ii) absence of a purple scutellum on the embryo. 13 3. Germination of putative haploid seeds in a growth cabin at 28° C and 90% humidity for 14 3 to 5 days; treating the seedlings with colchicine for 8 hrs after cutting their coleoptile 15 tips; subsequently, transplanting the seedlings into jiffy pots filled with soil and 16 cultivation in the greenhouse until growth stage V3 (Abendroth et al. 2011). 17 4. Transplanting of the surviving plants into the field. 18 5. Verification of genuine haploid (H) or doubled-haploid (DH) plants on the basis of visual 19 scoring (compared with the hybrid phenotype, the H/DH phenotype is characterized by 20 a shorter stature, erect and narrow leaves and reduced growth and fertility) and 21 rogueing of false positives (F1 plants resulting from hybrid seeds of induction cross that 22 were misclassified due to absence of a purple scutellum on the embryo) before 23 pollination. 9 Supporting information Melchinger et al 24 25 6. Shoot bagging and self-pollination of D0 plants, which produced both silks and filled anthers. 26 7. Harvest of D1 ears with seed set. 27 8. Growing the seeds of D1 ears ear-to-row; checking the D1 lines for phenotypic 28 uniformity; elimination of off-types; line multiplication by self-pollination of individual 29 plants in each row. 30 We recorded for each induction cross the number N i of units (seeds, seedlings, plants, D0 31 plants, D1 ears with seed set, propagated D1 lines) present in each stage ( i 1,...,8 ) and 32 determined the success rate for each working step i as the ratio SRi 33 the production costs for each step, the expected total production costs TCosts per D1 line for 34 each landrace and the elite crosses were calculated as follows: N i 1 . Together with Ni 8 35 T Cos ts Ci ni i 1 36 Here, Ci refers to all variable costs per unit (plants in induction cross, seeds, seedlings, plants 37 or lines) in stage i , and ni 38 order to obtain one D1 line. In stages 𝑖 = 1 and 2, the costs Ci varied among the different 39 source germplasm depending on the efforts required for sorting of haploid and hybrid seeds 40 in induction crosses due to variable expression levels of the R1-nj marker. After stage 2, costs 41 Ci were identical for all source germplasm except for isolation of silks and pollination in the 42 landraces (working step 5), but the success rates differed among landraces and the elite 43 materials. Costs of labor per processed unit were based on long-term data gathered in the 44 maize breeding program of the University of Hohenheim (W. Schipprack, unpublished data, Ni refers to the number of units required in working step i in N8 10 Supporting information Melchinger et al 45 2016) and are based on current wages and cost for consumables as well as land rent in 46 Germany. Cost of induction crosses and line multiplication by selfing in the winter nursery in 47 Chile were taken from the price list of companies offering this service. 11 Supporting information Melchinger et al 48 49 File S2: Permutation tests for statistics comparing two populations across multiple markers simultaneously. 50 For comparing two population samples for a statistic (e.g., FST statistic) or absolute 51 difference in allele frequencies) calculated from the allele frequencies at marker set M, the 52 following problems can exist: 53 54 1. The allele frequencies at different loci in M are not stochastically independent. This occurs for example, if markers are in linkage disequilibrium. 55 2. The two populations differ in their population structure. In our study, the S 0 and D1 56 generation differ in their degree of homozygosity: DH lines, subsequently referred to as 57 D1 lines, are completely homozygous and, hence, both parental gametes are identical, 58 whereas the parental gametes of S0 genotypes can be assumed to be stochastically 59 independent, because the S0 generation was produced by random mating each 60 landrace. 61 A solution to Problem 1 can be obtained by a permutation test, in which the test statistic is 62 calculated as function (in our case as the mean) of all markers in set M (M could be (i) a 63 single marker, or (ii) all markers in a given bin, or (iii) all markers over the entire genome. To 64 obtain the distribution of under the null hypothesis H0 (the two populations compared do 65 not differ in the allele frequencies at all markers in set M ), is calculated for each of a large 66 number (N = 10,000 in our study) of permutations of the genotypes from the two populations. 67 Comparing the observed value of with the distribution of obtained for the permutations 68 yields the corresponding P-value. An advantage of this test is that it can be applied irrespective 69 of whether the markers in set M are stochastically independent or not. 12 Supporting information Melchinger et al 70 A solution to Problem 2 was obtained by using so-called pseudo-S0 (PS0) individuals in the 71 permutation test instead of D1 lines, when calculating for the different permutations to 72 obtain the distribution of under the null hypothesis that the S0 and D1 generation do not 73 differ in their allele frequencies at marker set M. PS0 individuals are obtained by sampling 74 from the N D1 D1 lines at random 0.5 N D1 pairs of lines without replacement, where 75 0.5 N D1 is the largest integer 0.5 N D1 . The genotype of these pairs of PS0 individuals is 76 obtained from the genotypes of the two “parental” D1 lines used in their formation and 77 corresponds exactly to the genotype that would be obtained if the two gametes from which 78 the two D1 lines originated, had been combined in the S0 generation by random mating. Thus, 79 the PS0 genotypes have exactly the same allele frequencies as the original D1 lines (except for 80 minor deviations if 0.5 N D1 is odd) and have the same population structure as the S0 81 generation, from which the D1 lines were generated. Consequently, the S0 and PS0 populations 82 can be compared in permutation tests without complications arising from different population 83 structure due to different degree of homozygosity. In our study, we used in each permutation 84 run a new set of PS0 genotypes obtained by random union of D1 lines for calculating . 13 Supporting information Melchinger et al 85 86 File S3: Theory for effects of hitchhiking under selection at the haploid and/or diploid stage during the development of doubled-haploid (DH) lines. 87 Let p1 and q1 be the allele frequencies of alleles A and a at the A locus and let p2 and q2 be 88 the allele frequencies of alleles B and b at the B locus in the array of gametes used for 89 production of doubled-haploid (DH) lines, i.e., before selection. Let p1* , q1* , p2* and q2* be 90 corresponding allele frequencies in the population of doubled-haploid (DH) lines produced 91 from them, i.e., after selection. The latter correspond to the frequencies of genotypes AA, aa, 92 BB and bb, respectively, in the D1 generation in this study. Let D denote the linkage 93 disequilibrium before selection. 94 We assume that the A locus is subject to selection during the DH process either already at the 95 haploid level (e.g. the haploid embryo does not survive) or at the diploid level (e.g. the diploid 96 DH plant is not fertile), whereas the B locus is selectively neutral. Let w1 be the “overall” fitness 97 of genotype AB or Ab at the haploid stage and genotype AABB or AAbb at the diploid stage 98 and w2 be the “overall” fitness of genotypes aB and ab or aaBB and aabb. 99 Then, we get the following table for the two-locus frequencies before selection: 100 Diploid Frequencies A Haploid B locus A locus B BB p2 p1 p2 D q1 p2 D Frequencies b bb q2 p1q2 D q1q2 D gametes or gametes/ w1 w2 DH lines DH genotypes Fitness 101 a Haploid genotype AA aa Diploid genotype p1 q1 Frequencies of 14 Supporting information Melchinger et al 102 Then, the average fitness of the population at the haploid or diploid homozygous state before 103 selection w w1 p1 w2 q1 . 104 Defining v1 105 locus: p1* p1v1 and q1* q1v2 . 106 107 108 w1 w and v2 2 , we obtain after selection the following frequencies at the A w w (1) For the B locus, we get the following frequencies of DH lines after selection: p2* p1 p2 D v1 q1 p2 D v2 p2 D v1 v2 , (2) 109 q2* p1q2 D v1 q1q2 D v2 q2 D v2 v1 . 110 For the change in frequencies after selection, we get: 111 A locus: p1 p1* p1 p1 v1 1 (4) 112 B locus: p2 p2* p2 D v1 v2 (5) 113 The linkage disequilibrium after selection is: 114 D* p1 p2 D v1 p1* p2* p1 p2 D v1 p1v1 p2 D v1 v2 115 D* Dv1 1 p1 v1 v2 116 and for the change in linkage disequilibrium after selection, we get: 117 D D* D D v1 1 p1 v1 v2 1 D v1 1 v1 p1 v1 v2 118 From this result, we can draw the following conclusions: (3) (6) 15 Supporting information Melchinger et al 119 1. If a selectively neutral locus (B locus) has linkage disequilibrium D with a locus (A locus) 120 under selection at the haploid and/or diploid homozygous state during production of DH 121 lines, it follows from Eqn. (5) that the change in allele frequency at the B locus ( p2 ) is 122 a linear function of D . Thus, if D 0 , i.e., both loci are in linkage equilibrium, the allele 123 frequency at the selectively neutral locus will not change. 124 2. Suppose the allele a at locus A is lethal, i.e., w2 v2 0 , w w1 p1 , and v1 w1 1 w1 p1 p1 1 D , and we get p2 . p1 p1 125 Thus, p1* 1 , p2* p2 D 126 For a lethal allele a, it will most likely be close to extinction, i.e., p1 1.0 . Hence, 127 p2 D and the change in allele frequency at this locus depends almost exclusively on 128 D . In our study (see Figure 3), the decay of linkage disequilibrium, measured as r², 129 2 reached r 0.10 (i.e., r 0.3162 ) at a physical distance of 3 Mb between loci. 130 Assuming without loss of generality r ≥ 0, we get from r 131 D r pi qi p j q j 0.25r . 132 Thus, the change in allele frequency at a selectively neutral gene that is 3 Mb distant 133 from 134 p2 0.25 0.3162 0.078 at maximum, i.e. very small. a lethal allele, the expected change D the equation pi qi p j q j in allele frequency is 135 16 Supporting information Melchinger et al 136 File S4: Translation of allele designation between different SNP arrays. 137 For the set of 36,209 SNPs in common between markers of the class “PolyHighResolution” on 138 the 600k Affymetrix Axiom® Maize Genotyping Array (Unterseer et al. 2014) and the 50k 139 Illumina® MaizeSNP50 BeadChip, a “translation” of allele coding was necessary because in 140 about half of the cases, the two array platforms targeted opposite strands of the template 141 DNA. The “translation” of the allele coding used in the manufacturer’s annotation file of the 142 Affymetrix 600k chip to the “Forward” allele coding given in the allele report table of the 143 Illumina 50k chip was based on data from 29 maize inbred lines from a sequence variant 144 discovery panel (3), on which genotyping data for both arrays were available. First, we 145 identified for each SNP the major allele in the data set of these lines for the 600k chip and 146 identified a relationship to one of the alleles on the 50k chip using as a basis the subset of lines 147 carrying this allele in the 600k data set. Afterwards, the second allele on the 600k chip was 148 assigned to the second allele on the 50k chip. Second, on the basis of this “translation” rule, 149 we translated the 600k genotype data of all 29 lines, resulting in the so-called 50kT data. Third, 150 the 50k and 50kT genotype data were compared for each line and only those SNPs and their 151 “translation” were accepted for further analyses, if they met both quality criteria: 152 153 154 155 1. Data from both the 50k and 50kT genotyping were available for at least 25 out of the 29 lines. 2. Genotyping results for the 50k and 50kT data matched for all available lines except for the maximum of one mismatch. 156 This yielded a set of 33,039 markers, of which 28,133 were polymorphic across the entire set 157 of 380 genotypes and fulfilled all quality criteria described in section of Materials and 158 Methods. 159 17 Supporting information Melchinger et al 160 References 161 Abendroth, L. J., R. W. Elmore, M. J. Boyer, and S. K. Marlay, 2011 Corn growth and 162 163 164 165 development. Iowa State University Extensions, Ames. Nei, M., 1973 Analysis of Gene Diversity in Subdivided Populations. Proc. Nat. Acad. Sci. U.S.A. 70: 3321-3323. Prigge, V., and A. E. Melchinger, 2012 Production of Haploids and Doubled Haploids in 166 Maize, pp. 161-172 in: Plant Cell Culture Protocols, Methods in Molecular Biology 3rd 167 edition, edited by V. M. Loyola-Vargas, and N. Ochoa-Alejo. Humana Press, Totowa. 168 Unterseer, S., E. Bauer, G. Haberer, M. Seidel, C. Knaak, et al., 2014 A powerful tool for 169 genome analysis in maize: development and evaluation of the high density 600 k SNP 170 genotyping array. BMC Genomics 15: 823. 18 Supporting information Melchinger et al
© Copyright 2026 Paperzz