Blowing in the Wind: The Quest for Accurate Crop Variety Identification in Field Research, with an Application to Maize in Uganda TALIP KILIC Senior Economist & Survey Methods Team Leader Living Standards Measurement Study Development Data Group – Survey Unit – World Bank [email protected] Co-Authors: JOHN ILUKOR, JAMES STEVENSON, SYDNEY GOURLAY, FREDERIC KOSMOWSKI, ANDRZEJ KILIAN, JULIUS PYTON SSERUMAGA, AND GODFREY ASEA International Consortium on Applied Bioeconomy Research (ICABR) Conference Berkeley, CA – May 31, 2017 Motivation • Accurate identification of crop varieties grown by farmers key to estimating – levels of improved variety cultivation – ensuing impacts on production, productivity, and a range of welfare and nutrition outcomes • Empirical evidence central to justifying investments in crop R&D, support to seed systems • Among farmers, correct information essential to their adoption & management decisions Prevailing Approaches to Variety Identification • Extent of underinvestment in methodological innovation for accurate variety identification is puzzling • Literature on adoption & impacts have usually relied on expert estimates and/or farmer-reported survey data on – Variety names – Improved vs. traditional status of a cultivated variety – Hybrid vs. OPV status of a cultivated variety • Why worry? – Weaknesses in extension & formal seed systems – Reliance on informal channels of seed acquisition – Variety naming systems that exhibit variation across time & space • Limited empirical evidence on the accuracy of prevailing approaches to variety identification (& implications of measurement error in impact evaluation) Our Contribution • Implemented a survey experiment in Eastern Uganda to test the relative accuracy subjective approaches to maize variety identification compared to DNA fingerprinting • Compiled a reference library of improved varieties in Uganda that serves a key input into DNA fingerprinting as well as the assessment of commercial seed quality MAPS: Methodological Experiment on Measuring Maize Productivity, Soil Fertility, and Variety Support • LSMS Minding the (Agricultural) Data Gap Research Program, funded by UK Aid • Global Strategy to Improve Agricultural and Rural Statistics, housed at FAO • World Bank Innovations in Big Data and Analytics Program • World Bank Trust Fund for Statistical Capacity Building Primary Objectives • Test subjective approaches to measurement vis-à-vis objective methods for maize yield measurement, soil fertility assessment & maize variety identification Partnerships • Uganda Bureau of Statistics (Implementing Agency), World Agroforestry Centre (Soil Fertility), CGIAR Standing Panel on Impact Assessment (Variety Identification), Stanford University & Terra Bella (Remote Sensing) Round I (First Agricultural Season of 2015) • Post-Planting Fieldwork: April-June 2015 • Crop Cutting Fieldwork: June-August 2015 • Post-Harvest Fieldwork: September-November 2015 Round II (First Agricultural Season of 2016) • Identical timeline & visit structure • Follow-up to a subset of Round I households (540 out of 900) MAPS Sample Round I Enumeration Area (EA) Selection • 45 EAs from a 400 Km2 remote sensing tasking area (Iganga & Mayuge) • 15 EAs in each of Serere & Sironko districts Household Selection • Original Plan: 6 pure stand & 6 intercropping households selected at random in each EA following listing – 450 in each universe • Result: 385 vs. 515 split – inadequate # of pure stand HHs in select EAs – 249 vs. 291 split in Iganga & Mayuge Plot Selection • Survey Solutions CAPI application to randomly select one plot per household Round II • Follow-up to 540 households in Iganga & Mayuge • Analysis sample: 440 households with crop cuts in Round I & II • Attrition does not have a bearing on the analysis MAPS Remote Sensing Tasking Area MAPS Methods Methods Tested: Maize Production • • • Crop-cutting • 4m x 4m & a 2m x 2m subplot in Round I • 8m x 8m sub-plot in Round II • Full-plot crop cut in Round II (1/2 of sample) Remote sensing based on high-res imagery • First in testing the method in a smallholder production system against an objective measure Self-reported harvest • Conversion of quantities in non-standard unit-condition combos into KG-, dried grain terms (“official” methods) Land Area • • GPS measurement (Garmin eTrex 30 handheld units) Self-reported area Soil Fertility (Round I) • • • Conventional Soil Analysis (subsample) Spectral Soil Analysis Self-reported soil quality & attributes Variety Identification • DNA fingerprinting of grain sampled from the crop-cutting subplot harvest (4x4m in Round I, 8x8m in Round II) Self-reported variety name, type & morphological attributes • DNA Fingerprinting • Diversity Arrays Technology (DArTseq) method that facilitates genome-wide characterizations of large accessions sets compared to existing genotyping-by-sequencing methods using SNP markers • Compiled a reference library of 38 maize varieties in circulation during the pre-planting period of the first rainy season of 2015, from NARO & 4 major seed companies, with revealed genotyping intention • Genotyped each reference library and field samples to derive two vars – Heterogeneity: # of DNA marker variants in the genomic representation - a collection of fragments from the genome selected for sequencing – Purity: Computed only for the field samples, represents the extent to which heterogeneity overlaps with that of the matched reference library variety identified initially as the one with the closest genetic distance to the field sample in question (below a distance threshold of 3). Recursive Partitioning & Classification Tree Analysis of Morphological Attributes of 38 Reference Library Samples • Morphological attributes for the reference library: Obtained by planting out the 38 varieties in NaCCRI fields. • Results: Varieties are uniquely identified using 11 attributes. • Identification of the varieties in the field: Using these attributes, varieties that the farmers plant were identified based farmer responses on morphological attributes Context Descriptive Statistics Crop Cut Sub-Plot/Seed Attributes Farmer Reporting Multiple Varieties in Crop Cut Sub-Plot = Yes † Farmer Reporting Multiple Varieties in Crop Cut Sub-Plot = No † Farmer Reporting Multiple Varieties in Crop Cut Sub-Plot = Don't Know † Primary Variety's Recyclability Status Correctly Identified † Farmer Says He Knows the Variety † Planted Seed Source = Stockist/Market † Plot Attributes Distance from Dwelling Based on GPS Location (KMs) GPS-Based Plot Area in Hectares Plot is Purestand † Share of Seed Would Have Been Planted Under Purestand ‡ Household Labor Days Any Hired Labor on Plot † Hired Labor Days Any Organic Fertilizer Was Applied † Any Inorganic Fertilizer Was Applied † Any Pesticide Was Applied † Weighted Additive Soil Quality Index (Muhkerjee & Lal) Household Attributes Household Size Dependency Ratio PCA-Based Agricultural Implement & Machinery Index Any Member Received Extension Advice on Ag Production † Manager Attributes Female † Age (Years) Education (Years) Manager = Respondent † Observations Mean Std. Error Min Max 0.17 0.02 0 1 0.58 0.02 0 1 0.25 0.02 0 1 0.31 0.02 0 1 0.45 0.03 0 1 0.37 0.02 0 1 0.16 0.14 0.46 82.96 49.12 0.43 5.19 0.01 0.09 0.04 0.30 0.02 0.01 0.01 0.79 1.77 0.03 0.88 0.00 0.02 0.01 0.00 0.00 0.92 0.00 1.37 0 1 8.33 100 0 504 0 1 0 294 0 1 0 1 0 1 0 0.5 6.36 1.44 -0.27 0.31 0.14 0.06 0.04 0.03 1 22 0 7 -1.52 5.39 0 1 0.42 41.07 6.32 0.82 0.02 0.71 0.30 0.01 510 0 6 0 0 1 92 20 1 How Do Different Methods Perform in Unique Identification of Maize Varieties in Round I? • 53 percent of farmers could not state the variety they have planted • Farmer-reported morph. attributes does not uniquely identify varieties • DNA fingerprinting performs the best for unique varietal identification Unique Variety Identification, Irrespective of Correct Variety Identification, by Method Farmer Elicitation DNA Fingerprinting Variety Name Provision Morphological Protocol Observations Percent Observations Percent Observations Percent Uniquely Identified 227 44.5 62 12.2 510 100.0 Not Uniquely Identified 283 55.5 448 87.8 0 0.0 TOTAL 510 100 510 100 510 100 13 15 12 Total Number of Varieties Identified Only 2 Percent of the Farmers Correctly Identified the Variety Based on DNA Analysis in Round I • With the exception of LONGE 10H, the varieties stated by the farmers (i.e. right panel) are NOT among the varieties identified by DNA fingerprinting (i.e. left panel). • Either farmers do not know or the stated names are the ones they were told Source: Ilukor et al. (Forthcoming). And Our Experts Were No Better! Variety Name Unidentified YARA42 LONGE10H WE2114 LONGE5 LONGE4 (Selected) Incidences of Variety Cultivation, by Method Farmer Elicitation Expert Variety Name Morphological Elicitation Provision Protocol 55.5 87.8 0.0 0.0 0.0 0.0 11.8 0.0 35.0 0.0 0.4 0.0 15.3 0.2 40.0 5.3 2.4 20.0 DNA Fingerprinting 0.0 34.3 33.3 21.2 0.0 0.0 How Do Different Methods Perform in Identifying Local/Improved & OPV/Hybrid Varieties in Round I? • • Cultivation of improved & hybrid varieties is under-estimated by farmers Cultivation of open pollinated varieties is over-estimated by farmers Incidence of Local/Improved & Hybrid/OPV Variety Cultivation, by Method Panel A: Incidence of Local/Improved Variety Cultivation (% ) Farmer Elicitation DNA Farmer Morphological Fingerprinting Reporting Protocol Improved 45.1 -100.0 Traditional 42.8 -0.0 Don't Know 12.2 -0.0 Panel B: Incidence of Hybrid/OPV Variety Cultivation (% ) Farmer Elicitation DNA Farmer Morphological Fingerprinting Reporting Protocol Hybrid 30.6 12.2 99.6 OPV 31.2 0.0 0.4 Don't Know 38.2 87.8 0.0 Purity of the Field Samples in Round I • Farmer-planted variety according to DNA fingerprinting is the reference library variety that is genetically closest (not necessarily identical) • Purity = Overlap between the field sample genetic heterogeneity & the genetic heterogeneity of the identified variety in the reference library Purity of Field Samples 120 100 80 60 40 20 0 Mean 63.2% Median 62.1% Min 46.9% Max 98.4% Headline Findings from Multivariate Analyses of Variety Identification Outcomes Purity • Negatively correlated with farmer’s correct identification of recyclability of the seed • NO relationship with commercial acquisition of the seed! Farmer’s correct identification of an improved variety • Positively correlated with farmer’s knowledge of the variety & commercial acquisition of the seed (Unacceptable Levels of) Heterogeneity in Reference Library Samples in Round I • Acceptable level of heterogeneity of the samples is 20% but most of the reference library samples are above the threshold. 80 Mean 32.9% 60 70 Median 24.6% Min 9.8% H628 H614D H625 H624 H520 H6213 H513 WE2115 WE2114 WE2106 WE2104 WE2103 WE2101 WH505 KH500-43A SC627 Seed Variety WH403 PAN67 FH6150 YARA42 DK8031 YARA41 LONGE11H LONGE9 LONGE10H LONGE8 LONGE7 UH6303 LONGE6 UH5054 UH5053 UH5052 UH5051 VP-MAX SCDUMA LONGE5 MM3 LONGE4 10 20 30 40 50 Max 75.2% Key Take-Away Messages • Variety identification findings reveal: – – – – High-levels of improved variety cultivation, despite popular belief But… cultivated varieties are of inferior quality Limited farmer knowledge about the varieties that they plant Weaknesses in & potential implications for extension & seed system • Evidence prompts us to think more critically about existing agricultural statistics & survey methods • Support for DNA fingerprinting to be the new standard for accurate variety identification in field research – Further experimentation & synthesis of evidence from completed survey experiments on other countries & crops – key to formulating guidelines for scale-up – Additional costs require more thinking around sub-sampling approaches in existing household & farm surveys Blowing in the Wind: The Quest for Accurate Crop Variety Identification in Field Research, with an Application to Maize in Uganda TALIP KILIC Senior Economist & Survey Methods Team Leader Living Standards Measurement Study Development Data Group – Survey Unit – World Bank [email protected] Co-Authors: JOHN ILUKOR, JAMES STEVENSON, SYDNEY GOURLAY, FREDERIC KOSMOWSKI, ANDRZEJ KILIAN, JULIUS PYTON SSERUMAGA, AND GODFREY ASEA International Consortium on Applied Bioeconomy Research (ICABR) Conference Berkeley, CA – May 31, 2017
© Copyright 2026 Paperzz