iPlant/USDA-ARS Big Data workshop on RNAseq Workflows and Standards Dec 7-9, 2014 at CSHL Organizers Jason Williams, Kapeel Chougule, Dewayne Shoemaker, Doreen Ware Goals • Foster competency in using iPlant including knowing the general features of the platforms, how they work together, how to select and use platforms/tools, and knowledge of platform’s expectations and limitations • Support trainers with domain knowledge at the most basic level of analysing RNA-Seq data • Cultivate support and commitment to a sustainable network that provides access to shared training materials, help, and recommendations for future training and workshop. Agenda • 2 pre-workshop webinar- forming teams and getting started with RNA seq assembly • At the workshop– working with data and metadata in iPlant (iDrop, iCommands, DE, Sharing, Searching) – working with the Discovery Environment (apps/workflows/analyses) and Atmosphere ( image launching, visualizing and downstream analysis) – an introduction to XSEDE resources and iPlant support mechanism Total participants : 23 Data Insect Plant Parasite Fish Animal Tools Utilized Data Pre-Processing FastX suite: trimmer, quality trimming, quality filter HTProcessPipeline: includes Trimmomatic for quality trimming Digital Normalization Trinity Normalization khmer normalization suite (based on DigiNorm) de novo Transcriptome Assembly Trinity SOAPdenovo-Trans Assembly Quality CEGMA Contig statistics Conserved Domains Transdecoder BLAST Create a BLAST database Mapping Reads to de novo Assembly Bowtie (trimmed reads agst the SOAPtrans assembly) Bowtie Build and Map SAM to sorted BAM->indexed BAM Tools Utilized khmer genie R studio Trinotate – blast, signalP, tmHMM, HMMER, RNAMMER Survey done after the workshop Before the workshop how would you rate your level of bioinformatics skills? n=20 12 10 8 6 4 2 0 Beginner Advanced Intermediate How helpful was it to be working in teams n=20 12 10 8 6 4 2 0 Not helpful at all Slightly helpful Neutral/No opinion Helpful Very Helpful How helpful were the webinars that preceded the workshop 14 12 10 8 6 4 2 0 Not helpful at all Slightly helpful Neutral/No opinion Helpful Very Helpful How did the workshop impact on your ability to perform bioinformatics analyses? 12 10 8 6 4 2 0 Had no impact Improved my ability slightly Improved my ability immensely How prepared are you to help others use iPlant in the following ways n=20 Create Atmosphere image Move data into/out of Atmosphere Connect to Atmosphere image Launch Atmosphere image Create workflow in DE Create App in DE Add/Modify App in DE Run an anlyses in the Discovery Environment Input and manage metadata Run analyses in iPlant Share data within iPlant Upload data (iCommands) Upload data (iDrop/DE) 0 Unprepared 5 Somewhat prepared 10 Prepaired 15 Very prepared 20 25 Barriers to using iPlant (Indicate how much you agree with the statement) n=19 I can't get publishable results using iPlant resources iPlant support staff are not reliable/quick iPlant documentation and manuals are not helpful iPlant services are not reliable I find it difficulty to use iPlant tools iPlant tools are not user friendly 0 N/A Strongly Agree 2 4 Agree 6 Neutral 8 Disagree 10 12 14 Strongly Disagree 16 18 20 Outcomes from workshop Groups • Group1 – Tools and workflows -Brenda Oppert, Anna Bennett, Jamie Strange, Brian Rector Neil Sanscrainte and George Yocum • Group2- Integration of new tools-Christopher Childers, Guangtu Gao , Geoff Waldbieser, Monica Poelchau, Zaid Abdo • Group3- Metadata Standards- Michelle Graham, Joe Hull, Pia Olafson, Lucy Stewart, Judy Chen, Deven See, and Brad Coates (Lead) • Group4-Adoption (training and organizing webinars)- Anja Baldo, Stephen White, Linda Ballard, Brenda Oppert, Kristina Friesen, Pia Olafson Group1-Prioritize tool and workflows Brenda Oppert, Anna Bennett, Jamie Strange, Brian Rector Neil Sanscrainte and George Yocum 1. upload and assemble an RNA and genomic data set. 2. process data through annotation and post assembly quality control 3. Downstream: Report to “integration” team. 4. Develop a mechanism for other ARS researchers and collaborators to suggest improvements. Group2-Tool integration‘Install me!’ Christopher Childers, Guangtu Gao , Geoff Waldbieser, Monica Poelchau, Zaid Abdo 1. Communication with the ‘Application’ group: create template with required metadata for requested applications (program name, version, URL, application, justification, test input files). 1. Develop workflow for program installation (to easily train other developers) (which includes pushing the finished apps to the Tester group; including sufficient documentation/readmes for the resulting app) Group3-Meta Data Standards Group Members Michelle Graham, Joe Hull, Pia Olafson, Lucy Stewart, Judy Chen, Deven See, and Brad Coates (Lead • Emphasis was on following NSF standards and NCBI annotation descriptors. • Across project areas (insect, plant, animal) collaborate with iPlant and Big Data centers to implement standard associations with data uploads & DOIs. • Database integration of meta data, sequence and assembly information into searchable database to ease retrieval, find/foster collaborations, and highlight ARS outputs. . Group4-Adoption (training and organizing webinars) Anja Baldo, Stephen White, Linda Ballard, Brenda Oppert, Kristina Friesen, Pia Olafson 1. Identify holes in existing material; differences in standard iPlant vs. USDA practices 2. Webinars could go on USDA youtube channel 3. Announce releases of tools, images, workflows, via webinars (other tools that are widely successful) 4. Tie trainings into IDPs 5. Downloadable materials (tutorials, videos, etc) at ARS website, SharePoint, or iPlant location 6. Assess adoption i. Track training material downloads ii. Track iPlant signups iii. Ask what they hope to get out of the training when they sign up, make it brief iv. Ask after a few months if expectations were met v. Track USDA tags in user forum 7. Pre-workshop homework (successful component of current gathering) i. needs to be clear, easy to complete ii. Sub-groups of attendees to foster participation in pre-course materials 8. What about locations with poor connections? And other barriers to adoption.
© Copyright 2026 Paperzz