Integrative Learning Science Community Report

GENOME CONSORTIUM ON
ACTIVE TEACHING USING
NEXT-GENERATION
SEQUENCING
Vince Buonaccorsi
HHMI award to Juniata college
NSF RCN-UBE
Specific Objectives
Broad Goals
Workshop
Increase network participation and
disseminate to other regions
X
Nurture sense of community
among young scientists and
educators
X
Support network communication,
coordination, and sharing of
resources
Student
Presentations
Leadership and
Technology Support
X
X
HHMI COMPUTER CLUSTER SPECS



One Master Node

Processors AMD Opteron Eight-Core 16 Cores per Master Node 2

16Gb RAM per Master Node

Hard Drive RAID Edition 3TB 6Gb/s Hard Drive, 7200RPM

Operating System CenTOS 1
Four Compute Nodes

Processors AMD Opteron 32 Cores per Compute Node

RAM 8GB 128Gb Ram per Compute Node

Hard Drive RAID Edition 500GB 6Gb/s Hard Drive
Cooling Solution

Rack 42U Rack Cabinet

Managed for Proper Cabinet AirFlow

Cooling 12,000 - BTU External Chilling Unit

Venting Duct Kit for venting into CRAC / HVAC air return system
Change through workshops
Research intensive
(Penn State) and PUI
faculty
Change through workshops
Research intensive
(Penn State) and PUI
faculty
Educational
Modules
Change through workshops
Who is GCAT-SEEK?
MSI Institions
MSI
14%
Non
MSI
86%
GCAT-SEEK: Nextgen Apps of interest
Transcript
omics
36%
Metageno
mics
20%
Eukaryotic
Genomics
26%
Bacterial
Genomics
18%
GCAT-SEEK: Organismal Interests
Plantae
28%
Animalia
41%
Fungi
13%
Bacteria
16%
Archaea
2%
GCAT-SEEK: Mainly Small PUIs
Number of undergraduates at school
35
Frequency
30
25
20
15
10
5
0
1-1000
1001-5000
5001-10000
10001-20000
Number of Students
20001-30000
GCAT-SEEK: Low technical experience
Linux Proficiency
Frequency
25
20
15
10
5
0
1
2
3
Low
Frequency
Frequency
Number of NextGen Data Sets
Analyzed
35
30
25
20
15
10
5
0
2
3
4
5
High
Perl or Python Proficiency
1
Low
4
5
High
30
25
20
15
10
5
0
GCAT-SEEK: High teaching experience
Undergraduate Teaching Experience
12
Frequency
10
8
6
4
2
0
0
1-5
6-10
11-15 16-20 21-25 26-30 31-40
Years Teaching
Familiarity with Assessment
Literature
30
25
20
15
10
5
0
15
Frequency
Frequency
Years in GCAT
10
5
0
1-5
6-10
Years in GCAT
11-15
1
Low
2
3
4
5
High
Merits and Impacts of Network







Community of 155 enthusiastic biologists, primarily
teachers from >110 colleges and universities
Intellectual synergies on exptal design, bioinformatics
approach, pedagogy & assessment
Discounted runs, software, hardware
Dissemination of data, pedagogic, assessment modules
Outreach to MSIs
Database of barcoded metagenomic primers
Student Impact in Year 1: 28 research students, 95
students in labs
Workshop: Participant Profile
Workshop Facilitators
Misc. Details
Meals: here
Coffee and snacks: meals, lounges downstairs
Evening sessions: here
Socials: Pheasant lounge
Breakout Rooms: Downstairs
Questions?
EUKARYOTIC GENOMICS
BREAKOUT
Vince Buonaccorsi
Eukaryotic genomics breakout

Content
Data processing
 Assembly
 Genome Characterization

Repeats
 Gene number

Platforms
 HHMI

Gene Annotation
 Ortholog ID
 Whole genome alignment
 Variant ID

Virtual boxes


Cluster
Linux applications on personal
computers
 Galaxy

web
Amazon EC2
GCAT-SEEK Sequencing Cores

Founding Genome Core Facility

Penn State
Coordinated all 2011runs
 Advised RNAseq runs 2013
 Advised & coordinated euk/prok genomic runs for 2013
 GCAT-SEEK workshop presenters


Affiliated Genome Core Facilities

Ohio University


In-house prices on Ion Torrent 318 runs
Indiana University, Bloomington

In-house prices on Illumina HiSeq runs