University of California Governance Models for Research Computing Western Educause Conference San Francisco May 2007 Copyright University of California 2007. This work is the intellectual property of the Regents of the University of California. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the Regents. To disseminate otherwise or to republish requires written permission from the Regents. Presenters David Walker, UC Office of the President Director, Advanced Technologies Information Resources & Communications Heidi Schmidt, UC San Francisco Director, Customer Support Services Office of Academic & Administrative Information Systems Ann Dobson, UC Berkeley Associate Director, Client Services Information Services & Technology Jason Crane, PhD, UC San Francisco Programmer Analyst Radiology Perspectives System-wide initiatives Campus models Central campus IT services Shared research computing facility UC-Wide Activities in Support of Research and Scholarship David Walker Office of the President University of California [email protected] Information Technology Guidance Committee (ITGC) Identify strategic directions for IT investments that enable campuses to meet their distinctive needs more effectively while supporting the University’s broader mission, academic programs and strategic goals. Promote the deployment of information technology services to support innovation and the enhancement of academic quality and institutional competitiveness. Leverage IT investment and expertise to fully exploit collective and campus-specific IT capabilities. Planning Inputs Broad consultation with: • UC stakeholders • Campus and system wide governing bodies Coordination with related academic and administrative planning processes Environmental scans and competitive analysis ITGC Timetable Launch the ITGC Feb, 2006 Interim work group reports Nov, 2006 Summary report to Provost Jun, 2007 Review and comment Oct, 2007 Presentations to President, COC, Regents, Academic Council Nov, 2007 Areas Addressed by the ITGC Research and Scholarship Teaching, Learning, and Student Experience University-Wide Administrative and Business Systems Critical Success Factors (e.g., common architecture, end-user support, collaboration infrastructure) Potential Recommendations for Research and Scholarship Advanced Network Services UC Grid Academic Cyberinfrastructure Advanced Network Services Upgrade all campus routed Internet connections to 10 Gbps Pilot new network services • Non-routed interconnects • Lightpath-based, application-dedicated bandwidth End-to-end performance tools, instrumentation, and support UC Grid Enable resource sharing, based on UCTrust Implement comprehensive storage services • Large-scale computation, project collaboration, (very) long-term preservation Explore UC-provided resources • Base-level compute and storage • Data center space • Support services Academic Cyberinfrastructure Ubiquitous access to services critical to research, scholarship, and instruction • Collaboration tools and services • Tools for creation and dissemination of electronic information • Digital preservation • Grant application / administration tools • End-user support services UCTrust A unified identity and access management infrastructure for the University of California Based on InCommon and Shibboleth More Information IT Guidance Committee • www.universityofcalifornia.edu/itgc UCTrust • www.ucop.edu/irc/itlc/uctrust Campus Governance Models Heidi Schmidt University of California, San Francisco Office of Academic & Administrative Information Systems University of California = Diversity Campus governance bodies that may influence research computing include: Academic Senate IT advisory boards & committees Research advisory boards & committees Discipline-based governance groups Campus-wide Perspectives Ann Dobson University of California, Berkeley Information Services & Technology UC Berkeley Desire to provide central services Desire to meet needs of less technical, less resource-rich researchers (e.g. social scientists) Tension between one-time grant funding and ongoing expenses Desire to optimize use of resources Need commodity model (one size fits all) If we build it, will they come? Requirements for LBNL Clusters Systems in the SCS Program must meet the following requirements to be eligible for support: IA32 or AMD64 architecture Participating cluster must have a minimum of 8 compute nodes Dedicated cluster architecture. No interactive logins on compute nodes Red Hat Linux operating system Warewulf cluster implementation toolkit Sun Grid Engine scheduler All slave nodes only reachable from master node Clusters that will be located in the Computer room must meet the following additional requirements Rack mounted hardware required. Desktop form factor hardware not allowed Equipment to be installed into APC Netshelter VX computer racks. Prospective cluster owners should include the cost of these racks into their budget General Purpose Cluster Hardware donated by Sun 29 Sun v20z servers (1 head node, 28 compute nodes) 2 cpu, 2 gig RAM, 1 73-gig hard drive Gigabit ethernet interconnect NFS file storage on SAN Housed in central campus data center General Purpose Cluster (cont.) Cluster management provided by LBNL’s Cluster Team Operating system: Centos-4.4 (x86_64) MPI Version: Open-MPI-1.1 Scheduler: Torque-2.1.6 Compiler: GCC-3.4.6 and GCC-4.1.0 Cluster provided on a recharge basis Collocation and Support Collocation in central data center $8/RU/month + power charge Cluster management support Varies based on number of nodes About $1500/month for 30-node cluster Assistance in preparing grant requests Audience Poll Does your campus provide central research computing facilities? Are these services provided on a recharge basis? Are these services centrally funded? Departmental Clusters Survey revealed 24 clusters Half are in EECS Data center space provided at no cost Grant funding for support FTE Hardware from donations or from grants Charge for storage, network connections Others a variety: biology, geography, statistics, space science, optometry, seismology Intel or AMD, many flavors of Linux Departmental Clusters (cont.) Chemistry Model Chemistry provides machine room space Chemistry FTE helps configure and get started PI must have grad student Sys Admin 4-5 clusters owned by faculty, support the research of 5-10 grad students All Linux running ROCS or Warewulf Audience Poll Do departments on your campus provide research computing support to their PI’s? On a recharge basis? Subsidized? Other UC Campuses/Labs UC San Francisco Completely decentralized UC Irvine Research Computing Support Group (1.6 FTE) Data center space ($200/month/rack) Shared clusters for researchers and grad students High speed networking Backup service System administration on recharge basis Other UC Campuses/Labs (cont.) UCLA Data center space High speed networking Shared clusters Storage Cluster hosting and management Charge for in-depth consulting, long-term projects, and nominal one-time node charge Other UC Campuses/Labs (cont.) LBNL Data center space High speed networking 3 FTE Pre-purchase consulting and procurement assistance Setup and configuration System administration and cybersecurity Charge for incremental costs to support clusters Other UC Campuses/Labs (cont.) UC Riverside Data center space Funds for seed clusters Researchers without funds to buy own Researchers with ability to purchase but will use central service Ongoing support of systems will be recharged Other UC Campuses/Labs (cont.) UC San Diego Decentralized Services on Recharge Basis Central Services: Network Infrastructure Server room space Hosting System Administration Consulting Supported by “knowledge worker” fee San Diego Supercomputer Center Audience Poll Does your campus have a “knowledge worker” fee? Challenges Provide a useful central resource Optimize use of clusters Encourage PIs to use central resources even if it costs money Develop funding model that works well with grants Shared Research Computing Jason Crane, PhD University of California, San Francisco Department of Radiology Case Study: UCSF Radiology Department Shared Research Computing Resources • UCSF Radiology computing • Center for Quantitative Biomedical Research (QB3 Institute) computational cluster • Incentives and disincentives for sharing computing resources • Advice about building collaborations and consensus UCSF Radiology Department Computing Radiology Research Computing Recharge Group A Group B … Organization and Structure – Ownership: individual research groups (~150 desktop workstations + Linux cluster) – Administration: Radiology Research Computing Recharge – Cost Structure: hardware + support recharge from PIs research grant directs Computational Needs and Problems – – – – – Underutilized CPU’s Some researchers have computationally demanding problems Serial processing on individual desktop machines takes hours-days Manual cycle stealing Embarrassingly parallel problems UCSF Radiology Department Computing Solution: – Deploy resource management software (RMS, Sun Grid Engine) to enable parallel computing on idle desktop machines supported by recharge. – Group specific queues: users submit parallel processing jobs to idle machines within their research group. Radiology Research Computing Recharge Group A … Group B RMS - Job Scheduler Group A Group B Research Group Users … UCSF Radiology Department Computing Observations Clustering – Increased intra-group CPU utilization – Increased adoption of computationally demanding software – Improved research capabilities and throughput However, – Inter-group CPU sharing was under-utilized – Higher-end storage needed to support IO requirements – Recharge cost for underutilized dedicated cluster doesn’t scale well – Time sharing is more cost effective than dedicated partially utilized cluster Interdepartmental Shared Computational Cluster Organization and Structure – Users: Interdepartmental within QB3 institute – Cost Structure: • PIs research grant directs: compute nodes (time share), fits well with one time sources of funding. • Institute grants and endowments: shared admin., high-end shared hardware – Governance: • Technical: cluster admin, technical users • Policy: committee of representative PIs – Hardware: 1200 cores (Linux), 13TB NAS Radiology’s Requirements – – – – Real-time & interactive apps. benefit from large number of CPUs for short bursts Access to shared high-end storage for IO intensive apps Lower cost structure for cluster support via utilizing institute supported administration HIPAA compliance Interdepartmental Shared Computational Cluster Experiences to date – High-end cost-effective resource for institute’s research – Varied use patterns benefit all users – Frees research group time for research – Radiology’s unique requirements (HIPAA, workflow, accessibility) slow to be implemented – Evaluate requirements, consider application interoperability: Use of Grid standards may have eased the transition for Radiology (cluster design/software porting). Incentives for Sharing – Reduce Costs • Share administrative costs • Leverage bulk buying power • Increase hardware utilization – Increase Performance and QOS • Justify high-end hardware: shared cost, efficient utilization • Greater hardware redundancy • Design input from larger expertise pool Disincentives for Sharing – Sharing isn’t equitable – Use cases vary from norm – Sharing may impact my resources Advice for Sharing – Establish guidelines for collaboration: • Equitable cost structure • Voting rights/governance – Develop applications/services to support accepted Grid standards
© Copyright 2026 Paperzz