XSEDE, ISL Cloud - NCSA Open Source

LARGE SCALE DEPLOYMENT
OF DAP AND DTS
Rob Kooper
Jay Alemeda
Volodymyr Kindratenko
The need for scaling
• How can we scale?
• How can DAP architecture scale?
• How can DTS architecture scale?
• What options do we have to scale?
• Amazon solution for scaling
• XSEDE solution for scaling
• Cloud solution for scaling
Finite Resources
CPU
Memory
Disk
Network
Scalability
•
A system whose performance improves after adding
hardware, proportionally to the capacity added, is said to
be a scalable system.
Scaling Up And Out
• Scale UP (vertically)
•
•
•
•
Adding resources to a single system
“Speed”
Performance
Moor’s Law
• Scale OUT (horizontally)
•
•
•
•
•
Cloud
Adding nodes to the system
Nodes can be commodity hardware (vs HPC)
Increase software complexity
Increase management complexity
Elasticity
• Need ability to grow/shrink on demand
• Based on workload add or remove resources
• Keep requirements small
• If many people use one service bring up more of those
• Don’t bring up services that people don’t use
Software Server Architecture
Unknown
Format
Data
Useable
Data
Polyglot
Software
Server
Software
Server
Software
Server
…
Software
Server
Image
Magic
Open
Office
ffmepg
…
3D
Studio
HTTP
HTML
JSON
HTTP
HTML
JSON
Medici 2.0 Architecture
Load Balancer
Frontend
Webapp
…
Frontend
Webapp
…
Frontend
Webapp
Event Bus (rabbitMQ)
External
Services
Elastic
Elastic
search
Elastic
search
search
MongoD
MongoD
B
B
MongoDB
Filesystem
Extractor
(Java)
Extractor
(Python)
How to grow?
• More servers at ISDA
• Funding is in Brown Dog
• Not sustainable
• Commercial Clouds
• Amazon, …
• XSEDE
• NSF funded HPC computation
• NCSA
• Cloud infrastructure
AWS Web Application Reference Architecture
AWS Batch Processing Reference Architecture
Pricing
• Small machine (1CPU, 2GB)
• Linux $0.026 per Hour
• Windows $0.036 per Hour
• Server is approx. $10,000 and can hold 20 VMs
• Average lifespan 5 years (~ $500 per VM)
• Equals around 2 years of Amazon time
• But cheaper if we only need it 8 hours per day! And 7
hours/day in case of windows.
XSEDE Resources
Jay Alameda
National Center for Supercomputing
Applications
23 July 2014
What is XSEDE
• Integrating service for wide variety of High
Performance Computing (HPC) and Visualization
and Data Analysis (RDAV) resources
–
–
–
–
–
Front line support
Uniform documentation
Extended collaborative support
Training, education and outreach services
Allocations
• www.xsede.org
Variety of HPC and RDAV resources
• Dynamic list at
https://www.xsede.org/web/guest/resources/ov
erview
–
–
–
–
–
–
–
Overview, and expiration dates for each resource
Traditional clusters
Visualization and data analysis resources
Storage resources
High throughput resources
Testbeds
Services
Potentially Interesting Resources for
Browndog
• Testbed resource “FutureGrid”
– Production through 9/30/2014
– Partitioned into
• HPC
• Infrastructure as a Service (IaaS)
– Nimbus
– Openstack
– Eucalyptus
• Dedicated
– Layer Platform as a Service (PaaS) (eg, MapReduce,
Hadoop) on top of these partitions
Potentially Interesting Resources for
Browndog - 2
• Service resource “Quarry”
– Web service hosting environment
– Resource end date not specified
– Available for XRAC allocations with web-service
component
• Storage: either NSF home directories, or lustre based
storage.
– OpenVZ provides virtual hosting of RPM based
linux distributions
– Persistent virtual machine
New XSEDE Resource: Comet
• Long-tail science system hosted at San Diego
Supercomputer Center
• Builds on experience with SDSC Gordon (flash
memory, persistent storage nodes), and SDSC
Trestles (long-tail science)
– 99% of jobs in 2012 used < 2048 cores
– These jobs consumed half of the total core hours
across NSF resources.
Comet
• Partially designed to pick up FutureGrid use
(virtual clusters)
• Gateway hosting nodes and virtual machine
repository
• Optimized for jobs within a rack
• Continues access to flash memory (Gordon)
• Capacity computing: computing for the 99% of
XSEDE jobs
Comet virtualization
• Leverage experience and expertise from
FutureGrid
• Virtual machine jobs scheduled like batch jobs
• Flexible software environments for new
communities and applications
• Virtual machine repository
• Virtual HPC cluster (multi-(whole)-node),
miminum latency and overhead penalty
XSEDE and BrownDog
• Premise: BrownDog will become an integral
part of a researcher’s workflow
• Question: Should BrownDog evolve into an
XSEDE resource provider, to provide data
services for XSEDE?
ISL Resources
Volodymyr Kindratenko
Innovative Systems Laboratory
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Hadoop
Management
server, user portal
1Gb Management
switch
HDFS,
MapReduce
(6)
Secondary
management
server
QDR IB
switch
OpenStack Cloud
keystone, glance,
cinder, nova,
horizon, heat
1Gb Management
switch
neutron
cinder-volume
QDR IB
switch
nova-compute,
open-switch
(32)
Virtual Lab for Advanced Design
Storage
Nodes
Base Nodes
High
Memory
Node
Management
1Gb Management
switch
10Gb Core
switch
http://www.ncsa.illinois.edu/about/org/isl
10Gb SDN
switch
High memory node
• Dell PowerEdge R920
RAM
CPU0
CPU1
RAM
RAM
RAM
CPU
Intel Xeon E7-4860v2
2.6 GHz (4)
RAM
3 TB
•
CPU2
CPU3
RAM
RAM
QPI
•
Storage
•
RAM
RAM
PHC
Interconnect
•
•
2x 300 GB 10,000
RPM SAS 6 Gbps
HDD
4x 800 GB SAS
Read-Intensive MLC
12 Gbps SSD
6x 1 TB 7,200 RPM
Near-Line SAS 6
Gbps HDD
6x 1 Gbps Ethernet
2x 10 Gbps Ethernet
Other systems
• GPU Server
• 8 NVIDIA C2050 GPUs
• Intel Xeon Phi Server
• 2 Xeon Phi 7120 (Knights Corner) application accelerators
• HPC cluster
• 8 nodes