Risk and Security Management for Distributed Supercomputing with

Risk and Security Management for
Distributed Supercomputing with Grids
Urpo Kaila <[email protected]>
Funet CERT & CSC
2006-09-22
19th TF-CSIRT Meeting,Espoo, Finland
Agenda
Grid’s and supercomputing
Some definitions
How do they work?
Example of Grids
Grids and Security
Risk management and Security domains
Creating baselines for Security
Case M-grid revisited
Organisation and setup
Security Working Group
Risk analysis, Security Policy & Acceptable Use Policy
User Security Guide, Administrator Security Guide
Grid Security and CSIRT’s
Making Grid Security compatible
Incident handling
Some definitions
Supercomputers
most efficient systems worldwide on a given time for
massive parallel processing of
advanced research tasks
Distributed computing
several inter-connected
computers share the computing
tasks assigned to the system
[IEEE]
Cluster
Similar efficient computers
coupled closely together
Grid computing
Affordable high performance
distributed computing with
interconnected clusters
Moore’s law as seen on the Top500 list
Pentium 4 = ~ 2-4 GFlops
What is the Grid?
Grid according to Ian Foster (2002) in "What is the Grid? A
Three Point Checklist“:
• Computing resources are not administered centrally.
• Open standards are used.
• Non-trivial quality of service is achieved
Different types of grids
Info-grid -WWW
Data-grid - Databases
Compu-grid - Computing
Evolved from computational needs
of "big science"
Grid must have:
Virtual organisations
Middleware
Truly Distributed
How do they work?
$ grid-proxy-init
Your identity:
/O=Grid/O=NorduGrid/OU=csc.fi/CN=Urpo Kaila
Enter GRID pass phrase for this identity:
…
$ ngsub -d 1 -f mygridjob.xrsl
The Role of Grid Middleware
NorduGrid ARC Tutorial / Arto Teräs and Juha Lento 2005-09-20
Examples of Grids and Grid resources
•
•
•
•
•
•
•
TeraGrid - Open scientific discovery infrastructure financed US
National Science Foundation
DEISA - Distributed Euroapean Infrastructure for Supercomputing
Applications
EGEE - The Enabling Grids for E-sciencE
LHCG - Large Hadron Collider Grid (CERN)
e-IRG - The e-Infrastructure Reflection Group
NorduGrid - a Grid Research and Development collaboration
The Globus Alliance - an international collaboration that conducts
research and development to create fundamental Grid technologies
Grids and Security
Threats
WARNING! When working on the Grid, you must
accept that some information on your jobs and on
your Grid identity is made public. This includes
your name, your affiliation, IP address of your
client computer, job names and duration, used
runtime environment names and other less sensitive
information (see the Grid monitor for example).
(Nordugrid)
What excites hackers? (A. Cormack, 2002)
•
•
•
•
High profile targets – to enhance their
reputation
Powerful CPU – for password cracking etc.
Large disk – to distribute illegal material
High bandwidth – for denial of service
attacks
Security matrix
Reactive
Security
Proactice
Security
Technical
Security
Forensics
Firewalls
Cryptography
Patching
vulnerabilities
IPS
Security
Management
Incident handling
Security policies
and guides
Training and
awarness building
Risk and (proactive) security
Risk management (à la Wikipedia)
Security Domains [à la (ISC)2 CISSP CBK]
1.1 Establish the context
1.2 Identification
1.3 Assessment
1.4 Potential Risk Treatments
1.
2.
3.
1.4.1 Risk avoidance
ks
s
i
1.4.2 Risk reduction
lr
a
1.4.3 Risk retention
du
i
s
1.4.4 Risk transferre
1.5 Create the plan
1.6 Implementation
1.7 Review of the plan
Access Control
Application Security
Business Continuity and Disaster Recovery
Planning
4. Cryptography
5. Information Security and Risk Management
6. Legal, Regulations, Compliance and
Investigations
7. Operations Security
8. Physical (Environmental) Security
9. Security Architecture and Design
10. Telecommunications and Network Security
What has already been done (examples)
Joint Security Policy Group LCG/EGEE:
The LCG Security and Availability Policy
The Grid Acceptable Usage Policy
The Virtual Organisation Security Policy
PlanetLab
Acceptable Use Policy (AUP)
E-Infrastructure Reflection Group (e-irg)
Authentication and authorisation policies
Usage policies
Etc
Case M-grid revisited
M-Grid - Material Sciences National
Grid Infrastructure in Finland
Joint project between CSC, seven universities and The
Helsinki Institute of Physics (HIP)
Connected to the Nordic NorduGrid network, but
access is currently limited to M-grid partners and CSC
customers
The systems are particularly suitable for highthroughput running of sequential and easy-to-parallel
programs
The theoretical computing capacity of the system is
approximately 2.5 Tflops.
M-grid is based on HP ProLiant DL145, DL385 and
DL585 servers equipped with 64 bit AMD Opteron
processors (642 altogether)
The M-Grid Security Working Group
Organisation
Started January 2006, meetings once in a month, exept summertime
Members: CSC staff, visiting experts and M-Grid administrators:
Juha Jäykkä (UTU)
Michael Gindonis, Kalle Happonen (HIP)
Ivan Degtyarenko (HUT)
Vera Hansper (JYU)
Reports to M-Grid Administrators meeting
Collaborating with the HIP Wiki
Task
Risk analysis
To create a set of security policies and guidelines
Technical planning, implementation and supervision
Incident handling
The M-Grid Risk analysis 2006
Impact
“Mitigate”
Disaster
High
Medium
Residual
Likelihood
Risk = likelihood x impact
Picture by Vera Hansper
Problematic
Internal - Intentional
Internal - Accidental
External - Intentional
External - Accidental
Low
Over 50 threats
identified and
analysed!
M-grid Security Policy (Reviewed)
1. Introduction ( scope, objectives)
2. Participants, roles and
responsibilities
3. Physical security
4. User accounts and access
control
Local accounts
Grid accounts
Virtual Organization management
Certificate Authorities
5. Network security
Network access and services
Additional services
Firewalls
4. Network security (contd.)
Firewalls
5. Operational security
Patches
Monitoring
6. Confidentiality and privacy
Grid users
Local users and administrators
7. Incident response
8. Compliance
Exceptions
9. Approval and review
10. Comments
M-grid Security Policy (examples)
Accounts must be protected by a good password or other method
providing equivalent security
Sites are allowed to create time-limited accounts for
persons working in documented collaboration projects
outside the site'
s organization
Sites may offer additional services which are open to a
large user base, but these must be approved by the M-grid
administration
Sites must not offer any additional services running on the
administration server without approval of the M-grid administration.
A node
M-grid Acceptable Use Policy
Short, intended for the user, the security policy is to be read when
needed
Examples of content:
By using the M-grid resources you automatically agree to
comply with this Acceptable Use Policy “
You must act in a responsible manner and must not cause
harm to other users, to M-grid or to other systems.
You may not use M-grid for illegal activities.
The M-grid services and systems are intended for professional,
academic research or education.
Your account is personal and may not be shared with other
people
§
Security Guides
M-grid User Security Guide
A short technical howto
Example:
Your proxy certificate is … not protected by
a password therefore it should not be valid
for longer than necessary as proxy certificates
can be easily renewed
M-grid Administrator Security Guide
A Longer howto
Under construction
Examples of Technical security tasks
Implemented and on-the-wish list
Firewall-rpm
Log management and monitoring
Integrity check
Package signing
Availability monitoring
Automatic alerting
Backup of frontend
ssh- key managemnt
Security audits
Grid Security and CSIRT’s
Making Grid Security compatible
The grid’s tend to interconnect – we need compatible
security
Complex new technologies and
“fuzzy” virtual organisations in
“our hosts and networks”
International cooperation needed
Technical level – Management level
Reactive security - Proactive security
The risks haven’t materialized yet
Grid Incident handling
Existing CSIRT’s should be used as
professional incident handling hubs
Constant and proactice knowledge transfer
needed between Grid administration,
CSIRT’s and site administators
In the M-Grid Security policy already a
paragraph:
The administrator, in consultation with CSC should also inform
Funet CERT ([email protected], tel. +358-94572038) if the incident affects other M-grid sites
Finally - Finnish security terminology :)
Information – Tieto
Security – turvallisuus
Incident – poikkeama
Many incidents – poikkeamia
The interrogative form – ~ko
Also – ~kin
Have there been – oliko
Have there also been any security incidents?
Oliko tietoturvapoikkeamiakinko?