Software provisioning Inside a Secure Environment as Docker

Software Provisioning
Inside a Secure
Environment
As Docker Containers
Abdulrahman Azab
05, May
Agenda






TSD: Services for Sensitive Data
TSD Software Provisioning: Options and Issues
Docker
Docker in a Secure Environment
Accessing the HPC Cluster from a Docker
Container
Use Case: Galaxy Portal inside the TSD as a
Docker Container
TSD
Tjenester/Services
for Sensitive Data
p01-u1
VM
p01-u2
VM
P1
P1
P01
VM
VM
VM
P01
P1
P1
P02
VM
VM
VM
P1
P1
Pn
VM
VM
VM
HNAS File-system
p01-um
VM
Colossus File-system
Colossus
Two factor Authentication
TSD: Services for Sensitive Data: Architecture
SLURM
CE
CE
CE
CE
CE
CE
TSD Services for sensitive data: Storage
Path
/tsd/shared
/tsd/pXX
/pXX/home/pXX-userXX
/pXX/data/durable
/pXX/data/no-backup
/pXX/fx/import
/pXX/fx/export
/pXX/data/colossus
/cluster
Purpose
Physical location
Shared
HNAS
Main project directory
HNAS
User home directory
HNAS
Mounted on
colossus
Data (Backup support)
Data (no backup)
HNAS
I/O to-from TSD
HNAS
Submission to SLURM
HNAS
Mounted on
colossus
Shared dir between
jobs
colossus
TSD Services for sensitive data: File import/export
TSD
tsd-fx01
tsd-fx02
pXX/export
NFS
SFTP
pXX/fx/export
File-Lock protocol
pXX/fx/import
pXX/import
Sluice Server
pXX
Virtual
Sluice Server
pXX Users
HNAS File-system
TSD Software
Provisioning
TSD Software Provisioning: Options




Import the package through the file-lock, and
install on your home directory.
Install as a module (for shared command-line
tools).
Install on the Shared project Area.
Install on a VM, i.e. the project Windows VM or
one of the Linux VM, depending on the associated
platform.
TSD Software Provisioning: Issues
Many software packages require downloading and updating
package dependencies throughout the installation.
Running a user installed VM inside the TSD is not permitted.
A solution is to study the installation wizard/script of the
desired software package, document all the package
dependencies which are needed and downloaded during
the installation process, manually download all these
packages to a local path, and modify the installation
wizard/script to fetch those dependencies from the local
path instead of the original Internet external links.
TSD Software Provisioning: Issues
Need to securely package a software for
run inside the TSD without a need to
install an entire VM
Docker
What is Docker?
Docker is an open-source project that automates
the deployment of applications inside software
containers, by providing an additional layer of
abstraction and automation of operating system–
level virtualization on Linux.
[www.docker.com]
Virtual Machines
Docker containers
Docker vs. VM
Docker Technology
LXC (LinuX Containers): Multiple isolated
Linux systems (containers) on a single host
 Layered File System

[Source: https://docs.docker.com/terms/layer/]
Docker Run Platforms
Various Linux distributions (Ubuntu, Fedora,
RHEL, Centos, openSUSE, ...)
 Cloud (Amazon EC2, Google Compute
Engine, Rackspace)
 Windows, OSX: Boot2Docker

Installing Docker
$sudo yum -y install docker-io
$sudo yum -y update docker-io
$ sudo service docker start
Uninstalling Docker
$sudo service docker stop
$sudo rm -rf /var/lib/docker
$sudo yum erase docker-io
Terminology – Image (borrowed)

Persisted snapshot that can be run
 images: List all local images
 run: Create a container from an image and
execute a command in it
 pull: Download image from repository
 rmi: Delete a local image
18
Terminology – Container (borrowed)

Runnable instance of an image
 ps: List all running containers
 ps –a: List all containers (incl. stopped)
 top: Display processes of a container
 start: Start a stopped container
 stop: Stop a running container
 pause: Pause all processes within a
container
 rm: Delete a container
 commit: Create an image from a container
19
Daemon Container (borrowed)

Open Terminal in container:
 docker run –it ubuntu /bin/bash

Run as deamon: docker run –d [image] command
20
Building a Docker Image
Interactive building
Building from a Docker File
Base Image (Disk)
Base Image (Disk)
Run
Run the
Installation
procedure
Container (Memory)
Commit
New Image (Disk)
Load
Dockerfile
Build
New Image (Disk)
Installation
script
Interactive building Example: Bowtie2
$ docker run -t -i ubuntu:14.04
root@2a896c8cdd83:/# apt-get update -qq --fix-missing
root@2a896c8cdd83:/# apt-get install -qq -y wget unzip
root@2a896c8cdd83:/# wget -q -O bowtie2.zip
http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.4/bowtie2-2.2.4-linuxx86_64.zip/download
root@2a896c8cdd83:/# unzip bowtie2.zip -d /opt/
root@2a896c8cdd83:/# ln -s /opt/bowtie2-2.2.4/ /opt/bowtie2
root@2a896c8cdd83:/# rm bowtie2.zip
root@2a896c8cdd83:/# export PATH=$PATH:/opt/bowtie2
root@2a896c8cdd83:/# exit
$ docker commit –m “bowtie2-docker test” 2a896c8cdd83 azab/bowtie2
$ docker run –t azab/bowtie2 bowtie2 --version
/bowtie2/bowtie2-align version 2.1.0
64-bit
Built on do-dmxp-mac.win.ad.jhu.edu
Tue Feb 26 13:34:02 EST 2013
……..
$ docker push azab/bowtie2
Dockerfile Example: Bowtie2
FROM ubuntu:14.04
MAINTAINER Enis Afgan <[email protected]>
RUN apt-get update -qq --fix-missing; \
apt-get install -qq -y wget unzip;
RUN wget -q -O bowtie2.zip http://sourceforge.net/projects/bowtiebio/files/bowtie2/2.2.4/bowtie2-2.2.4-linux-x86_64.zip/download; \
unzip bowtie2.zip -d /opt/; \
ln -s /opt/bowtie2-2.2.4/ /opt/bowtie2; \
rm bowtie2.zip
ENV PATH $PATH:/opt/bowtie2
Building a Docker Image from a Dockerfile
<source-directory>
Dockerfile
files
.dockerignore
$docker build -t <image-name> <source-directory>
Mount Volumes (borrowed)




Log to host file
Adapt script to log to /log/hello3.log
docker run -d –v /home/docker/log:/log
/bin/bash /sayHello.sh
Run second container: Volume can be shared
25
Docker in a
Secure
Environment
Docker in a Secure Environment: Advantages




Docker creates a set of namespaces and control groups for each
container. Processes running within a container cannot influence,
or even see, processes running in another container, or in the host
system.
Each container has a separate network stack. Unless the host
system is setup to allow containers to interact with each other
through their respective network interfaces, no interactions can
happen between containers.
The use of Linux control groups ensure that each container gets a
fair share of memory, CPU, disk I/O. A single container cannot
bring the system down by exhausting one of those resources.
Containers make it easier to control which data and software
components are installed, through the use of scripted instructions
in setup files, i.e. Docker files.
27
Docker in a Secure Environment: Issues




The Docker daemon/engine always requires root
privileges.
Docker allows users, upon running/starting a container,
to mount directories from the host on the container; and it
allows you to do so without limiting the access rights of
the container.
Docker does not, so far, provide each container with a
separate user namespace, which means that it offers no
user ID isolation.
The isolation provided by Docker is not as robust as the
segregation established by hypervisors for virtual
machines. Docker Containers so far don’t have any
hardware isolation, which is the case for VMs.
28
Proposals for a Secure Use of Docker Containers

Secure Building of Docker Images
User
System Administrator
Security Officer
Secure
Infrastructure
Dockerfile
Approve
Dockerfile
Build
Docker
Image
Deploy
Local Docker
Repository
Proposals for a Secure Use of Docker Containers

Secure Running of Docker Containers
User
$docker run
$docker-safe run
/etc/sudoers.d/docker
Cmnd_Alias DOCKERSAFE = /docker-safe
%docker-safe-group
ALL=(ALL)
NOPASSWD: DOCKERSAFE
Accessing the
HPC Cluster from
a Docker
Container
Galaxy as a Docker Container

VM
Isolation problem
Container
$sbatch
sbatch: command not found
Scheduler
W
W
W
W
W
W
Accessing the HPC Cluster from a Docker
Container: Issues

Isolation problem
Mount everything?
Then Isolation is meaningless
Stroll File-system


a universal File-system based interface for
seamless task submission to one or more HPC
clusters.
Users interact with the cluster through simple read
and write File-system commands.
Stroll File-system
Physical
Virtual
Virtual
storage
FS command
handler
user-mode library
User mode
Kernel mode
Windows/Linux Kernel
file system driver
CallBack/FUSE
VFS Engine
Tasks
Grid
Clients
Grid
Clients
GridClients
Broker
Drivers
GridDrivers
Grid
Drivers
Grid
Local / Network
FS Interface
Stroll
Application
Network
Local
Command
prompt
CIFS / SMB
Shell
3. Stroll
2. Grid Client(s)
Condor_schedd
UCC
HIMAN-C
glite-swat-client
ARC Client
Globus-Client
1. Grid Architecture (Back End)
Condor
UNICORE
HiMan
gLite
NodruGrid
Globus
Local Model
4. File-system interface
User Machine
User Machine
Script Native
Local Users
NFS / Samba
STROLL Server
Network Model
5. Grid Consumer (Front End)
Stroll File-system
1 #!/bin/sh
2 # Example with the parallelPSM.R and 20 subjobs
3 cd /stroll
4 # Create PSM job directory inside the virtual path: /stroll
5 mkdir PSM
6 # Copy task files into the job directory
7 cp ~/parallelPSM.R PSM
8 cp ~/inp* PSM
9 # Set the configuration parameters:
10 echo 'parallelPSM.R' > config/exec
11 echo 'R' > config/rt
12 echo 'true' > config/wait
13 echo '20' > config/subjobs
14 echo 'inp$(i)' > config/in
15 echo 'out$(i)' > config/out
16 echo 'M128' > config/req
17 # Submit the job
18 echo "submit" > PSM/control
19 # Wait until 'echo' command returns to collect the output
20 cp PSM/out* ~/
21 # Remove the virtual job directory
22 rm -r PSM
Stroll File-system
Host
Container
/cluster/
Job Output data
Job Input
data
Watch/Virtual
path
Stroll
FS
Scheduler
W
W
W
W
W
W
Galaxy Portal
inside the TSD as
a Docker
Container
What is Galaxy?

A web-based interface to the command-line tools (of
any kind) and their combinations (“workflows”)



Flexibility to include anybody’s command-line tools


Galaxy performs analysis interactively through the web,
on arbitrarily large datasets
Galaxy remembers what it did - history
by writing wrappers whose templates are available
An environment for sharing tools (or their wrappers)

“Tools Shed” repository
Galaxy Tool Wrapping
Smalt_wrapper.py
SMALT
Smalt_wrapper.xml
How it looks like
Galaxy as a Docker Container
Host
Container
Core Galaxy
/export/
mount
Tools and Datasets
/home/user/galaxy-export/
Galaxy as a Docker Container
Container
Stroll Job
Runner
/cluster/
Job Output data
Job Input
data
Watch/Virtual
path
Stroll
FS
Scheduler
W
W
W
W
W
W