Software Provisioning Inside a Secure Environment As Docker Containers Abdulrahman Azab 05, May Agenda TSD: Services for Sensitive Data TSD Software Provisioning: Options and Issues Docker Docker in a Secure Environment Accessing the HPC Cluster from a Docker Container Use Case: Galaxy Portal inside the TSD as a Docker Container TSD Tjenester/Services for Sensitive Data p01-u1 VM p01-u2 VM P1 P1 P01 VM VM VM P01 P1 P1 P02 VM VM VM P1 P1 Pn VM VM VM HNAS File-system p01-um VM Colossus File-system Colossus Two factor Authentication TSD: Services for Sensitive Data: Architecture SLURM CE CE CE CE CE CE TSD Services for sensitive data: Storage Path /tsd/shared /tsd/pXX /pXX/home/pXX-userXX /pXX/data/durable /pXX/data/no-backup /pXX/fx/import /pXX/fx/export /pXX/data/colossus /cluster Purpose Physical location Shared HNAS Main project directory HNAS User home directory HNAS Mounted on colossus Data (Backup support) Data (no backup) HNAS I/O to-from TSD HNAS Submission to SLURM HNAS Mounted on colossus Shared dir between jobs colossus TSD Services for sensitive data: File import/export TSD tsd-fx01 tsd-fx02 pXX/export NFS SFTP pXX/fx/export File-Lock protocol pXX/fx/import pXX/import Sluice Server pXX Virtual Sluice Server pXX Users HNAS File-system TSD Software Provisioning TSD Software Provisioning: Options Import the package through the file-lock, and install on your home directory. Install as a module (for shared command-line tools). Install on the Shared project Area. Install on a VM, i.e. the project Windows VM or one of the Linux VM, depending on the associated platform. TSD Software Provisioning: Issues Many software packages require downloading and updating package dependencies throughout the installation. Running a user installed VM inside the TSD is not permitted. A solution is to study the installation wizard/script of the desired software package, document all the package dependencies which are needed and downloaded during the installation process, manually download all these packages to a local path, and modify the installation wizard/script to fetch those dependencies from the local path instead of the original Internet external links. TSD Software Provisioning: Issues Need to securely package a software for run inside the TSD without a need to install an entire VM Docker What is Docker? Docker is an open-source project that automates the deployment of applications inside software containers, by providing an additional layer of abstraction and automation of operating system– level virtualization on Linux. [www.docker.com] Virtual Machines Docker containers Docker vs. VM Docker Technology LXC (LinuX Containers): Multiple isolated Linux systems (containers) on a single host Layered File System [Source: https://docs.docker.com/terms/layer/] Docker Run Platforms Various Linux distributions (Ubuntu, Fedora, RHEL, Centos, openSUSE, ...) Cloud (Amazon EC2, Google Compute Engine, Rackspace) Windows, OSX: Boot2Docker Installing Docker $sudo yum -y install docker-io $sudo yum -y update docker-io $ sudo service docker start Uninstalling Docker $sudo service docker stop $sudo rm -rf /var/lib/docker $sudo yum erase docker-io Terminology – Image (borrowed) Persisted snapshot that can be run images: List all local images run: Create a container from an image and execute a command in it pull: Download image from repository rmi: Delete a local image 18 Terminology – Container (borrowed) Runnable instance of an image ps: List all running containers ps –a: List all containers (incl. stopped) top: Display processes of a container start: Start a stopped container stop: Stop a running container pause: Pause all processes within a container rm: Delete a container commit: Create an image from a container 19 Daemon Container (borrowed) Open Terminal in container: docker run –it ubuntu /bin/bash Run as deamon: docker run –d [image] command 20 Building a Docker Image Interactive building Building from a Docker File Base Image (Disk) Base Image (Disk) Run Run the Installation procedure Container (Memory) Commit New Image (Disk) Load Dockerfile Build New Image (Disk) Installation script Interactive building Example: Bowtie2 $ docker run -t -i ubuntu:14.04 root@2a896c8cdd83:/# apt-get update -qq --fix-missing root@2a896c8cdd83:/# apt-get install -qq -y wget unzip root@2a896c8cdd83:/# wget -q -O bowtie2.zip http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.4/bowtie2-2.2.4-linuxx86_64.zip/download root@2a896c8cdd83:/# unzip bowtie2.zip -d /opt/ root@2a896c8cdd83:/# ln -s /opt/bowtie2-2.2.4/ /opt/bowtie2 root@2a896c8cdd83:/# rm bowtie2.zip root@2a896c8cdd83:/# export PATH=$PATH:/opt/bowtie2 root@2a896c8cdd83:/# exit $ docker commit –m “bowtie2-docker test” 2a896c8cdd83 azab/bowtie2 $ docker run –t azab/bowtie2 bowtie2 --version /bowtie2/bowtie2-align version 2.1.0 64-bit Built on do-dmxp-mac.win.ad.jhu.edu Tue Feb 26 13:34:02 EST 2013 …….. $ docker push azab/bowtie2 Dockerfile Example: Bowtie2 FROM ubuntu:14.04 MAINTAINER Enis Afgan <[email protected]> RUN apt-get update -qq --fix-missing; \ apt-get install -qq -y wget unzip; RUN wget -q -O bowtie2.zip http://sourceforge.net/projects/bowtiebio/files/bowtie2/2.2.4/bowtie2-2.2.4-linux-x86_64.zip/download; \ unzip bowtie2.zip -d /opt/; \ ln -s /opt/bowtie2-2.2.4/ /opt/bowtie2; \ rm bowtie2.zip ENV PATH $PATH:/opt/bowtie2 Building a Docker Image from a Dockerfile <source-directory> Dockerfile files .dockerignore $docker build -t <image-name> <source-directory> Mount Volumes (borrowed) Log to host file Adapt script to log to /log/hello3.log docker run -d –v /home/docker/log:/log /bin/bash /sayHello.sh Run second container: Volume can be shared 25 Docker in a Secure Environment Docker in a Secure Environment: Advantages Docker creates a set of namespaces and control groups for each container. Processes running within a container cannot influence, or even see, processes running in another container, or in the host system. Each container has a separate network stack. Unless the host system is setup to allow containers to interact with each other through their respective network interfaces, no interactions can happen between containers. The use of Linux control groups ensure that each container gets a fair share of memory, CPU, disk I/O. A single container cannot bring the system down by exhausting one of those resources. Containers make it easier to control which data and software components are installed, through the use of scripted instructions in setup files, i.e. Docker files. 27 Docker in a Secure Environment: Issues The Docker daemon/engine always requires root privileges. Docker allows users, upon running/starting a container, to mount directories from the host on the container; and it allows you to do so without limiting the access rights of the container. Docker does not, so far, provide each container with a separate user namespace, which means that it offers no user ID isolation. The isolation provided by Docker is not as robust as the segregation established by hypervisors for virtual machines. Docker Containers so far don’t have any hardware isolation, which is the case for VMs. 28 Proposals for a Secure Use of Docker Containers Secure Building of Docker Images User System Administrator Security Officer Secure Infrastructure Dockerfile Approve Dockerfile Build Docker Image Deploy Local Docker Repository Proposals for a Secure Use of Docker Containers Secure Running of Docker Containers User $docker run $docker-safe run /etc/sudoers.d/docker Cmnd_Alias DOCKERSAFE = /docker-safe %docker-safe-group ALL=(ALL) NOPASSWD: DOCKERSAFE Accessing the HPC Cluster from a Docker Container Galaxy as a Docker Container VM Isolation problem Container $sbatch sbatch: command not found Scheduler W W W W W W Accessing the HPC Cluster from a Docker Container: Issues Isolation problem Mount everything? Then Isolation is meaningless Stroll File-system a universal File-system based interface for seamless task submission to one or more HPC clusters. Users interact with the cluster through simple read and write File-system commands. Stroll File-system Physical Virtual Virtual storage FS command handler user-mode library User mode Kernel mode Windows/Linux Kernel file system driver CallBack/FUSE VFS Engine Tasks Grid Clients Grid Clients GridClients Broker Drivers GridDrivers Grid Drivers Grid Local / Network FS Interface Stroll Application Network Local Command prompt CIFS / SMB Shell 3. Stroll 2. Grid Client(s) Condor_schedd UCC HIMAN-C glite-swat-client ARC Client Globus-Client 1. Grid Architecture (Back End) Condor UNICORE HiMan gLite NodruGrid Globus Local Model 4. File-system interface User Machine User Machine Script Native Local Users NFS / Samba STROLL Server Network Model 5. Grid Consumer (Front End) Stroll File-system 1 #!/bin/sh 2 # Example with the parallelPSM.R and 20 subjobs 3 cd /stroll 4 # Create PSM job directory inside the virtual path: /stroll 5 mkdir PSM 6 # Copy task files into the job directory 7 cp ~/parallelPSM.R PSM 8 cp ~/inp* PSM 9 # Set the configuration parameters: 10 echo 'parallelPSM.R' > config/exec 11 echo 'R' > config/rt 12 echo 'true' > config/wait 13 echo '20' > config/subjobs 14 echo 'inp$(i)' > config/in 15 echo 'out$(i)' > config/out 16 echo 'M128' > config/req 17 # Submit the job 18 echo "submit" > PSM/control 19 # Wait until 'echo' command returns to collect the output 20 cp PSM/out* ~/ 21 # Remove the virtual job directory 22 rm -r PSM Stroll File-system Host Container /cluster/ Job Output data Job Input data Watch/Virtual path Stroll FS Scheduler W W W W W W Galaxy Portal inside the TSD as a Docker Container What is Galaxy? A web-based interface to the command-line tools (of any kind) and their combinations (“workflows”) Flexibility to include anybody’s command-line tools Galaxy performs analysis interactively through the web, on arbitrarily large datasets Galaxy remembers what it did - history by writing wrappers whose templates are available An environment for sharing tools (or their wrappers) “Tools Shed” repository Galaxy Tool Wrapping Smalt_wrapper.py SMALT Smalt_wrapper.xml How it looks like Galaxy as a Docker Container Host Container Core Galaxy /export/ mount Tools and Datasets /home/user/galaxy-export/ Galaxy as a Docker Container Container Stroll Job Runner /cluster/ Job Output data Job Input data Watch/Virtual path Stroll FS Scheduler W W W W W W
© Copyright 2026 Paperzz