POWER7+ Accelerated Encryption and Random Number

POWER7+ Accelerated Encryption and Random
Number Generation for Linux
Kent Yoder <[email protected]>
IBM Linux Technology Center
February 22, 2013
Contents
1 Introduction
2
2 Hardware Architecture
2.1 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Random Bit Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
2
3 Software Architecture
2
4 Setting Up Your Linux System
4.1 Encryption and Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 RNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
4
5 Monitoring Use of the Accelerators
5
6 Applications
6.1 Disk Encryption . . . . . . .
6.1.1 dm-crypt with LUKS
6.1.2 eCryptfs . . . . . . . .
6.2 IPSec . . . . . . . . . . . . .
6.2.1 StrongSwan . . . . . .
6
6
6
6
6
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Trademarks
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
1
1
Introduction
TM
R
One of the new features included in the IBM
POWER7+ CPU is a set of co-processors providing
high speed encryption and random number generation (RNG). Support for these new hardware
features is now available in Fedora Linux and will continue to be integrated into other community
Linux distributions. This white paper walks you through the hardware and software architecture
for the co-processors, discusses their setup on a Fedora 18 system, and explains how you can best
take advantage of them in various Linux applications.
2
Hardware Architecture
2.1
Encryption
The encryption and RNG accelerators are physically resident in each POWER7+ CPU. All CPU
cores in a chip share access to their own accelerator units. Each accelerator unit contains 3
processors responsible for Advanced Encryption Standard (AES) encryption and Secure Hash
Algorithm (SHA) hashing.
The hardware contains 16 I/O queues to handle incoming and outgoing requests, which are controlled
R
exclusively by the PowerVM
hypervisor.
2.2
Random Bit Generation
Hardware random numbers are delivered 64
bits at a time into an 8-entry buffer also
managed by the PowerVM hypervisor. Each
bit is independently generated using 64
ring-oscillators designed to run at their own
frequencies.
3
Software Architecture
Support for the accelerators is implemented
in the Linux kernel’s crypto API, where
requests for encryption are passed through
hypervisor calls to PowerVM. PowerVM is
then responsible for queuing and submitting
requests from all logical partitions to the
hardware.
The accelerators support the following modes
for AES:
• Galois Counter Mode (GCM)
Figure 1: Software architecture
• Counter Mode (CTR)
2
• Cipher Block Chaining Mode (CBC)
• Electronic Code Book Mode (ECB)
They support the following hash algorithms:
• SHA-256
• SHA-512
They also support one keyed MAC:
• AES-XCBC-MAC
4
Setting Up Your Linux System
Support for the POWER7+ accelerators is now included in Fedora 17 and 18 and may be included
in other distributions after the time of this writing. Any Linux distribution that supports the
R
POWER
architecture and includes a kernel version 3.5 or newer running on PowerVM should be
able to make use of the accelerators.
You can query your kernel version with the uname command (shown here on a Fedora 18 system).
$ uname -r
3.6.10-4.fc18.ppc64p7
4.1
Encryption and Hashing
The Linux driver supporting the accelerators is called nx-crypto and can be loaded using the
modprobe command.
# modprobe nx-crypto
Once the driver is loaded, you can see a list of its supported algorithms in /proc/crypto. Algorithms
whose driver property has the ”-nx” suffix will be accelerated by the POWER7+ CPU. Here’s a
sample algorithm listed in the output of the command cat /proc/crypto.
name
driver
module
priority
refcnt
selftest
type
blocksize
min keysize
max keysize
ivsize
geniv
:
:
:
:
:
:
:
:
:
:
:
:
cbc(aes)
cbc-aes-nx
nx_crypto
300
1
passed
blkcipher
16
16
32
16
<default>
3
An explanation of these fields:
• name: The algorithm and mode written as ”mode(alg)”.
• driver: An indication of the driver implementing the algorithm. The ”-nx” suffix indicates
that its implemented by the nx-crypto driver.
• module: The kernel module for the algorithm’s driver.
• priority: The priority for use of the algorithm. When multiple modules provide an implementation
of the same algorithm, the highest priority module is tried first.
• refcnt: A count of the users of this algorithm. In the case of encrypted file systems like
eCryptfs, this count is incremented whenever a new file is opened, so the count can be very
large.
• selftest: An indication of the status of the automatic self test that runs before an algorithm
is used.
• type: The type of algorithm, such as block cipher, hash, and so forth. shash indicates
a synchronous hash operation. All algorithms accelerated by the POWER7+ CPU are
implemented as synchronous operations.
• blocksize: The number of bytes encrypted or hashed at a time for an algorithm.
• min keysize: The minimum key size in bytes for the algorithm.
• max keysize: The maximum key size in bytes for the algorithm.
• ivsize: The size in bytes of an initial vector for the algorithm.
• geniv: The method for initial vector (IV) generation. Some algorithm modes use specific
methods to generate their IVs – all others will be marked as <default>.
4.2
RNG
The pseries-rng module is the driver for the random number generator. Once loaded, the driver
creates /dev/hwrng, which provides a read-only interface to the device.
By default, most applications will be written to pull random numbers from /dev/urandom on
a Linux system. /dev/urandom is a non-blocking source of randomness and is seeded using entropy
gathered inside the Linux kernel using sources such as keyboard activity, interrupt activity, and
TM
other device activity. On IBM Power Systems
servers, most logical partitions are installed
without direct connections to mice or keyboards. Therefore their opportunities to gather entropy
from devices are small. To overcome this issue, we can use rngd to pull entropy from the POWER7+
CPU and use it to seed /dev/urandom.
In Fedora, rngd is included in the rng-tools package. Once it is installed, you can use systemctl
to enable and activate the service.
4
# systemctl enable rngd.service
# systemctl start rngd.service
The rngd service continually reads entropy from the POWER7+ CPU through /dev/hwrng and
write it out to /dev/random, which seeds the /dev/urandom entropy pool. As applications drain
random bytes from /dev/urandom, it is continually reseeded with entropy from the POWER7+
CPU.
5
Monitoring Use of the Accelerators
debugfs support is included in the nx-crypto driver, which allows for real-time monitoring of the
use of the accelerators. After the nx-crypto driver is loaded, mount debugfs.
# mount -t debugfs debugfs /sys/kernel/debug
After debugfs is mounted, several special are exposed under /sys/kernel/debug/nx-crypto/.
These files include:
• aes bytes: The number of bytes encrypted and decrypted by all modes of AES.
• aes ops: The number of requests sent to the accelerator for encryption or decryption in all
modes of AES.
• errors: The total number of errors returned by the hypervisor that could not be mitigated
by the driver. An example of an error mitigated by the driver is a retry of the operation if
H BUSY is returned.
• last error: The last error returned by the hypervisor that could not be mitigated by the
driver in response to a request sent by last error pid.
• last error pid: The process ID of the application that received last error.
• sha256 bytes: The number of bytes hashed using SHA-256.
• sha256 ops: The number of SHA-256 requests sent to the accelerator.
• sha512 bytes: The number of bytes hashed using SHA-512.
• sha512 ops: The number of SHA-512 requests sent to the accelerator.
As mentioned previously, last error contains the error value returned by the hypervisor only when
that error cannot be mitigated by the kernel. The most common errors returned are:
• H BUSY (1): The hardware is busy, try again later
• H RESOURCE (4294967280): Insufficient hypervisor resources to perform the operation
• H NOT ENOUGH RESOURCES (4294967252): Hypervisor memory is exhausted
When any of these errors are encountered, the driver re-submits the request to the hypervisor.
Only after 10 retries does the driver give up and log the error into last error.
5
6
Applications
6.1
6.1.1
Disk Encryption
dm-crypt with LUKS
One method of encrypting storage on a Linux system is to leverage the device mapper for encrypting
a logical volume. dm-crypt is the device mapper kernel module to use, and keys for the volume are
managed with the cryptsetup utility. cryptsetup manages the Linux Unified Key Setup (LUKS)
header for your encrypted volume.
After the nx-crypto driver is loaded, the only requirement for enabling acceleration of dm-crypt is
to be sure that you are using the AES-CBC algorithm on your logical volume. To see the default
cipher setting, use cryptsetup.
$ cryptsetup --help | tail -4
Default compiled-in device cipher parameters:
loop-AES: aes, Key 256 bits
plain: aes-cbc-essiv:sha256, Key: 256 bits, Password hashing: ripemd160
LUKS1: aes-cbc-essiv:sha256, Key: 256 bits, LUKS header hashing: sha1, RNG: /dev/urandom
As encrypted files are read and written, you should be able to see the values for
/sys/kernel/debug/nx-crypto/aes ops and /sys/kernel/debug/nx-crypto/aes bytes increase.
6.1.2
eCryptfs
eCryptfs is a Virtual File System (VFS) layer encrypted file system included with most Linux
distributions. Due to the large 4K encrypt and decrypt operations it uses when reading and
writing data to disk, it is a good candidate to benefit from the POWER7+ accelerators.
After the nx-crypto driver is loaded, the only requirement for enabling acceleration of eCryptfs is
to be sure that you are using the AES algorithm on your eCryptfs mount point. This is indicated
by the ecryptfs cipher=aes mount option.
# mount | grep ecryptfs cipher
/mnt/lower on /mnt/upper type ecryptfs (rw,...,ecryptfs cipher=aes,...)
As encrypted files and read and written, you should be able to see the values for
/sys/kernel/debug/nx-crypto/aes ops and /sys/kernel/debug/nx-crypto/aes bytes increase.
6.2
IPSec
Several of the AES modes accelerated by the POWER7+ CPU are specifically made for use with
IPSec. Setup is as simple as configuring your IPSec implementation to use one of these modes.
6.2.1
StrongSwan
To configure StrongSwan so that its IKE and ESP encryption is offloaded to the POWER7+
accelerators, use these options in /etc/strongswan/ipsec.conf :
6
ikev2= OR esp=
aes128
aes192
aes256
aes128ctr
aes192ctr
aes256ctr
aes128gcm8
aes128gcm12
aes128gcm16
aes192gcm8
aes192gcm12
aes192gcm16
aes256gcm8
aes256gcm12
aes256gcm16
POWER7+ algorithm
AES-CBC
AES-CBC
AES-CBC
AES-CTR
AES-CTR
AES-CTR
AES-GCM
AES-GCM
AES-GCM
AES-GCM
AES-GCM
AES-GCM
AES-GCM
AES-GCM
AES-GCM
Key Size
128
192
256
128
192
256
128
128
128
192
192
192
256
256
256
Table 1: StrongSwan cipher modes
For more detailed information about configuring StrongSwan ciphers, see
http://wiki.strongswan.org/projects/strongswan/wiki/IKEv2CipherSuites.
7
Trademarks
IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corp., registered
in many jurisdictions worldwide. Other product and service names might be trademarks of IBM
or other companies. A current list of IBM trademarks is available on the Web at ”Copyright and
trademark information” at www.ibm.com/legal/copytrade.shtml.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
7