TECHNICAL REPORT ZFS Backup, Recovery and Cloning with the HGST Active Archive System Roye Avidor Technical Marketing Engineer HGST JANUARY 2016 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM Contents Overview 1Overview This paper shows you how to easily backup and recover ZFS file systems with the HGST Active Archive System as its a central repository. 1Installation 3ZFS snapshot backup by examples The backup process makes a read-only copy of a ZFS file system state, using ZFS file system snapshots. These copies will be stored in the HGST Active Archive System. The stored copies can then be used to recover corrupted file systems, roll back changes in test/dev environments, and also transfer them to other servers. 4ZFS snapshot recovery by examples Using the s3cmd utility, you can easily transfer the data from and to the HGST Active Archive System, and also manage the stored ZFS snapshot repository within the HGST Active Archive System. 5ZFS incremental snapshot backup by examples ZFS agents 2Configuration 6ZFS incremental snapshot recovery by examples Backup stream HGST Active Archive System 7ZFS file system transfer Recovery stream Installation Step 1: Install the ZFS file system (not covered in this paper). Step 2: Install the s3cmd command line from http://s3tools.org, or use the prebuilt binaries from http://s3tools.org/repositories Step 3: Get the S3 endpoint hostname and port number for your HGST Active Archive System. Step 4: Get credentials (account) to access the HGST Active Archive System. In this paper our S3 endpoint is “s3.hgst.com” with the corresponding port number 7730, using the credentials: Username: zfsUser Password: zfsSecret 1 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM Configuration To configure the s3cmd command line utility to access our HGST Active Archive System, we need first to initialize its configuration file, and change a few configuration parameters. Configure s3cmd as below: #s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. Access Key: zfsUser Secret Key: zfsSecret Default Region [US]: Encryption password: Path to GPG program [/bin/gpg]: Use HTTPS protocol [Yes]: No HTTP Proxy server name: s3.hgst.com HTTP Proxy server port [3128]: 7730 New settings: Access Key: zfsUser Secret Key: zfsSecret Default Region: US Encryption password: Path to GPG program: /bin/gpg Use HTTPS protocol: False HTTP Proxy server name: s3.hgst.com HTTP Proxy server port: 7730 Test access with supplied credentials? [Y/n] n Save settings? [y/N] y Configuration saved to ‘/root/.s3cfg‘ # Change the authentication version from Version 4 to Version 2 by editing the .s3cfg file. Before the change: signature_v2 = False After the change: signature_v2 = True Change the host_base and host_bucket to point to our S3 endpoint by editing the .s3cfg file. Before the change: host_base = s3.amazonaws.com host_bucket = %(bucket)s.s3.amazonaws.com 2 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM After the change: host_base = s3.hgst.com host_bucket = %(bucket)s.s3.hgst.com The last step is to create a bucket to be used for backup and recovery procedures. Later, you can create any number of buckets to manage the repository. You can organize different ZFS pools by different buckets, or by servers. [root@zfs ~]# s3cmd mb s3://zfs-bucket Bucket ‘s3://zfs-bucket/’ created [root@zfs ~]# s3cmd ls 2015-11-09 21:34 s3://zfs-bucket From now on in this document, for all s3 operations we will use `s3://zfs-bucket` as our s3 endpoint bucket name. ZFS snapshot backup by examples This section shows how to create a snapshot, and back it up to the S3 HGST Active Archive System. Start by creating a new ZFS file system: [root@zfs ~]# zfs create pool1/myDataFS [root@zfs ~]# zfs list NAME USED AVAIL REFER pool1 1002M 95.4G 23.5K pool1/myDataFS 19K 95.4G 19K [root@zfs ~]# ls -l /pool1/myDataFS total 0 [root@zfs ~]# MOUNTPOINT /pool1 /pool1/myDataFS Copy some data into the newly created ZFS file system: [root@zfs ~]# cp -pr /usr/share/dict /pool1/myDataFS/ [root@zfs ~]# ls -l /pool1/myDataFS/ total 1 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict 3 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM Create a new snapshot: [root@zfs ~]# zfs snapshot pool1/myDataFS@Snap1 [root@zfs ~]# zfs list -t snapshot NAME USED AVAIL pool1/myDataFS@Snap1 0 - REFER 4.77M MOUNTPOINT - Now, back up the snapshot to the HGST Active Archive System: [root@zfs ~]# zfs send pool1/myDataFS@Snap1 | s3cmd put - s3://zfs-bucket2/snap1 ‘<stdin>’ -> ‘s3://zfs-bucket2/snap1’ 5043928 of 5043928 100% in 0s [part 1, 4MB] 7.55 MB/s done See how we redirected the zfs data stream to the s3cmd utility using a pipe (‘|’)? ZFS snapshot recovery by examples This section shows how to recover a stored ZFS snapshot from the HGST Active Archive System. Let’s make sure the ZFS file system is corrupted by removing the entire ZFS file system from its pool: [root@zfs ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT pool1 1.01G 95.4G 27.5K /pool1 pool1/myDataFS 37.2M 95.4G 37.2M /pool1/myDataFS [root@zfs ~]# zfs destroy -r pool1/myDataFS [root@zfs ~]# zfs list NAME USED AVAIL pool1 1.01G 95.4G REFER 27.5K MOUNTPOINT /pool1 [root@zfs ~]# ls /pool1/myDataFS ls: cannot access /pool1/myDataFS: No such file or directory Now to recover the file system, first we will list the stored snapshots: [root@zfs ~]# s3cmd ls 2015-11-09 21:34 s3://zfs-bucket [root@zfs ~]# s3cmd ls s3://zfs-bucket 2015-11-09 21:47 5043928 s3://zfs-bucket/Snap1 4 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM It’s a one line command to recover the ZFS file system: # s3cmd -F --no-progress get s3://zfs-bucket2/Snap1 - | zfs recv pool1/myDataFS Pay attention to the “--no-progress” switch. This tells the s3cmd utility not to print any progress information to the standard output. Otherwise, it will write verbose information to the STDOUT, and corrupt the output of the zfs command. Checking the recovered file system: [root@zfs ~]# zfs list NAME USED AVAIL pool1 1.01G 95.4G pool1/myDataFS 37.2M 95.4G REFER 27.5K 37.2M MOUNTPOINT /pool1 /pool1/myDataFS [root@zfs ~]# ls -l /pool1/myDataFS/ total 1 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict How easy was that? ZFS incremental snapshot backup by examples An incremental snapshot relates to two specific snapshots. Before creating the second snapshot, let’s add new data to the file system, so each snapshot will hold a different state of the ZFS file system: [root@zfs ~]# ls -l /pool1/myDataFS/ total 1 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict [root@zfs ~]# cp -pr /usr/share/man /pool1/myDataFS/ [root@zfs ~]# ls -l /pool1/myDataFS/ total 2 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict drwxr-xr-x. 47 root root 47 Nov 9 12:58 man Create the second snapshot: [root@zfs ~]# zfs list -t snapshot NAME USED AVAIL pool1/myDataFS@Snap1 10K - REFER 4.77M MOUNTPOINT - [root@zfs ~]# zfs snapshot pool1/myDataFS@Snap2 [root@zfs ~]# zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT pool1/myDataFS@Snap1 10K - 4.77M pool1/myDataFS@Snap2 0 - 37.2M - 5 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM Backup the incremental snaphost between Snap1 and Snap2, and store it: # zfs send -i pool1/myDataFS@Snap1 pool1/myDataFS@Snap2 | s3cmd put - s3://zfsbucket/Snap1_Snap2 ‘<stdin>’ -> ‘s3://zfs-bucket/Snap1_Snap2’ [part 1, 15MB] 15728640 of 15728640 100% in 1s 9.17 MB/s done ‘<stdin>’ -> ‘s3://zfs-bucket/Snap1_Snap2’ [part 2, 15MB] 15728640 of 15728640 100% in 1s 9.66 MB/s done ‘<stdin>’ -> ‘s3://zfs-bucket/Snap1_Snap2’ [part 3, 15MB] 15728640 of 15728640 100% in 1s 8.94 MB/s done ‘<stdin>’ -> ‘s3://zfs-bucket/Snap1_Snap2’ [part 4, 15MB] 15728640 of 15728640 100% in 1s 8.84 MB/s done ‘<stdin>’ -> ‘s3://zfs-bucket/Snap1_Snap2’ [part 5, 7MB] 7502992 of 7502992 100% in 0s 8.36 MB/s done Here, we used the “-i” option that takes two snapshots and produces an incremental data stream. ZFS incremental snapshot recovery by examples To show how a ZFS file system can be recovered using incremental snapshots, we will corrupt the local file system by destroying it. Execute the destroy command to remove the pool1/myDataFS file system: [root@zfs ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT pool1 1.01G 95.4G 27.5K /pool1 pool1/myDataFS 37.2M 95.4G 37.2M /pool1/myDataFS [root@zfs ~]# zfs destroy -r pool1/myDataFS [root@zfs ~]# zfs list NAME USED AVAIL pool1 1.01G 95.4G REFER 27.5K MOUNTPOINT /pool1 [root@zfs ~]# ls /pool1/myDataFS ls: cannot access /pool1/myDataFS: No such file or directory To apply an incremental snapshot we need to recover the base snapshot that was used when the incremental snapshot was taken. In our example, the incremental snapshot was taken between snaphost Snap1 and snapshot Snap2, so we will first recover Snap1: # s3cmd ls 2015-11-09 2015-11-09 # s3cmd -F s3://zfs-bucket 22:07 70417552 s3://zfs-bucket/Snap1_Snap2 21:47 5043928 s3://zfs-bucket/Snap1 --no-progress get s3://zfs-bucket/Snap1 - | zfs recv pool1/myDataFS # ls -l /pool1/myDataFS/ total 1 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict 6 TECHNICAL REPORT ZFS BACKUP, RECOVERY AND CLONING WITH THE HGST ACTIVE ARCHIVE SYSTEM After recovering the base snapshot, then apply the incremental snapshot: # s3cmd -F --no-progress get s3://zfs-bucket/Snap1_Snap2 - | zfs recv pool1/ myDataFS [root@zfs ~]# ls -l /pool1/myDataFS/ total 2 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict drwxr-xr-x. 47 root root 47 Nov 9 12:58 man ZFS file system transfer A ZFS file system transfer is similar to the recovery step, but this time we will use two different servers. This is a simple way to clone file systems for test and development. These are the steps we will take: 1. On the source server, create a snapshot and store it on the HGST Active Archive System: host_base = s3.hgst.com host_bucket = %(bucket)s.s3.hgst.com 2. On the target server, create a file system based on the stored snapshot: target# zfs list NAME USED AVAIL pool2 55K 96.4G REFER 19K MOUNTPOINT /pool2 target# s3cmd --no-progress get s3://zfs-bucket/SNAP-current pool2/MyDataFS | zfs recv target# zfs list NAME USED AVAIL REFER MOUNTPOINT pool2 37.3M 96.4G 21K /pool2 pool2/MyDataFS 37.2M 96.4G 37.2M /pool2/MyDataFS target# ls -l /pool2/MyDataFS/ total 2 drwxr-xr-x. 2 root root 4 Nov 5 08:10 dict drwxr-xr-x. 47 root root 47 Nov 9 12:58 man This process can easily be scripted or automated, so that a test and development workflow can easily take advantage of the quick snapshot capabilities of ZFS, while storing data in a highly durable repository. © 2016 HGST, Inc. 3403 Yerba Buena Road, San Jose, CA 95135 USA. Produced in the United States 01/16. All rights reserved. HGST trademarks are intended and authorized for use only in countries and jurisdictions in which HGST has obtained the rights to use, market and advertise the brand. Contact HGST for additional information. HGST shall not be liable to third parties for unauthorized use of this document or unauthorized use of its trademarks. Other trademarks are the property of their respective owners. References in this publication to HGST’s products, programs, or services do not imply that HGST intends to make these available in all countries in which it operates. Product specifications provided are sample specifications and do not constitute a warranty. Information is true as of the date of publication and is subject to change. Actual specifications for unique part numbers may vary. Please visit the Support section of our website, www.hgst.com/support/systems-support, for additional information on product specifications. Photographs may show design models. TR01-ZFS-Backup-Recovery-Cloning-EN-0116-1
© Copyright 2026 Paperzz