Creating the Metrocluster with Continuous Access EVA P6000 for

Building Disaster Recovery
Serviceguard Solutions Using
Metrocluster with Continuous
Access EVA P6000 for Linux
Part Number: 710335-006
Published: November 2016
© Copyright 2015 Hewlett Packard Enterprise Development LP
The information contained herein is subject to change without notice. The only warranties for Hewlett Packard Enterprise products and services
are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting
an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.
Confidential computer software. Valid license from Hewlett Packard Enterprise required for possession, use, or copying. Consistent with FAR
12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed
to the U.S. Government under vendor's standard commercial license.
Links to third-party websites take you outside the Hewlett Packard Enterprise website. Hewlett Packard Enterprise has no control over and is not
responsible for information outside the Hewlett Packard Enterprise website.
Acknowledgments
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.
Contents
1 Introduction..........................................................................................................5
Overview of P6000/EVA and P6000 Continuous Access Concepts.....................................................5
Copy sets.........................................................................................................................................5
Data replication Groups (DR Groups).............................................................................................5
Write modes....................................................................................................................................6
DR Group write history log..............................................................................................................7
Failover............................................................................................................................................7
Failsafe mode..................................................................................................................................8
Failsafe on Link-down/Power-up.....................................................................................................8
Overview of a Metrocluster with P6000 Continuous Access configuration...........................................9
P6000 Continuous Access Management Software............................................................................10
2 Configuring an application in a Metrocluster environment ...............................11
Installing the necessary hardware and software................................................................................11
Setting up the storage hardware ..................................................................................................11
Installing the necessary software..................................................................................................11
Creating the cluster.............................................................................................................................11
Site Aware Failover Configuration.................................................................................................11
Setting up the replication....................................................................................................................12
Creating VDISKs and DR groups using P6000 command view....................................................12
Using Storage System Scripting Utility (SSSU).............................................................................13
Creating the Metrocluster with Continuous Access EVA P6000 for Linux Map for the replicated
storage ..........................................................................................................................................14
Defining Management Server and SMI-S information .............................................................14
Creating the Management Server list.................................................................................15
Creating the Management Server Mapping file..................................................................15
Setting a default Management Server................................................................................15
Displaying the list of Management Servers.........................................................................15
Adding or updating Management Server information.........................................................16
Deleting a Management Server..........................................................................................16
Defining P6000/EVA Storages cells and DR groups................................................................16
Creating the Storage Map file.............................................................................................17
Displaying information about storage devices....................................................................17
Copying the Map file......................................................................................................................18
Configuring volume groups.................................................................................................................18
Identify the device special files for Vdisk in DR group...................................................................18
Identifying special device files..................................................................................................18
Configuring LVM volume group using Metrocluster with Continuous Access EVA P6000 for
Linux..............................................................................................................................................19
Creating LVM volume groups ..................................................................................................19
Configuring VMware VMFS Disk..............................................................................................20
Installing and configuring an application.............................................................................................21
Configuring a Metrocluster package .............................................................................................21
Setting up Disk Monitoring.......................................................................................................23
Configuring a Metrocluster package using Serviceguard Manager.........................................23
3 Metrocluster features.........................................................................................27
Data replication storage failover preview............................................................................................27
Rolling upgrade for Metrocluster.........................................................................................................27
Live Application Detach .....................................................................................................................28
4 Understanding failover/failback scenarios.........................................................29
Metrocluster package failover/failback scenarios ..............................................................................29
Contents
3
5 Administering Metrocluster................................................................................33
Adding a node to Metrocluster ...........................................................................................................33
Maintaining EVA P6000 Continuous Access replication in Metrocluster............................................33
P6000 Continuous Access Link failure scenarios.........................................................................33
Planned maintenance....................................................................................................................34
Node maintenance...................................................................................................................34
Metrocluster package maintenance.........................................................................................34
Failback.........................................................................................................................................34
Administering Metrocluster with Serviceguard Manager....................................................................34
Rolling upgrade...................................................................................................................................35
Upgrading Metrocluster replication software.................................................................................35
Limitations of the rolling upgrade for Metrocluster........................................................................35
Upgrading replication management software................................................................................35
Upgrading the OpenPegasus WBEM Services for Metrocluster with Continuous Access EVA
P6000 for Linux........................................................................................................................35
6 Troubleshooting.................................................................................................36
Troubleshooting Metrocluster.............................................................................................................36
Metrocluster log.............................................................................................................................36
P6000/EVA storage system log.....................................................................................................36
7 Support and other resources.............................................................................37
Accessing Hewlett Packard Enterprise Support.................................................................................37
Accessing updates..............................................................................................................................37
Websites.............................................................................................................................................37
Customer self repair...........................................................................................................................38
Remote support..................................................................................................................................38
Documentation feedback....................................................................................................................38
A Checklist and worksheet for configuring a Metrocluster with Continuous Access
EVA P6000 for Linux............................................................................................39
Disaster Recovery Checklist...............................................................................................................39
Cluster Configuration Worksheet........................................................................................................39
Package Configuration Worksheet.....................................................................................................40
P6000/EVA Configuration Checklist...................................................................................................41
B Package attributes for Metrocluster with Continuous Access EVA P6000 for
Linux.....................................................................................................................42
C smiseva.conf file...............................................................................................44
D mceva.conf file..................................................................................................45
E Identifying the devices to be used with packages.............................................47
Identifying the devices created in P6000/EVA....................................................................................47
Glossary...............................................................................................................48
Index.....................................................................................................................51
4
Contents
1 Introduction
This document describes how to configure data replication solutions using HPE P6000/EVA disk
Arrays to provide disaster recovery for Serviceguard clusters over long distances. It also gives
an overview of the P6000 Continuous Access software and the additional files that integrate
P6000/EVA disk Arrays with Metrocluster.
Overview of P6000/EVA and P6000 Continuous Access Concepts
P6000 Continuous Access provides remote data replication from primary P6000/EVA storage
systems to remote P6000/EVA storage systems. P6000 Continuous Access uses the remote-copy
function of the Hierarchical Storage Virtualization (HSV) controller running the controller software
(VCS or XCS) to achieve host-independent remote data replication.
Remote replication is the continuous copying of data from selected virtual disks on a source
(local) array to related virtual disks on a destination (remote) array. Virtual disks (Vdisks) are
user defined storage allotments of virtual or logical data storage. Applications continue to run
while data is replicated in the background. Remote replication requires a fabric connection
between the source and destination arrays and a logical grouping between source virtual disks
and destination virtual disks.
This section describes some basic remote replication concepts. The topics discussed are:
•
Copy Sets
•
Data Replication Groups (DR Groups)
•
Write modes
•
DR group write history log
•
Failover
•
Failsafe mode
•
Failsafe on Link-down/Power-up
Copy sets
A pairing relationship can be created to automatically replicate a logical disk from the source
array to another logical disk in the destination array. A pair of source and destination virtual disks
that have replication relationship is called a copy set. A Vdisk does not have to be part of a copy
set. Vdisks at any site can be set up for local storage and used for activities, such as testing and
backup. Clones and snapclones are examples of Vdisks used in this manner.When a Vdisk is
not part of a copy set, it is not disaster tolerant, but it can use various Vraid types for failure
tolerance.
Data replication Groups (DR Groups)
A DR group is a logical group of virtual disks in a remote replication relationship between two
arrays. DR groups operate in a paired relationship, with one DR group being a source and the
other a destination. Hence a DR group can be thought of as a collection of copy sets. The terms
source and destination are sometimes referred to as a DR mode or DR role. Hosts write data to
the virtual disks in the source array, and the array copies the data to the virtual disks in the
destination array. I/O ordering is maintained across the virtual disks in a DR group, ensuring I/O
consistency on the destination array in the event of a failure of the source array. All virtual disks
used for replication must belong to a DR group, and a DR group must contain at least one Vdisk.
When a DR group is first created, a full copy normalization occurs to copy all the data in the DR
group from the source array to the destination array, bringing the source and destination vdisks
into synchronization. Normalizations copy data from the source array to the destination array in
128 KB blocks.
Overview of P6000/EVA and P6000 Continuous Access Concepts
5
The replicating direction of a DR group is always from a source to a destination. In bidirectional
replication, an array can have both source and destination virtual disks that will reside in separate
DR groups. That is, one virtual disk cannot be both a source and destination simultaneously.
Bidirectional replication enables you to use both arrays for primary storage while they provide
disaster protection for another site.
The remote copy feature is intended not only for disaster recovery, but also to replicate data from
one storage system or physical site to another storage system or site. It also provides a method
for performing a backup at either the source or destination site.
P6000 Continuous Access has the ability to suspend and resume replication. Some versions of
controller software support auto suspend when a full copy of the DR group is required. This
feature can be used to protect the data at the destination site by delaying the full copy operation
until a snapshot or snapclone of the data has been made. See the HPE P6000 EVA Compatibility
Reference available at http://www.hpe.com/support/manuals —> storage -> Storage Software
-> Storage Replication Software -> HP P6000 Command View Software to determine if your
array supports this feature.
IMPORTANT: Metrocluster with Continuous Access EVA P6000 for Linux does not support
enabling auto suspend on full copy feature.
Write modes
The remote replication write modes are as follows:
•
Synchronous — The array acknowledges I/O completion after the data is cached on both
the local and destination arrays.
•
Asynchronous — The array acknowledges I/O completion before data is replicated on the
destination array. Asynchronous write mode can be basic or enhanced, depending on the
software version of the controller.
◦
Basic Asynchronous mode — An I/O completion acknowledgement is sent to the host
immediately after data is written to the cache at the source controller, but before the
data is delivered to the cache on the destination controller. There is no requirement to
wait for the I/O completion acknowledgement from the destination controller.
◦
Enhanced Asynchronous mode — The host receives an I/O completion acknowledgement
after the data is successfully written to the disk based log/journal, which is used to queue
the writes, which occurs after the data is written to local cache. The asynchronous
replication process reads the I/O from the journal and replicates it, using current
methodologies, to destination P6000/EVA.
You can specify the replication write mode when you create DR groups. The choice of write
mode, which is a business decision, has implications for bandwidth requirements and RPO.
Synchronous mode provides greater data currency because RPO will be zero. Asynchronous
mode provides faster response to server I/O, but at the risk of losing data queued at the source
side if a site disaster occurs. For complete information on which write modes are supported on
each version of controller software, see HPE P6000 EVA Compatibility Reference available at
http://www.hpe.com/support/manuals—>storage -> Storage Software -> Storage Replication
Software -> HP P6000 Command View Software.
IMPORTANT: This product supports only Synchronous replication mode and the Enhanced
Asynchronous replication mode and does not support basic asynchronous replication mode. For
more information on supported arrays, see the Disaster Tolerant Clusters Products Compatibility
Feature Matrix available at: http://www.hpe.com/info/linux-serviceguard-docs.
6
Introduction
DR Group write history log
The DR group write history log is a virtual disk that stores a DR group's host write data. The log
is created when you create the DR group. Once the log is created, it cannot be moved.
In synchronous mode or basic asynchronous mode, the DR group write history log stores data
when replication to the destination DR group is stopped because the destination DR group is
unavailable or suspended. This process is called logging. When replication resumes, the contents
of the log are sent to the destination virtual disks in the DR group. This process of sending I/Os
contained in the write history log to the destination array is called merging. Because the data is
written to the destination in the order that it was written to the log, merging maintains an
I/O-consistent copy of the DR group's data at the destination.
When using synchronous mode or basic asynchronous mode, if logging occurs because replication
has been suspended or the replication links have failed, the size of the log file expands in
proportion to the amount of writes. The size of the log file can increase only up to the user-specified
maximum value or to the controller's software default maximum value. You can set the maximum
size for the DR group write history log while in synchronous mode. The size of the log can't be
changed while in basic asynchronous mode. You must change the write mode to synchronous,
change the log file size, and then return to basic asynchronous mode.
In synchronous mode and basic asynchronous mode, the log grows as needed when the DR
group is logging and it shrinks as entries in the log are merged to the remote array. The controller
considers the log disk full when one of the following occurs:
•
No free space remains in the disk group that contains the log disk.
•
The log disk reaches 2 TB of Vraid1 space.
•
The log reaches the default or user-specified maximum log disk size.
In enhanced asynchronous mode, the DR group write history log acts as a buffer and stores the
data until it can be replicated. The consumption of the additional capacity required for the log
should not be viewed as missing capacity—it is capacity used to create the log.
The DR group write history log file size is set when you transition the DR group to enhanced
asynchronous mode. The space for the DR group write history log must be available on both the
source and destination arrays before the DR group is transitioned to enhanced asynchronous
mode. Once set, the space is reserved for the DR group write history log and cannot be reduced
in size. If necessary, you can reclaim allocated log disk space from a DR group in enhanced
asynchronous mode. You must first change the write mode to synchronous and then use the log
control feature to reduce the log size. When the log content has been drained, you can return
the DR group to enhanced asynchronous mode. Until the DR group is returned to enhanced
asynchronous mode, the DR group operates in synchronous mode, which may impact
performance. Allocated log file space is not decreased when DR group members are removed.
Log space usage will increase when members are being added to an existing DR group unless
the size of the log disk has reached the maximum of 2 TB or has been fixed to a user-defined
value. For details on maximum and default log sizes for different replication modes, see the HPE
P6000 EVA Compatibility Reference available at http://www.hpe.com/support/manuals—>
storage -> Storage Software -> Storage Replication Software -> HP P6000 Command View
Software.
When a write history log overflows, the controller invalidates the log contents and marks the DR
group for normalization to bring the source and destination arrays back into synchronization.
NOTE: When the replication mode is manually changed from asynchronous to synchronous
mode, the state is displayed as Run Down.
Failover
In P6000 Continuous Access replication, failover reverses replication direction for a DR group.
The destination array assumes the role of the source, and the source array assumes the role of
Overview of P6000/EVA and P6000 Continuous Access Concepts
7
the destination. The process can be planned or unplanned. A planned failover allows an orderly
shutdown of the system before the redundant system takes over. An unplanned failover occurs
when a failure or outage occurs that may not allow an orderly transition of roles.
NOTE:
Failover can take other forms:
•
Controller failover — The process that occurs when one controller in a pair assumes the
workload of a failed or redirected controller in the same array.
•
Fabric or path failover — I/O operations transfer from one fabric or path to another.
Failsafe mode
Failsafe mode is only available when a DR group is being replicated in synchronous mode and
specifies how host I/O is handled if data cannot be replicated between the source and destination
array. The failsafe mode can be one of the following:
•
•
Failsafe enabled — All host I/O to the DR group is stopped if data cannot be replicated
between the source array and destination array. This ensures that both arrays will always
contain the same data (RPO of zero). A failsafe-enabled DR group can be in one of two
states:
◦
Locked (failsafe-locked) — Host I/O and remote replication have stopped because data
cannot be replicated between the source and destination array.
◦
Unlocked (failsafe-unlocked) — Host I/O and remote replication have resumed once
replication between the arrays is re-established.
Failsafe disabled — If replication of data between the source and destination array is
interrupted, the host continues writes to the source array, but all remote replication to the
destination array stops and I/Os are put into the DR group write history log until remote
replication is re-established.
NOTE: Failsafe mode is available only in synchronous write mode. Host I/O can be
recovered by changing affected DR groups from failsafe-enabled mode to failsafe-disabled
mode. This action will begin logging of all incoming writes to the source member of the Data
Replication group.
Metrocluster with Continuous Access EVA P6000 for Linux does not support enabling Failsafe
Mode.
Failsafe on Link-down/Power-up
Failsafe on Link-down/Power-up is a setting that specifies whether or not virtual disks in a DR
group are automatically presented to hosts after a power-up (reboot) of the source array when
the links to the destination array are down and the DR group is not suspended. This prevents a
situation where the virtual disks in a DR group are presented to servers on the destination array
following a failover and then the virtual disks on the source array are also presented when it
reboots. Values for Failsafe on Link-down/Power-up are as follows:
•
•
Enabled — Virtual disks in a source DR group are not automatically presented to hosts. This
is the default value assigned to a DR group when it is created. This behavior is called
presentation blocking and provides data protection under several circumstances. Host
presentation remains blocked until the destination array becomes available (and can
communicate with the source array) or until the DR group is suspended.
Disabled — Virtual disks a source DR group are automatically presented to hosts after a
controller reboot.
NOTE: Metrocluster with Continuous Access EVA P6000 for Linux does not support
enabling Failsafe on Link-down/Power-up.
8
Introduction
For more information on remote data replication concepts and planning a remote replication
solution, see HPE P6000 Continuous Access Implementation Guide available at http://
www.hpe.com/support/manuals—>storage -> Storage Software -> Storage Replication Software
-> HP P6000 Continuous Access Software.
Overview of a Metrocluster with P6000 Continuous Access configuration
A Metrocluster is configured with the nodes at Site A and Site B. When Site A and Site B form a
Metrocluster, a third location is required where Quorum Server or arbitrator nodes must be
configured. There is a P6000/EVA storage at each site and they are connected to each other
through Continuous Access links.
An application is deployed in a Metrocluster by configuring it at both the sites. The sites are
referred either as DC1 or DC2 for an application, based on their role. Typically, the application
runs on DC1 site which is the primary site. If there is a disaster in DC1 site, the application
automatically fails over to the recovery site referred as DC2 site.
NOTE:
DC1 and DC2 are application-specific roles of a site.
For each application, either synchronous or enhanced asynchronous mode replication is configured
to replicate data between the two sites using DR group. In a typical configuration, more than one
application is configured to run in a Metrocluster. Depending on the application distribution in a
Metrocluster environment, some applications can have Site A as its DC1 while some other
applications can have Site B as its DC1.
Overview of a Metrocluster with P6000 Continuous Access configuration
9
Figure 1 Sample Configuration of Metrocluster with Continuous Access EVA P6000 for
Linux
Figure 1 depicts an example of two applications distributed in a Metrocluster with Continuous
Access EVA P6000 for Linux environment balancing the server and replication load. In this
example, Site A is the primary site or DC1 for application A, and recovery site or DC2 for
application B. Site B is the primary site or DC1 for application B, and recovery site or DC2 for
application A.
P6000 Continuous Access Management Software
Metrocluster with Continuous Access EVA P6000 for Linux requires the following two software
components to be installed in the Management Server:
10
•
P6000 Command View (earlier known as HP StorageWorks HP P6000 Command View).
This software component allows you to configure and manage the storage and DR group
via a web browser interface.
•
The SMI-S (Storage Management Interface Specification) EVA software provides the interface
for the management of P6000/EVA. Metrocluster with Continuous Access EVA P6000 for
Linux software uses OpenPegasus WBEM API to communicate with SMI-S to automatically
manage the DR Groups that are used in the application packages.
Introduction
2 Configuring an application in a Metrocluster environment
Installing the necessary hardware and software
When the following procedures are complete, an adoptive node will be able to access the data
belonging to a package after it fails over.
Setting up the storage hardware
1.
2.
3.
Before you configure this product, you must correctly cable the P6000/EVA with redundant
paths to each node in the cluster that will run packages accessing data on the array.
Install and configure the hardware components of the P6000/ EVA, including HSV controllers,
disk arrays, SAN switches, and Management Server.
Install and configure P6000 Command View and SMI-S EVA on the Management Server.
For the installation and configuration process, see the HPE P6000 Command View Installation
Guide.
Installing the necessary software
Before you configure a Metrocluster, make sure the following software is installed on all nodes:
•
Serviceguard for Linux A.12.00.00 or later
•
Metrocluster with Continuous Access EVA P6000 for Linux
See the Release Notes and Compatibility and Feature Matrix available at http://www.hpe.com/
info/linux-serviceguard-docs for the latest patches available for the above mentioned software.
Creating the cluster
NOTE: The file /etc/cmcluster.conf contains the mappings that resolve symbolic
references to $SGCONF, $SGROOT, $SGLBIN, etc. used in the pathnames in the following
subsections. If the Serviceguard variables are not defined on your system, then include the file
/etc/cmcluster.conf in your login profile for the root user. For more information on these
parameters,
see Understanding the Location of Serviceguard Files and Enabling Serviceguard Command
Access in Managing HPE Serviceguard A.12.00.30 for Linux available at http://www.hpe.com/
info/linux-serviceguard-docs .
Create a Serviceguard cluster with components on multiple sites according to the process
described in the Managing HPE Serviceguard A.12.00.30 for Linux available at http://
www.hpe.com/info/linux-serviceguard-docs.
NOTE:
A configuration with a Lock Lun is not supported in a Metrocluster environment.
Site Aware Failover Configuration
Serviceguard cluster allows sites to be configured in a Metrocluster environment. The Serviceguard
cluster configuration file includes the following attributes shown in Table 1 (page 11) to define
sites. This enables to use the package failover policiessite_preferred_manual or
site_preferred for Metrocluster package.
Table 1 Site Aware Failover Configuration
Attributes
Description
SITE_NAME
To define a unique name for a site in the cluster.
SITE
SITE keyword under the node's NODE_NAME definition.
Installing the necessary hardware and software
11
The following is a sample of the site definition in a Serviceguard cluster configuration file:
SITE_NAME san_francisco
SITE_NAME san_jose
NODE_NAME SFO_1
SITE san_francisco
.....
NODE_NAME SFO_2
SITE san_francisco
........
NODE_NAME SJC_1
SITE san_jose
.......
NODE_NAME SJC_2
SITE san_jose
........
Use cmviewcl command to view the list of sites that are configured in the cluster and their
associated nodes. The following is a sample of the command, and the output:
# cmviewcl -l node
SITE_NAME san_francisco
NODE STATUS STATE
SFO_1 up running
SFO_2 up running
.........
SITE_NAME san_jose
NODE STATUS STATE
SJC_1 up running
SJC_2 up running
You can configure either of these failover policies for Metrocluster failover packages. To use
these policies, you must specify site_preferred or site_preferred_manual for the
failover_policy attribute in the Metrocluster package configuration file.
NOTE: For a Metrocluster package, Hewlett Packard Enterprise recommends that you set the
failover_policy parameter to site_preferred.
Setting up the replication
Creating VDISKs and DR groups using P6000 command view
The P6000 Command View is a web-based tool to configure, manage, and monitor virtual disks
and DR groups as shown in Figure 2 (page 13).
12
Configuring an application in a Metrocluster environment
Figure 2 P6000/EVA Management Console using P6000 Command View
For more information on setting up P6000 Command View for configuring, managing, and
monitoring P6000/EVA Storage System , see HPE P6000 Command View User Guide available
at http://www.hpe.com/support/manuals -> storage-> Storage Software-> Storage Device
Management Software-> HP StorageWorks HP P6000 Command View Software.
Using the Command View (CV) web user interface create VDISKS, create DR GROUP using
the VDISKS and present those VDISKS to the connected host. After a DR group is created set
the destination host access field as Read only using Command View GUI.
NOTE: In the Metrocluster environment, it is required that the destination volume access mode
of the DR group be set to Read only mode.
In some earlier Command View software versions, when a DR group is created, only the source
volume (primary volume) is visible and accessible with Read/Write mode. The destination volume
(secondary volume) by default is not visible and accessible to its local hosts. The destination
volume access mode needs to be changed to Read only mode before the DR group can be used
and the destination volumes need to be presented to its local hosts.
Using Storage System Scripting Utility (SSSU)
HPE Storage System Scripting Utility (SSSU) is a utility to manage and monitor P6000/EVA
storage array. Using SSSU you can create VDISKS and DR groups to use with Metrocluster
packages. For more information about SSSU commands, see the sample input file available at:
$SGCONF/mccaeva/Samples/sssu_sample_input
The contents of sample input file are listed below. In the following sample file, DC-1 is the name
of the source array.
select manager 15.13.244.182 user=administrator pass=administrator
select system DC-1
set DR_GROUP “\Data Replication\DRG_DB1” accessmode=readonly
ls DR_GROUP “\Data Replication\DRG_DB1”
After you create the VDISKS and DR groups, run the following steps when copying and editing
the sample input file:
1. Copy the sample file sssu_sample_input to the desired location.
# cp $SGCONF/mccaeva/Samples/sssu_sample_input <desired location>
2.
Customize the file sssu_sample_input.
Setting up the replication
13
3.
After you customize the sssu_input file, run the SSSU command as follows to set the
destination Vdisk to read-only mode.
# /sbin/sssu “FILE <sssu_input_file>”
4.
To create the special device file name for the Vdisk on P6000/EVA, after changing the access
mode of the destination Vdisk, run the /usr/bin/rescan-scsi-bus.sh command to
detect and activate the disks, and then run lsscsi command to display the configured
disks.
NOTE: The lsscsi command is available in lssci package available in the respective
OS repository.
Creating the Metrocluster with Continuous Access EVA P6000 for Linux Map for
the replicated storage
The mapping file caeva.map is required to be present in all the cluster nodes for the Metrocluster
with Continuous Access EVA P6000 for Linux product to function. The mapping file caeva.map
contains information of the Management Servers as well as the information of the P6000/EVA
Storages and DR Groups used in the Metrocluster environment.
The Metrocluster with Continuous Access EVA P6000 for Linux product provides two utility tools,
smispasswd and evadiscovery, for users to provide information about the SMI-S servers
running on the Management Servers and DR groups that will be used in Metrocluster environment.
These tools must be used to create or modify the map file caeva.map.
This product provides two utility tools for users to provide information about the SMI-S service
running on the Management Servers and DR groups that will be used in Metrocluster environment.
smispasswd
Metrocluster retrieves storage information from the SMIS-Server for its startup or failover
operations. To contact the SMIS-server, it requires SMIS Server's hostname/IP address,
username/password, and port number. This information must be available in caeva.map file
before you configure any Metrocluster packages. The smispasswd utility, packaged along with
Metrocluster must be used to create or edit this map file with the above SMI-S server login
information.
evadiscovery
When querying P6000/EVA storage states through the SMI-S, Metrocluster first needs to find
the internal device IDs. This process takes time. However, this process need not be repeated
because the IDs are static in the P6000/EVA system. This information is cached in the caeva.map
file to improve package startup time. To cache the internal device IDs, run the evadiscovery
tool after the P6000/EVA and P6000 Continuous Access are configured, and the storage is
accessible from the hosts. The tool queries the active Management Server for the needed
information, and then saves it in caeva.map. Once the map file is created, it is necessary to
distribute it to all the cluster nodes to communicate with the P6000/EVA units.
NOTE: It is important to set the active Management Server before executing the evadiscovery
tool. For details to set the active Management Server see Section : Setting a default Management
Server.
Defining Management Server and SMI-S information
To define Management Server and SMI-S information use the smispasswd tool. The following
steps describe the options for defining Management Server and SMI-S information:
14
Configuring an application in a Metrocluster environment
Creating the Management Server list
On a host that resides on the same data center as the active management server, create the
Management Server list using an input file. To create use the following steps:
1. Create a configuration input file. A template of this file is available at the following location
for Red Hat and SUSE. For an example of the smiseva.conf file, see“smiseva.conf file”
(page 44).
The smiseva.conf is available at:
$SGCONF/mccaeva/smiseva.conf
2.
Copy the template file smiseva.conf to the desired location.
# cp $SGCONF/cmcaeva/smiseva.conf <desired location>
3.
For each Management Server in your configuration (both local and remote sites), enter the
Management Server’s hostname or IP address, the administrator login name, type of
connection (secure or non-secure), SMI-S name space and SMI-S port.
Creating the Management Server Mapping file
Use the smispasswd command to create or modify the Management Server information stored
in the mapping file.
For each Management Server listed in the file, a password prompt is displayed. A username and
password is created by your system administrator when the Management Server is configured
because of the security protocol for P6000/EVA. Enter the password associated with the username
of the SMI-S, and then re-enter it (as prompted) to verify that it is correct.
# smispasswd -f <desired location>/smiseva.conf
Enter password of <hostname1/ip_address1>: **********
Re-enter password of <hostname1/ip_address1>: **********
Enter password of <hostname2/ip_address2>: **********
Re-enter password of <hostname2/ip_address2>: **********
All the Management Server information has been successfully generated.
NOTE:
The desired location is where the modified smiseva.conf file is located.
For more information on configuring the username and password for SMI-S on the management
server, see the HPE P6000 Command View Installation Guide.
After the passwords are entered, the configuration is written to the caeva.map map file located
at:
$SGCONF/mccaeva
Setting a default Management Server
Use the smispasswd command to set the active Management Server that will be used by
evadiscovery tool.
For Example:
# smispasswd -d <hostname/ip_address>
The Management Server <hostname/ip_address> is set as the default active
SMI-S.
Displaying the list of Management Servers
Use the smispasswd command to display the current list of storage management servers that
are accessible by the cluster software. Example:
# smispasswd -l
Setting up the replication
15
MC/CAEVA Server list:
HOST
USERNAME
USE_SSL
NAMESPACE
---------------------------------------------------------------------Host1:Port
administrator
N
root/EVA
Host2:Port
administrator
N
root/EVA
Adding or updating Management Server information
To add or update individual Management Server login information to the map file, use the following
command options shown in Table 2:
smispasswd -h <hostname/ip_address> -n <namespace> -p <port> -u
<user_name> -s <y|n>
Table 2 Individual Management Server information
Command Options
Description
-h <hostname/ip_address>
This is either a DNS resolvable hostname or IP address of the Management Server
-n <namespace>
This is the name space configured for the SMI-S CIMOM . The default namespace
is root/EVA.
-p <port>
This is the port on which the SMI-S server listens. This attribute is optional and is
used when SMI-S server does not listen on the default ports.
-u <user_name>
This is the user name used to connect to SMI-S. The user name and password is
the same as those used with the sssu tool.
-s <y|n>
This option specifies the type of connection needed to be established between
Metrocluster software and the SMI-S CIMOM.
“y”
This option allows secure connection to Management Server using the HTTPS
protocol (HTTP using Secure Socket Layer encryption).
“n”
This option means a secure connection is not required.
1
1
CIMOM - Common Information Model Object Manager, a key component that routes information between providers
and clients.
This command adds a new record if it does not find the <hostname/ip_address> in the mapping
file. Otherwise, it only updates the record.
For Example:
# smispasswd -h <hostname/ip_address> -u administrator -n root/EVA -s
y
Enter password: **********
Re-enter password: ********
A new information has been successfully created
Deleting a Management Server
To delete a Management Server from the group used by the cluster, use the smispasswd
command with -r option. Example:
# smispasswd -r <hostname/ip_address>
The Management Server <hostname/ip_address> has been successfully removed
from the file
Defining P6000/EVA Storages cells and DR groups
On the node where access to SMI-S server is configured, define the P6000/EVA Storages and
DR Groups information to be used in the Metrocluster environment. The Metrocluster software
requires the internal device IDs to query the P6000/EVA storage states. The evadiscovery tool
is a Command Line Interface (CLI) that provides functions for defining P6000/EVA storage cells
16
Configuring an application in a Metrocluster environment
and DR group information. The tool caches the internal device IDs in the caeva.map file by
querying and searching for a list of devices information. This type of caching improves the
performance when querying P6000/EVA storage states through the SMI-S whenever required.
To use the evadiscovery tool:
1. Create a configuration input file. This file will contain the names of storage pairs and DR
groups. A template of configuration input file is available at the following location for Red
Hat and SUSE.
$SGCONF/mccaeva/mceva.conf
2.
Copy the template file to the desired location.
$SGCONF/mccaeva/mceva.conf <desired location>
3.
4.
5.
For each pair of storage units, enter the WorldWideName (WWN). The WWN is available
on the front panel of the P6000/EVA controller or from the P6000 Command View user
interface.
For each pair of storage units, enter the names of all DR groups that are managed by that
storage pair.
Save the file.
Creating the Storage Map file
After completing the HP P600/EVA Storages and DR Groups configuration file, use the
evadiscovery utility to create or modify the storage map file.
# evadiscovery -f <desired location>/mceva.conf
Verifying the storage systems and DR Groups .........
Generating the mapping data .........
Adding the mapping data to the file /opt/cmcluster/conf/mccaeva/caeva.map .........
The mapping data is successfully generated.
NOTE:
The desired location is where you have placed the modified mceva.conf file.
The command generates the mapping data and stores it in caeva.map file.
The mapping file caeva.map contains information of the Management Servers and information
of the P6000/EVA storage cells and DR Groups.
Displaying information about storage devices
Use the evadiscovery command to display information about the storage systems and DR
groups in your configuration. For example:
# evadiscovery -l
MC EVA Storage System and DR Groups map list:
Storage WWN: 50001FE15007DBA0
DR Group Name: \Data Replication\drg_cc
DR Group Name: \Data Replication\drg_1
Storage WWN: 50001FE15007DBD0
DR Group Name: \Data Replication\drg_cc
DR Group Name: \Data Replication\drg_1
Setting up the replication
17
NOTE: Run the evadiscovery tool after all the storage DR Groups are configured or when
there is any change to the storage device. For example, the user removes and recreates a DR
group that is used by an application package. In this case the DR Group's internal IDs are
regenerated by the P6000/EVA system. If any name of storage systems or DR groups is changed,
update the external configuration file, run the evadiscovery utility, and redistribute the map
file caeva.map to all Metrocluster clustered nodes.
Starting from release B.12.00.00 and B.01.00.02 (B.01.00.00 + SGLX_00461-463) onwards, for
all changes including addition or removal of a disk, update the external configuration file, run the
evadiscovery utility, and redistribute the map file caeva.map to all Metrocluster clustered
nodes.
Before running the evadiscovery command, the management server configuration must be
completed using the smispasswd command, else the evadiscovery command fails.
Copying the Map file
After running the smispasswd and evadiscovery commands to generate the caeva.map
file, copy this file in the same location to all cluster nodes so that the map file can be used by
this product to communicate with the storage arrays.
Copy the caeva.map file to all nodes in the cluster.
# scp $SGCONF/mccaeva/caeva.map node:$SGCONF/mccaeva/caeva.map
Similarly, whenever new Management Servers are added or the existing Server credentials are
changed, generate and redistribute the map file to all Metrocluster clustered nodes.
Configuring volume groups
This section describes the required steps to create a volume group for use in a Metrocluster
environment.
To configure volume groups, you must first identify the device special files for the source and
target VDisks in the DR Group. After that, create volume groups for source volumes, export them
for access by other nodes and import them on all other cluster nodes.
Identify the device special files for Vdisk in DR group
To configure volume groups, you must get the device special files for the DR Group's source
and target VDisks on nodes in both the sites.
See the “ Identifying the devices to be used with packages” (page 47) for identifying device
filenames for Vdisks.
NOTE: While using Cluster Device Special Files (cDSF) feature, the device special file name
is same on all nodes for a source and target VDisk.
Identifying special device files
The following is the sample output of the evainfo command on Linux environment.
Table 3 evainfo command output
# evainfo —d /dev/sdj
Devicefile Array
/dev/sdj
WWNN
Capacity
5000-1FE1-5007-DBA0 6005-08B4-0010-78F1-0002-4000-0143-0000 1024MB
Controller/Port/Mode
Ctl-B/FP–4/NonOptimized
For more information on using the evainfo tool, see HPE P6000 Evainfo Release Notes.
Use HPE P6000 Command View to identify the WWN for a Vdisk. The HPE P6000 Command
View for the WWN Identifier of the Vdisk is shown in Figure 3.
18
Configuring an application in a Metrocluster environment
Figure 3 P6000 Command View for the WWN identifier
Configuring LVM volume group using Metrocluster with Continuous Access EVA
P6000 for Linux
LVM storage can be used in Metrocluster. The following section show how to set up LVM volume
group. Before you create volume groups, you can create partitions on the disks and must enable
activation protection for logicalvolume groups, preventing the volume group from being activated
by more than one node at the same time. For more information on creating partitions and enabling
activation protection for logical volume groups, see Managing HPE Serviceguard for Linux
A.12.00.30 for Linux available at http://www.hpe.com/info/linux-serviceguard-docs.
Creating LVM volume groups
To create volume groups:
1. Create LVM physical volumes on each LUN.
# pvcreate -f /dev/sda1
2.
Create the volume group on the source volume.
# vgcreate --addtag $(uname -n) /dev/<vgname> /dev/sda1
3.
Create the logical volume. (XXXX indicates size in MB).
# lvcreate -L XXXX /dev/<vgname>
4.
Create a file system on the logical volume.
# mke2fs -j /dev/<vgname>/rlvol1
5.
If required, deactivate the Volume Groups on the primary system and remove the tag.
# vgchange -a n <vgname>
# vgchange --deltag $(uname -n) <vgname>
NOTE: Use the vgchange --deltag command only if you are implementing
volume-group activation protection. Remember that volume-group activation protection if
implemented, must be done on every node.
6.
Run the vgscan command on all the nodes to make the LVM configuration visible, and to
create the LVM database.
# vgscan
Configuring volume groups
19
7.
On the source disk site, run the following commands on all the other systems that might run
the Serviceguard package. If required, take a back up of a LVM configuration.
# vgchange --addtag $(uname -n) <vgname>
# vgchange -a y <vgname>
# vgcfgbackup <vgname>
# vgchange -a n <vgname>
# vgchange --deltag $(uname -n) <vgname>
8.
To verify the Volume Group configuration on the target disk site.
•
9.
To failover the DR group:
a. Select the remote storage system from the HPE P6000 Command View.
b. Select the desired destination disaster Recovery group, and then click Fail Over.
This makes the destination VDisk as SOURCE.
On the target disk site, run the following commands on all the systems that might run the
Serviceguard package. If required, take a back up of a LVM configuration.
# vgchange --addtag $(uname -n) <vgname>
# vgchange -a y <vgname>
# vgcfgbackup <vgname>
# vgchange -a n <vgname>
# vgchange --deltag $(uname -n) <vgname>
10. To failover the DR group:
a. Select the local storage system from the HPE P6000 Command View.
b. Select the desired destination disaster Recovery group, and then click Fail Over. This
makes the destination VDisk as SOURCE.
Configuring VMware VMFS Disk
Metrocluster with Continuous Access EVA P6000 for Linux supports VMware Virtual machine
file System (VMFS) based disks (VMDK) for application use. For more information about VMware
VMFS, see Managing HPE Serviceguard for Linux A.12.00.51 available at http://www.hpe.com/
info/linux-serviceguard-docs. You can find the details about how to deploy VMFS feature in
disaster recovery environment in the next section.
Before you apply or verify the Metrocluster package configuration file, ensure that the volume
group uses the replication group as follows:
•
Each disk that is part of the replication group in the storage array must be part of the datastore
as mentioned in the package. If one of the disks is not part of the datastore, a warning
message is displayed.
•
Each disk in the datastore must be part of the replication group in the storage array as
mentioned in the package.
Prerequisites for configuring VMWare VMFS disk for Metrocluster with Continuous Access EVA
P6000
To configure a disk for Metrocluster with Continuous Access EVA P6000 using VMFS:
1. Create replication group using storage specific command.
2. Create LUNs to the replication group on both Read-Only and Read-Write sites.
3. Export these LUNs to the ESXi hosts on Read-Write and Read-Only sites.
4. Create a datastore on the ESXi Host on the Read-Write site using the exported LUN.
5. Create VMDK disks on the datastore.
20
Configuring an application in a Metrocluster environment
For more information about storage specific commands, see the following document available
at http://www.hpe.com/support/manuals:
Prerequisite for configuring RDM disk for Metrocluster with Continuous Access EVA P6000 for
Linux
If you need to configure a disk using RDM, the datastore can either reside on disk which is part
of the replication group or on a different disk.
In the configuration where the datastore resides on the replication group, you must follow the
steps described in the Prerequisites for configuring VMWare VMFS disk for Metrocluster with
Continuous Access EVA P6000 section.
In a configuration where the datastore resides on a different disk which is not part of any replication
group, is not supported in Dynamically Linked Storage (DLS) configuration. An error message
is displayed when you try to configure it:
ERROR: The disk with WWID WWID, used for the Datastore DS_NAME is not
part of the Data Replication group CAEVA_DR_GROUP_NAME.
In the configuration where the datastore resides on a different disk which is not part of any
replication group is supported in Statically Linked Storage (SLS) configuration. For more
information about VMware DLS and SLS configuration, see Managing HPE Serviceguard for
Linux A.12.00.51 available at http://www.hpe.com/info/linux-serviceguard-docs.
Installing and configuring an application
You must replicate only the disks which contain application data and not the disks which contain
application binaries and configuration files. The following section describes how to configure a
package for the application.
Configuring a Metrocluster package
To create a Metrocluster modular packages do the following:
1. Run the following command to create a Metrocluster modular package configuration file:
# cmmakepkg –m dts/mccaeva temp.config
If the Metrocluster package uses Oracle Toolkit, then add the corresponding toolkit module.
For example, for a Metrocluster Oracle toolkit modular package, run the following command:
# cmmakepkg -m dts/mccaeva -m tkit/oracle/oracle temp.config
NOTE: Metrocluster is usually used with applications such as Oracle. So, the application
toolkit module must also be included when Metrocluster is used in conjunction with an
application. You must make sure to specify the Metocluster module before specifying the
toolkit module.
2.
Edit the following attributes in the temp.config file:
Table 4 Temp.config file Attributes
Attributes
Description
dts_pkg_dir
This is the package directory for this Metrocluster modular
package. The Metrocluster Environment file is generated for
this package in this directory. This value must be unique for all
packages.
DT_APPLICATION_STARTUP_POLICY
This is a parameter used to define a policy for starting an
application. It can be set to Availability_Preferred or
Data_Currency_Preferred policy.
DR_GROUP_NAME
This is the name of the DR group used by the package.
Installing and configuring an application
21
Table 4 Temp.config file Attributes (continued)
Attributes
Description
DC1_STORAGE_WORLD_WIDE_NAME
This is the World Wide Name of the P6000/EVA storage system
that resides in Data Center 1.
DC1_SMIS_LIST
This is a list of management servers that reside in Data Center
1.
DC1_HOST_LIST
This is a parameter used to specify a list of clustered nodes that
reside in Data Center 1.
DC2_STORAGE_WORLD_WIDE_NAME
This is the World Wide Name of the P6000/EVA storage system
that resides in Data Center 2.
DC2_SMIS_LIST
This is a list of management servers that reside in Data Center
2.
DC2_HOST_LIST
This is a parameter used to specify a list of clustered nodes that
reside in Data Center 2.
NOTE: The host name specified in DC1/DC2 parameter must match the output of the
hostname command. If hostname command displays a fully qualified name, then you must
specify the fully qualified name in DC1/DC2 host parameter list.
For Example:
# hostname
host1.domain1.com
If hostname command displays only host name, then you must specify the host name in
DC1/DC2 parameter list.
For Example:
# hostname
host1
There are additional Metrocluster parameters available in the package configuration file.
Hewlett Packard Enterprise recommends that you retain the default values of these variables
unless there is a specific business requirement to change them. For more information on
the Metrocluster parameters, see Appendix B (page 42).
For the failover_policy parameter, Metrocluster failover packages can be configured
to use any of the Serviceguard defined failover policies. The site_preferred and
site_preferred_manual failover policies are introduced in Serviceguard specifically for
Metrocluster configurations.
The site_preferred value implies that when a Metrocluster package needs to fail over,
it fails over to a node in the same site as the node it last ran. If there is no other configured
node available within the same site, the package fails over to a node on another site.
The site_preferred_manual failover policy provides automatic failover of packages
within a site and manual failover across sites.
Configure a cluster with sites to use either of these policies. For information on configuring
the failover policy to site_preferred or site_preferred_manual, see “Site Aware
Failover Configuration” (page 11).
3.
Validate the package configuration file.
# cmcheckconf -P temp.config
4.
Apply the package configuration file.
# cmapplyconf -P temp.config
22
Configuring an application in a Metrocluster environment
NOTE: If external_pre_script is specified in a Metrocluster package configuration,
the external_pre_script is executed after the execution of Metrocluster module scripts
in package startup. Metrocluster module scripts are always executed first during package
startup.
5.
Run the package on a node in the Serviceguard cluster.
# cmrunpkg -n <node_name> <package_name>
6.
Enable global switching for the package.
# cmmodpkg -e <package_name>
After the package is created, if the value of any Metrocluster parameter needs to be changed,
edit this package configuration file and re-apply it.
Setting up Disk Monitoring
HPE Serviceguard for Linux includes a Disk Monitor which you can use to detect problems in
disk connectivity. This lets you fail a package over from one node to another in the event of a
disk connectivity failure.
For instructions on configuring disk monitoring , see Creating a Disk Monitor Configuration section
in Managing HPE Serviceguard A.12.00.30 for Linux available at http://www.hpe.com/info/
linux-serviceguard-docs.
Configuring a Metrocluster package using Serviceguard Manager
To configure a Metrocluster package using Serviceguard Manager:
1. Access one of the node’s System Management Home Page at http://<nodename>:2301.
Log in using the root user’s credentials of the node.
2. Click Tools, if Serviceguard is installed, one of the widgets will have Serviceguard as an
option. Click Serviceguard Manager link within the widget.
3. On the Cluster’s Home Page, click the Configuration Tab, and then select Create A Modular
Package.
Figure 4 Creating modular package
4.
If the product Metrocluster with Continuous Access EVA P6000 for Linux Toolkit is installed,
you will be prompted to configure a Metrocluster package. Select the dts/mccaeva module,
and then click Next.
Installing and configuring an application
23
Figure 5 Selecting Metrocluster module
5.
6.
You will be prompted next to include any other toolkit modules. In case, application being
configured has a Serviceguard toolkit, select the appropriate toolkit; otherwise, move to the
next screen.
Enter the package name. Metrocluster packages can be configured only as failover packages.
Make sure that this option is selected as shown in Figure 6 (page 24) and then click Next.
Figure 6 Configuring package name
7.
Select additional modules if required by the application. For example, if the application uses
LVM volumegroups or VxVM diskgroups, select the volume_group module. Click Next.
Figure 7 Selecting additional modules for the package
8.
24
Review the node order in which the package will start, and modify other attributes, if needed.
Click Next.
Configuring an application in a Metrocluster environment
Figure 8 Configuring generic failover attributes
9.
Configure the attributes for a Metrocluster package. All the mandatory attributes (marked
with *) must be accurately filled.
a. Select Application start up policy from the list.
b. Specify the DR Group name, and then enter values for Wait Time and Query Timeout
, if required.
c. Select hosts for Data Center 1 and Data Center 2. Enter DC1/DC2 Storage World Wide
Names.
d. Specify the list of management servers for DC1 and DC2.
Figure 9 Specifying the list of management servers for DC1 and DC2.
10. Enter the values for other modules selected in step 7.
11. After you enter the values for all the modules, review all the inputs given to the various
attributes in the final screen. If you want to validate the package configuration click on Check
Configuration, else click on Apply Configuration.
Installing and configuring an application
25
Figure 10 Configuring Metrocluster P6000/EVA parameters
26
Configuring an application in a Metrocluster environment
3 Metrocluster features
Data replication storage failover preview
In an actual failure, packages are failed over to the standby site. In package startup, the underlying
storage is failed over based on the parameters defined in Metrocluster package. The storage
failover might fail under the following conditions:
•
Incorrect configuration or setup of Metrocluster and data replication environment.
The storage failover can fail if the Metrocluster package has syntax errors, or invalid
parameter values, or the installed Metrocluster binaries are corrupt or have incorrect file
permissions or the SMI-S server is unreachable or caeva.map file is corrupted.
•
Invalid data replication state.
The data may not be in write-order due to a track copy at the time of the failover attempt.
Also, the data is not current (lagging behind the primary) and the Metrocluster package
parameters are not set correctly to allow a failover on non-current data.
The command cmdrprev previews the failover of data replication storage. It determines if storage
failover can be successful during an actual package failover. This command can be used in both
Metrocluster and Continentalclusters. If the preview fails, the cmdrprev command displays a
detailed log that lists the cause for failure in stdout. The command options are as follows:
cmdrprev -p <package>
For more information, see the cmdrprev manpage.
The command exit value indicates if the storage failover in an actual package will succeed or
not. Table 5 describes the exit values of the command.
Table 5 Command exit value and its description
Value
Description
-1
The data replication storage failover preview failed. This indicates that in the event of an
actual recovery process, the data replication storage failover will not succeed on any node
in the cluster.
The failure is due to invalid command usage or due to invalid input parameters.
0
The data replication storage failover preview is successful on the node where the command
is run.
This indicates if data replication storage failover will be successful in the event of a package
failover.
1
The data replication storage failover preview failed.
This indicates that in the event of an actual recovery process, the data replication storage
failover will not succeed on any node in the cluster.
2
The data replication storage failover preview failed due to node specific error conditions or
due to transient conditions.
This indicates that in the event of an actual recovery process, the data replication storage
failover will not succeed on that node. Failover may be successful you attempt at a later
time or attempt on a different node in the cluster.
Rolling upgrade for Metrocluster
Use rolling Upgrade to upgrade the softwares components of the cluster with minimal downtime
to the applications managed by Metrocluster. See Rolling upgrade for the procedure.
Data replication storage failover preview
27
Live Application Detach
There may be circumstances in which you want to do maintenance that involves halting a node,
or the entire cluster, without halting or failing over the affected packages. Such maintenance
might consist of anything short of rebooting the node or nodes, but a likely case is networking
changes that will disrupt the heartbeat. New command options in Serviceguard for Linux
A.12.00.00 (collectively known as Live Application Detach (LAD)) allows you to do this kind of
maintenance while keeping the packages running. The packages are no longer monitored by
Serviceguard, but the applications continue to run. Packages in this state are called detached
packages. When you have done the necessary maintenance, you can restart the node or cluster,
and normal monitoring will resume on the packages. For more information on the LAD feature,
see Managing HPE Serviceguard A.12.00.30 for Linux available at http://www.hpe.com/info/
linux-serviceguard-docs
28
Metrocluster features
4 Understanding failover/failback scenarios
Metrocluster package failover/failback scenarios
This section discusses the package start up behaviors in various failure scenarios depending on
DT_APPLICATION_STARTUP_POLICY and replication mode. Table 6 describes the list of
failover scenarios.
NOTE: The first time failover to a node at a remote site has to be done with the Management
Server being active for the EVA array at the remote site.
Table 6 Replication Modes and Failover Scenarios
Failover
Scenario
Replication
Mode
DT_APPLICATION_STARTUP_POLICY
Resolution
Availability_Preferred Data_Currency_Preferred
Remote failover Synchronous DR Group fails over and the package is started. The
N/A
during normal
or Enhanced behavior is not affected by the presence of the FORCEFLAG
operations
Asynchronous file.
(planned
failover)
Remote failover Synchronous
when CA link
or Enhanced
down or when Asynchronous
the primary site
fails
DR Group fails over and the
package is started.
DR Group does not fail over To forcefully start up
and the package does not the package even
start.
though data currency
The following log message
appears in the package log:
The following log message cannot be
determined, create a
The role of the device group appears in the package log: “FORCEFLAG” file
Warning - Unable to get
on this site is "destination".
in package directory
The state of the replication link remote DR group state
and restart the
is down and the state of data because the CA link is
package.
may not be current. Because down. The role of the
the package startup policy is device group on this site is
AVAILABILITY_PREFERRED, "destination". The state of
the replication link is down
the program will attempt to
and the state of data may
start up the package.
not be current. Because the
package startup policy is
DATA_CURRENCY_PREFERRED,
the package is NOT allowed
to start up.
Remote failover Synchronous
during merging
Restart the package
If the Merge operation completes before WAIT_TIME
expires, DR group is failed over and the package is started. when the merge is
Otherwise, the DR Group does not failover and the package completed.
is not started. There is no change in this behaviour even if
the FORCEFLAG file is present.
The following log message appears in the package log:
The DR Group is in merging state.
....
The WAIT_TIME has expired.
Error - Failed to failover and swap the role of the device
group.
The package is NOT allowed to start up.
Enhanced
Waits till WAIT_TIME for the
Asynchronous Merge operation to complete.
The package is started even
if merge is not completed
within this time.
If the merge operation
completes before
WAIT_TIME expires, then
DR Group fails over and the
package is started.
To forcefully start up
the package, create
a “FORCEFLAG” file
in package directory
and restart the
package. Package
Metrocluster package failover/failback scenarios
29
Table 6 Replication Modes and Failover Scenarios (continued)
Failover
Scenario
Replication
Mode
DT_APPLICATION_STARTUP_POLICY
Resolution
Otherwise,, the package is
not started up.
does not wait for the
merge to complete.
The following log message It starts up
appears in the package log: immediately.
The replication link state is
good, the role of the device
group on this site is
destination" and the data
Log Copy is in progress.
Because the WAIT_TIME is
set to xx minutes, the
program will wait for
completion of the log copy
…
The DR Group is in merging
state.
…
The WAIT_TIME has
expired. Error - Failed to
failover and swap the role
of the device group. The
package is NOT allowed to
start up.
Remote failover Synchronous If full copy operation is complete before WAIT_TIME expires, Restart the package
during copying or Enhanced DR Group fails over and the package is started. Otherwise, when the full copy is
Asynchronous the DR Group does not failover and the package is not
completed.
started. The behavior is not affected by the presence of the
FORCEFLAG file.
The following log message appears in the package log :
The replication link state is good, the role of the device group
on this site is "destination" and the data Track Copy is in
progress. Because the WAIT_TIME is set to xx minutes, the
program will wait for completion of the track copy.
....
The WAIT_TIME has expired. Error - Failed to failover and
swap the role of the device group. The package is NOT
allowed to start up.
Remote failover Synchronous DR Group fails over and the
when CA link
or Enhanced package is started.
down and when Asynchronous
merge was in
progress
DR Group does not fail over To forcefully start up
and the package does not the package even
start.
though data currency
The following log message cannot be
appears in the package log determined, create a
“FORCEFLAG” file
:
in package directory,
Warning - Unable to get
and then restart the
remote DR group state
package.
because the CA link is
down.
…..
The role of the device group
on this site is "destination".
The state of the replication
link is down and the state of
data may not be current.
Because the package
startup policy is
30
Understanding failover/failback scenarios
Table 6 Replication Modes and Failover Scenarios (continued)
Failover
Scenario
Replication
Mode
DT_APPLICATION_STARTUP_POLICY
Resolution
DATA_CURRENCY_PREFERRED,
the package is NOT allowed
to start up.
Remote failover Synchronous
when CA link
or Enhanced
down and when Asynchronous
full copy was in
progress
DR Group does not fail over and the package does not start The package can be
because data is not consistent on the destination storage. manually restarted
successfully on
The following log message appears in the package log :
remote site after a
The role of the device group on this site is "destination". The consistent
state of the data may be inconsistent due to replication link snapclone/snapshot
is down while data copying from source to destination is still of the destination
in progress. The package is NOT allowed to start up.
Vdisk is restored.
HPE recommends
taking a
snapshot/snapclone
of the destination
Vdisks before the
copy starts so that
there is consistent
copy available for
recovery.
Remote failover Synchronous
when the link is
manually
suspended
DR Group does not fail over and the package does not start. Resume the link, and
The behavior is not affected by the presence of the
then start up the
FORCEFLAG file.
package
The following log message appears in the package log :
Error - The replication link is in suspend mode. The DR
group cannot be failed over.
Enhanced
DR Group fails over and the
Asynchronous package is started.
DR Group does not fail over To forcefully start up
and the package does not the package, create
start.
a FORCEFLAG file
The following log message in the package
appears in the package log directory, and then
restart the package.
:
The replication link of this
DR group is in suspend
mode.
The replication link state is
good and the role of the
device group on this site is
"destination". Because the
state of the data may not be
current and the package
startup policy is
DATA_CURRENCY_PREFERRED,
the package is NOT allowed
to start up.
Remote failover Synchronous
when DR
Enhanced
Group is in
Asynchronous
RUNDOWN
state and the
link is up
N/A
Waits until the WAIT_TIME for
the Merge operation is
complete. The package starts
even if merge is not complete
within this time.
To forcefully start up
the package, create
a “FORCEFLAG” file
in package directory
and restart the
package. Package
The following log message does not wait for the
appears in the package log merge to complete.
It starts up
:
immediately.
If the Merge operation
completes before
WAIT_TIME expires, the
package is started.
Otherwise, the package
startup fails.
Metrocluster package failover/failback scenarios
31
Table 6 Replication Modes and Failover Scenarios (continued)
Failover
Scenario
Replication
Mode
DT_APPLICATION_STARTUP_POLICY
Resolution
The replication link state is
good, the role of the device
group on this site is
destination" and the data
Log Copy is in progress.
Because the WAIT_TIME is
set to <xx> minutes, the
program will wait for
completion of the log copy.
….
The DR Group is in merging
state.
….
The WAIT_TIME has
expired.
Error - Failed to failover and
swap the role of the device
group. The package is NOT
allowed to start up.
Remote failover Synchronous
when the DR
Enhanced
DR Group fails over and the
Group is in
Asynchronous package is started.
RUNDOWN
state and link is
down
N/A
DR Group is not failed over To forcefully start up
and the package is not
the package even
started.
though data currency
The following log message cannot be
appears in the package log determined, create a
“FORCEFLAG” file
:
in package directory
Warning - Unable to get
and restart the
remote DR group state
package.
because the CA link is
down.
The role of the device group
on this site is "destination".
The state of the replication
link is down and the state of
data may not be current.
Because the package
startup policy is
DATA_CURRENCY_PREFERRED,
the package is NOT allowed
to start up.
32
Understanding failover/failback scenarios
5 Administering Metrocluster
Adding a node to Metrocluster
To add a node to Metrocluster with Continuous Access EVA P6000 for Linux:
1. To add the node in a cluster, edit the Serviceguard cluster configuration file, and then apply
the configuration:
# cmapplyconf -C cluster.config
2.
Copy caeva.map file to the new node.
For Red Hat:
# scp/usr/local/cmcluster/conf/mccaeva/caeva.map\
<new_node_name>:/usr/local/cmcluster/conf/mccaeva/caeva.map
For SUSE:
# scp/opt/cmcluster/conf/mccaeva/caeva.map \
<new_node_name>:/opt/cmcluster/conf/mccaeva/caeva.map
3.
Add the newly added node in a Metrocluster package by editing the Metrocluster package
configuration and applying the configuration:
# cmapplyconf –P <package_config_file>
Maintaining EVA P6000 Continuous Access replication in Metrocluster
While the package is running, a manual storage failover on P6000 Continuous Access outside
Metrocluster software can cause the package to halt. Hewlett Packard Enterprise recommends
that no manual storage failover be performed while the package is running.
A manual change of P6000 Continuous Access link state from suspend to resume is allowed to
re-establish data replication while the package is running.
P6000 Continuous Access Link failure scenarios
If all Continuous Access links fail and if failsafe mode is disabled, the application package
continues to run and writes new I/O to source Vdisk. The virtual log in P6000/EVA controller
collects host write commands and data; DR group's log state changes from normal to logging.
When a DR group is in a logging state, the log grows in proportion to the amount of write I/O
being sent to the source Vdisks. Upon Continuous Access links recovery, P6000 Continuous
Access automatically normalizes the source Vdisk and destination Vdisk data. If the links are
down for a long time and failsafe mode is disabled, the log disk may be full, and full copy occurs
automatically upon link recovery.
If the log disk is not full, when a Continuous Access connection is re-established, the contents
of the log are written to the destination Vdisk to synchronize it with the source Vdisk. This process
of writing the log contents, in the order that the writes occurred, is called merging. While merging
is in progress, write ordering is maintained and hence the data on the destination Vdisk is
consistent.
If the log disk is full, when a Continuous Access connection is re-established, a full copy from
the source Vdisk to the destination Vdisk is done. Since a full copy is done at the block level, the
data on the destination Vdisk is not consistent until the copy completes.
If primary site fails while copy is in progress, the data in destination Vdisk is not consistent, and
is not usable. The package can never startup on the recovery site. The application will not be
online until the primary site is restored. To manage the resynchronization and to ensure that a
consistent copy of data is there on recovery site, do the following:
Adding a node to Metrocluster
33
1.
2.
3.
After all Continuous Access links fail, put the Continuous Access link state to suspend mode
by using P6000 Command View UI. When Continuous Access link is in suspend state, P6000
Continuous Access does not resynchronize the source and destination Vdisks upon links
recovery. This helps in maintaining data consistency.
Take a local replication copy of the destination Vdisks using P6000 Business Copy software
so that there is consistent copy available for recovery.
Change the Continuous Access link state to resume mode. This initiates the normalization
upon Continuous Access link recovery.
The above steps ensures that even though the primary site fails while copy is in progress, the
destination Vdisk can be used after restoring the data from the BC volume.
Planned maintenance
Node maintenance
If you take a node down for maintenance, package failover and quorum calculation is based on
the remaining nodes. Make sure that the nodes are taken down evenly at each site, and that
enough nodes remain on-line to form a quorum if a failure occurs. Planned maintenance is treated
the same as a failure by the cluster.
Metrocluster package maintenance
There might be situations when the package needs to be taken down for maintenance purposes
without moving the package to another node. The following procedure is recommended for normal
maintenance of the Metrocluster with Continuous Access EVA P6000 for Linux:
1. Disable packge switching.
# cmmodpkg -d <pkgname>
2.
3.
If SMI-S user credentials or ports are being changed, update the smis.conf file and
re-generate the caeva.map file. Distribute the updated caeva.map file to all the nodes of
the cluster.
If there are changes to the package attributes, edit the configuration file, and then apply the
updated package configuration.
# cmapplyconf -P <pkgname.config>
4.
Start the package with the appropriate Serviceguard command.
# cmmodpkg -e <pkgname>
Failback
If the primary site is restored after a failover to the recovery site, you may want to fail back the
package to the primary site. Manually resync the data from the recovery site to the primary site.
After resynchronization is complete, halt the package on the recovery site, and restart it on the
primary site. Metrocluster performs a failover of the storage, which returns the SOURCE status
to the primary VDisks.
Administering Metrocluster with Serviceguard Manager
To administer Metrocluster or the packages configured under Metrocluster using Serviceguard
Manager:
1. Access one of the node’s System Management Home Page at http://<nodename>:2301.
Log in using the root user’s credentials of the node.
2. Click Tools, if Serviceguard is installed, one of the widgets will have Serviceguard as an
option. Click Serviceguard Manager link within the widget.
3. On the Cluster’s Home Page, click the Administration Tab and choose the available options
when required. Choose appropriate packages for the required option.
34
Administering Metrocluster
Rolling upgrade
Metrocluster configurations follow the Serviceguard rolling upgrade procedure. The Serviceguard
documentation includes rolling upgrade procedures to upgrade the Serviceguard version, operating
environment, and other software. This Serviceguard procedure, along with recommendations,
guidelines, and limitations, is applicable to Metrocluster versions. For more information on
completing a rolling upgrade of Serviceguard, see the latest edition of Managing HPE
Serviceguard A.12.00.30 for Linux, available at http://www.hpe.com/info/
linux-serviceguard-docs.
Upgrading Metrocluster replication software
To perform a rolling upgrade of Metrocluster software:
1. Disable package switching for all Metrocluster packages.
2. Install the new Metrocluster software on all nodes.
3. Enable package switching for all Metrocluster packages.
To upgrade the array-specific replication management software, see “Upgrading replication
management software” (page 35).
Limitations of the rolling upgrade for Metrocluster
The following are the limitations of the rolling upgrade for Metrocluster:
•
The cluster or package configuration cannot be modified until the rolling upgrade is complete.
If the configuration must be edited, upgrade all nodes to the new release, and then modify
the configuration file and copy it to all nodes in the cluster.
•
New features of the latest version of Metrocluster cannot be used until all nodes are upgraded
to the latest version.
•
More than two versions of Metrocluster cannot run in the cluster while the rolling upgrade
is in progress.
•
The rolling upgrade procedure cannot be used as a means of using multiple versions of
Metrocluster software within the cluster. Hewlett Packard Enterprise recommends that all
cluster nodes are immediately upgraded to the latest version.
•
Serviceguard cannot be deleted on any node when the rolling upgrade is in progress in the
cluster.
Upgrading replication management software
Upgrade the replication management software that is used by Metrocluster. In this product, the
array management softwares are running in a separate windows box, therefore you can upgrade
them independently without affecting the running Metrocluster.
Upgrading the OpenPegasus WBEM Services for Metrocluster with Continuous Access EVA
P6000 for Linux
Metrocluster with Continuous Access EVA P6000 for Linux uses OpenPegasus WBEM Services
software to communicate with the SMI-S server that manages the P6000/EVA disk arrays. You
can update OpenPegasus WBEM Services without halting the cluster or Metrocluster packages
on any of the nodes. Metrocluster communicates with this software when the Metrocluster
packages are starting up. Therefore, avoid Metrocluster package failover until all the nodes in
the Metrocluster are upgraded to the same version of OpenPegasus WBEM Services.
Rolling upgrade
35
6 Troubleshooting
Troubleshooting Metrocluster
Analyze Metrocluster and SMI-S/Command View log files to understand the problem in the
respective environment and follow a recommended action based on the error or warning
messages.
Metrocluster log
Make sure you periodically review the following files for messages, warnings, and recommended
actions. Hewlett Packard Enterprise recommends reviewing these files after each system, data
center, and/or application failures:
•
View the system log at /var/log/messages.
•
The package log file specified in the script_log_file parameter.
P6000/EVA storage system log
Analyze the CIMOM or Provider logs to understand the problem with the SMI-S layer. For more
information on SMI-S logs see the SMI-S EVA Provider user guide and for Command View logs
see the HPE P6000 Command View user guide.
36
Troubleshooting
7 Support and other resources
Accessing Hewlett Packard Enterprise Support
•
For live assistance, go to the Contact Hewlett Packard Enterprise Worldwide website:
www.hpe.com/assistance
•
To access documentation and support services, go to the Hewlett Packard Enterprise Support
Center website:
www.hpe.com/support/hpesc
Information to collect
•
Technical support registration number (if applicable)
•
Product name, model or version, and serial number
•
Operating system name and version
•
Firmware version
•
Error messages
•
Product-specific reports and logs
•
Add-on products or components
•
Third-party products or components
Accessing updates
•
Some software products provide a mechanism for accessing software updates through the
product interface. Review your product documentation to identify the recommended software
update method.
•
To download product updates, go to either of the following:
◦
Hewlett Packard Enterprise Support Center Get connected with updates page:
www.hpe.com/support/e-updates
◦
Software Depot website:
www.hpe.com/support/softwaredepot
•
To view and update your entitlements, and to link your contracts and warranties with your
profile, go to the Hewlett Packard Enterprise Support Center More Information on Access
to Support Materials page:
www.hpe.com/support/AccessToSupportMaterials
IMPORTANT: Access to some updates might require product entitlement when accessed
through the Hewlett Packard Enterprise Support Center. You must have an HP Passport
set up with relevant entitlements.
Websites
Website
Link
Hewlett Packard Enterprise Information Library
www.hpe.com/info/enterprise/docs
Hewlett Packard Enterprise Support Center
www.hpe.com/support/hpesc
Accessing Hewlett Packard Enterprise Support
37
Website
Link
Contact Hewlett Packard Enterprise Worldwide
www.hpe.com/assistance
Subscription Service/Support Alerts
www.hpe.com/support/e-updates
Software Depot
www.hpe.com/support/softwaredepot
Customer Self Repair
www.hpe.com/support/selfrepair
Insight Remote Support
www.hpe.com/info/insightremotesupport/docs
Serviceguard Solutions for HP-UX
www.hpe.com/info/hpux-serviceguard-docs
Single Point of Connectivity Knowledge (SPOCK) Storage www.hpe.com/storage/spock
compatibility matrix
Storage white papers and analyst reports
www.hpe.com/storage/whitepapers
Customer self repair
Hewlett Packard Enterprise customer self repair (CSR) programs allow you to repair your product.
If a CSR part needs to be replaced, it will be shipped directly to you so that you can install it at
your convenience. Some parts do not qualify for CSR. Your Hewlett Packard Enterprise authorized
service provider will determine whether a repair can be accomplished by CSR.
For more information about CSR, contact your local service provider or go to the CSR website:
www.hpe.com/support/selfrepair
Remote support
Remote support is available with supported devices as part of your warranty or contractual support
agreement. It provides intelligent event diagnosis, and automatic, secure submission of hardware
event notifications to Hewlett Packard Enterprise, which will initiate a fast and accurate resolution
based on your product’s service level. Hewlett Packard Enterprise strongly recommends that
you register your device for remote support.
For more information and device support details, go to the following website:
www.hpe.com/info/insightremotesupport/docs
Documentation feedback
Hewlett Packard Enterprise is committed to providing documentation that meets your needs. To
help us improve the documentation, send any errors, suggestions, or comments to Documentation
Feedback ([email protected]). When submitting your feedback, include the document
title, part number, edition, and publication date located on the front cover of the document. For
online help content, include the product name, product version, help edition, and publication date
located on the legal notices page.
38
Support and other resources
A Checklist and worksheet for configuring a Metrocluster
with Continuous Access EVA P6000 for Linux
Disaster Recovery Checklist
Use this checklist to make sure you have adhered to the disaster tolerant architecture guidelines
for two main data centers and a third location configuration.
Data centers A and B have the same number of nodes to maintain quorum in case an entire
data center fails.
Arbitrary nodes or Quorum Server nodes are located in a separate location from either of the
primary data centers (A or B).
The elements in each data center including nodes, disks, network components, and climate
control are on separate power circuits.
Multipathing is configured for each disk used in Metrocluster.
Each disk array is configured with redundant replication links.
At least two networks are configured to function as the cluster heartbeat.
All redundant cabling for network, heartbeat, and replication links are routed using physical
paths.
Cluster Configuration Worksheet
Use this cluster configuration worksheet either in place of, or in addition to the worksheet provided
in the latest version of the Managing HPE Serviceguard A.12.00.30 for Linux manual available
at http://www.hpe.com/info/linux-serviceguard-docs. If you have already completed a
Serviceguard cluster configuration worksheet, you only need to complete the first part of this
worksheet.
_______________________________________________________
Names and Nodes
_______________________________________________________
Cluster Name: ________________________________________________________
Data Center A Name and Location: _____________________________________
Site Name: __________________________________________________________
Node Names: ________________________________________________________
Data Center B Name and Location: ______________________________________
Site Name: ___________________________________________________________
Node Names: ________________________________________________________
Arbitrator/Quorum Server Third Location Name and Location: ______________
Arbitrator Node/Quorum Server Names: ________________________________
Maximum Configured Packages: _______________________________________
______________________________________________________
Subnets
_______________________________________________________
Heartbeat IP Addresses: _______________________________________________
Non-Heartbeat IP Addresses: ___________________________________________
_______________________________________________________
Timing Parameters
______________________________________________________
Member Timeout: _____________________________________________________
Disaster Recovery Checklist
39
Network Polling Interval: ______________________________________________
AutoStart Delay: ______________________________________________________
Package Configuration Worksheet
Use this package configuration worksheet either in place of, or in addition to the worksheet
provided in the latest version of the Managing HPE Serviceguard A.12.00.30 for Linux manual
available at http://www.hpe.com/info/linux-serviceguard-docs. If you have already completed
a Serviceguard package configuration worksheet, you only need to complete the first part of this
worksheet.
_______________________________________________________
Modular package configuration worksheet
________________________________________________________
Package Configuration data
_________________________________________________________
Package Name: _________________________________________________________
Primary Node: _________________________Data Center: ______________________
First Failover Node: __________________Data Center: ________________________
Second Failover Node: _________________Data Center: ______________________
Third Failover Node: __________________Data Center: _______________________
Fourth Failover Node: _________________Data Center: _______________________
Table 7 Volume Group module Information
Information Name
Entry 1
Entry 2
Entry 3
Entry 4
Volume Group
Logical Volume
File System
Mount Point
LVM/VxVM
Table 8 Package IP module Information
Package IP Address
IP Subnet
IP Subnet Node
Table 9 Service module Information
Service Name
Service command
Service restart
Fail fast enabled
Service Halt timeout
_____________________________________________________________
Metrocluster Continuous Access P6000/EVA Module Information
_____________________________________________________________
DR Group Name: ____________________________________________________________
40
Checklist and worksheet for configuring a Metrocluster with Continuous Access EVA P6000 for Linux
DC1 Storage Array WWN: ___________________________________________________
DC1 SMIS List: ______________________________________________________________
DC1 HOST List: _____________________________________________________________
DC2 Storage Array WWN: ___________________________________________________
DC2 SMIS List: ______________________________________________________________
DC2 HOST List: _____________________________________________________________
P6000/EVA Configuration Checklist
Use the following checklist to verify the Metrocluster with Continuous Access EVA P6000 for
Linux configuration
Redudant Management Servers configured and accessible to all nodes.
Source and Destination volumes created for use with all packages.
Management Servers Security configuration is complete (smispasswd command).
P6000/EVA mapping is complete (evadiscovery command).
caeva.map file is copied to all cluster nodes.
The caeva.map file is located at:
$SGCONF/mccaeva/
P6000/EVA Configuration Checklist
41
B Package attributes for Metrocluster with Continuous
Access EVA P6000 for Linux
This appendix lists all Package Attributes for this product. Hewlett Packard Enterprise recommends
that you use the default settings for most of these variables, so exercise caution when modifying
them:
CLUSTER_TYPE
This parameter identifies the type of disaster recovery
services cluster: Metrocluster or Continentalclusters. You
must set this to “metro” if this is a Metrocluster environment
and “continental” if this is a Continentalclusters
environment. A type of “metro” is supported only when the
HPE Metrocluster product is installed. A type of
“continental” is supported only when the HPE
Continentalclusters and Metrocluster software are installed.
PKGDIR
If the package is a legacy package, this variable contains
the full path name of the package directory. If the package
is a modular package, this variable contains the full path
name of the directory where the Metrocluster caeva
environment file is located.
DT_APPLICATION_STARTUP_POLICY This parameter defines the preferred policy to start the
application with respect to the state of the data in the local
volumes. It should be set to one of the following two
policies:
Availability_Preferred: The user chooses this
policy if he prefers application availability. Metrocluster
software allows the application to start if the data is
consistent even if the data is not current.
Data_Currency_Preferred:The user chooses this
policy if he prefers the application to start on consistent
and current data. Metrocluster software allows the
application to operate only on current data. This policy only
focuses on the state of the local data (with respect to the
application) being consistent and current.
A package can be forced to start on a node by creating the
FORCEFLAG in the package directory.
42
WAIT_TIME
(0 or greater than 0 [in minutes])
This parameter defines the timeout, in minutes, to wait for
completion of the data merging or copying for the DR group
before startup of the package on destination volume.
If WAIT_TIME is greater than zero, and if the state of DR
group is “merging in progress” or “copying in progress”,
Metrocluster software waits until WAIT_TIME value for the
merging or copying is complete. If WAIT_TIME expires and
merging or copying is still in progress, the package fails to
start with an error.
If WAIT_TIME is 0 (default value), and if the state of DR
group is “merging in progress” or “copying in progress”
state, Metrocluster software will not wait and will return an
exit 1 code to Serviceguard package manager. The
package fails to start with an error.
DR_GROUP_NAME
The name of the DR group used by this package. The DR
group name is defined when the DR group is created.
Package attributes for Metrocluster with Continuous Access EVA P6000 for Linux
DC1_STORAGE_WORLD_WIDE_NAME The world wide name of the P6000/EVA storage system
that resides in Data Center 1. This storage system name
is defined when the storage is initialized.
DC1_SMIS_LIST
A list of the management servers that reside in Data Center
1. Multiple names can be defined by using commas as
separators.
If a connection to the first management server fails,
attempts are made to connect to the subsequent
management servers in their order of specification.
A list of the clustered nodes that reside in Data Center 1.
Multiple names can be defined by using commas as
separators.
DC2_STORAGE_WORLD_WIDE_NAME The world wide name of the P6000/EVA storage system
that resides in Data Center 2. This storage system name
is defined when the storage is initialized.
DC2_SMIS_LIST
A list of the management servers that reside in Data Center
2. Multiple names can be defined by using commas as
separators.
If a connection to the first management server fails,
attempts are made to connect to the subsequent
management servers in their order of specification..
DC1_HOST_LIST
DC2_HOST_LIST
QUERY_TIME_OUT (Default 300
seconds)
A list of the clustered nodes that reside in Data Center 2.
Multiple names can be defined by using commas as
separators.
Sets the time in seconds to wait for a response from the
SMI-S CIMOM in storage management appliance. The
minimum recommended value is 20 seconds. If the value
is set to be smaller than 20 seconds, Metrocluster software
may time out before getting the response from SMI-S, and
the package fails to start with an error.
43
C smiseva.conf file
#################################################################
#
#
#
smiseva.conf CONFIGURATION FILE (template)
#
#
for use with the smispasswd utility
#
#
in the Metrocluster CA EVA Environment
#
#
#
#
Note: This file MUST be edited before it can be used.
#
#
For complete details about SMI-S configuration for use
#
#
with Metrocluster CA EVA, consult the manual "Designing
#
#
Disaster Tolerant High Availability Clusters."
#
#
#
#################################################################
#
# This file provides input to the smispasswd utility, which you
# use to set up secure access paths between cluster nodes and
# SMI-S services.
#
# Edit this file to include the appropriate information about
# the Management Server and SMI-S services that will be used in
# your Metrocluster CA EVA environment.
#
# After entering all the desired information, run the smispasswd
# command to generate the security configuration that allows
# cluster nodes to communicate with the SMI-S services.
#
# Below is an example configuration. The data is commented out.
#
# Hostname/IP_Address
User_login_name Secure
Namespace
Port
# IP_Address
Connection
(optional)
# --------------------------------- ---------------------# 15.13.244.182
administrator
y
root/EVA
5989
# 15.13.244.183
administrator
N
root/EVA
5988
# 15.13.244.192
admin12309
y
root/EVA
# SANMA04
admin
y
root/EVA
#
# The example shows a list of 4 Management Server and SMI-S data in
# the Metrocluster CA EVA environment. Each line represents a different
# Management Server and SMI-S data; fields on each line should be separated
# either by space(s) or tab(s). The order of fields is significant. The first
# field must be a hostname or IP address, the second field must be
# a user login name on the host. The third field must be 'y' or
# 'n' to use SSL connect. The next field must be the namespace
# of the SMI-S service. If the SMI-S service does not use the default port,
# then use the Port column to give the customized port number.
# For details of each field data, refer to the smispasswd
# man page, 'man smispasswd'.
#
# Note:
# Lines beginning with the pound sign (#) are comments. You cannot
# use the '#' character in your data entries.
#
# Enter your Managment Server and SMI-S services data under the dashed lines:
#
# Hostname/IP_Address
User_login_name Secure
Namespace
Port
# IP_Address
Connection
(optional)
# --------------------------------- -------------------------
44
smiseva.conf file
D mceva.conf file
##############################################################
## mceva.conf CONFIGURATION FILE (template) for use with
##
## the evadiscovery utility in the Metrocluster Continuous ##
## Access EVA Environment.
##
## Version: A.01.00
##
## Note: This file MUST be edited before it can be used.
##
## For complete details about EVA configuration for use
##
## with Metrocluster Continuous Access EVA, consult the
##
## manual “Designing Disaster Tolerant High Availability
##
## Clusters”.
##
##############################################################
## This file provides input to the evadiscovery utility,
##
## which you use to generate the /etc/dtsconf/caeva.map
##
## file. During Metrocluster Continuous Access EVA
##
## configuration, this file is copied to all cluster nodes. ##
## Edit the file to include the appropriate data about the ##
## EVA storage systems and DR groups that will be used in
##
## your Metrocluster Continuous Access EVA environment.
##
## After entering all the desired information, run the
##
## evadiscoverycommand to generate the mapping data and save##
## it in a map file.
##
## Note: Before running evadiscovery, you need to use the
##
## smispasswd command to create a SMI-S services
##
## configuration.
##
## Enter the data for storage device pairs and DR groups
##
## after the <storage_pair_info> and <dr_group_info> tags. ##
## The <storage_pair_info> tag represents the starting
##
## definition of a storage pair and its DR groups. Under a ##
## <storage_pair_info> tag, you must provide two storage
##
## Node World Wide Name (WWN)which both contain the DR groups##
## defined under the <dr_group_info> tag. You can define as ##
## many DR groups as you need, but each DR group must belong ##
## to only one of the storage pairs. A storage pair can have ##
## a maximum of 64 DR groups.
##
## Note that you can find storage Node World Wide Names form ##
## the front panel of your P6000 controllers or from the
##
## ‘Initialized Storage Properties’ page of command view
##
## EVA through your Web browser.
##
## Below is an example of a configuration with two storage
##
## pairs (4 storage units). The first storage pair contains ##
## 2 DR groups and the second pair contains 1 DR group.
##
## <storage_pair_info>
##
## “5000-1FE1-5000-4280” Enter first storage WWN in double
##
## quotes.
##
## “5000-1FE1-5000-4180” Enter second storage WWN in double ##
## quotes.
##
## <dr_group_info>
##
## “DR Group - Package1” Enter a DR group name in double
##
## quotes.
##
## “DR Group - OracleDB1” Enter a DR group name in double
##
## quotes.
##
## <storage_pair_info>
##
## “5000-1FE1-5000-4081” Enter first storage WWN in double
##
## quotes.
##
## “5000-1FE1-5000-4084” Enter second storage WWN in double ##
## quotes.
##
## <dr_group_info>
##
## "DR Group - Package2” Enter a DR group name in double
##
## quotes.
##
## Note:Since '#’ meant a start of a comment, you cannot
##
## include the ‘#’ in any <storage_pair_info>,
##
## <dr_group_info>, storage name and DR group name.
##
45
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
46
##
Note: All the storage and DR Group names should be
##
enclosed in double quotes (““), otherwise the
##
evadiscovery command will not detect them.
##
Enter your MC EVA Storage pairs and DR Groups under the
##
# dashed lines:
##
----------------------------------------------------------##
<storage_pair_info>
##
“5000-1FE1-5000-00DF”
##
“5000-1FE1-5000-00DE”
##
<dr_group_info>
##
“DR Group 1”
##
“DR Group 2”
##
“DR Group 3”
##
“DR Group 4”
##
mceva.conf file
E Identifying the devices to be used with packages
Identifying the devices created in P6000/EVA
After the WWN of the P6000/ EVA virtual volume is obtained, find the WWN of the disk using
lsscsi or scsi_id commands.
For Example:
# lsscsi | grep HSV | grep disk | awk '{print $6}'
After the P6000/EVA disks are retrieved by lsscsi command, run the scsi_id command to
find the WWN of the P6000/EVA disk.
For Example,
On SUSE:
#/lib/udev/scsi_id --whitelisted <HSV disk path>
On Red Hat:
#/sbin/scsi_id --whitelisted <HSV disk path>
You can also use EVAinfo tool to retrieve the wwn of P6000/EVA disk.
For Example:
# evainfo -a
Devicefile
Array
WWNN
Controller/Port/Mode
/dev/sdg
5000-1FE1-5007-DBD0 6005-08B4-0010-78FD-0002-4000-0018-0000
Ctl-A/FP-4/NonOptimized
/dev/sdh
5000-1FE1-5007-DBD0 6005-08B4-0010-786B-0001-E000-0031-0000
Ctl-A/FP-4/NonOptimized
/dev/sdp
5000-1FE1-5007-DBD0 6005-08B4-0010-78FD-0002-4000-0018-0000
Ctl-A/FP-3/NonOptimized
/dev/sdq
5000-1FE1-5007-DBD0 6005-08B4-0010-786B-0001-E000-0031-0000
Ctl-A/FP-3/NonOptimized
/dev/sdr
5000-1FE1-5007-DBD0 6005-08B4-0010-78FD-0002-4000-0018-0000
Ctl-B/FP-4/Optimized
/dev/sds
5000-1FE1-5007-DBD0 6005-08B4-0010-786B-0001-E000-0031-0000
Ctl-B/FP-4/Optimized
/dev/sdaf
5000-1FE1-5007-DBD0 6005-08B4-0010-78FD-0002-4000-0018-0000
Ctl-A/FP-4/NonOptimized
/dev/sdag
5000-1FE1-5007-DBD0 6005-08B4-0010-786B-0001-E000-0031-0000
Ctl-A/FP-4/NonOptimized
/dev/sdah
5000-1FE1-5007-DBD0 6005-08B4-0010-78FD-0002-4000-0018-0000
Ctl-A/FP-3/NonOptimized
/dev/sdai
5000-1FE1-5007-DBD0 6005-08B4-0010-786B-0001-E000-0031-0000
Ctl-A/FP-3/NonOptimized
Capacity
1024MB
1024MB
1024MB
1024MB
1024MB
1024MB
1024MB
1024MB
1024MB
1024MB
For more information on EVAInfo tool, see HPE EVAInfo Release Notes.
Identifying the devices created in P6000/EVA
47
Glossary
A, B
arbitrator
Nodes in a disaster tolerant architecture that act as tie-breakers in case all of the nodes in a
data center go down at the same time. These nodes are full members of the Serviceguard
cluster and must conform to the minimum requirements. The arbitrator must be located in a
third data center to ensure that the failure of an entire data center does not bring the entire
cluster down. See also quorum server.
automatic
failover
Failover directed by automation scripts or software (such as Serviceguard) and requiring no
human intervention.
C
campus cluster
A single cluster that is geographically dispersed within the confines of an area owned or leased
by the organization such that it has the right to run cables above or below ground between
buildings in the campus. Campus clusters are usually spread out in different rooms in a single
building, or in different adjacent or nearby buildings. See also extended distance cluster.
cluster
A Serviceguard cluster is a networked grouping of HPE 9000 and/or HPE Integrity Servers
series 800 servers (host systems known as nodes) having sufficient redundancy of software
and hardware that a single failure will not significantly disrupt service. Serviceguard software
monitors the health of nodes, networks, application services, EMS resources, and makes failover
decisions based on where the application is able to run successfully.
Continentalclusters
A group of clusters that use routed networks and/or common carrier networks for data replication
and cluster communication to support package failover between separate clusters in different
data centers. Continentalclusters are often located in different cities or different countries and
can span 100s or 1000s of kilometers.
Continuous
Access
A facility provided by the Continuos Access software option available with the HPE StorageWorks
P6000/EVA disk Array. This facility enables physical data replication between P6000/EVA series
disk arrays.
Controller
Software
Controller software manages all aspects of array operations, including communication with
P6000 Command View. VCS is the controller software for the EVA3000/5000 models. XCS is
the controller software for all other P6000/EVA models.
D
data center
A physically proximate collection of nodes and disks, usually all in one room.
data consistency
Whether data are logically correct and immediately usable; the validity of the data after the last
write. Inconsistent data, if not recoverable to a consistent state, is corrupt.
data currency
Whether the data contain the most recent transactions, and/or whether the replica database
has all of the committed transactions that the primary database contains; speed of data replication
may cause the replica to lag behind the primary copy, and compromise data currency.
data replication
The scheme by which data is copied from one site to another for disaster tolerance. Data
replication can be either physical (see physical data replication) or logical (see logical data
replication). In a Continentalclusters environment, the process by which data that is used by
the cluster packages is transferred to the Recovery Cluster and made available for use on the
Recovery Cluster in the event of a recovery.
database
replication
A software-based logical data replication scheme that is offered by most database vendors.
disaster
An event causing the failure of multiple components or entire data centers that render unavailable
all services at a single location; these include natural disasters such as earthquake, fire, or
flood, acts of terrorism or sabotage, large-scale power outages.
48
Glossary
disaster recovery
The process of restoring access to applications and data after a disaster. Disaster recovery
can be manual, meaning human intervention is required, or it can be automated, requiring little
or no human intervention.
disaster tolerant
The characteristic of being able to recover quickly from a disaster. Components of disaster
tolerance include redundant hardware, data replication, geographic dispersion, partial or complete
recovery automation, and well-defined recovery procedures.
E, F
Environment File
Metrocluster uses a configuration file that includes variables that define the environment for the
Metrocluster to operate in a Serviceguard cluster. This configuration file is referred to as the
Metrocluster environment file. This file needs to be available on all nodes in the cluster for
Metrocluster to function successfully.
failback
Failing back from a backup node, which may or may not be remote, to the primary node that
the application normally runs on.
failover
The transfer of control of an application or service from one node to another node after a failure.
Failover can be manual, requiring human intervention, or automated, requiring little or no human
intervention.
G, H, I, J, K, L
heartbeat
network
A network that provides reliable communication among nodes in a cluster, including the
transmission of heartbeat messages, signals from each functioning node, which are central to
the operation of the cluster, and which determine the health of the nodes in the cluster.
local failover
Failover on the same node; this most often applied to hardware failover, for example local LAN
failover is switching to the secondary LAN card on the same node after the primary LAN card
has failed.
LUN
(Logical Unit Number) A SCSI term that refers to a logical disk device composed of one or more
physical disk mechanisms, typically configured into a RAID level.
M, N
manual failover
Failover requiring human intervention to start an application or service on another node.
Metrocluster
A Hewlett Packard Enterprise product that allows a customer to configure a Serviceguard cluster
as a disaster tolerant metropolitan cluster.
metropolitan
cluster
A cluster that is geographically dispersed within the confines of a metropolitan area requiring
right-of-way to lay cable for redundant network and data replication components.
mission critical
application
Hardware, software, processes and support services that must meet the uptime requirements
of an organization. Examples of mission critical application that must be able to survive regional
disasters include financial trading services, e-business operations, 911 phone service, and
patient record databases.
multiple points of
failure (MPOF)
More than one point of failure that can bring down a Serviceguard cluster.
notification
A message that is sent following a cluster or package event.
Q
quorum server
A cluster node that acts as a tie-breaker in a disaster tolerant architecture in case all of the
nodes in a data center go down at the same time. See also arbitrator.
R
remote failover
Failover to a node at another data center or remote location.
resynchronization
The process of making the data between two sites consistent and current once systems are
restored following a failure. Also called data resynchronization.
49
S
split-brain
syndrome
When a cluster reforms with equal numbers of nodes at each site, and each half of the cluster
thinks it is the authority and starts up the same set of applications, and tries to modify the same
data, resulting in data corruption. Serviceguard architecture prevents split-brain syndrome in
all cases unless dual cluster locks are used.
sub-clusters
Sub-clusters are clusterwares that run above the Serviceguard cluster and comprise only the
nodes in a Metrocluster site. Sub-clusters have access only to the storage arrays within a site.
T
transparent
failover
A client application that automatically reconnects to a new server without the user taking any
action.
transparent IP
failover
Moving the IP address from one network interface card (NIC), in the same node or another
node, to another NIC that is attached to the same IP subnet so that users or applications may
always specify the same IP name/address whenever they connect, even after a failure.
U,V,Z
virtual array
Synonymous with disk array and storage system; a group of disks in one or more disk enclosures
combined with control software that presents disk storage capacity as one or more virtual disks.
See also virtual disk. See also virtual disk.
Virtual Controller
Software (VCS)
See controller software.
virtual disk
Variable disk capacity that is defined and managed by the array controller and presented to
hosts as a disk. May be called Vdisk in the user interface.
volume group
In LVM, a set of physical volumes such that logical volumes can be defined within the volume
group for user access. A volume group can be activated by only one node at a time unless you
are using Serviceguard OPS Edition. Serviceguard can activate a volume group when it starts
a package. A given disk can belong to only one volume group. A logical volume can belong to
only one volume group.
Vraid
The level to which user data is protected. Redundancy is directly proportional to cost in terms
of storage usage; the greater the level of data protection, the more storage space is required.
See also: Vraid0, Vraid1, Vraid5, Vraid6.
Vraid0
Optimized for I/O speed and efficient use of physical disk space, but provides no data
redundancy.
Vraid1
Optimized for data redundancy and I/O speed, but uses the most physical disk space.
Vraid5
Provides a balance of data redundancy, I/O speed, and efficient use of physical disk space.
Vraid6
Offers the features of Vraid5 while providing more protection for an additional drive failure, but
uses additional physical disk space.
50
Glossary
Index
A
S
accessing
updates, 37
Server
Default Management Server, 15
Management Server, 16
Site Aware Disaster Tolerant Architecture (SADTA), 35
site_preferred_manual
site_preferred, 11
smispasswd
configuration, 18
evadiscovery, 14
Storage Cells
DR Groups, 16
storage devices
configuration, 17
support
Hewlett Packard Enterprise, 37
C
cluster
continental, 42
Serviceguard, 11
cmviewcl
command, 12
configuration
environment, 9
configure
web-based tool, 12
Configuring
Generic Failover Attributes, 25
Metrocluster EVA Parameters, 26
contacting Hewlett Packard Enterprise, 37
Continentalclusters, 42
Metrocluster, 27
customer self repair, 38
U
updates
accessing, 37
V
D
Disaster Recovery
Continentalclusters worksheet, 39
Performing, 35
documentation
providing feedback on, 38
F
VDisk
Cluster Device Special Files (cDSF), 18
W
websites, 37
customer self repair, 38
worksheet
Continentalclusters, 39
failover_policy
site_preferred, 12
FORCEFLAG, 31
H
hardware
software, 11
Hierarchical Storage Virtualization (HSV)
terminology, 5
HPE P6000 Continuous Access, 5
M
Metrocluster
configuration, 9
Metrocluster Module, 24
Rolling, 27
SMI-S, 27
Metrocluster package
Metrocluster parameters, 22
Serviceguard Manager, 23
R
remote support, 38
Replication
Failover Preview, 27
51