STP Enhancement

STP Recovery Enhancement
“Recovery Voting”
December 08, 2011
Gregory Hutchison
[email protected]
7/28/2017
© 2015 IBM Corporation
1
An STP enhancement was made available to the z10, z196, z114 and higher described in this white
paper. The enhancement affects how recovery works and can protect the environment from an
entire Coordinated Timing Network (CTN) outage. It is assumed that the reader is somewhat
familiar with STP terminology (PTS = Preferred Time Server, etc). If these MCLs are on your z10
or z196, you have the enhancement installed. If your z196 is at Driver 93 or if you have a
zEnterprise 114, you also have the enhancement installed.
Server
z10
zEnterprise
Minimum Driver
79F
86E
MCL
N24406.094
N29799.110
Date Available
Sept. 28, 2011
August 24, 2011
Bundle
50
44
The enhancement focuses on STP triad configurations which include a PTS/CTS, BTS and Arbiter.
Under normal circumstances, all three roles are connected via coupling links, forming a triangle of
connectivity.
Prior to this enhancement, the STP recovery design handled recovery of single failures in an
STP-only CTN. A single failure is defined as one of the three servers with an STP role can’t
communicate with one of the other servers with a role. However, if two servers with roles can’t
communicate with the Current Time Server (CTS), it is potentially catastrophic.
The enhancements are required to handle planned and unplanned actions that could impact all
servers with roles in an STP-only CTN with three or more servers. However, a complete failure
could still occur if both the PTS and BTS are lost.
Prior to discussing the enhancement, a review of the current recovery design should be understood.
A Review of Recovery rules for 3 or more servers in CTN
Note: In order to keep the complex discussion that follows as understandable as possible, it will be
assumed that:
 PTS has been assigned as the Current Time Server (CTS)
 BTS is a Stratum 2 server capable of becoming the CTS.
Only one CTS is permitted in the CTN at a time, and it can be either the PTS or BTS. That is, you
can not have two Stratum 1 (S1) servers in a timing network, so as to ensure data integrity.
The Backup Time Server (BTS) can take over as Current Time Server (CTS), only if either:
 Preferred Time Server (PTS) can indicate it has “failed”
 BTS can unambiguously determine the PTS has “failed”
Recovery rules are based on “voting”, as follows.
 If BTS and Arbiter agree they cannot communicate with PTS, then it is safe for BTS to
takeover as S1
7/28/2017
© 2015 IBM Corporation
2

If PTS cannot communicate with BTS and Arbiter, it can no longer remain a S1 - Even if it
is operational and providing a time source to rest of CTN. Therefore, in this condition, the
PTS must take itself out of the CTN by becoming a Stratum 0 (not synchronized)!
Original Design
PTS
S0
Coupling
links
P1
S2
BTS/CTS
S1
P2
S2
S2
Arbiter
S2
P3
P5
P6




P4
BTS loses communication with CTS on all established paths
BTS and Arbiter communicate to establish if Arbiter also cannot communicate with CTS
If both BTS and Arbiter cannot communicate with CTS
o BTS takes over as CTS (S1)
Since only 1 CTS (S1) can exist,
o PTS surrenders role of CTS
If a planned disruptive action is attempted on the PTS (assuming it is the CTS), STP SE code
blocks that action until the role of PTS reassigned to another server in CTN or all coupling link
CHPIDs on the PTS configured offline.
If a planned disruptive action is attempted on BTS or Arbiter, STP code DOES NOT block the
action. This could cause a Sysplex outage if disruptive actions on BTS and Arbiter performed as
part of same task or sequentially without reassigning or removing roles!
A Planned disruptive action initiated from the HMC to Power on Reset the BTS and Arbiter would
cause the PTS to lose communication with both the BTS and Arbiter. At that point, the PTS/CTS
must surrender its Stratum 1 role and become unsynchronized. Since no other clock source is
available in the diagram above, the PTS becomes unsynchronized (S0) and CECs with LPARs P4,
P5, P6 also become unsynchronized, causing an entire Sysplex wide outage.
New Enhancement Design
7/28/2017
© 2015 IBM Corporation
3
A healthy CTN looks like this diagram.
Full triad attachment state
PTS/CTS
BTS
Arbiter
All special role servers attached to each other to form a triangle
In the next diagram, if there is a failure of communication between two of the three servers with
roles, a partial triad attachment state is detected by STP. While in this condition, recovery
voting is available as usual. That is, the BTS can take over as the CTS using Arbiter assisted
recovery, but as we’ve stated, the CTS must surrender its role when it loses attachment to the
remaining special role server!
7/28/2017
© 2015 IBM Corporation
4
Partial triad attachment state
PTS/CTS
BTS
PTS/CTS
BTS
Arbiter
Arbiter
One pair of special role servers not attached
PTS/CTS
BTS
Arbiter
Due to the normal recovery voting rules, the CTS must surrender its role when it loses attachment
both the BTS and the Arbiter (either planned or unplanned). This is why IBM has recommended
that roles be moved to another server if possible. If it is not possible to move the role to another
server, then IBM recommends that the role be completely removed from the CTN temporarily (not
configured).
Note: A further enhancement now prohibits customers from accidentally causing a Degraded
Triad state (shown next). STP was enhanced to consistently block a potential disruptive action on
any server that has an STP role (PTS, BTS or Arbiter). With specific MCLs installed on machines
with these roles, STP forces the role to be removed or reassigned prior to proceeding with a
disruptive action. See http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102019
for more information on this additional enhancement. Neither enhancement is available on z9,
z990 or z890.
The enhancement discussed in this white paper directs STP to NOT do recovery voting as usual
when in a Degraded Triad attachment state, whenever both the BTS and the Arbiter are
unavailable, as shown in the next diagram. This enhancement will prevent the Stratum 1 server
from surrendering its role in the CTN.
7/28/2017
© 2015 IBM Corporation
5
Degraded triad attachment state
PTS/CTS
NEW
Voting disabled
BTS
PTS/CTS
BTS
Arbiter
Arbiter
PTS/CTS
BTS
Arbiter
When any two of the special role servers cannot communicate with the third role
STP design changes – when in a Degraded Triad State
When the triad state transitions to the degraded triad attachment state, either from full attachment
state or partial attachment state, triad voting is disabled by the CTS.
When triad voting has been disabled



The BTS can not take over as CTS using Arbiter Assisted recovery
The CTS will not surrender its role when it loses attachment to the remaining special role
server
The BTS can still take over as CTS using either Console Assisted Recovery (CAR) or the
STP Going Away Signal (GOSIG) transmitted from the CTS.
GOSIG requires zEnterprise servers with InfiniBand (IFB) links using HCA3-O to
HCA3-O (12x IFB or 12x IFB3) or HCA3-O LR to HCA3-O LR (1x IFB).
Triad voting is re-enabled when triad state transitions back to full attachment state
7/28/2017
© 2015 IBM Corporation
6
Enhanced Design - BTS takeover as CTS example – after design change
In the new enhanced scenario if the PTS were to checkstop, the BTS takes over as the Stratum 1
and then Voting is disabled (assuming the roles have not be reassigned to another server).
If a subsequently BTS fails, no other server is available to take over as the Stratum 1, so the
customer should still reassign PTS, BTS, Arbiter to maintain robust configuration.
If the Arbiter subsequently fails or is Power on Reset (planned action), the BTS/CTS does not
surrender the Stratum 1 role and thus CECs with LPARs P4, P5, P6 stay synchronized as S2
servers. Since voting is temporarily disabled, there is minimal impact to the Sysplex.
PTS
S0
Coupling
links
P1
S2
BTS/CTS
S1
P2
S2
S2
Arbiter
S2
P3
P6
P5
P4
7/28/2017
© 2015 IBM Corporation
7