ServerNet Cluster Manual Abstract This manual describes the installation, configuration, and management of HP NonStop™ ServerNet Cluster hardware and software for ServerNet clusters that include the ServerNet Cluster Switch model 6770. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.21 and H06.03 and all subsequent G-series and H-series release version updates until otherwise indicated by its replacement publication. Part Number Published 520575-003 July 2006 Document History Part Number Product Version Published 520371-001 N.A. May 2001 520440-001 N.A. July 2001 520575-001 N.A. November 2001 520575-002 N.A. May 2002 520575-003 N.A. July 2006 ServerNet Cluster Manual Index Examples What’s New in This Manual xvii Manual Information xvii New and Changed Information About This Manual xix Where to Find More Information Notation Conventions xxii Figures Tables xviii xxi Part I. Introduction 1. ServerNet Cluster Description The ServerNet Cluster Product 1-2 Three Network Topologies Supported 1-2 Hardware and Software Components for Clustering 1-9 Coexistence With Expand-Based Networking Products 1-10 Benefits of Clustering 1-12 ServerNet Cluster Terminology 1-12 Cluster 1-12 Star Group 1-12 Node 1-13 Node Number 1-13 X and Y Fabrics 1-16 ServerNet Cluster Hardware Overview 1-18 Routers 1-18 Service Processors (SPs) 1-18 Modular ServerNet Expansion Boards (MSEBs) 1-19 Plug-In Cards 1-22 Node-Numbering Agent (NNA) FPGA 1-22 Cluster Switch 1-26 ServerNet Cables for Each Node 1-30 Connections Between Cluster Switches 1-30 Hewlett-Packard Company—520575-003 i 2. Planning for Installation Contents ServerNet Cluster Software Overview 1-33 SNETMON and the ServerNet Cluster Subsystem MSGMON 1-37 NonStop Kernel Message System 1-38 SANMAN Subsystem and SANMAN 1-38 Expand 1-39 OSM and TSM Software 1-44 SCF 1-45 1-33 Part II. Planning and Installation 2. Planning for Installation Using the Planning Checklist 2-1 Planning for the Topology 2-8 Considerations for Choosing a Topology 2-8 Software Requirements for the Star, Split-Star, and Tri-Star Topologies 2-9 Subsets of a Topology 2-9 Planning for Hardware 2-9 Cluster Switches Required for Each Topology 2-9 Alternate Cluster Switch Packaging 2-10 Two Fiber-Optic Cables Needed for Each Server 2-10 Fiber-Optic Cables for Multilane Links 2-10 Fiber-Optic Cable Information 2-11 Two MSEBs Needed for Each Server 2-13 Planning for Floor Space 2-18 Locating the Servers and Cluster Switches 2-19 Floor Space for Servicing of Cluster Switches 2-21 Planning for Power 2-22 Planning for Software 2-23 Minimum Software Requirements 2-23 SP Firmware Requirement for Systems With Tetra 8 Topology 2-26 Verifying the System Name, Expand Node Number, and Time Zone Offset 2-26 Planning for Compatibility With Other Expand Line Types 2-26 Planning for Serviceability 2-27 Planning for the System Consoles 2-27 Planning for the Dedicated OSM or TSM LAN 2-27 ServerNet Cluster Manual—520575-003 ii Contents 3. Installing and Configuring a ServerNet Cluster 3. Installing and Configuring a ServerNet Cluster Installing a ServerNet Cluster Using a Star Topology 3-1 Task 1: Complete the Planning Checklist 3-2 Task 2: Inventory Your Hardware 3-3 Task 3: Install the Servers 3-4 Task 4: Upgrade the Operating System and Software 3-4 Task 5: Install MSEBs in Slots 51 and 52 of Group 01 3-5 Task 6: Add MSGMON, SANMAN, and SNETMON to the System-Configuration Database 3-7 Task 7: Verify That $ZEXP and $NCP Are Started 3-11 Task 8: Install the Cluster Switches 3-12 Task 9: Perform the Guided Procedure for Configuring a ServerNet Node 3-17 Task 10: Check for Problems 3-23 Task 11: Add the Remaining Nodes to the Cluster 3-24 Installing a ServerNet Cluster Using the Split-Star Topology 3-25 Task 1: Decide Which Nodes Will Occupy the Two Star Groups of the Split-Star Topology 3-25 Task 2: Route the Fiber-Optic Cables for the Four-Lane Links 3-26 Task 3: Install the Two Star Groups of the Split-Star Topology 3-26 Task 4: Use the Guided Procedure to Prepare to Join the Clusters 3-27 Task 5: Connect the Four-Lane Links 3-27 Task 6: Configure Expand-Over-ServerNet Lines for the Remote Nodes 3-29 Task 7: Verify Operation of the Cluster Switches 3-29 Task 8: Verify Cluster Connectivity 3-29 Installing a ServerNet Cluster Using the Tri-Star Topology 3-30 Task 1: Decide Which Nodes Will Occupy the Three Star Groups of the Tri-Star Topology 3-30 Task 2: Route the Fiber-Optic Cables for the Two-Lane Links 3-32 Task 3: Install the Three Star Groups of the Tri-Star Topology 3-32 Task 4: Use the Guided Procedure to Prepare to Merge the Clusters 3-32 Connecting the Two-Lane Links 3-33 Task 5: Configure and Start Expand-Over-ServerNet Lines 3-35 Task 6: Verify Operation of the Cluster Switches 3-35 Task 7: Verify Cluster Connectivity 3-36 ServerNet Cluster Manual—520575-003 iii 4. Upgrading a ServerNet Cluster Contents 4. Upgrading a ServerNet Cluster Benefits of Upgrading 4-2 Benefits of Upgrading to G06.12 (Release 2) Functionality 4-2 Benefits of Upgrading to G06.14 (Release 3) Functionality 4-3 Benefits of Upgrading to G06.16 Functionality 4-3 Planning Tasks for Upgrading a ServerNet Cluster 4-4 Task 1: Identify the Current Topology 4-4 Task 2: Choose the Topology That You Want to Upgrade To 4-6 Task 3: Fill Out the Planning Worksheet 4-9 Task 4: Select an Upgrade Path 4-12 Upgrading Software to Obtain G06.12 Functionality 4-17 Upgrading Software Without System Loads to Obtain G06.12 Functionality 4-21 Upgrading Software With System Loads to Obtain G06.12 Functionality 4-25 Fallback for Upgrading Software to Obtain G06.12 Functionality 4-26 Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 Functionality 4-26 Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality 4-30 Upgrading Software to Obtain G06.14 Functionality 4-34 Upgrading Software Without System Loads to Obtain G06.14 Functionality 4-35 Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality 4-38 Upgrading Software With System Loads to Obtain G06.14 Functionality 4-42 Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality 4-45 Fallback for Upgrading Software to Obtain G06.14 Functionality 4-50 Merging Clusters to Create a Split-Star Topology 4-54 Example: Merging Two Star Topologies to Create a Split-Star Topology 4-54 Steps for Merging Two Star Topologies to Create a Split-Star Topology 4-60 Connecting the Four-Lane Links 4-64 Fallback for Merging Clusters to Create a Split-Star Topology 4-66 Merging Clusters to Create a Tri-Star Topology 4-68 Example: Merging Three Star Topologies to Create a Tri-Star Topology 4-68 Steps for Merging Three Star Topologies to Create a Tri-Star Topology 4-73 Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-78 Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-82 Connecting the Two-Lane Links 4-87 Fallback for Merging Clusters to Create a Tri-Star Topology 4-89 ServerNet Cluster Manual—520575-003 iv 5. Managing a ServerNet Cluster Contents Fallback for Merging Three Star Topologies to Create a Tri-Star Topology 4-89 Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-89 Reference Information 4-93 Considerations for Upgrading SANMAN and TSM 4-93 Considerations for Upgrading SNETMON/MSGMON and the Operating System 4-95 Updating the Firmware and Configuration 4-97 Updating Service Processor (SP) Firmware 4-105 Updating the Subsystem Control Facility (SCF) 4-105 Part III. Operations and Management 5. Managing a ServerNet Cluster Monitoring Tasks 5-1 Displaying Status Information Using the TSM Service Application 5-1 Running SCF Remotely 5-12 Quick Reference: SCF Commands for Monitoring a ServerNet Cluster 5-12 Displaying Information About the ServerNet Cluster Monitor Process (SNETMON) 5-13 Checking the Status of SNETMON 5-14 Checking the Status of the ServerNet Cluster Subsystem 5-16 Checking ServerNet Cluster Connections 5-17 Checking the Version of the ServerNet Cluster Subsystem 5-17 Generating Statistics 5-18 Monitoring Expand-Over-ServerNet Line-Handler Processes 5-20 Monitoring Expand-Over-ServerNet Lines and Paths 5-20 Control Tasks 5-26 Quick Reference: SCF Commands for Controlling a ServerNet Cluster 5-27 SCF Objects for Managing a ServerNet Cluster 5-27 Starting the Message Monitor Process (MSGMON) 5-28 Aborting the Message Monitor Process (MSGMON) 5-28 Starting the External ServerNet SAN Manager Process (SANMAN) 5-29 Aborting the External ServerNet SAN Manager Process (SANMAN) 5-29 Restarting the External ServerNet SAN Manager Process (SANMAN) 5-30 Starting the ServerNet Cluster Monitor Process (SNETMON) 5-30 Aborting the ServerNet Cluster Monitor Process (SNETMON) 5-30 Starting ServerNet Cluster Services 5-31 When a System Joins a ServerNet Cluster 5-31 ServerNet Cluster Manual—520575-003 v 6. Adding or Removing a Node Contents Stopping ServerNet Cluster Services 5-33 Switching the SNETMON or SANMAN Primary and Backup Processes 5-34 6. Adding or Removing a Node Adding a Node to a ServerNet Cluster 6-1 Removing a Node From a ServerNet Cluster 6-2 Moving a Node From One ServerNet Cluster to Another 6-4 Moving ServerNet Cables to Different Ports on the ServerNet II Switches Expanding or Reducing a Node in a ServerNet Cluster 6-11 Splitting a Large Cluster Into Multiple Smaller Clusters 6-11 7. Troubleshooting and Replacement Procedures Troubleshooting Procedures 7-1 Troubleshooting Tips 7-1 Software Problem Areas 7-3 Hardware Problem Areas 7-6 Troubleshooting the Cluster Tab in the TSM Service Application 7-9 Online Help for the Guided Procedures 7-11 Using TSM Alarms 7-12 Troubleshooting SNETMON 7-17 Troubleshooting MSGMON 7-19 Troubleshooting SANMAN 7-20 Troubleshooting Expand-Over-ServerNet Line-Handler Processes and Lines 7-21 Checking Communications With a Remote Node 7-23 Methods for Repairing ServerNet Connectivity Problems 7-23 Automatic Fail-Over for Two-Lane and Four-Lane Links 7-25 Checking the Internal ServerNet X and Y Fabrics 7-26 Checking the External ServerNet X and Y Fabrics 7-29 Using the Internal Loopback Test Action 7-30 Using SCF to Check Processor-to-Processor Connections 7-31 Finding ServerNet Cluster Event Messages in the Event Log 7-31 MSEB and ServerNet II Switch LEDs 7-33 Replacement Procedures 7-35 Replacing an MSEB 7-35 Replacing a PIC in a ServerNet II Switch 7-35 Replacing a PIC in an MSEB 7-36 Replacing a Fiber-Optic Cable Between an MSEB and a ServerNet II Switch 7-36 Replacing a Fiber-Optic Cable in a Multilane Link 7-37 ServerNet Cluster Manual—520575-003 vi 6-5 8. SCF Commands for SNETMON and the ServerNet Cluster Subsystem Contents Replacing a ServerNet II Switch 7-38 Replacing an AC Transfer Switch 7-38 Replacing a UPS 7-38 Diagnosing Performance Problems 7-39 Part IV. SCF 8. SCF Commands for SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster SCF Objects 8-2 Sensitive and Nonsensitive Commands 8-2 SCL SUBSYS Object Summary States 8-3 ServerNet Cluster Subsystem Start State (STARTSTATE Attribute) 8-4 ALTER Command 8-5 Considerations 8-5 Example 8-6 INFO Command 8-6 Example 8-6 PRIMARY Command 8-7 Consideration 8-8 Example 8-8 START Command 8-8 Considerations 8-8 Example 8-9 STATUS Command 8-9 Considerations 8-11 STATUS SUBNET Command Example 8-11 STATUS SUBNET, PROBLEMS Command Example 8-14 STATUS SUBNET, DETAIL Command Example (Partial Display) 8-15 STATUS SUBSYS Command Example 8-21 STOP Command 8-21 Considerations 8-22 Example 8-22 TRACE Command 8-22 Considerations 8-24 Examples 8-24 VERSION Command 8-25 Examples 8-25 ServerNet Cluster Manual—520575-003 vii Contents 9. SCF Commands for the External ServerNet SAN Manager Subsystem 9. SCF Commands for the External ServerNet SAN Manager Subsystem SANMAN SCF Objects 9-2 Sensitive and Nonsensitive Commands 9-2 ALTER Command 9-3 Considerations 9-4 ALTER SWITCH Command Examples 9-4 INFO Command 9-5 Consideration 9-5 INFO CONNECTION Command Example 9-6 INFO SWITCH Command Example 9-9 LOAD Command 9-13 Considerations 9-14 LOAD SWITCH Command Examples 9-15 PRIMARY Command 9-16 Consideration 9-16 Example 9-16 RESET Command 9-16 Considerations 9-17 RESET SWITCH Command Examples 9-17 STATUS Command 9-18 Considerations 9-19 STATUS CONNECTION Command Example 9-19 STATUS CONNECTION, NNA Command Example 9-22 STATUS SWITCH Command Example 9-25 STATUS SWITCH, ROUTER Command Example 9-33 TRACE Command 9-34 Considerations 9-35 Examples 9-35 VERSION Command 9-36 Examples 9-36 Command Status Enumeration 9-37 Status Detail Enumeration 9-37 ServerNet Cluster Manual—520575-003 viii 10. SCF Error Messages Contents 10. SCF Error Messages Types of SCF Error Messages 10-1 Command Parsing Error Messages 10-1 SCF-Generated Numbered Error Messages 10-1 Common Error Messages 10-1 SCL Subsystem-Specific Error Messages 10-1 SCF Error Messages Help 10-2 ServerNet Cluster (SCL) Error Messages 10-2 SANMAN (SMN) Error Messages 10-7 If You Have to Call Your Service Provider 10-12 A. Part Numbers B. Blank Planning Forms C. ESD Information D. Service Categories for Hardware Components E. TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Example Macro E-2 F. Common System Operations Logging On to the TSM Low-Level Link Application F-1 Logging On to the TSM Service Application F-2 Logging On to Multiple TSM Client Applications F-3 Starting a TACL Session Using the Outside View Application F-5 Starting a TACL From the TSM Service Application F-5 Starting a TACL From the TSM Low-Level Link Application F-5 Using the TSM EMS Event Viewer F-6 G. Fiber-Optic Cable Information Fiber-Optic Cabling Model G-1 Optical Characteristics G-3 Optical Fiber Connection G-3 Insertion Loss G-3 ServerNet MDI Optical Power Requirements Connectors G-4 ServerNet Cluster Connections G-5 Node Connections G-5 Cluster Switch Connections G-5 G-4 ServerNet Cluster Manual—520575-003 ix H. Using OSM to Manage the Star Topologies Contents H. Using OSM to Manage the Star Topologies ServerNet Cluster Resource Appears at Top Level of Tree Pane H-1 Some Cluster Resources Are Represented Differently in OSM H-1 Guided Procedures Have Changed H-2 Options for Changing Topologies H-2 For More Information About OSM H-3 I. SCF Changes at G06.21 Using SCF Commands for SNETMON and the ServerNet Cluster Subsystem ALTER SUBSYS I-1 INFO SUBSYS I-1 STATUS SUBNET I-1 TRACE I-1 Using SCF Commands for SANMAN I-2 ALTER SWITCH I-2 STATUS CONN I-2 STATUS SWITCH I-2 New SCF Error Messages for SANMAN I-2 I-1 Safety and Compliance Index Examples Example 5-1. Example 5-2. Example 5-3. Example 5-4. Example 5-5. Example 5-6. Example 5-7. Example 5-8. Example 5-9. Example 5-10. Example 5-11. Example 5-12. Example 5-13. Example 5-14. INFO PROCESS Command 5-14 LISTDEV Command 5-14 STATUS PROCESS Command 5-15 STATUS SUBSYS $ZZSCL Command 5-16 INFO SUBSYS $ZZSCL Command 5-16 STATUS SUBNET $ZZSCL command 5-17 VERSION SUBSYS, DETAIL Command 5-17 STATUS DEVICE Command Showing STARTED Line-Handler Process 5-20 STATUS DEVICE Command Showing STOPPED Line-Handler Process 5-20 STATUS LINE, DETAIL Command 5-22 STATUS PATH, DETAIL Command 5-23 STATS LINE Command 5-23 STATS PATH Command 5-24 INFO LINE, DETAIL Command 5-24 ServerNet Cluster Manual—520575-003 x Figures Contents Example 5-15. Example 5-16. Example 5-17. INFO PATH, DETAIL Command 5-25 INFO PROCESS $NCP, LINESET Command INFO PROCESS $NCP, NETMAP Command 5-25 5-26 Figures Figure 1-1. Figure 1-2. Figure 1-3. Figure 1-4. Figure 1-5. Figure 1-6. Figure 1-7. Figure 1-8. Figure 1-9. Figure 1-10. Figure 1-11. Figure 1-12. Figure 1-13. Figure 1-14. Figure 1-15. Figure 1-16. Figure 1-17. Figure 1-18. Figure 1-19. Figure 1-20. Figure 1-21. Figure 2-1. Figure 2-2. Figure 2-3. Figure 2-4. Figure 2-5. Figure 2-6. Figure 2-7. Figure 2-8. Figure 2-9. Figure 2-10. Figure 2-11. ServerNet Cluster Topologies (Both Fabrics Shown) 1-3 8-Node ServerNet Cluster Using Star Topology 1-4 16-Node ServerNet Cluster Using Split-Star Topology 1-6 24-Node ServerNet Cluster Using Tri-Star Topology 1-8 ServerNet Cluster Coexistence With a FOX Ring 1-11 ServerNet Node Numbers in a Split-Star Topology (One Fabric Shown) 1-14 ServerNet Node Numbers in a Tri-Star Topology (One Fabric Shown) 1-15 Simplified Logical Diagram Showing Internal X and Y Fabrics 1-17 ServerNet Expansion Board (SEB) and Modular ServerNet Expansion Board (MSEB) 1-20 Service Side Slots for NonStop Sxx000 Processor Enclosure 1-21 NNA and ECL PICs 1-23 ServerNet Packet and ServerNet ID 1-24 ServerNet Addressing Scheme for External Fabrics 1-25 Cluster Switch Enclosure 1-27 Cluster Switch Components 1-28 Cluster Switch Block Diagram 1-29 Routing Across the Four-Lane Links 1-31 Routing Across the Two-Lane Links 1-32 ServerNet Cluster Logical Diagram 1-35 Line-Handler Processes in a Four-Node Cluster 1-41 Message Passing Over ServerNet 1-43 Duplex SC Cable Connectors With Dust Caps 2-11 Slots 51 and 52 of Group 01 in a NonStop Sxx000 Server 2-14 SEBs and MSEBs 2-15 Single-Mode Fiber-Optic PIC Installed in Port 6 2-16 SEB and MSEB Connectors (Actual Size) 2-17 Maximum Cable Lengths for Servers and Switches 2-19 Maximum Cable Lengths With a Multilane Link 2-19 Placement of Cluster Switches 2-20 ServerNet II Switch Extended for Servicing 2-22 LAN Connections for NonStop S-Series Servers 2-28 Recommended Configuration for Dedicated LAN 2-29 ServerNet Cluster Manual—520575-003 xi Figures Contents Figure 2-12. Figure 2-13. Figure 2-14. Figure 2-15. Figure 3-1. Figure 3-2. Figure 3-3. Figure 3-4. Figure 3-5. Figure 3-6. Figure 4-1. Figure 4-2. Figure 4-3. Figure 4-4. Figure 4-5. Figure 4-6. Figure 4-7. Figure 4-8. Figure 4-9. Figure 4-10. Figure 4-11. Figure 4-12. Figure 4-13. Figure 5-1. Figure 5-2. Figure 5-3. Figure 5-4. Figure 5-5. Figure 5-6. Figure 5-7. Figure 5-8. Dedicated LAN 2-30 Public LAN 2-31 Ethernet LANs Serving Individual Nodes 2-34 Ethernet LAN Serving Multiple Nodes 2-35 Connecting a ServerNet Cable to an ECL PIC 3-6 Inspecting Fiber-Optic Cable Connectors 3-19 Key Positions on ServerNet II Switch Ports 3-20 Inserting a Fiber-Optic Cable Connector Into an MSEB Receptacle 3-20 Inserted Connector With Bad Fiber Connection 3-21 Effect of Uneven Fiber Insertion on Link Alive 3-21 Example of Upgrading Software for a Four-Node Cluster Without a System Load 4-19 Example of Upgrading Software for a Four-Node Cluster With System Loads 4-20 Example of Upgrading Software Without System Loads to Obtain G06.14 Functionality 4-37 Example of Upgrading Software With System Loads to Obtain G06.14 Functionality 4-44 Example of Merging Clusters Containing Pre-G06.13 Nodes 4-56 Example of Upgrading a Cluster to a Release 2 Split-Star Topology and Adding Nodes 4-58 Example of a Split-Star Topology With 80-Meter Four-Lane Links 4-59 Key Positions on ServerNet II Switch Ports 4-64 Before the Upgrade: Example of Merging Three Star Topologies to Create a Tri-Star Topology 4-71 After the Upgrade: Example of Merging Three Star Topologies to Create a Tri-Star Topology 4-72 Before the Upgrade: Example of Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-81 After the Upgrade: Example of Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-82 Key Positions on ServerNet II Switch Ports 4-87 TSM Management Window 5-2 Attributes for the MSEB 5-3 Attributes for the MSEB PIC 5-3 Tree Pane in the TSM Service Application 5-4 Attributes for the ServerNet Cluster Resource 5-5 Attributes for the Local Node 5-5 Attributes for the Remote Node Resource 5-6 Attributes for the External Fabric Resource 5-6 ServerNet Cluster Manual—520575-003 xii Figures Contents Figure 5-9. Figure 5-10. Figure 5-11. Figure 5-12. Figure 5-13. Figure 5-14. Figure 5-15. Figure 5-16. Figure 5-17. Figure 5-18. Figure 5-19. Figure 6-1. Figure 7-1. Figure 7-2. Figure 7-3. Figure 7-4. Figure 7-5. Figure 7-6. Figure 7-7. Figure 7-8. Figure 7-9. Figure 8-1. Figure C-1. Figure F-1. Figure F-2. Figure F-3. Figure G-1. Figure G-2. Figure G-3. Figure G-4. Attributes for the Switch Resource 5-7 Attributes for the Switch-to-Node Link 5-8 Attributes for the Switch-to-Switch Link 5-8 Attributes for the Remote Switch Object 5-9 F1 Help for Service State Attribute 5-10 Physical/Connection View of External Fabric in a Split-Star Topology 5-11 Physical/Connection View of External Fabric in a Tri-Star Topology 5-11 SNETMON Status Displayed by TSM Service Application 5-15 Generate ServerNet Statistics Action 5-19 Checking the Expand-Over-ServerNet Lines Using TSM 5-21 ServerNet Cluster Connection Status Dialog Box 5-22 ServerNet II Switch Component of Cluster Switch 6-5 Management Window Tree Pane Showing Cluster Tab 7-9 Fabric Alarm Example 7-12 Alarms Tab Example 7-12 Alarm Detail Example 7-13 Repair Actions Example 7-14 ServerNet Cluster Attributes Showing SNETMON and SANMAN States 7-18 Remote Node Attributes Showing Expand Information 7-22 MSEB LEDs 7-33 ServerNet II Switch LEDs 7-34 ServerNet Cluster Subsystem States 8-4 Using ESD Protection When Servicing CRUs C-2 Log On to TSM Low-Level Link Dialog Box F-1 Log On to TSM Service Connection Dialog Box F-2 Multiple TSM Client Applications F-4 Fiber-Optic Cable Model G-1 Zipcord Cable Drawing G-2 Ruggedized Cable Drawing G-2 Duplex SC Connector and Receptacle G-4 ServerNet Cluster Manual—520575-003 xiii Tables Contents Tables Table 1-1. Table 1-2. Table 1-3. Table 1-4. Table 1-5. Table 2-1. Table 2-2. Table 2-3. Table 2-4. Table 2-5. Table 2-6. Table 2-7. Table 2-8. Table 2-9. Table 2-10. Table 3-1. Table 3-2. Table 3-3. Table 3-4. Table 3-5. Table 3-6. Table 3-7. Table 3-8. Table 3-9. Table 3-10. Table 3-11. Table 4-1. Table 4-2. Table 4-3. Table 4-4. Table 4-5. Table 4-6. Topology Subsets 1-7 Hardware Components for Clustering 1-9 PICs Used for ServerNet Cluster Installations 1-22 Using TSM Client Applications to Manage a ServerNet Cluster 1-44 SNETMON and SANMAN SCF Commands 1-45 Planning Checklist 2-2 Maximum Cluster Size for Each Topology 2-8 Cluster Switch Requirements for Each Topology 2-9 Cable Length Requirements for Multilane Links 2-11 Fiber-Optic Cable Requirements 2-12 Switch Enclosure Dimensions 2-21 Cluster Switch Power Requirements 2-22 Minimum Software Requirements for Each Topology 2-23 Checking SPR Levels 2-24 Version Procedure Information for ServerNet Cluster Software 2-25 Task Summary for Installing a ServerNet Cluster With a Star Topology 3-2 Hardware Inventory Checklist 3-3 Decision Table for Updating the ServerNet II Switch Firmware and Configuration 3-13 Firmware and Configuration Compatibility With the NonStop Kernel 3-14 Minimum SPR levels for G06.12 and G06.14 ServerNet Cluster Functionality 3-15 Task Summary for Installing a ServerNet Cluster Using a Split-Star Topology 3-25 Planning for the Split-Star Topology 3-26 Cable Connections for the Four-Lane Links 3-28 Task Summary for Installing a ServerNet Cluster Using a Tri-Star Topology 3-30 Planning for the Tri-Star Topology 3-31 Cable Connections for the Two-Lane Links 3-34 ServerNet Cluster Releases 4-2 G06.16 SPRs 4-3 Supported Configuration Tags 4-5 Supported Topologies, Cluster Switch Positions, and ServerNet Node Numbers 4-6 Comparison of ServerNet Cluster Topologies 4-7 SPRs for G06.12 and G06.14 ServerNet Cluster Functionality 4-8 ServerNet Cluster Manual—520575-003 xiv Tables Contents Table 4-7. Table 4-8. Table 4-9. Table 4-10. Table 4-11. Table 4-12. Table 4-13. Table 4-14. Table 4-15. Table 4-16. Table 4-17. Table 4-18. Table 4-19. Table 4-20. Table 4-21. Table 4-22. Table 4-23. Table 4-24. Table 4-25. Table 4-26. Table 4-27. Table 4-28. Table 4-29. Table 4-30. Table 4-31. Table 4-32. Table 4-33. Table 4-34. Table 5-1. Upgrade Planning Worksheet 4-10 T0569 Firmware Revisions 4-12 SCF and TSM Display of T0569 Configuration Revisions 4-12 Supported Upgrade Paths for Clusters Using the Star Topology 4-13 Supported Upgrade Paths for Clusters Using the Split-Star Topology 4-14 Supported Upgrade Paths for Clusters Using the Tri-Star Topology 4-16 Upgrade Summary: Upgrading Software to Obtain G06.12 Functionality 4-18 Upgrade Summary: Upgrading Software Without System Loads to Obtain G06.14 Functionality 4-36 Upgrade Summary: Upgrading Software With System Loads to Obtain G06.14 Functionality 4-43 Upgrade Summary: Upgrading Software to Create a Split-Star Topology (G06.12 Functionality) 4-55 Planning for Nodes in the Split-Star Topology 4-61 Planning for Cluster Switches in the Split-Star Topology 4-61 Four-Lane Link Connections for the Split-Star Topology 4-65 Upgrade Summary: Merging Three Star Topologies to Create a Tri-Star Topology 4-69 Planning for Cluster Switches in the Tri-Star Topology 4-74 Planning for Nodes in the Tri-Star Topology 4-75 Upgrade Summary: Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 4-79 Two-Lane Link Connections for the Tri-Star Topology 4-88 SANMAN and TSM Considerations 4-93 SANMAN and TSM SPRs 4-94 SANMAN and RVU Compatibility 4-94 SNETMON, MSGMON, and Operating System Version Compatibility 4-95 Length of Four-Lane Link and Operating System 4-96 Cluster Switch Positions and ServerNet Node Numbers 4-97 Firmware and Configuration File Names 4-98 Firmware and Configuration Compatibility With the NonStop Kernel 4-99 Upgrading T0569: Sequence for Downloading ServerNet II Switch Firmware and Configuration 4-104 Falling Back T0569: Sequence for Downloading ServerNet II Switch Firmware and Configuration 4-105 SCF Commands for Monitoring a ServerNet Cluster 5-12 ServerNet Cluster Manual—520575-003 xv Contents Table 5-2. Table 7-1. Table 7-2. Table 7-3. Table 7-4. Table 7-5. Table 8-1. Table 8-2. Table 8-3. Table 8-4. Table 9-1. Table 9-2. Table 9-3. Table 9-4. Table D-1. Table G-1. Table G-2. SCF Commands for Controlling a ServerNet Cluster 5-27 Software Problem Areas 7-3 Hardware Problem Areas 7-6 Automatic Fail-Over of ServerNet Traffic on a Four-Lane Link (X or Y Fabric) 7-26 Scope of Node Connectivity ServerNet Path Test 7-29 Names of Associated Subsystems 7-31 ServerNet Cluster SCF Objects and Commands 8-1 SCF Features for SNETMON and the SCL Subsystem by RVU 8-2 SCF Object Summary States 8-3 Path State Values Returned by the STATUS SUBNET, DETAIL Command 8-17 External ServerNet SAN Manager (SMN) Subsystem SCF Commands 9-1 SCF Features for SANMAN by RVU 9-2 SANMAN Cluster Switch Polling Intervals 9-12 Switch Port Status Codes and Possible Values 9-31 Service Categories for Hardware Components D-1 Optical Fiber and Cable Characteristic G-3 Single-Mode Fiber Insertion Loss G-3 ServerNet Cluster Manual—520575-003 xvi What’s New in This Manual Manual Information ServerNet Cluster Manual Abstract This manual describes the installation, configuration, and management of HP NonStop™ ServerNet Cluster hardware and software for ServerNet clusters that include the ServerNet Cluster Switch model 6770. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.21 and H06.03 and all subsequent G-series and H-series release version updates until otherwise indicated by its replacement publication. Part Number Published 520575-003 July 2006 Document History Part Number Product Version Published 520371-001 N.A. May 2001 520440-001 N.A. July 2001 520575-001 N.A. November 2001 520575-002 N.A. May 2002 520575-003 N.A. July 2006 ServerNet Cluster Manual—520575-002 xvii New and Changed Information What’s New in This Manual New and Changed Information This document has been updated throughout to incorporate changes to product and company names. This document now incorporates the ServerNet Cluster 6770 Supplement, for easier access and linking to the information. That relocated content includes: • • Appendix H, Using OSM to Manage the Star Topologies, describes some of the differences when using OSM instead of TSM to manage your star topology ServerNet cluster on NonStop S-series systems, and directs you to other OSM documentation for additional information. Appendix I, SCF Changes at G06.21, describes SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies. References to other ServerNet cluster documentation has been added. See Other ServerNet Cluster Manuals on page xxi. The part numbers in Appendix A, Part Numbers have been removed. For an up-to-date list of part numbers, refer to: NTL Support and Service Library > Service Information > Part Numbers > Part Number List for NonStop S-Series Customer Replaceable Units (CRUs) > ServerNet Cluster (Model 6770). The glossary has been removed. For definitions of common terms, see the NonStop System Glossary. ServerNet Cluster Manual—520575-002 xviii About This Manual The following table describes the sections of this manual. Part Section I II III IV Title This section... 1 ServerNet Cluster Description Introduces the ServerNet Cluster product. It describes hardware components for the 6770 ServerNet Cluster Switch and software components and the concepts that are essential to understanding the operation of a ServerNet cluster. 2 Planning for Installation Describes how to plan for installing ServerNet cluster hardware and software. 3 Installing and Configuring a ServerNet Cluster Describes how to install ServerNet Cluster hardware and software. 4 Upgrading a ServerNet Cluster Describes how to upgrade a ServerNet cluster to a split-star topology that supports up to 16 nodes or a tri-star topology that supports up to 24 nodes. 5 Managing a ServerNet Cluster Describes tasks for monitoring and controlling a ServerNet cluster. 6 Adding or Removing a Node Describes how to change the size of an already-installed ServerNet cluster (without changing the topology). 7 Troubleshooting and Replacement Procedures Describes how to use the TSM Service Application, the Subsystem Control Facility (SCF), and guided procedures to troubleshoot and diagnose a ServerNet cluster. This section also contains replacement procedures for the main hardware components of a ServerNet cluster. 8 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Describes the SCF commands that are supported specifically for SNETMON and the ServerNet cluster (SCL) subsystem. 9 SCF Commands for the External ServerNet SAN Manager Subsystem Describes the SCF commands that are supported specifically for the external system area network manager (SANMAN) process (SMN) subsystem. ServerNet Cluster Manual— 520575-003 xix About This Manual Part Appendix Section Title This section... 10 SCF Error Messages Describes the error messages generated by SCF and provides the cause, effect, and recovery information for the SCF error messages specific to the ServerNet cluster (SCL) subsystem and the SANMAN (SMN) subsystem. A Part Numbers Directs you to NTL for the list of part numbers. B Blank Planning Forms Contains blank copies of Planning Forms. C ESD Information Provides guidelines for preventing damage caused by electrostatic discharge (ESD) when working with electronic components. D Service Categories for Hardware Components Lists the service categories for ServerNet cluster hardware components. E TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Provides an example of a HP Tandem Advanced Command Language (TACL) macro for configuring ServerNet cluster generic processes. F Common System Operations Contains procedures for common operations used to monitor a ServerNet cluster. Procedures include logging on to TSM client applications, starting a TACL prompt, and using the TSM EMS Event Viewer Application. G Fiber-Optic Cable Information Provides additional information about the fiber-optic cables that connect a cluster switch to a node or another cluster switch. H Using OSM to Manage the Star Topologies Describes some of the differences when using OSM instead of TSM to manage your star topology ServerNet cluster on NonStop S-series systems, and directs you to other OSM documentation for additional information. I SCF Changes at G06.21 Describes changes made to the SCF product module for SNETMON that might affect management of a cluster with one of the star topologies. ServerNet Cluster Manual— 520575-003 xx Where to Find More Information About This Manual Where to Find More Information Other ServerNet Cluster Manuals This manual describes ServerNet clusters that contain ServerNet Cluster Switches (model 6770). For other ServerNet cluster information, see these documents: • • • • ServerNet Cluster 6770 Installation and Support Guide ServerNet Cluster 6780 Planning and Installation Guide ServerNet Cluster 6780 Operations Guide ServerNet Cluster Supplement for NS-Series Servers Support and Service Library These NTL Support and Service library categories provide procedures, part numbers, troubleshooting tips, and tools for servicing NonStop S-series and Integrity NonStop NS-series systems: • • • • • Hardware Service and Maintenance Publications Service Information Service Procedures Tools and Download Files Troubleshooting Tips Within these categories, where applicable, content might be further categorized according to server or enclosure type. Authorized service providers can also order the NTL Support and Service Library CD: • • HP employees: Subscribe at World on a Workbench (WOW). Subscribers automatically receive CD updates. Access the WOW order form at http://hps.knowledgemanagement.hp.com/wow/order.asp. HP Authorized Service Providers and Channel Partners: Send an inquiry to [email protected]. ServerNet Cluster Manual— 520575-003 xxi Notation Conventions About This Manual Notation Conventions Hypertext Links Blue underline is used to indicate a hypertext link within text. By clicking a passage of text with a blue underline, you are taken to the location described. For example: This requirement is described under Backup DAM Volumes and Physical Disk Drives on page 3-2. General Syntax Notation The following list summarizes the notation conventions for syntax presentation in this manual. UPPERCASE LETTERS. Uppercase letters indicate keywords and reserved words; enter these items exactly as shown. Items not enclosed in brackets are required. For example: MAXATTACH lowercase italic letters. Lowercase italic letters indicate variable items that you supply. Items not enclosed in brackets are required. For example: file-name [ ] Brackets. Brackets enclose optional syntax items. For example: TERM [\system-name.]$terminal-name INT[ERRUPTS] A group of items enclosed in brackets is a list from which you can choose one item or none. The items in the list may be arranged either vertically, with aligned brackets on each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines. For example: FC [ num ] [ -num] [ text] K [ X | D ] address-1 { } Braces. A group of items enclosed in braces is a list from which you are required to choose one item. The items in the list may be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines. For example: LISTOPENS PROCESS { $appl-mgr-name } { $process-name } ALLOWSU { ON | OFF } ServerNet Cluster Manual— 520575-003 xxii General Syntax Notation About This Manual | Vertical Line. A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces. For example: INSPECT { OFF | ON | SAVEABEND } … Ellipsis. An ellipsis immediately following a pair of brackets or braces indicates that you can repeat the enclosed sequence of syntax items any number of times. For example: M address [ , new-value ]… [ - ] {0|1|2|3|4|5|6|7|8|9}… An ellipsis immediately following a single syntax item indicates that you can repeat that syntax item any number of times. For example: "s-char…" Punctuation. Parentheses, commas, semicolons, and other symbols not previously described must be entered as shown. For example: error := NEXTFILENAME ( file-name ) ; LISTOPENS SU $process-name.#su-name Quotation marks around a symbol such as a bracket or brace indicate the symbol is a required character that you must enter as shown. For example: "[" repetition-constant-list "]" Item Spacing. Spaces shown between items are required unless one of the items is a punctuation symbol such as a parenthesis or a comma. For example: CALL STEPMOM ( process-id ) ; If there is no space between two items, spaces are not permitted. In the following example, there are no spaces permitted between the period and any other items: $process-name.#su-name Line Spacing. If the syntax of a command is too long to fit on a single line, each continuation line is indented three spaces and is separated from the preceding line by a blank line. This spacing distinguishes items in a continuation line from items in a vertical list of selections. For example: ALTER [ / OUT file-spec / ] CONTROLLER [ , attribute-spec ]... !i and !o. In procedure calls, the !i notation follows an input parameter (one that passes data to the called procedure); the !o notation follows an output parameter (one that returns data to the calling program). For example: CALL CHECKRESIZESEGMENT ( segment-id , error ) ; ServerNet Cluster Manual— 520575-003 xxiii !i !o Notation for Messages About This Manual !i,o. In procedure calls, the !i,o notation follows an input/output parameter (one that both passes data to the called procedure and returns data to the calling program). For example: error := COMPRESSEDIT ( filenum ) ; !i:i. !i,o In procedure calls, the !i:i notation follows an input string parameter that has a corresponding parameter specifying the length of the string in bytes. For example: error := FILENAME_COMPARE_ ( filename1:length , filename2:length ) ; !i:i !i:i !o:i. In procedure calls, the !o:i notation follows an output buffer parameter that has a corresponding input parameter specifying the maximum length of the output buffer in bytes. For example: error := FILE_GETINFO_ ( filenum , [ filename:maxlen ] ) ; !i !o:i Notation for Messages The following list summarizes the notation conventions for the presentation of displayed messages in this manual. Bold Text. Bold text in an example indicates user input entered at the terminal. For example: ENTER RUN CODE ?123 CODE RECEIVED: 123.00 The user must press the Return key after typing the input. Nonitalic text. Nonitalic letters, numbers, and punctuation indicate text that is displayed or returned exactly as shown. For example: Backup Up. lowercase italic letters. Lowercase italic letters indicate variable items whose values are displayed or returned. For example: p-register process-name [ ] Brackets. Brackets enclose items that are sometimes, but not always, displayed. For example: Event number = number [ Subject = first-subject-value ] A group of items enclosed in brackets is a list of all possible items that can be displayed, of which one or none might actually be displayed. The items in the list might be arranged either vertically, with aligned brackets on each side of the list, or ServerNet Cluster Manual— 520575-003 xxiv Notation for Management Programming Interfaces About This Manual horizontally, enclosed in a pair of brackets and separated by vertical lines. For example: proc-name trapped [ in SQL | in SQL file system ] { } Braces. A group of items enclosed in braces is a list of all possible items that can be displayed, of which one is actually displayed. The items in the list might be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines. For example: obj-type obj-name state changed to state, caused by { Object | Operator | Service } process-name State changed from old-objstate to objstate { Operator Request. } { Unknown. } | Vertical Line. A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces. For example: Transfer status: { OK | Failed } % Percent Sign. A percent sign precedes a number that is not in decimal notation. The % notation precedes an octal number. The %B notation precedes a binary number. The %H notation precedes a hexadecimal number. For example: %005400 %B101111 %H2F P=%p-register E=%e-register Notation for Management Programming Interfaces The following list summarizes the notation conventions used in the boxed descriptions of programmatic commands, event messages, and error lists in this manual. UPPERCASE LETTERS. Uppercase letters indicate names from definition files; enter these names exactly as shown. For example: ZCOM-TKN-SUBJ-SERV lowercase letters. Words in lowercase letters are words that are part of the notation, including Data Definition Language (DDL) keywords. For example: token-type !r. The !r notation following a token or field name indicates that the token or field is required. For example: ZCOM-TKN-OBJNAME token-type ZSPI-TYP-STRING. ServerNet Cluster Manual— 520575-003 xxv !r Change Bar Notation About This Manual !o. The !o notation following a token or field name indicates that the token or field is optional. For example: ZSPI-TKN-MANAGER token-type ZSPI-TYP-FNAME32. !o Change Bar Notation Change bars are used to indicate substantive differences between this edition of the manual and the preceding edition. Change bars are vertical rules placed in the right margin of changed portions of text, figures, tables, examples, and so on. Change bars highlight new or revised information. For example: The message types specified in the REPORT clause are different in the COBOL85 environment and the Common Run-Time Environment (CRE). The CRE has many new message types and some new message type codes for old message types. In the CRE, the message type SYSTEM includes all messages except LOGICAL-CLOSE and LOGICAL-OPEN. ServerNet Cluster Manual— 520575-003 xxvi Part I. Introduction This part contains only one section: Section 1, ServerNet Cluster Description. ServerNet Cluster Manual— 520575-003 Part I. Introduction ServerNet Cluster Manual— 520575-003 1 ServerNet Cluster Description This section introduces the ServerNet Cluster product. It describes the hardware and software components and the concepts that are essential to understanding the operation of a ServerNet cluster. Note. This manual, along with the ServerNet Cluster 6770 Installation and Support Guide, describes ServerNet clusters that contain ServerNet Cluster Switches (model 6770). The ServerNet Clusters product also includes support for: • • • NonStop NS-series servers 6780 ServerNet Cluster Switches A layered topology (for 6780 ServerNet Cluster Switches only) For information about these variations of the ServerNet Cluster product, see these documents: • • • ServerNet Cluster 6780 Planning and Installation Guide ServerNet Cluster 6780 Operations Guide ServerNet Cluster Supplement for NS-Series Servers This section assumes that you are familiar with NonStop S-series servers, the ServerNet protocol, and networking fundamentals. If you are not familiar with these concepts, refer to the NonStop S-Series Planning and Configuration Guide. This section contains the following subsections: Heading Page The ServerNet Cluster Product 1-2 ServerNet Cluster Terminology 1-12 ServerNet Cluster Hardware Overview 1-18 ServerNet Cluster Software Overview 1-33 ServerNet Cluster Manual— 520575-003 1 -1 The ServerNet Cluster Product ServerNet Cluster Description The ServerNet Cluster Product The ServerNet Cluster product is a new interconnection technology for NonStop S-series servers. This technology enables up to 24 servers to be connected in a group, or ServerNet cluster, that can pass information from one server to any other server in the cluster using the ServerNet protocol. Servers using either of the currently supported system topologies (Tetra 8 and Tetra 16) can participate in a cluster. ServerNet clusters extend the ServerNet X and Y fabrics outside the system boundary and allow the ServerNet protocol to be used for intersystem messaging. A ServerNet cluster consists of individual servers, each containing internal ServerNet fabrics (X and Y). These servers are connected to other servers through fiber-optic cables and 6770 ServerNet Cluster Switches. The cables and ServerNet switches constitute the external ServerNet X and Y fabrics. ServerNet clusters allow multiple multiprocessor systems to work together and appear to client applications as one large processing entity. Three Network Topologies Supported The network topology refers to the shape of a LAN or WAN and how its components are organized. The ServerNet Cluster product, when used with 6770 ServerNet Cluster Switches, uses three types of topologies to support a cluster: a star topology, a splitstar topology, and a tri-star topology. In all three topologies, each node has two independent connections (the X and Y fabrics) to two independent cluster switches. The loss of a node does not affect the other nodes in the cluster. Figure 1-1 shows diagrams of the supported network topologies. The shaded areas indicate a star group within a topology. ServerNet Cluster Manual— 520575-003 1 -2 Three Network Topologies Supported ServerNet Cluster Description Figure 1-1. ServerNet Cluster Topologies (Both Fabrics Shown) Star Group Star Group X1/Y1 X1/Y1 Star Topology X2/Y2 Star Group Star Group Split-Star Topology X1/Y1 X2/Y2 X3/Y3 Star Group Star Group Tri-Star Topology VST080.vsd ServerNet Cluster Manual— 520575-003 1 -3 Three Network Topologies Supported ServerNet Cluster Description Star Topology The star topology, introduced with the G06.09 RVU, supports up to eight nodes and requires two cluster switches—one for the X fabric and one for the Y fabric. Because there is only one cluster switch per fabric, the position ID of each cluster switch is always 1. Consequently, the cluster switches are named X1 and Y1. Note. You can configure a single system so that the software necessary for external ServerNet communication is running and ready to communicate with other nodes. However, if no other nodes are connected to the cluster switches, no communication occurs. In this case, communication with remote nodes occurs as soon as the nodes are configured and connected. Figure 1-2 shows both fabrics of an eight-node ServerNet cluster connected in a star topology. Figure 1-2. 8-Node ServerNet Cluster Using Star Topology 8-Node ServerNet Cluster (both fabrics shown) NonStop Himalaya Cluster Switch X1 NonStop Himalaya Cluster Switch Switch-to-Node Links (80 meters maximum) External ServerNet X Fabric Y1 External ServerNet Y Fabric ServerNet Nodes VST039.VSD ServerNet Cluster Manual— 520575-003 1 -4 Three Network Topologies Supported ServerNet Cluster Description Split-Star Topology The split-star topology, introduced with the G06.12 RVU, supports from 2 to 16 nodes and uses up to four cluster switches—two for the X fabric and two for the Y fabric. The two cluster switches on each fabric can have a position ID of either 1 or 2. Consequently, the cluster switches on the X fabric are named X1 and X2, and the cluster switches on the Y fabric are named Y1 and Y2. The first cluster switch on a fabric (X1 or Y1) supports ServerNet nodes 1 through 8. The second cluster switch on the same fabric (X2 or Y2) supports ServerNet nodes 9 through 16. Figure 1-3 shows both fabrics of a 16-node ServerNet cluster connected in a split-star topology. ServerNet Cluster Manual— 520575-003 1 -5 Three Network Topologies Supported ServerNet Cluster Description Figure 1-3. 16-Node ServerNet Cluster Using Split-Star Topology ServerNet Nodes X1 External ServerNet X fabric consists of 2 cluster switches, fourlane links, and switchto-node links Switch-to-Node Links (80 meters maximum) Y1 External ServerNet Y fabric consists of 2 cluster switches, fourlane links, and switchto-node links Four-Lane Links (maximum length depends on software release) * X2 Switch-to-Node Links (80 meters maximum) Y2 ServerNet Nodes VST069.VSD * The maximum length for the four-lane links depends on the level of clustering software and type of processor. For details, see Table 2-4, Cable Length Requirements for Multilane Links, on page 2-11. ServerNet Cluster Manual— 520575-003 1 -6 Three Network Topologies Supported ServerNet Cluster Description Tri-Star Topology The tri-star topology, introduced with the G06.14 RVU, supports from 2 to 24 nodes and uses up to six cluster switches—three for the X fabric and three for the Y fabric. The three cluster switches on each fabric can have a position ID of 1, 2, or 3. Consequently, the cluster switches on the X fabric are named X1, X2, and X3 and the cluster switches on the Y fabric are named Y1, Y2, and Y3. The first cluster switch on a fabric (X1 or Y1) supports ServerNet nodes 1 through 8, the second cluster switch on the same fabric (X2 or Y2) supports ServerNet nodes 9 through 16, and the third cluster switch on the same fabric (X3 or Y3) supports ServerNet Nodes 17 through 24. Figure 1-4 shows both fabrics of a 24-node ServerNet cluster connected in a tri-star topology. Subsets of a Topology You can build your cluster as a subset of either the split-star or the tri-star topology. For example, even though a tri-star topology can contain three cluster switches per fabric, you can build a tri-star topology that uses only one or two switches per fabric. The advantage of a topology subset is that it allows you to grow the cluster quickly online as your applications grow. Table 1-1 lists the valid subsets for each topology. A subset of a topology must meet all the requirements for the topology shown in Table 2-8 on page 2-23. Table 1-1. Topology Subsets Topology Full Topology or Subset Star Full Split-star Tri-star Cluster Switches Per Fabric Using cluster switches . . . Provides ServerNet node numbers . . . 1 X1/Y1 1 through 8 Full 2 X1/Y1 and X2/Y2 1 through 16 Subset 1 X1/Y1 1 through 8 Subset 1 X2/Y2 9 through 16 Full 3 X1/Y1, X2/Y2, and X3/Y3 1 through 24 Subset 2 X1/Y1 and X2/Y2 1 through 16 Subset 2 X1/Y1 and X3/Y3 1 through 8 and 17 through 24 Subset 2 X2/Y2 and X3/Y3 9 through 24 Subset 1 X1/Y1 1 through 8 Subset 1 X2/Y2 9 through 16 Subset 1 X3/Y3 17 through 24 ServerNet Cluster Manual— 520575-003 1 -7 Three Network Topologies Supported ServerNet Cluster Description Figure 1-4. 24-Node ServerNet Cluster Using Tri-Star Topology ServerNet Nodes Switch-to-Node Links (80 meters maximum) External ServerNet X fabric consists of up to three cluster switches, two-lane links, and switch-tonode links X1 Y1 External ServerNet Y fabric consists of up to three cluster switches, two-lane links, and switch-tonode links Two-Lane Links (maximum length depends on software) * X3 X2 Y3 Y2 Switch-to-Node Links (80 meters maximum) Switch-to-Node Links (80 meters maximum) ServerNet Nodes VST121.VSD * The maximum length for the two-lane links depends on the level of clustering software. For details, see Table 2-4, Cable Length Requirements for Multilane Links, on page 2-11. ServerNet Cluster Manual— 520575-003 1 -8 Hardware and Software Components for Clustering ServerNet Cluster Description Hardware and Software Components for Clustering For each topology, Table 1-2 lists key hardware components required to construct a ServerNet cluster. Table 1-2. Hardware Components for Clustering Required for Star Topology Required for Split-Star Topology Required for Tri-Star Topology NonStop S-series servers 1 to 8 1 to 16 1 to 24 Each server can have 2 to 16 processors. Cluster switches 2 (1 per external fabric) 4 (2 per external fabric) 6 (3 per external fabric) Subsets of the split-star and tri-star topologies use fewer cluster switches. Modular ServerNet expansion boards (MSEBs) 2 to 16 2 to 32 2 to 48 Two MSEBs are required for each node (one per external fabric). NNA PICs installed in MSEBs 2 to 16 2 to 32 2 to 48 Two NNA PICs are required for each node (one per external fabric). Single-wide PICS installed in cluster switches 16 32 48 Eight single-wide, single-mode fiber-optic PICS are included in each cluster switch. These PICs are not replaceable. Double-wide PICs installed in cluster switches None. However, these PICs are included in each cluster switch for future use. 8 12 Two double-wide, single-mode fiber-optic PICs are included in each cluster switch. Each double-wide PIC has two ports. These PICs are not replaceable. Fiber-optic cables (node to cluster switch) 2 to 16 2 to 32 2 to 48 Two single-mode fiberoptic cables are required per node (one per external fabric). Fiber-optic cables (between cluster switches) NA 8 (4 for each fabric) 12 (6 for each fabric) See Fiber-Optic Cables for Multilane Links on page 2-10. Hardware Component Notes For more information about hardware, see ServerNet Cluster Hardware Overview on page 1-18. ServerNet Cluster Manual— 520575-003 1 -9 Coexistence With Expand-Based Networking Products ServerNet Cluster Description The ServerNet Cluster product relies on many software components, described in the ServerNet Cluster Software Overview on page 1-33. Coexistence With Expand-Based Networking Products Nodes in a ServerNet cluster coexist as systems belonging to an Expand network. The ServerNet cluster product introduces a new line type for Expand: the Expand-over-ServerNet line. ServerNet clustering is compatible with all other Expand-based products. For example, a node in a FOX ring can simultaneously function as a node in a ServerNet cluster. Logical coexistence of the ServerNet cluster subsystem and the ServerNet/FX adapter subsystem is made possible by using different line-handler processes (Expand-overServerNet versus Expand-over-FOX) to handle incoming and outgoing Expand messages and packets. Physical coexistence is made possible by using different hardware components to route the packets: FOX nodes Use ServerNet/FX adapters and fiber-optic cable for routing. ServerNet nodes Use MSEBs and cluster switches for routing. Figure 1-5 shows an example of a ServerNet cluster coexisting within a FOX ring. This arrangement enables communication from NonStop: • • • • K-series to K-series K-series to S-series S-series to K-series S-series to S-series ServerNet clusters can also coexist with ATM, Ethernet, Fast Ethernet, Token Ring, and other HP WAN and LAN products. For more information, see the Expand Configuration and Management Manual. Note. ServerNet cluster technology permits a wide variety of network topologies to be constructed. To find out which topologies have been tested and approved, contact your HP representative. ServerNet Cluster Manual— 520575-003 1- 10 Coexistence With Expand-Based Networking Products ServerNet Cluster Description Figure 1-5. ServerNet Cluster Coexistence With a FOX Ring FOX Ring K-Series S-Series S-Series External ServerNet Fabrics K-Series S-Series S-Series K-Series K-Series vst037.vsd ServerNet Cluster Manual— 520575-003 1- 11 Benefits of Clustering ServerNet Cluster Description Benefits of Clustering Clustering has multiple benefits. ServerNet clusters can improve: • • • • Performance. For interprocessor communication, ServerNet clusters take advantage of the NonStop Kernel message system for low message latencies, low message processor costs, and high message throughput. The same message system is used for interprocessor communication within the node and between nodes. Furthermore, ServerNet clusters take advantage of the faster transmission speed of ServerNet II for increased message throughput between nodes. Availability. ServerNet clusters make it easier to repair or bypass problems by diverting the load to other nodes. Manageability. ServerNet clusters allow you to share resources easily. The ServerNet cluster quick-disconnect capability makes it easier to implement planned outages. Scalability. ServerNet clusters provide an avenue for growth as your network load increases. ServerNet Cluster Terminology This subsection introduces terms that are essential to understanding a ServerNet cluster: • • • • • • • Cluster on page 1-12 Star Group on page 1-12 Node on page 1-13 Node Number on page 1-13 X and Y Fabrics on page 1-16 Internal ServerNet Fabrics on page 1-16 External ServerNet Fabrics on page 1-17 Cluster A cluster is a collection of servers, or nodes, that can function either independently or collectively as a processing unit. This use of “cluster” differs from the definition of a cluster in a FOX ring. In a FOX ring, a cluster is synonymous with “server” and refers to a collection of processors and I/O devices rather than a collection of servers. Star Group A star group consists of an X and a Y cluster switch and up to eight connected ServerNet nodes. This term is useful in describing the segments of a split-star or tristar topology. A split-star topology consists of two star groups connected by four-lane links. A tri-star topology consists of three star groups connected by two-lane links. ServerNet Cluster Manual— 520575-003 1- 12 Node ServerNet Cluster Description Node When a system joins a network, the system becomes a network node. A node is a uniquely identified computer system connected to one or more other computer systems. Each system in an Expand network is an Expand node. Each system in a ServerNet cluster is a ServerNet node. In general, a ServerNet node can be any model of server that supports ServerNet fabrics. To determine if your server can be part of a ServerNet cluster, refer to the documentation for your server. Because ServerNet clusters use Expand, a ServerNet node is also an Expand node. Node Number NonStop servers use several types of node numbers. A node number identifies a member system in a network. The node number is unique for each system in the network. A ServerNet node number identifies a member system in a ServerNet cluster. The ServerNet node number is a simplified expression of the six-bit node routing ID that determines the node to which a ServerNet packet is routed. The ServerNet node number, which can be viewed using the OSM Service Connection, TSM Service Application, or SCF, is unique for each node in a ServerNet cluster. ServerNet Node-Number Assignments The following table shows the ServerNet node numbers supported by each network topology: This Topology... Supports ServerNet Node Numbers... Star 1 through 8 Split-star 1 through 16 Tri-star 1 through 24 ServerNet node numbers are assigned automatically, depending on the port to which a node is connected on the ServerNet II Switch. (The ServerNet II Switch is the main component of a cluster switch.) The cluster switches support the following ServerNet nodes: Cluster Switches... Support ServerNet Nodes... X1 and Y1 1 through 8 X2 and Y2 9 through 16 X3 and Y3 17 through 24 ServerNet Cluster Manual— 520575-003 1- 13 Node Number ServerNet Cluster Description Figure 1-6 shows the ServerNet node-number assignments in a split-star topology. Figure 1-6. ServerNet Node Numbers in a Split-Star Topology (One Fabric Shown) ServerNet Node Numbers 1 X1 or Y1 Cluster Switch 2 0 1 4 2 3 5 4 6 5 7 6 8 7 ServerNet II Switch Port Numbers X2 or Y2 Cluster Switch 10 11 8 9 10 11 8 9 ServerNet II Switch Port Numbers 0 9 3 10 1 2 11 3 12 4 13 5 6 14 7 15 16 ServerNet Node Numbers VST070.vsd ServerNet Cluster Manual— 520575-003 1- 14 Node Number ServerNet Cluster Description Figure 1-7 shows the ServerNet node-number assignments in a tri-star topology. Figure 1-7. ServerNet Node Numbers in a Tri-Star Topology (One Fabric Shown) ServerNet Node Numbers 1 9 2 10 0 3 4 0 1 8 10 1 2 9 11 2 X1/Y1 ServerNet II Switch Port Numbers 3 5 4 5 6 6 11 12 3 X2/Y2 ServerNet II Switch Port Numbers 4 11 8 5 10 9 6 13 14 7 7 15 7 8 16 X3/Y3 ServerNet II Switch Port Numbers 0 17 18 8 9 1 2 19 3 20 10 11 5 6 4 21 7 22 23 24 ServerNet Node Numbers VST120.vsd ServerNet Cluster Manual— 520575-003 1- 15 X and Y Fabrics ServerNet Cluster Description Expand Node Number An Expand node number, sometimes called a “system number,” is a number that identifies a system in an Expand network. A ServerNet node has both a ServerNet node number and an Expand node number. X and Y Fabrics A collection of connected routers and ServerNet links is called a fabric. Two identically configured fabrics, referred to as the X fabric and the Y fabric, together provide a faulttolerant interconnection for the server. Communication occurs on both fabrics simultaneously for increased throughput. Each processor connects to both fabrics. The X fabric and the Y fabric are not connected to each other; therefore, a ServerNet packet cannot cross from one fabric to the other, and a failure in one fabric does not affect the other fabric. This fault tolerance at the hardware level enhances availability. Internal ServerNet Fabrics An internal ServerNet X or Y fabric delivers ServerNet packets between ServerNet end devices within a server. The internal ServerNet X and Y fabrics are identical. ServerNet messages are routed across the internal fabrics using server hardware such as ServerNet Expansion Boards (SEBs) and Modular ServerNet Expansion Boards (MSEBs). Figure 1-8 shows a simplified logical diagram of the internal ServerNet X and Y fabrics. ServerNet Cluster Manual— 520575-003 1- 16 X and Y Fabrics ServerNet Cluster Description Figure 1-8. Simplified Logical Diagram Showing Internal X and Y Fabrics ServerNet Link (X fabric) ServerNet Link (Y fabric) (Used for clarity when both fabrics are shown) X fabric Y fabric Processor X Y Processor X Y Y fabric X fabric ServerNet Adapter Disks ServerNet Adapter ServerNet Adapter ServerNet Adapter vst909.vsd External ServerNet Fabrics An external ServerNet X or Y fabric delivers ServerNet packets between servers that are nodes in a ServerNet cluster. See Figure 1-2 on page 1-4 and Figure 1-3 on page 1-6. External fabrics provide intersystem communication. The external ServerNet fabrics are an extension of the internal ServerNet fabrics beyond the single-system boundary. Every ServerNet cluster has identical, external X and Y fabrics. Messages are routed to the external fabrics using a special plug-in card (PIC) installed in the MSEBs in slots 51 and 52 of group 01 of each node. The external system area network manager processes (SANMANs) in each node of a ServerNet cluster manage the external ServerNet fabrics. Every SANMAN process has equal rights to manage the external fabrics. ServerNet Cluster Manual— 520575-003 1- 17 ServerNet Cluster Hardware Overview ServerNet Cluster Description ServerNet Cluster Hardware Overview ServerNet clusters use the following hardware components: • • • • • • • Routers Service processors (SPs) Modular ServerNet expansion boards (MSEBs)* Plug-in cards (PICs)* Node-numbering agent FPGAs Cluster switches* ServerNet cables* *For service categories, refer to Appendix D, Service Categories for Hardware Components. Routers In a NonStop S-series server, routers route messages across ServerNet links. The following customer-replaceable units (CRUs) contain routers: • • • • • Processor multifunction (PMF) CRUs and PMF2 CRUs I/O multifunction (IOMF) and IOMF 2 CRUs ServerNet expansion boards (SEBs) Modular ServerNet expansion boards (MSEBs) ServerNet II Switches (main component of cluster switch) The MSEB and ServerNet II Switch use router-2 technology. The router-2 is a proprietary design, application-specific integrated circuit (ASIC) that provides wormhole routing of ServerNet packets between 12 input ports and 12 output ports. The ASIC contains a 12-way crossbar switch that receives data packets, checks them for errors, interprets their destination addresses, and then routes the packets out one of its 12 output ports. The router-2 ASIC supports the ServerNet II protocol. Because the external ServerNet fabrics contain MSEBs and ServerNet II Switches that use the router-2, ServerNet clusters can take advantage of the faster speeds and larger packet sizes of ServerNet II. Service Processors (SPs) Service processors (SPs) are components of the processor multifunction (PMF), PMF2, I/O multifunction (IOMF), and IOMF 2 CRUs. SPs assign ServerNet IDs to processors and I/O devices within a NonStop S-series server. They also configure the SEBs and MSEBs to route packets between these devices. For ServerNet clusters, the SPs: • Assign ServerNet IDs for the processors in a server in a regular fashion, such that a client can request a ServerNet ID for a given processor in a given (logical) node from any master service processor (MSP) in any node in the ServerNet cluster. ServerNet Cluster Manual— 520575-003 1- 18 ServerNet Cluster Description • • • • Modular ServerNet Expansion Boards (MSEBs) Program the SEBS and/or MSEBs, as well as any other routers within a system to route packets to the MSEBs that connect this system to the external ServerNet fabrics. Such routing occurs whenever these packets are addressed to a ServerNet ID that does not lie within the range of ServerNet IDs for the current system. Program the MSEBs in slots 51 and 52 of group 01 to route ServerNet packets to the external fabrics if the packets are destined for a different node. Control the configuration of the plug-in card (PIC) that provides the ServerNet connection between the system and the external ServerNet fabrics. This is the single-mode fiber PIC that has the node-numbering agent (NNA) field-programmable gate array (FPGA). Provide events to an external client, such as the OSM or TSM server software, to indicate problems encountered while configuring a system to become a member of a ServerNet cluster. Modular ServerNet Expansion Boards (MSEBs) A modular ServerNet expansion board (MSEB) is similar to the ServerNet expansion board (SEB) currently in use in NonStop S-series servers. MSEBs provide connections for ServerNet cables that link one system enclosure to another. However, unlike SEBs, MSEBs use plug-in cards (PICs) to provide a choice of cable media (ECL, optical fiber, or serial copper) for routing ServerNet packets. ServerNet Cluster Manual— 520575-003 1- 19 Modular ServerNet Expansion Boards (MSEBs) ServerNet Cluster Description Figure 1-9 shows an SEB and an MSEB. Figure 1-9. ServerNet Expansion Board (SEB) and Modular ServerNet Expansion Board (MSEB) SEB MSEB Fault LED (amber) Ejector Power-On LED (green) Link Alive LED (green) SERVERNET 6 SERVERNET 5 ServerNet 10 Serial Copper ServerNet 6 PIC (NNA fiber-optic only) 10 Link Alive LED (green) 6 ServerNet 5 PIC ServerNet 9 Serial Copper 9 5 ServerNet 4 PIC SERVERNET 4 SERVERNET 3 4 ServerNet 8 Serial Copper ServerNet 3 PIC 8 3 ServerNet 2 PIC SERVERNET 2 SERVERNET 1 ServerNet 7 Serial Copper 7 2 ServerNet 1 PIC 1 VST002.vsd An MSEB connects either to the X fabric or to the Y fabric, but never to both. MSEBs allow the connection of fiber-optic cables from the server to the cluster switches. At least two MSEBs must be installed in each node of a ServerNet cluster. The MSEBs must be installed in the group 01 enclosure in slot 51 (for the X fabric) and slot 52 (for the Y fabric). MSEBs do not need to be installed in other enclosures. ServerNet Cluster Manual— 520575-003 1- 20 Modular ServerNet Expansion Boards (MSEBs) ServerNet Cluster Description These MSEBs route packets out of each server and onto the external ServerNet fabrics. All packets destined for other nodes travel through port 6 of these MSEBs. Figure 1-10 shows MSEBs installed in slots 51 and 52 of a NonStop S7xx000 server. Figure 1-10. Service Side Slots for NonStop Sxx000 Processor Enclosure 50 55 Modular ServerNet Expansion Board (MSEB) 51 52 53 54 Y Fabric Fiber-Optic Cables to NonStop Cluster Switches X Fabric 56 Group 01 VST048.vsd For information about the use of MSEBs for ServerNet connections within a node, refer to the NonStop S-Series Hardware Installation and FastPath Guide. ServerNet Cluster Manual— 520575-003 1- 21 Plug-In Cards ServerNet Cluster Description Plug-In Cards A plug-in card (PIC) allows an MSEB or a ServerNet II Switch to support a variety of ServerNet cable media. The MSEB chassis can hold six single-wide PICs. A singlewide PIC has one connector. The ServerNet II Switch can accommodate eight single-wide PICs and two doublewide PICs. Double-wide PICs have two connectors and are installed in each ServerNet II Switch in ports 8 through 11. Table 1-3 describes the PICs used for ServerNet cluster installations. Other types of PICs, such as multimode fiber PICs and serial copper PICs, are available for connecting ServerNet cables for the internal fabrics. Table 1-3. PICs Used for ServerNet Cluster Installations PIC Type Where Installed Function ECL MSEBs Allows first-generation ServerNet cables to be connected to MSEBs. Single-mode fiber ServerNet II Switches Allows fiber-optic cable to be connected from the MSEBs in group 01 of each node to the cluster switches. Double-wide fiber ServerNet II Switches Allows two ServerNet II Switches to be connected together on the same fabric to support a split-star topology. Single-mode fiber with NNA FPGA Group 01 MSEBs, slots 51 and 52, port 6 Allows fiber-optic cable to be connected from an MSEB in group 01 of each node to the cluster switch and carry out nodenumbering function for the cluster. PICs are Class 3 customer-replaceable units (CRUs). However, they are replaceable when installed in MSEBs but not when installed in ServerNet II Switches. For more information about service categories for ServerNet cluster hardware components, refer to Appendix D, Service Categories for Hardware Components. Node-Numbering Agent (NNA) FPGA The node-numbering agent is a field-programmable gate array (FPGA) contained within one of the single-mode, fiber-optic PICs. The NNA FPGA converts the ServerNet IDs of packets routed to the external ServerNet fabrics so that they can be directed to another server. An NNA FPGA can be installed only in port 6 of the MSEBs in slots 51 and 52 of group 01. The NNA FPGA translates the ServerNet source and destination IDs for all packets going to or coming from the external ServerNet fabrics. ServerNet Cluster Manual— 520575-003 1- 22 Node-Numbering Agent (NNA) FPGA ServerNet Cluster Description Figure 1-11 shows an NNA PIC and an ECL PIC. Figure 1-11. NNA and ECL PICs NNA PIC ECL PIC VST057.vsd External Routing and the NNA To understand the role of the NNA FPGA in a ServerNet cluster, you must understand ServerNet packets and ServerNet IDs. ServerNet packets are the unit of transmission in a ServerNet network. A ServerNet packet consists of a header, a variable-size data field, and a 32-bit, cyclic redundancy check (CRC) checksum covering the entire packet. Figure 1-12 shows a detail of the ServerNet ID portion of the header. The header contains fields for control information, the destination ServerNet ID, and the source ServerNet ID to identify the processor or I/O device transmitting and receiving the packet. Each packet has a destination ServerNet ID (DID) and a source ServerNet ID (SID). (Recall that the node number is derived from the first six bits of the 20-bit ServerNet ID.) The SID for ServerNet packets within a server is the same for all NonStop S-series servers. Therefore, packets that must be sent to another node must be modified. For outbound packets, the SID must be replaced with an SID for external routing. For inbound packets, the DID must be replaced with a DID for internal routing. ServerNet Cluster Manual— 520575-003 1- 23 Node-Numbering Agent (NNA) FPGA ServerNet Cluster Description The NNA FPGA controls ServerNet addressing on the external ServerNet fabrics. The FPGA contains circuitry that modifies the SID or the DID of each packet so the packets are routed correctly between each node. Note. The NNA FPGA modifies only one of the fields (DID or SID) and the CRC checksum of each packet. The ServerNet address and data payload fields are not modified. Figure 1-12. ServerNet Packet and ServerNet ID ServerNet Packet H A D C ServerNet ID Bits 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ServerNet Node Routing ID Key H = 8-Byte Header 20-Bit Destination ServerNet ID (DID) 20-Bit Source ServerNet ID (SID) 4-Bit Transaction Type 8-Bit Data Length 4-Bit Transaction ID 1-Bit Primary/Alternate Path 1-Bit Acknowledge/Negative Acknowledge/Negative Acknowledge 1-Bit Request/Response 5 Bits Reserved A = 4-Byte ServerNet Address D = 0-Byte to 64-Byte Data C = 4-Byte Cyclic Redundancy Check (CRC) Code VST040.vsd ServerNet Cluster Manual— 520575-003 1- 24 Node-Numbering Agent (NNA) FPGA ServerNet Cluster Description Modification of the ServerNet IDs Figure 1-13 illustrates how the node number for a ServerNet packet is modified as a packet moves from one node to another in a cluster: 1. ServerNet packets from \A destined for \B leave the local node through the singlemode, fiber-optic PIC installed in port 6 of the MSEB in group 01. As the packets pass through the PIC, the node-numbering agent (NNA) FPGA modifies the SID of each packet, replacing the default internal-routing SID with the external-routing SID. 2. The packets travel along the fiber-optic cable until they reach the ServerNet II Switch. (The ServerNet II Switch is the main component of the cluster switch.) Because the PICs installed in the ServerNet II Switch do not contain an NNA FPGA, the packets pass through these PICs unmodified and are routed to their destination ID by the router-2 ASIC within the ServerNet II Switch. 3. When they reach the single-mode, fiber-optic PIC in the MSEB at \B, the ServerNet packets are modified again. The NNA FPGA modifies the DID for each packet, replacing the external-routing DID with the default internal-routing DID. Figure 1-13. ServerNet Addressing Scheme for External Fabrics Internal Fabric Packet 4 0 Internal Fabric External Fabric Packet 4 Data DID SID NNA Modifies SID \A 2 Packet Data DID SID 0 NNA Modifies DID 2 Data DID SID \B NonStop Himalaya Cluster Switch MSEB (Group 01) \A Node Numbers Internal Routing: 0 External Routing: 2 Legend DID - Destination ID SID - Source ID NNA - Node-Numbering Agent MSEB - Modular ServerNet Expansion Board MSEB (Group 01) \B Node Numbers Internal Routing: 0 External Routing: 4 VST034.VSD ServerNet Cluster Manual— 520575-003 1- 25 Cluster Switch ServerNet Cluster Description Cluster Switch The cluster switch is an assembly consisting of the following components: • • • ServerNet II Switch Uninterruptible power supply (UPS) AC transfer switch Depending on the type of topology, a ServerNet cluster uses from two to six cluster switches. Clusters with a star topology use one cluster switch per external fabric for a total of two cluster switches. Clusters with a split-star topology use two cluster switches per external fabric for a total of four cluster switches. Clusters with a tri-star topology use three cluster switches per external fabric for a total of six cluster switches. For details on the topologies, see Three Network Topologies Supported. Packaging for Cluster Switches The cluster switch can be packaged in a switch enclosure that is half the height of the standard NonStop S-series enclosure. Figure 1-14 shows the switch enclosure. The switch enclosure can be installed on top of a NonStop S-series system enclosure—but not on top of stacked system enclosures. For details, see Figure 2-8, Placement of Cluster Switches. ServerNet Cluster Manual— 520575-003 1- 26 Cluster Switch ServerNet Cluster Description Figure 1-14. Cluster Switch Enclosure VST010.vsd The cluster switch can also be packaged in a 19-inch rack that is 24 to 26 (61 to 66 cm) inches deep. ServerNet II Switch The ServerNet II Switch is the main component of the cluster switch. The ServerNet II Switch is a 12-port network switch used in ServerNet networks. In a ServerNet cluster, ports 0 through 7 provide the physical junction points that enable the nodes to connect to the cluster switches. Ports 8 through 11 are used for connections between cluster switches on the same fabric in a split-star topology or tri-star topology. Figure 1-15 shows the ServerNet II Switch in a switch enclosure with the door removed. Note. The ServerNet I Switch product is not supported for use in a ServerNet cluster. ServerNet Cluster Manual— 520575-003 1- 27 Cluster Switch ServerNet Cluster Description Figure 1-15. Cluster Switch Components ServerNet II Switch AC Transfer Switch Uninterruptible Power Supply (UPS) VST013.vsd For detailed information about the ServerNet II Switch, refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide. Single-Wide and Double-Wide PICs Like the MSEB, the ServerNet II Switch uses plug-in cards (PICs) to allow for a variety of ServerNet cable media. However, only single-mode fiber-optic PICS are currently supported for use in a ServerNet II switch. Two sizes of PICs are used in the ServerNet II switch: single-wide and double-wide PICs. Ports Use 0 through 7 Single-wide fiber-optic PICs 8 through 11 Double-wide fiber-optic PICs ServerNet Cluster Manual— 520575-003 1- 28 Cluster Switch ServerNet Cluster Description Single-wide fiber-optic PICs connect each node to a cluster switch. Double-wide fiberoptic PICs connect cluster switches on the same fabric in the split-star and tri-star topologies. Uninterruptible Power Supply (UPS) Within a cluster switch, the uninterruptible power supply (UPS), AC transfer switch, and ServerNet II Switch power supply form the cluster switch power subsystem. The UPS provides backup power (about two hours) to the ServerNet II Switch during a loss of AC line current. Figure 1-15 shows the UPS in a switch enclosure. Figure 1-16 shows a block diagram of the cluster switch. Figure 1-16. Cluster Switch Block Diagram ServerNet Ports ServerNet II AC/ ServerNet II Switch DC Power Switch Supply AC Power Cord ServerNet II Switc h Monitor Port (9 pin) Proprietary UPS AC Power Cord Power Monitor Cables AC Transfer Switch Primary AC Input Secondary AC Input VST930.vsd AC Transfer Switch The AC transfer switch allows the switch and UPS to draw power from either of two AC line cords. Power is drawn from the primary rail as long as power is available. If the primary rail fails, the load is connected to the secondary rail. A power-management cable connects the UPS and the AC transfer switch to the maintenance port of the ServerNet Cluster Manual— 520575-003 1- 29 ServerNet Cables for Each Node ServerNet Cluster Description ServerNet II Switch. Figure 1-15 shows the AC transfer switch in a cluster switch enclosure. Note. Relay scrubbing is not supported for the AC transfer switch. For most installations, the UPS provides backup power for enough time for you either to replace or bypass the AC transfer switch in the event of a failure. ServerNet Cables for Each Node ServerNet cables provide ServerNet links between routing devices. The cables are available in 10-meter, 40-meter, and 80-meter lengths. An 80-meter plenum-rated cable is also available. Fiber-optic ServerNet cables must meet the specifications shown in Table 2-5 on page 2-12. In a ServerNet cluster, single-mode, fiber-optic cables link each node to a cluster switch. Two ServerNet cables are needed for each node. One cable connects the MSEB in slot 51 of group 01 to the cluster switch serving the X fabric. The other cable connects the MSEB in slot 52 of group 01 to the cluster switch serving the Y fabric. Connections Between Cluster Switches ServerNet clusters using either the split-star topology or the tri-star topology require additional fiber-optic ServerNet cables to connect cluster switches on each fabric. The connections between two cluster switches are called a multilane link. The split-star topology and the tri-star topology use a different type of multilane link. Four-Lane Link Used In the Split-Star Topology In the split-star topology, the two cluster switches on each fabric are connected by a four-lane link consisting of four fiber-optic cables. Ports 8 through 11 of the two cluster switches are used for the four-lane link. Traffic travels in both directions across a fourlane link. Note. All four fiber-optic cables are required for a four-lane link, regardless of the number of nodes in the split-star topology. Cluster switches X1 and Y1 send traffic from ServerNet nodes 1 through 8 across the four-lane link to cluster switches X2 and Y2 for routing to ServerNet nodes 9 though 16. Figure 1-17 shows how traffic is routed across the four-lane links. ServerNet Cluster Manual— 520575-003 1- 30 Connections Between Cluster Switches ServerNet Cluster Description Figure 1-17. Routing Across the Four-Lane Links Routing From Any Node Connected to Cluster Switch X1/Y1 X2/Y2 To Distant ServerNet Nodes . . . Uses Port . . . 9 and 10 8 11 and 12 9 13 and 14 10 15 and 16 11 1 and 2 8 3 and 4 9 5 and 6 10 7 and 8 11 X1 or Y1 01 -02 -03 -04 -05 -06 -07 -08 -- 0 1 2 3 4 5 6 7 8 9 10 11 X2 or Y2 8 9 10 11 0 1 2 3 4 5 6 7 --------- 09 10 11 12 13 14 15 16 VST140.vsd If one lane of the four-lane link is down, traffic for up to four nodes is affected on one of the fabrics. For example, if port 8 is down at either end, nodes 1 and 2 cannot communicate with ServerNet nodes 9 through 16 on the affected fabric, and nodes 9 and 10 cannot communicate with ServerNet nodes 1 through 8 on the affected fabric. To avoid this problem, split-star topologies using G06.14 software (or G06.13 and the release 3 SPRs listed in Table 3-5) support automatic fail-over of ServerNet traffic on the four-lane links. For more information, refer to Section 7, Troubleshooting and Replacement Procedures. Two-Lane Links Used In the Tri-Star Topology In the tri-star topology, the three cluster switches on each fabric are connected by three two-lane links consisting of six fiber-optic cables. Ports 8 through 11 of the cluster switches are used for the two-lane links. Traffic travels in both directions across the two-lane links. Figure 1-18 shows how traffic is routed across the two-lane links. The tri-star topology features automatic fail-over of ServerNet traffic on the two-lane links. As long as at least one functional link is healthy between cluster switches, processor paths between ServerNet nodes can remain up on both fabrics. For more information, refer to Section 7, Troubleshooting and Replacement Procedures. ServerNet Cluster Manual— 520575-003 1- 31 Connections Between Cluster Switches ServerNet Cluster Description Figure 1-18. Routing Across the Two-Lane Links Routing From Any Node Connected to Cluster Switch X1/Y1 X2/Y2 X3/Y3 To Distant ServerNet Nodes . . . Uses Link . . . 9 through 12 L1 13 through 16 L2 17 through 20 L5 21 through 24 L6 1 through 4 L1 5 through 8 L2 17 through 20 L3 21 through 24 L4 1 through 4 L5 5 through 8 L6 9 through 12 L3 13 through 16 L4 X1 or Y1 01 -02 -03 -04 -05 -06 -07 -08 -- 0 1 2 3 4 5 6 7 X2 or Y2 L1 8 9 11 10 10 11 8 9 L2 L6 L5 17 -18 -19 -20 -21 -22 -23 -24 -- 0 1 2 3 4 5 6 7 --------- 09 10 11 12 13 14 15 16 L3 L4 8 9 10 11 0 1 2 X3 or Y3 3 4 5 6 7 VST134.vsd ServerNet Cluster Manual— 520575-003 1- 32 ServerNet Cluster Software Overview ServerNet Cluster Description ServerNet Cluster Software Overview ServerNet clusters use the following software components: • • • • • • • SNETMON and the ServerNet cluster subsystem MSGMON NonStop Kernel message system SANMAN Expand OSM or TSM software SCF SNETMON and the ServerNet Cluster Subsystem SNETMON is the Subsystem Programmatic Interface (SPI) server for ServerNet cluster subsystem-management commands. It monitors and responds to events relevant to ServerNet cluster operations, and it manages the ServerNet cluster subsystem. SNETMON is a fault-tolerant process pair. An SNETMON process pair exists on every node in a functioning ServerNet cluster. Total failure of the SNETMON process spawns a new SNETMON. SNETMON Summary The following list summarizes information about SNETMON: Process Description ServerNet cluster monitor process Abbreviation SNETMON Generic Process Name (Recommended) $ZZKRN.#ZZSCL Process Pair Name $ZZSCL Product Number T0294 Program File Name $SYSTEM.SYSnn.SNETMON SNETMON Functions SNETMON has multiple functions: • • • • Manages the state of the ServerNet cluster subsystem Detects the presence of other systems in the ServerNet cluster and establishes the ServerNet message-system connections Supports the Expand-over-ServerNet line handlers with the Network Access Method (NAM) protocol Monitors and responds to events that affect the message-system connections, such as processor failures, reloads, and path events ServerNet Cluster Manual— 520575-003 1- 33 SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster Description • • • • Receives path-event information from the individual processors in the system and translates this information into system-connection status information and EMS events Responds to queries from OSM or TSM client applications using the Subsystem Programmatic Interface (SPI) protocol Provides ServerNet status and statistics information to SNETMON clients Keeps its backup process up to date SNETMON also maintains the ServerNet cluster subsystem state. State changes are logged to $ZLOG as event ZCOM-EVT-SUMSTATE-CHG. There is no aggregate state for the ServerNet cluster subsystem. The SNETMONs on each node have a peer-topeer relationship with each other. Each SNETMON maintains the state of objects relevant to the local system and its connection to the ServerNet cluster. SNETMON maintains state information about: • • • The ServerNet cluster subsystem on each member system. Each individual ServerNet path from a processor on the local system to a processor on a remote system. The Expand-over-ServerNet line-handler processes. SNETMON does not recognize the Expand-over-ServerNet line handler state model but does recognize the identities (logical devices [LDEVs]) of the line-handlers that bind to it and of their process state. Figure 1-19 is a logical diagram that shows the interaction of SNETMON with other ServerNet cluster software components. ServerNet Cluster Manual— 520575-003 1- 34 SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster Description Figure 1-19. ServerNet Cluster Logical Diagram TSM (RAL) Service Processor SPI SCF SPI SANMAN Line Handler SPI NAM EMS Events Collectors ($0, $ZLOG) Events MsgMon Message System SNetMon NAM One Expand-OverServerNet Line-Handler Process for Each Remote System Line Handler Status/ Commands Application File System NRT NCP VST012.vsd SNETMON Fault Tolerance For fault tolerance, the SNETMON process pair can be configured to run on a list of processors. For example, in systems having eight processors, HP recommends that you configure SNETMON to run in the following CPU list: 2, 5, 6, 3, 7, 4. The persistence manager ($ZPM) starts the primary SNETMON process on the first processor in the list (processor 2), and the primary chooses the next processor in the list (processor 5) to start the backup. A failure in the primary processor (processor 2) causes a takeover. Processor 5 then becomes the primary, and a new backup SNETMON is created in the next processor in the list (processor 6). In this way, SNETMON can survive a variety of outages. Configuring SNETMON is described in more detail in Section 3, Installing and Configuring a ServerNet Cluster. As an alternative, you can use The ZPMCONF macro to automate the process of adding MSGMON, SANMAN, and SNETMON to the system-configuration database. For a description of the macro, refer to Appendix E, TACL Macro for Configuring MSGMON, SANMAN, and SNETMON. ServerNet Cluster Manual— 520575-003 1- 35 SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster Description You add the ServerNet cluster monitor process as a generic process using the Kernel subsystem SCF ADD PROCESS command. See the SCF Reference Manual for the Kernel Subsystem for complete command syntax. The ServerNet cluster monitor process must be configured: • • • • • To be persistent (set the AUTORESTART attribute to a nonzero value). To have the process name $ZZSCL (set the NAME attribute to $ZZSCL). The recommended symbolic name is ZZSCL. So that the $ZPM persistence manager stops the process by sending in an internal system message (set the STOPMODE attribute to SYSMSG). To run under the SUPER.SUPER user ID (by default the USERID attribute is set to the user ID of the current SCF session). With at least two processors (preferably more) in the CPU list (defined by using the CPU attribute). SNETMON Interaction With Expand For Expand, SNETMON interacts with the line-handler processes using the NAM protocol to do the following: • • • Enable Expand line-handler processes to synchronize with each other during startup Enable detection that the remote Expand line-handler process is down Inform the appropriate local line-handler process when the ServerNet cluster connection to a remote system goes down The ServerNet Cluster Subsystem The ServerNet cluster subsystem is the system configuration environment that enables ServerNet communication between processors in different nodes of a ServerNet cluster. The following list summarizes naming and numbering for the ServerNet cluster subsystem: Full Name ServerNet cluster subsystem Subsystem Name SCL Subsystem Acronym ZSCL Subsystem Number 218 The ServerNet cluster subsystem can have four logical states: • • • • STOPPED STARTING STARTED STOPPING ServerNet Cluster Manual— 520575-003 1- 36 MSGMON ServerNet Cluster Description These states and their transitions are described in detail in Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem. MSGMON MSGMON is a monitor process that resides in each processor of a server and executes functions required by the message system. MSGMON is a helper for SNETMON. MSGMON handles communications between SNETMON and individual processors. MSGMON also logs events from and generates events on behalf of the message system. MSGMON was created to relieve the system monitor subsystem. MSGMON is a persistent process. Once it is started, it terminates only in the event of an internal failure or a termination message from the persistence monitor, $ZPM. MSGMON is not a process pair. Note. MSGMON is compatible only with G06.09 and later RVUs. The following list summarizes information about MSGMON: Process Description Message monitor process Abbreviation MSGMON Generic Process Name (Recommended) $ZZKRN.#MSGMON Process Name $ZIMnn, where nn is a processor number Product Number Included in T0294 Program File Name $SYSTEM.SYSnn.MSGMON You add the message monitor (MSGMON) process as a generic process using the SCF Kernel subsystem ADD PROCESS command. This is described in Section 3, Installing and Configuring a ServerNet Cluster. See the SCF Reference Manual for the Kernel Subsystem for complete command syntax for the ADD PROCESS command. As an alternative, you can use the ZPMCONF macro to automate the process of adding MSGMON, SANMAN, and SNETMON to the system-configuration database. For a description of the macro, refer to Appendix E, TACL Macro for Configuring MSGMON, SANMAN, and SNETMON. Once created, MSGMON is started by the persistence manager (and restarted as necessary). MSGMON can be stopped with an SCF ABORT PROCESS command or as a result of an internal failure. ServerNet Cluster Manual— 520575-003 1- 37 NonStop Kernel Message System ServerNet Cluster Description MSGMON must be configured: • • • • To be persistent. To run in every processor of a system. To have the process name $ZIMnn, where nn is the processor number. The CPU ALL attribute ensures that the process names are created with the proper CPU number suffix. The recommended symbolic name is MSGMON. To run under the super ID (255,255). NonStop Kernel Message System The NonStop Kernel message system provides the set of privileged application program interfaces (APIs) for interprocessor communication between processes residing on the same or on different systems. The message system is changed for G06.09 and later RVUs to allow interprocessor communication between distinct systems over the ServerNet fabrics. SANMAN Subsystem and SANMAN The external system area network manager process (SANMAN) subsystem is the software environment that enables control of the external ServerNet fabrics. The following list summarizes naming and numbering for the SANMAN subsystem: Full Name NonStop Kernel external SAN manager subsystem Subsystem Name SMN Subsystem Acronym ZSMN Subsystem Number 237 SANMAN The External System Area Network Manager (SANMAN) is a process pair that runs in every NonStop S-series server connected to a ServerNet cluster. SANMAN provides the services needed to manage the external ServerNet fabrics and the system’s access to the fabrics: • • • • Manages the external ServerNet fabrics. Initializes, monitors, configures, and controls the cluster switches. Polls the cluster switches at regular intervals. For details, see Table 9-3 on page 9-12. Communicates with other processes or objects that require information from or about the external fabrics. These processes or objects can include the OSM or TSM client, SNETMON, SCF, the cluster switches, and the service processors (SPs). ServerNet Cluster Manual— 520575-003 1- 38 Expand ServerNet Cluster Description The following list summarizes information about SANMAN: Process Description External system area network manager process Abbreviation SANMAN Generic Process Name (Recommended) $ZZKRN.#ZZSMN Process Pair Name $ZZSMN Product Number T0502 Program File Name $SYSTEM.SYSnn.SANMAN SANMAN can be run in any processor. The process pair must be configured to be started by the persistence manager. You add the ServerNet SAN manager process as a generic process using the SCF Kernel subsystem ADD PROCESS command. This task is described in Section 3, Installing and Configuring a ServerNet Cluster. See the SCF Reference Manual for the Kernel Subsystem for complete command syntax for the ADD PROCESS command. As an alternative, you can use the ZPMCONF macro to automate the process of adding MSGMON, SANMAN, and SNETMON to the system-configuration database. For a description of the macro, refer to Appendix E, TACL Macro for Configuring MSGMON, SANMAN, and SNETMON. The ServerNet SAN manager process must be configured: • • • • • To be persistent (set the AUTORESTART attribute to a nonzero value). To have the process name $ZZSMN (set the NAME attribute to $ZZSMN). The recommended symbolic name is ZZSMN. So that the $ZPM persistence manager stops the process by sending in an internal system message (set the STOPMODE attribute to SYSMSG). To run under the super group user ID (by default the USERID attribute is set to the user ID of the current SCF session). To have at least two processors (preferably more) in the CPU list (defined with the CPU attribute). Expand The Expand subsystem is the networking software that connects NonStop S-series servers and other NonStop servers. It extends the operation of the fault-tolerant operating system to dispersed nodes. In an Expand network, system resources can communicate with one another as if they were running on the same local system. This communication is achieved by using the NonStop Kernel message system as the common interface to access system resources. ServerNet Cluster Manual— 520575-003 1- 39 Expand ServerNet Cluster Description The Expand subsystem supports a variety of protocols and communications methods to enable you to connect systems in local area network (LAN) and wide area network (WAN) topologies. Expand-over-FOX and Expand-over-ATM are two examples of communications methods. Expand-Over-ServerNet Line-Handler Processes Expand-over-ServerNet is a communications medium for the Network Access Method (NAM). When you add a node to a ServerNet cluster, you must add an Expand-overServerNet line-handler process for the new node in every other system in the cluster, as well as a line-handler process in the new node for every other node in the cluster. Thus, in a four-node ServerNet cluster, each node has three Expand-over-ServerNet line-handler processes. The OSM Add Node to ServerNet Cluster and the TSM Configure ServerNet Node guided procedures assist you in configuring and starting the line-handler processes. Figure 1-20 shows the Expand-over-ServerNet line-handler processes for a four-node ServerNet cluster. The line-handler processes follow the naming convention used by the guided procedure. ServerNet Cluster Manual— 520575-003 1- 40 Expand ServerNet Cluster Description Figure 1-20. Line-Handler Processes in a Four-Node Cluster \NODE1 (Expand Node Number 001) $SC002 $SC003 $SC004 $SC001 \NODE4 (Expand Node Number 004) $SC001 X-Fabric $SC002 $SC004 $X252 $SC003 Y -Fabric \NODE2 (Expand Node Number 002) $SC003 $SC004 $SC001 $SC002 \NODE3 (Expand Node Number 003) Key Configured Single-Line Path VST092.vsd ServerNet Cluster Manual— 520575-003 1- 41 Expand ServerNet Cluster Description The following list summarizes information about the line-handler process: Description Expand-over-ServerNet line-handler process Type 63 Subtype 4 Profile PEXPSSN ASSOCIATEDEV Default $ZZSCL (SNETMON) The Expand-over-ServerNet line-handler process manages security-related messages and forwards packets outside the ServerNet cluster. Other messages, such as incoming and outgoing data, usually bypass the Expand-over-ServerNet line-handler process and are handled directly by the ServerNet X fabric and Y fabric and the message system. This is known as Expand bypass mode. The Expand-over-ServerNet line-handler process is similar to the Expand-over-FOX line-handler process. However, for Expand-over-ServerNet lines you configure a subtype code of 4 and use the PEXPSSN profile. The ASSOCIATEDEV parameter contains the name of the SNETMON process, $ZZSCL. Like the Expand-over-FOX line handler process, the Expand-over-ServerNet line handler process can support only a single line and a single neighbor system. So the Expand-over-FOX line-handler process and the Expand-over-ServerNet line handler process are always separate processes with different logical devices (LDEVs). Super Time Factors Expand Time Factors (TFs) are numbers assigned to a line, and reported for a path and a route, to indicate efficiency in transporting data. The lower the TF, the more efficient the line, path, or route. Previous TFs have defined FOX lines as the fastest communication medium between servers. Prior to Expand-over-ServerNet, the highest line speed supported was 224 Kbps. Super Time Factors allow Expand network routing to have a faster transport medium between systems than a FOX line. Super Time Factors allow the extension of the automatically calculated TFs to line speeds greater than 224 Kbps. If two systems are linked by more than one type of Expand line (for example, Expandover-ServerNet and Expand-over-FOX), the time factor determines which line is faster. HP recommends setting the SPEEDK parameter to SNET, which makes it the fastest line. The ServerNet cluster transport is preferred because of its high bandwidth. ServerNet Cluster Manual— 520575-003 1- 42 Expand ServerNet Cluster Description Expand and Message-System Traffic In a ServerNet cluster, message-system traffic flows directly between processors by way of the message system but under the control of Expand. Figure 1-21 diagrams the traffic. Secure message-system traffic between processes on different ServerNet nodes travels through the Expand-over-ServerNet line handlers, and through the local message system between the communicating processes and the line handlers. Nonsecure message-system traffic flows directly between processors through the intersystem message-system ServerNet connections as directed by the Network Routing Table (NRT) under the control of Expand. The intersystem ServerNet message-system connections are not used if the appropriate settings are not made in the NRT. Figure 1-21. Message Passing Over ServerNet Processor A \ALPHA Processor C \BETA Process Process File System File System Ordinary Messages Message System Message System Processor B \ALPHA Processor D \BETA LH File System Message System SecurityChecked Messages LH File System Message System VST043.vsd For more details on the Expand subsystem, see the Expand Configuration and Management Manual. ServerNet Cluster Manual— 520575-003 1- 43 OSM and TSM Software ServerNet Cluster Description OSM and TSM Software Either HP Open System Management (OSM) or (its predecessor) Compaq TSM software can be used to monitor and service a ServerNet cluster (in addition to your NonStop servers. The OSM supports all ServerNet cluster topologies on both G-series and H-series. TSM supports only the star topologies (star, split-star, and tri-star), not the newer layered topology, and TSM does not support H-series. For more information, see Appendix H, Using OSM to Manage the Star Topologies. OSM client-based applications are preloaded on system consoles (formerly known as TSM workstations) shipped after G06.21. Consoles can be upgraded to the latest OSM or TSM client software from the NonStop System Console (NSC) Installer CD. System consoles communicate with NonStop S-series servers over an Ethernet LAN. LAN connections and LAN planning are described in more detail in Section 2, Planning for Installation. Note. SCF commands also are available for gathering information. You can use SCF commands to check the status of SNETMON, SANMAN, the Expand-over-ServerNet linehandler processes, cluster switches, IPC connectivity, and more. For details, see Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem, and Section 9, SCF Commands for the External ServerNet SAN Manager Subsystem. Table 1-4 lists the OSM and TSM client applications and provides some examples of how you can use them to manage a ServerNet cluster: Table 1-4. Using TSM Client Applications to Manage a ServerNet Cluster Use this application . . . OSM Service Connection or TSM Service Application To do this . . . • • • • • • • OSM Notification Director or TSM Notification Director OSM Event Viewer or TSM EMS Event Viewer Application • • • View node and cluster information Monitor the health of the internal and external ServerNet fabrics, including the MSEBs and the cluster switches View alarms and repair actions for system and cluster resources Start ServerNet Cluster services Generate ServerNet statistics Perform cluster switch actions such as updating firmware and configuration Switch the primary processor for SANMAN and SNETMON Display or dial-out incident reports (IRs) Configure dial-outs to a service provider View and monitor EMS event logs ServerNet Cluster Manual— 520575-003 1- 44 SCF ServerNet Cluster Description SCF Table 1-5 lists the SCF commands that are supported by SNETMON and SANMAN. Note. For SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies, see Appendix I, SCF Changes at G06.21. Table 1-5. SNETMON and SANMAN SCF Commands SNETMON SCF Commands SANMAN SCF Commands ALTER SUBSYS ALTER SWITCH INFO SUBSYS INFO CONN PRIMARY PROCESS INFO SWITCH START SUBSYS LOAD SWITCH STATUS SUBNET PRIMARY PROCESS STATUS SUBSYS RESET SWITCH STOP SUBSYS STATUS CONN TRACE PROCESS STATUS SWITCH VERSION PROCESS TRACE PROCESS VERSION SUBNET VERSION PROCESS VERSION SUBSYS Examples and descriptions of these SCF commands are found later in this manual: • • • Section 5, Managing a ServerNet Cluster contains examples of SCF commands. Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem documents the SNETMON SCF commands. Section 9, SCF Commands for the External ServerNet SAN Manager Subsystem documents the SANMAN SCF commands. The SCF interfaces to the Kernel subsystem, WAN subsystem, and Expand subsystem also are used to manage a ServerNet cluster. ServerNet Cluster Manual— 520575-003 1- 45 SCF ServerNet Cluster Description ServerNet Cluster Manual— 520575-003 1- 46 Part II. Planning and Installation This part contains the following sections: • • • Section 2, Planning for Installation Section 3, Installing and Configuring a ServerNet Cluster Section 4, Upgrading a ServerNet Cluster ServerNet Cluster Manual— 520575-003 Part II. Planning and Installation ServerNet Cluster Manual— 520575-003 2 Planning for Installation This section describes how to plan to install a ServerNet Cluster or add a node to an already-installed cluster. This section contains the following subsections: Heading Page Using the Planning Checklist 2-1 Planning for the Topology 2-8 Planning for Hardware 2-9 Planning for Floor Space 2-18 Planning for Power 2-22 Planning for Software 2-23 Planning for Serviceability 2-27 Before installing a ServerNet cluster, complete the planning steps to ensure that the installation: • • • Is successful Proceeds quickly Requires little long-term maintenance Note. HP strongly recommends that you complete the Planning Checklist before attempting the installation. Some planning checks are optional, but others, if not performed in advance, will prevent you from completing the installation. Using the Planning Checklist Table 2-1 shows the Planning Checklist. Read through the checklist and mark off each item for which your site is prepared. If you are not prepared to check off an item, see the reference for more information about how to prepare your site. ServerNet Cluster Manual— 520575-003 2 -1 Using the Planning Checklist Planning for Installation Planning Checklist Table 2-1 shows the planning checklist. Table 2-1. Planning Checklist (page 1 of 3) √ Major Planning Step For More Information Plan for the Topology Choose from one of the three supported topologies: star, split-star, or tri-star. Planning for the Topology on page 2-8 Make sure all nodes to be added to the cluster have the software required by the chosen topology. Planning for the Topology on page 2-8 Plan for Hardware Make sure the correct number of cluster switches are either available or will be ordered to support the topology of the cluster you are building. Cluster Switches Required for Each Topology on page 2-9 If the cluster switches are not purchased preinstalled in switch enclosures provided by HP, make sure two 19-inch racks meeting HP specifications are available. Alternate Cluster Switch Packaging on page 2-10 For each server that will join the cluster, make sure two single-mode, fiber-optic cables meeting HP specifications are available. Two Fiber-Optic Cables Needed for Each Server on page 2-10 If your cluster will use either the split-star topology or the tri-star topology, make sure enough fiberoptic cables (of the correct length) are available to connect the cluster switches on each fabric. Fiber-Optic Cables for Multilane Links on page 2-10 Make sure infrastructure (cable-routing channels) is available to support and route the fiber-optic cables. Two Fiber-Optic Cables Needed for Each Server on page 2-10 For each server that will join the ServerNet cluster, make sure MSEBs are installed in slots 51 and 52 of the group 01 enclosure. Two MSEBs Needed for Each Server on page 2-13 If MSEBs are already installed in slots 51 and 52 of group 01 of each server, verify that the MSEBs contain a single-mode fiber-optic PIC with the NNA FPGA in port 6. PICs for MSEBs That Will Replace SEBs on page 2-16 If MSEBs need to be installed, make sure the MSEBs contain the proper complement of PICs to support the ServerNet cables that must be attached to them. PICs for MSEBs That Will Replace SEBs on page 2-16 If MSEBs need to be installed, make sure ServerNet cables with the appropriate connectors are available to connect the MSEBs to other enclosures. ServerNet Cables for Attaching MSEBs to Other System Enclosures on page 2-17 ServerNet Cluster Manual— 520575-003 2 -2 Using the Planning Checklist Planning for Installation Table 2-1. Planning Checklist (page 2 of 3) √ Major Planning Step For More Information Plan for Floor Space Make sure the number of servers that will be connected to form a ServerNet cluster is no more than 24. Planning for Floor Space on page 2-18 Make sure no server will participate in more than one cluster at a time. Planning for Floor Space on page 2-18 If the servers are already installed . . . Make sure each cluster switch is no more than 80 meters (measured by cable length) from the group 01 enclosure of each server to which it connects. Locating the Servers and Cluster Switches on page 2-19 If the servers are not already installed . . . Make sure preinstallation planning for the servers and peripheral equipment has been or will be conducted in accordance with HP standards. NonStop S-Series Planning and Configuration Guide Make sure the servers can be installed so that the group 01 enclosure of each server is no more than 80 meters (measured by cable length) from the X- and Y-fabric cluster switches to which it connects. Locating the Servers and Cluster Switches on page 2-19 Floor Space for the Cluster Switches Make sure there is enough floor space to accommodate the footprint of the X-fabric and Yfabric cluster switches, whether installed in switch enclosures or in 19-inch racks. Floor Space for Servicing of Cluster Switches on page 2-21 Make sure there is enough space in front and in back of the switch enclosures or 19-inch racks for servicing. Floor Space for Servicing of Cluster Switches on page 2-21 Plan for Power Make sure power planning for the servers and peripheral equipment has been or will be conducted in accordance with HP standards. NonStop S-Series Planning and Configuration Guide and the ServerNet Cluster 6770 Hardware Installation and Support Guide Make sure an AC power source or, preferably, two independent AC power sources, are available within power-cable distance (8.2 feet (2.5 meters)) of each enclosure or 19-inch rack containing a cluster switch. Planning for Power on page 2-22 Make sure the AC power sources available for use by the cluster switches provide the appropriate current and voltage levels for continuous operation of the switches. Planning for Power on page 2-22 ServerNet Cluster Manual— 520575-003 2 -3 Using the Planning Checklist Planning for Installation Table 2-1. Planning Checklist (page 3 of 3) √ Major Planning Step For More Information Make sure emergency power-off (EPO) cables, if required, can be routed to the cluster switches. NonStop S-Series Planning and Configuration Guide Plan to Upgrade Software Make sure all servers to be added to the cluster are running the required operating system RVUs or SPRs. Minimum Software Requirements on page 2-23 Make sure the T0509 Expand/ServerNet Profile is installed on each server to be added to the cluster. Minimum Software Requirements on page 2-23 Make sure all servers to be added to the cluster have unique system names. Verifying the System Name, Expand Node Number, and Time Zone Offset on page 2-26 Make sure all servers to be added to the cluster have unique Expand node numbers. Verifying the System Name, Expand Node Number, and Time Zone Offset on page 2-26 Make sure the system clocks on all servers to be added to the cluster are set to the correct time zone offset. Verifying the System Name, Expand Node Number, and Time Zone Offset on page 2-26 If you want ServerNet nodes to coexist with other types of Expand nodes, make sure you purchase the necessary Expand profile products. Planning for Compatibility With Other Expand Line Types on page 2-26 Plan for the System Consoles Make sure at least one primary and one backup system console are available to manage the servers in the cluster. Planning for the System Consoles on page 2-27 Using the Cluster Planning Work Sheet, you can record the names and locations of the servers to be clustered. Appendix B, Blank Planning Forms, contains a blank copy of the Cluster Planning Work Sheet that you can photocopy. The form can accommodate information for up to eight nodes. If your cluster will contain 8 through 16 nodes, make two copies; if your cluster will contain more than 16 nodes, make three copies. See the Cluster Planning Work Sheet (Example) on page 2-5. It spans three pages because it contains information for an 18-node cluster. ServerNet Cluster Manual— 520575-003 2 -4 Using the Planning Checklist Planning for Installation Cluster Planning Work Sheet (Example) Cluster Name: Production/Sales Date: 17 Oct. 2001 Page 1 of 3 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \PROD1 G34973 52 Room 3 16 NonStop S-74000 1 0 1 \PROD2 G34970 51 Room 3 12 NonStop S-74000 1 1 2 \TEST1 G34971 50 Room 1 8 NonStop S-74000 1 2 3 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \PROD3 G34975 53 Room 1 10 NonStop S-74000 1 3 4 \TEST2 G34968 49 Room 2 2 NonStop S-74000 1 4 5 \SFSALES G34901 45 Room 2 8 NonStop S-74000 1 5 6 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \LASALES G34902 55 Room 2 12 NonStop S-74000 1 6 7 \NYSALES G34905 46 Room 2 12 NonStop S-74000 1 7 8 ServerNet Cluster Manual— 520575-003 2 -5 Using the Planning Checklist Planning for Installation Cluster Planning Work Sheet (Example) Cluster Name: Production/Sales Date: 17 Oct. 2001 Page 2 of 3 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \AKSALES G34998 54 Bldg B, Rm2 2 NonStop S-74000 2 0 9 \PROD4 G34991 47 Bldg B, Rm2 8 NonStop S-74000 2 1 10 \DATA1 G34992 44 Bldg B, Rm2 6 NonStop S-74000 2 2 11 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \DATA2 G34995 56 Bldg B, Rm2 12 NonStop S-72000 2 3 12 \MASALES G34098 48 Bldg B, Rm2 4 NonStop S-72000 2 4 13 \DATA4 G34001 40 Bldg B, Rm2 12 NonStop S-72000 2 5 14 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \PROD5 G36990 39 Bldg B, Rm2 12 NonStop S-74000 2 6 15 \DATA5 G36078 38 Bldg B, Rm2 4 NonStop S-74000 2 7 16 ServerNet Cluster Manual— 520575-003 2 -6 Using the Planning Checklist Planning for Installation Cluster Planning Work Sheet (Example) Cluster Name: Production/Sales Date: 17 Oct. 2001 Page 3 of 3 System Name Serial Number Expand Node # Location # of Processors Model X/Y Switch # X/Y Switch Port # ServerNet Node # \DATA6 G30200 89 Bldg C, Rm1 8 NonStop S-72000 3 1 18 \ \ \ NonStop S- NonStop S- \CHSALES G30001 81 Bldg C, Rm1 12 NonStop S-70000 3 0 17 System Name \ Serial Number Expand Node # Location # of Processors Model NonStop SX/Y Switch # X/Y Switch Port # ServerNet Node # System Name \ Serial Number Expand Node # Location # of Processors Model NonStop SX/Y Switch # X/Y Switch Port # ServerNet Node # \ NonStop S- ServerNet Cluster Manual— 520575-003 2 -7 NonStop S- Planning for the Topology Planning for Installation Planning for the Topology The topology you use determines the maximum size of the cluster. Table 2-2 compares the topologies. Table 2-2. Maximum Cluster Size for Each Topology Topology Cluster Switches Per Fabric Total Cluster Switches Maximum Number of Nodes Supported Star 1 2 8 Split-star 1 or 2 2 or 4 16 Tri-star 1, 2, or 3 2, 4, or 6 24 Considerations for Choosing a Topology • • If your cluster will contain more than 16 nodes or will grow to more than 16 nodes, you must use the tri-star topology. The tri-star topology supports up to 24 nodes. If your cluster will contain more than eight nodes but is not likely to grow beyond 16 nodes, you can build either a split-star topology or a subset of the tri-star topology that uses two cluster switches per fabric. In this case, to chose the best topology you need to consider: • • • • • • Performance: A split-star topology offers twice the throughput of a tri-star topology with two cluster switches per fabric. Cable Costs: The tri-star topology with two cluster switches per fabric requires a total of four fiber connections between cluster switches (two cables per fabric). The split-star topology requires a total of eight fiber connections (four cables per fabric). Migration: The tri-star topology with two cluster switches per fabric provides fast and easy migration to a full tri-star topology supporting up to 24 nodes. Online migration from a split-star topology to a tri-star topology is more complex and requires upgrading one fabric at a time. For more information, refer to Section 4, Upgrading a ServerNet Cluster. Software: The tri-star topology requires G06.13 (with SPRs) or a later G-series RVU (without SPRs). The split-star topology can be constructed using G06.09 or later RVUs. (SPRs are required for G06.09 through G06.11.) Features and Functions: The tri-star topology features automatic fail-over of the two-lane links and significant defect repair. (The split-star topology also features automatic fail-over and significant defect repair when G06.14 or superseding SPRs are installed.) If you are building a new cluster, use the split-star or tri-star topology. The star topology is not recommended because you must change the topology to grow the cluster beyond eight nodes. ServerNet Cluster Manual— 520575-003 2 -8 Software Requirements for the Star, Split-Star, and Tri-Star Topologies Planning for Installation Software Requirements for the Star, Split-Star, and Tri-Star Topologies Each topology has different software requirements. You must make sure that any server added to a cluster meets the software requirements for the topology. See Table 2-8 on page 2-23. Subsets of a Topology Valid subsets of the split-star and tri-star topologies can be built as long as they meet the software requirements for the topology shown in Table 2-8 on page 2-23. The valid subsets of a topology are: • • • A subset of the split-star topology having up to eight nodes and using only one cluster switch per fabric A subset of the tri-star topology having up to eight nodes with one cluster switch per fabric A subset of the tri-star topology having up to 16 nodes with two cluster switches per fabric For details about each type of subset, see Table 1-1 on page 1-7. Planning for Hardware Hardware planning consists of ordering the ServerNet cables, MSEBs, and cluster switches needed to connect a cluster. Contact your HP representative for the most current ordering information. Cluster Switches Required for Each Topology The number of cluster switches you need depends on the topology you use to build the cluster. Table 2-3 lists the cluster switch requirements for each topology. Table 2-3. Cluster Switch Requirements for Each Topology Topology Cluster Switches per Fabric Total Cluster Switches Fiber-Optic Cables For Multilane Link Nodes Supported Star 1 2 None Up to 8 Split-star (subset) 1 2 None Up to 8 Split-star 2 4 8 (2 four-lane links) Up to 16 Tri-star (subset) 1 2 None Up to 8 Tri-star (subset) 2 4 4 (2 two-lane links) Up to 16 Tri-star 3 6 12 (6 two-lane links) Up to 24 ServerNet Cluster Manual— 520575-003 2 -9 Alternate Cluster Switch Packaging Planning for Installation Alternate Cluster Switch Packaging The cluster switches are typically packaged in a switch enclosure that is half the height of a NonStop S-series system enclosure. (See Figure 1-14 on page 1-27.) However, the components of a cluster switch can also be ordered separately and installed in a 19-inch rack that you provide. The rack must be an EIA standard rack that is 19 inches wide and 24 to 26 inches deep. For more information about the cluster switch, refer to Section 1, ServerNet Cluster Description. Two Fiber-Optic Cables Needed for Each Server Each server that joins a ServerNet cluster requires two single-mode fiber-optic cables. These cables connect the MSEBs in group 01 of the server to the cluster switches. The maximum length for these cables is 80 meters. Figure 2-6 on page 2-19 shows the maximum cable lengths. Caution. Do not exceed the 80-meter maximum cable length between a ServerNet node and a cluster switch. Doing so can cause multiple, unrecoverable link failures to and from the node that is in violation of the distance limit. Fiber-Optic Cables for Multilane Links If your cluster uses more than one cluster switch per fabric, you need additional fiberoptic cables for the multilane links between the cluster switches. The number of fiber-optic cables you need depends on the number of cluster switches and the topology you are using. To determine how many fiber-optic cables you need for multilane links, see Table 2-3 on page 2-9. Cables up to 80 meters in length are supported for multilane links in the split-star and tri-star topologies. However, you can use longer cables between cluster switches if the requirements in Table 2-4 are met. ServerNet Cluster Manual— 520575-003 2- 10 Fiber-Optic Cable Information Planning for Installation Table 2-4. Cable Length Requirements for Multilane Links Cable Length Minimum Requirements Up to 80 m All nodes in the cluster must meet the requirements for the split-star topology. See Table 2-8 on page 2-23. Up to 1 km All nodes in the cluster must be running G06.11 or a later version of the operating system. Up to 5 km All of the following: • • • All nodes in the cluster must be running G06.16 or a later version of the operating system. All nodes in the cluster must be S76000 or S86000. All processors in all nodes must have the NSR-X or NSR-Y processor type.* * You can use the OSM Service Connection or TSM Service Application to check the processor type. (TSM displays Processor Name attribute, OSM displays Processor Type attribute.) Fiber-Optic Cable Information Single-mode, fiber-optic cables provided by HP for ServerNet cluster installations adhere to the IEEE 802.3z (Gigabit Ethernet) standard. The cables are terminated by duplex subscriber connectors (SC), as shown in Figure 2-1. Figure 2-1. Duplex SC Cable Connectors With Dust Caps Dust Cap vst056.vsd Note the following considerations: • • • Single-mode, fiber-optic cables are required for connecting nodes and cluster switches in a ServerNet cluster. Multimode fiber-optic cables, FOX ring fiber-optic cables, and ATM cables are not supported for use in a ServerNet cluster. If the fiber-optic cables you use are not provided by HP, the cables must meet the specifications in Table 2-5. ServerNet Cluster Manual— 520575-003 2- 11 Fiber-Optic Cable Information Planning for Installation Table 2-5. Fiber-Optic Cable Requirements Description Requirements • • • • • Supported lengths 10 meters 40 meters 80 meters 80 meters, plenum-rated Cables longer than 80 meters are supported for use in a multilane link if certain requirements are met. See Table 2-4 on page 2-11. Connector/receptacle type Duplex SC (See Figure 2-1 on page 2-11.) Core/cladding diameter 9/125 micrometers Nominal fiber specification wavelength 1310 nm Fiber Corning SMF-28 Operating distance For connections between a node and a cluster switch, a maximum of 80 m is supported. For connections in a multilane link, see Table 2-4 on page 2-11. Bend radius (smallest allowable bend) 1.8 in. (45 mm) Channel insertion loss 4.5 dB Fiber cable attenuation (max.) 0.5 dB/km Connector insertion loss 0.5 dB Maximum loss 10.5 dB Zero dispersion wavelength 1300 <= λ0 <= 1324 nm Dispersion slope (max.) (SO) 0.093 ps / nm 2 * km Transmitter output power Minimum: –9.5 dBm Maximum: –3 dBm Receiver input optical power Minimum: –20 dBm Maximum: –3 dBm For more information about fiber-optic cables, see Appendix G, Fiber-Optic Cable Information. ServerNet Cluster Manual— 520575-003 2- 12 Two MSEBs Needed for Each Server Planning for Installation Two MSEBs Needed for Each Server Each server that will join a ServerNet cluster must have at least two modular ServerNet expansion boards (MSEBs). These MSEBs must be installed in slots 51 and 52 of the group 01 enclosure. (Other enclosures do not need MSEBs.) Check the CRUs installed in group 01, slots 51 and 52 to determine if they are SEBs or MSEBs. Figure 2-2 shows the location of these slots. Note. If you have a two-processor system (for example, a NonStop S700) with ServerNet adapters installed in group 01, slots 51 and 52, you must move the adapters so that MSEBs can be installed in these slots. You might have to purchase an additional system enclosure in which to install the moved adapters. For instructions on moving the ServerNet adapters, refer to the manual for each adapter. ServerNet Cluster Manual— 520575-003 2- 13 Two MSEBs Needed for Each Server Planning for Installation Figure 2-2. Slots 51 and 52 of Group 01 in a NonStop Sxx000 Server 50 55 51 52 53 54 Slots 51 and 52 56 Slot Component 50, 55 51, 52 Processor Multifunction (PMF) CRU ServerNet Expansion Board (SEB) or Modular ServerNet Expansion Board (MSEB) 53, 54 ServerNet Adapter 56 Emergency Power-Off (EPO) Connector VST007.vsd ServerNet Cluster Manual— 520575-003 2- 14 Two MSEBs Needed for Each Server Planning for Installation Figure 2-3 shows SEBs and MSEBs. You can check these components visually or use the OSM Service Connection or TSM Service Application to discover and display all of the system components on the system console. Figure 2-3. SEBs and MSEBs SEB MSEB Fault LED (amber) Ejector Power-On LED (green) Link Alive LED (green) SERVERNET 6 SERVERNET 5 ServerNet 10 Serial Copper ServerNet 6 PIC (NNA fiber-optic only) 10 Link Alive LED (green) 6 ServerNet 5 PIC ServerNet 9 Serial Copper 9 5 ServerNet 4 PIC SERVERNET 4 SERVERNET 3 4 ServerNet 8 Serial Copper ServerNet 3 PIC 8 3 ServerNet 2 PIC SERVERNET 2 SERVERNET 1 ServerNet 7 Serial Copper 7 2 ServerNet 1 PIC 1 VST002.vsd ServerNet Cluster Manual— 520575-003 2- 15 Two MSEBs Needed for Each Server Planning for Installation PICs for MSEBs That Will Replace SEBs If slots 51 and 52 of group 01 contain SEBs, you must replace them with MSEBs. The MSEBs that you install in these slots must contain a single-mode fiber-optic PIC with the node-numbering agent (NNA) field-programmable gate array (FPGA) in port 6. Port 6 is used to connect fiber-optic cables from the server to a ServerNet II switch. See Figure 2-4. Figure 2-4. Single-Mode Fiber-Optic PIC Installed in Port 6 6 10 Port 6 Containing Single-Mode Fiber-Optic Plug-In Card (PIC) With Node-Numbering Agent (NNA FPGA) 5 4 9 MSEB vst004.vsd If slots 51 and 52 of group 01 already contain MSEBs, check the PICs installed in the MSEBs in port 6. If the system is running, you can use the OSM Service Connection or TSM Service Application to determine if a PIC installed in port 6 is an NNA PIC. Check the PIC Type attribute for the PIC subcomponent of the MSEB. Section 5, Managing a ServerNet Cluster, shows a TSM display of the PIC attributes. If the system is not running, you can remove the MSEB and visually check the PIC installed on the MSEB common base board. The NNA PIC is roughly twice as long as all other PICs for the MSEB. See Figure 1-11 on page 1-23. Note. When installed in an MSEB and viewed on the MSEB faceplate, the single-mode fiber PIC, multimode fiber PIC, and single-mode fiber NNA PIC are identical in appearance. ServerNet Cluster Manual— 520575-003 2- 16 Two MSEBs Needed for Each Server Planning for Installation Before an MSEB can replace an SEB, the MSEB must be populated with plug-in cards (PICs) to accept the ServerNet cables previously attached to the SEB. For example, if four ECL ServerNet cables are attached to a SEB, the MSEB that replaces it must contain four ECL PICs in the ports to which the cables connect. Figure 2-5 compares the supported SEB and MSEB connectors. Figure 2-5. SEB and MSEB Connectors (Actual Size) SEB MSEB ECL PIC ECL Serial Copper (Fixed) Serial Copper PIC Multimode Fiber-Optic PIC or Single-Mode Fiber-Optic PIC (With or Without NNA FPGA) vst015.vsd ServerNet Cables for Attaching MSEBs to Other System Enclosures Replacing SEBs with MSEBs also requires you to replace the ServerNet cables connecting the SEBs to other system enclosures. The number and type of cables you need depend on the number and type of PICs installed in the MSEBs. The following table shows the ServerNet cables required for the supported PIC connectors: This PIC connector... Requires this cable type ECL (fixed or PIC) ECL ServerNet cables Serial copper (fixed or PIC) Serial copper ServerNet cables Multimode fiber Multimode fiber-optic ServerNet cables Single-mode fiber (with or without NNA) Single-mode fiber-optic ServerNet cables Appendix A, Part Numbers, lists the available ServerNet cables. ServerNet Cluster Manual— 520575-003 2- 17 Planning for Floor Space Planning for Installation Replacing SEBs With MSEBs Using the Guided Procedure You can use an OSM or TSM guided replacement procedure to replace an SEB with an MSEB. Note. The guided procedure for replacing an SEB or MSEB cannot be used to replace ServerNet/FX or ServerNet/DA adapters installed in group 01, slots 51 and 52. To move the adapters to an unused slot in the same enclosure or another enclosure, refer to the manual for each adapter. To launch the TSM guided procedure, from the system console of the system you are adding, select Start>Programs>HP TSM>Guided Replacement Tools>Replace SEB or MSEB. The OSM guided procedure is launched from within the OSM Service Connection by performing the Replace action from the SEB you want to replace. Online help is available to assist you in performing the procedures. You can replace only one SEB or MSEB at a time. Note. Before using the guided procedure to replace an SEB or an MSEB, review the online help topic “Read This First.” This topic contains important information about software requirements and tools you need to perform the procedure. The online help for the guided procedure tells you when to connect ServerNet cables to the PICs in the MSEB. Installing or Replacing a PIC in an MSEB To install or replace a PIC in an MSEB, perform each step in the guided procedure up to and including removing the MSEB from the enclosure. Then refer to the NonStop S-Series Hardware Installation and FastPath Guide. When the PIC is installed or replaced, use the guided procedure to reinstall the MSEB into the enclosure. Planning for Floor Space A functioning ServerNet cluster can consist of as few as two NonStop S-series servers or as many as 24. Clustered servers do not take up any more floor space than nonclustered servers. However, you must plan additional floor space for the cluster switches. Each cluster switch enclosure occupies the same floor space as a NonStop S-series system enclosure. The number of cluster switches required depends on the topology (or topology subset) used by the cluster. To determine how many cluster switches you need, see Table 2-3 on page 2-9. To plan floor space for the servers, refer to the NonStop S-Series Planning and Configuration Guide. ServerNet Cluster Manual— 520575-003 2- 18 Locating the Servers and Cluster Switches Planning for Installation Locating the Servers and Cluster Switches The servers must be located so that they can be connected to the cluster switches with a 10-meter, 40-meter, or 80-meter cable. See Figure 2-6. This is not the straight-line distance. Bends in the cable after it is installed can significantly reduce the actual distance from the cluster switch to the server. Figure 2-6. Maximum Cable Lengths for Servers and Switches 160 m 80 m Fiber-Optic Cable 80 m Fiber-Optic Cable Group 01 \A Group 01 \B Cluster Switch VST038.vsd The cluster switches are packaged in separate enclosures or racks. HP recommends that you do not install cluster switches adjacent to each other. (When cluster switches are installed adjacent to each other, there is a higher potential for miscabling.). Caution. Do not exceed the 80-meter maximum cable length between a ServerNet node and a cluster switch. Doing so can cause multiple, unrecoverable link failures to and from the node that is in violation of the distance limit. For connections in a multilane link, you can use cables longer than 80 meters if certain requirements are met. For details, see Table 2-4 on page 2-11. Figure 2-7 shows the maximum cable lengths for a multilane link. Figure 2-7. Maximum Cable Lengths With a Multilane Link 1.16 km * 80 m Fiber-Optic Cable 80 m Fiber-Optic Cable Group 01 \A X1 Cluster Switch 1 km * Fiber-Optic Cables X2 Cluster Switch Group 01 \B (X Fabric Shown) This illustration uses four-lane links as an example. Two-lane links use lthe same maximum cable lengths. VST079.vsd * 5 km cables can be used in a multilane link if the requirements in Table 2-4 on page 2-11 are met. ServerNet Cluster Manual— 520575-003 2- 19 Locating the Servers and Cluster Switches Planning for Installation Note the following considerations for locating the cluster switches: • • • The power cord for a cluster switch and its power subsystem measures 8.2 feet (2.5 meters). A switch enclosure can be installed on the floor or on top of a base system enclosure. To reduce cabling errors, HP recommends that the X-fabric and Y-fabric cluster switches be installed with some distance between them. Do not install the cluster switches next to or on top of each other. See Figure 2-8. Figure 2-8. Placement of Cluster Switches To avoid cabling errors, place the X-fabric and the Y-fabric switches in different locations ServerNet Cluster Switch (Y Fabric) ServerNet Cluster Switch (X Fabric) Do not stack ServerNet Cluster switches on top of a double-high stack. Do not stack ServerNet Cluster Switches on top of each other. Do not locate ServerNet Cluster Switches next to each other. VST042.vsd ServerNet Cluster Manual— 520575-003 2- 20 Floor Space for Servicing of Cluster Switches Planning for Installation Floor Space for Servicing of Cluster Switches You must plan floor space for the cluster switches that includes service space in front of and behind the switch enclosure (or 19-inch rack). Table 2-6 shows the dimensions of the switch enclosure. The total weight of the switch enclosure and its components is 180 lbs. Table 2-6. Switch Enclosure Dimensions Measurement Dimension Description Height (including casters) Width Depth Clearance Footprint Example Inches centimeters 1 switch enclosure 20.5 52.0 1 stackable switch enclosure and 1 base enclosure 54.6 138.7 Enclosure including two cable channels 23 58 Enclosure without cable channels 19 48.3 Enclosure with cable channels 30.75 73.8 Enclosure without cable channels 25.75 65.4 Appearance side 36 91.5 Service side 30 77 Enclosure 5 ft2 0.47 m2 Enclosure with clearance 13.2 ft2 1.23 m2 The ServerNet II Switch subcomponent of the cluster switch is installed on slide rails. When pulled out for servicing (see Figure 2-9), the ServerNet II Switch extends approximately 24 inches in front of the enclosure. You must allow enough service clearance in front of the switch enclosure so that the ServerNet II Switch can be fully extended—and removed, if necessary—for servicing. ServerNet Cluster Manual— 520575-003 2- 21 Planning for Power Planning for Installation Figure 2-9. ServerNet II Switch Extended for Servicing VST075.vsd Planning for Power Power planning for the servers and their peripheral equipment (for example, tape drives, disk drives, and system consoles) is the same whether or not the server is a member of a ServerNet cluster. Because external ServerNet fabrics use fiber-optic cable, each server is electrically isolated from transient signals originating at another server. For power planning information, refer to the NonStop S-Series Planning and Configuration Guide. You need to plan for the power requirements of each cluster switch. For fault tolerance, HP recommends that the dual power cords provided by the AC transfer switch be plugged into separate AC power sources. Table 2-7 describes power requirements for the cluster switch. For more information, refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide. Table 2-7. Cluster Switch Power Requirements Characteristic Range AC input voltage for U.S. and Japan 100-120 V ac, 50 Hz/60 Hz, 4.0 A AC input voltage for Europe 200-240 V ac, 50 Hz, 2.0 A Power consumption (maximum) for cluster switch (UPS batteries charging) 400 watts ServerNet Cluster Manual— 520575-003 2- 22 Planning for Software Planning for Installation Planning for Software Planning for software includes preparing to upgrade to a new RVU and verifying the Expand system name and number. Minimum Software Requirements Any node that will participate in a ServerNet cluster must have Expand (T9057) software, which is delivered on the site update tape (SUT). In addition, the Expand/ServerNet Profile (T0569) is required for clustering. This is an optional component that, if ordered, is delivered on the SUT. Additional software requirements depend on the topology you use to construct the ServerNet cluster. See Table 2-8. If you build a subset of the split-star or tri-star topology, all nodes in the cluster must still meet the minimum software requirements for the topology. . Table 2-8. Minimum Software Requirements for Each Topology Topology Software Component Star Split-Star Tri-Star Operating system G06.09 or later G06.09 or later (G06.09, G06.10, and G06.11 require SPRs) G06.13 or later (G06.13 requires SPRs) SNETMON/ MSGMON See Table 4-28 on page 4-95 See Table 4-28 on page 4-95 T0294AAG or superseding SANMAN T0502 or superseding T0502AAE or superseding T0502AAG or superseding ServerNet II Switch Firmware and Configuration T0569AAA or superseding T0569AAB or superseding T0569AAE or superseding OSM client software T0632 or later T0633 or later T0634 or later T0632 or later T0633 or later T0634 or later T0632 or later T0633 or later T0634 or later OSM server T0682AAA or later T0682AAA or later T0682AAA or later TSM client Version 10 (T8154AAS) or superseding Version 2001B (T8154AAW) or superseding Version 2001D (T8154AAY) or superseding TSM server T7945AAS T7945AAW or superseding T7945AAY or superseding SP firmware T1089AAX or superseding T1089ABB or superseding T1089ABC or superseding SCF T9082ACN or superseding T9082ACQ or superseding T9802ACQ or superseding ServerNet Cluster Manual— 520575-003 2- 23 Minimum Software Requirements Planning for Installation Checking SPR Levels Table 2-9 shows how to check the current SPR levels for ServerNet cluster software. Table 2-9. Checking SPR Levels Product Software Component To check the current SPR level . . . T0502 SANMAN At a TACL prompt: > VPROC $SYSTEM.SYSnn.SANMAN or In SCF: -> VERSION PROCESS $ZZSMN, DETAIL If no version is indicated, see Footnote 1. T0294 SNETMON/MSGMON At a TACL prompt: > VPROC $SYSTEM.SYSnn.SNETMON or In SCF: -> VERSION PROCESS $ZZSCL, DETAIL If no version is indicated, see Footnote 1. T0569 ServerNet II Switch firmware and configuration files (on the server)2 To check the firmware file from a TACL prompt: > VPROC $SYSTEM.SYSnn.M6770 To check the configuration file from a TACL prompt: > VPROC $SYSTEM.SYSnn.M6770CL If no version is indicated, see Footnote 1. T1089 Service Processor (SP) firmware Using the TSM Service Application: 1. In the tree pane of the Management window, navigate to the SP and right-click. 2. In the pop-up menu that appears, click Attributes. 3. Check the Service Processor Firmware Version attribute value. T9082 Subsystem Control Facility (SCF) At a TACL prompt: > VPROC $SYSTEM.SYSTEM.SCF T0682 OSM server software At a TACL prompt: > VPROC $SYSTEM.SYSnn.cimom > VPROC $SYSTEM.SYSnn.ralprvd T0632 OSM client software Launch the OSM Low-Level Link or OSM Notification Director application and from the Help menu, click About HP OSM... T7945 TSM server software At a TACL prompt: > VPROC $SYSTEM.SYSnn.SRM T8154 TSM client software Log on to the TSM Service Application and from the Help menu, click About Compaq TSM. T0633 1 Versions of this software earlier than G06.12 do not include the SPR level as part of their version information. See Table 2-10 on page 2-25. ServerNet Cluster Manual— 520575-003 2- 24 Minimum Software Requirements Planning for Installation Version Procedure Information for ServerNet Cluster Software Some ServerNet cluster software components earlier than G06.12 omit the SPR level from their version procedure information. Table 2-10 shows the version procedure dates that identify the currently installed SPR. Table 2-10. Version Procedure Information for ServerNet Cluster Software Product SPR VPROC String SANMAN T0502 (G06.09) T0502G08^03JUL00^SMN T0502AAA T0502G08^21DEC00^SMN T0502AAE T0502G08^07FEB01^SMN^AAE T0502AAG T0502G08^02JUL01^SMN^AAG T0502AAH T0502G08^15MAY02^08APR02_AAH T0569 (G06.09) No information (empty file) T0569AAA T0569G06^01APR01^M6770FM T0569AAB T0569G06^11APR01^FMW^AAB T0569AAE T0569G06^02JUL01^FMW^AAE T0569AAF T0569G06^06MAY02^FMW^AAF T0569 (G06.09) No information (empty file) T0569AAA T0569G06^01APR01^M6770TL T0569AAB T0569G06^11APR01^CL1^AAB ServerNet II Switch Firmware (M6770) ServerNet II Switch Configuration (M6770CL) T0569G06^11APR01^CL2^AAB T0569AAE or T0569AAF T0569G06^02JUL01^CL1^AAE T0569G06^02JUL01^CL2^AAE T0569G06^02JUL01^CL3^AAE T0569G06^02JUL01^CL4^AAE T0569G06^02JUL01^CL5^AAE SNETMON/MSGMON T0294 (G06.09) T0294G08^03JUL00^SCL T0294AAA T0294G08^15DEC00^SCL T0294AAB T0294G08^31DEC00^SCL T0294AAE T0294G08^07FEB01^SCL^AAE T0294AAG T0294G08^01JUL01^SCL^AAG T0294AAH T0294G08_15MAY02_13MAR02_AAH ServerNet Cluster Manual— 520575-003 2- 25 SP Firmware Requirement for Systems With Tetra 8 Topology Planning for Installation SP Firmware Requirement for Systems With Tetra 8 Topology When preparing a system with the Tetra 8 topology for ServerNet Cluster connectivity, you must upgrade the SP firmware to a version that supports clustering (shown in Table 2-8 on page 2-23) before you perform a system load with G06.09 or higher. Otherwise, the ServerNet Cluster processes $ZZSCL (SNETMON) and $ZZSMN (SANMAN) will abend repeatedly when the system load is performed. This requirement does not apply to systems with the Tetra16 topology. Verifying the System Name, Expand Node Number, and Time Zone Offset Before you can add a server to a ServerNet cluster you must ensure that the system name and system number (Expand node number) do not conflict with the name and number of another node in the cluster. Also, you should verify that the system clocks on all nodes in a cluster are set to the correct time zone offset. To verify the system name and system number (Expand node number) and view the time zone offset: 1. Log on to the system. 2. Use the Kernel subsystem SCF INFO command. - > INFO SUBSYS $ZZKRN For each server, record the system name and Expand node number on the Cluster Planning Work Sheet. See the Cluster Planning Work Sheet (Example) on page 2-5. For a blank copy of the Planning Work Sheet, see Appendix B, Blank Planning Forms. For information about the system name, system number (Expand node number), or the time zone offset, refer to the SCF Reference Manual for the Kernel Subsystem. Planning for Compatibility With Other Expand Line Types If you want to create a ServerNet cluster that operates with other Expand line types, you must order individual profile products for the line types you are using. Profile products include the following: This Profile . . . Supports Line Types Expand/ServerNet (T0509G06) Expand/SWANgroup (T0532G06) Expand/FastPipe (T0533G06) • • • • • • • Expand-over-ServerNet lines X.25 NETDIRECT NETSATELLITE IP ATM FOX ServerNet Cluster Manual— 520575-003 2- 26 Planning for Serviceability Planning for Installation For example, if you currently are using Expand-over-IP lines and you want to add Expand-over-ServerNet lines and be able to use both lines together, then you must buy the Expand/FastPipe Profile (T0533G06) for every system that will use an Expandover-IP line. And you must buy the Expand/ServerNet Profile (T0509G06) for every system that will use an Expand-over-ServerNet line. You can set the SPEEDK parameter to ETHER10 (for Expand/IP over 10 Mbps Ethernet), but the lines should work even if you do not use SPEEDK. For more information, see the Expand Configuration and Management Manual. Planning for Serviceability Serviceability planning includes planning for the system consoles and for the dedicated OSM or TSM LAN. Planning for serviceability ensures that you can take full advantage of the features provided by the OSM or TSM management software. Planning for the System Consoles Typically, each server has a primary and backup system console. However, you can construct a dedicated LAN between multiple servers and share system consoles. If the system consoles are shared, the ServerNet cluster must have at least one primary and one backup system console. Note. Primary and backup system consoles are HP-approved workstations configured as primary and backup dial-out points. System consoles have modems and can dial out incident reports to a service provider, such as the Global Customer Support Center (GCSC). System consoles must be installed on a dedicated LAN. Dedicated LANs are described later in this section. In addition to primary and backup system consoles, a server can have additional consoles that are not dial-out points and do not have to be installed on a dedicated LAN. Planning for the Dedicated OSM or TSM LAN To plan for the dedicated OSM or TSM LAN, you must understand the supported LAN configurations for OSM or TSM applications. Supported LAN Configurations for OSM and TSM OSM and TSM applications can be used in two types of Ethernet LAN environments: a dedicated LAN or a public LAN. The type of LAN environment you use determines the ServerNet cluster operations you can perform from a particular system console. Note that you can use OSM or TSM applications to manage a server only if the server and the system console are members of the same dedicated or public LAN. In other words, if your dedicated LAN or public LAN includes all of the nodes in a ServerNet cluster, you can monitor any node in the cluster from a single system console. If your LAN includes some but not all of the nodes in a ServerNet cluster, you must use different system console to monitor different nodes. ServerNet Cluster Manual— 520575-003 2- 27 Planning for the Dedicated OSM or TSM LAN Planning for Installation Figure 2-10 shows how LAN connections are made for NonStop S-series servers. Figure 2-10. LAN Connections for NonStop S-Series Servers Public LAN Dedicated LAN for TSM Dedicated LAN for SWAN concentrators Group 01 VST912.vsd Dedicated LAN The dedicated LAN is an Ethernet LAN used for secure management of a NonStop S-series server. Connection to a dedicated LAN is required for all S-series installations. Note. A system console connected to a dedicated LAN can run any OSM or TSM application, including the Low-Level Link Application. The dedicated LAN consists of: • • • One or more NonStop S-series servers (Other types of servers are not allowed.) One or more Ethernet hubs One (primary) or more (backup or additional) system consoles Figure 2-11 shows a dedicated LAN configuration. ServerNet Cluster Manual— 520575-003 2- 28 Planning for the Dedicated OSM or TSM LAN Planning for Installation Figure 2-11. Recommended Configuration for Dedicated LAN Remote Service Provider Remote Service Provider NonStop S-Series Server Primary System Console Backup System Console Modem Modem Hub 1 Hub 2 Note: Do not use this figure as a wiring diagram. Actual connections vary depending on the Ethernet hub you use. VSTt998.vsd The dedicated LAN connects to the Ethernet ports on PMF CRUs located in group 01. See Figure 2-12. This LAN includes only components specified by HP and must not include routers or bridges. A dedicated LAN can include all the nodes in a ServerNet cluster, provided all the nodes are within the reach of the Ethernet cables. Such a LAN is highly useful. You can use it to perform system console operations for every node in the ServerNet cluster from any system console on the LAN. System console operations include establishing a low-level link connection, managing incident reports (IRs), and dialing out information to a service provider. ServerNet Cluster Manual— 520575-003 2- 29 Planning for the Dedicated OSM or TSM LAN Planning for Installation Figure 2-12. Dedicated LAN Group 01 MSP0: xxx.yyy.zz.b MSP1: xxx.yyy.zz.c Operating system access: xxx.yyy.zzz. j Operating system access: xxx.yyy.zzz. k Remote Service Provider Modem Remote Service Provider Microhub Microhub Modem Backup System Console xxx.yyy.zz.d Primary System Console xxx.yyy.zz.a TSM workstation xxx.yyy.zzz.q TSM workstation xxx.yyy.zzz.r VST917.vsd ServerNet Cluster Manual— 520575-003 2- 30 Planning for the Dedicated OSM or TSM LAN Planning for Installation Public LAN A public LAN is an Ethernet LAN that can include many clients and servers that might or might not include routers or bridges. NonStop S-series servers connect to public LANs using Ethernet 4 ServerNet Adapter (E4SA) or Fast Ethernet ServerNet Adapter (FESA) ports. Figure 2-13 shows a public LAN using an E4SA. Figure 2-13. Public LAN Public LAN Expand/ IP E4SA Connection PMF CRU Connection NonStop K20000 Server NonStop S-series Server Dedicated LAN for TSM Dedicated LAN for SWANs Primary System Console Backup System Console Modem SWAN Modem Remote Service Provider Remote Service Provider VST919.vsd ServerNet Cluster Manual— 520575-003 2- 31 Planning for the Dedicated OSM or TSM LAN Planning for Installation Note the following considerations regarding a public LAN: • • • • Connection to a public LAN is not required for NonStop S-series installations. Additional system consoles can be connected to a public LAN, but a primary or backup (dial-out point) system console cannot be connected to a public LAN. A system console connected to a public LAN can run only the Service Application and the Event Viewer Application. A system console connected to a public LAN cannot use the Low-Level Link Application or the Notification Director Application. A public LAN can include multiple NonStop S-series servers and other types of servers. For More Information About LAN Planning Refer to the following manuals for more information about LAN planning and Ethernet adapters: To find out more about . . . LAN planning for OSM, TSM, and for SWAN Installing the dedicated LAN Installing or replacing an Ethernet 4 ServerNet Adapter (E4SA) Using SCF (the SLSA subsystem) to manage an Ethernet LAN See this manual . . . • • • • • • WAN Subsystem Configuration and Management Manual TSM Configuration Guide NonStop S-Series Planning and Configuration Guide NonStop S-Series Hardware Installation and FastPath Guide Ethernet Adapter Installation and Support Guide LAN Configuration and Management Manual LAN Considerations for ServerNet Clusters The LAN configuration you choose affects the OSM or TSM applications you will be able to use. It also affects the number of nodes you will be able to manage from a single system console. Note the following considerations: • • • You can log on to a server from a system console only if the server and the system console are connected to the same Ethernet LAN. OSM and TSM applications allow you to log on to and manage resources on only one server at a time. However, you can launch multiple copies of an application on the same system console and log on to different servers at the same time. A secure, dedicated LAN is required for connecting the system consoles to all NonStop S-series servers. Any system console on a dedicated LAN can use all OSM or TSM applications. The dedicated LAN can be expanded to include all the ServerNet Cluster Manual— 520575-003 2- 32 Planning for Installation Planning for the Dedicated OSM or TSM LAN servers in a ServerNet cluster, provided the servers are located close enough for the Ethernet cables to reach. • A public LAN connected to the E4SAs or FESAs in each NonStop S-series server can be constructed to include all the nodes in a ServerNet cluster. There are no restrictions to the types of servers or workstations that can participate in this LAN. However, a system console connected to a public LAN can run only the OSM Service Connection, TSM Service Application, or the Event Viewer applications. These considerations are described in more detail in the following subsections: • • • Single-System Logon for OSM and TSM Applications on page 2-33 Using Multiple Copies of a OSM or TSM Application on page 2-35 Supported LAN Configurations for OSM and TSM on page 2-27 Single-System Logon for OSM and TSM Applications OSM and TSM applications have an important operating restriction: you can log on to and manage resources on only one system at a time. In addition, you can log on to a system only if the system and the system console you are using belong to the same Ethernet LAN. In Figure 2-14, each node in the ServerNet cluster has a system console. And each system console can display some generic ServerNet cluster information. For example, OSM or TSM client applications communicate with SNETMON using SPI to obtain information such as the identity of other nodes in the ServerNet cluster. But because each LAN in this configuration services only a single system, its system console can perform system management tasks (for example, viewing alarms on components or loading processors) only on a single system. For example, the system console connected to \A cannot perform management tasks on \D. The LAN configurations prevent you from managing resources on systems not connected to the same LAN as the system console you are using. ServerNet Cluster Manual— 520575-003 2- 33 Planning for the Dedicated OSM or TSM LAN Planning for Installation Figure 2-14. Ethernet LANs Serving Individual Nodes Ethernet LAN Ethernet LAN Hub Hub \A \B Ethernet LAN Ethernet LAN Hub Hub \C \D X Fabric External ServerNet Fabrics Y Fabric ServerNet Cluster VST008.vsd ServerNet Cluster Manual— 520575-003 2- 34 Planning for the Dedicated OSM or TSM LAN Planning for Installation Figure 2-15 shows the same ServerNet cluster but with an Ethernet LAN that links all the nodes in the cluster. This LAN configuration, which could be a dedicated LAN or a public LAN, allows any system console to manage resources on any node. Figure 2-15. Ethernet LAN Serving Multiple Nodes Ethernet LAN \A \B \C \D X Fabric External ServerNet Fabrics Y Fabric ServerNet Cluster VST009.vsd Using Multiple Copies of a OSM or TSM Application The Ethernet LAN configuration shown in Figure 2-15 also allows one system console to manage the resources for multiple nodes at the same time. Even though you can log on to only one system at a time, you can launch multiple copies of the same OSM or TSM application, logging on to a different system for each copy. Refer to Appendix F, Common System Operations, for more information about launching multiple copies of the same OSM or TSM application. ServerNet Cluster Manual— 520575-003 2- 35 Planning for Installation Planning for the Dedicated OSM or TSM LAN ServerNet Cluster Manual— 520575-003 2- 36 3 Installing and Configuring a ServerNet Cluster This section describes how to install a new ServerNet cluster. If you are modifying an existing cluster, see Section 4, Upgrading a ServerNet Cluster. To install a new cluster that supports . . . See Up to 8 nodes and has one cluster switch per fabric Installing a ServerNet Cluster Using a Star Topology on page 3-1 Up to 16 nodes and has two cluster switches per fabric Installing a ServerNet Cluster Using the Split-Star Topology on page 3-25 Up to 24 nodes and has three cluster switches per fabric Installing a ServerNet Cluster Using the Tri-Star Topology on page 3-30 Installing a ServerNet Cluster Using a Star Topology A ServerNet cluster using a star topology uses one cluster switch per fabric and can support up to eight nodes. Table 3-1 lists the tasks for installing a ServerNet cluster using a star topology. Each task can contain multiple steps. You construct a ServerNet cluster by configuring the first node and then adding nodes one at a time until the cluster is complete. However, some tasks, such as installing the servers and upgrading operating-system software, can be performed on all the servers before you add them to the cluster. Note. The time required to complete the installation varies depending on the hardware and software to be installed. ServerNet Cluster Manual— 520575-003 3 -1 Installing and Configuring a ServerNet Cluster Task 1: Complete the Planning Checklist Table 3-1. Task Summary for Installing a ServerNet Cluster With a Star Topology √ Description Page Task 1: Complete the Planning Checklist 3-2 Task 2: Inventory Your Hardware 3-3 Task 3: Install the Servers 3-4 Task 4: Upgrade the Operating System and Software 3-4 Task 5: Install MSEBs in Slots 51 and 52 of Group 01 3-5 Task 6: Add MSGMON, SANMAN, and SNETMON to the SystemConfiguration Database 3-7 Task 7: Verify That $ZEXP and $NCP Are Started 3-11 Task 8: Install the Cluster Switches 3-12 Task 9: Perform the Guided Procedure for Configuring a ServerNet Node 3-17 Task 10: Check for Problems 3-23 Task 11: Add the Remaining Nodes to the Cluster 3-24 Task 1: Complete the Planning Checklist Before you install a ServerNet cluster, HP recommends that you complete all the tasks in the planning checklist. Section 2, Planning for Installation, contains the Planning Checklist and describes how to complete it. You should review the planning checklist each time you add a system to the cluster. Note. If you do not complete the planning checklist, you might have to go back and complete some planning tasks during installation. ServerNet Cluster Manual— 520575-003 3 -2 Installing and Configuring a ServerNet Cluster Task 2: Inventory Your Hardware Task 2: Inventory Your Hardware Check that you have all the required hardware components to build your cluster. Table 3-2 can help you inventory hardware. Table 3-2. Hardware Inventory Checklist √ Description Notes Check that you have at least two MSEBs for each node that will participate in the cluster. The two required MSEBs must be installed in group 01, slots 51 and 52. MSEBs for other enclosures are optional. Check that the MSEBs in each group 01 of each node contain the correct PICS. MSEBs in group 01, slots 51 and 52 must contain a single-mode fiber-optic NNA PIC in port 6. Other ports must contain the appropriate PICs to enable connectivity to the SEBs or MSEBs to which they connect. Check that enough ServerNet cables are available to replace the current connections from group 01 to other enclosures within each system. These replacement cables must support the same type of media (such as ECL) as your current ServerNet connections. Check that you have the correct number of cluster switches. You need: • • Two cluster switches for a cluster using the star topology. The star topology supports up to eight nodes. If this cluster will become part of a larger split-star or tri-star topology, you need additional cluster switches. See Cluster Switch Requirements for Each Topology on page 2-9. Check that you have two fiber-optic cables (up to 80 meters long) per node. These cables connect group 01 to a cluster switch. If this cluster will become part of a split-star or tri-star topology, check that you have enough fiber-optic cables for the multilane linkss. The number of additional fiber-optic cables depends on the topology of the cluster. See Cluster Switch Requirements for Each Topology on page 2-9. If this cluster will become part of a split-star or tri-star topology and the cables in a multilane link will be more than 80 meters, check that the cluster will meet the requirements for longer cables. Cables in a multilane link can be longer than 80 meters if the cluster meets the requirements in Table 2-4 on page 2-11. For more information on these hardware components, refer to Section 2, Planning for Installation. ServerNet Cluster Manual— 520575-003 3 -3 Installing and Configuring a ServerNet Cluster Task 3: Install the Servers Task 3: Install the Servers If the individual servers are not already installed, you must install them and ensure that they function properly as stand-alone systems before adding them to a cluster. You can install all the servers now or install them individually before adding each one to the cluster. To plan for installation and install a NonStop S-series server, refer to the following guides: • • NonStop S-Series Planning and Configuration Guide NonStop S-Series Hardware Installation and FastPath Guide Task 4: Upgrade the Operating System and Software Unless a system is new, you must upgrade the operating system before adding the system to a ServerNet cluster. You can upgrade all the systems now or upgrade them individually before adding each one to the cluster. Each system must have: • The G06.09 or later RVU. In general, the different nodes in your cluster can have different versions of the operating system as long as they meet the requirements of your selected topology. For software requirements for each topology, see Table 2-8 on page 2-23. To check the versions of software currently running on your system, see Table 2-9 on page 2-24. Note. If your cluster uses a split-star topology or a tri-star topology, and the fiber-optic cables between the clusters switches exceed 80 meters, all nodes in the cluster must meet the requirements shown in Table 2-4 on page 2-11. • • • T9057, the Expand product, which is delivered on the site update tape (SUT). T0509, the Expand/ServerNet Profile required for ServerNet clustering. If ordered, this optional component is delivered on the site update tape (SUT). TSM client software version 10.0 or later. Be sure to use the most current OSM or TSM client software that supports your operating system: For NonStop Kernel Operating System Use TSM Client Version G06.16 2002B (or later) G06.15 2002A (or later) G06.14 2001D (or later) G06.13 2001C (or later) G06.12 2001B (or later) G06.11 2001A (or later) G06.10 2000A (or later) G06.09 10.0 (or later) ServerNet Cluster Manual— 520575-003 3 -4 Installing and Configuring a ServerNet Cluster Task 5: Install MSEBs in Slots 51 and 52 of Group 01 For more information about software requirements, refer to Section 2, Planning for Installation. For software installation information, refer to the following: • • • G06.xx Software Installation and Upgrade Guide NonStop System Console Installer Guide Interactive Upgrade Guide After upgrading operating system software, be sure to save a stable copy of the current configuration using the SCF SAVE command. For example: -> SAVE CONFIGURATION 01.02 This precaution will help you recover from any errors during the configuration of ServerNet cluster software. For information about the SCF SAVE CONFIGURATION command, refer to the SCF Reference Manual for G-Series Releases. Task 5: Install MSEBs in Slots 51 and 52 of Group 01 If you did not install MSEBs as part of hardware planning, you must do so now. You can install the MSEBs in all the systems now or install them as you add each system to the cluster. You can use a guided replacement procedure to replace the SEBs installed in slots 51 and 52 of group 01. This online method requires you to replace one CRU at a time. You can also use the guided procedure to install or replace a PIC in an MSEB. Note. The guided procedure for replacing an SEB or MSEB cannot be used to replace ServerNet/FX or ServerNet/DA adapters installed in group 01, slots 51 and 52. To move the adapters to an unused slot in the same enclosure or another enclosure, refer to the manual for each adapter. To launch the TSM guided procedure, select Start>Programs>HP TSM>Guided Replacement Tools>Replace SEB or MSEB. The OSM guided procedure is launched from within the OSM Service Connection by performing the Replace action from the SEB you want to replace. Online help is available to assist you in performing the procedures. Note. Before using the guided procedure to replace an SEB or an MSEB, be sure to review the online help topic “Read This First.” This topic contains important information about software requirements and tools you need to perform the procedure. To view the online help, choose Procedure Help from the Help menu. You can replace only one SEB or MSEB at a time. Online help is available to assist you. The online help prompts you to connect ServerNet cables to the PICs in the MSEB. ServerNet Cluster Manual— 520575-003 3 -5 Installing and Configuring a ServerNet Cluster Task 5: Install MSEBs in Slots 51 and 52 of Group 01 Considerations for Connecting ECL ServerNet Cables to ECL PICs If you are connecting ECL ServerNet cables to ECL PICs, note the following considerations: • • • The ServerNet cable connector and ECL PIC connector have standoffs that must be mated correctly. See Figure 3-1. When possible, connect the cable to the PIC on the MSEB before installing the MSEB into the slot. Placing the MSEB on its side on a flat surface and then connecting the cable makes it easier to secure the thumbscrews. Finger-tighten both thumbscrews on the cable connector at the same time so that the connector attaches squarely to the PIC. Figure 3-1. Connecting a ServerNet Cable to an ECL PIC MSEB Faceplate Connector Standoff ServerNet Cable Connector ECL PIC ECL PIC Standoff VST005.vsd ServerNet Cluster Manual— 520575-003 3 -6 Installing and Configuring a ServerNet Cluster Task 6: Add MSGMON, SANMAN, and SNETMON to the System-Configuration Database Task 6: Add MSGMON, SANMAN, and SNETMON to the SystemConfiguration Database Unless the system is new, you must add MSGMON, SANMAN, and SNETMON as generic processes to the system-configuration database. (These processes are preconfigured on new systems.) You must configure these processes to start automatically at system load and be persistent (that is, restart automatically if stopped abnormally). 1. Log on to the local system using the super ID (255, 255). (You must be logged on using the super ID when you add MSGMON, SANMAN, and SNETMON to the system-configuration database.) 2. Check to see if MSGMON, SANMAN, and SNETMON have already been added to the system-configuration database. Use the Kernel subsystem SCF INFO PROCESS command to display a list of the generic processes: -> INFO PROCESS $ZZKRN.* Under the *Name column of the display, look for the following: • • • $ZIMnn (MSGMON) $ZZSMN (SANMAN) $ZZSCL (SNETMON) If MSGMON, SANMAN, and SNETMON are . . . Then Present Go to the next step. Not present Configure them. (See Configuring MSGMON, SANMAN, and SNETMON on page 3-8.) 3. Gather more information about MSGMON, SANMAN, and SNETMON by using the SCF INFO PROCESS command. For example: -> INFO PROCESS $ZZKRN.#MSGMON, DETAIL -> INFO PROCESS $ZZKRN.#ZZSMN, DETAIL -> INFO PROCESS $ZZKRN.#ZZSCL, DETAIL Note. MSGMON, ZZSMN, and ZZSCL are the recommended symbolic names for these processes. They are also the names used as examples in this manual. However, these processes can be configured using other symbolic names. You can verify that MSGMON, SANMAN, and SNETMON are configured correctly by comparing the configured modifiers with the recommended modifiers shown in Adding MSGMON, SANMAN, and SNETMON Using SCF Commands on page 3-9. ServerNet Cluster Manual— 520575-003 3 -7 Installing and Configuring a ServerNet Cluster Task 6: Add MSGMON, SANMAN, and SNETMON to the System-Configuration Database 4. Check that MSGMON ($ZIMnn), SANMAN ($ZZSMN), and SNETMON ($ZZSCL) are started: -> STATUS PROCESS $ZZKRN.* If MSGMON, SANMAN, and SNETMON are . . . Then . . . Started Go to Task 7: Verify That $ZEXP and $NCP Are Started on page 3-11. Not started Start each process as described in Starting MSGMON, SANMAN, and SNETMON on page 3-11. Configuring MSGMON, SANMAN, and SNETMON You can configure MSGMON, SANMAN, and SNETMON by using a HP-supplied macro or typing the SCF commands manually. Use one of the following procedures: • • Adding MSGMON, SANMAN, and SNETMON Using the ZPMCONF Macro on page 3-8 Adding MSGMON, SANMAN, and SNETMON Using SCF Commands on page 3-9 Adding MSGMON, SANMAN, and SNETMON Using the ZPMCONF Macro The ZPMCONF macro automates the process of adding MSGMON, SANMAN, and SNETMON to the system-configuration database. For a description of the macro, refer to Appendix E, TACL Macro for Configuring MSGMON, SANMAN, and SNETMON. To obtain the ZPMCONF macro, do one of the following: • If TSM Version 2000A or later is installed on the system console you are using, you can obtain the macro from a web page stored on the workstation: 1. Click Start>Programs>Compaq TSM>Compaq S-Series Service (CSSI) Web. 2. Click the NonStop™ ServerNet Cluster link. 3. Click Tools. 4. Click ZPMCONF. 5. Follow the instructions to download the macro. Note. The ZPMCONF macro leaves the ServerNet cluster subsystem (also known as ServerNet cluster services) in the STOPPED state. If you use a guided procedure to configure and start Expand-over-ServerNet line-handler processes, the guided procedure starts the subsystem for you. However, if you use another method for configuring linehandler processes, you must start the subsystem manually using the SCF START SUBSYS $ZZSCL command. ServerNet Cluster Manual— 520575-003 3 -8 Installing and Configuring a ServerNet Cluster • • Task 6: Add MSGMON, SANMAN, and SNETMON to the System-Configuration Database From a workstation that has Internet access, you can view a copy of the macro at http://nonstop.compaq.com/. Click Technical Documentation>Compaq S-Series Service (CSSI) Web>Extranet version of the Compaq S-Series Service (CSSI) Web>NonStop ServerNet Cluster>ZPMCONF macro. Copy the macro from Appendix E, TACL Macro for Configuring MSGMON, SANMAN, and SNETMON and paste the commands into an edit file on your system. If CSSI snapshot is not available, the former CSSI content is now available from within the Support and Services collection of the NonStop Technical Library (NTL). To use the ZPMCONF macro: Note. Do not run the ZPMCONF macro on a system that is currently a member of a ServerNet cluster. MSGMON, SANMAN, and SNETMON will be aborted, and the connection to the cluster will be lost temporarily. 1. Log on using the super ID (255, 255). The macro will not run successfully if you use another user ID. 2. Copy the macro to (or create it in) the $SYSTEM.ZSUPPORT subvolume. 3. At a TACL prompt, type RUN ZPMCONF. 4. When the macro finishes, check the SCF state of $ZZSCL to make sure that it is in the STARTED state: -> SCF STATUS PROCESS $ZZKRN.#ZZSCL 5. If $ZZSCL is not started, start it: -> SCF START PROCESS $ZZKRN.#ZZSCL Adding MSGMON, SANMAN, and SNETMON Using SCF Commands If you do not have access to the ZPMCONF macro, or if you want to type the commands manually, perform the steps in this task on each system you want to add to the cluster: 1. Configure MSGMON: -> ADD PROCESS $ZZKRN.#MSGMON, AUTORESTART 10, CPU ALL, HOMETERM $ZHOME, OUTFILE $ZHOME, NAME $ZIM, PRIORITY 199, PROGRAM $SYSTEM.SYSTEM.MSGMON, SAVEABEND ON, STARTMODE SYSTEM, STOPMODE SYSMSG & & & & & & & & & & ServerNet Cluster Manual— 520575-003 3 -9 Installing and Configuring a ServerNet Cluster Task 6: Add MSGMON, SANMAN, and SNETMON to the System-Configuration Database 2. Configure SANMAN: Note. For two-processor systems, HP recommends that you specify (00, 01) for the CPU list. For four-processor systems, specify (02, 01, 03) for the CPU list. For systems of six or more processors, specify (02, 05, 06, 03, 07, 04) for the CPU list. -> ADD PROCESS $ZZKRN.#ZZSMN, & AUTORESTART 10, & PRIORITY 199, & PROGRAM $SYSTEM.SYSTEM.SANMAN, & CPU FIRSTOF (02, 05, 06, 03, 07, 04), & HOMETERM $ZHOME, & OUTFILE $ZHOME, & NAME $ZZSMN, & SAVEABEND ON, & STARTMODE SYSTEM, & STOPMODE SYSMSG, & STARTUPMSG "CPU-LIST <cpu-list>" 3. Configure SNETMON: Note. For two-processor systems, HP recommends that you specify (00, 01) for the CPU list. For four-processor systems, specify (02, 01, 03) for the CPU list. For systems of six or more processors, specify (02, 05, 06, 03, 07, 04) for the CPU list. -> ADD PROCESS $ZZKRN.#ZZSCL, & AUTORESTART 10, & PRIORITY 199, & PROGRAM $SYSTEM.SYSTEM.SNETMON, & CPU FIRSTOF (02, 05, 06, 03, 07, 04), & HOMETERM $ZHOME, & OUTFILE $ZHOME, & NAME $ZZSCL, & SAVEABEND ON, & STARTMODE SYSTEM, & STOPMODE SYSMSG, & STARTUPMSG "CPU-LIST <cpu-list>" 4. Start each process as described in Starting MSGMON, SANMAN, and SNETMON on page 3-11. ServerNet Cluster Manual— 520575-003 3- 10 Installing and Configuring a ServerNet Cluster Task 7: Verify That $ZEXP and $NCP Are Started Starting MSGMON, SANMAN, and SNETMON 1. Use the SCF START PROCESS command to start MSGMON, SANMAN, and SNETMON: -> START PROCESS $ZZKRN.#MSGMON Note. After typing the START PROCESS $ZZKRN.#MSGMON command, it is normal to receive error messages indicating that one or more processors did not start due to CPU down. The command attempts to start a MSGMON process in 16 processors but cannot start the process if some processors are not present in your system. -> START PROCESS $ZZKRN.#ZZSMN -> START PROCESS $ZZKRN.#ZZSCL Note. You do not need to start the subsystem object for the ServerNet cluster monitor process yet. The subsystem cannot be started until the ServerNet cables are attached from the node to the cluster switch. The guided procedure for configuring a ServerNet node instructs you when to connect the cables and also brings the subsystem to the STARTED state if it is not already started. 2. Use the SCF STATUS PROCESS command to verify that MSGMON ($ZIMnn), SANMAN ($ZZSMN), and SNETMON ($ZZSCL) are started: -> STATUS PROCESS $ZZKRN.* Task 7: Verify That $ZEXP and $NCP Are Started Subsequent tasks configure Expand line-handler processes that enable nodes to communicate with each other using the ServerNet protocol. For these processes to be configured, the Expand manager process ($ZEXP) and the network-control process ($NCP) must be started. 1. At a TACL prompt, use the STATUS command to verify that $ZEXP and $NCP are started: > STATUS $ZEXP > STATUS $NCP 2. If $ZEXP or $NCP are not started or not present, refer to the Expand Configuration and Management Manual for information about configuring and starting these processes before continuing. 3. Verify that the $NCP ALGORITHM modifier on the system you are adding is set to the value used by all other nodes in the cluster. Note that this modifier must be set to the same value not only for all nodes in the cluster, but also for all nodes in other networks that use Expand to communicate with the cluster. You can use the SCF INFO PROCESS command to check the modifier value: -> SCF INFO PROCESS $NCP, DETAIL For more information, refer to the Expand Configuration and Management Manual. ServerNet Cluster Manual— 520575-003 3- 11 Installing and Configuring a ServerNet Cluster Task 8: Install the Cluster Switches Task 8: Install the Cluster Switches You must install the X-fabric and Y-fabric cluster switches before you can add a node to the cluster. This task includes installing the cluster switches, routing fiber-optic cables and, if necessary, updating the ServerNet II Switch firmware and configuration. 1. Install the cluster switches as instructed in the ServerNet Cluster 6770 Hardware Installation and Support Guide. After the cluster switches are installed, you can skip this task when adding the remaining nodes. 2. Route the fiber-optic ServerNet cables as instructed in the ServerNet Cluster 6770 Hardware Installation and Support Guide. Note. For cable-management purposes, you can connect the fiber-optic cables to the ServerNet II Switch ports inside the cluster switch enclosures. However, do not connect the fiber-optic cables to the MSEBs until you are instructed to do so by the guided procedure for configuring a ServerNet node. 3. Use Table 3-3 on page 3-13 to decide if you need to update the firmware and configuration loaded on the ServerNet II Switch subcomponent of each cluster switch. Caution. When you receive a cluster switch from HP as new equipment or as a replacement unit, the ServerNet II Switch subcomponent is preloaded with T0569AAA firmware and configuration files. T0569AAA is not compatible with some clusters. Before installing the cluster switch, you might have to update the firmware and configuration files to T0569AAB or a superseding SPR. A cluster switch running T0569AAA will not be recognized as a valid peer if connected to another cluster switch running T0569AAB or a superseding SPR. ServerNet Cluster Manual— 520575-003 3- 12 Installing and Configuring a ServerNet Cluster Task 8: Install the Cluster Switches Table 3-3. Decision Table for Updating the ServerNet II Switch Firmware and Configuration If . . . Then . . . The cluster switch will be connected in a star topology with up to eight nodes, and all nodes are running G06.09, G06.10, or G06.11 and do not have the required SPRs listed in Table 3-5 on page 3-15. The cluster switch can be installed using the preloaded T0569AAA firmware and configuration. You do not need to update the firmware or configuration. The cluster switch will be connected in a star topology with up to eight nodes, and some or all nodes are running the following RVUs: The cluster switch can be connected running the preloaded T0569AAA firmware and configuration. However, any nodes that are running G06.12 or a later RVU, or have the required SPRs listed in Table 3-5 on page 3-15, will generate OSM or TSM alarms recommending that you download the T0569AAB firmware and configuration (or a superseding firmware and configuration) to the cluster switch. • • G06.09, G06.10, or G06.11 with the required SPRs listed in Table 3-5 on page 3-15 G06.12 or later You can ignore the alarms. However, HP recommends that you install the required SPRs and update the switch firmware and configuration to obtain defect repair and new functionality. The cluster switch will be connected in a split-star topology with up to 16 nodes. Before installing the cluster switch, you must update the firmware and configuration with T0569AAB (or a superseding SPR). See Updating the Firmware and Configuration on page 3-15. The cluster switch will be connected in a tri-star topology with up to 24 nodes. Before installing the cluster switch, you must update the firmware and configuration with T0569AAE (or a superseding SPR). See Updating the Firmware and Configuration on page 3-15. Caution. All nodes attached to a cluster switch whose firmware and configuration will be updated must be running the compatible RVUs and their required SPRs. Any nodes that do not meet these requirements will experience permanent loss of Expand traffic across the cluster switch when the new firmware and configuration is loaded on the cluster switch. See Table 3-4 on page 3-14. ServerNet Cluster Manual— 520575-003 3- 13 Installing and Configuring a ServerNet Cluster Task 8: Install the Cluster Switches Table 3-4. Firmware and Configuration Compatibility With the NonStop Kernel If . . . Then all nodes must be running . . . The ServerNet II Switch will have its firmware and configuration updated with T0569AAB. The ServerNet II Switch will have its firmware updated with T0569AAE (or a superseding SPR) and its configuration updated with one of the split-star configuration tags from T0569AAE (or a superseding SPR): • • • • G06.12 or a later RVU, or G06.09, G06.10, or G06.11 and have the required Release 2 SPRs listed in Table 3-5 on page 3-15 G06.14 or a later RVU, or A pre-G06.14 RVU and have the required Release 3 SPRs listed in Table 3-5 on page 3-15 Max 16 nodes, nodes 1-8 (0x10000) Max 16 nodes, nodes 9-16 (0x10001) The ServerNet II Switch will have its firmware updated with T0569AAE (or a superseding SPR), and its configuration updated with one of the tri-star configuration tags from T0569AAE (or a superseding SPR): • • • • • • • G06.14 or a later RVU, or G06.13 and have the required Release 3 SPRs listed in Table 3-5 on page 3-15 Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Max 24 nodes, nodes 17-24 (0x10004) Required SPRs If T0569AAB, T0569AAE (or a superseding SPR) will be loaded on the cluster switch, you must check the software revision levels on each node to be connected to the cluster switch. Table 3-5 lists the minimum SPR levels needed to support the T0569AAB and T0569AAE (or a superseding SPR) firmware and configuration. Note. To check the current SPR levels, see Table 2-9 on page 2-24. To check the running versions of cluster switch firmware and configuration, see Checking the Revisions of the Running Firmware and Configuration on page 4-11. ServerNet Cluster Manual— 520575-003 3- 14 Installing and Configuring a ServerNet Cluster Task 8: Install the Cluster Switches Table 3-5. Minimum SPR levels for G06.12 and G06.14 ServerNet Cluster Functionality Required Minimum SPR Level Release 21 (G06.12 equivalent) Release 32 (G06.14 equivalent) External ServerNet SAN manager process (SANMAN) T0502AAE T0502AAG ServerNet cluster monitor process/message system monitor process (SNETMON/MSGMON) If G06.09, use T0294G083 If G06.10, use T0294G083 If G06.11, use T0294AAB If G06.12, use T0294AAG T0294AAG ServerNet II Switch firmware and configuration files (on the server) T0569AAB T0569AAE Service processor (SP) firmware T1089ABB T1089ABC Subsystem Control Facility (SCF) T9082ACQ T9082ACR OSM server software T0682AAA T0682AAA OSM client software T0632 T0633 T0634 T0632 T0633 T0634 TSM server software T7945AAW T7945AAY TSM client software T8154AAW (TSM 2001B4) T8154AAY (TSM 2001D 5) Software Component 1 Release 2 supports the star and split-star topologies. 2 Release 3 supports the star, split-star, and tri-star topologies. 3 T0294G08 is the version of SNETMON/MSGMON that shipped with G06.09 and G06.10. 4 Provided on the NonStop S-series System Console Installer 13.0 CD. 5 Provided on the NonStop S-series System Console Installer 15.0 CD. Updating the Firmware and Configuration Use this procedure only if indicated in Table 3-3 on page 3-13. This procedure updates the firmware and configuration of a new cluster switch to prepare the cluster switch for installation. 1. Choose a node that can be used to update the firmware and configuration files. If the cluster switch will be used in a split-star topology, the node must be running G06.12 (or later) or have all of the Release 2 SPRs listed in Table 3-5. If the cluster switch will be used in a tri-star topology, the node must be running G06.14 or have all of the Release 3 SPRs listed in Table 3-5 and be running G06.13. 2. If the node is currently a member of a ServerNet cluster, you must temporarily remove it from the cluster before connecting it to the cluster switch. You can do this ServerNet Cluster Manual— 520575-003 3- 15 Installing and Configuring a ServerNet Cluster Task 8: Install the Cluster Switches by stopping the ServerNet cluster subsystem and then disconnecting the fiber-optic cables connected to port 6 of the MSEBs in group 01, slots 51 and 52. Note. HP does not recommend this method for permanently removing a node from a ServerNet cluster because of the disruption in traffic across the Expand lines. For complete instructions, refer to Section 6, Adding or Removing a Node. 3. Connect a fiber-optic cable from port 6 of the MSEB located in slot 51 (X fabric) of group 01 to any port in the range 0 through 7 of the ServerNet II Switch subcomponent of the cluster switch. You do not need to connect to the Y fabric. Connecting to the X fabric is sufficient to update the firmware and configuration files. 4. Power on the cluster switch: a. Connect the two power cords from the AC transfer switch to two independent AC power sources. b. On the UPS front panel, press the Power ON button. The green power-on LED lights when the UPS is powered. Green Power-On LED STANDBY Button ON Button VST906.vsd c. On the ServerNet II Switch front panel, press the Power On button. (You must fully depress the button until it clicks.) 5. Using the system console for the node you selected in Step 1, log on to the TSM Service Application. Note. You can also use the OSM Service Connection. See the online help for information on how to perform the equivalent activity. 6. Click the Cluster tab. (It can take a minute or two for the Cluster tab to become visible after you connect to a new cluster switch.) Note. It is normal for the Compaq TSM 2001B or later client software to display down-rev firmware and down-rev configuration alarms for a new cluster switch. 7. Click the plus (+) sign next to the External ServerNet X or Fabric resource to display the Switch resource for the connected cluster switch. 8. Right-click the Switch resource and select Actions. The Actions dialog box appears. ServerNet Cluster Manual— 520575-003 3- 16 Installing and Configuring a ServerNet Cluster Task 9: Perform the Guided Procedure for Configuring a ServerNet Node 9. In the Available Actions list, select Firmware Update and click Perform Action. The Update Switch guided procedures interface appears. 10. Click Start and follow the guided procedure to download the appropriate firmware. For online help, click the Help menu or click the Help button in any of the procedure dialog boxes. You can ignore local switch and peer fabric errors during this procedure. When the Perform SCF STATUS SUBNET Command and the SCF START SERVERNET Command dialog boxes appear, click Continue. 11. After the firmware update procedure is complete, return to the Actions dialog box, select Configuration Update, and click Perform Action. The Update Switch guided procedures interface appears again. 12. Click Start and follow the guided procedure to download the appropriate configuration files. For online help, click the Help menu or click the Help button in any of the procedure dialog boxes. You can ignore local switch and peer fabric errors during this procedure. When the following dialog boxes appear, select the option to continue with the procedure: • • • Perform SCF STATUS SUBNET Command dialog box CAUTION: Possible Cluster Outage dialog box SCF START SERVERNET Command dialog box 13. After the Firmware Update and Configuration Update actions are complete: a. b. c. d. Disconnect the node from the ServerNet II Switch. Power off the ServerNet II Switch. Power off the UPS. Disconnect the two AC transfer switch power cords from the power sources. 14. Disconnect the cluster switch from the node you selected in Step 1 and go to the next installation task. Task 9: Perform the Guided Procedure for Configuring a ServerNet Node Use the guided configuration procedure to configure and add a node (system) to a ServerNet cluster (including the first node in the cluster). You perform this procedure on one system at a time. Note. Documentation for the guided procedure is contained in the online help for the procedure. Before using the guided procedure to configure and add a node, be sure to review the online help topic “Read Before Using.” This topic contains important information about software requirements that must be met before you run the procedure. To launch the TSM guided procedure, select Start>Programs>HP TSM>Guided Replacement Tools>Replace SEB or MSEB. The OSM guided procedure is ServerNet Cluster Manual— 520575-003 3- 17 Installing and Configuring a ServerNet Cluster Task 9: Perform the Guided Procedure for Configuring a ServerNet Node launched from within the OSM Service Connection by performing the Replace action from the SEB you want to replace. Online help is available to assist you in performing the procedures. What the Guided Procedure Does The guided procedure for configuring a ServerNet node: • • Verifies that the group 01 MSEBs are installed and ready Tells you when to connect the fiber-optic cables Note. For information about how to connect the fiber-optic cables, refer to Connecting a Fiber-Optic Cable to an MSEB or ServerNet II Switch on page 3-18 or to the online help for the guided procedure. • • Prompts you to start ServerNet cluster services by placing the SUBSYS object in the STARTED state and collects cluster information Allows you to configure Expand-over-ServerNet line-handler processes and start Expand-over-ServerNet lines Note. For information about using SCF to configure Expand-over-ServerNet line-handler processes manually, refer to Using SCF to Configure Expand-Over-ServerNet LineHandler Processes on page 3-22 or the Expand Configuration and Management Manual. When you are finished using the guided procedure, go to Task 10: Check for Problems on page 3-23. Connecting a Fiber-Optic Cable to an MSEB or ServerNet II Switch Do not connect fiber-optic cables to the MSEBs until instructed to do so by the guided procedure for configuring a ServerNet node. If you connect the cables before you are instructed to do so, the procedure cannot generate the Expand-over-ServerNet linehandler processes automatically. For more information about automatic configuration of line-handler processes, see the online help for the guided procedure. Note. Handle the fiber-optic cables gently. Do not step on a fiber-optic cable or place a heavy object on top of a fiber-optic cable. If the cable must be installed with a bend, make sure the bend radius is not smaller than 1.8 inches (45 mm). 1. Label the cables as described in the ServerNet Cluster 6770 Hardware Installation and Support Guide. 2. If dust caps are installed on the ceramic ferrule tips of each connector, remove the dust caps. 3. Inspect each fiber-optic cable connector. See Figure 3-2. ServerNet Cluster Manual— 520575-003 3- 18 Installing and Configuring a ServerNet Cluster • • • Task 9: Perform the Guided Procedure for Configuring a ServerNet Node Make sure the ferrule housing and the ceramic ferrule tip are visible. The ferrule housing should be at least flush with the connector housing. It is normal for the ferrule housing to slide freely (approximately 2 mm) within the connector body between the stops designed into the connector-body assembly. If the ferrule housing and ceramic ferrule tip are pushed back inside the connector body, the connector might be defective and you should obtain another cable. Figure 3-2. Inspecting Fiber-Optic Cable Connectors Connector Body Ceramic Ferrule Tips Good Connector Ferrule Housing Defective Connector (DO NOT USE) VST003.vsd 4. Align the keys on the connector body with the key slots on the receptacle of the MSEB or ServerNet II Switch. Note. The key slots on ports 8 through 11 of the ServerNet II Switch are located at the top of the receptacle. On ports 0 through 7, the key slots are located at the bottom of the receptacle. See Figure 3-3. ServerNet Cluster Manual— 520575-003 3- 19 Installing and Configuring a ServerNet Cluster Task 9: Perform the Guided Procedure for Configuring a ServerNet Node Figure 3-3. Key Positions on ServerNet II Switch Ports Keys Ports 8-11 Ports 0-7 Keys VST111.vsd 5. Insert the connector into the receptacle, squeezing the connector body gently between your thumb and forefinger as you insert it. Push the connector straight into the receptacle until the connector clicks into place. See Figure 3-4. Figure 3-4. Inserting a Fiber-Optic Cable Connector Into an MSEB Receptacle Keys on Connector Body VST055.vsd 6. Verify that the connector is fully mated to the receptacle and that the fibers are evenly inserted. If the connector is defective, it is possible for one or both fibers to ServerNet Cluster Manual— 520575-003 3- 20 Installing and Configuring a ServerNet Cluster Task 9: Perform the Guided Procedure for Configuring a ServerNet Node fail to make a solid connection even though the connector is inserted properly. Figure 3-5 shows a fully inserted connector in which one of the fibers does not make a solid connection. Figure 3-5. Inserted Connector With Bad Fiber Connection Bad Fiber Connection VST144.vsd 7. Check the link-alive LED at both ends of the cable. Both LEDs should light seconds after the connector is inserted. You must check the link-alive LED at both ends of the cable to ensure that both the transmit and receive fibers are connected. It is possible for the link-alive LED to light on one end of the cable even though one of the fibers is not connected. See Figure 3-6. Figure 3-6. Effect of Uneven Fiber Insertion on Link Alive Good Connector Defective Connectors Link Alive No Link Alive Fibers Evenly Inserted Rx Link Alive Fibers Not Evenly Inserted Tx Rx Tx Rx Tx VST145.vsd ServerNet Cluster Manual— 520575-003 3- 21 Installing and Configuring a ServerNet Cluster Task 9: Perform the Guided Procedure for Configuring a ServerNet Node Using SCF to Configure Expand-Over-ServerNet LineHandler Processes HP recommends that you configure line-handler processes using the guided procedure for configuring a ServerNet node. The guided procedure can automatically generate line-handler processes for you, if desired. However, you can configure these processes manually by using SCF. Line-Handler Processes Required for Clustering Every system in a ServerNet cluster must have an Expand-over-ServerNet line-handler process for every other system in the ServerNet cluster. Therefore, when you add a system to a ServerNet cluster, you must configure multiple line-handler processes: On . . . You must configure . . . The system you are adding Line-handler processes for all the other systems in the ServerNet cluster All other systems in the ServerNet cluster A line-handler process for the system you are adding For example, if you have a ServerNet cluster consisting of three nodes, and you want to add a fourth node, you must configure a total of six line-handler processes—three on the system you are adding and one on each of the three current nodes of the ServerNet cluster. In addition, every line-handler process must be given a unique name. Note. To log on to a system other than the one where your current TACL process is running, you must first start a remote TACL process in that system. You must also set up remote passwords between the local system and the remote system. You do this using the REMOTEPASSWORD command. For information about configuring remote passwords, refer to the Guardian User’s Guide. Example The following SCF example shows recommended modifier values for an Expand-overServerNet line-handler process: >SCF >ADD DEVICE $ZZWAN.#SC006, CPU 2, ALTCPU 5, PROFILE PEXPSSN, & >IOPOBJECT LHOBJ, TYPE (63, 4), ASSOCIATEDEV $ZZSCL, & >NEXTSYS 123, RSIZE 0, SPEEDK SNET, COMPRESS_OFF, & >PATHBLOCKBYTES 0, PATHPACKETBYTES 4095 ServerNet Cluster Manual— 520575-003 3- 22 Installing and Configuring a ServerNet Cluster Task 10: Check for Problems Rules for Configuring Line-Handler Processes Using SCF If you use SCF to configure Expand-over-ServerNet line-handler processes manually (in other words, you do not use the guided procedure), observe the following configuration rules: Rule 1 Whenever possible, configure the primary and backup Expand-overServerNet line-handler processes in different processor enclosures. This rule applies to all systems except for two-processor systems, which have only one processor enclosure. Rule 1 takes precedence over Rules 2 and 3. Rule 2 Whenever possible, avoid configuring Expand-over-ServerNet line-handler processes in processors 0 and 1. Key system processes, such as $SYSTEM, can run only in processors 0 and 1. By reducing the number of processes using processors 0 and 1, you can minimize the risk of key system processes undergoing a software halt. Because Rule 1 takes precedence over Rule 2, Rule 2 can be applied only in systems having three or more processor enclosures. In a system with two processor enclosures, the line-handler process pair must be configured following Rule 1. Therefore, either the primary or the backup must run in processor 0 or 1. Rule 3 Whenever possible, avoid configuring Expand-over-ServerNet line-handler processes on processors 8 through 15. These processors form the outer tetrahedron of the Tetra 16 topology. There are more ServerNet components (routers and links) along the paths from these processors to the external fabrics than along the paths that originate in the inner tetrahedron (processors 0 through 7). Consequently, these paths have a higher probability of failing as a result of hardware faults. To Learn More About Expand-Over-ServerNet Line-Handler Processes For more information about configuring Expand-over-ServerNet line-handler processes, refer to Section 1, ServerNet Cluster Description, or the Expand Configuration and Management Manual. Task 10: Check for Problems You can quickly verify the health of each node you have added to a ServerNet cluster by using the OSM Service Connection or TSM Service Application. If you find problems, HP recommends that you resolve the problems before adding new nodes to the cluster. 1. Using a system console connected to the system you added, log on to the system using the OSM Service Connection or TSM Service Application. For more information about logging on, refer to Appendix F, Common System Operations. 2. In the tree pane of the management window, click the Cluster tab (or the Cluster object in the OSM Service Connection). ServerNet Cluster Manual— 520575-003 3- 23 Installing and Configuring a ServerNet Cluster Task 11: Add the Remaining Nodes to the Cluster 3. Click the plus (+) sign next to the ServerNet Cluster resource so you can see each node or subcomponent of the ServerNet cluster. The tree pane displays ServerNet cluster resources. 4. Look for yellow or red icons over a resource: • • A yellow icon indicates that a resource is not in a normal operational state or contains subcomponents that are yellow or red. The resource might require operational intervention. A red icon indicates that a resource requires service. 5. In the tree pane, select a resource having a yellow or red icon, then check the attributes in the detail pane. For more information about displaying status information and attributes, refer to Section 5, Managing a ServerNet Cluster. 6. Check the Alarm Summary dialog box for alarms. See Section 7, Troubleshooting and Replacement Procedures for information about using OSM or TSM alarms and troubleshooting problems. Task 11: Add the Remaining Nodes to the Cluster To add additional nodes to the cluster, repeat Tasks 1 through 10, skipping the tasks that you may have already completed for all nodes. ServerNet Cluster Manual— 520575-003 3- 24 Installing and Configuring a ServerNet Cluster Installing a ServerNet Cluster Using the Split-Star Topology Installing a ServerNet Cluster Using the Split-Star Topology A ServerNet cluster using a split-star topology has up to two cluster switches per fabric and can support up to 16 nodes. You construct a split-star topology by installing the two star groups of the split-star as independent clusters and then connecting the clusters using the Add Switch guided procedure. Each of the two star groups uses one switch per fabric and supports up to eight nodes. Each star group is a valid subset of the split-star topology and can operate independently of the other star group if necessary. Note. To install a valid subset of the split-star topology, use the following procedure, but skip the steps for constructing the unneeded star group and connecting the four-lane links. Table 3-6 lists the tasks for installing a ServerNet cluster using a split-star topology. Table 3-6. Task Summary for Installing a ServerNet Cluster Using a Split-Star Topology √ Description Page Task 1: Decide Which Nodes Will Occupy the Two Star Groups of the Split-Star Topology 3-25 Task 2: Route the Fiber-Optic Cables for the Four-Lane Links 3-26 Task 3: Install the Two Star Groups of the Split-Star Topology 3-26 Task 4: Use the Guided Procedure to Prepare to Join the Clusters 3-27 Task 5: Connect the Four-Lane Links 3-27 Task 6: Configure Expand-Over-ServerNet Lines for the Remote Nodes 3-29 Task 7: Verify Operation of the Cluster Switches 3-29 Task 8: Verify Cluster Connectivity 3-29 Task 1: Decide Which Nodes Will Occupy the Two Star Groups of the Split-Star Topology Each star group of the split-star topology can support up to eight nodes. You can use the following form to plan which nodes will belong to each star group of the split-star topology. Note. If the four-lane link connecting the X1/Y1 and X2/Y2 cluster switches exceeds 80 meters, all nodes in the cluster must meet the requirements in Table 2-4 on page 2-11. ServerNet Cluster Manual— 520575-003 3- 25 Installing and Configuring a ServerNet Cluster Task 2: Route the Fiber-Optic Cables for the FourLane Links Table 3-7. Planning for the Split-Star Topology Cluster Switch Port ServerNet Node Number X1/Y1 0 1 _______ \_______________________ 1 2 _______ \_______________________ 2 3 _______ \_______________________ 3 4 _______ \_______________________ 4 5 _______ \_______________________ 5 6 _______ \_______________________ 6 7 _______ \_______________________ 7 8 _______ \_______________________ 0 9 _______ \_______________________ 1 10 _______ \_______________________ 2 11 _______ \_______________________ 3 12 _______ \_______________________ 4 13 _______ \_______________________ 5 14 _______ \_______________________ 6 15 _______ \_______________________ 7 16 _______ \_______________________ X2/Y2 Expand Node Number System Name Task 2: Route the Fiber-Optic Cables for the Four-Lane Links The following tasks assume that you have routed (but not connected) the single-mode fiber-optic cables to be used for the four-lane links. Do not connect the four-lane links until you are instructed to do so by the guided procedure. If you have not already routed the cables, use the following steps: 1. Label both ends of each fiber-optic cable to be used for the four-lane links. On the label, include the fabric and cluster switch position (X1, for example) and the port number (8, for example) to which the cable will be connected. 2. If they are not already routed, route the fiber-optic ServerNet cables to be used for the four-lane links. For more information, refer to Section 2, Planning for Installation. Task 3: Install the Two Star Groups of the Split-Star Topology Use the procedure for Installing a ServerNet Cluster Using a Star Topology on page 3-1 to install two ServerNet clusters, each containing up to eight nodes. ServerNet Cluster Manual— 520575-003 3- 26 Installing and Configuring a ServerNet Cluster Task 4: Use the Guided Procedure to Prepare to Join the Clusters You can save time by configuring one cluster to use X1/Y1 cluster switches and the other cluster to use the X2/Y2 cluster switches, but this practice is not required. The Add Switch guided procedure detects the cluster switch configuration and gives you an opportunity to change it before connecting the four-lane links. Task 4: Use the Guided Procedure to Prepare to Join the Clusters Before joining the two halves of the split-star topology, you must use the Add Switch guided procedure to verify that the hardware and software is ready. You must run the guided procedure at least once on both clusters. From the system console of any node connected to the cluster, for TSM choose Start>Programs>Compaq TSM>Guided Configuration Tools>Add Switch. For OSM, the procedure is launched from within the OSM Service Connection by performing the Replace action from the switch you want to replace. Online help is available to assist you in performing the procedure. The guided procedure and its online help allow you to: • • • • Update the switch firmware files, if necessary Update the cluster switch configuration, if necessary Perform the hard reset action required to complete the configuration update Connect the four-lane links between the X1/Y1 and X2/Y2 cluster switches When the guided procedure indicates that both clusters meet the requirements for connecting a remote cluster switch, you can connect the four-lane links. Task 5: Connect the Four-Lane Links Caution. Do not connect the four-lane links until you are instructed to do so by the guided procedure for adding a cluster switch (Add Switch). Connecting a four-lane link between two cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster. Power cycling each node might be necessary to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting two cluster switches running the T0569AAA configuration can create an invalidly configured split-star topology in which ServerNet packets sent by nodes 1 through 8 to nodes 9 through 16 loop indefinitely between the two cluster switches. To connect the four-lane links between the X1/Y1 and X2/Y2 cluster switches: 1. Remove the black plugs from ports 8 through 11 of the double-wide PICs inside each cluster switch. ServerNet Cluster Manual— 520575-003 3- 27 Installing and Configuring a ServerNet Cluster Task 5: Connect the Four-Lane Links 2. Remove the dust caps from the fiber-optic cable connectors. Note. ServerNet II Switch ports 8 through 11 are keyed differently from ports 0 through 7. See Figure 3-3 on page 3-20. To connect the four-lane link cables, you must align the fiber-optic cable connector with the key on top. 3. One cable at a time, connect the cable ends. Table 3-8 shows the cable connections. Note. To avoid generating an alarm, you must connect the four-lane links for both fabrics within four minutes. OSM or TSM incident analysis (IA) software generates an alarm eventually if one external fabric has two cluster switches while the other external fabric has only one cluster switch. When a cluster switch is added to an external fabric, the IA checks the peer fabric to determine if it has two cluster switches. If, after four minutes, only one external fabric has two cluster switches, the IA generates a Missing Remote ServerNet Switch alarm. If, after four more minutes, the peer fabric still does not have two cluster switches, the IA dials out the Missing Remote ServerNet Switch alarm. If a second cluster switch is added to the peer fabric after the alarm is generated but before the alarm is dialed out, the alarm is deleted and is not dialed out. Table 3-8. Cable Connections for the Four-Lane Links Cluster Switch X1 Y1 Port Connects to Cluster Switch Port 8 X2 8 9 9 10 10 11 11 8 Y2 8 9 9 10 10 11 11 4. Check the link-alive LED near each PIC port. The link-alive LEDs should light a few seconds after the cable is connected at both ends. If the link-alive LEDs do not light: • • • Try reconnecting the cable, using care to align the key on the cable plug with the PIC connector. Make sure the dust caps have been removed from the cable ends. If possible, try connecting a different cable. 5. Continue until all of the cables are connected. ServerNet Cluster Manual— 520575-003 3- 28 Installing and Configuring a ServerNet Cluster Task 6: Configure Expand-Over-ServerNet Lines for the Remote Nodes Task 6: Configure Expand-Over-ServerNet Lines for the Remote Nodes Unless automatic line-handler configuration is enabled, the ServerNet nodes in each half of the split-star topology will not have Expand-over-ServerNet line-handler processes configured for the remote nodes in the other half of the split-star. You can use a guided configuration procedure to configure and start Expand-overServerNet lines between the two halves of the split-star topology. You must perform this procedure on all nodes in the cluster. From the system console of the system you are adding, for TSM choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. For OSM, the guided procedure is launched from within the OSM Service Connection by performing the Add Node to ServerNet Cluster action from the System object. Online help is available to assist you in performing the procedure. Task 7: Verify Operation of the Cluster Switches 1. Log on to a node in the range of ServerNet node numbers 1 through 8, and use the OSM Service Connection or TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches. 2. Log on to a node in the range of ServerNet node numbers 9 through 16, and use the OSM Service Connection or TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches. Task 8: Verify Cluster Connectivity On all nodes in the newly merged cluster, use the SCF STATUS SUBNET $ZZSCL command to verify ServerNet cluster connectivity. Note. Using the SCF STATUS SUBNET $ZZSCL command requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) on the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window on each node. ServerNet Cluster Manual— 520575-003 3- 29 Installing and Configuring a ServerNet Cluster Installing a ServerNet Cluster Using the Tri-Star Topology Installing a ServerNet Cluster Using the Tri-Star Topology A ServerNet cluster using a tri-star topology has three cluster switches per fabric and can support up to 24 nodes. You construct a tri-star topology by installing the three star groups of the tri-star as independent clusters and then connecting the clusters using the Add Switch guided procedure. Each of the three star groups uses one switch per fabric and supports up to eight nodes. Each star group is a valid subset of the tri-star topology and can operate independently of the other star groups if necessary. Note. The tri-star topology (three cluster switches per fabric) requires that all nodes run the G06.14 version of the operating system or G06.13 with the Release 3 SPRs listed in Table 3-5 on page 3-15. This is required because versions of the operating system earlier than G06.13 do not support the tri-star topology. In addition, the cluster switches must be loaded with the T0569AAE firmware and configuration. Table 3-9 lists the tasks for installing a ServerNet cluster using a tri-star topology. Table 3-9. Task Summary for Installing a ServerNet Cluster Using a Tri-Star Topology √ Description Page Task 1: Decide Which Nodes Will Occupy the Three Star Groups of the Tri-Star Topology 3-30 Task 2: Route the Fiber-Optic Cables for the Two-Lane Links 3-32 Task 3: Install the Three Star Groups of the Tri-Star Topology 3-32 Task 4: Use the Guided Procedure to Prepare to Merge the Clusters 3-32 Task 5: Configure and Start Expand-Over-ServerNet Lines 3-35 Task 6: Verify Operation of the Cluster Switches 3-35 Task 7: Verify Cluster Connectivity 3-36 Task 1: Decide Which Nodes Will Occupy the Three Star Groups of the Tri-Star Topology Each star group of the tri-star topology can support up to eight nodes. Use the following form to plan which nodes will belong to each star group. Note. If the lengths of the two-lane links between the cluster switches are more than 1 kilometer, all nodes in the cluster must meet the requirements for cables up to 5 kilometers in Table 2-4 on page 2-11. ServerNet Cluster Manual— 520575-003 3- 30 Installing and Configuring a ServerNet Cluster Task 1: Decide Which Nodes Will Occupy the Three Star Groups of the Tri-Star Topology Table 3-10. Planning for the Tri-Star Topology Cluster Switch X1/Y1 X2/Y2 X3/Y3 Port ServerNet Node Number Expand Node Number System Name 0 1 _______ \_______________________ 1 2 _______ \_______________________ 2 3 _______ \_______________________ 3 4 _______ \_______________________ 4 5 _______ \_______________________ 5 6 _______ \_______________________ 6 7 _______ \_______________________ 7 8 _______ \_______________________ 0 9 _______ \_______________________ 1 10 _______ \_______________________ 2 11 _______ \_______________________ 3 12 _______ \_______________________ 4 13 _______ \_______________________ 5 14 _______ \_______________________ 6 15 _______ \_______________________ 7 16 _______ \_______________________ 0 17 _______ \_______________________ 1 18 _______ \_______________________ 2 19 _______ \_______________________ 3 20 _______ \_______________________ 4 21 _______ \_______________________ 5 22 _______ \_______________________ 6 23 _______ \_______________________ 7 24 _______ \_______________________ ServerNet Cluster Manual— 520575-003 3- 31 Installing and Configuring a ServerNet Cluster Task 2: Route the Fiber-Optic Cables for the TwoLane Links Task 2: Route the Fiber-Optic Cables for the Two-Lane Links The following tasks assume that you have routed (but not connected) the single-mode fiber-optic cables to be used for the two-lane links. (Do not connect the two-lane links until you are instructed to do so by the guided procedure.) If you have not already routed the cables: 1. Label both ends of each fiber-optic cable to be used for the two-lane links. On the label, include the fabric and cluster switch position (X1, for example) and the port number (8, for example) to which the cable will be connected. 2. Route the cables. For more information, refer to Section 2, Planning for Installation. Task 3: Install the Three Star Groups of the Tri-Star Topology Use the procedure for Installing a ServerNet Cluster Using a Star Topology on page 3-1 to install three ServerNet clusters, each containing one cluster switch per fabric and up to eight nodes. You can save time by configuring the clusters to use the X1/Y1, X2/Y2, and X3/Y3 cluster switches, but this practice is not required. The Add Switch guided procedure detects the cluster switch configuration and gives you an opportunity to change it before merging the clusters. Task 4: Use the Guided Procedure to Prepare to Merge the Clusters Before merging the three star groups of the tri-star topology, you must use the Add Switch guided procedure to verify that the hardware and software are ready. 1. On any node connected to one of the star groups, run the Add Switch guided procedure: a. From the system console of that node, for TSM, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Add Switch. For OSM, the guided procedure is launched from within the OSM Service Connection by performing the Replace action from the Switch object. b. In the Select Upgrade Topology Dialog Box, select the tri-star topology as the topology to which you want to upgrade. c. In the Select Switch Configuration Tag Dialog Box, select one of the following configuration tags: • • • Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Max 24 nodes, nodes 17-24 (0x10004) ServerNet Cluster Manual— 520575-003 3- 32 Installing and Configuring a ServerNet Cluster Connecting the Two-Lane Links d. Follow the dialog boxes to update the firmware and configuration for the Xfabric cluster switch if the guided procedure determines that the switch needs updating. e. When prompted, update the firmware and configuration for the Y-fabric cluster switch. The guided procedure remembers the configuration tag you selected for the X-fabric cluster switch and uses it for the Y-fabric cluster switch. f. When the guided procedure tells you to connect the cables for the X fabric, click Stop Task. You must not connect the two-lane link cables until all the cluster switches have been updated. 2. Repeat Step 1 on the second star group. 3. Repeat Step 1 on the third star group, but do not click Stop Task when the guided procedure tells you to connect the cables. 4. Connect the two-lane links on both fabrics using the fiber-optic cables. See Connecting the Two-Lane Links on page 3-33. 5. When the cables are connected, click Continue in the Connect the Cables Dialog Box. 6. In the Local Switch Information Dialog Box, click Test to verify the connection between the cluster switches. You must log on to nodes attached to at least two different star groups of the tristar topology in order for the guided procedure to test all of the remote connections on both fabrics. If you log on to a node attached to . . . The guided procedure checks these remote connections . . . X1/Y1 X1/Y1 to X2/Y2, and X1/Y1 to X3/Y3 X2/Y2 X2/Y2 to X1/Y1, and X2/Y2 to X3/Y3 X3/Y3 X3/Y3 to X1/Y1, and X3/Y3 to X2/Y2 7. Log on to a node attached to a different star group of the tri-star topology to test the other remote connections. Run the Add Switch guided procedure again. In the Local Switch Information dialog box, click the Test button to test the connectivity between the local remote switches. Connecting the Two-Lane Links Do not connect the two-lane links until you are instructed to do so by the Add Switch guided procedure. ServerNet Cluster Manual— 520575-003 3- 33 Installing and Configuring a ServerNet Cluster Connecting the Two-Lane Links To connect the two-lane links: 1. Remove the black plugs from ports 8 through 11 on the double-wide PICs inside each switch enclosure. 2. Remove the dust caps from the fiber-optic cable connectors. Note. Cluster switch ports 8, 9, 10, and 11 are keyed differently from ports 0 through 7. To connect the two-lane link cables, you must align the fiber-optic cable connector with the key on top. See Figure 3-3 on page 3-20. 3. One cable at a time, connect the cable ends. Table 3-11 shows the cable connections. Note. To avoid generating an alarm, you must connect the two-lane links for both fabrics within eight minutes. The OSM or TSM incident analysis (IA) software generates an alarm eventually if both external fabrics do not have the same number of cluster switches. When a cluster switch is added to an external fabric, the IA checks the peer fabric to determine if it has the same number of cluster switches. After eight minutes, if one external fabric has fewer cluster switches, the IA generates a Missing Remote ServerNet Switch alarm. After eight more minutes, if the external fabric still has fewer cluster switches, the IA dials out the Missing Remote ServerNet Switch alarm. If you equalize the number of cluster switches on both fabrics after the alarm is generated but before the alarm is dialed out, the alarm is deleted and is not dialed out. Table 3-11. Cable Connections for the Two-Lane Links Port Connects to Cluster Switch Port X1 8 X2 10 X1 9 X2 11 X1 10 X3 8 X1 11 X3 9 X2 8 X3 10 X2 9 X3 11 Y1 8 Y2 10 Y1 9 Y2 11 Y1 10 Y3 8 Y1 11 Y3 9 Y2 8 Y3 10 Y2 9 Y3 11 Cluster Switch ServerNet Cluster Manual— 520575-003 3- 34 Installing and Configuring a ServerNet Cluster Task 5: Configure and Start Expand-Over-ServerNet Lines 4. Check the link-alive LED near each PIC port. The link-alive LEDs should light a few seconds after each cable is connected at both ends. If the link-alive LEDs do not light: • • • Try reconnecting the cable, using care to align the key on the cable plug with the PIC connector. Make sure the dust caps are removed from the cable ends. If possible, try connecting a different cable. 5. Continue until all of the cables are connected. Task 5: Configure and Start Expand-Over-ServerNet Lines To configure and start Expand-over-ServerNet lines between the three star groups of the tri-star topology, use the guided procedure for configuring a ServerNet node. Unless the automatic line-handler configuration feature is enabled, you must run the guided procedure on every node in order to configure and start Expand-over-ServerNet lines to all other nodes. From the system console of that node, for TSM, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. For OSM, the guided procedure is launched from within the OSM Service Connection by performing the Add Node to ServerNet Cluster action from the System object. Online help is available to assist you in performing the procedure. Task 6: Verify Operation of the Cluster Switches Use the OSM Service Connection or TSM Service Application to verify the operation of the cluster switches: 1. Log on to a node in the range of ServerNet node numbers 1 through 8, and use the OSM Service Connection or TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X1 and Y1). 2. Log on to a node in the range of ServerNet node numbers 9 through 16, and use the OSM Service Connection or TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X2 and Y2). 3. Log on to a node in the range of ServerNet node numbers 17 through 24, and use the OSM Service Connection or TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X3 and Y3). ServerNet Cluster Manual— 520575-003 3- 35 Installing and Configuring a ServerNet Cluster Task 7: Verify Cluster Connectivity Task 7: Verify Cluster Connectivity Use the SCF STATUS SUBNET $ZZSCL, PROBLEMS command to make sure direct ServerNet communication is possible between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. To obtain information about individual fabrics, you can use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. ServerNet Cluster Manual— 520575-003 3- 36 4 Upgrading a ServerNet Cluster This section describes how to upgrade a ServerNet cluster by installing new software, adding ServerNet Switches, or both. You need to upgrade a ServerNet cluster when the cluster cannot accept any more ServerNet nodes. Adding cluster switches usually changes the topology of a cluster. In some cases, before changing the topology of the cluster, you must upgrade the software running on each node and the firmware and configuration loaded in each cluster switch. Note. You can also use the OSM Service Connection instead of the TSM Service Application. For more information, see Appendix H, Using OSM to Manage the Star Topologies. This section also includes fallback procedures in case anything goes wrong during the upgrade. Note. This section assumes that you already have one or more ServerNet clusters installed and you want to upgrade software or change the topology. If you are building a ServerNet cluster for the first time, refer to Section 3, Installing and Configuring a ServerNet Cluster. If you need to do any of the following, refer to Section 6, Adding or Removing a Node: • • • Add a node to a cluster Remove a node from a cluster so you can add the node to another cluster Split a large cluster into multiple smaller clusters without installing new cluster switches This section contains: • • • • • • • • • • • Benefits of Upgrading on page 4-2 Planning Tasks for Upgrading a ServerNet Cluster on page 4-4 Upgrading Software to Obtain G06.12 Functionality on page 4-17 Fallback for Upgrading Software to Obtain G06.12 Functionality on page 4-26 Upgrading Software to Obtain G06.14 Functionality on page 4-34 Fallback for Upgrading Software to Obtain G06.14 Functionality on page 4-50 Merging Clusters to Create a Split-Star Topology on page 4-54 Fallback for Merging Clusters to Create a Split-Star Topology on page 4-66 Merging Clusters to Create a Tri-Star Topology on page 4-68 Fallback for Merging Clusters to Create a Tri-Star Topology on page 4-89 Reference Information on page 4-93 ServerNet Cluster Manual— 520575-003 4 -1 Upgrading a ServerNet Cluster Benefits of Upgrading Benefits of Upgrading The three major releases of the ServerNet Cluster product are as follows: Table 4-1. ServerNet Cluster Releases ServerNet Cluster Release Introduced With RVU Supports Release 1 G06.09 Up to 8 nodes Release 2 G06.12 Up to 16 nodes Release 3 G06.14 Up to 24 Nodes Depending on your current software, you can upgrade either to ServerNet cluster release 2 or release 3 functionality. Benefits of Upgrading to G06.12 (Release 2) Functionality Upgrading to G06.12 or applying the SPRs listed in Table 4-6 on page 4-8 to G06.09, G06.10, or G06.11 adds functions, including: • • • • • • The ability to update the ServerNet II Switch firmware and configuration Support for the star and split-star topologies, including up to 16 nodes in a cluster using the split-star topology New TSM actions New guided procedures New SCF commands for the external ServerNet SAN manager process (SANMAN) New SCF commands for the ServerNet cluster monitor process (SNETMON) The new functionality enhances ServerNet cluster manageability and facilitates further upgrades. Installing new software also provides defect repair. Note. Unless otherwise indicated in this section, information describing G06.12 software also applies to G06.13. No significant changes to ServerNet cluster software were made for G06.13. G06.12 and G06.13 provide identical functions for clustering. However, TSM client software version 2001C is required for G06.13. ServerNet Cluster Manual— 520575-003 4 -2 Upgrading a ServerNet Cluster Benefits of Upgrading to G06.14 (Release 3) Functionality Benefits of Upgrading to G06.14 (Release 3) Functionality Upgrading to G06.14 or upgrading to G06.13 and applying the SPRs listed in Table 4-6 on page 4-8 adds functions, including: Note. Upgrading the operating system to G06.13 or a later G-series RVU is required only if you need to configure a tri-star topology. Otherwise, you might be able to apply the Release 3 (G06.14) SPRs to G06.09 through G06.12 to obtain defect repair and enhancements such as automatic fail-over for the split-star topology. For availability of these SPRs on earlier RVUs, check with your HP representative. • • • • • • Support for the star, split-star, and tri-star topologies, including up to 24 nodes in a cluster using the tri-star topology Automatic link fail-over for split-star and tri-star topologies New TSM actions New guided procedures New SCF command options for the external ServerNet SAN manager process (SANMAN) New SCF command options for the ServerNet cluster monitor process (SNETMON) Benefits of Upgrading to G06.16 Functionality G06.16 includes SPRs that provide significant defect repair. Table 4-2 lists the G06.16 ServerNet cluster SPRs. However, these SPRs do not add new topologies or new commands and do not constitute a new ServerNet cluster release. For more information, refer to the softdoc for each SPR. Table 4-2. G06.16 SPRs Software Component G06.16 SPR Level SANMAN T0502AAH SNETMON/MSGMON T0294AAH ServerNet II Switch firmware and configuration files (on the server) T0569AAF Subsystem Control Facility (SCF) T9082ACT TSM server software T7945ABB TSM client software T8154ABB (TSM 2002B) ServerNet Cluster Manual— 520575-003 4 -3 Upgrading a ServerNet Cluster Planning Tasks for Upgrading a ServerNet Cluster Planning Tasks for Upgrading a ServerNet Cluster Four tasks in planning to upgrade a ServerNet cluster are as follows: • • • • Task 1: Identify the Current Topology on page 4-4 Task 2: Choose the Topology That You Want to Upgrade To on page 4-6 Task 3: Fill Out the Planning Worksheet on page 4-9 Task 4: Select an Upgrade Path on page 4-12 Task 1: Identify the Current Topology To identify the supported upgrade paths for a cluster, you must know the topology and the number of cluster switches currently used by the cluster. The topology refers to the physical layout of components in a network. In a ServerNet cluster, the topology defines the maximum size of the cluster and how cluster switches on the same fabric are interconnected. To identify the topology, use SCF or the TSM Service Application to check the configuration tag for all cluster switches on both fabrics: Note. Counting the number of nodes or the number of cluster switches is not always an accurate way to determine the topology. Subsets of the split-star and tri-star topologies can be constructed that contain the same number of cluster switches and nodes as a cluster using the star topology. If your cluster has multiple cluster switches per fabric, it uses either the split-star or tri-star topology. If three cluster switches per fabric are used, you can be certain the cluster uses the tri-star topology. If only one cluster switch per fabric is present, the cluster might use any topology. • Using SCF: 1. Log on to any node in the cluster 2. Use the SCF INFO SWITCH command: >SCF INFO SWITCH $ZZSMN 3. Compare the Configuration Tag information with the supported configuration tags shown in Table 4-3. • Using the TSM Service Application: 1. Log on to any node in the cluster. 2. In the Management window, click the Cluster tab. 3. Expand the External_ServerNet fabric resource. 4. Click the X or Y switch resource to select it. ServerNet Cluster Manual— 520575-003 4 -4 Upgrading a ServerNet Cluster Task 1: Identify the Current Topology 5. In the Attributes pane, check the Configuration Tag attribute. 6. To identify the topology, compare the attribute value with the supported configuration tags shown in Table 4-3. Table 4-3. Supported Configuration Tags Topology SCF Display TSM/Guided Procedures Display Star 0x00010000* Max 8 nodes, nodes 1-8 (0x10000) Split-star 0x00010000* 0x00010001 Max 16 nodes, nodes 1-8 (0x10000) Max 16 nodes, nodes 9-16 (0x10001) Tri-star 0x00010002 0x00010003 0x00010004 Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Max 24 nodes, nodes 17-24 (0x10004) Manufacturing (Blank) Default 0x00011111 Manufacturing Default (0x11111) *A cluster using the split-star topology with only one cluster switch per fabric uses the same configuration tag as a cluster using the star topology if the cluster switch occupies position 1. You can identify the topology by checking the configuration revision. A star topology cluster switch has a configuration revision of 0_0. A split-star topology cluster switch has a configuration revision of 1_7 or higher. See Table 4-9 on page 4-12. ServerNet Cluster Manual— 520575-003 4 -5 Upgrading a ServerNet Cluster Task 2: Choose the Topology That You Want to Upgrade To Task 2: Choose the Topology That You Want to Upgrade To To choose the upgrade topology, decide the maximum number of nodes that the upgraded cluster will need to support. This number should include anticipated future growth in the cluster. For maximum scalability, HP recommends upgrading to the tristar topology or a subset of the tri-star topology. Note. Section 2, Planning for Installation includes additional considerations for choosing a topology. If the cluster needs to support . . . Choose this topology Up to 16 nodes Split-star Up to 24 nodes Tri-star Table 4-4 shows the supported topologies and topology subsets, including the ServerNet node numbers used by each combination. You can expand a topology online, but upgrading from one topology to another requires temporarily disrupting ServerNet connectivity one fabric at a time. Table 4-4. Supported Topologies, Cluster Switch Positions, and ServerNet Node Numbers Topology Full Topology or Subset Star Full Split-star Tri-star Cluster Switches Per Fabric Cluster Switches Supports ServerNet node numbers . . . 1 X1/Y1 1 through 8 Full 2 X1/Y1 and X2/Y2 1 through 16 Subset 1 X1/Y1 1 through 8 Subset 1 X2/Y2 9 through 16 Full 3 X1/Y1, X2/Y2, and X3/Y3 1 through 24 Subset 2 X1/Y1 and X2/Y2 1 through 16 Subset 2 X1/Y1 and X3/Y3 1 through 8 and 17 through 24 Subset 2 X2/Y2 and X3/Y3 9 through 24 Subset 1 X1/Y1 1 through 8 Subset 1 X2/Y2 9 through 16 Subset 1 X3/Y3 17 through 24 ServerNet Cluster Manual— 520575-003 4 -6 Upgrading a ServerNet Cluster Task 2: Choose the Topology That You Want to Upgrade To Table 4-5 compares the topologies. Table 4-5. Comparison of ServerNet Cluster Topologies Topology Star Split-Star Tri-Star Introduced With ServerNet Cluster Release Release 1 Release 2 Release 3 Introduced With RVU G06.09 G06.12 G06.14 G06.09 and later G-series RVUs G06.091 G06.101 G06.111 G06.131 G06.14 G06.15 G06.16 Supported on RVUs G06.12 G06.13 G06.14 G06.15 G06.16 Maximum Number of Nodes Supported 8 16 24 Cluster Switches Per Fabric 1 1 or 2 1, 2, or 3 Total Cluster Switches 2 2 or 4 2, 4, or 6 Star Groups2 1 1 or 2 1, 2, or 3 1 SPRs are required for support on this RVU. See Table 4-6 on page 4-8. 2 A star group consists of an X and a Y cluster switch and up to eight connected ServerNet nodes. A split-star topology consists of two star groups. A tri-star topology consists of three star groups. ServerNet Cluster Manual— 520575-003 4 -7 Upgrading a ServerNet Cluster Task 2: Choose the Topology That You Want to Upgrade To Table 4-6 shows the SPRs required to obtain G06.12 (release 2) and G06.14 (release 3) functionality. Note. G06.14 functionality includes support for the tri-star topology but also supports the star and split-star topologies and provides significant defect repair. Table 4-6. SPRs for G06.12 and G06.14 ServerNet Cluster Functionality Software Component Release 2 (G06.12 equivalent) Release 3 (G06.14 equivalent) SANMAN T0502AAE T0502AAG SNETMON/MSGMON See Table 4-28 on page 4-95.1 T0294AAG ServerNet II Switch firmware and configuration files (on the server) T0569AAB T0569AAE Service processor (SP) firmware T1089ABB T1089ABC Subsystem Control Facility (SCF) T9082ACQ T9082ACR TSM server software T7945AAW T7945AAY TSM client software T8154AAW (TSM 2001B2) T8154AAY (TSM 2001D3) 1 SNETMON/MSGMON (T0294) is included for completeness. Upgrading SNETMON/MSGMON is not required to obtain G06.12 functionality. See Considerations for Upgrading SNETMON/MSGMON and the Operating System on page 4-95. 2 Provided on the NonStop S-series System Console Installer 13.0 CD. 3 Provided on the NonStop S-series System Console Installer 15.0 CD. ServerNet Cluster Manual— 520575-003 4 -8 Upgrading a ServerNet Cluster Task 3: Fill Out the Planning Worksheet Task 3: Fill Out the Planning Worksheet To help you plan for upgrading software, use Table 4-7 on page 4-10 to record the SPR levels of ServerNet cluster software on all nodes in the cluster. Table 4-7 accommodates an eight-node cluster. If your cluster contains more than eight nodes, you can make copies of the worksheet. The following tables help you gather information about each node: Table Function Table 2-9 on page 2-24 Describes how to check the SPR levels for ServerNet cluster software Table 2-10 on page 2-25 Lists version procedure (VPROC) information for ServerNet cluster software Table 4-8 on page 4-12 Allows you to determine the SPR level of the running firmware on a cluster switch Table 4-9 on page 4-12 Allows you to determine the SPR level of the running configuration on a cluster switch ServerNet Cluster Manual— 520575-003 4 -9 Upgrading a ServerNet Cluster Task 3: Fill Out the Planning Worksheet Table 4-7. Upgrade Planning Worksheet Node ___ Node ___ Node ___ Node ___ System Name \_________ \_________ \_________ \_________ Release Version Update (RVU) G06._____ G06._____ G06._____ G06._____ SANMAN T0502____ T0502____ T0502____ T0502____ SNETMON/MSGMON T0294____ T0294____ T0294____ T0294____ ServerNet II Switch firmware and configuration (file ver.) T0569____ T0569____ T0569____ T0569____ ServerNet II Switch firmware and configuration (running ver.) T0569____ T0569____ T0569____ T0569____ Service processor (SP) firmware T1089____ T1089____ T1089____ T1089____ SCF T9082____ T9082____ T9082____ T9082____ TSM Server T7945____ T7945____ T7945____ T7945____ TSM Client T8154____ T8154____ T8154____ T8154____ Node ___ Node ___ Node ___ Node ___ System Name \_________ \_________ \_________ \_________ Release Version Update (RVU) G06._____ G06._____ G06._____ G06._____ SANMAN T0502____ T0502____ T0502____ T0502____ SNETMON/MSGMON T0294____ T0294____ T0294____ T0294____ ServerNet II Switch firmware and configuration (file version) T0569____ T0569____ T0569____ T0569____ ServerNet II Switch firmware and configuration (running ver.) T0569____ T0569____ T0569____ T0569____ T1089____ T1089____ T1089____ T1089____ SCF T9082____ T9082____ T9082____ T9082____ TSM Server T7945____ T7945____ T7945____ T7945____ TSM Client T8154____ T8154____ T8154____ T8154____ Service processor (SP) firmware ServerNet Cluster Manual— 520575-003 4- 10 Upgrading a ServerNet Cluster Task 3: Fill Out the Planning Worksheet Checking SPR Levels or Version Procedure Information for ServerNet Cluster Software Table 2-9 on page 2-24 shows how to check the current SPR levels for ServerNet cluster software. However, some ServerNet cluster software components earlier than G06.12 omit the SPR level from their version procedure information. In these cases, see Table 2-10 on page 2-25 for the version procedure dates that identify the currently installed SPR. Checking the Revisions of the Running Firmware and Configuration Before you upgrade, HP recommends that you check the revisions of the firmware and configuration that are running in a cluster switch. Normally, the running firmware and configuration match the SPR level of the file versions of the firmware and configuration. If there is a mismatch, make sure you understand the reasons for the mismatch before upgrading. • Using SCF: 1. Log on to a node connected to the cluster switch. 2. Type INFO SWITCH $ZZSMN. In the SCF output, check the Firmware Revision and Configuration Revision row. For the firmware revisions, see Table 4-8. For the configuration revisions, see Table 4-9. • Using the TSM Service Application: 1. Log on to a node that is connected to the cluster switch. 2. In the Management window, click the Cluster tab. 3. Click the + sign next to the External ServerNet Fabric resource to display the Switch resource. 4. Click the Switch resource. 5. Click the Attributes tab to display the Switch attributes. 6. Note the values for the Firmware Version and Configuration Version attributes. For the firmware revisions, see Table 4-8. For the configuration revisions, see Table 4-9. ServerNet Cluster Manual— 520575-003 4- 11 Upgrading a ServerNet Cluster Task 4: Select an Upgrade Path Table 4-8. T0569 Firmware Revisions T0569 SPR Firmware Revision Supported Topologies T0569AAA 2_0_21 Star T0569AAB 3_0_52 Star and split-star T0569AAE 3_0_81 Star, split-star, and tri-star T0569AAF 3_0_82 Star, split-star, and tri-star Table 4-9. SCF and TSM Display of T0569 Configuration Revisions T0569 SPR Topology SCF displays the configuration revision as . . . T0569AAA Star 0x00000000 0_0 T0569AAB Star or split-star 0x00010007 1_7 T0569AAE Star or split-star 0x0001000b 1_11 Tri-star 0x00020005 2_5 Star or split-star 0x0001000b 1_11 Tri-star 0x00020005 2_5 T0569AAF TSM and guided procedures display . . . Task 4: Select an Upgrade Path Based on your cluster’s current topology and the desired upgrade topology, use one of the following tables to identify the supported upgrade paths and the upgrade steps: For clusters currently using the . . . See Star topology Table 4-10 on page 4-13 Split-star topology Table 4-11 on page 4-14 Tri-star topology Table 4-12 on page 4-16 ServerNet Cluster Manual— 520575-003 4- 12 Upgrading a ServerNet Cluster Task 4: Select an Upgrade Path Table 4-10 lists the supported upgrade paths for clusters using the star topology. Table 4-10. Supported Upgrade Paths for Clusters Using the Star Topology Can be upgraded to . . . A star topology using . . . One cluster switch per fabric Topology Cluster Switches Per Fabric To upgrade . . . Star 1 Refer to one of the following: • • Upgrading Software to Obtain G06.12 Functionality on page 4-17 Upgrading Software to Obtain G06.14 Functionality on page 4-34 These upgrade paths upgrade software from release 1 (G06.09) or release 2 (G06.12) to release 2 or release 3 (G06.14 or superseding SPRs) without changing the topology or the number of cluster switches per fabric. Split-star 1 Refer to one of the following: • • 2 Tri-star Upgrading Software to Obtain G06.12 Functionality on page 4-17 Upgrading Software to Obtain G06.14 Functionality on page 4-34 1. Upgrade the cluster, referring to Upgrading Software to Obtain G06.12 Functionality on page 4-17. 2. Merge the cluster with another cluster using the star topology. Refer to Merging Clusters to Create a Split-Star Topology on page 4-54. 1 Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2 or 3 1. Upgrade the cluster, referring to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2. Merge the cluster with another cluster using the star topology. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. ServerNet Cluster Manual— 520575-003 4- 13 Upgrading a ServerNet Cluster Task 4: Select an Upgrade Path Table 4-11 lists the supported upgrade paths for clusters using the split-star topology. Table 4-11. Supported Upgrade Paths for Clusters Using the Split-Star Topology (page 1 of 2) Can be upgraded to . . . A split-star topology using . . . One cluster switch per fabric Topology Cluster Switches Per Fabric Split-star 1 Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. This option upgrades the software from release 2 (G06.12) to release 3 (G06.14 or superseding SPRs) without changing the topology or the number of cluster switches per fabric. 2 Refer to Merging Clusters to Create a Split-Star Topology on page 4-54. 1 Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2 1. Upgrade the cluster, referring to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2. Merge the cluster with another cluster using the star topology. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. 1. Upgrade the cluster, referring to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2. Merge the cluster with another cluster using the split-star topology. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. Tri-star 3 To upgrade . . . ServerNet Cluster Manual— 520575-003 4- 14 Upgrading a ServerNet Cluster Task 4: Select an Upgrade Path Table 4-11. Supported Upgrade Paths for Clusters Using the Split-Star Topology (page 2 of 2) Can be upgraded to . . . A split-star topology using . . . Two cluster switches per fabric Topology Cluster Switches Per Fabric Split-star 2 Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. This option upgrades the software from release 2 (G06.12) to release 3 (G06.14 or superseding SPRs) without changing the topology or the number of cluster switches per fabric. Tri-star 2 Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 3 1. Upgrade the cluster, referring to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 2. Merge the cluster with another cluster using the star topology. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. To upgrade . . . ServerNet Cluster Manual— 520575-003 4- 15 Upgrading a ServerNet Cluster Task 4: Select an Upgrade Path Table 4-12 lists the supported upgrade paths for clusters using the tri-star topology. Table 4-12. Supported Upgrade Paths for Clusters Using the Tri-Star Topology Can be upgraded to . . . A tri-star topology using . . . One cluster switch per fabric Topology Cluster Switches Per Fabric Tri-star 2 Merge the cluster with another cluster using one switch per fabric. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. 3 Merge the cluster with two star topologies or a split-star topology. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. To upgrade . . . Two cluster switches per fabric Tri-star 3 Merge the cluster with another cluster using one switch per fabric. Refer to Merging Clusters to Create a Tri-Star Topology on page 4-68. Three cluster switches per fabric N.A. N.A. No upgrade path is currently supported for a tri-star topology having three cluster switches per fabric. You can add ServerNet nodes up to a maximum of 24, but you cannot add cluster switches. ServerNet Cluster Manual— 520575-003 4- 16 Upgrading a ServerNet Cluster Upgrading Software to Obtain G06.12 Functionality Upgrading Software to Obtain G06.12 Functionality This upgrade begins with a ServerNet cluster consisting of up to eight nodes running the G06.09, G06.10 or G06.11 RVU. The upgrade: • Installs new versions of the software listed in Table 4-13 on all nodes Note. HP recommends upgrading to the latest software whenever possible. See Upgrading Software to Obtain G06.14 Functionality on page 4-34. For availability of the release 3 (G06.14) SPRs on G06.09 and later G-series RVUs, check with your service provider. • • • Upgrades the firmware and configuration in the cluster switches Does not change the cluster topology Does not require upgrading to the G06.12 RVU unless you want to upgrade for defect repair or new functionality Following the upgrade, you can change the topology (add cluster switches) to support up to 16 nodes without upgrading the ServerNet II Switch firmware and configuration. Table 4-13 summarizes this upgrade. ServerNet Cluster Manual— 520575-003 4- 17 Upgrading a ServerNet Cluster Upgrading Software to Obtain G06.12 Functionality Table 4-13. Upgrade Summary: Upgrading Software to Obtain G06.12 Functionality Before the Upgrade After the Upgrade Max. Nodes Supported 8 8 Cluster Switches Per Fabric 1 1 NonStop Kernel Operating System Release G06.09, G06.10, or G06.11 G06.09, G06.10, G06.11, or G06.12 ServerNet II Switch Firmware Empty firmware files (T0569) or T0569AAA T0569AAB (G06.12 equivalent) SANMAN Version T0502 or T0502AAA (G06.09) T0502AAE (G06.12 equivalent) TSM Server Version T7945AAS (G06.09), T7945AAT (G06.10), or T7945AAV (G06.11) T7945AAW (G06.12 equivalent) TSM Client Version 10.0 (G06.09), 2000A (G06.10), or 2001A (G06.11) 2001B (G06.12) SNETMON/MSGMON Version T0294 (G06.09 or G06.10) or T0294AAB (G06.11) T0294* (G06.09 or G06.10), T0294AAB (G06.11), or T0294AAE (G06.12) Service Processor (SP) Version T1089AAX (G06.09) or T1089AAZ (G06.09, G06.10, or G06.11) T1089ABB (G06.12 equivalent) SCF T9082ACN T9082ACQ (G06.12 equivalent) *T0294AAA (a Class D SPR) can also be applied to G06.09 and G06.10. ServerNet Cluster Manual— 520575-003 4- 18 Upgrading a ServerNet Cluster Upgrading Software to Obtain G06.12 Functionality Figure 4-1 shows an example of a four-node ServerNet cluster before and after a software upgrade without a system load. Figure 4-1. Example of Upgrading Software for a Four-Node Cluster Without a System Load G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAA (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAA (loaded on cluster switch) Before Software Upgrade After Software Upgrade G06.09 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB G06.10 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB G06.10 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAB (loaded on cluster switch) Cluster Switch Cluster Switch G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Y1 T0569AAB (loaded on cluster switch) Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST100.vsd ServerNet Cluster Manual— 520575-003 4- 19 Upgrading a ServerNet Cluster Upgrading Software to Obtain G06.12 Functionality Figure 4-2 shows an example of a four-node ServerNet cluster before and after a software upgrade with system loads. Figure 4-2. Example of Upgrading Software for a Four-Node Cluster With System Loads G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAA (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAA (loaded on cluster switch) Before Software Upgrade After Software Upgrade G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAB (loaded on cluster switch) Cluster Switch G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Cluster Switch G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Y1 T0569AAB (loaded on cluster switch) Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST104.vsd ServerNet Cluster Manual— 520575-003 4- 20 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.12 Functionality You can upgrade two ways: • • Upgrading Software Without System Loads to Obtain G06.12 Functionality on page 4-21 Upgrading Software With System Loads to Obtain G06.12 Functionality on page 4-25 Caution. HP recommends that you have access to a spare cluster switch before starting any upgrade procedure that includes a firmware or configuration change. Upgrading Software Without System Loads to Obtain G06.12 Functionality This procedure upgrades the TSM client on all system consoles and applies the following SPRs to all nodes: • • • • • SANMAN ServerNet II Switch firmware SP firmware SCF TSM server Then one of the nodes is used to download the T0569AAB firmware and configuration to the cluster switches. Upgrading the nodes to G06.12 is optional. 1. On the system consoles for all nodes in the cluster, upgrade the TSM client software to Compaq 2001B or a later version of the TSM client. For more information, refer to the NonStop System Console Installer Guide. 2. On all nodes in the cluster: a. Use DSM/SCM to apply the following SPRs (and their requisite SPRs). For more information about the SPRs, refer to Table 4-6 on page 4-8. Note. This procedure does not include T0294AAE, which is optional. For information about upgrading SNETMON/MSGMON, refer to the softdoc. • • • • • • T0502AAE T0569AAA (see note) T0569AAB (see note) T1089ABB T9082ACQ T7945AAW ServerNet Cluster Manual— 520575-003 4- 21 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.12 Functionality Note. The following considerations apply to T0569AAA and T0569AAB: • • T0569AAA is included in addition to T0569AAB so that it is available in the archive for fallback purposes. Because of time constraints unique to each production environment, installing these SPRs sometimes cannot be accomplished all at once for every node in a cluster. You might need to continue using a star topology while the SPRs are applied to some—but not all—nodes. In this scenario, any node to which the T0569AAB SPR is applied generates a TSM alarm indicating that the node has a newer version of T0569 than the version running on the cluster switch (T0569AAA). The presence of this alarm can prompt some users to download T0569AAB to the cluster switches before the SPRs have been applied to all nodes. The repair action for the alarm warns you not to download T0569AAB to the cluster switches until the SPRs are applied to all nodes, but some users might not see the warning. To avoid alarms generated on ServerNet nodes during a migration procedure that extends over a long period of time, apply all of the preceding SPRs except T0569AAB. When all of the SPRs except T0569AAB have been applied to all nodes and you are ready to download T0569AAB to the cluster switches, apply T0569AAB to all nodes. b. Abort the SANMAN process: >SCF ABORT PROCESS $ZZKRN.#ZZSMN Note. The TSM Service Application cannot display the Cluster tab for a node on which $ZZKRN.#ZZSMN (SANMAN) has been aborted. c. Abort the TSM server process: >SCF ABORT PROCESS $ZZKRN.#TSM-SRM d. Run ZPHIRNM to perform the rename step. e. Restart the TSM server process: >SCF START PROCESS $ZZKRN.#TSM-SRM f. Use the TSM Service Application to update the SP firmware. For the detailed steps, refer to the online help. g. Restart the SANMAN process: >SCF START PROCESS $ZZKRN.#ZZSMN h. Use the TSM Service Application to check the ServerNet cluster status. For information about using TSM, refer to Section 5, Managing a ServerNet Cluster. ServerNet Cluster Manual— 520575-003 4- 22 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.12 Functionality 3. On all nodes, use SCF to make sure direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL Note. Using the SCF STATUS SUBNET $ZZSCL command requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 4. Select a node whose system console can be used to download the ServerNet II Switch firmware and configuration. This will be the downloader node. 5. On the downloader node: Caution. All nodes attached to a cluster switch whose firmware and configuration will be updated with T0569AAB or T0569AAE must be running a version of the operating system that is compatible with the T0569 SPR to be downloaded. Any nodes that do not meet these requirements might experience permanent loss of Expand traffic across the cluster switch when the new firmware and configuration are loaded on the cluster switch. Table 4-32 on page 4-99 describes the NonStop Kernel requirements for ServerNet nodes connected to cluster switches that will be updated. a. Use the Firmware Update action of the TSM Service Application to download the T0569AAB firmware from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. In cluster switches running the T0569AAA firmware and configuration preloaded at the factory, the fault LED can turn on when firmware is downloaded. This is normal, and the fault LED will turn off when the new configuration is loaded. b. Use the Configuration Update action of the TSM Service Application to download the T0569AAB configuration from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. This scenario does not change the ServerNet node number range for the cluster switch. Note. Downloading the firmware and configuration disrupts ServerNet connectivity through the X-fabric cluster switch temporarily. c. Use the TSM Service Application to verify that the X-fabric cluster switch is operational. d. Use the SCF STATUS SUBNET $ZZSCL command on all nodes to verify that direct ServerNet connectivity has been restored on the X fabric. ServerNet Cluster Manual— 520575-003 4- 23 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.12 Functionality Note. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or a later G-series RVU). For faster (but manual) recovery of ServerNet connectivity, use the SCF START SERVERNET \REMOTE.$ZSNET.FABRIC.* command on all affected nodes after you have verified by using the TSM Service Application that the X-fabric cluster switch is operational. The START SERVERNET command works only for nodes running G06.12 or a later G-series RVU. For nodes running G06.09 through G06.11, you can use the following commands: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL Stopping and restarting the ServerNet cluster subsystem destroys and restarts interprocessor connectivity between the node that receives the commands and all other nodes over both external fabrics. 6. Repeat Step 5 for the Y-fabric cluster switch. 7. If upgrading to a new RVU, perform the following optional steps: a. Perform a system load of G06.12 on the downloader node. For more information about migrating to G06.12, refer to the following manuals: • • Interactive Upgrade Guide G06.12 Software Installation and Upgrade Guide b. When convenient, perform a system load of G06.12 on the remaining nodes one at a time. ServerNet Cluster Manual— 520575-003 4- 24 Upgrading a ServerNet Cluster Upgrading Software With System Loads to Obtain G06.12 Functionality Upgrading Software With System Loads to Obtain G06.12 Functionality This procedure instructs you to shut down the cluster and upgrade each node to the G06.12 RVU or add SPRs to obtain G06.12 functionality. Then you rebuild the cluster using the installation procedures in Section 3, Installing and Configuring a ServerNet Cluster. 1. Remove all nodes from the cluster by using the procedure for removing a node in Section 6, Adding or Removing a Node. After shutting down software processes, be sure to disconnect the cables that link the MSEBs in group 01 to the X-fabric and Y-fabric cluster switches 2. Upgrade each node to G06.12, or apply SPRs to obtain G06.12 clustering functionality: • • To upgrade to the G06.12 RVU, refer to the following manuals: • • Interactive Upgrade Guide G06.12 Software Installation and Upgrade Guide To apply the SPRs, refer to Steps 1 through 3 of Upgrading Software Without System Loads to Obtain G06.12 Functionality on page 4-21. 3. When the software on all nodes has been upgraded, rebuild the ServerNet cluster using the procedures in Section 3, Installing and Configuring a ServerNet Cluster. Note. HP recommends that you update the firmware and configuration of the cluster switches to T0569AAB before connecting the upgraded nodes. The installation tasks in Section 3 include steps for updating the ServerNet II Switch firmware and configuration. ServerNet Cluster Manual— 520575-003 4- 25 Upgrading a ServerNet Cluster Fallback for Upgrading Software to Obtain G06.12 Functionality Fallback for Upgrading Software to Obtain G06.12 Functionality The fallback procedure you use depends on whether or not your system can be stopped temporarily for a system load: • • Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 Functionality on page 4-26 Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality on page 4-30 Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 Functionality Use this procedure if you need to restore the ServerNet II Switch firmware and configuration files to an earlier version (G06.09, for example). This procedure uses one of the nodes to restore the old firmware and configuration files. The other nodes can continue operating as members of the cluster. Choose a node from which to reconfigure the cluster switches, and use the system console for that node to complete the following steps: 1. If the cluster is connected in a split-star topology, follow one of the procedures in Fallback for Merging Clusters to Create a Split-Star Topology on page 4-66 to disconnect the four-lane links. Caution. When the four-lane links have been disconnected, they should remain disconnected unless the cluster switch needs to be added to another cluster. In that case, do not connect the four-lane link until you are instructed to do so by the Add Switch guided procedure. Connecting the four-lane link between two cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster if the NNA Version is 5. (FCO 39746B replaced Version 5 NNAs with NNA Version 22.) Power cycling each node might be necessary to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting two cluster switches running the T0569AAA configuration can create an invalidly configured split-star topology in which ServerNet packets sent by nodes 1 through 8 to nodes 9 through 16 loop indefinitely between the two cluster switches. 2. Use DSM/SCM to retrieve T0569AAA from the archive so that you have access to the M6770 and M6770CL firmware and configuration files. 3. Use the Configuration Update action of the TSM Service Application to download the T0569AAA M6770CL configuration to the nearest X-fabric cluster switch. For ServerNet Cluster Manual— 520575-003 4- 26 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. Note the following considerations for downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the X-fabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 (0x10000) in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. 4. Use the Firmware Update action of the TSM Service Application to download the T0569AAA M6770 firmware file to the nearest X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. 5. Use the TSM Service Application to perform a hard reset of the nearest X-fabric cluster switch: a. From the Cluster tab, click the plus (+) sign next to the External ServerNet X Fabric resource to display the Switch resource. b. Right-click the Switch resource, and select Actions. The Actions dialog box appears. c. In the Actions dialog box, choose Hard Reset, and click Perform Action. 6. Use the TSM Service Application to verify the operation of the X-fabric cluster switch. 7. Use the SCF STATUS SUBNET $ZZSCL command on all nodes to verify that direct ServerNet connectivity has been restored on the X fabric: >SCF STATUS SUBNET $ZZSCL Note. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or a later G-series RVU). For faster (but manual) recovery of ServerNet connectivity, use the SCF START SERVERNET \REMOTE.$ZSNET.FABRIC.* command on all affected nodes after you have verified by using the TSM Service Application that the X-fabric cluster switch is operational. The START SERVERNET command works only for nodes running G06.12 or a later G-series RVU. For nodes running G06.09 through G06.11, the following commands can be used: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL Stopping and restarting the ServerNet cluster subsystem destroys and restarts interprocessor connectivity between the node that receives the commands and all other nodes over both external fabrics. ServerNet Cluster Manual— 520575-003 4- 27 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 8. Use the Configuration Update action of the TSM Service Application to download the T0569AAA M6770CL configuration to the nearest Y-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. Note the following considerations for downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the Y-fabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. 9. Use the Firmware Update action of the TSM Service Application to download the T0569AAA M6770 firmware file to the nearest Y-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. If the cluster switch is currently running T0569AAA firmware, updating the configuration can change the fabric setting. The fabric setting (X or Y) will change to a default setting (X) when the T0569AAA configuration is loaded in the switch. The T0569AAA configuration normally should be loaded on the cluster switch only when a configuration fallback is being performed. If the cluster switch is installed on the external Y fabric, you need to reconfigure the fabric setting in the cluster switch back to Y after the fallback. Until you perform this action, ServerNet traffic through the cluster switch will remain disabled. You can use the Set LED to { X | Y } Side action in the TSM Service Application to correct the fabric setting. If the cluster switch is currently running T0569AAB firmware, the configured fabric setting (X or Y) remains unchanged when either the T0569AAA (fallback) or the T0569AAB configuration is loaded in the cluster switch. 10. Use the TSM Service Application to perform a hard reset of the nearest Y-fabric cluster switch: a. From the Cluster tab, click the plus (+) sign next to the External ServerNet Y Fabric resource to display the Switch resource. b. Right-click the Switch resource and select Actions. The Actions dialog box appears. c. In the Actions dialog box, choose Hard Reset, and click Perform Action. 11. Use the TSM Service Application to verify the operation of the Y-fabric cluster switch. ServerNet Cluster Manual— 520575-003 4- 28 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software Without a System Load to Obtain G06.12 12. On all nodes, use SCF to make sure direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL Note. Using SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 13. On all nodes connected to the cluster switch, use SCF to ensure that the ServerNet node numbers used by the MSEB port and ServerNet II Switch port are consistent: >SCF STATUS CONN $ZZSMN In the SCF display, check that the “SvNet Node Number” values for the MSEB port and switch port are the same. If the values are not the same, use one of the following recovery measures: • • Use the SCF PRIMARY PROCESS $ZZSMN command to force a takeover of the SANMAN process in the problem node. Disconnect and reconnect the MSEB-to-switch fiber-optic cable on the affected fabric. 14. If the requisite SPRs are to be removed, make sure T0569AAA has been downloaded to all cluster switches prior to removing the requisite SPRs. G06.12 contains T0502AAE and T0569AAB. If necessary, you can use T0569AAA with G06.12 instead of T0569AAB, but only in a star topology, which supports up to eight ServerNet nodes. 15. If falling back to an earlier version of the operating system (G06.09, for example), perform the following optional step: a. Shut down any nodes for which a fallback to an earlier version of the operating system is desired. b. On the nodes that were shut down, load the system using the down-rev operating system. ServerNet Cluster Manual— 520575-003 4- 29 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality Use this procedure if you need to restore the ServerNet II Switch firmware and configuration files after upgrading the nodes to the G06.12 SUT: 1. If the cluster is connected in a split-star topology, follow one of the procedures in Fallback for Merging Clusters to Create a Split-Star Topology on page 4-66 to disconnect the four-lane links. Caution. When the four-lane links have been disconnected, they should remain disconnected unless the cluster switch needs to be added to another cluster. In that case, do not connect the four-lane links until you are instructed to do so by the Add Switch guided procedure. Connecting a four-lane link between two cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster if the NNA Version is 5. (FCO 39746B replaced Version 5 NNAs with NNA Version 22.) Power cycling each node might be necessary to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting two cluster switches running the T0569AAA configuration can create an invalidly configured split-star topology in which ServerNet packets sent by nodes 1 through 8 to nodes 9 through 16 loop indefinitely between the two cluster switches. 2. Select a node to load down-rev firmware and configuration files on the cluster switches. Use this downloader node to restore the down-rev firmware and configuration files to the X- and Y-fabric cluster switches: a. Use DSM/SCM to retrieve T0569AAA from the archive, replacing the M6770 and M6770CL firmware and configuration files in $SYSTEM.SYSnn. b. Use the Configuration Update action of the TSM Service Application to download the T0569AAA M6770CL configuration to the nearest X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. Note the following considerations for downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the Xfabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 (0x10000) in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. c. Use the Firmware Update action of the TSM Service Application to download the T0569AAA M6770 firmware file to the nearest X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. ServerNet Cluster Manual— 520575-003 4- 30 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality d. Use the TSM Service Application to perform a hard reset of the nearest Xfabric cluster switch: 1. From the Cluster tab, click the plus (+) sign next to the External ServerNet X Fabric resource to display the Switch resource. 2. Right-click the Switch resource and select Actions. The Actions dialog box appears. 3. In the Actions dialog box, choose Hard Reset, and click Perform Action. e. Use the TSM Service Application to verify the operation of the X-fabric cluster switch. f. Use SCF to make sure that direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. The T0294AAG (or a superseding SPR) must be applied in order for you to use the PROBLEMS option. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. g. Use the Configuration Update action of the TSM Service Application to download the T0569AAA M6770CL configuration to the nearest Y-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. Note the following considerations for downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the Yfabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. h. Use the Firmware Update action of the TSM Service Application to download the T0569AAA M6770 firmware file to the nearest Y-fabric cluster switch. For ServerNet Cluster Manual— 520575-003 4- 31 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. If the cluster switch is currently running T0569AAA firmware, updating the configuration can change the fabric setting. The fabric setting (X or Y) will change to a default setting (X) when the T0569AAA configuration is loaded in the switch. The T0569AAA configuration normally should be loaded on the cluster switch only when a configuration fallback is being performed. If the cluster switch is installed on the external Y fabric, you need to reconfigure the fabric setting in the cluster switch back to Y after the fallback. Until you perform this action, ServerNet traffic through the cluster switch remains disabled. You can use the Set LED to { X | Y } Side action in the TSM Service Application to correct the fabric setting. If the cluster switch is currently running T0569AAB firmware, the configured fabric setting (X or Y) remains unchanged when either the T0569AAA (fallback) or the T0569AAB configuration is loaded in the cluster switch. i. Use the TSM Service Application to perform a hard reset of the nearest Yfabric cluster switch: 1. From the Cluster tab, click the plus (+) sign next to the External ServerNet Y Fabric resource to display the Switch resource. 2. Right-click the Switch resource and select Actions. The Actions dialog box appears. 3. In the Actions dialog box, choose Hard Reset, and click Perform Action. j. Use the TSM Service Application to verify the operation of the Y-fabric cluster switch. k. On all nodes, use SCF to make sure that direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL Note. Using the SCF STATUS SUBNET $ZZSCL command requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) on the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window on each node. l. On all nodes connected to the cluster switch, use SCF to ensure that the ServerNet node numbers used by the MSEB port and ServerNet II Switch port are consistent: >SCF STATUS CONN $ZZSMN ServerNet Cluster Manual— 520575-003 4- 32 Upgrading a ServerNet Cluster Fallback for Upgrading ServerNet Cluster Software With System Loads to Obtain G06.12 Functionality In the SCF display, check the “SvNet Node Number” values for the MSEB port and switch port to make sure they are the same. If the values are not the same, use one of the following recovery measures: • • Use the SCF PRIMARY PROCESS $ZZSMN command to force a takeover of the SANMAN process in the problem node. Disconnect and reconnect the MSEB-to-switch fiber-optic cable on the affected fabric. m. Perform a system load of the down-rev operating system on the downloader node. 3. Perform a system load of the down-rev operating system on the next node, and continue until you have loaded the down-rev operating system on all desired nodes. ServerNet Cluster Manual— 520575-003 4- 33 Upgrading a ServerNet Cluster Upgrading Software to Obtain G06.14 Functionality Upgrading Software to Obtain G06.14 Functionality This section contains the following procedures: Procedure Use this procedure if . . . Upgrading Software Without System Loads to Obtain G06.14 Functionality on page 4-35 You want to obtain G06.14 functionality, but you do not need to create a tri-star topology. Upgrading Software With System Loads to Obtain G06.14 Functionality on page 4-42 You need to build a tri-star topology and your cluster currently contains some G06.09 through G06.12 nodes. To use the tri-star topology, all nodes in a cluster must be running one of the following: • • G06.13 with SPRs G06.14 or a later G-series RVU Caution. HP recommends that you have a spare cluster switch on site or have ready access to a spare cluster switch before starting any upgrade procedure that includes a firmware or configuration change. ServerNet Cluster Manual— 520575-003 4- 34 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.14 Functionality Upgrading Software Without System Loads to Obtain G06.14 Functionality This procedure allows you to take advantage of defect repair and enhancements such as automatic fail-over for the split-star topology. However, if the cluster contains G06.09 through G06.12 nodes, this procedure does not prepare the cluster for upgrading to a tri-star topology. To prepare a cluster for upgrading to a tri-star topology, refer to Upgrading Software With System Loads to Obtain G06.14 Functionality on page 4-42. This upgrade: • • • • • Begins with a ServerNet cluster consisting of up to 16 nodes (star or split-star topologies) running any of the following RVUs: • • • • • G06.09 G06.10 G06.11 G06.12 G06.13 Upgrades the TSM client software on all system consoles Installs new versions of the software listed in Table 4-6 on page 4-8 Uses one of the nodes (or two nodes for a split-star topology) to download the T0569AAE or superseding firmware and configuration to the cluster switches Does not change the cluster topology Table 4-14 summarizes this upgrade. ServerNet Cluster Manual— 520575-003 4- 35 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.14 Functionality Table 4-14. Upgrade Summary: Upgrading Software Without System Loads to Obtain G06.14 Functionality Before the Upgrade After the Upgrade Max. Nodes Supported 8 (star topology) or 16 (split-star topology) 8 (star topology) or 16 (split-star topology) Cluster Switches Per Fabric 1 or 2 1 or 2 NonStop Kernel Operating System Release G06.09, G06.10, G06.11, G06.12, or G06.13 Same (with SPRs) ServerNet II Switch Firmware T0569 (empty files), T0569AAA, or T0569AAB T0569AAE (or superseding) SANMAN Version T0502, T0502AAA (G06.09), or T0502AAE (G06.12) T0502AAG (or superseding) TSM Server Version T7945AAS (G06.09), T7945AAT (G06.10), T7945AAV (G06.11), T7945AAW (G06.12), or T7945AAX (G06.13) T7945AAY (or superseding) TSM Client Version 10.0 (G06.09), 2000A (G06.10), 2001A (G06.11) 2001B (G06.12) 2001C (G06.13) 2001D (or a later client version) SNETMON/MSGMON Version T0294 (G06.09 or G06.10), T0294AAB (G06.11), or T0294AAE (G06.12 or G06.13) Refer to Table 4-28 on page 4-95. Service Processor (SP) Version T1089AAX (G06.09), T1089AAZ (G06.09, G06.10, or G06.11), or T1089ABB (G06.12 or G06.13) T1089ABC (or superseding) SCF T9082ACN or T9082ACQ T9082ACQ (for G06.09 through G06.13) or T9082ACR *T0502AAG might not be immediately supported on all RVUs. Before upgrading, check with your HP representative to ensure that all of the SPRs you need are available. ServerNet Cluster Manual— 520575-003 4- 36 Upgrading a ServerNet Cluster Upgrading Software Without System Loads to Obtain G06.14 Functionality Figure 4-3 on page 4-37 shows an example of a four-node ServerNet cluster before and after a software upgrade without system loads to obtain G06.14 functionality. Figure 4-3. Example of Upgrading Software Without System Loads to Obtain G06.14 Functionality G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 G06.10 T0502AAE T7945AAT T0294 T1089AAZ T9082ACQ T0569AAB G06.11 T0502AAE T7945AAV T0294AAB T1089AAZ T9082ACQ T0569AAB NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAB (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAB (loaded on cluster switch) Before Software Upgrade After Software Upgrade G06.12 T0502AAG T7945AAY T0294AAG T1089ABC T9082ACQ T0569AAE NonStop Himalaya S-Series Server X1 T0569AAE (loaded on cluster switch) G06.09 T0502AAG T7945AAY T0294 T1089ABC T9082ACQ T0569AAE NonStop Himalaya S-Series Server Cluster Switch G06.10 T0502AAG T7945AAY T0294 T1089ABC T9082ACQ T0569AAE G06.11 T0502AAG T7945AAY T0294AAB T1089ABC T9082ACQ T0569AAE NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server Cluster Switch Y1 T0569AAE (loaded on cluster switch) Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST138.vsd ServerNet Cluster Manual— 520575-003 4- 37 Upgrading a ServerNet Cluster Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality 1. On the system consoles for all nodes in the cluster, upgrade the TSM client software to Compaq 2001D or a later version of the TSM client. For more information, refer to the NonStop System Console Installer Guide. 2. On all nodes in the cluster: a. Use DSM/SCM to apply the following SPRs (or superseding SPRs) and their requisite SPRs. For more information about the SPRs, refer to Table 4-6 on page 4-8 or the softdoc for the SPR. SPR Notes T0502AAG For availability of this SPR on G06.09 through G06.12 RVUs, contact your HP representative. T0294AAG This SPR can be applied only to nodes running G06.12 or a later G-series RVU. For more information about T0294 compatibility, refer to Considerations for Upgrading SNETMON/MSGMON and the Operating System on page 4-95. T0569AAA If you are upgrading software for a star topology (G06.09, G06.10, or G06.11 without SPRs), you must include T0569AAA so that it is available in the archive for fallback purposes. For more information, refer to T0569AAA Firmware and Configuration Files on page 4-101. T0569AAE See the following note. T1089ABC None. T7945AAY None. T9082ACR None. Note. Because of time constraints unique to each production environment, installing these SPRs sometimes cannot be accomplished all at once for every node in a cluster. You might need to continue using a star or split-star topology while the SPRs are applied to some—but not all—nodes. In this scenario, any node to which the T0569AAE SPR is applied generates a TSM alarm indicating that the node has a newer version of T0569 than the version running on the cluster switch (T0569AAA or T0569AAB). The presence of this alarm can prompt some users to download T0569AAE to the cluster switches before the SPRs have been applied to all nodes. The repair action for the alarm warns you not to download T0569AAE to the cluster switches until the SPRs are applied to all nodes, but some users might not see the warning. To avoid alarms generated on ServerNet nodes during a migration procedure that extends over a long period of time, apply all of the preceding SPRs except T0569AAE. When all of the SPRs except T0569AAE have been applied to all nodes and you are ready to download T0569AAE to the cluster switches, apply T0569AAE to all nodes. ServerNet Cluster Manual— 520575-003 4- 38 Upgrading a ServerNet Cluster Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality b. Shut down any applications using Expand-over-ServerNet connections between the node and the rest of the cluster. c. Abort all Expand-over-ServerNet lines on the node for remote nodes in the cluster. On remote nodes, abort the Expand-over-ServerNet line for the node receiving the SPRs. >SCF ABORT LINE $SCxxx d. Stop the Servernet cluster subsystem: >SCF STOP SUBSYS $ZZSCL e. Abort the SNETMON process: >SCF ABORT PROCESS $ZZKRN.#ZZSCL f. Abort the SANMAN process: >SCF ABORT PROCESS $ZZKRN.#ZZSMN Note. The TSM Service Application cannot display the Cluster tab for a node on which $ZZKRN.#ZZSMN (SANMAN) has been aborted. g. Abort all MSGMON processes: >SCF ABORT PROCESS $ZZKRN.#MSGMON h. Abort the TSM server process: >SCF ABORT PROCESS $ZZKRN.#TSM-SRM i. Run ZPHIRNM to perform the rename step. j. Restart the TSM server process: >SCF START PROCESS $ZZKRN.#TSM-SRM k. Use the TSM Service Application to update the SP firmware. For the detailed steps, refer to the online help. l. Restart the MSGMON processes: >SCF START PROCESS $ZZKRN.#MSGMON m. Restart the SANMAN process: >SCF START PROCESS $ZZKRN.#ZZSMN n. Restart the SNETMON process: >SCF START PROCESS $ZZKRN.#ZZSCL o. Restart the ServerNet cluster subsystem: >SCF START SUBSYS $ZZSCL p. Restart all Expand-over-ServerNet lines: >SCF START LINE $SCxxx ServerNet Cluster Manual— 520575-003 4- 39 Upgrading a ServerNet Cluster Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality q. Use the TSM Service Application to check the ServerNet cluster status. For information about using TSM, refer to Section 5, Managing a ServerNet Cluster. 3. Use SCF to make sure direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. In order to use the PROBLEMS option, T0294AAG (or a superseding SPR) must be applied. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 4. Select a node whose system console can be used to download the ServerNet II Switch firmware and configuration. This will be the downloader node. If you are upgrading a cluster that uses a split-star topology, you must select two downloader nodes: one for the X1/Y1 cluster switches and one for the X2/Y2 cluster switches. (You cannot download firmware or a configuration across a fourlane link to a remote cluster switch.) Caution. All nodes attached to a cluster switch whose firmware and configuration will be updated with T0569AAB or T0569AAE must be running a version of the operating system that is compatible with the T0569 SPR to be downloaded. Any nodes that do not meet these requirements might experience permanent loss of Expand traffic across the cluster switch when the new firmware and configuration are loaded on the cluster switch. Table 4-32 on page 4-99 describes the NonStop Kernel requirements for ServerNet nodes connected to cluster switches that will be updated. 5. On the X1/Y1 downloader node: a. Use the Firmware Update action of the TSM Service Application to download the T0569AAE firmware from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. In cluster switches running the T0569AAA firmware and configuration preloaded at the factory, the fault LED can turn on when firmware is downloaded. This is normal, and the fault LED will turn off when the new configuration is loaded. b. Use the Configuration Update action of the TSM Service Application to download the T0569AAE configuration from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. ServerNet Cluster Manual— 520575-003 4- 40 Upgrading a ServerNet Cluster Steps for Upgrading Software Without System Loads to Obtain G06.14 Functionality This scenario does not change the ServerNet node number range for the cluster switch. Note. Downloading the firmware and configuration disrupts ServerNet connectivity through the X-fabric cluster switch temporarily. c. Use the TSM Service Application to verify that the X-fabric cluster switch is operational. d. Use the SCF STATUS SUBNET $ZZSCL command on all nodes to verify that direct ServerNet connectivity has been restored on the X fabric. Note. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or a later G-series RVU). For faster (but manual) recovery of ServerNet connectivity, use the SCF START SERVERNET \REMOTE.$ZSNET.FABRIC.* command on all affected nodes after you have used the TSM Service Application to verify that the X-fabric cluster switch is operational. The START SERVERNET command works only for nodes running G06.12 or a later G-series RVU. For nodes running G06.09 through G06.11, you can use the following commands: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL Stopping and restarting the ServerNet cluster subsystem destroys and restarts interprocessor connectivity between the node that receives the commands and all other nodes over both external fabrics. 6. Repeat Step 5 for the Y-fabric cluster switch. 7. If you are upgrading a cluster that uses a split-star topology, repeat Step 5 and Step 6 using the X2/Y2 downloader node. 8. Restart any applications using Expand-over-ServerNet connections between the nodes in the cluster. ServerNet Cluster Manual— 520575-003 4- 41 Upgrading a ServerNet Cluster Upgrading Software With System Loads to Obtain G06.14 Functionality Upgrading Software With System Loads to Obtain G06.14 Functionality This upgrade: • • • Begins with a ServerNet cluster consisting of up to 16 nodes (star or split-star topologies) running any of the following RVUs: ° ° ° ° ° G06.09 G06.10 G06.11 G06.12 G06.13 Upgrades the TSM client software on all system consoles Migrates the operating system on all nodes to G06.13 (with SPRs) or a later G-series RVU (without SPRs) Note. Upgrading the operating system to G06.13 or a later G-series RVU is required only if you need to configure a tri-star topology. Otherwise, you might be able to apply the Release 3 (G06.14) SPRs to G06.09 through G06.12 without a system load. For availability of these SPRs on earlier RVUs, check with your HP representative. • • • Installs new versions of the software listed in Table 4-6 on page 4-8 on all G06.13 nodes Upgrades the firmware and configuration in the cluster switches Does not change the cluster topology Following the upgrade, you can change the topology (add cluster switches) to support up to 24 nodes without installing new software or performing a system load. ServerNet Cluster Manual— 520575-003 4- 42 Upgrading a ServerNet Cluster Upgrading Software With System Loads to Obtain G06.14 Functionality Table 4-15 summarizes this upgrade. Table 4-15. Upgrade Summary: Upgrading Software With System Loads to Obtain G06.14 Functionality Before the Upgrade After the Upgrade Max. Nodes Supported 8 (star topology) or 16 (split-star topology) 8 (star topology), 16 (split-star topology), or 24 (tri-star topology) Cluster Switches Per Fabric 1 or 2 1, 2, or 3 NonStop Kernel Operating System Release G06.09, G06.10, G06.11, G06.12, or G06.13 G06.13 (with SPRs) or G06.14* ServerNet II Switch Firmware T0569 (empty files), T0569AAA, or T0569AAB T0569AAE (or superseding) SANMAN Version T0502, T0502AAA (G06.09), or T0502AAE (G06.12) T0502AAG (or superseding) TSM Server Version T7945AAS (G06.09), T7945AAT (G06.10), T7945AAV (G06.11), T7945AAW (G06.12), or T7945AAX (G06.13) T7945AAY (or superseding) TSM Client Version 10.0 (G06.09), 2000A (G06.10), 2001A (G06.11), 2001B (G06.12), or 2001C (G06.13) 2001D (G06.14) or a later client version SNETMON/MSGMON Version T0294 (G06.09 or G06.10), T0294AAB (G06.11), or T0294AAE (G06.12 or G06.13) Refer to Table 4-28 on page 4-95. Service Processor (SP) Version T1089AAX (G06.09), T1089AAZ (G06.09, G06.10, or G06.11), or T1089ABB (G06.12 or G06.13) T1089ABC (or superseding) SCF T9082ACN or T9082ACQ • • • T9082ACQ (for G06.09 through G06.13) T9082ACR (for G06.14 or G06.15) T9082ACT (for G06.16) *G06.13 or a later G-series RVU is required only if you need to construct a tri-star topology. The G06.14 SPRs can be applied to other RVUs. For availability of the SPRs on G06.09 through G06.12, check with your HP representative. See Upgrading Software Without System Loads to Obtain G06.14 Functionality on page 4-35. ServerNet Cluster Manual— 520575-003 4- 43 Upgrading a ServerNet Cluster Upgrading Software With System Loads to Obtain G06.14 Functionality Figure 4-4 shows an example of a four-node ServerNet cluster before and after a software upgrade to obtain G06.14 functionality. This upgrade requires a system load to migrate to G06.13 or G06.14 unless a node is already running G06.13, in which case SPRs can be applied. Figure 4-4. Example of Upgrading Software With System Loads to Obtain G06.14 Functionality G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB G06.10 T0502AAE T7945AAT T0294 T1089AAZ T9082ACQ T0569AAB G06.10 T0502AAE T7945AAT T0294 T1089AAZ T9082ACQ T0569AAB G06.11 T0502AAE T7945AAV T0294AAB T1089AAZ T9082ACQ T0569AAB NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAB (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAB (loaded on cluster switch) Before Software Upgrade After Software Upgrade G06.13 T0502AAG T7945AAY T0294AAG T1089ABC T9082ACR T0569AAE G06.13 T0502AAG T7945AAY T0294AAG T1089ABC T9082ACR T0569AAE G06.14 T0502AAG T7945AAY T0294AAG T1089ABC T9082ACR T0569AAE G06.14 T0502AAG T7945AAY T0294AAG T1089ABC T9082ACR T0569AAE NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAE (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAE (loaded on cluster switch) Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST137.vsd ServerNet Cluster Manual— 520575-003 4- 44 Upgrading a ServerNet Cluster Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality To perform the upgrade: 1. Upgrade any nodes running G06.12 or an earlier RVU to G06.13 or a later G-series RVU (recommended) or G06.13. For information about migrating to a new RVU, refer to: • • Interactive Upgrade Guide G06.xx Software Installation Guide Note. Upgrading the operating system to G06.13 or a later G-series RVU is required only if you need to configure a tri-star topology. Otherwise, you might be able to apply the Release 3 (G06.14 or superseding) SPRs to G06.09 through G06.12 without a system load. Check with your HP representative for availability of these SPRs on earlier RVUs. See Upgrading Software Without System Loads to Obtain G06.14 Functionality on page 4-35. 2. Unless you have already done so as part of an operating system upgrade, upgrade the TSM client software to TSM 2001D (or a later version of the client software) on the system consoles of all nodes. For more information, refer to the NonStop System Console Installer Guide. 3. On all G06.13 nodes, apply the SPRs required for G06.14 functionality: a. Use DSM/SCM to apply the following (or superseding) SPRs and their requisite SPRs. For more information about the SPRs, refer to Table 4-6 on page 4-8 or the softdoc for each SPR. SPR Notes T0502AAG For availability of this SPR on G06.09 through G06.12 RVUs, contact your HP representative. T0294AAG This SPR can be applied only to nodes running G06.12 or a later G-series RVU. For more information about T0294 compatibility, refer to Considerations for Upgrading SNETMON/MSGMON and the Operating System on page 4-95. T0569AAA If you are upgrading software for a star topology (G06.09, G06.10, or G06.11 without SPRs), you must include T0569AAA so that it is available in the archive for fallback purposes. For more information, refer to T0569AAA Firmware and Configuration Files on page 4-101. T0569AAE See the following note. T1089ABC None. T7945AAY None. T9082ACR None. ServerNet Cluster Manual— 520575-003 4- 45 Upgrading a ServerNet Cluster Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality Note. Because of time constraints unique to each production environment, installing these SPRs sometimes cannot be accomplished all at once for every node in a cluster. You might need to continue using a star or split-star topology while the SPRs are applied to some—but not all—nodes. In this scenario, any node to which the T0569AAE SPR is applied generates a TSM alarm indicating that the node has a newer version of T0569 than the version running on the cluster switch (T0569AAA or T0569AAB). The presence of this alarm can prompt some users to download T0569AAE to the cluster switches before the SPRs have been applied to all nodes. The repair action for the alarm warns you not to download T0569AAE to the cluster switches until the SPRs are applied to all nodes, but some users might not see the warning. To avoid alarms generated on ServerNet nodes during a migration procedure that extends over a long period of time, apply all of the preceding SPRs except T0569AAE. When all of the SPRs except T0569AAE have been applied to all nodes and you are ready to download T0569AAE to the cluster switches, apply T0569AAE to all nodes. b. Shut down any applications using Expand-over-ServerNet connections between the node and the rest of the cluster. c. Abort all Expand-over-ServerNet lines on the node for remote nodes in the cluster. On remote nodes, abort the Expand-over-ServerNet line for the node receiving the SPRs. >SCF ABORT LINE $SCxxx d. Stop the Servernet cluster subsystem: >SCF STOP SUBSYS $ZZSCL e. Abort the SNETMON process: >SCF ABORT PROCESS $ZZKRN.#ZZSCL f. Abort the SANMAN process: >SCF ABORT PROCESS $ZZKRN.#ZZSMN Note. The TSM Service Application cannot display the Cluster tab for a node on which $ZZKRN.#ZZSMN (SANMAN) has been aborted. g. Abort all MSGMON processes: >SCF ABORT PROCESS $ZZKRN.#MSGMON h. Abort the TSM server process: >SCF ABORT PROCESS $ZZKRN.#TSM-SRM i. Run ZPHIRNM to perform the rename step. j. Restart the TSM server process: >SCF START PROCESS $ZZKRN.#TSM-SRM ServerNet Cluster Manual— 520575-003 4- 46 Upgrading a ServerNet Cluster Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality k. Use the TSM Service Application to update the SP firmware. For the detailed steps, refer to the online help. l. Restart the MSGMON processes: >SCF START PROCESS $ZZKRN.#MSGMON m. Restart the SANMAN process: >SCF START PROCESS $ZZKRN.#ZZSMN n. Restart the SNETMON process: >SCF START PROCESS $ZZKRN.#ZZSCL o. Restart the ServerNet cluster subsystem: >SCF START SUBSYS $ZZSCL p. Restart all Expand-over-ServerNet lines: >SCF START LINE $SCxxx q. Use the TSM Service Application to check the ServerNet cluster status. For information about using TSM, refer to Section 5, Managing a ServerNet Cluster. 4. Use SCF to make sure direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. In order to use the PROBLEMS option, T0294AAG (or a superseding SPR) must be applied. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 5. Select a node whose system console can be used to download the ServerNet II Switch firmware and configuration. This will be the downloader node. If you are upgrading a cluster that uses a split-star topology, you must select two downloader nodes: one for the X1/Y1 cluster switches and one for the X2/Y2 ServerNet Cluster Manual— 520575-003 4- 47 Upgrading a ServerNet Cluster Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality cluster switches. (You cannot download firmware or a configuration across a fourlane link to a remote cluster switch.) Caution. All nodes attached to a cluster switch whose firmware and configuration will be updated with T0569AAB or T0569AAE must be running a version of the operating system that is compatible with the T0569 SPR to be downloaded. Any nodes that do not meet these requirements might experience permanent loss of Expand traffic across the cluster switch when the new firmware and configuration are loaded on the cluster switch. Table 4-32 on page 4-99 describes the NonStop Kernel requirements for ServerNet nodes connected to cluster switches that will be updated. 6. On the X1/Y1 downloader node: a. Use the Firmware Update action of the TSM Service Application to download the T0569AAE firmware from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. In cluster switches running the T0569AAA firmware and configuration preloaded at the factory, the fault LED can turn on when firmware is downloaded. This is normal, and the fault LED will turn off when the new configuration is loaded. b. Use the Configuration Update action of the TSM Service Application to download the T0569AAE configuration from the server to the X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. This scenario does not change the ServerNet node number range for the cluster switch. Note. Downloading the firmware and configuration disrupts ServerNet connectivity through the X-fabric cluster switch temporarily. c. Use the TSM Service Application to verify that the X-fabric cluster switch is operational. ServerNet Cluster Manual— 520575-003 4- 48 Upgrading a ServerNet Cluster Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality d. Use SCF on all nodes to verify that direct ServerNet connectivity has been restored on the X fabric: >SCF STATUS SUBNET $ZZSCL Note. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or a later G-series RVU). For faster (but manual) recovery of ServerNet connectivity, use the SCF START SERVERNET \REMOTE.$ZSNET.FABRIC.* command on all affected nodes after you have used the TSM Service Application to verify that the X-fabric cluster switch is operational. The START SERVERNET command works only for nodes running G06.12 or a later G-series RVU. For nodes running G06.09 through G06.11, you can use the following commands: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL Stopping and restarting the ServerNet cluster subsystem destroys and restarts interprocessor connectivity between the node that receives the commands and all other nodes over both external fabrics. 7. Repeat Step 6 for the Y-fabric cluster switch. 8. If you are upgrading a cluster that uses a split-star topology, repeat Step 6 and Step 7 using the X2/Y2 downloader node. ServerNet Cluster Manual— 520575-003 4- 49 Upgrading a ServerNet Cluster Fallback for Upgrading Software to Obtain G06.14 Functionality Fallback for Upgrading Software to Obtain G06.14 Functionality Use this procedure if you upgraded a cluster to G06.14 functionality and you now need to restore the ServerNet II Switch firmware and configuration files to an earlier version (G06.12, for example). This procedure uses one of the nodes to restore the old configuration files. The other nodes can continue operating as members of the cluster. Note. HP does not recommend backing out the firmware. T0569AAE firmware supports all topologies on all RVUs and includes significant defect repair. If you must fall back to earlier firmware for any reason, you must download the T0569AAB (or T0569AAA) M6770 firmware file. In addition, you must power cycle the ServerNet II Switch subcomponent of the cluster switch if you fall back to T0569AAB (or T0569AAA). Choose a node from which to reconfigure the cluster switches, and use the system console for that node to complete the following steps. (If you are falling back to a splitstar topology, you must choose a node for each of the two star groups in the cluster.) 1. If the cluster is connected in a tri-star topology, follow one of the procedures in Fallback for Merging Clusters to Create a Tri-Star Topology on page 4-89 to disconnect the two-lane links. 2. Use DSM/SCM to retrieve T0569AAB (or T0569AAA) from the archive so that you have access to the M6770 and M6770CL firmware and configuration files. 3. Use the Configuration Update action of the TSM Service Application to download the T0569AAB (or T0569AAA) M6770CL configuration to the nearest X-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. The following considerations apply when downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the X-fabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 (0x10000) in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. 4. If you need to fall back to earlier firmware, use the TSM Service Application to download the T0569AAB (or T0569AAA) M6770CL configuration to the nearest Xfabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Otherwise, skip this step. ServerNet Cluster Manual— 520575-003 4- 50 Upgrading a ServerNet Cluster Fallback for Upgrading Software to Obtain G06.14 Functionality 5. If you fell back to earlier firmware from T0569AAE in Step 4, power cycle the ServerNet II Switch subcomponent of the nearest X-fabric cluster switch: a. On the ServerNet II Switch front panel, press the Power On button to remove power. (You must fully depress the button until it clicks.) b. Wait at least one minute. c. Press the Power On button again to reapply power to the ServerNet II Switch. 6. Use the TSM Service Application to verify the operation of the X-fabric cluster switch. 7. If the cluster is connected in a split-star topology, repeat Step 2 through Step 6 on the external ServerNet X fabric of the other star group of the split-star topology. 8. Use SCF on all nodes to verify that direct ServerNet connectivity has been restored on the X fabric: >SCF STATUS SUBNET $ZZSCL Note. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or a later G-series RVU). For faster (but manual) recovery of ServerNet connectivity, use the SCF START SERVERNET \REMOTE.$ZSNET.FABRIC.* command on all affected nodes after you use the TSM Service Application to verify that the X-fabric cluster switch is operational. The START SERVERNET command works only for nodes running G06.12 or a later G-series RVU. For nodes running G06.09 through G06.11, you can use the following commands: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL Stopping and restarting the ServerNet cluster subsystem destroys and restarts interprocessor connectivity between the node that receives the commands and all other nodes over both external fabrics. 9. Use the Configuration Update action of the TSM Service Application to download the T0569AAB (or T0569AAA) M6770CL configuration to the nearest Y-fabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Note. The following considerations apply when downloading the T0569AAA configuration: • • Downloading the configuration disrupts ServerNet communications across the X-fabric cluster switch. The T0569AAA configuration does not support Configuration 2 (0x10001). If the cluster switch uses Configuration 2, you must change the configuration tag to Configuration 1 (0x10000) in order to download the T0569AAA configuration. The guided procedure allows you to change the configuration tag. ServerNet Cluster Manual— 520575-003 4- 51 Upgrading a ServerNet Cluster Fallback for Upgrading Software to Obtain G06.14 Functionality 10. If you need to fall back to earlier firmware, use the TSM Service Application to download the T0569AAB (or T0569AAA) M6770CL configuration to the nearest Xfabric cluster switch. For detailed steps, refer to Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99. Otherwise, skip this step. 11. If you fell back to earlier switch firmware from T0569AAE in Step 10, power cycle the ServerNet II Switch subcomponent of the nearest Y-fabric cluster switch: a. On the ServerNet II Switch front panel, press the Power On button to remove power. (You must fully depress the button until it clicks.) b. Wait at least one minute. c. Press the Power On button again to reapply power to the ServerNet II Switch. 12. Use the TSM Service Application to verify the operation of the Y-fabric cluster switch. 13. If the cluster is connected in a split-star topology, repeat Step 2 through Step 6 on the external ServerNet Y fabric of the other star group of the split-star topology. 14. On all nodes, use SCF to make sure that direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL Note. Using the SCF STATUS SUBNET $ZZSCL command requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) on the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window on each node. If T0294AAG (or a superseding SPR) is applied, use the following command: SCF STATUS SUBNET $ZZSCL, PROBLEMS 15. On all nodes connected to the cluster switch, use SCF to ensure that the ServerNet node numbers used by the MSEB port and ServerNet II Switch port are consistent: >SCF STATUS CONN $ZZSMN In the SCF display, check that the “SvNet Node Number” values for the MSEB port and switch port are the same. If the values are not the same, use one of the following recovery measures: • • Use the SCF PRIMARY PROCESS $ZZSMN command to force a takeover of the SANMAN process in the problem node. Disconnect and reconnect the MSEB-to-switch fiber-optic cable on the affected fabric. ServerNet Cluster Manual— 520575-003 4- 52 Upgrading a ServerNet Cluster Fallback for Upgrading Software to Obtain G06.14 Functionality 16. If the requisite SPRs are to be removed, make sure the T0569AAB (or T0569AAA) M6770CL configuration has been downloaded to all cluster switches prior to removing the requisite SPRs. 17. If falling back to an earlier version of the operating system (G06.12, for example), perform the following optional step: a. Shut down any nodes for which fallback to an earlier version of the operating system is desired. b. On the nodes that were shut down, load the system using the down-rev operating system. ServerNet Cluster Manual— 520575-003 4- 53 Upgrading a ServerNet Cluster Merging Clusters to Create a Split-Star Topology Merging Clusters to Create a Split-Star Topology To create a split-star topology, you must merge two clusters that use one cluster switch per fabric. Typically, you will merge two clusters that use the star topology. However, you can also merge valid subsets of other topologies to create a split-star topology. This subsection contains the following examples and procedures: • • Example: Merging Two Star Topologies to Create a Split-Star Topology on page 4-54 Steps for Merging Two Star Topologies to Create a Split-Star Topology on page 4-60 Example: Merging Two Star Topologies to Create a Split-Star Topology This upgrade begins with two ServerNet clusters having up to eight nodes and running the G06.09, G06.10 or G06.11 RVU. The upgrade: • Installs new versions of the software listed in Table 4-6 on page 4-8 on all nodes if necessary Note. HP recommends upgrading to the latest software whenever possible. See Upgrading Software to Obtain G06.14 Functionality on page 4-34. • • Upgrades the firmware and configuration in the cluster switches if necessary Reconfigures the clusters if necessary so that the ServerNet node numbers do not overlap Note. The upgrade requires that the G06.12 version of SANMAN (T0502AAE or superseding) be running in all nodes in the cluster. T0502AAE is required because versions of SANMAN earlier than G06.12 cannot support ServerNet node numbers 9 through 16. In addition, the cluster switches must be running the T0569AAB firmware and configuration. T0569AAB is required for the split-star topology (two cluster switches per fabric). Following the upgrade, the merged cluster will use the split-star topology and support up to 16 nodes. Note. The migration of a cluster earlier than G06.12 to a G06.12 cluster using the X2/Y2 cluster switches requires shutting down ServerNet cluster communications for all nodes in the cluster. Shutting down ServerNet cluster communications is required because the ServerNet node numbers in the cluster must be changed from numbers 1 through 8 to numbers 9 through 16 once both the X-fabric and Y-fabric cluster switches in that cluster are upgraded to the X2/Y2 cluster switch configuration. Table 4-16 summarizes this upgrade. ServerNet Cluster Manual— 520575-003 4- 54 Upgrading a ServerNet Cluster Example: Merging Two Star Topologies to Create a Split-Star Topology Table 4-16. Upgrade Summary: Upgrading Software to Create a Split-Star Topology (G06.12 Functionality) Before the Upgrade After the Upgrade Max. Nodes Supported 8 16 Cluster Switches Per Fabric 1 2 NonStop Kernel Operating System Release G06.09, G06.10, or G06.11 If the four-lane link is less than 80 meters, the cluster can contain G06.09, G06.10, G06.11, or G06.12 nodes. If the four-lane link is more than 80 meters, G06.11 or a later G-series RVU is required on all nodes. ServerNet II Switch Firmware Empty firmware files (T0569) or T0569AAA T0569AAB (or superseding) SANMAN Version T0502 or T0502AAA (G06.09) T0502AAE (or superseding) TSM Server Version T7945AAS (G06.09), T7945AAT (G06.10), or T7945AAV (G06.11) T7945AAW (or superseding) TSM Client Version 10.0 (G06.09), 2000A (G06.10), or 2001A (G06.11) 2001B (G06.12) or a later client version SNETMON/MSGMON Version T0294 (G06.09 or G06.10), or T0294AAB (G06.11) T0294* (G06.09 or G06.10), T0294AAB (G06.11), or T0294AAE (G06.12) Service Processor (SP) Version T1089AAX (G06.09) or T1089AAZ (G06.09, G06.10, or G06.11) T1089ABB (or superseding) SCF T9082ACN T9082ACQ (or superseding) *T0294AAA (a Class D SPR) can also be applied to G06.09 and G06.10. Figure 4-5 shows the merging of two three-node ServerNet clusters into a split-star topology that can support up to 16 nodes. The cluster switches for one of the threenode clusters are reconfigured as X2 and Y2 in order to construct the new topology. In addition, the merged cluster uses 1-kilometer four-lane links. Therefore, all nodes must run G06.11 or a later G-series RVU. ServerNet Cluster Manual— 520575-003 4- 55 Upgrading a ServerNet Cluster Example: Merging Two Star Topologies to Create a Split-Star Topology Figure 4-5. Example of Merging Clusters Containing Pre-G06.13 Nodes G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 NonStop Himalaya S-Series Server G06.10 T0502AAA T7945AAT T0294 T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server Y1 T0569AAA (loaded on cluster switch) Y1 T0569AAA (loaded on cluster switch) Cluster Switch Cluster Switch Cluster Switch X1 T0569AAA (loaded on cluster switch) NonStop Himalaya S-Series Server G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 NonStop Himalaya S-Series Server G06.10 T0502AAA T7945AAT T0294 T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB Cluster Switch X1 T0569AAA (loaded on cluster switch) Before Software Upgrade and Merging After Software Upgrade and Merging 1-km Four-Lane Link G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Y1 T0569AAB (loaded on cluster switch) Y2 T0569AAB (loaded on cluster switch) Cluster Switch Cluster Switch NonStop Himalaya S-Series Server Cluster Switch X1 T0569AAB (loaded on cluster switch) Cluster Switch X2 T0569AAB (loaded on cluster switch) NonStop Himalaya S-Series Server G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB 1-km Four-Lane Link Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST102.vsd ServerNet Cluster Manual— 520575-003 4- 56 Upgrading a ServerNet Cluster Example: Merging Two Star Topologies to Create a Split-Star Topology Figure 4-6 shows a four-node ServerNet cluster that has been modified to support up to 16 nodes by upgrading software and adding cluster switches. In addition, two new nodes have been added to the cluster. Because all of the nodes in the modified cluster are running G06.11 or a later G-series RVU, up to 1-kilometer four-lane links can be used between the two halves of the splitstar topology. ServerNet Cluster Manual— 520575-003 4- 57 Upgrading a ServerNet Cluster Example: Merging Two Star Topologies to Create a Split-Star Topology Figure 4-6. Example of Upgrading a Cluster to a Release 2 Split-Star Topology and Adding Nodes G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAA (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAA (loaded on cluster switch) Before Software Upgrade G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB After Software Upgrade NonStop Himalaya S-Series Server 1-km Four-Lane Link Y1 T0569AAB NonStop Himalaya S-Series Server G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Cluster Switch Y2 T0569AAB Cluster Switch Cluster Switch Cluster Switch X1 T0569AAB X2 T0569AAB NonStop Himalaya S-Series Server G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server G06.12 T0502AAE T7945AAW T0294AAE T1089ABB T9082ACQ T0569AAB 1-km Four-Lane Link Note: The TSM client (T8154) IPM is not included in this example. It is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST101.vsd ServerNet Cluster Manual— 520575-003 4- 58 Upgrading a ServerNet Cluster Example: Merging Two Star Topologies to Create a Split-Star Topology Figure 4-7 shows a four-node ServerNet cluster that has been modified to support up to 16 nodes by upgrading software and adding cluster switches. In addition, one new server has been added to the cluster. Because the nodes in the modified cluster are running G06.09, G06.10, and G06.11 RVUs, the four-lane links must be 80 meters or less. Figure 4-7. Example of a Split-Star Topology With 80-Meter Four-Lane Links G06.09 T0502 T7945AAS T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.10 T0502 T7945AAT T0294 T1089AAX T9082ACN T0569 G06.11 T0502AAA T7945AAV T0294AAB T1089AAZ T9082ACN T0569 NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAA (loaded on cluster switch) Cluster Switch Cluster Switch Y1 T0569AAA (loaded on cluster switch) Before Software Upgrade After Software Upgrade G06.11 T0502AAE T7945AAW T0294AAB T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server G06.10 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB 80-m Four-Lane Link Y1 T0569AAB (loaded on cluster switch) Y2 T0569AAB (loaded on cluster switch) NonStop Himalaya S-Series Server Cluster Switch Cluster Switch G06.10 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server Cluster Switch G06.09 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB NonStop Himalaya S-Series Server NonStop Himalaya S-Series Server X1 T0569AAB (loaded on cluster switch) Cluster Switch X2 T0569AAB (loaded on cluster switch) 80-m Four-Lane Link G06.09 T0502AAE T7945AAW T0294 T1089ABB T9082ACQ T0569AAB Note: The TSM client software IPM (T8154) is not included in this example because the TSM client is a prerequisite of the TSM server software IPM (T7945) and usually uses the same IPM identifier. VST103.vsd ServerNet Cluster Manual— 520575-003 4- 59 Upgrading a ServerNet Cluster Steps for Merging Two Star Topologies to Create a Split-Star Topology Steps for Merging Two Star Topologies to Create a Split-Star Topology To merge two clusters that use one cluster switch per fabric to create a split-star topology: Caution. Do not connect the four-lane link until you are instructed to do so by the Add Switch guided procedure. Connecting the four-lane link between two cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster if the NNA Version is 5. (FCO 39746B replaced Version 5 NNAs with NNA Version 22.) You might need to power cycle each node to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting two cluster switches running the T0569AAA configuration can create an invalidly configured split-star topology in which ServerNet packets sent by nodes 1 through 8 to nodes 9 through 16 loop indefinitely between the two cluster switches. 1. Decide which nodes will occupy the two star groups of the split-star topology. You can use Table 4-17 to record the nodes that will belong to each star group of the split-star topology. This form accommodates information about the nodes and cluster switches before and after the upgrade. 2. Record information about the cluster switches that will serve the two star groups of the split-star topology. You can use Table 4-18 to plan for the cluster switches. Make copies of this form to record before and after information. ServerNet Cluster Manual— 520575-003 4- 60 Upgrading a ServerNet Cluster Steps for Merging Two Star Topologies to Create a Split-Star Topology Table 4-17. Planning for Nodes in the Split-Star Topology Before the Upgrade Cluster Switch Port / ServerNet Node Number X__/Y__ X__/Y__ After the Upgrade System Name Cluster Switch 0 / ___ \_____________ X1/Y1 1 / ___ Port / ServerNet Node Number System Name 0/1 \_____________ \_____________ 1/2 \_____________ 2 / ___ \_____________ 2/3 \_____________ 3 / ___ \_____________ 3/4 \_____________ 4 / ___ \_____________ 4/5 \_____________ 5 / ___ \_____________ 5/6 \_____________ 6 / ___ \_____________ 6/7 \_____________ 7 / ___ \_____________ 7/8 \_____________ 0 / ___ \_____________ 0/9 \_____________ 1 / ___ \_____________ 1 / 10 \_____________ 2 / ___ \_____________ 2 / 11 \_____________ 3 / ___ \_____________ 3 / 12 \_____________ 4 / ___ \_____________ 4 / 13 \_____________ 5 / ___ \_____________ 5 / 14 \_____________ 6 / ___ \_____________ 6 / 15 \_____________ 7 / ___ \_____________ 7 / 16 \_____________ X2/Y2 Table 4-18. Planning for Cluster Switches in the Split-Star Topology Example X Fabric Y Fabric Fabric/Position X1 X____ Y____ GUID V0XE6Z _____________________ _____________________ Configuration Tag 0x10000 _____________________ _____________________ Firmware Rev. 2_0_21 _____________________ _____________________ Configuration Rev. 0_0 _____________________ _____________________ Fabric/Position X2 X____ Y____ GUID V0YJ4B _____________________ _____________________ Configuration Tag 0x10001 _____________________ _____________________ Firmware Rev. 2_0_21 _____________________ _____________________ Configuration Rev. 0_0 _____________________ _____________________ ServerNet Cluster Manual— 520575-003 4- 61 Upgrading a ServerNet Cluster Steps for Merging Two Star Topologies to Create a Split-Star Topology 3. Select one of the ServerNet clusters to be the X1/Y1 cluster. If necessary, upgrade the ServerNet cluster software on all nodes in the cluster by using the steps in Upgrading Software to Obtain G06.12 Functionality on page 4-17. 4. Select one of the ServerNet clusters to be the X2/Y2 cluster. If necessary, upgrade the ServerNet cluster software on all nodes in the cluster by using the steps in Upgrading Software to Obtain G06.12 Functionality on page 4-17. 5. From any node on the X1/Y1 cluster, run the Add Switch guided procedure. From the system console of any node in the cluster, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Add Switch. The guided procedure checks the cluster switch hardware and software and allows you to: • • • Update the ServerNet II Switch firmware and configuration if necessary Change the configuration tag so that the cluster switch supports Configuration 1 (ServerNet node numbers 1 through 8) or Configuration 2 (ServerNet node numbers 9 through 16) Perform soft or hard resets as necessary 6. From any node on the X2/Y2 cluster, run the guided procedure for adding a cluster switch (Add Switch). 7. When the guided procedure indicates that both clusters are ready to add remote cluster switches, you can connect the four-lane links between the X1/Y1 and X2/Y2 cluster switches. Refer to Connecting the Four-Lane Links on page 4-64. 8. From any node on the X1/Y1 cluster, run the guided procedure for adding a cluster switch again. In the Local Switch Information dialog box, click the Test button to test the connectivity between the X1/Y1 and X2/Y2 cluster switches. 9. From any node on the X2/Y2 cluster, run the guided procedure for adding a cluster switch again. In the Local Switch Information dialog box, click the Test button to test the connectivity between the X2/Y2 and X1/Y1 cluster switches. 10. Log on to a node in the range of ServerNet node numbers 1 through 8, and use the TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches. 11. Log on to a node in the range of ServerNet node numbers 9 through 16, and use the TSM Service Application to verify the operation of the local X-fabric and Yfabric cluster switches. ServerNet Cluster Manual— 520575-003 4- 62 Upgrading a ServerNet Cluster Steps for Merging Two Star Topologies to Create a Split-Star Topology 12. On all nodes in the newly merged cluster, use SCF to verify ServerNet connectivity: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. In order to use the PROBLEMS option, T0294AAG (or a superseding SPR) must be applied. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 13. To configure and start Expand-over-ServerNet lines between the two halves of the split-star topology, use the guided procedure for configuring a ServerNet node. Unless the automatic line-handler configuration feature is enabled, you must run the guided procedure on every node in order to configure and start Expand-overServerNet lines to all other nodes. From the system console of that node, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. Online help can assist you in performing the procedure. ServerNet Cluster Manual— 520575-003 4- 63 Upgrading a ServerNet Cluster Connecting the Four-Lane Links Connecting the Four-Lane Links When the guided procedure indicates that both clusters are ready to add remote cluster switches, you can connect the four-lane links between the X1/Y1 and X2/Y2 cluster switches. Caution. Before connecting ServerNet cables, inspect the cables as described in Connecting a Fiber-Optic Cable to an MSEB or ServerNet II Switch on page 3-18. Using defective connectors can cause ServerNet connectivity problems. To connect the four-lane links: 1. Label both ends of each fiber-optic cable to be used for the four-lane links. On the label, include the fabric and cluster switch position (X1, for example) and the port number (8, for example) to which the cable will be connected. 2. If they are not already routed, route the fiber-optic ServerNet cables to be used for the four-lane links. 3. Remove the black plugs from ports 8 through 11 on the double-wide PICs inside each switch enclosure. 4. Remove the dust caps from the fiber-optic cable connectors. Note. ServerNet II Switch ports 8, 9, 10, and 11 are keyed differently from ports 0 through 7. To connect the four-lane link cables, you must align the fiber-optic cable connector with the key on top. See Figure 4-8. Figure 4-8. Key Positions on ServerNet II Switch Ports Keys Ports 8-11 Ports 0-7 Keys VST111.vsd ServerNet Cluster Manual— 520575-003 4- 64 Upgrading a ServerNet Cluster Connecting the Four-Lane Links 5. One cable at a time, connect the cable ends. Table 4-19 on page 4-65 shows the cable connections. Note. To avoid generating an alarm, you must connect the four-lane links for both fabrics within four minutes. The TSM incident analysis (IA) software generates an alarm eventually if one external fabric has two cluster switches but the other external fabric has only one cluster switch. When a cluster switch is added to an external fabric, the IA checks the peer fabric to determine if it has two cluster switches. After four minutes, if only one external fabric has two cluster switches, the IA generates a Missing Remote ServerNet Switch alarm. After four more minutes, if the peer fabric still does not have two cluster switches, the IA dials out the Missing Remote ServerNet Switch alarm. If a second cluster switch is added to the peer fabric after the alarm is generated but before the alarm is dialed out, the alarm is deleted and is not dialed out. Table 4-19. Four-Lane Link Connections for the Split-Star Topology √ Cluster Switch Port Connects to Cluster Switch Port X1 8 X2 8 X1 9 X2 9 X1 10 X2 10 X1 11 X2 11 Y1 8 Y2 8 Y1 9 Y2 9 Y1 10 Y2 10 Y1 11 Y2 11 6. Check the link-alive LED near each PIC port. The link-alive LEDs should light a few seconds after the cable is connected at both ends. If the link-alive LEDs do not light: • • • Make sure the dust caps are removed from the cable ends. Try reconnecting the cable, using care to align the key on the cable plug with the PIC connector. If possible, try connecting a different cable. 7. Continue until all the cables are connected. ServerNet Cluster Manual— 520575-003 4- 65 Upgrading a ServerNet Cluster Fallback for Merging Clusters to Create a Split-Star Topology Fallback for Merging Clusters to Create a SplitStar Topology This fallback procedure divides a split-star topology into two ServerNet clusters that support no more than eight nodes each. In this scenario, you separate the clusters into two individual logical clusters with only one switch per fabric for each cluster. (Each star group of the split-star topology is a logical cluster.) 1. Select one of the star groups for a complete shutdown of ServerNet cluster services. You can use the star group with the fewest nodes or the star group that is least critical to your application. 2. In all nodes, stop any applications that depend on ServerNet cluster connectivity to the nodes in the other star group. 3. In all nodes of the star group selected in Step 1: a. Stop any applications that depend on ServerNet cluster connectivity to nodes within that star group. b. Abort the Expand-over-ServerNet lines to all nodes in the other star group. c. Use the SCF STOP SUBSYS $ZZSCL command to ensure an orderly shutdown of ServerNet communications on that star group. 4. In all nodes of the other star group, abort all Expand-over-ServerNet lines to all nodes in the star group selected in Step 1. 5. Disconnect the four-lane links for each fabric to form two physically independent clusters. Note. Disconnecting ports 8 through 11 prevents unnecessary neighborhood-error-check alarms generated on all system consoles. These alarms are generated because the T0569AAA firmware and configuration are not supported in a split-star topology. T0569AAB firmware and configuration are mandated on all cluster switches that are connected in a split-star topology. ServerNet Cluster Manual— 520575-003 4- 66 Upgrading a ServerNet Cluster Fallback for Merging Clusters to Create a Split-Star Topology 6. If necessary, use one of the fallback procedures in Upgrading Software to Obtain G06.12 Functionality on page 4-17 to fall back from G06.12 to an earlier RVU in each of the two physically independent clusters. Note. If the cluster switch is currently running T0569AAA firmware, updating the configuration can change the fabric setting. The fabric setting (X or Y) changes to a default setting (X) when the T0569AAA configuration is loaded in the switch. The T0569AAA configuration normally should be loaded on the cluster switch only when a configuration fallback is being performed. If the cluster switch is installed on the external Y fabric, you must reconfigure the fabric setting in the cluster switch back to Y after the fallback. Until you perform this action, ServerNet traffic through the cluster switch remains disabled. You can use the Set LED to { X | Y } Side action in the TSM Service Application to correct the fabric setting. If the cluster switch is currently running T0569AAB firmware, the configured fabric setting (X or Y) remains unchanged when either the T0569AAA (fallback) or the T0569AAB configuration is loaded in the cluster switch. 7. In all nodes of the star group selected in Step 1: a. Use the SCF START SUBSYS $ZZSCL command to bring up direct ServerNet connectivity between the nodes. b. Start the Expand-over-ServerNet lines to the other nodes in the star group. c. If desired, start any applications that utilize ServerNet connectivity to nodes within the star group. ServerNet Cluster Manual— 520575-003 4- 67 Upgrading a ServerNet Cluster Merging Clusters to Create a Tri-Star Topology Merging Clusters to Create a Tri-Star Topology To create a tri-star topology supporting up to 24 nodes, you can do one of the following: • • Merge three clusters that currently use one cluster switch per fabric (supporting up to eight nodes each). Merge a cluster that uses one cluster switch per fabric (supporting up to eight nodes) with another cluster that uses two cluster switches per fabric (supporting up to 16 nodes). You cannot merge two clusters if both clusters currently use two cluster switches per fabric. (32-node clusters are currently not supported.) Note. The tri-star topology (three cluster switches per fabric) requires that all nodes run 06.13 with the SPRs listed in Table 4-6 on page 4-8 or a later G-series RVU (without SPRs). G06.13 or a later G-series RVU is required because versions of the operating system earlier than G06.13 do not support the tri-star topology. In addition, the cluster switches must be loaded with the T0569AAE firmware and configuration. This section contains the following subsections: • • • • Example: Merging Three Star Topologies to Create a Tri-Star Topology on page 4-68 Steps for Merging Three Star Topologies to Create a Tri-Star Topology on page 4-73 Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology on page 4-78 Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology on page 4-82 Example: Merging Three Star Topologies to Create a Tri-Star Topology The following example begins with three ServerNet clusters running any of the following RVUs: • • • • • G06.09 G06.10 G06.11 G06.12 G06.13 All three clusters use the star topology (or are subsets of other topologies that include only one cluster switch per fabric and support up to eight nodes). The upgrade: • Installs new versions of the release 3 software listed in Table 4-6 on page 4-8 on all nodes ServerNet Cluster Manual— 520575-003 4- 68 Upgrading a ServerNet Cluster • • Example: Merging Three Star Topologies to Create a Tri-Star Topology Upgrades the firmware and configuration in the cluster switches Reconfigures the clusters to use ServerNet node numbers 1 through 8, 9 through 16, and 17 through 24 Following the upgrade, the merged cluster uses the tri-star topology and supports up to 24 nodes. Table 4-20 summarizes the upgrade. Table 4-20. Upgrade Summary: Merging Three Star Topologies to Create a TriStar Topology (page 1 of 2) Before the Upgrade After the Upgrade Max. Nodes Supported 8 24 Cluster Switches Per Fabric 1 3 NonStop Kernel Operating System Release G06.09, G06.10, G06.11, G06.12, or G06.13 G06.13 (with SPRs) or a later G-series RVU (without SPRs) ServerNet II Switch Firmware Any of the following: T0569AAE (or superseding) • • • SANMAN Version TSM Server Version TSM Client Version Empty firmware files (T0569) T0569AAA T0569AAB Any of the following: • • • T0502 T0502AAA (G06.09) T0502AAE (G06.12 equivalent) Any of the following: • • • • • T0502AAG (or superseding) T7945AAS (G06.09) T7945AAY (or superseding) T7945AAT (G06.10) T7945AAV (G06.11) T7945AAW (G06.12) T7945AAX (G06.13) 10.0 (G06.09), 2000A (G06.10), 2001A (G06.11) 2001B (G06.12) 2001C (G06.13) ServerNet Cluster Manual— 520575-003 4- 69 2001D (G06.14) or a later client version Upgrading a ServerNet Cluster Example: Merging Three Star Topologies to Create a Tri-Star Topology Table 4-20. Upgrade Summary: Merging Three Star Topologies to Create a TriStar Topology (page 2 of 2) SNETMON/MSGMON Version Before the Upgrade After the Upgrade Any of the following: T0294AAG (or superseding) • • • • Service Processor (SP) Version SCF T0294 (G06.09 or G06.10) T0294AAA (G06.09 or G06.10) T0294AAB (G06.11) T0294AAE (G06.12 or G06.13) Any of the following: • • • T1089AAX T1089ABC (or superseding) T1089AAZ T1089ABB T9082ACN or T9082ACQ ServerNet Cluster Manual— 520575-003 4- 70 T9082ACQ (or superseding) Upgrading a ServerNet Cluster Example: Merging Three Star Topologies to Create a Tri-Star Topology Figure 4-9 and Figure 4-10 show the merging of three ServerNet clusters into a tri-star topology that can support up to 24 nodes. Figure 4-9 shows three clusters using the star topology installed and ready for merging. Figure 4-9. Before the Upgrade: Example of Merging Three Star Topologies to Create a Tri-Star Topology \A \B TSM Workstation A X1 Cluster Switch \C 2-Node ServerNet Cluster (X fabric only) \D TSM Workstation B X1 Cluster Switch \E 2-Node ServerNet Cluster (X fabric only) \F TSM Workstation C X1 Cluster Switch 2-Node ServerNet Cluster (X fabric only) VST141.vsd ServerNet Cluster Manual— 520575-003 4- 71 Upgrading a ServerNet Cluster Example: Merging Three Star Topologies to Create a Tri-Star Topology Figure 4-10 shows the cluster after the upgrade. The cluster switches have been reconfigured as X1, X2, and X3 in order to construct the new tri-star topology. The upgraded cluster uses 1-kilometer two-lane links. Figure 4-10. After the Upgrade: Example of Merging Three Star Topologies to Create a Tri-Star Topology TSM Workstation B \C X1 TSM Workstation C \D Cluster Switch \E 8 9 11 10 10 11 8 9 6-Node ServerNet Cluster (X fabric only) \F Cluster Switch X2 Two-Lane Links 8 9 10 11 X3 Cluster Switch \A \B TSM Workstation A VST142.vsd ServerNet Cluster Manual— 520575-003 4- 72 Upgrading a ServerNet Cluster Steps for Merging Three Star Topologies to Create a Tri-Star Topology Steps for Merging Three Star Topologies to Create a Tri-Star Topology The following steps describe how to use the Add Switch guided procedure to merge three clusters that use one cluster switch per fabric to create a tri-star topology: Caution. Do not connect the two-lane links for the tri-star topology until the Add Switch guided procedure instructs you to do so. Connecting the two-lane links between cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster if the NNA Version is 5. (FCO 39746B replaced Version 5 NNAs with NNA Version 22.) You might need to power cycle each node to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting two cluster switches running the T0569AAA configuration can create an invalidly configured split-star topology in which ServerNet packets sent by nodes 1 through 8 to nodes 9 through 16 loop indefinitely between the two cluster switches. To check the running configuration, see Checking the Revisions of the Running Firmware and Configuration on page 4-11. 1. Start with three functioning ServerNet clusters that use one cluster switch per fabric. Note. If you are starting with one cluster that uses the star topology and another cluster that uses the split-star topology, see Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology on page 4-82. 2. Decide the configuration tags to be used by the cluster switches in each cluster when they are combined in the tri-star topology. During the upgrade, the Add Switch guided procedure prompts you for the configuration tag. You can use Table 4-21 on page 4-74 to record this information. Table 4-3 on page 4-5 shows the supported configuration tags. ServerNet Cluster Manual— 520575-003 4- 73 Upgrading a ServerNet Cluster Steps for Merging Three Star Topologies to Create a Tri-Star Topology Table 4-21. Planning for Cluster Switches in the Tri-Star Topology Example X Fabric Y Fabric Fabric/Position X1 X____ Y____ GUID V0XE6Z _____________________ _____________________ Configuration Tag 0x10002 _____________________ _____________________ Firmware Rev. 3_0_81 _____________________ _____________________ Configuration Rev. 2_5 _____________________ _____________________ Fabric/Position X2 X____ Y____ GUID V0YF1Z _____________________ _____________________ Configuration Tag 0x10003 _____________________ _____________________ Firmware Rev. 3_0_81 _____________________ _____________________ Configuration Rev. 2_5 _____________________ _____________________ Fabric/Position X3 X____ Y____ GUID R1PE4Y _____________________ _____________________ Configuration Tag 0x10004 _____________________ _____________________ Firmware Rev. 3_0_81 _____________________ _____________________ Configuration Rev. 2_5 _____________________ _____________________ 3. Decide which nodes will belong to each cluster switch before and after the upgrade to the tri-star topology. You can use Table 4-22 on page 4-75 to record this information. ServerNet Cluster Manual— 520575-003 4- 74 Upgrading a ServerNet Cluster Steps for Merging Three Star Topologies to Create a Tri-Star Topology Table 4-22. Planning for Nodes in the Tri-Star Topology Before the Upgrade Cluster Switch Port / ServerNet Node Number X__/Y__ X__/Y__ X__/Y__ After the Upgrade System Name Cluster Switch 0 / ___ \_____________ X1/Y1 1 / ___ Port / ServerNet Node Number System Name 0/1 \_____________ \_____________ 1/2 \_____________ 2 / ___ \_____________ 2/3 \_____________ 3 / ___ \_____________ 3/4 \_____________ 4 / ___ \_____________ 4/5 \_____________ 5 / ___ \_____________ 5/6 \_____________ 6 / ___ \_____________ 6/7 \_____________ 7 / ___ \_____________ 7/8 \_____________ 0 / ___ \_____________ 0/9 \_____________ 1 / ___ \_____________ 1 / 10 \_____________ 2 / ___ \_____________ 2 / 11 \_____________ 3 / ___ \_____________ 3 / 12 \_____________ 4 / ___ \_____________ 4 / 13 \_____________ 5 / ___ \_____________ 5 / 14 \_____________ 6 / ___ \_____________ 6 / 15 \_____________ 7 / ___ \_____________ 7 / 16 \_____________ 0 / ___ \_____________ 0 / 17 \_____________ 1 / ___ \_____________ 1 / 18 \_____________ 2 / ___ \_____________ 2 / 19 \_____________ 3 / ___ \_____________ 3 / 20 \_____________ 4 / ___ \_____________ 4 / 21 \_____________ 5 / ___ \_____________ 5 / 22 \_____________ 6 / ___ \_____________ 6 / 23 \_____________ 7 / ___ \_____________ 7 / 24 \_____________ X2/Y2 X3/Y3 ServerNet Cluster Manual— 520575-003 4- 75 Upgrading a ServerNet Cluster Steps for Merging Three Star Topologies to Create a Tri-Star Topology 4. If you have not already done so, upgrade all nodes in all clusters to G06.13 or a later G-series RVU. For nodes upgraded to G06.13, you must apply the release 3 SPRs indicated in Table 4-6 on page 4-8. Refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. 5. On any node connected to one of the clusters, run the Add Switch guided procedure: a. From the system console of that node, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Add Switch. b. In the Select Upgrade Topology dialog box, select the tri-star topology as the topology to which you want to upgrade. c. In the Select Switch Configuration Tag dialog box, select one of the following configuration tags: • • • Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Max 24 nodes, nodes 17-24 (0x10004) d. Follow the dialog boxes to update the firmware and configuration for the Xfabric cluster switch if the guided procedure determines that the switch needs updating. e. When prompted, update the firmware and configuration for the Y-fabric cluster switch. The guided procedure remembers the configuration tag you selected for the X-fabric cluster switch and uses it for the Y-fabric cluster switch. f. When the guided procedure prompts you to connect the cables for the X fabric, click Stop Task. You must not connect the two-lane link cables until all the cluster switches have been updated. 6. Repeat Step 5 on the second cluster. 7. Repeat Step 5 on the third cluster, but do not click Stop Task when the guided procedure tells you to connect the cables. 8. Connect the two-lane links on both fabrics using the fiber-optic cables. See Connecting the Two-Lane Links on page 4-87. 9. When the cables are connected, click Continue in the Connect the Cables dialog box. 10. In the Local Switch Information dialog box, click Test to verify the connection between the cluster switches. ServerNet Cluster Manual— 520575-003 4- 76 Upgrading a ServerNet Cluster Steps for Merging Three Star Topologies to Create a Tri-Star Topology You must log on to nodes attached to at least two different star groups of the tristar topology in order for the guided procedure to test all of the remote connections on both fabrics. If you log on to a node attached to . . . The guided procedure checks these remote connections . . . X1/Y1 X1/Y1 to X2/Y2 and X1/Y1 to X3/Y3 X2/Y2 X2/Y2 to X1/Y1 and X2/Y2 to X3/Y3 X3/Y3 X3/Y3 to X1/Y1 and X3/Y3 to X2/Y2 11. Log on to a node attached to a different star group of the tri-star topology to test the other remote connections. Run the Add Switch guided procedure again. In the Local Switch Information dialog box, click the Test button to test the connectivity between the local remote switches. 12. On all nodes in the newly merged cluster, use SCF to verify ServerNet connectivity: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. In order to use the PROBLEMS option, T0294AAG (or a superseding SPR) must be applied. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. 13. Use the Configure ServerNet Node guided procedure to configure and start Expand-over-ServerNet lines between the star groups of the tri-star topology. Unless the automatic line-handler configuration feature is enabled, you must run the Configure ServerNet Node guided procedure on every node in order to configure and start the Expand lines to all other nodes. From the system console of a node, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. ServerNet Cluster Manual— 520575-003 4- 77 Upgrading a ServerNet Cluster Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology The following example begins with two ServerNet clusters running any of the following RVUs: • • • • • G06.09 G06.10 G06.11 G06.12 G06.13 One cluster uses the star topology and includes up to eight nodes. The other cluster uses the split-star topology and includes up to 16 nodes. (However, the same procedure applies if you are merging any cluster using one cluster switch per fabric with another cluster using two cluster switches per fabric to create a tri-star topology.) The upgrade: • • • Installs new versions of the software listed in Table 4-6 on page 4-8 on all nodes Upgrades the firmware and configuration in the cluster switches Reconfigures one of the clusters to use ServerNet node numbers 17 through 24 Note. The upgrade of a pre-G06.13 cluster to a G06.13 or G06.14 cluster using X1/Y1, X2/Y2, and X3/Y3 cluster switches requires shutting down ServerNet cluster communications for all nodes in one of the clusters. This action is required because the ServerNet node numbers in the cluster must be changed from 1 through 8 or 9 through 16 to numbers 17 through 24 when the X-fabric and Y-fabric cluster switches in that cluster are upgraded to the new configuration. Following the upgrade, the merged cluster uses the tri-star topology and supports up to 24 nodes. ServerNet Cluster Manual— 520575-003 4- 78 Upgrading a ServerNet Cluster Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology Table 4-23 summarizes the upgrade. Table 4-23. Upgrade Summary: Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology (page 1 of 2) Before the Upgrade After the Upgrade Max. Nodes Supported 8 or 16 24 Cluster Switches Per Fabric 1 or 2 3 NonStop Kernel Operating System Release G06.09, G06.10, G06.11, G06.12, or G06.13 G06.13 (with the SPRs listed below) or a later G-series RVU (without SPRs) ServerNet II Switch Firmware Any of the following: T0569AAE (or superseding) • • • SANMAN Version TSM Server Version TSM Client Version Empty firmware files (T0569) T0569AAA T0569AAB Any of the following: • • • T0502 T0502AAA (G06.09) T0502AAE (G06.12 equivalent) Any of the following: • • • • • T0502AAG (or superseding) T7945AAS (G06.09) T7945AAT (G06.10) T7945AAY (G06.14 equivalent) or superseding T7945AAV (G06.11) T7945AAW (G06.12) T7945AAX (G06.13) 10.0 (G06.09), 2000A (G06.10), 2001A (G06.11), 2001B (G06.12), or 2001C (G06.13) ServerNet Cluster Manual— 520575-003 4- 79 2001D (G06.14) or a later client version Upgrading a ServerNet Cluster Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology Table 4-23. Upgrade Summary: Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology (page 2 of 2) SNETMON/MSGMON Version Before the Upgrade After the Upgrade Any of the following: T0294AAG (or superseding) • • Service Processor (SP) Version SCF T0294 (G06.09 or G06.10) T0294AAB (G06.11, G06.12, or G06.13) Any of the following: • • • T1089ABC (or superseding) T1089AAX T1089AAZ T1089ABB T9082ACN or T9082ACQ ServerNet Cluster Manual— 520575-003 4- 80 T9082ACR (or superseding) Upgrading a ServerNet Cluster Example: Merging A Split-Star Topology and a Star Topology to Create a Tri-Star Topology Figure 4-11 and Figure 4-12 show the merging of two ServerNet clusters into a tri-star topology that can support up to 24 nodes. Figure 4-11 shows clusters using the star and split-star topologies installed and ready for merging. The split-star topology includes a four-lane link connecting the cluster switches. Figure 4-11. Before the Upgrade: Example of Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology \C \D TSM Workstation B X1 Cluster Switch 4-Node ServerNet Cluster (X fabric only) 8 9 10 11 Four-Lane Link 8 9 10 11 \A \B X2 TSM Workstation A X1 Cluster Switch 2-Node ServerNet Cluster (X fabric only) Cluster Switch \E \F TSM Workstation C VST135.vsd ServerNet Cluster Manual— 520575-003 4- 81 Upgrading a ServerNet Cluster Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology Figure 4-12 shows the cluster after the upgrade. The cluster switches from the star topology have been reconfigured as X3 and Y3 in order to construct the new tri-star topology. The upgraded cluster uses 1-kilometer two-lane links. Figure 4-12. After the Upgrade: Example of Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology TSM Workstation B \C X1 TSM Workstation C \D Cluster Switch \E 8 9 11 10 10 11 8 9 6-Node ServerNet Cluster (X fabric only) \F Cluster Switch X2 Two-Lane Links 8 9 10 11 X3 Cluster Switch \A \B TSM Workstation A VST136.vsd Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 1. If you have not already done so, upgrade all nodes in all clusters to G06.13 or a later G-series RVU by using the Steps for Upgrading Software With System Loads to Obtain G06.14 Functionality on page 4-45. Note. The TSM Service Application shows Down-rev Firmware and Down-rev Configuration alarms when the firmware or configuration running on a cluster switch are older than the file versions of $SYSTEM.SYSnn.M6770 and $SYSTEM.SYSnn.M6770CL. This situation is normal if you have upgraded a node but have not yet downloaded the new firmware or configuration to the cluster switches. ServerNet Cluster Manual— 520575-003 4- 82 Upgrading a ServerNet Cluster Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 2. Decide the configuration tags to be used by the cluster switches in each cluster when they are combined in the tri-star topology. During the upgrade, the Add Switch guided procedure prompts you for the configuration tag. You can use Table 4-21 on page 4-74 to record this information. Table 4-3 on page 4-5 shows the supported configuration tags. Note. If you are merging a split-star topology with a star topology to create a tri-star topology, HP recommends configuring the star topology to support nodes 17 through 24. If you do this, you do not have to change the ServerNet node numbers supported by the cluster switches using the split-star topology. Those cluster switches already support nodes 1 through 16. 3. Decide which nodes will belong to each cluster switch before and after the upgrade to the tri-star topology. You can use Table 4-22 on page 4-75 to record this information. 4. Log on to any node connected to the cluster using the star topology, and run the Add Switch guided procedure: Note. A cluster using a star topology uses one cluster switch per fabric and supports up to eight ServerNet nodes. A split-star topology uses up to two cluster switches per fabric and supports up to 16 ServerNet nodes. a. From the system console of that node, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Add Switch. Online help can assist you in performing the procedure. The guided procedure: • • • • • Checks the software levels of SNETMON and SANMAN on the local node Checks the ServerNet II Switch firmware and configuration Allows you to update the firmware and configuration if necessary Tells you when to disconnect the four-lane links if necessary and connect two-lane links Allows you to test the remote connections (two-lane links) b. In the Select Upgrade Topology dialog box, select the tri-star topology as the topology to which you want to upgrade. c. In the Select Switch Configuration Tag dialog box, select the following configuration tag: Max 24 nodes, nodes 17-24 (0x10004) ServerNet Cluster Manual— 520575-003 4- 83 Upgrading a ServerNet Cluster Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology Configuring the star topology cluster to support nodes 17 through 24 means you will not have to change the ServerNet node numbers supported by the cluster switches using the split-star topology. Note. This procedure assumes that the other two star groups already support nodes 1 through 8 and 9 through 16. You might have to select a different configuration tag if you are merging a star topology with a subset of a tri-star topology that uses two cluster switches per fabric. d. Follow the dialog boxes to update the firmware and configuration for the Xfabric cluster switch if the guided procedure determines that the switch needs updating. The Add Switch guided procedure invokes another guided procedure, Update Switch, to update the firmware and configuration. Caution. All nodes attached to a cluster switch whose configuration will be updated with T0569AAE must be running G06.14 or a later G-series RVU, or they must be running G06.13 and have the required SPRs listed in Table 4-6 on page 4-8 installed. Any nodes that do not meet these requirements will experience permanent loss of Expand traffic across the cluster switch when T0569AAE is loaded on the cluster switch. If you have not already upgraded the software, refer to Upgrading Software to Obtain G06.14 Functionality on page 4-34. e. When prompted, repeat the firmware and configuration update for the Y-fabric cluster switch. The guided procedure remembers the configuration tag you selected for the X-fabric cluster switch and uses it for the Y-fabric cluster switch. Do not connect the two-lane links until you update the firmware and configuration on all of the X-fabric cluster switches to be used in the tri-star topology. 5. On any node connected to the cluster that uses the split-star topology, run the Add Switch procedure: a. In the Select Upgrade Topology dialog box, select the tri-star topology as the topology to which you want to upgrade. b. In the Select Switch Configuration Tag dialog box, select either of the following configuration tags. If possible, select a configuration tag that allows the cluster switch to continue supporting the same ServerNet node numbers: • • Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Because a remote switch is present, the procedure prompts you to disconnect the four-lane links between the two cluster switches on the X fabric before updating the local cluster switch. After you disconnect the four-lane links on the X fabric, continue using the guided procedure to update the local X-fabric cluster switch. ServerNet Cluster Manual— 520575-003 4- 84 Upgrading a ServerNet Cluster Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 6. On the other star group of the split-star cluster, log on to a node and run the Add Switch guided procedure to update the firmware and configuration for the X-fabric cluster switch. 7. When all of the cluster switches on the X fabric are updated, connect the two-lane links between the three cluster switches on the X fabric of the tri-star topology. See Connecting the Two-Lane Links on page 4-87. 8. Wait at least 4 to 5 minutes for the X fabric to become operational. Then use the TSM Service Application to verify that there are no alarms on the X fabric. 9. Repeat Steps 4 through 6 for the Y fabric on the cluster that uses the split-star topology. You can continue the guided procedure on the node you are currently using. 10. When the cables are connected, click Continue in the Connect the Cables dialog box. 11. In the Local Switch Information dialog box, click Test to verify the connection between the cluster switches. You must log on to nodes attached to at least two different star groups of the tristar topology in order for the guided procedure to test all of the remote connections on both fabrics. If you log on to a node attached to . . . The guided procedure checks these remote connections . . . X1/Y1 X1/Y1 to X2/Y2, and X1/Y1 to X3/Y3 X2/Y2 X2/Y2 to X1/Y1, and X2/Y2 to X3/Y3 X3/Y3 X3/Y3 to X1/Y1, and X3/Y3 to X2/Y2 12. Log on to a node attached to a different star group of the tri-star topology to test the other remote connections. Run the Add Switch guided procedure again. In the Local Switch Information dialog box, click the Test button to test the connectivity between the local remote switches. 13. Use the TSM Service Application to verify the operation of the cluster switches: a. Log on to a node in the range of ServerNet node numbers 1 through 8, and use the TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X1 and Y1). b. Log on to a node in the range of ServerNet node numbers 9 through 16, and use the TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X2 and Y2). ServerNet Cluster Manual— 520575-003 4- 85 Upgrading a ServerNet Cluster Steps for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology c. Log on to a node in the range of ServerNet node numbers 17 through 24, and use the TSM Service Application to verify the operation of the local X-fabric and Y-fabric cluster switches (X3 and Y3). 14. To configure and start Expand-over-ServerNet lines between the three star groups of the tri-star topology, use the guided procedure for configuring a ServerNet node. Unless the automatic line-handler configuration feature is enabled, you must run the guided procedure on every node in order to configure and start Expand-overServerNet lines to all other nodes. From the system console of that node, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. Online help can assist you in performing the procedure. 15. Use SCF to make sure direct ServerNet communication is possible on both fabrics between all nodes connected to the cluster switches: >SCF STATUS SUBNET $ZZSCL, PROBLEMS Note. The T0294AAG (or a superseding SPR) must be applied in order to use the PROBLEMS option. If the PROBLEMS option is not available, use the SCF STATUS SUBNET $ZZSCL command on all nodes. SCF STATUS SUBNET $ZZSCL requires T0294AAA or a superseding SPR. If remote passwords are configured, you can issue the SCF STATUS SUBNET $ZZSCL command for a remote node (for example, \REMOTE) from the local node as follows: SCF STATUS SUBNET \REMOTE.$ZZSCL This command eliminates the need to establish a logon window for each node. ServerNet Cluster Manual— 520575-003 4- 86 Upgrading a ServerNet Cluster Connecting the Two-Lane Links Connecting the Two-Lane Links When the Add Switch guided procedure indicates that the two-lane links can be connected between the cluster switches in different star groups, connect the two-lane links for the specified fabric. Use the following steps: Caution. Do not connect the two-lane links for the tri-star topology until the Add Switch guided procedure instructs you to do so. Connecting the two-lane links between cluster switches running the T0569AAA configuration can cause an outage on every node in the cluster. Power cycling each node might be necessary to recover from this outage. Outages can occur because the cluster is unprotected by the neighbor-checking logic if cluster switches running the T0569AAA configuration are connected. Connecting cluster switches running the T0569AAA configuration can create an invalidly configured tri-star topology in which ServerNet packets loop indefinitely between the cluster switches. 1. Label both ends of each fiber-optic cable to be used for the two-lane links. On the label, include the fabric and cluster switch position (X1, for example) and the port number (8, for example) to which the cable will be connected. Caution. Before connecting ServerNet cables, inspect the cables as described in Connecting a Fiber-Optic Cable to an MSEB or ServerNet II Switch on page 3-18. Using defective connectors can cause ServerNet connectivity problems. 2. If they are not already routed, route the fiber-optic ServerNet cables to be used for the two-lane links. 3. Remove the black plugs from ports 8 through 11 on the double-wide PICs inside each switch enclosure. 4. Remove the dust caps from the fiber-optic cable connectors. Note. ServerNet II Switch ports 8, 9, 10, and 11 are keyed differently from ports 0 through 7. To connect the two-lane link cables, you must align the fiber-optic cable connector with the key on top. See Figure 4-13. Figure 4-13. Key Positions on ServerNet II Switch Ports Keys Ports 8-11 Ports 0-7 Keys VST111.vsd ServerNet Cluster Manual— 520575-003 4- 87 Upgrading a ServerNet Cluster Connecting the Two-Lane Links 5. One cable at a time, connect the cable ends. Table 4-24 shows the cable connections. Caution. During an upgrade from a split-star topology to a tri-star topology, you must first connect all cables on the X fabric and then wait for the guided procedure to prompt you to connect the cables on the Y fabric. Table 4-24. Two-Lane Link Connections for the Tri-Star Topology √ Cluster Switch Port Connects to Cluster Switch . . . X1 8 X2 10 X1 9 X2 11 X1 10 X3 8 X1 11 X3 9 X2 8 X3 10 X2 9 X3 11 Y1 8 Y2 10 Y1 9 Y2 11 Y1 10 Y3 8 Y1 11 Y3 9 Y2 8 Y3 10 Y2 9 Y3 11 Port 6. Check the link-alive LED near each PIC port. The link-alive LEDs should light a few seconds after the cable is connected at both ends. If the link-alive LEDs do not light: a. Try reconnecting the cable, using care to align the key on the cable plug with the PIC connector. b. Make sure the dust caps are removed from the cable ends. c. If possible, try connecting a different cable. 7. Continue until all cables on a fabric are connected. ServerNet Cluster Manual— 520575-003 4- 88 Upgrading a ServerNet Cluster Fallback for Merging Clusters to Create a Tri-Star Topology Fallback for Merging Clusters to Create a Tri-Star Topology Use one of the following procedures to fall back from merging clusters to create a tristar topology: • • Fallback for Merging Three Star Topologies to Create a Tri-Star Topology on page 4-89 Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology on page 4-89 Fallback for Merging Three Star Topologies to Create a Tri-Star Topology This procedure: • • • Starts with a tri-star topology using three cluster switches per fabric Aborts Expand lines between the three star groups of the cluster to prepare for splitting the tri-star topology. (A star group consists of an X and a Y switch and up to eight connected ServerNet nodes.) Disconnects the two-lane links to split the three star groups of the tri-star topology 1. Use the steps in Splitting a Large Cluster Into Multiple Smaller Clusters on page 6-11 to disconnect the two-lane links between the three star groups of the cluster. 2. If necessary, refer to Fallback for Upgrading Software to Obtain G06.14 Functionality on page 4-50. Note. Unless your applications require it, you do not need to fall back to an earlier RVU on individual nodes. HP recommends using the T0569AAE or superseding firmware unless fallback is unavoidable. Each cluster can function independently as a subset of the tri-star topology. Cluster switches can be added online. Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology This procedure: • • • Starts with a tri-star topology using three cluster switches per fabric. Aborts Expand lines for one star group of the cluster to prepare for leaving the tristar topology. (A star group consists of an X and a Y switch and up to eight connected ServerNet nodes.) Downloads a split-star configuration one fabric at a time to the cluster switches in the other two star groups. For clusters using two cluster switches per fabric, the ServerNet Cluster Manual— 520575-003 4- 89 Upgrading a ServerNet Cluster Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology split-star configuration provides greater throughput than a subset of a tri-star topology. But the tri-star subset facilitates expansion to 24 nodes. For information about the topologies, refer to Planning for the Topology on page 2-8. • • Disconnects the two-lane links one fabric at a time. Connects four-lane links one fabric at a time to the split-star topology. Note. If you need to separate the star groups in a split-star or tri-star topology but do not need to change the topology or the configuration used by individual star groups, refer to Splitting a Large Cluster Into Multiple Smaller Clusters on page 6-11. 1. Identify the star group that will leave the cluster when the tri-star topology is disconnected. This star group can function as a subset of a tri-star topology using only one cluster switch per fabric. The other two star groups will function as a splitstar topology when the tri-star topology is disconnected. You can use Table 4-22 on page 4-75 to record information about the star groups and nodes. 2. On the star group that will leave the cluster, abort the Expand-over-ServerNet lines: a. On all nodes in the star group that will leave the cluster, abort the Expand-overServerNet lines to the nodes in the two star groups that will form the split-star topology. For example: >SCF ABORT LINE $SC021 b. On all nodes that will populate the split-star topology, abort the Expand-overServerNet lines to the nodes in the star group that will leave the cluster. c. Optionally, stop and delete the Expand-over-ServerNet line-handler processes. Note. You do not need to download new firmware or a new configuration to the cluster switches for the star group that will leave the cluster. The star group can function as a subset of the tri-star topology. Even if you download a new configuration, HP recommends using T0569AAE (or superseding) firmware. If your applications require falling back to an earlier RVU, refer to Fallback for Upgrading Software to Obtain G06.14 Functionality on page 4-50. 3. On one of the star groups that will make up the split-star topology, use the TSM Configuration Update action to download a split-star configuration to the X-fabric cluster switch: Note. Backward migration of the firmware is recommended only in exceptional cases. The T0569AAE firmware is backward compatible with all ServerNet cluster releases, supports all three topologies, and contains significant defect repair. a. In the Management window, click the Cluster tab to view information about the ServerNet cluster. b. In the tree pane, right-click the X-fabric Switch resource, and select Actions. The Actions dialog box appears. ServerNet Cluster Manual— 520575-003 4- 90 Upgrading a ServerNet Cluster Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology c. In the Available Actions box, select Configuration Update, and click Perform action. The Update Switch guided procedure is launched d. In the Guided Procedure - Update Switch window, click Start. The Configuration Update dialog box appears. e. In the Topology box, click the radio button to select the Split-Star topology. f. In the Configuration tag field, select one of the following configuration tags: • • Max 16 nodes, nodes 1-8 (0x10000) Max 16 nodes, nodes 9-16 (0x10001) g. Click Perform action. Note. The configuration update disrupts ServerNet communication through the switch temporarily, but the external ServerNet Y fabric is still functional. 4. Repeat Step 3 on the X-fabric cluster switch for the other star groups that will make up the split-star topology, but select a different configuration tag. 5. On all X-fabric cluster switches in the tri-star topology, disconnect the two-lane links (the switch-to-switch cables connected to ports 8 through 11). Do not disconnect the two-lane links for the Y-fabric cluster switches. 6. Connect the four-lane links between the X-fabric cluster switches that will be used for the split-star topology. (These are the cluster switches whose configuration tags you changed in Step 3 and Step 4.) For more information, refer to Connecting the Four-Lane Links on page 4-64. 7. Wait 4 to 5 minutes for the X fabric to come up, and then verify that the fabric is operational: a. Use SCF to check for connectivity problems: >SCF STATUS SUBNET $ZZSCL, PROBLEMS b. If problems are indicated on a node, use SCF on that node to gather more information: >SCF STATUS SUBNET $ZZSCL c. Use SCF to verify that ports 8 through 11 are enabled. >SCF STATUS SWITCH $ZZSMN In the Switch Port Status section of the display, the TP (Target Port Enabled) column should show the value EN (Enabled) for ports 8 through 11. d. Use the TSM Service Application to check for alarms on the X fabric. If you see alarms, wait for several minutes to see if the alarms clear automatically. 8. When you have confirmed that the X fabric is operational, repeat Step 3 on one of the Y-fabric cluster switches that will be used for the split-star topology. You must change the configuration tags to support the split-star topology. ServerNet Cluster Manual— 520575-003 4- 91 Upgrading a ServerNet Cluster Fallback for Merging a Split-Star Topology and a Star Topology to Create a Tri-Star Topology 9. Repeat Step 3 on the other Y-fabric cluster switch that will be used for the split-star topology. You must change the configuration tags to support the split-star topology. 10. On all Y-fabric cluster switches in the tri-star topology, disconnect the two-lane links (the switch-to-switch cables connected to ports 8 through 11). 11. Connect the four-lane links between the Y-fabric cluster switches that will be used for the split-star topology. (These are the cluster switches whose configuration tags you changed in Step 8.) For more information, refer to Connecting the Four-Lane Links on page 4-64. 12. Wait 4 to 5 minutes for the Y fabric to come up, and then verify that the fabric is operational: a. Use the SCF STATUS SUBNET $ZZSCL, PROBLEMS command to check for connectivity problems. b. Use the SCF STATUS SWITCH $ZZSMN command to verify that the ports are enabled. c. Use the TSM Service Application to check for alarms on the X fabric. 13. Use the SCF PRIMARY PROCESS $ZZSMN command to refresh the TSM display. 14. If necessary, refer to Fallback for Upgrading Software to Obtain G06.14 Functionality on page 4-50. Note. Unless your applications require it, you do not need to fall back to an earlier RVU on individual nodes. The TSM client and server software are backward compatible, and SNETMON (T0294AAG) and SANMAN (T0502AAG) support all topologies. ServerNet Cluster Manual— 520575-003 4- 92 Upgrading a ServerNet Cluster Reference Information Reference Information This section contains reference information for upgrading a ServerNet cluster: • • • • • Considerations for Upgrading SANMAN and TSM on page 4-93 Considerations for Upgrading SNETMON/MSGMON and the Operating System on page 4-95 Updating the Firmware and Configuration on page 4-97 Updating Service Processor (SP) Firmware on page 4-105 Updating the Subsystem Control Facility (SCF) on page 4-105 Considerations for Upgrading SANMAN and TSM The following considerations apply when you upgrade SANMAN and TSM: • You can upgrade SANMAN and TSM online without disrupting ServerNet communication over the cluster. However, TSM cannot display information about the external fabrics while you are upgrading SANMAN. Note. HP recommends that you upgrade SANMAN and TSM to their G06.14 or superseding versions on all nodes connected to a ServerNet cluster. Upgrading provides significant enhancements to ServerNet cluster manageability, including the ability to update the ServerNet II Switch firmware and configuration. • • The order in which you upgrade SANMAN and TSM is not critical, but HP recommends that you upgrade SANMAN first because it includes only a server component. TSM includes both client and server components. Different versions of SANMAN and TSM support different cluster sizes. Table 4-25 compares the functionality provided by various SANMAN and TSM combinations. Table 4-25. SANMAN and TSM Considerations These versions of SANMAN and TSM Support . . . Notes G06.09 through G06.11 Up to eight nodes in a cluster These versions do not support commands to update the ServerNet II Switch firmware and configuration. G06.12 and G06.13 Up to 16 nodes in a cluster These versions support commands to update the switch firmware and configuration. In addition, these versions are backward compatible with the G06.09, G06.10, and G06.11 versions of the operating system. G06.14 and later Up to 24 nodes in a cluster These versions are backward compatible with G06.09 and subsequent versions of the operating system. ServerNet Cluster Manual— 520575-003 4- 93 Upgrading a ServerNet Cluster • • • Considerations for Upgrading SANMAN and TSM Before the cluster switches are loaded with T0569AAB (G06.12) firmware and configuration files, all nodes connected to the switches must be running at least the G06.12 versions of SANMAN (T0502AAE) and the T7945AAW version of the TSM server software. Before the cluster switches are loaded with T0569AAE (G06.14) or T0569AAF (G06.16) firmware and configuration files, all nodes connected to the switches must be running at least the G06.14 versions of SANMAN (T0502AAG) and the T7945AAY version of the TSM server software. In addition to being included on the system update tape (SUT) for their respective RVUs, the SANMAN and TSM versions are available as SPRs. See Table 4-26. Table 4-26. SANMAN and TSM SPRs These versions of SANMAN and TSM Are available as SPRs for use with . . . G06.12 G06.09, G06.10, and G06.11 servers G06.14 G06.13 G06.16 G06.13, G06.14, and G06.15 • • SANMAN is forward and backward compatible with the G06.09 through G06.16 RVUs. Table 4-27 lists the SANMAN versions, RVU compatibility, and supported topologies. The most current TSM client is usually backward compatible with all previous versions of TSM server software and RVUs. For information about TSM compatibility, refer to the TSM Read Me. Table 4-27. SANMAN and RVU Compatibility SANMAN Version Is compatible with RVUs . . . Supported Topologies T0502 (G06.09) G06.09 through G06.16 Star T0502AAA (G06.09) G06.09 through G06.16 Star T0502AAE (G06.12) G06.09 through G06.16 Star and split-star T0502AAG (G06.14) G06.09 through G06.16 Star, split-star, and tri-star T0502AAH (G06.16) G06.09 through G06.16 Star, split-star, and tri-star ServerNet Cluster Manual— 520575-003 4- 94 Upgrading a ServerNet Cluster Considerations for Upgrading SNETMON/MSGMON and the Operating System Considerations for Upgrading SNETMON/MSGMON and the Operating System The following considerations apply when you upgrade SNETMON/MSGMON and the operating system: • • SNETMON/MSGMON and the operating system support up to 16 nodes since G06.09. This support allows the coexistence in a 16-node cluster of ServerNet nodes running G06.09 or a later G-series RVU. G06.13 with T0294AAG (or a superseding SPR) and later G-series RVUs (no SPRs required) support up to 24 nodes. A version dependency exists between SNETMON/MSGMON and the operating system running on the same node. Nodes running any compatible combination of SNETMON/MSGMON and the operating system can interoperate over the cluster. Nodes running any incompatible combination of SNETMON/MSGMON and the operating system cause SNETMON/MSGMON to abend on the local node due to a version mismatch error. Version checks between SNETMON/MSGMON and the operating system are performed by SNETMON/MSGMON when the SNETMON/MSGMON process starts. Table 4-28 provides version and compatibility information for SNETMON/MSGMON and the operating system. Table 4-28. SNETMON, MSGMON, and Operating System Version Compatibility SNETMON/MSGMON Version Is compatible with these operating system versions . . . T0294G08 (G06.09) G06.09 and G06.10 T0294AAA (G06.09)* G06.09 and G06.10 T0294AAB (G06.11) G06.11 and G06.12 T0294AAE (G06.12) G06.12 through G06.14 T0294AAG (G06.14) G06.12 through G06.14 T0294AAH (G06.16) G06.12 through G06.16 *T0294AAA is a Class D (restricted) SPR. • • Operating system upgrades require a system load, but you can perform them at your convenience. You can upgrade SNETMON/MSGMON from any RVU. However, you must observe the version dependencies between SNETMON/MSGMON and the operating system. Upgrades of the operating system in individual nodes across the cluster are possible, but they require appropriate migration rules for other software components. ServerNet Cluster Manual— 520575-003 4- 95 Upgrading a ServerNet Cluster • • Considerations for Upgrading SNETMON/MSGMON and the Operating System If SNETMON/MSGMON abends due to a version mismatch error, it cannot start direct ServerNet connectivity between the local node and other remote nodes. However, this condition does not interfere with ServerNet connectivity between those other remote nodes. All nodes in a split-star topology containing 1-kilometer four-lane links must run G06.11 or a subsequent G-series RVU. The migration technique described in Upgrading Software With System Loads to Obtain G06.12 Functionality on page 4-25 is required for any users who migrate from G06.09 or G06.10 to G06.11 or G06.12 to achieve the longer distances provided by the split-star (16-node) topology. Changes in the G06.11 and G06.12 RVUs ensure safe usage of longer ServerNet cables. All nodes in a split-star topology containing 5-kilometer four-lane links must run G06.16 or a subsequent G-series RVU and contain only NSR-X or NSR-Y processors. Table 4-29 describes the relationship between the four-lane links and the RVU running on the node in a split-star topology. Table 4-29. Length of Four-Lane Link and Operating System If the split-star topology consists of . . . The four-lane link can be . . . At least one G06.09 or G06.10 node Up to 80 meters Nodes containing G06.11 or a later G-series RVU and at least one processor in the cluster is not processor type NSR-X or NSR-Y. Up to 1 kilometer Nodes containing G06.16 or a later G-series RVU and all processors in all nodes are processor type NSR-X or NSR-Y. Up to 5 kilometers • The tri-star topology requires that all nodes run G06.14 (or a G-series RVU) or G06.13 with SPRs. If all processors in the cluster are processor type NSR-X or NSR-Y, the two-lane links used in a tri-star topology can be up to 5 kilometers long. Otherwise, the two-lane links can be up to 1 kilometer long. ServerNet Cluster Manual— 520575-003 4- 96 Upgrading a ServerNet Cluster Updating the Firmware and Configuration Updating the Firmware and Configuration This subsection contains the following information: • • • • • • • About the ServerNet II Switch Firmware and Configuration on page 4-97 Firmware and Configuration File Names on page 4-98 Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration on page 4-99 Soft Reset and Hard Reset on page 4-101 T0569AAA Firmware and Configuration Files on page 4-101 Upgrading SANMAN Before Loading New Configuration on page 4-102 Combinations of Firmware and Configuration Files on page 4-103 Caution. HP recommends that you have a spare cluster switch on site or have ready access to a spare cluster switch before starting any upgrade procedure that includes a firmware or configuration change. About the ServerNet II Switch Firmware and Configuration To understand the role of firmware and configuration in a ServerNet cluster, you must understand the ServerNet II Switch subcomponent of the cluster switch. The ServerNet II Switch contains: • • • • Router-2 ASIC Microprocessor Flash memory Random access memory (RAM) Firmware is software that processes and replies to in-band control (IBC) manageability commands sent by the external ServerNet SAN manager process (SANMAN). The firmware configures the router-2 ASIC according to the configuration. The configuration defines the ServerNet node numbers assigned by the cluster switch and the router-2 ASIC routing table. The configuration determines the position of the switch in the topology and the ServerNet node numbers that the switch supports. Table 4-30 shows the cluster switch positions and ServerNet node numbers. Table 4-30. Cluster Switch Positions and ServerNet Node Numbers Cluster Switch Position Supported ServerNet Node Numbers X1 and Y1 1 through 8 X2 and Y2 9 through 16 X3 and Y3 17 through 24 ServerNet Cluster Manual— 520575-003 4- 97 Upgrading a ServerNet Cluster Updating the Firmware and Configuration The firmware and configuration are saved in flash memory in the ServerNet II Switch. Upon power on or a hard reset, the ServerNet II Switch starts running the firmware and configuration. Firmware and Configuration File Names Table 4-31 lists the firmware and configuration files provided on the SUT. These same files are available in the T0569AAB and T0569AAE SPRs. Table 4-31. Firmware and Configuration File Names Description File Notes ServerNet II Switch firmware file $SYSTEM.SYSnn.M6770 None. ServerNet II Switch configuration library file $SYSTEM.SYSnn.M6770CL “CL” stands for Configuration Library. Depending on the configuration file version, this file supports up to three configurations (X1/Y1, X2/Y2, and X3/Y3). ServerNet II Switch configuration reserved file $SYSTEM.SYSnn.M6770CR “CR” stands for Configuration Reserved. This file is empty and is reserved for future use. Firmware and Configuration Compatibility With the NonStop Kernel Before updating the firmware and configuration of an installed cluster switch, you must ensure that ServerNet nodes connected to the cluster switch are running a version of the NonStop Kernel that is compatible with the new firmware and configuration. Caution. All nodes attached to a cluster switch whose firmware and configuration will be updated with T0569AAB or T0569AAE must be running a version of the operating system that is compatible with the T0569 SPR to be downloaded. Any nodes that do not meet these requirements might experience permanent loss of Expand traffic across the cluster switch when the new firmware and configuration are loaded on the cluster switch. Table 4-32 on page 4-99 describes the NonStop Kernel requirements for ServerNet nodes connected to cluster switches that will be updated. ServerNet Cluster Manual— 520575-003 4- 98 Upgrading a ServerNet Cluster Updating the Firmware and Configuration Table 4-32. Firmware and Configuration Compatibility With the NonStop Kernel If . . . Then all nodes must be running one of. . . The ServerNet II Switch will have its firmware and configuration updated with T0569AAB The ServerNet II Switch will have its firmware updated with T0569AAE and its configuration updated with one of the splitstar configuration tags from T0569AAE: • • • • G06.12 or a later G-series RVU G06.09, G06.10, or G06.11 and have the required release 2 SPRs listed in Table 4-6 on page 4-8 G06.14 or a later G-series RVU A pre-G06.14 RVU and have the required release 3 SPRs listed in Table 4-6 on page 4-8 Max 16 nodes, nodes 1-8 (0x10000) Max 16 nodes, nodes 9-16 (0x10001) The ServerNet II Switch will have its firmware updated with T0569AAE and its configuration updated with one of the tristar configuration tags from T0569AAE: • • • • • • • G06.14 or a later G-series RVU G06.13 and have the required release 3 SPRs listed in Table 4-6 on page 4-8 Max 24 nodes, nodes 1-8 (0x10002) Max 24 nodes, nodes 9-16 (0x10003) Max 24 nodes, nodes 17-24 (0x10004) Using the TSM Service Application to Download the ServerNet II Switch Firmware or Configuration You can use the Firmware Update action or Configuration Update action in the TSM Service Application to download firmware or configuration files to the cluster switch. Note. It is possible to download firmware or configuration files using SCF commands. However, these procedures are reserved for service providers. (The SCF procedures are documented in the NonStop S-Series Service Provider Supplement.) Using SCF to download the firmware or configuration files can cause TSM alarms because TSM cannot detect the sensitive actions being performed in the cluster switch. ServerNet Cluster Manual— 520575-003 4- 99 Upgrading a ServerNet Cluster Updating the Firmware and Configuration The Firmware Update and Configuration Update actions execute the Update Switch guided procedure. (There is no direct access to this guided procedure from the Start menu.) You can use the guided procedure to update either the firmware or the configuration, depending on the TSM action you used to start the guided procedure. Caution. Upgrading the ServerNet II Switch firmware from T0569AAA or upgrading the configuration from any version causes a temporary disruption of ServerNet connectivity through the cluster switch. Before upgrading the firmware from T0569AAA or updating the configuration from any version, you must use the SCF STATUS SUBNET $ZZSCL command on all nodes to ensure that ServerNet interprocessor communication (IPC) connectivity is up on the other fabric between the nodes. If T0294AAG (or a superseding SPR) is applied, use the following command: SCF STATUS SUBNET $ZZSCL, PROBLEMS Detailed steps for using TSM to start the guided procedures follow. Use these steps only when instructed to do so by the procedures later in this section: 1. Using the TSM Service Application, log on to a node connected to the cluster switch to which you want to download firmware or a new configuration. 2. Click the Cluster tab. 3. Click the plus (+) sign next to the External_ServerNet_X_Fabric or the External_ServerNet_Y_Fabric to display the cluster switch. 4. Right-click the Switch resource, and select Actions. The Actions dialog box appears. 5. Do one of the following: • • To download firmware, choose the Firmware Update action. To download a configuration, choose the Configuration Update action. 6. Click Perform action to load the guided procedure for updating the switch. 7. When the guided procedures interface appears, click Start. Dialog boxes guide you through the update operation. To prevent alarms during sensitive operations involving the cluster switch hardware or firmware, the TSM ServerNet cluster and cluster switch incident analysis (IA) software undergoes a three-minute rest period during: • • • • Firmware update action performed using TSM Configuration update action performed using TSM Hard reset action performed using TSM Cluster switch replacement using a guided procedure ServerNet Cluster Manual— 520575-003 4 -100 Upgrading a ServerNet Cluster Updating the Firmware and Configuration In general, alarms are not created during the rest period. You can ignore any alarms that occur during these operations. Note. TSM alarms are not suppressed when sensitive operations are performed on cluster switches using SCF commands. Soft Reset and Hard Reset You must perform a soft reset of the ServerNet II Switch subcomponent after a firmware download. After a configuration download, you must perform a hard reset. The guided procedure includes the soft reset and gives you the option to perform the hard reset function, as required. The following table compares the soft and hard reset functions: Soft Reset Hard Reset • • • • • • • • Must be used after a firmware update. Restarts the firmware but not the router-2 ASIC. Does not disrupt pass-through ServerNet traffic. Must be used after a configuration update. Restarts the firmware and the router-2 ASIC. Is equivalent to a power-on reset. Disrupts pass-through ServerNet traffic. Is implemented automatically by the T0569AAA firmware after a firmware update. This defect in the T0569AAA firmware has been corrected in T0569AAB. T0569AAA Firmware and Configuration Files The ServerNet II Switch firmware and configuration files provided with G06.09, G06.10, and G06.11 are empty. For this reason, HP strongly recommends that you apply SPR T0569AAA to all ServerNet nodes running G06.09 through G06.11. T0569AAA provides the firmware and configuration files (see Table 4-31 on page 4-98) equivalent to the files that HP preloaded on the cluster switches during shipments for G06.09 through G06.11. If T0569AAA is not present on all nodes and a problem occurs with a cluster switch during a software upgrade, you cannot fall back by downloading the original firmware and configuration files to the cluster switch. The T0569AAA configuration (M6770CL) defines that all switch ports are by default enabled for ServerNet pass-through data traffic. This behavior is acceptable for a star topology cluster supporting up to eight ServerNet nodes and using one cluster switch per fabric. However, enabling the ports by default is not allowed for the split-star or tristar topologies. For these topologies, ports can be enabled only after neighborhood checks have been performed to validate that the cluster is correctly cabled and configured. Before updating a cluster switch to newer firmware, you must apply the T0569AAA SPR. The procedures for upgrading software later in this section include this step. ServerNet Cluster Manual— 520575-003 4 -101 Upgrading a ServerNet Cluster Updating the Firmware and Configuration Upgrading SANMAN Before Loading New Configuration Beginning with the G06.12 configuration, all pass-through ServerNet data traffic is by default disabled on the ServerNet II switch ports. The G06.12 and superseding versions of SANMAN enable switch ports after ensuring that neighbor checks for the ports have passed. The neighbor check logic is a new functionality implemented by SANMAN in G06.12. Neighbor checking ensures that the cluster has been properly cabled, and the cluster switches properly configured, before enabling pass-through ServerNet data traffic in ServerNet ports. Previous versions of SANMAN do not implement the neighbor check logic. Consequently, a node running a version of SANMAN earlier than G06.12 cannot enable ports on cluster switches running a G06.12 or later configuration. However, such a node can perform all switch manageability functions available in versions of SANMAN earlier than G06.12. Manageability traffic is always enabled on ServerNet II Switch ports, whether or not pass-through ServerNet data traffic is enabled or disabled. A node running a version of SANMAN earlier than G06.12 does not know it must enable ports on the nearest switches to allow regular (non-IBC) data traffic to and from the cluster to commence. The SNETMON/MSGMON running on that node cannot discover other remote nodes, and Expand-over-ServerNet lines for that node will not come up. To avoid this problem, you must upgrade SANMAN on all nodes to G06.12 or a later version before loading the new configuration on the switches. The configuration running on the cluster switch defines the default state for the switch ports. The firmware starts the ports on the default state set by the configuration as follows: In this configuration . . . Ports are . . . T0569AAA Enabled by default T0569AAB or superseding Disabled by default ServerNet Cluster Manual— 520575-003 4 -102 Upgrading a ServerNet Cluster Updating the Firmware and Configuration Combinations of Firmware and Configuration Files For a ServerNet cluster production environment, the recommended combinations of firmware and configuration files are: This combination Is required for . . . And recommended for . . . T0569AAA firmware and T0569AAA configuration N/A A star topology having up to eight nodes when at least one of the nodes connected to the cluster switches does not have the T0502AAA or a superseding SPR installed T0569AAB firmware and T0569AAB configuration A split-star topology having up to 16 nodes All topologies when all of the nodes connected to the cluster switch have at least the T0502AAA or a superseding installed T0569AAE firmware and T0569AAE configuration A tri-star topology having up to 24 nodes All topologies if T0502AAG is available However, other combinations are possible, particularly during migration procedures. For update or fallback operations, you must follow a required sequence when downloading any of the following: • • • ServerNet II Switch firmware ServerNet II Switch configuration Both the firmware and the configuration Upgrading T0569: Sequence for Downloading the ServerNet II Switch Firmware and Configuration Table 4-33 shows the sequence for upgrading the T0569 firmware and configuration. Values shown in the table are relative. For example, if the running firmware is T0569AAA and the running configuration is T0569AAB, see the column for old firmware and a new configuration. Table 4-9 on page 4-12 shows how SCF and TSM display the T0569AAA and T0569AAB firmware and configuration. ServerNet Cluster Manual— 520575-003 4 -103 Upgrading a ServerNet Cluster Updating the Firmware and Configuration Table 4-33. Upgrading T0569: Sequence for Downloading ServerNet II Switch Firmware and Configuration Currently Running in Cluster Switch Node Number Range Change? T0569 Firmware T0569 Configuration Use This Sequence No Old Old 1. Download the firmware. 2. Download the configuration. Yes Old New Download the firmware only. New Old Download the configuration only. New New No action needed. Old Old 1. Download the firmware. 2. Download the configuration. Old New 1. Download the firmware. 2. Download the configuration. New Old Download the configuration only. New New Download the configuration only. ServerNet Cluster Manual— 520575-003 4 -104 Upgrading a ServerNet Cluster Updating Service Processor (SP) Firmware Falling Back to T0569: Sequence for Downloading the ServerNet II Switch Firmware and Configuration Table 4-34 shows the sequence for falling back to an earlier version of the T0569 firmware and configuration. Values shown in the table are relative. For example, if the running firmware is T0569AAB and the running configuration is T0569AAA, see the column for new firmware and an old configuration. Table 4-9 on page 4-12 shows how SCF and TSM display the T0569AAA and T0569AAB firmware and configuration. Table 4-34. Falling Back T0569: Sequence for Downloading ServerNet II Switch Firmware and Configuration Currently Running in Cluster Switch Node Number Range Change? T0569 Firmware T0569 Configuration Use This Sequence No New New 1. Download the configuration. 2. Download the firmware. Yes New Old Download the firmware only. Old New Download the configuration only. Old Old No action needed. New New 1. Download the configuration. 2. Download the firmware. New Old Download the firmware only. Old New 1. Download the firmware (to upgrade it to AAB). 2. Download the configuration (to fall back to AAA). 3. Download the firmware (to fall back to AAA). Old Old No action needed. Updating Service Processor (SP) Firmware For defect repair, HP recommends that you update the service processor (SP) firmware on all nodes to T1089ABC or a superseding SPR. Updating the Subsystem Control Facility (SCF) In order to use the online help text provided with the new SNETMON and SANMAN SCF commands, you must apply T9082ACQ or a superseding SPR. ServerNet Cluster Manual— 520575-003 4 -105 Upgrading a ServerNet Cluster Updating the Subsystem Control Facility (SCF) ServerNet Cluster Manual— 520575-003 4 -106 Part III. Operations and Management This part contains the following sections: • • • Section 5, Managing a ServerNet Cluster Section 6, Adding or Removing a Node Section 7, Troubleshooting and Replacement Procedures ServerNet Cluster Manual— 520575-003 Part III. Operations and Management ServerNet Cluster Manual— 520575-003 5 Managing a ServerNet Cluster This section describes how to monitor and control a ServerNet cluster. This section contains two subsections: Heading Page Monitoring Tasks 5-1 Control Tasks 5-26 Monitoring Tasks Monitoring tasks allow you to check the general health of the ServerNet Cluster. These tasks include: • • • • • • • • • • Displaying Status Information Using the TSM Service Application on page 5-1 Running SCF Remotely on page 5-12 Displaying Information About the ServerNet Cluster Monitor Process (SNETMON) on page 5-13 Checking the Status of SNETMON on page 5-14 Checking the Status of the ServerNet Cluster Subsystem on page 5-16 Checking ServerNet Cluster Connections on page 5-17 Checking the Version of the ServerNet Cluster Subsystem on page 5-17 Generating Statistics on page 5-18 Monitoring Expand-Over-ServerNet Line-Handler Processes on page 5-20 Monitoring Expand-Over-ServerNet Lines and Paths on page 5-20 Note. You can use OSM instead of TSM for any of the procedures described in this manual. For information on using OSM instead of TSM, see Appendix H, Using OSM to Manage the Star Topologies. Displaying Status Information Using the TSM Service Application You can use the TSM Service Application to gain information about the MSEBs, the local node, the remote nodes, the external X and Y fabrics, and the ServerNet Switches in the ServerNet Cluster. In the split-star and tri-star topologies, a cluster switch appears as a remote switch in the TSM Service Application if you are logged on to a server that is not directly connected to the switch. The TSM Service Application obtains information about the local node and the fabrics from both SANMAN and SNETMON. Information about the remote nodes is obtained only from SNETMON. ServerNet Cluster Manual— 520575-003 5 -1 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application To obtain ServerNet cluster-related information using the TSM Service Application: 1. Using a system console attached to any functioning node in the ServerNet cluster, log on to the TSM Service Application. (For details about logging on, refer to Appendix F, Common System Operations.) The Management Window appears, as shown in Figure 5-1. Figure 5-1. TSM Management Window VST110.vsd ServerNet Cluster Manual— 520575-003 5 -2 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 2. In the tree pane, click the GRP-1 resource and the GRP-1.MOD-1 resource to display all of the components in the group 01 processor enclosure. 3. Click the MSEB resource to select it. 4. In the details pane, click the Attributes tab. Figure 5-2 shows the attributes for this resource. Figure 5-2. Attributes for the MSEB VST016.vsd 5. In the tree pane, click the plus (+) sign next to the MSEB resource so that you can see each subcomponent PIC. 6. In the tree pane, click the PIC for port 6 to select it. 7. In the details pane, click the Attributes tab. Figure 5-3 shows the attributes for this resource. Figure 5-3. Attributes for the MSEB PIC VST017.vsd ServerNet Cluster Manual— 520575-003 5 -3 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 8. Click the Cluster tab. The tree pane displays the ServerNet Cluster resource preselected with the high-level cluster resources below it. See Figure 5-4. Note. The Cluster tab appears in the Management Window if the external ServerNet SAN manager process (SANMAN) can communicate with at least one of the cluster switches. The Cluster tab does not appear if SANMAN cannot communicate with any cluster switches. Figure 5-4. Tree Pane in the TSM Service Application VST095.vsd ServerNet Cluster Manual— 520575-003 5 -4 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 9. In the details pane, click the Attributes tab. Figure 5-5 shows the attributes for the ServerNet Cluster resource. Figure 5-5. Attributes for the ServerNet Cluster Resource VST044.vsd 10. In the tree pane, click the local node to select it. Figure 5-6 shows the attributes for this resource. Figure 5-6. Attributes for the Local Node VST050.vsd ServerNet Cluster Manual— 520575-003 5 -5 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 11. In the tree pane, click a remote node to select it. Figure 5-7 shows the attributes for this resource. Figure 5-7. Attributes for the Remote Node Resource VST045.vsd 12. In the tree pane, click either the External_ServerNet_X_Fabric or the External_ServerNet_Y_Fabric to select it. Figure 5-8 shows the attributes for this resource. Figure 5-8. Attributes for the External Fabric Resource VST049.vsd ServerNet Cluster Manual— 520575-003 5 -6 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 13. In the tree pane, click the plus sign (+) next to the external fabric object to expand it, and then click the switch resource to select it. Figure 5-9 shows the attributes for this resource. Figure 5-9. Attributes for the Switch Resource VST090.vsd ServerNet Cluster Manual— 520575-003 5 -7 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 14. In the tree pane, click the plus sign (+) next to the switch resource to expand it, and then click any switch-to-node link (for example, Y_PIC_1_To_\name) to select it. The switch-to-node link represents the connection between a cluster switch and an MSEB. Figure 5-10 shows the attributes for this resource. Figure 5-10. Attributes for the Switch-to-Node Link VST097.vsd 15. In the tree pane, click any switch-to-switch link (for example X_PIC_8_To_Switch_X_GUID_V0XY1U). The switch-to-switch link represents one of the links in a four-lane link between two cluster switches. Figure 5-11 shows the attributes for this resource. Figure 5-11. Attributes for the Switch-to-Switch Link VST098.vsd ServerNet Cluster Manual— 520575-003 5 -8 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 16. In the tree pane, click a remote switch. Figure 5-12 shows the attributes for this resource. Figure 5-12. Attributes for the Remote Switch Object VST099.vsd ServerNet Cluster Manual— 520575-003 5 -9 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application 17. For more information about a specific attribute, select the attribute and then press F1 to view the online help. Figure 5-13 shows the F1 Help for the Service State attribute . Figure 5-13. F1 Help for Service State Attribute VST052vsd 18. In the tree pane, click the External_ServerNet_X_Fabric or the External_ServerNet_Y_Fabric to select it. 19. From the Display drop-down menu, select either Physical View or Connection View. (For ServerNet Cluster resources, the physical view and the connection view are the same.) The view pane displays a logical diagram of the ServerNet connections between the nodes and cluster switches on the external fabric. Figure 5-14 and Figure 5-15 show views of an external ServerNet fabric in two different topologies. ServerNet Cluster Manual— 520575-003 5- 10 Managing a ServerNet Cluster Displaying Status Information Using the TSM Service Application Figure 5-14. Physical/Connection View of External Fabric in a Split-Star Topology VST087.vsd Figure 5-15. Physical/Connection View of External Fabric in a Tri-Star Topology VST133.vsd ServerNet Cluster Manual— 520575-003 5- 11 Managing a ServerNet Cluster Running SCF Remotely Running SCF Remotely The Subsystem Control Facility (SCF) provides commands that display general information about the ServerNet Cluster subsystem. Because the view of a ServerNet Cluster can change significantly from one node to another, you should gather data at each node by using SCF and the TSM client software, then compare the information. In the case of SCF, you can compare information by logging on to a remote TACL process and then running SCF at the remote node. In the case of the TSM client software, you can compare information by logging on to the system console for each node in the cluster. Or, if a TSM dedicated LAN or a public LAN is available that links multiple nodes, you can log on to any member node from the same system console. Quick Reference: SCF Commands for Monitoring a ServerNet Cluster Table 5-1 lists SCF commands that can be used to gather information about a ServerNet Cluster. Note. For SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies, see Appendix I, SCF Changes at G06.21. Table 5-1. SCF Commands for Monitoring a ServerNet Cluster (page 1 of 2) See page Use this SCF command . . . To . . . INFO PROCESS Display the attribute values for SNETMON, SANMAN, and MSGMON. 5-14 LISTDEV Display the SNETMON logical device (LDEV) number, name, and device type. 5-14 STATUS PROCESS Check the status of SNETMON, SANMAN, and MSGMON. 5-15 STATUS SUBSYS $ZZSCL Display status information about the ServerNet cluster subsystem. 5-16 STATUS SUBNET $ZZSCL Check connections in a ServerNet cluster. 5-17 INFO SUBSYS $ZZSCL Display start-state and command-state information about the ServerNet cluster subsystem. 5-16 VERSION SUBSYS, DETAIL Display version information about the ServerNet cluster subsystem. 5-17 STATUS CONN $ZZSMN Display dynamic status information about external fabric connections. 9-19 STATUS DEVICE Obtain information about the state of a linehandler process. 5-20 ServerNet Cluster Manual— 520575-003 5- 12 Managing a ServerNet Cluster Displaying Information About the ServerNet Cluster Monitor Process (SNETMON) Table 5-1. SCF Commands for Monitoring a ServerNet Cluster (page 2 of 2) See page Use this SCF command . . . To . . . STATUS LINE, DETAIL Check the status for the Expand-overServerNet line. 5-22 STATUS PATH, DETAIL Display detailed information about the path. 5-23 STATUS SWITCH $ZZSMN Display dynamic status information about the cluster switches on both external fabrics. 9-25 STATS LINE Display statistical information about the Expand-over-ServerNet line. 5-23 STATS PATH Display statistical information about the path. 5-24 INFO CONN $ZZSMN Display information about the condition of both external fabric connections to the cluster switches. 9-6 INFO LINE, DETAIL Check operational information for an Expandover-ServerNet line. 5-24 INFO PATH, DETAIL Display detailed information about the current or default attribute values for the path. 5-25 INFO PROCESS $NCP, LINESET Display the status of all of the Expand lines that are currently active on the system. 5-25 INFO PROCESS $NCP, NETMAP Display the status of the network as seen from a specific system. 5-26 INFO SWITCH $ZZSMN Display information about the cluster switches on both external fabrics. 9-9 Displaying Information About the ServerNet Cluster Monitor Process (SNETMON) You can use the SCF interface to the Kernel subsystem and the TSM client software to display information about the ServerNet Cluster subsystem components. Using SCF Commands for the Kernel Subsystem You can use the Kernel subsystem SCF INFO PROCESS command to display configuration information for the ServerNet cluster monitor process (recommended generic process name is $ZZKRN.#ZZSCL). This process is also referred to as SNETMON. Example 5-1 shows an SCF INFO PROCESS command and its output. Note. In this section, the ServerNet cluster monitor process ($ZZSCL) is referred to by its abbreviation, SNETMON. And the symbolic name for the SNETMON generic process is assumed to be ZZSCL, which is the recommended symbolic name. ServerNet Cluster Manual— 520575-003 5- 13 Managing a ServerNet Cluster Checking the Status of SNETMON Example 5-1. INFO PROCESS Command > INFO PROCESS $ZZKRN.#ZZSCL NONSTOP KERNEL - Info PROCESS \MINDEN.$ZZKRN.#ZZSCL Symbolic Name *Name *Autorestart *Program ZZSCL $ZZSCL 10 $SYSTEM.SYSTEM.SNETMON For more information about the SCF INFO PROCESS command, refer to the SCF Reference Manual for the Kernel Subsystem. You can also use the SCF LISTDEV command to display the SNETMON logical device (LDEV) number, name, and device type. Example 5-2 shows an SCF LISTDEV command and its output. Example 5-2. LISTDEV Command > LISTDEV $ZZSCL LDev Name PPID BPID Type Rsize Pri Program 66 $ZZSCL 5,284 6,271 (64,0) 132 199 \KENO.$SYSTEM.SYS01.SNETMON For more information about the SCF LISTDEV command, refer to the SCF Reference Manual for G-Series Releases. Checking the Status of SNETMON You can use the TSM Service Application or SCF to check the status of SNETMON. Using TSM to Check the SNETMON Status 1. Log on using the TSM Service Application. 2. Click the Cluster tab to view information about the cluster. 3. Select the ServerNet Cluster resource. 4. In the Details pane, click the Attributes tab. 5. Check the SNETMON Process State attribute. ServerNet Cluster Manual— 520575-003 5- 14 Managing a ServerNet Cluster Checking the Status of SNETMON Figure 5-16. SNETMON Status Displayed by TSM Service Application VST044.vsd Using SCF to Check the SNETMON Status You can use the Kernel subsystem SCF STATUS PROCESS command to check the status of SNETMON. Example 5-3 shows an SCF STATUS PROCESS command and its output. Example 5-3. STATUS PROCESS Command > STATUS PROCESS $ZZKRN.#ZZSCL NONSTOP KERNEL - Status Process \MINDEN.$ZZKRN.#ZZSCL Symbolic Name Name State Sub Primary Backup Owner PID PID ID ZZSCL $ZZSCL STARTED 0, 11 1, 11 255,255 For more information about the SCF STATUS PROCESS command, refer to the SCF Reference Manual for the Kernel Subsystem. ServerNet Cluster Manual— 520575-003 5- 15 Managing a ServerNet Cluster Checking the Status of the ServerNet Cluster Subsystem Checking the Status of the ServerNet Cluster Subsystem You can use the TSM Service Application or SCF to check the status of the ServerNet cluster subsystem. Using TSM to Check the ServerNet Cluster Subsystem Status 1. Log on using the TSM Service Application. 2. Click the Cluster tab to view information about the cluster. 3. Select the ServerNet Cluster resource. 4. In the Details pane, click the Attributes tab. 5. Check the ServerNet Cluster State attribute. See Figure 5-16 on page 5-15. Using SCF to Check the ServerNet Cluster Subsystem Status You can use the ServerNet cluster subsystem SCF STATUS command to display status information about the ServerNet cluster subsystem. Example 5-4 shows the information returned by the STATUS command. Example 5-4. STATUS SUBSYS $ZZSCL Command > STATUS SUBSYS $ZZSCL Servernet Cluster - Status SUBSYS \MINDEN.$ZZSCL Subsystem....... STARTED For more information about the STATUS command, refer to Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem. You can also use the Kernel subsystem INFO SUBSYS command to display start-state and command-state information about the ServerNet cluster subsystem. Example 5-5 shows the information returned by the INFO SUBSYS command. Example 5-5. INFO SUBSYS $ZZSCL Command > INFO SUBSYS $ZZSCL ServerNet Cluster - Info SUBSYS \TOYS.$ZZSCL Start state..... STARTED Command state... STARTED Command time..... 06 Sep 2001, 9:43:18.936 ServerNet Cluster Manual— 520575-003 5- 16 Managing a ServerNet Cluster Checking ServerNet Cluster Connections Checking ServerNet Cluster Connections You can use the SCF STATUS SUBNET to check connections to other systems in the ServerNet cluster. Example 5-6 shows the output of a STATUS SUBNET command. Example 5-6. STATUS SUBNET $ZZSCL command > STATUS SUBNET $ZZSCL SNETMON Remote ServerNet SNETMON SysName Num 0<--CPU States-->15 LocLH RemLH SCL EXPAND State,Cause Node-----------------------------------------------------------------------1| RMT \SCQA6 254 1111,0000,0000,0000 UP 119 UP CONN CONN . . . . . . | 2| LCL \TROLL 253 1111,1p11,1111,1111 . . . . . . . . . . . . STRD,UNKN | 3| RMT \MS9 206 1111,0000,0000,0000 UP 125 UP CONN CONN . . . . . . | 4| RMT \MS10 207 1111,0000,0000,0000 UP 124 UP CONN CONN . . . . . . | 5| RMT \MS11 208 1111,0000,0000,0000 UP 123 UP CONN CONN . . . . . . | 6| RMT \MS12 209 XXY1,0000,0000,0000. . . . . . . . . . . . . . . . . . | 7| RMT \MS13 210 1111,0000,0000,0000 UP 121 UP CONN CONN . . . . . . | 8| RMT \MS14 211 P111,0000,0000,0000 UP 120 UP CONN CONN . . . . . . | 9| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 10| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 11| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 12| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 13| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 14| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 15| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 16| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | --------------------------------------------------------------------------- For more details on the output of this command, see the STATUS SUBNET Command Example on page 8-11. You can also use the SCF STATUS SERVERNET command to check connections between processors within a system. For detailed examples, see Using SCF to Check the Internal ServerNet Fabrics on page 7-27. Checking the Version of the ServerNet Cluster Subsystem You can use the ServerNet cluster subsystem SCF VERSION command to display version information about the ServerNet cluster subsystem. Example 5-7 shows the information returned by the VERSION, DETAIL command. Example 5-7. VERSION SUBSYS, DETAIL Command > VERSION SUBSYS $ZZSCL, DETAIL Detailed VERSION SUBSYS \SYS.$ZZSCL SYSTEM \SYS SCL - T0294G08 - (01JUL01) - AAG GUARDIAN - T9050 - (Q06) SCF KERNEL - T9082G02 - (14JAN02) (03JAN02) SCL PM - T0294G08 - (01JUL01) - AAG For more information about the VERSION command, refer to Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem. ServerNet Cluster Manual— 520575-003 5- 17 Managing a ServerNet Cluster Generating Statistics Generating Statistics Each processor in every node of a ServerNet cluster keeps a set of statistical counters for each node in the cluster, including the local node. In addition, each processor keeps a set of generic counters that are not associated with any particular node. Each hour, SNETMON causes the statistical counters in each processor to be sent to the service log ($ZLOG). SNETMON does this by sending a request to the MSGMON process that resides in each processor. The MSGMON process sends the statistics data to the service log in the form of an event: the ZSCL-EVT-STATISTICS: Node Statistics (1200) event. This event provides a snapshot of message-system error counters for the ServerNet Cluster subsystem. The statistical counters are reset after the statistics data is recorded. Using the TSM EMS Event Viewer Application, you can view Node Statistics events as they are logged hourly. Or you can use the TSM Service Application to generate these events at random for the server where you are logged on. Note. Statistics should be interpreted only by a service provider trained by HP. Use the following procedure to generate and view the statistics event for a node: 1. Right-click the local or remote node, and select Actions. The Actions dialog box appears. Figure 5-17 shows the Actions dialog box. 2. From the Actions list, click Generate ServerNet Statistics. ServerNet Cluster Manual— 520575-003 5- 18 Managing a ServerNet Cluster Generating Statistics Figure 5-17. Generate ServerNet Statistics Action vst018.vsd 3. Click Perform action. The Action Status window shows the progress of the action. 4. Click Close to close the Actions dialog box. 5. Use the TSM EMS Event Viewer Application to check the statistics event for the system you are validating: a. Start the event viewer and log on as described in Appendix F, Common System Operations. Be sure to include the service log ($ZLOG) in the event source criteria. b. Find the Node Statistics (1200) event. c. Double-click the event to show the tokens and their descriptions. Or select the event, and then choose Detail from the Display menu. The Event Viewer displays the Event Detail window for the selected event. For more information about the Node Statistics (1200) event, see the Operator Messages Manual. Note. The statistical counters are reset automatically each hour. You can reset the counters manually at any time by using the Reset ServerNet Statistics action for the ServerNet Cluster resource. Resetting the counters also produces a series of Node Statistics (1200) events. ServerNet Cluster Manual— 520575-003 5- 19 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Line-Handler Processes Monitoring Expand-Over-ServerNet Line-Handler Processes You can use the WAN subsystem SCF STATUS DEVICE command to obtain information about the state of a line-handler process. Example 5-8 shows a linehandler process in the STARTED state. Example 5-8. STATUS DEVICE Command Showing STARTED Line-Handler Process > STATUS DEVICE $ZZWAN.#SC001 WAN Manager STATUS DEVICE for DEVICE State : ........... STARTED LDEV number ....... 97 PPIN................ 0 ,19 \BUZZ.$ZZWAN.#SC001 BPIN............. 1 ,15 Example 5-9 shows a line-handler process in the STOPPED state. Example 5-9. STATUS DEVICE Command Showing STOPPED Line-Handler Process > STATUS DEVICE $ZZWAN.#SC003 WAN Manager STATUS DEVICE for DEVICE State : ........... STOPPED \BUZZ.$ZZWAN.#SC003 Monitoring Expand-Over-ServerNet Lines and Paths You can use the following tools to obtain detailed information about Expand-overServerNet lines and paths: • • • TSM Service Application Guided procedure for configuring a ServerNet node Expand subsystem SCF commands: • • • • • • • • STATUS LINE, DETAIL Command on page 5-22 STATUS PATH, DETAIL Command on page 5-23 STATS LINE Command on page 5-23 STATS PATH Command on page 5-24 INFO LINE, DETAIL Command on page 5-24 INFO PATH, DETAIL Command on page 5-25 INFO PROCESS $NCP, LINESET Command on page 5-25 INFO PROCESS $NCP, NETMAP Command on page 5-26 For more information about using SCF commands for the Expand subsystem, refer to the Expand Configuration and Management Manual. ServerNet Cluster Manual— 520575-003 5- 20 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Lines and Paths Using TSM to Monitor Expand-Over-ServerNet Lines 1. Log on using the TSM Service Application. 2. Click the Cluster tab to view information about the cluster. 3. For information about the line configured from the local node to a specific remote node, click the Remote Node resource. 4. In the Details pane, click the Attributes tab. 5. Check the Expand/ServerNet Line LDEV State attribute. Figure 5-18. Checking the Expand-Over-ServerNet Lines Using TSM VST045.vsd Using the Guided Procedure to Monitor Expand-OverServerNet Lines If you run the guided procedure for configuring a ServerNet node on a system that has already been added to a ServerNet cluster, the procedure quickly cycles through its list of software checks and displays the ServerNet Cluster Connection Status dialog box. See Figure 5-19. This dialog box shows the currently configured Expand-overServerNet lines. To run the guided procedure, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. Online help is available to assist you in interpreting the dialog boxes. ServerNet Cluster Manual— 520575-003 5- 21 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Lines and Paths Figure 5-19. ServerNet Cluster Connection Status Dialog Box vst058.vsd Using the SCF STATUS LINE, DETAIL Command Use the STATUS LINE, DETAIL command to check the status for the Expand-overServerNet line. Example 5-10 shows an SCF STATUS LINE, DETAIL command and output for an Expand-over-ServerNet line named $SC003. Example 5-10. STATUS LINE, DETAIL Command > STATUS LINE $SC003, DETAIL EXPAND Detailed Status LINE PPID............. ( 0, 19) State............ STOPPED Trace Status..... OFF Detailed State... DOWN Detailed Info..... None $SC003 BPID............. ( 1, 16) Path LDEV........ 115 Effective line priority 1 Status Err 66 ServerNet Cluster Manual— 520575-003 5- 22 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Lines and Paths Using the SCF STATUS PATH, DETAIL Command Use the STATUS PATH, DETAIL command to display detailed information about the path. Example 5-11 shows this command. Example 5-11. STATUS PATH, DETAIL Command > STATUS PATH $SC003, DETAIL EXPAND Detailed Status PATH $SC003 PPID......... ( 0, 23) State......... STARTED Trace Status OFF Line LDEVs.. 126 BPID............ ( 1, Number of lines.. Superpath......... 19) 1 OFF Using the SCF STATS LINE Command Use the STATS LINE command to display statistical information about the Expandover-ServerNet line. Example 5-12 shows this command. Example 5-12. STATS LINE Command > STATS LINE $SC003 EXPAND Stats LINE $SC003, PPID Resettime... MAY 12,2000 12:15:59 MsgSent RepRecv MsgTout ErrRecv LastErr Bind 2 2 0 0 0 Aconn 4 4 0 3 140 MsgSent RepRecv MsgTout ErrRecv LastErr Unbind 1 1 0 1 201 Data 178016 178016 0 1 201 Proc lookup failures 0 ( 0, 22), BPID ( 1, 18) Sampletime... MAY 17,2000 17:36:02 Pconn 10 8 2 8 201 Query 14565 14565 0 0 0 Disc 0 0 0 0 0 MsgRecv RepSent ErrSent LastErr Notif 0 0 0 0 Data 221051 221051 0 0 Inactivity timeouts ServerNet Cluster Manual— 520575-003 5- 23 0 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Lines and Paths Using the SCF STATS PATH Command Use the STATS PATH command to display statistical information about the path. Example 5-13 shows a partial listing for this command. Example 5-13. STATS PATH Command > STATS PATH $SC004 EXPAND Stats PATH $SC004, PPID ( 0 Reset Time.... JUL 28,2000 07:25:28 Current Pool Pages Used Max Pool Pages Used Pool Size in Pages Total Number of Pool Fails 89 89 4177 0 20), BPID ( 1, 18) Sample Time.. JUL 30,2000 13:30:33 Curr OOS Used in Words Max OOS Used in Words Number of OOS Timeouts Number of Known System 0 0 0 0 -----------------------LEVEL 4 MESSAGE HISTOGRAM -----------------------<= 64 .. 1300 <= 128 .. 0 <= 256.. 0 <= 512 .. 0 <= 1024 .. 0 <= 2048.. 0 <= 4096 .. 0 > 4096 .. 0 -----------------------LEVEL 4 / LEVEL 3 -------Average--------Average--Packets Forwards Links Packets/Frame Bytes/Frame Sent 12109 0 0 1.0 28 Rcvd 1409 0 650 1.0 68 L4 Packets Discarded......... 0 -----------------------LEVEL 4 / LEVEL 3 DETAIL ------------------------More text? ([Y],N) Using the SCF INFO LINE, DETAIL Command Use the INFO LINE, DETAIL command to check operational information for an Expand-over-ServerNet line. Example 5-14 shows an SCF INFO LINE, DETAIL command and output for an Expand-over-ServerNet line named $SC002. Example 5-14. INFO LINE, DETAIL Command > INFO LINE $SC002, DETAIL EXPAND Detailed Info LINE $SC002 (LDEV 186) L2Protocol Net^Nam TimeFactor ... 10K *SpeedK..... SNET Framesize.... 132 -Rsize..... -Speed..... *LinePriority.. 1 StartUp..... OFF Delay. 0:00:00.10 *Rxwindow...... 7 *Timerbind..0:01:00.00 *L2Timeout.. 0:00:01.00 *Txwindow..... 7 *Maxreconnects... 0 *AfterMaxRetries PASSIVE *Timerreconnect 0:01:00.00 *Retryprobe.... 10 *Timerprobe... 0:00:30.00 *Associatedev... $ZZSCL *Associatesubdev *Timerinactivity 0:00:00.00 *ConnectType... ACTIVEANDPASSIVE ServerNet Cluster Manual— 520575-003 5- 24 Managing a ServerNet Cluster Monitoring Expand-Over-ServerNet Lines and Paths Using the SCF INFO PATH, DETAIL Command Use the INFO PATH, DETAIL command to display detailed information about the current or default attribute values for the path. Example 5-15 shows this command. Example 5-15. INFO PATH, DETAIL Command > INFO PATH $SC043, DETAIL EXPAND Detailed Info *Compress.... OFF *OStimeout... 0:00:03:00 *L4Timeout... 0:00:20:00 L4ExtPackets ON Local *PathBlockBytes 0 *PathPacketBytes 4095 PATH $SC043 *Nextsys....... *L4Retries..... *L4SendWindow.. *L4CongCtrl.... Remote 0 0 #43 *OSspace.... 32767 3 OldTF....... 32767 254 TimeFactor.. inf OFF *Superpath OFF Negotiated Maximum 0 4095 0 4095 Using the SCF INFO PROCESS $NCP, LINESET Command Use the INFO PROCESS $NCP, LINESET command to display the status of all the Expand lines that are currently active on the system. Example 5-16 shows this command. Example 5-16. INFO PROCESS $NCP, LINESET Command > INFO PROCESS $NCP, LINESET EXPAND Info PROCESS LINESETS AT \WOODY LINESET NEIGHBOR 1 \BUZZ (004) $NCP , LINESET (3) #LINESETS=3 LDEV 114 2 \PINK (043) 113 2 \TOYS (001) 115 TF -- TIME: JUL 29,2000 18:04:09 PID LINE -- ---1 10K ( 0, 20) 1 10K ( 0, 21) 1 LDEV ServerNet Cluster Manual— 520575-003 5- 25 STATUS 114 NOT READY 113 READY 115 READY FileErr# (140) Managing a ServerNet Cluster Control Tasks Using the SCF INFO PROCESS $NCP, NETMAP Command Use the INFO PROCESS $NCP, NETMAP command to display the status of the network as seen from a specific system. Example 5-17 shows this command. Example 5-17. INFO PROCESS $NCP, NETMAP Command > INFO PROCESS $NCP, NETMAP EXPAND Info NETMAP PROCESS AT \TOYS $NCP, NETMAP (1) #LINESETS=3 TIME: JUL 30,2000 13:19:54 SYSTEM TIME (DISTANCE) BY PATH 3 \WOODY 10K(01)* inf(--) 20K(02) 4 \BUZZ inf(--) inf(--) inf(--) 43 \PINK 20K(02) inf(--) 10K(01)* --------------------------------------------------------------LINESETS AT \TOYS LINESET NEIGHBOR 1 \WOODY (003) LDEV 126 TF 10K 2 \BUZZ (004) 125 -- 3 \PINK (043) 124 10K (1) #LINESETS=3 PID ( 0, LINE 23) 1 -- ---1 ( 0, 21) 1 LDEV STATUS FileErr# 126 READY 125 NOT READY (140) 124 READY Control Tasks Control tasks include: • • • • • • • • • • Starting the Message Monitor Process (MSGMON) on page 5-28 Aborting the Message Monitor Process (MSGMON)Aborting the Message Monitor Process (MSGMON) on page 5-28 Starting the External ServerNet SAN Manager Process (SANMAN) on page 5-29 Aborting the External ServerNet SAN Manager Process (SANMAN) on page 5-29 Restarting the External ServerNet SAN Manager Process (SANMAN) on page 5-30 Starting the ServerNet Cluster Monitor Process (SNETMON) on page 5-30 Starting ServerNet Cluster Services on page 5-31 When a System Joins a ServerNet Cluster on page 5-31 Stopping ServerNet Cluster Services on page 5-33 Switching the SNETMON or SANMAN Primary and Backup Processes on page 5-34 ServerNet Cluster Manual— 520575-003 5- 26 Managing a ServerNet Cluster Quick Reference: SCF Commands for Controlling a ServerNet Cluster Quick Reference: SCF Commands for Controlling a ServerNet Cluster Table 5-2 lists SCF commands that can be used to control components of a ServerNet Cluster. Table 5-2. SCF Commands for Controlling a ServerNet Cluster Use this SCF command . . . To . . . See page START PROCESS $ZZKRN.#MSGMON Start MSGMON 5-28 ABORT PROCESS $ZZKRN.#MSGMON Abort MSGMON 5-28 START PROCESS $ZZKRN.#ZZSMN Start SANMAN 5-29 ABORT PROCESS $ZZKRN.#ZZSMN Abort SANMAN 5-29 START PROCESS $ZZKRN.#ZZSCL Start SNETMON 5-30 ABORT PROCESS $ZZKRN.#ZZSCL Stop SNETMON 5-30 START SUBSYS $ZZSCL Start ServerNet cluster services 5-31 STOP SUBSYS $ZZSCL Stop ServerNet cluster services 5-33 PRIMARY PROCESS $ZZSCL Switch the SNETMON primary process 5-35 PRIMARY PROCESS $ZZSMN Switch the SANMAN primary process 5-35 SCF Objects for Managing a ServerNet Cluster Before using commands to control ServerNet cluster operations, be sure to review the objects that represent components of the ServerNet cluster subsystem: • • Kernel subsystem PROCESS objects for the following processes: ° ° ° ServerNet cluster monitor process (SNETMON) Message monitor process (MSGMON) External ServerNet SAN manager process (SANMAN) ServerNet Cluster subsystem (SCL) SUBSYS object The Kernel Subsystem PROCESS Objects The ServerNet cluster monitor processes are represented by Kernel subsystem PROCESS objects. SCF commands for configuring, starting, stopping, and displaying information about PROCESS objects are described in the SCF Reference Manual for the Kernel Subsystem. The SCL Subsystem SUBSYS Object The SCL subsystem SUBSYS object represents the way in which the ServerNet cluster monitor process starts ServerNet cluster services and joins systems to the ServerNet Cluster Manual— 520575-003 5- 27 Managing a ServerNet Cluster Starting the Message Monitor Process (MSGMON) ServerNet cluster. SCF commands for configuring, starting, stopping, and displaying information about the SCL subsystem SUBSYS object are described in Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem. Starting the Message Monitor Process (MSGMON) Adding the message monitor process (MSGMON) to the configuration database is described in Section 3, Installing and Configuring a ServerNet Cluster. To run MSGMON after adding it but before the next system load, or after stopping it by using the SCF ABORT PROCESS $ZZKRN.#MSGMON command, use the SCF START command: > START PROCESS $ZZKRN.#MSGMON Note. In this section, the message-system monitor process ($ZIMnn) is referred to as MSGMON. The symbolic name for the MSGMON generic process is assumed to be MSGMON, which also is the recommended symbolic name. Aborting the Message Monitor Process (MSGMON) To abort MSGMON on the local system, use the SCF ABORT PROCESS command: > ABORT PROCESS $ZZKRN.#MSGMON Note. Normally, you should not need to abort MSGMON—even if a system is no longer a member of a ServerNet cluster. There are two cases in which you might have to abort MSGMON: • • During installation of a T0294 SPR that includes a new version of MSGMON. In this case, follow the installation instructions in the SOFTDOC for that SPR. Before you can alter one or more MSGMON process configuration attributes via SCF ALTER PROCESS $ZZKRN.#MSGMON. Normally, you should not need to alter any of the MSGMON process configuration attributes, as long as you have configured MSGMON with the TACL macro documented in Appendix E. Aborting MSGMON on a node will not change the state of ServerNet Cluster IPC connectivity and Expand-over-ServerNet lines to and from that node. However, while MSGMON is not running, the node will not be able to bring up or automatically repair remote IPC connectivity. This includes bringing up remote IPC connectivity to any processor that is reloaded in the cluster, bringing up IPC connectivity to any new node that is added to the cluster, and automatically repairing any failed remote IPC connections. Expand-over-ServerNet lines to any new node that is added to the cluster while MSGMON is not running will also not be brought up. ServerNet Cluster Manual— 520575-003 5- 28 Managing a ServerNet Cluster Starting the External ServerNet SAN Manager Process (SANMAN) Starting the External ServerNet SAN Manager Process (SANMAN) Adding the external ServerNet SAN manager process (SANMAN) to the configuration database is described in Section 3, Installing and Configuring a ServerNet Cluster. When you add SANMAN, HP recommends that you set the STARTMODE attribute to SYSTEM. If you do so, SANMAN starts automatically after a system load or a processor reload. To run SANMAN after adding it but before the next system load, or after stopping it with the SCF ABORT PROCESS $ZZKRN.#ZZSMN command, use the SCF START PROCESS command: > START PROCESS $ZZKRN.#ZZSMN Note. In this section, the external ServerNet SAN manager process ($ZZSMN) is referred to as SANMAN. The symbolic name for the SANMAN generic process is assumed to be ZZSMN, which also is the recommended symbolic name. Aborting the External ServerNet SAN Manager Process (SANMAN) To stop the ServerNet SAN manager process, use the SCF ABORT PROCESS command: > ABORT PROCESS $ZZKRN.#ZZSMN Note. Although it has no effect on data traffic, aborting SANMAN causes the Cluster tab to disappear in the TSM Service Application. Consequently, you cannot view or manage cluster resources such as external fabrics and cluster switches. Normally, you should not need to abort SANMAN—even if a system is no longer a member of a ServerNet cluster. There are two cases in which you might have to abort SANMAN: • • During installation of a T0502 SPR that includes a new version of SANMAN. In this case, follow the installation instructions in the SOFTDOC for the SPR. Before you can alter one or more SANMAN process configuration attributes with SCF ALTER PROCESS $ZZKRN.#ZZSMN. Normally, you should not need to alter any of the SANMAN process configuration attributes, as long as you have configured SANMAN with the TACL macro documented in Appendix E. ServerNet Cluster Manual— 520575-003 5- 29 Managing a ServerNet Cluster Restarting the External ServerNet SAN Manager Process (SANMAN) Restarting the External ServerNet SAN Manager Process (SANMAN) The external ServerNet SAN manager process may be restarted for the following reasons: • • • Both processors in which the ServerNet SAN manager process pair is running are stopped. The $ZPM persistence manager automatically restarts the process pair as soon as any processor in its processor list becomes available. The ServerNet SAN manager process abends. The $ZPM persistence manager restarts it immediately. An operator issues an SCF ABORT PROCESS $ZZKRN.#ZZSMN command, followed by an SCF START PROCESS $ZZKRN.#ZZSMN command. Starting the ServerNet Cluster Monitor Process (SNETMON) Adding the ServerNet cluster monitor process (SNETMON) to the configuration database is described in Section 3, Installing and Configuring a ServerNet Cluster. HP recommends that you set the STARTMODE attribute to SYSTEM for the ServerNet cluster monitor process. If you do so, SNETMON starts automatically after a system load or a processor reload. To start SNETMON after configuring it but before the next system load, or after stopping it by using the SCF ABORT PROCESS $ZZKRN.#ZZSCL command, use the SCF START command: > START PROCESS $ZZKRN.#ZZSCL Aborting the ServerNet Cluster Monitor Process (SNETMON) To stop the ServerNet cluster monitor process, use the SCF ABORT PROCESS command: > ABORT PROCESS $ZZKRN.#ZZSCL Note. Normally, you should not need to abort SNETMON—even if a system is no longer a member of a ServerNet cluster. The ServerNet statistics provided by the Node Statistics (1200) event are not available when SNETMON is aborted. There are two cases in which you might have to abort SNETMON: • • During installation of a T0294 SPR that includes a new version of SNETMON. In this case, follow the installation instructions in the SOFTDOC for the SPR. Before you can alter one or more SNETMON process configuration attributes via SCF ALTER PROCESS $ZZKRN.#ZZSCL. Normally, you should not need to alter any of the SNETMON process configuration attributes, if you configured SNETMON with the TACL macro documented in Appendix E. ServerNet Cluster Manual— 520575-003 5- 30 Managing a ServerNet Cluster Starting ServerNet Cluster Services Aborting SNETMON on a node does not change the state of ServerNet Cluster IPC connectivity to and from that node. However, while SNETMON is not running, the node will not be able to bring up or automatically repair remote IPC connectivity. This includes bringing up remote IPC connectivity to any processor that is reloaded in the cluster, bringing up IPC connectivity to any new node that is added to the cluster, and automatically repairing any failed remote IPC connections. Expand-over-ServerNet lines to any new node that is added to the cluster while SNETMON is not running cannot be brought up either. Although ServerNet Cluster IPC connectivity is not brought down when the SNETMON process is aborted, Expand-over-ServerNet lines tolerate only temporary absences of SNETMON. Typically, Expand-over-ServerNet lines for the node is brought down if SNETMON is aborted and is not restarted within five minutes. Starting ServerNet Cluster Services You can use TSM or SCF to start ServerNet cluster services. Note. You will receive an error if you try to start ServerNet cluster services before the fiberoptic cables are connected from the MSEBs in group 01 to the cluster switches. Using TSM to Start ServerNet Cluster Services 1. Log on using the TSM Service Application. 2. Click the Cluster tab to view information about the ServerNet cluster. 3. In the tree pane, right-click the ServerNet Cluster resource, and select Actions. 4. From the Actions list, click Start ServerNet Cluster Services. 5. Click Perform action. The Action Status box shows the progress of the action. 6. Click Close to close the Actions dialog box. 7. Verify that ServerNet cluster services are started. (See Using TSM to Check the ServerNet Cluster Subsystem Status on page 5-16.) Using SCF to Start ServerNet Cluster Services To start ServerNet Cluster services on the local system, use the SCF START SUBSYS $ZZSCL command: > START SUBSYS $ZZSCL When a System Joins a ServerNet Cluster The following steps describe what happens when a system joins a ServerNet cluster: 1. First the ServerNet cluster monitor process is started (by the persistence manager $ZPM or with an SCF START PROCESS command, depending on the STARTMODE configuration). ServerNet Cluster Manual— 520575-003 5- 31 Managing a ServerNet Cluster When a System Joins a ServerNet Cluster 2. Then the ServerNet cluster monitor process checks the configuration of its associated ServerNet cluster (SCL) subsystem SUBSYS object: • • If the SUBSYS object is configured with a STARTSTATE attribute set to STOPPED—which is the default—the ServerNet cluster monitor process waits for an SCF START SUBSYS $ZZSCL command before starting ServerNet cluster services and joining the system to the ServerNet cluster. If the SUBSYS object is configured with a STARTSTATE attribute set to STARTED, the ServerNet cluster monitor process automatically starts ServerNet cluster services and joins the system to the ServerNet cluster. Note. For detailed information about the SCL subsystem SUBSYS object summary states and the STARTSTATE attribute, see Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem. Do not confuse the PROCESS object STARTMODE attribute with the SUBSYS object STARTSTATE attribute: • • The STARTMODE attribute controls the way the Persistence Manager $ZPM starts the ServerNet cluster monitor process after a system load. The STARTSTATE attribute controls the way the ServerNet cluster monitor process joins the system to the ServerNet cluster. 3. Once ServerNet cluster services on the local system are started, the ServerNet cluster monitor process establishes ServerNet connections with all other systems in the ServerNet cluster that are in the STARTED or STARTING states. 4. If the Expand-over-ServerNet line-handler processes are configured and are in the STARTED state, Expand connectivity over ServerNet with the other systems in the ServerNet cluster is established. 5. When the ServerNet cluster startup is completed, the ServerNet cluster monitor process has a list of all systems known to be in the ServerNet cluster, and ServerNet connections are established with each system. 6. If ServerNet connection attempts fail or if successful connections subsequently fail, periodic attempts are made to establish or reestablish the connection. Failures and successful reconnections are logged to the event log. Failures to connect are logged as path or other failures. In addition, each ServerNet Cluster subsystem state change (to STARTING, then to STARTED) is logged. If no other systems are discovered, that fact is also logged. ServerNet Cluster Manual— 520575-003 5- 32 Managing a ServerNet Cluster Stopping ServerNet Cluster Services Stopping ServerNet Cluster Services You can use TSM or SCF to stop ServerNet cluster services. Using TSM to Stop ServerNet Cluster Services 1. Log on by using the TSM Service Application. 2. Click the Cluster tab to view information about the ServerNet cluster. 3. In the tree pane, right-click the ServerNet Cluster resource, and select Actions. 4. From the Actions list, click Stop ServerNet Cluster Services. 5. Click Perform action. The Action Status window shows the progress of the action. 6. Click Close to close the Actions dialog box. 7. Verify that ServerNet cluster services are stopped. (See Using TSM to Check the ServerNet Cluster Subsystem Status on page 5-16.) Using SCF to Stop ServerNet Cluster Services To stop ServerNet Cluster services on the local system, use the SCF STOP SUBSYS $ZZSCL command: > STOP SUBSYS $ZZSCL Note. The SCF STOP SUBSYS $ZZSCL command stops ServerNet data traffic for a node on both external fabrics (X and Y). By contrast, the SCF STOP SERVERNET $ZSNET command, when used on a ServerNet node, stops internal and external ServerNet data traffic for only one fabric (X or Y). There is no SCF command to stop ServerNet data traffic on only one external fabric. When ServerNet Cluster Services Are Stopped When you stop ServerNet cluster services, the ServerNet cluster monitor process brings ServerNet cluster services to a STOPPED logical state. The ServerNet cluster monitor process itself does not stop, but remains an active process. Terminating access to the ServerNet cluster proceeds as follows: 1. The ServerNet cluster monitor process sets the ServerNet cluster subsystem state to STOPPING and logs the state change. 2. The ServerNet cluster monitor process informs each remote ServerNet cluster monitor process that the subsystem is stopping. 3. The ServerNet cluster monitor process instructs each local processor to terminate ServerNet connectivity. 4. When the processors have completed this, the ServerNet cluster monitor process moves the subsystem to the STOPPED state and logs the change. Only the subsystem state changes are logged. Individual path state changes are not logged. ServerNet Cluster Manual— 520575-003 5- 33 Managing a ServerNet Cluster Switching the SNETMON or SANMAN Primary and Backup Processes 5. On remote systems, as the ServerNet cluster monitor processes receive word that a ServerNet cluster member has departed, they instruct their local processors to bring down the ServerNet connections with the departing system. These remote ServerNet cluster monitor processes then log the node disconnection to the event log. To fully terminate ServerNet cluster services on a system and stop the ServerNet cluster monitor process, you must issue two SCF commands: > STOP SUBSYS $ZZSCL > ABORT PROCESS $ZZKRN.#ZZSCL Note. Normally, you should not need to abort SNETMON—even if a system is no longer a member of a ServerNet cluster. Stopping the ServerNet cluster monitor process by using an SCF ABORT PROCESS $ZZKRN.#ZZSCL command alone terminates the $ZZSCL process, but does not bring down ServerNet cluster services on the local system. If you use only the SCF ABORT PROCESS command, the system remains joined to the ServerNet cluster for up to 30 seconds if the ServerNet cluster state for the node is STARTED (as represented by the SUBSYS $ZZSCL object for the node). After 30 seconds, the Expand-over-ServerNet line-handlers, which periodically exchange status messages with the local ServerNet cluster monitor process, terminates the Expandover-ServerNet connections. Switching the SNETMON or SANMAN Primary and Backup Processes You can use the TSM Service Application or SCF to cause either the SNETMON or SANMAN primary and backup processes to switch roles. For example, if the primary process is configured to run in processor 0 and the backup process is configured to run in processor 1, you can use the TSM Switch SNETMON Primary Processor action (or the Switch SANMAN Primary Processor action) to switch the primary to processor 1 and the backup to processor 0. Switching SNETMON is recommended if you need to halt a processor. You might need to halt a processor, for example, to prepare for reloading the processor or servicing a PMF CRU. Using TSM to Switch the Primary and Backup Processes Use the following procedure to switch the primary and backup processes using the TSM Service Application: 1. Log on to the TSM Service Application, as described in Appendix F, Common System Operations. The Management Window appears. 2. Click the Cluster tab. See Figure 5-4 on page 5-4. ServerNet Cluster Manual— 520575-003 5- 34 Managing a ServerNet Cluster Switching the SNETMON or SANMAN Primary and Backup Processes 3. In the tree pane, right-click the ServerNet Cluster resource, and select Actions. The actions dialog box appears. 4. Click the Switch SNETMON Primary Processor action or the Switch SANMAN Primary Processor action. 5. Click Perform action. A confirmation dialog box asks if you are sure you want to perform the action. 6. Click OK. The Action Status window shows the progress of the action. 7. Click Close to close the Actions dialog box. Using SCF to Switch the Primary and Backup Processes You use the SCF PRIMARY PROCESS command to switch the primary and backup processes for SNETMON or for SANMAN. For example: > PRIMARY PROCESS $ZZSCL, 2 > PRIMARY PROCESS $ZZSMN, 3 For more information about SCF commands for SNETMON and for SANMAN, refer to Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem, and Section 9, SCF Commands for the External ServerNet SAN Manager Subsystem. ServerNet Cluster Manual— 520575-003 5- 35 Managing a ServerNet Cluster Switching the SNETMON or SANMAN Primary and Backup Processes ServerNet Cluster Manual— 520575-003 5- 36 6 Adding or Removing a Node This section describes how to change the size of an already-installed ServerNet Cluster or a node in a cluster. This section includes: Heading Page Adding a Node to a ServerNet Cluster 6-1 Removing a Node From a ServerNet Cluster 6-2 Moving a Node From One ServerNet Cluster to Another 6-4 Moving ServerNet Cables to Different Ports on the ServerNet II Switches 6-5 Expanding or Reducing a Node in a ServerNet Cluster 6-11 Splitting a Large Cluster Into Multiple Smaller Clusters 6-11 Note. You can use OSM instead of TSM for any of the procedures described in this manual. For information on using OSM instead of TSM, see Appendix H, Using OSM to Manage the Star Topologies. Adding a Node to a ServerNet Cluster You can add nodes to a ServerNet cluster until the cluster reaches the maximum number of nodes supported for the topology. (The tri-star topology supports up to 24 nodes.) You can add only one node at a time. HP recommends that you use the guided procedure to add a node. Note. If the topology used by your cluster will not support any more nodes, you must change the topology. See Section 4, Upgrading a ServerNet Cluster. 1. Review the Planning Checklist in Section 2, Planning for Installation, to ensure that you are ready to add a node. In particular, you must make sure that the node meets the software requirements for the topology used by the cluster. 2. Verify that $ZEXP and $NCP are started and configured properly, as described in Section 3, Installing and Configuring a ServerNet Cluster. 3. Verify that MSGMON, SANMAN, and SNETMON are configured and started as described in Section 3, Installing and Configuring a ServerNet Cluster. 4. Use the guided configuration procedure to configure and add a node to an alreadyinstalled ServerNet cluster. Note. Before using the guided procedure, be sure to review the online help topic “Read Before Using.” This topic contains important information about software requirements that must be met before you run the procedure. From the system console of the server you are adding, choose Start>Programs>Compaq TSM>Guided Configuration Tools> Configure ServerNet Node. Online help is available to assist you in performing the procedure. ServerNet Cluster Manual— 520575-003 6 -1 Adding or Removing a Node Removing a Node From a ServerNet Cluster The Configure ServerNet Node guided procedure: • • • • Verifies that the group 01 MSEBs are installed and ready. Tells you when to connect fiber-optic cables between the MSEBs and the cluster switches. Online help shows you show to make the cable connections. Section 3, Installing and Configuring a ServerNet Cluster, also contains information about how to connect cables. Starts ServerNet cluster services. Configures Expand-over-ServerNet line handlers. Removing a Node From a ServerNet Cluster To remove a node from a ServerNet cluster: 1. Shut down or reroute traffic for any applications using the Expand-over-ServerNet connection between the node being removed and the rest of the cluster. 2. On all nodes in the cluster, use the Expand subsystem SCF LISTDEV command to identify the currently configured Expand-over-ServerNet lines: -> LISTDEV TYPE 63,4 3. On the node being removed, use the Expand subsystem SCF ABORT LINE command to abort the Expand-over-ServerNet lines to other nodes. You must repeat the command for each line to be aborted. The command syntax for the SCF ABORT LINE command is: -> ABORT LINE $device_name $device_name is the device name of an Expand-over-ServerNet line-handler process. For example: -> ABORT LINE $SC001 4. Use the WAN subsystem SCF STOP DEVICE and DELETE DEVICE commands to stop and delete the line-handler process associated with each Expand-overServerNet line. You must repeat the commands for each line-handler process to be stopped and deleted: a. Stop the line-handler process: -> STOP DEVICE $ZZWAN.#SC001 b. If you know the node will not rejoin the cluster later, delete the line-handler process: -> DELETE DEVICE $ZZWAN.#SC001 ServerNet Cluster Manual— 520575-003 6 -2 Adding or Removing a Node Removing a Node From a ServerNet Cluster 5. On all other nodes in the cluster: a. Use the Expand subsystem SCF ABORT LINE command to abort the Expandover-ServerNet line for the node being removed. b. Use the WAN subsystem SCF STOP DEVICE command to stop the Expandover-ServerNet line-handler process for the node being removed. c. If you know the line-handler process will not be needed, use the WAN subsystem SCF DELETE DEVICE command to delete the line-handler process for the node being removed. 6. On the node being removed, bring ServerNet cluster services to a STOPPED logical state. You can use TSM or SCF to stop ServerNet cluster services. To stop ServerNet cluster services using the TSM Service Application: a. Right-click the ServerNet Cluster resource, and select Actions. b. From the Actions list, select Stop ServerNet Cluster Services and click Perform action. The Action Status window shows the progress of the action. c. Click Close to close the Actions dialog box. d. In the Attributes tab of the details pane, verify that the ServerNet Cluster State is stopped. To stop ServerNet cluster services using SCF, issue the SCF STOP SUBSYS $ZZSCL command: -> STOP SUBSYS $ZZSCL The ServerNet cluster monitor process itself does not stop but remains an active process. The node being removed informs other nodes that it is leaving the cluster. 7. Disconnect the cables that link the MSEBs in group 01 to the X-fabric and Y-fabric cluster switches. Note. Disconnect both cables at roughly the same time. If you disconnect one cable and then allow more than two minutes to pass before disconnecting the second cable, TSM alarms will be generated. If value-added diagnostics are enabled, these alarms will be dialed out to the service provider. ServerNet Cluster Manual— 520575-003 6 -3 Adding or Removing a Node Moving a Node From One ServerNet Cluster to Another Moving a Node From One ServerNet Cluster to Another If you have more than one ServerNet cluster, you might want to move a node from one cluster to another. Note the following considerations before moving a node from one cluster to another: • • You must ensure that software on the node being moved is compatible with the software on the cluster being joined. For software compatibility issues, see Section 2, Planning for Installation and Section 4, Upgrading a ServerNet Cluster. In order to move a node from one cluster to another, you must have access to an unused port on the receiving cluster. If no unused ports are available, refer to Section 4, Upgrading a ServerNet Cluster, for more information about merging clusters. As an example, the following procedure moves a node from cluster A to cluster B: 1. To remove a node from cluster A, complete the procedure for Removing a Node From a ServerNet Cluster on page 6-2. 2. To add the node to cluster B, complete the procedure for Adding a Node to a ServerNet Cluster on page 6-1. Adding the node includes: • • Connecting the cables from the recently removed node to the X-fabric and Yfabric cluster switches for cluster B Configuring new Expand-over-ServerNet line-handler processes between the node you are adding to cluster B and the other nodes in cluster B Even if you deleted the line-handler processes on cluster A, you can quickly reconnect the node to cluster A by repeating the preceding steps and using the automatic linehandler configuration feature of the guided procedure. ServerNet Cluster Manual— 520575-003 6 -4 Adding or Removing a Node Moving ServerNet Cables to Different Ports on the ServerNet II Switches Moving ServerNet Cables to Different Ports on the ServerNet II Switches The ServerNet II Switch is the main component of the cluster switch. Figure 6-1 shows the ServerNet II Switch extended for servicing. Figure 6-1. ServerNet II Switch Component of Cluster Switch VST075.vsd If a PIC installed in a ServerNet II Switch is faulty, you might need to move the fiberoptic cables providing the external X-fabric and Y-fabric connections to different (unused) ports on the X and Y ServerNet II Switches. (Because the cables must use the same port on ServerNet II Switches, you must move both cables.) Note. This procedure moves the cable connections at the ServerNet II Switches only. At the server, the fiber-optic cables must always connect to port 6 of the MSEBs in slots 51 and 52 of the group 01 enclosure. You can move the cables to unused ports on the same group of switches. For example, if the cables are currently connected to port 4 of the X1 and Y1 ServerNet II Switches, you might move them to port 7 of the X1 and Y1 ServerNet II Switches. Or you can move the cables to unused ports on the other group of switches. For example, you might move the cables in the previous example to port 6 of the X2 and Y2 ServerNet II Switches. However, make sure that both cables connect to the same group of switches (X1 and Y1 or X2 and Y2—but not X1 and Y2). ServerNet Cluster Manual— 520575-003 6 -5 Adding or Removing a Node Moving ServerNet Cables to Different Ports on the ServerNet II Switches To move the fiber-optic cables at the ServerNet II Switches: 1. Make sure an unused port in the range 0 through 7 is available on the X-fabric and Y-fabric ServerNet II Switches. You must use the same port number on both switches. 2. Complete the Planning Form for Moving ServerNet Cables. See the Sample Planning Form for Moving ServerNet Cables on page 6-9. A blank copy of this form is included in Appendix B, Blank Planning Forms. To complete the form: a. Record the system name and Expand node number of the node whose ServerNet cables will be moved. You can use the TSM Service Application to collect this information. b. Record the old and new ServerNet II Switch port connections. c. For the node whose ServerNet cables will be moved, list the Expand-overServerNet lines that must be aborted before removing the node from the cluster. You can use the Expand subsystem SCF LISTDEV command to identify the currently configured Expand-over-ServerNet lines: -> LISTDEV TYPE 63, 4 d. For all other nodes in the cluster, list the Expand-over-ServerNet line that must be aborted for the node being removed. You can use the SCF INFO PROCESS $NCP, LINESET command to identify the line configured for a specific node: -> INFO PROCESS $NCP, LINESET 3. If your applications might be affected and an adequate alternate Expand line is not available, shut down all applications that depend on the Expand-over-ServerNet line-handler processes. 4. Identify and label the cables you want to move to the unused ports. Labeling the cables reduces the likelihood of moving the wrong cables. 5. Use the following commands to remove the ServerNet node (the node whose cables will change ports) temporarily from the ServerNet cluster: a. On the node being removed, use the Expand subsystem SCF ABORT LINE command to abort the Expand-over-ServerNet lines to all other nodes in the cluster. For example: -> -> -> -> -> -> ABORT ABORT ABORT ABORT ABORT ABORT LINE LINE LINE LINE LINE LINE $SC012 $SC013 $SC014 $SC015 $SC016 $SC017 Note. You do not have to stop or delete the line-handler processes (devices). ServerNet Cluster Manual— 520575-003 6 -6 Adding or Removing a Node Moving ServerNet Cables to Different Ports on the ServerNet II Switches b. On all other nodes in the cluster, use the SCF ABORT LINE command to abort the Expand-over-ServerNet line to the node being removed. Depending on your service LAN, you might have to log on to each node individually to do this. For example: -> ABORT LINE $SC011 c. On the node being removed, bring ServerNet cluster services to a STOPPED logical state: -> STOP SUBSYS $ZZSCL 6. At the ServerNet II Switches, disconnect the X-fabric and Y-fabric fiber-optic cables for the node you removed from the cluster. 7. Reconnect the cables to the unused ports, making sure that: • • The fiber-optic cables connect to the same port number on the X and Y ServerNet II Switches. The fiber-optic cables connect to the same group of switches (X1 and Y1 or X2 and Y2). 8. Check for link alive at both ends of each cable. About 10 seconds after you connect a cable to an unused port on the ServerNet II Switch, the green ServerNet port LED near the PIC lights to indicate link alive. The status lights on the front panel of the ServerNet II Switch also indicate link alive. The ServerNet port LED at port 6 on the MSEB in group 01 lights to indicate link alive. If a ServerNet port LED does not light after 60 seconds, disconnect and reconnect the cable. If the cables are properly connected and one or both LEDs fail to light, a PIC or cable might be faulty. For procedures to replace cables and switch components, refer to Section 7, Troubleshooting and Replacement Procedures. 9. Use either the guided configuration procedure or SCF commands to add the removed node back into the cluster: • To use the guided configuration procedure: a. From the system console of the system you are adding, choose Start>Programs>Compaq TSM>Guided Configuration Tools>Configure ServerNet Node. b. Click Start and log on to the system. c. When the guided procedure prompts you: • • • Click Yes to install the server tools. Click Yes to enable automatic line-handler configuration. Click Yes to start the SNETMON subsystem (ServerNet cluster services). ServerNet Cluster Manual— 520575-003 6 -7 Adding or Removing a Node • Moving ServerNet Cables to Different Ports on the ServerNet II Switches Select all of the remote nodes in the ServerNet Cluster Connection Status dialog box and click Configure/Start to start the local and remote lines. If the automatic line-handler configuration feature is enabled on the remote nodes, the lines on those nodes are started automatically. d. If the remote lines do not start for all of the nodes, repeat Steps a through c, logging on to each node individually to add and start a line for the node being added. • To use SCF commands: a. On the node being added to the cluster, start the ServerNet cluster subsystem: -> START SUBSYS $ZZSCL b. On the node being added to the cluster, use the Expand subsystem SCF START LINE command to start the Expand-over-ServerNet lines to all other nodes in the cluster. For example: -> -> -> -> -> -> START START START START START START LINE LINE LINE LINE LINE LINE $SC012 $SC013 $SC014 $SC015 $SC016 $SC017 c. On all other nodes in the cluster, use the SCF START LINE command to start the Expand-over-ServerNet line to the node being added. Depending on your service LAN, you might have to log on to each node individually to do this. For example: -> START LINE $SC011 10. Using the TSM Service Application, verify that both external fabrics to the node are Up. If the fabrics do not come up, refer to Section 7, Troubleshooting and Replacement Procedures. 11. If necessary, restart your applications. ServerNet Cluster Manual— 520575-003 6 -8 Adding or Removing a Node Moving ServerNet Cables to Different Ports on the ServerNet II Switches Sample Planning Form for Moving ServerNet Cables a. Identify the node whose ServerNet cables will be moved: System Name: \ PROD1 Expand Node Number: 011 b. Record the old and new cluster switch port connections: Before Moving Cables After Moving Cables Expand Node Cluster Switch Port Expand Node System Name X1/Y1 0 011 \PROD1 1 012 \PROD2 012 \PROD2 2 013 \PROD3 013 \PROD3 X2/Y2 X3/Y3 System Name \ 3 \ \ 4 \ \ 5 \ \ 6 \ \ 7 \ \ 0 014 \DEV1 014 \DEV1 1 015 \DEV2 015 \DEV2 2 016 \DEV3 016 \DEV3 3 017 \DEV4 017 \DEV4 4 \ 011 \PROD1 5 \ \ 6 \ \ 7 \ \ 0 \ \ 1 \ \ 2 \ \ 3 \ \ 4 \ \ 5 \ \ 6 \ \ 7 \ \ ServerNet Cluster Manual— 520575-003 6 -9 Adding or Removing a Node Moving ServerNet Cables to Different Ports on the ServerNet II Switches c. List the lines to abort on the node whose ServerNet cables will be moved and on all other nodes: On the node whose cables will be moved... On all other nodes... Node Abort Expand-OverServerNet Line Node Abort Expand-OverServerNet Line \PROD2 $SC012 \PROD2 $SC011 \PROD3 $SC013 \PROD3 $SC011 \DEV1 $SC014 \DEV1 $SC011 \DEV2 $SC015 \DEV2 $SC011 \DEV3 $SC016 \DEV3 $SC011 \DEV4 $SC017 \DEV4 $SC011 \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ ServerNet Cluster Manual— 520575-003 6- 10 Adding or Removing a Node Expanding or Reducing a Node in a ServerNet Cluster Expanding or Reducing a Node in a ServerNet Cluster Like any NonStop S-series server, a node in a ServerNet cluster can be expanded or reduced (enclosures can be added or removed) while the server is online. However, if online expansion requires changes to the MSEBs in the group 01 enclosure, the node’s connections to the cluster might not be in a fault-tolerant state for a short time. To expand or reduce a node in a ServerNet cluster, refer to the NonStop S-Series System Expansion and Reduction Guide. Splitting a Large Cluster Into Multiple Smaller Clusters ServerNet clusters that have more than one cluster switch per fabric (clusters using the split-star or tri-star topologies) can be split into smaller clusters that are valid subsets of the split-star or tri-star topologies. Any valid subset of the split-star or tri-star topology can function independently as a cluster, if necessary. Splitting a cluster: • • • Can be done online. Does not require installing additional cluster switches. Does not change the topology or the ServerNet node numbers used by the star groups that are split. If you need to change the topology, refer to Section 4, “Upgrading a ServerNet Cluster.” This topology . . . Can be split into . . . Split-star Two clusters having up to eight nodes each and using one cluster switch per fabric. Tri-star One of the following: • • Three clusters having up to eight nodes each and using one cluster switch per fabric. Two clusters: a cluster having up to eight nodes and using one cluster switch per fabric and a cluster having up to 16 nodes and using two cluster switches per fabric. Use the following steps to split a split-star topology or a tri-star topology: 1. Select one of the star groups for a complete shutdown of ServerNet cluster services. This can be the star group with the least nodes or the star group that is least critical to your application. 2. In all nodes of the cluster, stop any applications that depend on ServerNet cluster connectivity to the nodes in the star group that will be shut down. ServerNet Cluster Manual— 520575-003 6- 11 Adding or Removing a Node Splitting a Large Cluster Into Multiple Smaller Clusters 3. In all nodes of the star group selected in Step 1: a. Stop any applications that depend on ServerNet cluster connectivity to nodes in the other star group(s). b. Stop Expand connectivity to the nodes in the other star group(s): 1. Use the Expand subsystem SCF LISTDEV command to identify the currently configured Expand-over-ServerNet lines: -> LISTDEV TYPE 63,4 2. Use the Expand subsystem SCF ABORT LINE command to abort the Expand-over-ServerNet lines to the nodes in other star group(s): -> ABORT LINE $SCxxx 3. Use the WAN subsystem SCF STOP DEVICE command to stop the linehandler process associated with each Expand-over-ServerNet line you aborted. -> STOP DEVICE $ZZWAN.#SCxxx 4. Use the WAN subsystem DELETE DEVICE command to delete the linehandler process associated with each Expand-over-ServerNet line you aborted. -> DELETE DEVICE $ZZWAN.#SCxxx c. Use the SCF STOP SUBSYS $ZZSCL command to ensure an orderly shutdown of ServerNet communications: -> STOP SUBSYS $ZZSCL 4. In all nodes of the other star groups, abort all Expand-over-ServerNet lines to the nodes in the star group selected in Step 1. -> ABORT LINE $SCxxx 5. On both fabrics, disconnect the four-lane links (or two-lane links) between the star group selected in Step 1 and the other star group(s). (The four-lane links or twolane links are the switch-to-switch cables connected to ports 8 through 11.) Caution. To avoid generating an alarm, you must disconnect the four-lane or two-lane links on both fabrics within four minutes if the ServerNet II Switches are running T0569AAB (eight minutes if the switches are running T0569AAE). The TSM incident analysis (IA) software generates an alarm eventually if one external fabric has a different number of cluster switches than the other external fabric. ServerNet Cluster Manual— 520575-003 6- 12 Adding or Removing a Node Splitting a Large Cluster Into Multiple Smaller Clusters 6. In all nodes of the star group selected in Step 1: a. Use the SCF START SUBSYS $ZZSCL command to bring up direct ServerNet connectivity between the nodes. -> START SUBSYS $ZZSCL b. If desired, start any applications that utilize ServerNet connectivity to nodes within the star group. 7. If you began with a tri-star topology and one of the remaining clusters has two cluster switches per fabric, you can split the cluster by repeating the procedure. ServerNet Cluster Manual— 520575-003 6- 13 Adding or Removing a Node Splitting a Large Cluster Into Multiple Smaller Clusters ServerNet Cluster Manual— 520575-003 6- 14 7 Troubleshooting and Replacement Procedures This section describes how to use software tools to diagnose and troubleshoot a ServerNet Cluster. This section also contains replacement procedures for the main hardware components of a ServerNet cluster. This section contains the following subsections: Heading Page Troubleshooting Procedures 7-1 Replacement Procedures 7-35 Note. You can use OSM instead of TSM for any of the procedures described in this manual. For information on using OSM instead of TSM, see Appendix H, Using OSM to Manage the Star Topologies. Troubleshooting Procedures See Table 7-1 and Table 7-2 for a list of problems and recovery actions for ServerNet cluster components. For a general approach to troubleshooting, read the Troubleshooting Tips on page 7-1. This section does not include common troubleshooting information for NonStop S-series servers. Instead, refer to the NonStop S-Series Hardware Support Guide or the CSSI Web. (Click Start>Programs>Compaq TSM>Compaq S-Series Service (CSSI) Web.) Troubleshooting Tips As you troubleshoot ServerNet cluster problems, keep in mind: • • • • Sometimes it can be difficult to determine if a problem is related to faulty hardware, faulty software, or both. In general, check the software first. Use the TSM Service Application as your first method for obtaining information about a problem. In many cases, the TSM Service Application not only identifies the problem component but can tell you how to fix it. To troubleshoot effectively, use more than one tool. For example, you can use TSM, SCF, and a guided procedure to check the status of an Expand-overServerNet line. Using all three methods allows you to compare the information returned by each method. If possible, gather information about a problem from multiple perspectives. The view of a ServerNet cluster can change significantly from one node to another, so ServerNet Cluster Manual— 520575-003 7 -1 Troubleshooting and Replacement Procedures Troubleshooting Tips gather data at each node using SCF and the TSM client software and then compare the information. ServerNet Cluster Manual— 520575-003 7 -2 Troubleshooting and Replacement Procedures Software Problem Areas Software Problem Areas Table 7-1 lists some common software problem areas, describes troubleshooting steps, and provides references for more information. Table 7-1. Software Problem Areas (page 1 of 3) Problem Area Symptom Recovery Compaq TSM Client Software The Cluster tab does not appear. See Troubleshooting the Cluster Tab in the TSM Service Application on page 7-9. All other symptoms See one of the following for additional troubleshooting information: Guided Procedures Interface Any • • • TSM Read Me online help file Online help for the TSM Service Application TSM Online User Guide See the online help for the guided procedures interface or for the guided procedure you are using. If you are unable to start the guided procedures interface, you might still be able to view the online help. See Online Help for the Guided Procedures on page 7-11. SNETMON Any See Troubleshooting SNETMON on page 7-17. MSGMON Any See Troubleshooting MSGMON on page 7-19. SANMAN Any See Troubleshooting SANMAN on page 7-20. ServerNet Communication Any See Methods for Repairing ServerNet Connectivity Problems on page 7-23. Communication on an internal fabric is disrupted. See Using the Fabric Troubleshooting Guided Procedure to Check the Internal ServerNet Fabrics on page 7-26 or do the following: 1. Use the TSM package to check for alarms and repair actions for the Internal Fabric resource. See Using TSM Alarms on page 7-12. 2. Perform SCF and TSM diagnostic actions to get more information. See Checking the Internal ServerNet X and Y Fabrics on page 7-26. ServerNet Cluster Manual— 520575-003 7 -3 Troubleshooting and Replacement Procedures Software Problem Areas Table 7-1. Software Problem Areas (page 2 of 3) Problem Area Symptom Recovery ServerNet Communication Communication on an external fabric is disrupted. See Using the Fabric Troubleshooting Guided Procedure to Check the Internal ServerNet Fabrics on page 7-26 or do the following: Communication on an external fabric is disrupted by: • • • BTE timeouts CRC checksum errors 1. Use TSM to check for alarms and repair actions for the External Fabric resource. See Using TSM Alarms on page 7-12. 2. Use the Node Connectivity ServerNet Path Test. See Checking the External ServerNet X and Y Fabrics on page 7-29. One of the fiber-optic cable connections might be faulty. Reseat or replace the fiber-optic cable connecting the MSEB with the ServerNet II Switch. See Replacing a Fiber-Optic Cable Between an MSEB and a ServerNet II Switch on page 7-36. TPB errors These symptoms appear in event messages or statistics. ServerNet Communication Communication with a remote node is disrupted. 1. Use TSM to check for alarms and repair actions for the Remote Node resource. See Using TSM Alarms on page 7-12. 2. Verify that SNETMON is operating properly. See Troubleshooting SNETMON on page 7-17. 3. Verify that SANMAN is operating properly. See Troubleshooting SANMAN on page 7-20. 4. Perform the Node Responsive Test. See Checking Communications With a Remote Node on page 7-23. ServerNet Cluster Manual— 520575-003 7 -4 Troubleshooting and Replacement Procedures Software Problem Areas Table 7-1. Software Problem Areas (page 3 of 3) Problem Area ServerNet Communication Expand-OverServerNet LineHandler Processes and Lines Symptom Recovery The TSM Service Application shows the remote node name as \Remote_Node_\n nn, where nnn is the Expand node number. Verify that the Expand-over-ServerNet linehandler processes between the local node and the remote node are up. See Troubleshooting Expand-Over-ServerNet Line-Handler Processes and Lines on page 7-21. The TSM Service Application shows the remote node name as \UNKNOWN_Re mote_\Node_n, where n is the ServerNet node number. Verify that SNETMON is operating properly on the remote node. See Troubleshooting SNETMON on page 7-17. The TSM Service Application shows the Switch-toNode link as X | Y Fabric_to_\nnn, where nnn is the Expand node number. Verify that the Expand-over-ServerNet linehandler processes between the local node and the remote node are up. See Troubleshooting Expand-Over-ServerNet Line-Handler Processes and Lines on page 7-21. The TSM Service Application shows the Switch-toNode link as X | Y Fabric_to_\Node_ n, where n is the ServerNet node number. Verify that SNETMON is operating properly on the remote node. See Troubleshooting SNETMON on page 7-17. Communication between a cluster switch and a local or remote node is disrupted. Use TSM to check for alarms and repair actions for the Switch-to-Node link. See Using TSM Alarms on page 7-12. Communication between the local node and a remote node is disrupted. See Troubleshooting Expand-Over-ServerNet Line-Handler Processes and Lines on page 7-21. ServerNet Cluster Manual— 520575-003 7 -5 Troubleshooting and Replacement Procedures Hardware Problem Areas Hardware Problem Areas Table 7-2 lists some common hardware problem areas, describes troubleshooting steps, and provides references for more information. Table 7-2. Hardware Problem Areas (page 1 of 3) Problem Area Symptom Recovery MSEB Any 1. Use TSM to check for alarms and repair actions for the MSEB resource. See Using TSM Alarms on page 7-12. 2. Use the TSM Service Application to perform the CRU Responsive Test action on the MSEB. 3. Check the LEDs. See MSEB and ServerNet II Switch LEDs on page 7-33. 4. Check for ServerNet cable and PIC problems before considering replacing the MSEB. 5. See Replacing an MSEB on page 7-35. 1. Use the TSM Service Application to check the PIC Type. Make sure the PIC Type is “NNA” for port 6 of the MSEBs installed in slots 51 and 52 of group 01. 2. Use TSM to check for alarms and repair actions for the MSEB resource. See Using TSM Alarms on page 7-12. 3. Check the link-alive LEDs. See MSEB and ServerNet II Switch LEDs on page 7-33. 4. If recommended by a service provider or by the repair action text for a TSM alarm, perform the Internal Loopback Test action on the PIC. See Using the Internal Loopback Test Action on page 7-30. 5. See Replacing a PIC in a ServerNet II Switch on page 7-35. PIC (installed in MSEB only) Any ServerNet Cluster Manual— 520575-003 7 -6 Troubleshooting and Replacement Procedures Hardware Problem Areas Table 7-2. Hardware Problem Areas (page 2 of 3) Problem Area Symptom Recovery ServerNet Cable (SEB to SEB, SEB to MSEB, or MSEB to MSEB) Internal fabric communication 1. Use TSM to check for alarms and repair actions for the Internal Fabric resource. See Using TSM Alarms on page 7-12. 2. Perform SCF and TSM diagnostic actions to get more information. See Checking the Internal ServerNet X and Y Fabrics on page 7-26. 3. Check to make sure all ServerNet cables between enclosures within a system are securely connected. 4. Refer to the NonStop S-Series Hardware Support Guide or the CSSI Web site. (Click Start>Programs>Compaq TSM>Compaq SSeries Service (CSSI) Web.) 1. Use TSM to check for alarms and repair actions for the External Fabric resource. These include alarms on the switch-to-node link and switch-to-switch link resources. See Using TSM Alarms on page 7-12. 2. Perform the Node Connectivity ServerNet Path Test. See Checking the External ServerNet X and Y Fabrics on page 7-29. 3. Check to make sure the fiber-optic ServerNet cables are securely connected between the MSEBs in group 01 and the Xand Y-fabric ServerNet II Switches. 4. Check the link-alive LEDs. See MSEB and ServerNet II Switch LEDs on page 7-33. 5. See Replacing a Fiber-Optic Cable Between an MSEB and a ServerNet II Switch on page 7-36. Fiber-Optic ServerNet Cable (MSEB to ServerNet II Switch or four-lane link) External fabric communication problem ServerNet Cluster Manual— 520575-003 7 -7 Troubleshooting and Replacement Procedures Hardware Problem Areas Table 7-2. Hardware Problem Areas (page 3 of 3) Problem Area Symptom Recovery ServerNet II Switch Any 1. Use TSM to check for alarms and repair actions for the Switch resource. See Using TSM Alarms on page 7-12. 2. Check the link-alive LEDs. See MSEB and ServerNet II Switch LEDs on page 7-33. 3. See Replacing a ServerNet II Switch on page 7-38. 4. Refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide for additional troubleshooting and replacement information. 1. Use TSM to check for alarms and repair actions for the Switch resource. See Using TSM Alarms on page 7-12. 2. See Replacing an AC Transfer Switch on page 7-38. Uninterruptible Power Supply (UPS) or AC Transfer Switch Any ServerNet Cluster Manual— 520575-003 7 -8 Troubleshooting and Replacement Procedures Troubleshooting the Cluster Tab in the TSM Service Application Troubleshooting the Cluster Tab in the TSM Service Application The Cluster tab appears in the Management Window of the TSM Service Application if the external ServerNet SAN manager process (SANMAN) can communicate with at least one cluster switch. The cluster tab does not appear if SANMAN cannot communicate with a cluster switch. Figure 7-1 shows the tree pane of the TSM Service Application, including resources for the local cluster switches for the external X fabric and external Y fabric. Figure 7-1. Management Window Tree Pane Showing Cluster Tab vst046.vsd ServerNet Cluster Manual— 520575-003 7 -9 Troubleshooting and Replacement Procedures Troubleshooting the Cluster Tab in the TSM Service Application If the Cluster tab does not appear, try the following: 1. Check the TSM client software version: a. From the Help menu, select About Compaq TSM. The About Compaq TSM dialog box appears. b. Verify that the TSM client software version is Version 10.0 or later. Be sure to use the most current TSM client software that supports your operating system: For NonStop Kernel Operating System Release Use TSM Client Version G06.16 2002B (or later) G06.15 2002A (or later) G06.14 2001D (or later) G06.13 2001C (or later) G06.12 2001B (or later) G06.11 2001A (or later) G06.10 2000A (or later) G06.09 10.0 (or later) 2. Try refreshing the display by doing one of the following: • • • From the Display menu, select Refresh. From the Display menu, select New Management Window to view a new window. Try logging off and logging on again to the local node. 3. Check to make sure SANMAN is running. If it is not running, the Cluster tab will not appear. Restart SANMAN, if necessary. See Troubleshooting SANMAN on page 7-20. If the Cluster tab does not appear momentarily, try displaying a new management window. Or log off of the TSM Service Application and log back on to view the Cluster tab. 4. Make sure the ServerNet cables are connected from the local node to the cluster switches. 5. Make sure the cluster switches are powered and operating normally. For more information, refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide. ServerNet Cluster Manual— 520575-003 7- 10 Troubleshooting and Replacement Procedures Online Help for the Guided Procedures Online Help for the Guided Procedures The guided procedures interface includes the following online help files: File Name Contains online help for . . . CLUSTER.CHM Configure ServerNet Node procedure FABRICTS.CHM Troubleshoot a ServerNet Fabric procedure GRT.CHM All of the following guided procedures: • • • • Replace IOMF Replace PMF Replace Power Supply Replace ServerNet/DA PDK.CHM The guided procedures interface SEB.CHM The Replace SEB or MSEB procedure SWANFFU.CHM The SWAN Fast Firmware Update procedure SWITCH.CHM The Replace Switch Component procedure SWITCHADDITION.CHM The Add Switch guided procedure SWITCHFIRMWARE.CHM The Update Switch guided procedure TSM.CHM The connect-to-system task used by all guided procedures These files are stored on the system console in the C:\ZSUPPORT\GUIDED PROCEDURES\HELP directory. By double-clicking an online help file, you can open each online help system and view the help independently of the guided procedures application. ServerNet Cluster Manual— 520575-003 7- 11 Troubleshooting and Replacement Procedures Using TSM Alarms Using TSM Alarms An alarm is a message, similar to an event message, that reports detected faults or abnormal conditions for a CRU or component. The tree pane of the TSM Service Application Management window displays a colored bell icon next to a resource causing an alarm. See Figure 7-2. Figure 7-2. Fabric Alarm Example Alarm Bell vst024.vsd You can use the TSM Service Application to gather information about an alarm in order to diagnose a problem. Viewing TSM Alarms 1. Log on to the TSM Service Application. Note. For a detailed description of logging on to the TSM applications see Appendix F, Common System Operations. 2. In either the tree pane or the view pane, click a resource to select it. 3. In the details pane, click the Alarms tab to view the alarm. See Figure 7-3. Figure 7-3. Alarms Tab Example vst025.vsd ServerNet Cluster Manual— 520575-003 7- 12 Troubleshooting and Replacement Procedures Using TSM Alarms 4. In the Alarms tab, do one of the following: • • Double-click the alarm for more information. Right-click the alarm and select Details from the menu. The Alarm Detail dialog box appears, showing detailed information about the alarm. See Figure 7-4. For a list of ServerNet cluster-related alarms, see ServerNet Cluster-Related Alarms on page 7-15. Figure 7-4. Alarm Detail Example vst026.vsd 5. In the Alarm Detail dialog box, click Repair Actions for a list of steps you can take to respond to the alarm. See Figure 7-5. ServerNet Cluster Manual— 520575-003 7- 13 Troubleshooting and Replacement Procedures Using TSM Alarms Figure 7-5. Repair Actions Example VST027.vsd 6. Perform the repair actions to fix the problem and remove the alarm. More detailed information is provided in the TSM alarm attachment file for the alarm. TSM alarm attachment files are named ZZAL* and are attached to problem incident reports. To view the ZZAL* files, refer to Using ZZAL* (Attachment) Files on page 7-15. ServerNet Cluster Manual— 520575-003 7- 14 Troubleshooting and Replacement Procedures Using TSM Alarms Using ZZAL* (Attachment) Files To find the ZZAL* file for the alarm you are interested in, do the following: • List all of the ZZAL* files in the $SYSTEM.ZSERVICE subvolume using the FILEINFO command: TACL> FILEINFO $SYSTEM.ZSERVICE.ZZAL* • Look for the ZZAL* file with the same timestamp as the time shown in the Alarm time field on the Alarm Detail dialog box. To view a ZZAL* (attachment) file, refer to the TSM Notification Director Application online help for viewing an incident report attachment. ServerNet Cluster-Related Alarms The TSM Service Application can display the following ServerNet cluster-related alarms. These include alarms on the MSEB, ServerNet Cluster, Switch, and ServerNet Fabric resources. Repair actions are provided for each alarm. Note. If dial-out is configured on a node, all ServerNet cluster-related alarms are dialed out to the GCSC. Alarms in bold face cause a dial-out on all clustered nodes that are configured for dial-out. • • • • • • • • • • • • • • • • • • • • • • • Automatic Expand Line Handler Configuration Failure Backfeed Contact Failure Backup Configuration Not Same as Running Configuration Backup Firmware Not Same as Running Firmware Backup Power Rail Failure Battery Failure Both Configurations Incorrect After Load Cable Between AC Transfer Switch and ServerNet Switch Disconnected Cable Between UPS and AC Transfer Switch Disconnected Configuration Incompatibility Between Local ServerNet Switch and Remote ServerNet Switch. Configuration Tag Mismatch Between X and Y ServerNet Switches Configuration Version Mismatch Between X and Y ServerNet Switches Corrupt Configuration Corrupt Firmware Different Remote ServerNet Switch Connected to All Local ServerNet Switch Ports Down-Rev Configuration Down-Rev Firmware Down-Rev MSEB NNA PIC Fabric Setting Mismatch Between Local ServerNet Switch and Remote ServerNet Switch Factory Default Configuration Loaded Firmware Version Mismatch Between X and Y ServerNet Switches FLASH Program Error FLASH Sector Test Failure ServerNet Cluster Manual— 520575-003 7- 15 Troubleshooting and Replacement Procedures • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Using TSM Alarms Ground Failure Hardware Error IBC Driver Limit for ServerNet Switches exceeded Insufficient Backup Time on UPS Invalid Fabric Parameter Error Invalid Fabric Setting Invalid FLASH ID Invalid GUID Invalid MSEB Confguration Record Invalid ServerNet Switch Control Block Invalid ServerNet Switch PIC Type Link Receive Disabled on ServerNet Switch Port Link Transmit Disabled on ServerNet Switch Port Local ServerNet Switch Port not Connected to Remote ServerNet Switch Port Low Battery Voltage Low Inverter Voltage Lower Boot Block Section of FLASH Locked Missing Modular ServerNet Expansion Board Missing NNA Plug-in Card Missing Remote ServerNet Switch Missing ServerNet Switch Missing ServerNet Switch PIC MSEB Confguration Record Fetch Error New Configuration Incorrect After Load NNA Verify Failure Node Plugged Into Wrong Port of ServerNet Switch Overloaded Switch AC Subsystem Overloaded UPS Packet Receive Disabled on ServerNet Switch Port Packet Transmit Disabled on ServerNet Switch Port Port Number Mismatch Between Local ServerNet Switch and Remote ServerNet Switch Power Supply Fan Failure Primary Power Rail Failure Program Checksum Error Remote ServerNet Switch Connected to Wrong Port of Local ServerNet Switch SANMAN Running Without a Backup SEEPROM Checksum Error ServerNet Services Initialization Error ServerNet Switch Internal Stack Overflow ServerNet Switch Not Responding ServerNet Switch Operating on Battery Power ServerNet Switch Port Disabled for ServerNet Traffic ServerNet Switch Router Self-Check Error Service Processor I/O Library Routine Call Error SNETMON Running Without a Backup SNETMON Unable to Discover Remote Node ServerNet Cluster Manual— 520575-003 7- 16 Troubleshooting and Replacement Procedures • • • • • • Troubleshooting SNETMON SRAM Memory Test Failure Too Many ServerNet Switch Automatic Resets Because of Backpressure Upper Boot Block Section of FLASH Locked UPS Failure UPS Not Responding X Fabric Not Connected to the Same ServerNet Switch Port as Y Fabric Troubleshooting SNETMON For general information about SNETMON, refer to Section 1, ServerNet Cluster Description. The SNETMON process ($ZZKRN.#ZZSCL) and the ServerNet cluster subsystem ($ZZSCL) must be in the STARTED state on a system in order for the system to join a ServerNet cluster. $ZZKRN.#ZZSCL is a persistent process that should be configured to be started at all times. Use the following steps to troubleshoot SNETMON: 1. Verify that SNETMON and the ServerNet cluster subsystem are started. Do one of the following: • • Using the TSM Service Application, click the ServerNet Cluster resource in the tree pane to select it. In the attributes pane, check the SNETMON Process State and ServerNet Cluster State attributes. See Figure 7-6. At an SCF prompt: -> STATUS PROCESS $ZZKRN.#ZZSCL -> STATUS SUBSYS $ZZSCL Note. If SNETMON ($ZZKRN.#ZZSCL) does not appear to be available, it might be configured using a different symbolic name. Use the SCF INFO PROCESS $ZZKRN.* command to display a list of all currently configured generic processes. ServerNet Cluster Manual— 520575-003 7- 17 Troubleshooting and Replacement Procedures Troubleshooting SNETMON Figure 7-6. ServerNet Cluster Attributes Showing SNETMON and SANMAN States VST044.vsd 2. If $ZZKRN.#ZZSCL is not configured, refer to Section 3, Installing and Configuring a ServerNet Cluster, for information about configuring and starting it. If $ZZKRN.#ZZSCL is configured but not started, try starting it by typing the following at an SCF prompt: -> START PROCESS $ZZKRN.#ZZSCL 3. If the SUBSYS object is not started, do one of the following: • Type the following at an SCF prompt: -> START SUBSYS $ZZSCL • Use the TSM Service Application to perform the Start ServerNet Cluster Services action on the ServerNet Cluster resource. For more information about starting ServerNet cluster services, refer to Section 5, Managing a ServerNet Cluster. ServerNet Cluster Manual— 520575-003 7- 18 Troubleshooting and Replacement Procedures Troubleshooting MSGMON 4. If you continue to have problems, contact your service provider. Note. Systems using the Tetra 8 topology must have a version of SP firmware that supports clustering to participate in a ServerNet Cluster. Otherwise the ServerNet cluster processes $ZZKRN.#ZZSCL (SNETMON) and $ZZKRN.#ZZSMN (SANMAN) will abend repeatedly when a system load is performed with G06.09 or later. Installing a version of the SP firmware that supports clustering after the cold load and subsequently trying to start $ZZSCL and $ZZSMN does not correct the problem. To correct the problem, you must install a version of SP firmware that supports clustering and then perform a system load. To determine which version of SP firmware you need, see Checking SPR Levels on page 2-24. This problem does not affect systems that use the Tetra16 topology. Troubleshooting MSGMON For general information about MSGMON, refer to Section 1, ServerNet Cluster Description. A MSGMON process must be running in every processor of a system. $ZZKRN.#MSGMON is a persistent process that should be configured to be started at all times. Use the following steps to troubleshoot MSGMON: 1. Verify that MSGMON is started. At an SCF prompt type: -> STATUS PROCESS $ZZKRN.#MSGMON Note. If $ZZKRN.#MSGMON does not appear to be available, it might be configured using a different symbolic name. Use the SCF INFO PROCESS $ZZKRN.* command to display a list of all currently configured generic processes. 2. If $ZZKRN.#MSGMON is not configured, refer to Section 3, Installing and Configuring a ServerNet Cluster, for information about configuring it. 3. If $ZZKRN.#MSGMON is configured but not started, you can start it by typing the following at an SCF prompt. This command will start a copy of MSGMON on every available processor on the system: -> START PROCESS $ZZKRN.#MSGMON 4. If you continue to have problems, contact your service provider. ServerNet Cluster Manual— 520575-003 7- 19 Troubleshooting and Replacement Procedures Troubleshooting SANMAN Troubleshooting SANMAN For general information about SANMAN, refer to Section 1, ServerNet Cluster Description. $ZZKRN.#ZZSMN is a persistent process that should be configured to be started at all times. SANMAN must be in the STARTED state on a system in order for the system to join a ServerNet cluster. Use the following steps to troubleshoot SANMAN: 1. Verify that SANMAN is started. Do one of the following: • • Using the TSM Service Application, click the ServerNet Cluster resource to select it, and check the SANMAN Process State attribute. See Figure 7-6 on page 7-18. At an SCF prompt type: -> STATUS PROCESS $ZZKRN.#ZZSMN Note. If $ZZKRN.#ZZSMN does not appear to be available, it might be configured using a different symbolic name. Use the SCF INFO PROCESS $ZZKRN.* command to display a list of all currently configured generic processes. 2. If $ZZKRN.#ZZSMN is not configured, refer to Section 3, Installing and Configuring a ServerNet Cluster, for information about configuring it. 3. If $ZZKRN.#ZZSMN is configured but not started, use SCF to start it: -> START PROCESS $ZZKRN.#ZZSMN 4. If you continue to have problems, contact your service provider. Note. Systems using the Tetra 8 topology must have a version of SP firmware that supports clustering to participate in a ServerNet Cluster. Otherwise the ServerNet cluster processes $ZZKRN.#ZZSCL (SNETMON) and $ZZKRN.#ZZSMN (SANMAN) will abend repeatedly when a system load is performed with G06.09 or later. Installing a version of the SP firmware that supports clustering after the cold load and subsequently trying to start $ZZSCL and $ZZSMN does not correct the problem. To correct the problem, you must install a version of SP firmware that supports clustering and then perform a system load. To determine which version of SP firmware you need, see Checking SPR Levels on page 2-24. This problem does not affect systems that use the Tetra16 topology. ServerNet Cluster Manual— 520575-003 7- 20 Troubleshooting and Replacement Procedures Troubleshooting Expand-Over-ServerNet LineHandler Processes and Lines Troubleshooting Expand-Over-ServerNet Line-Handler Processes and Lines For general information about Expand-Over-ServerNet lines and line-handler processes, refer to Section 1, ServerNet Cluster Description or the Expand Configuration and Management Manual. The Expand-over-ServerNet line-handler processes are responsible for managing security-related messages and forwarding packets outside the ServerNet cluster. If you suspect a problem with an Expand-overServerNet line-handler process or line: 1. Verify the status of the line-handler processes by doing one of the following • • Run the guided procedure for configuring a ServerNet node (Configure ServerNet Node) to display the ServerNet Cluster Connection Status dialog box. This dialog box indicates whether or not line-handler processes are configured between the local node and remote nodes and shows the line states. For information about running the guided procedure, refer to Section 3, Installing and Configuring a ServerNet Cluster. At an SCF prompt, type: -> STATUS DEVICE $ZZWAN.* 2. If the line-handler processes are in the STOPPED state, you can start them using the SCF START DEVICE command. For example: -> START DEVICE $ZZWAN.#SC004 3. If the line-handler processes need to be configured, do one of the following: • • Configure line-handler processes using the guided procedure for configuring a ServerNet node. For more information, see Section 3, Installing and Configuring a ServerNet Cluster. Configure the line-handler processes manually using SCF. See the Expand Configuration and Management Manual. 4. Verify the status of the Expand-over-ServerNet lines by doing one of the following: • • Using the TSM Service Application, click a Remote Node resource to select it, and check the Expand attributes in the details pane. See Figure 7-7. At an SCF prompt, type: -> STATUS LINE $SC004, DETAIL -> STATUS PATH $SC004, DETAIL ServerNet Cluster Manual— 520575-003 7- 21 Troubleshooting and Replacement Procedures Troubleshooting Expand-Over-ServerNet LineHandler Processes and Lines Figure 7-7. Remote Node Attributes Showing Expand Information VST045.vsd 5. If the Expand-over-ServerNet lines are stopped, start them by doing one of the following: • • Use the guided procedure for configuring a ServerNet node. For more information, see Section 3, Installing and Configuring a ServerNet Cluster. At an SCF prompt, type: -> START LINE $SC004 6. Use the INFO PROCESS $NCP, NETMAP command to gain additional information about the lines: -> INFO PROCESS $NCP, NETMAP 7. If you continue to have problems, refer to the Expand Network Management and Troubleshooting Guide. ServerNet Cluster Manual— 520575-003 7- 22 Troubleshooting and Replacement Procedures Checking Communications With a Remote Node Checking Communications With a Remote Node Use the Node Responsive Test action in the TSM Service Application to test communications with a remote node. This action pings the remote node, verifying whether or not the node is connected and responding. To ping a remote node: 1. In the tree pane, right-click the X or Y fabric-to-node resource for the node that you want to ping, and select Actions. The Actions dialog box appears. 2. From the Actions list, click Node Responsive Test. 3. Click Perform action. The Action Status window shows the progress of the action. If the action status shows Completed, the action passed. If the action status shows Failed, the action failed. 4. If the action failed, click Show detail for more information Methods for Repairing ServerNet Connectivity Problems ServerNet connectivity problems refer to the inability of one or more processors to communicate with one or more other processors over a ServerNet path. Clusters using the split-star or tri-star topologies with G06.14 software (or G06.13 and the release 3 SPRs that provide G06.14 functionality) support automatic fail-over of ServerNet traffic on the two-lane or four-lane links. For more information, refer to Automatic Fail-Over for Two-Lane and Four-Lane Links on page 7-25. You can use several methods to repair ServerNet connectivity problems. The following subsections describe these methods: Procedure Page Switching the SANMAN Primary and Backup Processes to Repair Connectivity Problems 7-23 Using the SCF START SERVERNET Command to Repair Connectivity Problems 7-24 Stopping and Starting the ServerNet Cluster Subsystem to Repair Connectivity Problems 7-24 Aborting and Restarting SNETMON to Repair Connectivity Problems 7-25 Switching the SANMAN Primary and Backup Processes to Repair Connectivity Problems The SCF PRIMARY PROCESS $ZZSMN command forces a takeover of the SANMAN primary process. The backup process, upon becoming the new primary, queries all processors in the node to find the state of ServerNet connections to all other nodes. If it finds any connections that are down, it initiates a sequence to bring the connections to an online state. ServerNet Cluster Manual— 520575-003 7- 23 Troubleshooting and Replacement Procedures Methods for Repairing ServerNet Connectivity Problems The SCF PRIMARY PROCESS $ZZSMN command is a noninvasive command and is the recommended command to repair ServerNet connectivity. It does not cause the SANMAN process to stop running, and it does not cause any ServerNet connectivity that is already up to go down. It does repair any ServerNet connectivity that is down, except for cases in which connectivity is down because of ServerNet hardware failures. Using the SCF START SERVERNET Command to Repair Connectivity Problems For servers running the G06.12 RVU, the SCF START SERVERNET $ZSNET.fabric.cpu command provides a noninvasive manual method for recovering ServerNet paths. SCF START SERVERNET \remotenode.$ZSNET.fabric.* is the recommended command for speeding up automatic interprocessor communication after a cluster switch hard reset. Previously, this command had no effect if the fabric or local path was already up. At G06.12, the message system behavior in response to this command was changed so that the processor that receives the command always checks and brings up any remote IPC paths that are down on the fabric. This is done regardless of the fabric being down or up at the processor when the command is received. Stopping and Starting the ServerNet Cluster Subsystem to Repair Connectivity Problems You can also use the following SCF command sequence to repair connectivity problems: SCF STOP SUBSYS $ZZSCL SCF START SUBSYS $ZZSCL This sequence of commands doesn’t result in a takeover. The primary and backup SNETMON processes continue to run on their respective processors. Stopping and starting ServerNet cluster services to repair connectivity problems is less preferable than using SCF PRIMARY PROCESS $ZZSCL. This is because the SCF STOP SUBSYS $ZZSCL command stops ServerNet cluster services. The SNETMON process pair continues to run, but it tears down all ServerNet connectivity between the node on which the command is issued and all other nodes in the cluster. The SCF START SUBSYS $ZZSCL command brings up ServerNet cluster connectivity from scratch. The final effect is that connectivity comes up after this sequence of commands. However, connectivity was fully destroyed before being brought up again. ServerNet Cluster Manual— 520575-003 7- 24 Troubleshooting and Replacement Procedures Automatic Fail-Over for Two-Lane and Four-Lane Links HP does not recommend stopping and starting the ServerNet cluster subsystem to repair ServerNet connectivity. The SCF STOP SUBSYS $ZZSCL command is used primarily to ensure the orderly removal of a node from the cluster. The SCF STOP SUBSYS $ZZSCL command normally is used prior to: • • • Physically disconnecting a node from the cluster. Halting the processors on a node, possibly in preparation for a system load to a new version of the operating system. Aborting the SNETMON process pair for the purpose of upgrading the SNETMON software, unless recommended otherwise by HP. Aborting and Restarting SNETMON to Repair Connectivity Problems You can also use the following sequence of commands to repair connectivity problems: SCF ABORT PROCESS $ZZKRN.#ZZSCL SCF START PROCESS $ZZKRN.#ZZSCL Using this sequence of commands aborts the SNETMON process pair (SNETMON ceases to exist). However, ServerNet cluster connectivity is left intact. When the SNETMON process is started again, it queries all processors in the node to find the state of ServerNet connections to all other nodes. If it finds any connections that are down, it initiates a sequence to bring the connections to an online state. The outcome is therefore similar to issuing an SCF PRIMARY PROCESS $ZZSCL command, but there is a key difference. The Expand-over-ServerNet line-handler processes tolerate only temporary absences of the SNETMON process. After three minutes of absence, the Expand-over-ServerNet line-handler processes declare the lines to other nodes to be down. Consequently, aborting and starting SNETMON must be used with caution. Because of the possibility of Expand-over-ServerNet lines going down (in case the SNETMON process pair is not running for more than three minutes), HP recommends using the SCF PRIMARY PROCESS $ZZSCL command to repair ServerNet cluster connectivity. Automatic Fail-Over for Two-Lane and Four-Lane Links Clusters using the split-star or tri-star topologies with G06.14 software (or G06.13 and the release 3 SPRs) support automatic fail-over of ServerNet traffic on the two-lane or four-lane links. With link fail-over, internode paths stay up on both fabrics as long as at least one lane between the cluster switches is functional. If a lane fails, the traffic that would normally use that lane is automatically redirected to the next available lane. If a lane is repaired, it is automatically brought online again after it has passed neighbor checks. In clusters using the tri-star topology, if a lane fails, traffic is redirected to the other lane of the same two-lane link. ServerNet Cluster Manual— 520575-003 7- 25 Troubleshooting and Replacement Procedures Checking the Internal ServerNet X and Y Fabrics In clusters using the split-star topology, if a lane fails, traffic is redirected as shown in Table 7-3. Table 7-3. Automatic Fail-Over of ServerNet Traffic on a Four-Lane Link (X or Y Fabric) If a lane fails for traffic using port . . . Traffic is redirected to the lane using port . . . 8 9 9 10 10 11 11 8 Split-star topologies using G06.12 software (or G06.09 through G06.11with SPRs) do not support automatic link fail-over on the four-lane links. The failure of a switch-toswitch lane in this case brings down a subset of the node-to-node paths on a fabric. For example, if a lane fails for traffic using port 8 between the X1 and X2 cluster switches, the following paths are downed: • • All X-fabric IPC paths between ServerNet nodes 1 through 8 and 9 through 10 All X-fabric IPC paths between ServerNet nodes 9 through 16 and nodes 1 through 2 However, paths on the Y fabric remain functional. Checking the Internal ServerNet X and Y Fabrics You can use a guided troubleshooting procedure, the TSM Service Application, or SCF to check the internal ServerNet fabrics. Using the Fabric Troubleshooting Guided Procedure to Check the Internal ServerNet Fabrics You can use the Troubleshoot ServerNet Fabric guided procedure to troubleshoot an internal fabric. From the system console, choose Start>Programs>Compaq TSM>Guided Troubleshooting Tools>Troubleshoot ServerNet Fabric. Documentation for the guided procedure is contained in the online help for the procedure. The procedure guides you through the process of selecting a plug-in card (PIC) to troubleshoot and running internal and external loopback or tests on the PIC. The procedure prompts you to replace PICs and ServerNet cables as necessary. ServerNet Cluster Manual— 520575-003 7- 26 Troubleshooting and Replacement Procedures Checking the Internal ServerNet X and Y Fabrics Using TSM to Check the Internal ServerNet Fabrics Use the Group Connectivity ServerNet Path Test action in the TSM Service Application to check the internal ServerNet X and Y fabrics for the local system. Use this test when you want to check the integrity of group-to-group connections along one ServerNet fabric at a time. This test checks the following components: • • • • ServerNet cables PMF CRUs IOMF CRUs SEBs 1. Using a system console attached to the node whose internal fabric you want to check, log on to the TSM Service Application. Logging on to the TSM Service Application is described in detail in Appendix F, Common System Operations. The Management Window appears. 2. In the tree pane, right-click either the Internal_ServerNet_X_Fabric or the Internal_ServerNet_Y_Fabric and select Actions. The Actions dialog box appears. 3. From the Actions list, click Group Connectivity ServerNet Path Test. 4. Click Perform action. The Action Status window shows the progress of the action. 5. Check for an alarm on the Internal_ServerNet_X_Fabric or the Internal_ServerNet_Y_Fabric. If no alarm is present, the action completed successfully and ServerNet messages are able to use the fabric. 6. If an alarm is present, refer to Using TSM Alarms on page 7-12 to identify the repair actions for the alarm. Using SCF to Check the Internal ServerNet Fabrics Use the SCF STATUS SERVERNET command to check processor-to-processor connectivity for both fabrics within a system: >SCF STATUS SERVERNET $ZSNET ServerNet Cluster Manual— 520575-003 7- 27 Troubleshooting and Replacement Procedures Checking the Internal ServerNet X and Y Fabrics The system displays: NONSTOP KERNEL X-FABRIC TO 0 1 FROM 00 UP UP 01 UP UP 02 UP UP 03 UP UP 04 UP UP 05 UP UP 06 UP UP 07 UP UP 08 <- DOWN 09 <- DOWN 10 <- DOWN 11 <- DOWN 12 <- DOWN 13 <- DOWN 14 <- DOWN 15 <- DOWN Y-FABRIC TO FROM 00 01 02 03 04 05 06 07 08 <09 <10 <11 <12 <13 <14 <15 <- 0 1 UP UP UP UP UP UP UP UP DN DN DN DN UP UP UP UP DOWN DOWN DOWN DOWN DOWN DOWN DOWN DOWN Status SERVERNET 2 3 4 5 6 7 8 9 10 UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA 2 3 4 5 6 7 8 9 UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP DN DN UP UP UP UP UP UP DN DN UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UP UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA 11 UNA UNA UNA UNA UNA UNA UNA UNA 10 12 UNA UNA UNA UNA UNA UNA UNA UNA 11 UNA UNA UNA UNA UNA UNA UNA UNA 13 UNA UNA UNA UNA UNA UNA UNA UNA 12 UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA 13 UNA UNA UNA UNA UNA UNA UNA UNA 14 UNA UNA UNA UNA UNA UNA UNA UNA 14 UNA UNA UNA UNA UNA UNA UNA UNA 15 UNA UNA UNA UNA UNA UNA UNA UNA 15 UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA UNA In this example, the boldface type shows that the Y-fabric connection between the processors in group 01 (processors 0 and 1) and the processors in group 03 (processors 4 and 5) is down. The X fabric is functioning normally. For more information about the STATUS SERVERNET command, refer to the SCF Reference Manual for the Kernel Subsystem. ServerNet Cluster Manual— 520575-003 7- 28 Troubleshooting and Replacement Procedures Checking the External ServerNet X and Y Fabrics Checking the External ServerNet X and Y Fabrics You must use the TSM Service Application to check the external ServerNet X and Y fabrics. The guided procedure for troubleshooting a ServerNet fabric cannot troubleshoot the link between an MSEB and a ServerNet II Switch. The guided procedure can troubleshoot internal fabrics only. However, you can perform internal and external loopback tests on an NNA PIC in an MSEB. If no alarms are visible on the external fabrics but you still want to check the connectivity between the local node and the cluster switch, you can use the Node Connectivity ServerNet Path Test action. This action tests a path from the local MSEB to the ServerNet II Switch component on one fabric. If problems are found, TSM generates alarms based on the result. This action is provided for diagnostic purposes only. It has no destructive effect. Table 7-4 describes the scope of the Node Connectivity ServerNet Path Test action. Table 7-4. Scope of Node Connectivity ServerNet Path Test On This Fabric The test checks from . . . To the . . . X fabric Port 6 of the MSEB in slot 51 of group 01 X-fabric ServerNet II Switch component, including the status of the switch and the switch port. Y fabric Port 6 of the MSEB in slot 52 of group 01 Y-fabric ServerNet II Switch component, including the status of the switch and the switch port. If an alarm is generated, the problem lies between the local node and the cluster switch. The problem might be any of the following components: • • • • • The MSEB in slot 51 or 52 of the group 01 enclosure of the local system. The PIC in port 6 of the MSEB in slot 51 or 52 of the group 01 enclosure of the local system. The cable from the PIC in port 6 of the MSEB in slot 51 or 52 of the group 01 enclosure to the ServerNet II Switch component of the cluster switch. The ServerNet II Switch component of the cluster switch. The PIC in the ServerNet II Switch to which the cable from the local system is connected. Note that PICs installed in ServerNet II Switches are not currently replaceable. ServerNet Cluster Manual— 520575-003 7- 29 Troubleshooting and Replacement Procedures Using the Internal Loopback Test Action Use the following procedure to check ServerNet connectivity on the external fabrics: Note. An error will be returned if you try to run this path test when another Node Connectivity ServerNet Path Test is in progress on the same fabric. The Path Test in Progress attribute indicates if a path test is currently being conducted on the fabric. 1. Log on to the TSM Service Application. For a detailed description of logging on to the TSM Service Application, see Appendix F, Common System Operations. 2. In the Management window tree pane, click the Cluster tab. 3. In the tree pane, right-click either the External_ServerNet_X_Fabric or the External_ServerNet_Y_Fabric and select Actions. The Actions dialog box appears. 4. From the Actions list, click Node Connectivity ServerNet Path Test. 5. Click Perform action. The Action Status window shows the progress of the action. 6. Check for an alarm on the External_ServerNet_X_Fabric or the External_ServerNet_Y_Fabric. If no alarm is present, the action completed successfully. If an alarm is present (an alarm bell appears next to the object), click the object to select it. See Using TSM Alarms on page 7-12 to get more information about the alarm. Using the Internal Loopback Test Action If you need to execute an internal loopback test, HP recommends that you use the guided procedure for troubleshooting a ServerNet fabric (Troubleshoot ServerNet Fabric). The guided procedure automates the process of performing internal and external loopback tests on a variety of PICs. See Using the Fabric Troubleshooting Guided Procedure to Check the Internal ServerNet Fabrics on page 7-26. The guided procedure provides the same function as the TSM Internal Loopback Test action. The Internal Loopback Test action tests the circuitry of a plug-in card (PIC) installed in an MSEB to determine if ServerNet traffic can pass through the PIC. You can use the Internal Loopback Test Action on a PIC installed in any port of an MSEB. However, you can only perform this action on one PIC at a time. Caution. Use the Internal Loopback Test action only if you have been instructed to do so by your service provider or by the repair action text for a TSM alarm. Do not use the Internal Loopback Test action on a PIC that you believe is operating normally. Doing so will shut down ServerNet traffic through the PIC. The Internal Loopback Test action indicates with high probability whether or not the PIC circuitry is operational or faulty. However, it does not test the connector on the PIC. As a result, it is possible that the Internal Loopback Test Action can succeed even though communication through the PIC is not possible. ServerNet Cluster Manual— 520575-003 7- 30 Troubleshooting and Replacement Procedures Using SCF to Check Processor-to-Processor Connections Typically, you use the Internal Loopback Test action to isolate the cause of a malfunctioning ServerNet path where a PIC is part of that path. You can perform the action with a ServerNet cable connected to the PIC. The action isolates the MSEB port occupied by the PIC, preventing the port from sending or receiving ServerNet traffic during the action. However, the action tests only the PIC and not the cable connected to it. 1. Log on to the TSM Service Application. For a detailed description of logging on to the TSM Service Application, see Appendix F, Common System Operations. 2. Click the System tab of the management window. 3. In the tree pane, right-click the PIC that you want to test (PICs are subcomponents of MSEBs), and select Actions. The Actions dialog box appears. 4. From the Actions List, click Internal Loopback Test. 5. Click Perform action. The Action Status window shows the progress of the action. The action status indicates that action completed or failed. If the action completed, there is a high probability that the PIC is functioning normally. If the action failed, click the Action detail button to get more information. 6. Click Close to close the Action dialog box. Using SCF to Check Processor-to-Processor Connections You can use the SCF STATUS SUBNET command to check processor-to-processor connections. For details, see the STATUS SUBNET, DETAIL Command Example (Partial Display) on page 8-15. Finding ServerNet Cluster Event Messages in the Event Log Before you search the event log ($ZLOG) for ServerNet cluster subsystem event messages, you should be familiar with the subsystems listed in Table 7-5. Table 7-5. Names of Associated Subsystems Subsystem Subsystem Name Subsystem Number ServerNet cluster SCL 218 Messages that indicate a fabric or a node has been disconnected SANMAN SMN 237 Messages that indicate attachment to the HLRF (external ServerNet fabric) failed. Or messages that indicate that the HLRF (external ServerNet fabric) was not discovered. Message system IPC 203 Messages that indicate a path is down or a fabric is down. What to Look For ServerNet Cluster Manual— 520575-003 7- 31 Troubleshooting and Replacement Procedures Finding ServerNet Cluster Event Messages in the Event Log When you view events using the TSM EMS Event Viewer Application, the subsystem name (or, in rare cases, the subsystem number) is shown in the SSID column. The TSM EMS Event Viewer Application allows you to specify: • • • The date and time ranges of the events you want displayed The logs (such as $0 and $ZLOG) from which you want events displayed The subsystems from which you want events displayed For cause, effect, and recovery information for event messages, refer to the Operator Messages Manual. ServerNet Cluster Manual— 520575-003 7- 32 Troubleshooting and Replacement Procedures MSEB and ServerNet II Switch LEDs MSEB and ServerNet II Switch LEDs You can use the LEDs on the MSEB and ServerNet II switch to help diagnose problems. Figure 7-8 describes the LEDs on the MSEB. Figure 7-9 on page 7-34 shows the LEDs on the ServerNet II Switch. For information about LEDs on the AC transfer switch or UPS, refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide. Figure 7-8. MSEB LEDs MSEB 1 No. LED Type Color 1 Fault Amber 2 3 10 9 5 3 Lights when the MSEB or one of its PICs is not in a fully functional state. Possibly, a fault was detected on the MSEB or one of its PICs or the MSEB or PIC has not been successfully intiialized and configured for use as a system resource. When an MSEB is powered on, the amber LED lights briefly to indicate initialization and configuration are in progress. When the MSEB has been successfully intiialized and configured, the amber LED is extinguished. If the MSEB could not be configured, the amber LED remains lit. 6 3 Function 2 Power-On Green Lights when the MSEB is powered on. 3 Link Alive Green Lights to indicate that the port is receiving a valid link-alive signal from the remote port to which it connected. The ServerNet port LEDs (one for each PIC port) indicate the state of the ServerNet link. These LEDs are extinguished upon loss of the link-alive signal. 4 3 8 3 3 7 2 3 1 vst 130 vsd ServerNet Cluster Manual— 520575-003 7- 33 Troubleshooting and Replacement Procedures MSEB and ServerNet II Switch LEDs Figure 7-9. ServerNet II Switch LEDs ServerNet II Switch Front Panel LEDs X Y 1 2 0 1 2 3 4 5 6 7 8 9 10 11 3 5 4 7 7 10 0 ServerNet II Switch PIC LEDs 11 1 6 8 2 6 3 6 4 6 9 5 6 7 7 6 6 7 6 6 No. LED Type Color Function 1 Power-On Green Lights to indicate the ServerNet II Switch is powered on. 2 Fault Amber Lights to indicate the ServerNet II Switch is not in a fully functional state. Possibly a fault was detected on the switch, or it has not been successfully intiialized and configured. When the switch is powered on, the amber LED lights briefly to indicate initialization and configuration are in progress. When the switch has been successfully intiialized and configured, the amber LED is extinguished. If the switch could not be configured, the amber LED remains lit. 3 X and Y Green Lights to indicate the external ServerNet fabric (X or Y) served by the switch. 4 Ports 0-7 Green Lights to indicate the ServerNet port is receiving a valid link-alive indication from a ServerNet node. 5 Ports 8-11 Green Lights to indicate the ServerNet port is receiving a valid link-alive indication from a remote ServerNet switch. 6 PIC Ports 0-7 Green Lights to indicate the ServerNet port is receiving a valid link-alive indication from a ServerNet node. 7 PIC Ports 8-11 Green Lights to indicate the ServerNet port is receiving a valid link-alive indication from a remote ServerNet switch. VST131.vsd ServerNet Cluster Manual— 520575-003 7- 34 Troubleshooting and Replacement Procedures Replacement Procedures Replacement Procedures This subsection includes the following replacement procedures: Procedure Page Replacing an MSEB 7-35 Replacing a PIC in a ServerNet II Switch 7-35 Replacing a PIC in an MSEB 7-36 Replacing a Fiber-Optic Cable Between an MSEB and a ServerNet II Switch 7-36 Replacing a Fiber-Optic Cable in a Multilane Link 7-37 Replacing a ServerNet II Switch 7-38 Replacing an AC Transfer Switch 7-38 Replacing a UPS 7-38 To service or replace all other server components, refer to the NonStop S-Series Hardware Support Guide or the CSSI Web site. (Click Start>Programs>Compaq TSM>Compaq S-Series Service (CSSI) Web. Do not attempt to replace a component unless you have exhausted all other troubleshooting techniques and are reasonably certain that replacement is the only alternative. Replacing an MSEB You use the guided replacement procedure to replace an MSEB. From the system console of the server you are adding, choose Start>Programs>Compaq TSM>Guided Replacement Tools>Replace SEB or MSEB. Online help is available to assist you in performing the procedure. You can replace only one MSEB at a time. Replacing a PIC in a ServerNet II Switch PICs installed in a ServerNet II Switch cannot be replaced in the field. If a PIC is determined to be faulty, the ServerNet cable attached to the PIC must be moved to an unused port on the switch. (Remember that both fabrics must connect to the same switch port number. If you move a ServerNet cable to a different port on the switch, you must move the corresponding cable for the alternate fabric.) If no unused port is available, the switch and its power subsystem must be replaced. To move a ServerNet cable to a different port on a ServerNet II Switch, refer to Section 6, Adding or Removing a Node. ServerNet Cluster Manual— 520575-003 7- 35 Troubleshooting and Replacement Procedures Replacing a PIC in an MSEB Replacing a PIC in an MSEB PICs installed in an MSEB can be replaced, but the MSEB must be removed from the enclosure before the PIC can be replaced. To remove the MSEB safely, you must use the guided procedure. From the system console of the server you are adding, choose Start>Programs>Compaq TSM>Guided Replacement Tools>Replace SEB or MSEB. Online help is available to assist you in performing the procedure. Perform each step in the guided procedure up to and including removing the MSEB from the enclosure. Then refer to the Replace SEB or MSEB Guided Procedure online help for instructions on replacing the PIC. When the PIC is replaced, use the guided procedure to reinstall the MSEB in the enclosure. Replacing a Fiber-Optic Cable Between an MSEB and a ServerNet II Switch Use this procedure to replace a fiber-optic cable between an MSEB and a ServerNet II Switch: 1. Before starting, make sure that the internal and external ServerNet fabrics served by the cable opposite the one you are replacing are healthy. See Checking the External ServerNet X and Y Fabrics on page 7-29. For example, if you are replacing the cable for the X fabric, make sure that the internal Y fabric for all other nodes and the external Y fabric are fully operational. 2. Review the information on connecting fiber-optic cables in Section 3, Installing and Configuring a ServerNet Cluster. 3. Route the replacement cable between the MSEB and the ServerNet II Switch. 4. Disconnect the suspected bad cable from the MSEB and from the ServerNet II Switch. 5. If the replacement cable has dust caps, remove the dust caps and install them on the suspected bad cable. 6. Connect the replacement cable to the ServerNet II Switch and then to the MSEB. Make sure that you connect the replacement cable to the ports from which you removed the suspected bad cable. 7. Use the TSM Service Application to check for alarms, as described in Using TSM Alarms on page 7-12. 8. After a while, most TSM alarms should clear automatically. If the alarms do not clear, run the Node Connectivity ServerNet Path Test. See Checking the External ServerNet X and Y Fabrics on page 7-29. 9. If the alarms persist, perform the repair actions for clearing the alarms. ServerNet Cluster Manual— 520575-003 7- 36 Troubleshooting and Replacement Procedures Replacing a Fiber-Optic Cable in a Multilane Link Replacing a Fiber-Optic Cable in a Multilane Link Use this procedure to replace a fiber-optic cable in a multilane link between two cluster switches. Before starting, read this procedure all the way through, especially if your cluster switches are in different sites. Note. You can determine the ServerNet nodes that are affected by the loss of a specific multilane link by referring to Connections Between Cluster Switches on page 1-30. Figure 1-17 shows the routing of ServerNet packets across four-lane links, and Figure 1-18 shows the routing of ServerNet packets across two-lane links. 1. Make sure that interprocessor communication (IPC) connectivity is up between all nodes on the peer fabric of the suspected bad cable by doing one of the following: • • If one of the nodes is running SNETMON/MSGMON version T0294AAG (or a superseding SPR), use the SCF STATUS SUBNET $ZZSCL, PROBLEMS command to check connectivity on the peer fabric. This command reports connectivity problems on all nodes. To determine your SNETMON/MSMGMON version, see Table 2-9, Checking SPR Levels. If none of the nodes are running T0294AAG (or a superseding SPR), use the SCF STATUS SUBNET $ZZSCL command on all nodes to check the peer fabric. If connectivity is down on the peer fabric, repair the problem, if possible, before attempting to replace a multilane link. If necessary, refer to the Troubleshooting Procedures on page 7-1. 2. Label the connectors of the replacement cable with the cluster switch names and port numbers to which the suspected bad cable is connected (such as X1, port 8 at one end and X2, port 10 at the other end). If necessary, refer to Figure 1-17 for connections in a four-lane link and Figure 1-18 for connections in a two-lane link. 3. Physically route the replacement cable along the same path as the suspected bad cable. 4. Disconnect the suspected bad cable at both ends. 5. If the connectors on the replacement cable have dust caps, remove the dust caps and install them on the connectors of the suspected bad cable. 6. Connect the replacement cable as labeled in Step 2. 7. Confirm that the link-alive LED lights at both ends. The link-alive LEDs should light within a few seconds. If the link-alive LEDs do not light: • • Try reconnecting the cable, using care to align the key on the cable plug with the PIC connector. If possible, try connecting a different cable. ServerNet Cluster Manual— 520575-003 7- 37 Troubleshooting and Replacement Procedures Replacing a ServerNet II Switch 8. Direct ServerNet connectivity is automatically restored after an interval of approximately 50 seconds times the number of nodes in the cluster (25 seconds for nodes running G06.14 or later). If you do not want to wait, you can manually force recovery of ServerNet connectivity as follows: • • On nodes running G06.12 or later RVUs, issue the SCF START SERVERNET $ZNET command. On nodes running the G06.09 through G06.11 RVUs, issue the SCF STOP SUBSYS $ZZSCL command followed by the SCF START SUBSYS $ZZSCL command. These commands will temporarily disrupt connectivity on both fabrics for any nodes receiving the commands. 9. Repeat Step 1 to confirm that IPC connectivity is up between all nodes on both fabrics. 10. If connectivity problems continue with the cable, contact your service provider. Replacing a ServerNet II Switch Refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide to prepare for replacement. Then use the guided replacement procedure to replace the ServerNet II Switch. From a system console attached to any node connected to the switch, choose Start>Programs>Compaq TSM>TSM Guided Replacement Tools>Replace Switch Component. Online help is available to assist you in performing the procedure. Replacing an AC Transfer Switch Refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide for the steps to replace an AC transfer switch. Replacing a UPS The UPS is a field-replaceable unit (FRU) and must be replaced by a service provider. For more information about the UPS, refer to the ServerNet Cluster 6770 Hardware Installation and Support Guide. The steps for replacing a UPS are described in the NonStop S-Series Service Provider Supplement. The service provider must use the guide in addition to the guided procedure for replacing a switch component in order to replace the UPS. Note. To promote availability, HP recommends replacing the UPS battery at least every three years. Because the battery in the UPS is not currently a replaceable unit, you must replace the UPS to replace the UPS battery. Contact your service provider for information about replacement. ServerNet Cluster Manual— 520575-003 7- 38 Troubleshooting and Replacement Procedures Diagnosing Performance Problems Diagnosing Performance Problems Diagnosis of performance problems in any environment involves multiple steps and requires extensive knowledge of performance fundamentals and methodologies. If there are ServerNet Cluster performance issues, you might want to explore these areas: • • • Throughput: How much traffic is flowing through nodes and between individual processors? Latency: How well does traffic flow? Problem isolation: What is responsible for the problem? The Measure performance tool can assist you in collecting data for throughput and latency analysis. Measure has a SERVERNET entity that tracks all the interprocessor communication for remote nodes (remote IPC type). This information allows you to isolate and examine the flow of ServerNet cluster traffic. Another area to investigate, if you suspect performance degradation, is the health of your EXPAND lines. For more information on the SERVERNET, NETLINE, and PROCESS entities and the G08 counters added specifically for ServerNet cluster support, consult the Measure Reference Manual. ServerNet Cluster Manual— 520575-003 7- 39 Troubleshooting and Replacement Procedures ServerNet Cluster Manual— 520575-003 7- 40 Diagnosing Performance Problems Part IV. SCF This part contains the following sections: • • • Section 8, SCF Commands for SNETMON and the ServerNet Cluster Subsystem Section 9, SCF Commands for the External ServerNet SAN Manager Subsystem Section 10, SCF Error Messages ServerNet Cluster Manual— 520575-003 Part IV. SCF ServerNet Cluster Manual— 520575-003 8 SCF Commands for SNETMON and the ServerNet Cluster Subsystem This section describes the SCF commands that are supported specifically for SNETMON and the ServerNet cluster (SCL) subsystem. Note. Commands that are generally supported by SCF, such as the ASSUME and ENV commands, are documented in the SCF Reference Manual for G-Series RVUs. Kernel subsystem SCF commands such as ADD, START and STOP for configuring generic (system managed) processes (such as the ServerNet cluster monitor process, represented as a PROCESS object) are documented in the SCF Reference Manual for the Kernel Subsystem. Table 8-1 lists SCF objects and commands for SNETMON and the ServerNet cluster subsystem. Note. For SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies, see Appendix I, SCF Changes at G06.21. Table 8-1. ServerNet Cluster SCF Objects and Commands Command PROCESS Object SUBNET Object SUBSYS Object Sensitive Command? Page ALTER Command X Yes 8-5 INFO Command X No 8-6 Yes 8-7 X Yes 8-8 X No 8-9 X Yes 8-21 Yes 8-22 No 8-25 PRIMARY Command X START Command STATUS Command X STOP Command TRACE Command X VERSION Command X X X To use all the features in Table 8-1, you must be running the G06.14 RVU, or you must have installed the SNETMON/MSGMON T0294AAG SPR. To determine which features were introduced with each RVU, see Table 8-2, SCF Features for SNETMON and the SCL Subsystem by RVU. ServerNet Cluster Manual— 520575-003 8 -1 SCF Commands for SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster SCF Objects Table 8-2. SCF Features for SNETMON and the SCL Subsystem by RVU RVU SNETMON/ MSGMON SPR G06.09 T0294 Introduced These New SCF Features • • G06.12 T0294AAE • • G06.14 T0294AAG • T0294AAA • ALTER, INFO, START, STATUS, STOP, and VERSION commands for the SUBSYS object PRIMARY, TRACE, and VERSION commands for the PROCESS Object. SUBNET object for STATUS command. STATUS SUBNET displays information for up to 16 nodes ACTIVE, SUMMARY, and PROBLEMS options for the STATUS SUBNET command. STATUS SUBNET displays information for up to 24 nodes ServerNet Cluster SCF Objects The following SCF objects are supported for SNETMON and the ServerNet cluster (SCL) subsystem: PROCESS Use this object to issue commands for SNETMON ($ZZKRN.#ZZSCL). SUBNET Use this object to gather information about connections within the ServerNet cluster subsystem ($ZZSCL). SUBSYS Use this object to issue commands for the ServerNet cluster subsystem itself ($ZZSCL). Sensitive and Nonsensitive Commands Sensitive SCF commands can have detrimental effects if improperly used. A sensitive command can be issued only by a user who has the super ID, the owner of the subsystem, or a member of the group of the subsystem owner. When used in conjunction with the security features of the system services, SCF provides effective access control for sensitive commands. Commands that request information or status but that do not affect operation are called nonsensitive commands and are available to all users. ServerNet Cluster Manual— 520575-003 8 -2 SCF Commands for SNETMON and the ServerNet Cluster Subsystem SCL SUBSYS Object Summary States SCL SUBSYS Object Summary States The ServerNet cluster (SCL) subsystem state is maintained by the ServerNet cluster monitor process (SNETMON). There is no aggregate ServerNet cluster subsystem state; each ServerNet cluster monitor process maintains the state of objects relevant to the local system and its connection to the ServerNet cluster. Table 8-3 lists the summary states for the SUBSYS object supported by the SCL subsystem. Table 8-3. SCF Object Summary States Summary State Description STARTING The ServerNet cluster subsystem is attempting to establish ServerNet connectivity with the other nodes in the ServerNet cluster. STARTED The STARTING phase is complete. For all ServerNet nodes, either connectivity has been established or the attempt failed. It is possible to be in the STARTED state with no connectivity. When the ServerNet cluster subsystem moves to the STARTED state, it automatically detects all other nodes in the ServerNet cluster and establishes connections with those that have a ServerNet cluster monitor process in the STARTED or STARTING states. STOPPING Connectivity is in the process of being brought down. STOPPED The STOPPING phase is complete; there is no ServerNet cluster connectivity. This is also the initial state of the ServerNet cluster monitor process. When the ServerNet cluster subsystem is running but in a STOPPED summary state, it responds to SCF requests and status queries from local Expand-over-ServerNet line-handler processes. ServerNet connections with remote processors remain disabled, and discovery requests from remote ServerNet cluster monitor processes are not accepted. Status or statistics for remote connections are not available because these connections do not exist. However, statistics for the local nodes are provided. Figure 8-1 illustrates the state transitions and the commands that trigger them. ServerNet Cluster Manual— 520575-003 8 -3 SCF Commands for SNETMON and the ServerNet Cluster Subsystem ServerNet Cluster Subsystem Start State (STARTSTATE Attribute) Figure 8-1. ServerNet Cluster Subsystem States ServerNet Cluster Subsystem States STARTED STOP STARTING STOP STOPPING START STOPPED vst011.vsd ServerNet Cluster Subsystem Start State (STARTSTATE Attribute) The start state configuration (STARTSTATE attribute) of the SCL subsystem SUBSYS object controls the way in which the ServerNet cluster monitor process joins the system to the ServerNet cluster. The ServerNet cluster monitor process (SNETMON) is usually launched automatically by the Persistence Manager ($ZPM) as soon as any of the processors configured for the ServerNet cluster monitor process are reloaded (assuming the ServerNet cluster monitor process is configured with the STARTMODE attribute set to SYSTEM). Note. Do not confuse the PROCESS object STARTMODE attribute with the SUBSYS object STARTSTATE attribute. The STARTMODE attribute controls the way the Persistence Manager launches the ServerNet cluster monitor process. (See the ADD PROCESS command description in the SCF Reference Manual for the Kernel Subsystem.) Once the ServerNet cluster monitor process is started, it checks the SUBSYS object start state (STARTSTATE attribute) to determine whether to automatically bring the ServerNet cluster subsystem to the STARTED logical state. Even though the ServerNet cluster monitor process is running, the system does not join the ServerNet cluster until the ServerNet cluster subsystem is in the STARTED logical state. ServerNet Cluster Manual— 520575-003 8 -4 SCF Commands for SNETMON and the ServerNet Cluster Subsystem ALTER Command Set the STARTSTATE attribute by using the ALTER SUBSYS command. (See ALTER Command on page 8-5.) If STARTSTATE is set to . . . Then STARTED The ServerNet cluster subsystem automatically moves into the STARTED logical state and joins the system to the ServerNet cluster. STOPPED The ServerNet cluster subsystem waits for a START SUBSYS command before moving into the STARTED logical state and joining the system to the ServerNet cluster. ALTER Command The ALTER command modifies the value of the STARTSTATE attribute for the ServerNet cluster subsystem. ALTER is a sensitive command. The ALTER command syntax is: ALTER [ /OUT file-spec/ ] SUBSYS $ZZSCL { attribute-spec }... OUT file-spec causes any SCF output generated for this command to be directed to the specified file. attribute-spec specifies the attribute to be modified and the value to be assigned to it. Currently, there is only one supported attribute name and value combination: [ , STARTSTATE { STARTED | STOPPED } ] The STARTSTATE attribute specifies the start state of the ServerNet cluster subsystem: • • STARTED: the ServerNet cluster subsystem automatically moves into the STARTED logical state when the system is loaded and joins the system to the ServerNet cluster. STOPPED: the ServerNet cluster subsystem must be manually moved into the logical STARTED state using the SCF START SUBSYS command (see page 8-8.) Considerations • The STARTSTATE attribute is only used after a system load. When the ServerNet cluster subsystem is restarted on a running system, it enters the state that was last set by an operator command (including the implied operator command at system load). ServerNet Cluster Manual— 520575-003 8 -5 SCF Commands for SNETMON and the ServerNet Cluster Subsystem • Example If the ALTER SUBSYS command is entered correctly, an EMS message reports the command, the time it was executed, the terminal from which the command was entered, and the group and user numbers of the user issuing the command. Example The following command alters the STARTSTATE attribute for the ServerNet cluster subsystem. The next time the system is loaded, the ServerNet cluster subsystem must wait for a START SUBSYS command before moving into the STARTED state and joining the system to the ServerNet cluster: > ALTER SUBSYS $ZZSCL, STARTSTATE STOPPED INFO Command The INFO command returns the current values for the ServerNet cluster monitor process configuration. INFO is a nonsensitive command. The INFO command syntax is: INFO [ /OUT file-spec/ ] SUBSYS $ZZSCL [ , DETAIL ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. DETAIL INFO SUBSYS and INFO SUBSYS, DETAIL show the same information. Example The following command displays the current configuration for the ServerNet cluster monitor process: > INFO SUBSYS $ZZSCL Servernet Cluster - Info SUBSYS \SYS.$ZZSCL StartState....... STARTED CommandState..... STARTED CommandTime....... 06 Jul 2000, 16:20:50.703 ServerNet Cluster Manual— 520575-003 8 -6 SCF Commands for SNETMON and the ServerNet Cluster Subsystem PRIMARY Command The fields returned by the INFO SUBSYS command are as follows: StartState shows the current value of the STARTSTATE attribute for the ServerNet cluster subsystem. Possible values are: STARTED The ServerNet cluster subsystem is configured to move into the STARTED state automatically and join the system to the ServerNet cluster after a system load. STOPPED The ServerNet cluster subsystem is configured to wait for a START SUBSYS command before moving into the STARTED state and joining the system to the ServerNet cluster. CommandState shows the state for the ServerNet cluster subsystem that was last set by operator command (including the implied operator command after system load), since the last system load. Possible values are: STARTED The ServerNet cluster subsystem automatically performs join processing in the event of a restart. If the ServerNet cluster is already in a joined state, it remains joined with no disruption. STOPPED The ServerNet cluster subsystem automatically performs disconnect processing in the event of a restart. This ensures that the system is properly disconnected from the ServerNet cluster. CommandTime shows the time that the state of the ServerNet cluster subsystem was last set by operator command (including the implied operator command after system load), since the last system load (when the last START or STOP SUBSYS command was entered). PRIMARY Command The PRIMARY command causes a processor switch. The backup processor becomes the primary processor, and the primary processor becomes the backup processor. PRIMARY is a sensitive command. The PRIMARY command syntax is: PRIMARY [ /OUT file-spec/ ]PROCESS $ZZSCL [, cpunum ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. ServerNet Cluster Manual— 520575-003 8 -7 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Consideration cpunum is the processor number of the current backup processor for the ServerNet cluster monitor process. Consideration Wild cards are not supported for the PRIMARY PROCESS command. Example The following command causes the previously configured backup processor for the ServerNet cluster monitor process (processor 3) to become the primary processor. > PRIMARY PROCESS $ZZSCL, 3 START Command The START command moves the ServerNet cluster subsystem into the logical STARTED state and joins the system to the ServerNet cluster. START is a sensitive command. The START command syntax is: START [ /OUT file-spec/ ] SUBSYS $ZZSCL OUT file-spec causes any SCF output generated for this command to be directed to the specified file. Considerations • • If the START SUBSYS command is entered correctly, an EMS message reports the command, the time it was executed, the terminal from which the command was entered, and the group and user numbers of the user issuing the command. ServerNet cluster startup is a two-tiered process: 1. First the ServerNet cluster monitor process has to be started (by the Persistence Manager or with an SCF START PROCESS command). 2. Then the ServerNet cluster monitor process moves ServerNet cluster services into the STARTED state and joins the system to the ServerNet cluster. When the ServerNet cluster monitor process is launched, it enters its default service state (STOPPED), and then checks the SUBSYS object STARTSTATE configuration to determine whether to automatically proceed towards the STARTED logical state. ServerNet Cluster Manual— 520575-003 8 -8 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Example If the configured STARTSTATE is STOPPED, the ServerNet cluster monitor process must wait for a START SUBSYS command before proceeding to start ServerNet cluster services. Once ServerNet cluster services on the local system have been started, the ServerNet cluster monitor process establishes ServerNet connections with all other systems in the ServerNet cluster that are in the STARTED or STARTING states, and moves the subsystem to the STARTED state. If the Expand-over-ServerNet line-handler processes are configured and are in the STARTED state, Expand connectivity is established with the other systems in the ServerNet cluster. When the START SUBSYS is completed: • • • The subsystem state is STARTED. The ServerNet cluster monitor process has a list of all systems known to be in the ServerNet cluster. ServerNet connections are established with each system. If ServerNet connection attempts fail or if successful connections subsequently fail, periodic attempts are made to establish or reestablish the connection. Failures and successful reconnections are logged to the event log. Failures to connect are logged as path or other failures. In addition, each ServerNet cluster subsystem state change (to STARTING and then to STARTED) is logged. If no other systems are discovered, that fact also is logged. Example The following SCF command moves the ServerNet cluster subsystem into the logical STARTED state and joins the system to the ServerNet cluster: > START SUBSYS $ZZSCL STATUS Command The STATUS command returns information about the cluster connections and the state of the ServerNet cluster subsystem. STATUS is a nonsensitive command. The STATUS command syntax is: STATUS [/OUT file-spec/] { SUBNET } $ZZSCL [, DETAIL] [, LOCAL] [, NODE ServerNet node number] [, ACTIVE ] [, SUMMARY ] [, PROBLEMS ] { SUBSYS } $ZZSCL [, DETAIL] ServerNet Cluster Manual— 520575-003 8 -9 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS Command OUT file-spec causes any SCF output generated for this command to be directed to the specified file. DETAIL if specified with STATUS SUBNET, displays detailed status information on all internal and external ServerNet paths between processors for all nodes in the cluster. Currently, STATUS SUBSYS and STATUS SUBSYS, DETAIL show the same information. LOCAL if specified, displays detailed status information for the local ServerNet cluster subsystem only. Use of this option negates the specification of the DETAIL option. NODE if specified, displays detailed status information only for the node (local or remote) having ServerNet node number. Use of this option negates the specification of the DETAIL option. ACTIVE if specified, displays a summary table of all ServerNet cluster subsystem connections for currently active nodes only. SUMMARY if specified, displays a summary table of all ServerNet cluster subsystem connections for all nodes at the beginning of the output, regardless of DETAIL, LOCAL, or NODE options. PROBLEMS if specified, automatically queries the SNETMON processes in all nodes and displays any connectivity problems. The PROBLEMS option is a quick way to check connectivity problems without having to issue separate STATUS SUBNET commands for each node. No other options can be specified with this option. Remote passwords must be established for nodes to report problems. ServerNet Cluster Manual— 520575-003 8- 10 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Considerations Considerations The following considerations apply to the STATUS SUBNET $ZZSCL command: • • • • If the DETAIL, LOCAL, and NODE parameters are not specified, a summary table of all ServerNet cluster subsystem connections appears. If detailed status information appears (a DETAIL, LOCAL, or NODE option was specified), the nodes known to the ServerNet cluster subsystem appear in numeric order regardless of the order requested. The NODE option can be specified as many times as desired to indicate more than one particular node. It can also be used with the LOCAL option. The PROBLEMS option must be specified by itself. The following consideration applies to the STATUS SUBSYS $ZZSCL command: • The ServerNet cluster subsystem state is maintained by the ServerNet cluster monitor process. There is no aggregate ServerNet cluster subsystem state. Instead each ServerNet cluster monitor process maintains the state of objects relevant to the local system and its connection to the ServerNet cluster. STATUS SUBNET Command Example The following example shows the STATUS SUBNET $ZZSCL command with no options specified: > STATUS SUBNET $ZZSCL SNETMON Remote ServerNet SNETMON SysName Num 0<--CPU States-->15 LocLH RemLH SCL EXPAND State,Cause Node-----------------------------------------------------------------------1| RMT \SCQA6 254 1111,0000,0000,0000 UP 119 UP CONN CONN . . . . . | 2| LCL \TROLL 253 1111,1p11,1111,1111 . . . . . . . . . . . . STRD,UNKN | 3| RMT \MS9 206 1111,0000,0000,0000 UP 125 UP CONN CONN . . . . . . | 4| RMT \MS10 207 1111,0000,0000,0000 UP 124 UP CONN CONN . . . . . . | 5| RMT \MS11 208 1111,0000,0000,0000 UP 123 UP CONN CONN . . . . . . | 6| RMT \MS12 209 XXY1,0000,0000,0000. . . . . . . . . . . . . . . . . . | 7| RMT \MS13 210 1111,0000,0000,0000 UP 121 UP CONN CONN . . . . . . | 8| RMT \MS14 211 P111,0000,0000,0000 UP 120 UP CONN CONN . . . . . . | 9| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 10| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 11| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 12| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 13| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 14| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 15| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 16| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 17| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 18| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 19| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 20| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 21| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 22| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 23| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | 24| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | --------------------------------------------------------------------------- ServerNet Cluster Manual— 520575-003 8- 11 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET Command Example The following paragraphs explain the fields returned by the STATUS SUBNET summary table: Node is the ServerNet node number (1 through 24) of the local or remote system, where: LCL indicates that the node is local. This ServerNet node’s SNETMON is providing the information in the display. RMT indicates that the node is remote. This is any ServerNet node other than the local node. SysName is the system name of the ServerNet node. Num is the Expand node number of the ServerNet node. SNETMON CPU States provides an individual path state summary code for the connections between the local node and a particular processor of a particular ServerNet node. The path state codes are as follows: Path State Code Description 0 The paths are down because the processor is down. 1 The paths and processor are up without errors. P The processor is up, but there are errors on both internal fabrics. X The processor is up, but there is an error on the internal X fabric. Y The processor is up, but there is an error on the internal Y fabric. p The processor is up, but there are errors on both external fabrics. x The processor is up, but there is an error on the external X fabric. y The processor is up, but there is an error on the external Y fabric. Note. If errors are present on both the internal and external paths, only an internal path error is indicated. You can use the DETAIL, LOCAL, or NODE option to obtain more detailed information. ServerNet Cluster Manual— 520575-003 8- 12 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET Command Example Remote ServerNet describes the status of a remote node’s ServerNet connection to the local node, where: LocLH is the status (UP or DOWN) of the local Expand-over-ServerNet line and its LDEV number. RemLH is the status (UP or DOWN) of the remote Expand-over-ServerNet line. SCL indicates if the local ServerNet cluster subsystem ($ZZSCL) is connected (CONN) or not connected (NCONN) to a particular remote ServerNet cluster subsystem. EXPAND indicates if the local node is connected (CONN) or not connected (NCONN) to a particular remote node using an Expand connection (even if multiple hops separate the two nodes). SNETMON State,Cause is the local SNETMON state and cause status, where the possible state codes for the ServerNet cluster subsystem are: State Code Description STRG Currently starting STRD Successfully started STPG Currently stopping STPD Stopped and not currently active The possible cause codes are: Cause Code Description UNKN Unknown cause PRTC Protocol error registering with SANMAN AUTO Automatic startup NNAR NNA (Node Numbering Agent) reprogrammed PWFL Power failure OPRQ Operator request HDWR Hardware failure ServerNet Cluster Manual— 520575-003 8- 13 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, PROBLEMS Command Example STATUS SUBNET, PROBLEMS Command Example The following example shows the STATUS SUBNET $ZZSCL command with the PROBLEMS option: > STATUS SUBNET $ZZSCL, PROBLEMS Node SysName Nodes With Connectivity Problems ---------------------------------------------------------------------------1) \SIERRA | ( 05 ) 2) \IGATE | ( Error 48 returned while accessing node ) 3) \SPEEDY | ( 05 ) 4) \....... | ( Node is not currently active ) 5) \COMM | ( 01, 02, 03, 05) ) 6) \TESTY | ( 05 ) 7) \....... | ( Node is not currently active ) 8) \....... | ( Node is not currently active ) 9) \....... | ( Node is not currently active ) 10) \....... | ( Node is not currently active ) 11) \....... | ( Node is not currently active ) 12) \....... | ( Node is not currently active ) 13) \....... | ( Node is not currently active ) 14) \....... | ( Node is not currently active ) 15) \....... | ( Node is not currently active ) 16) \....... | ( Node is not currently active ) 17) \....... | ( Node is not currently active ) 18) \....... | ( Node is not currently active ) 19) \....... | ( Node is not currently active ) 20) \....... | ( Node is not currently active ) 21) \....... | ( Node is not currently active ) 22) \....... | ( Node is not currently active ) 23) \....... | ( Node is not currently active ) 24) \....... | ( Node is not currently active ) ---------------------------------------------------------------------------- The following paragraphs explain the fields returned by the STATUS SUBNET, PROBLEMS summary: Node is the ServerNet node number (1 through 24) of the system in the cluster. SysName is the system name of the ServerNet node. Nodes With Connectivity Problems lists ServerNet node numbers of the nodes with which SysName is having connectivity problems. If there are no connectivity problems between Sysname and the other nodes, No connectivity problems detected appears in the display. Other information, such as error messages, might appear in this field. ServerNet Cluster Manual— 520575-003 8- 14 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) STATUS SUBNET, DETAIL Command Example (Partial Display) The following example shows a partial display of the STATUS SUBNET $ZZSCL, command with the DETAIL option: > STATUS SUBNET $ZZSCL, DETAIL Remote Node -- ServerNet Node Number: 14 System Name: \STAR2 Expand Node Number: 212 Remote Processors Up (via EXPAND): ( 0 1 2 3 ) Local LH Ldev Number: 122 Local LH Name: $SC212 Local LH Status: UP Remote LH Status: UP SNETMON Local/Remote Connection Status: CONNECTED Internal Path States For X Fabric: | 1 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15| 2 SRC -----------------------------------------------| 3 00 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5| 4 01 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5| 5 02 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5| 6 03 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5| 7 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 8 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 9 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|10 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|11 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|12 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|13 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|14 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|15 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|16 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|17 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|18 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|19 Internal Path States For Y Fabric: |21 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15|22 SRC -----------------------------------------------|23 00 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5|24 01 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5|25 02 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5|26 03 |37 37 37 37 5 5 5 5 5 5 5 5 5 5 5 5|27 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|28 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|29 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|30 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|31 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|32 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|33 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|34 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|35 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|36 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|37 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|38 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| Unknown state No data for path No remote path data Source CPU down Destination CPU down Src strtng SNet not allc Src strtng SNet allc Src brought down conn Dst strtng SNet not allc Dst strtng SNet allc Dst brought down conn Hrd dwn sequence errors Hrd dwn barrier NACK Hrd dwn transfer NACK Hrd dwn barrier timeout Hrd dwn conn state bad Hrd dwn src CPU died Hrd dwn max BTE timeouts Hrd dwn pwr on interrupt Hrd dwn fabric auto up Hrd dwn unknown reason Sft dwn barrier success Sft dwn max BTE timeouts Sft dwn unknown reason Sft dwn max rvival tries Downed by operator cmd Src fab dwnd oper cmd Src fab dwnd interrupt Src fab dwnd unkn cause Dst fab dwnd oper cmd Dst fab dwnd interrupt Dst fab dwnd unkn cause Path up other path died Path up operator command Path up probe success Path up CPU brought up Path up unknown cause External Path States For X Fabric (OUT): | 1 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15| 2 SRC -----------------------------------------------| 3 00 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 4 01 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 5 02 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 6 03 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 7 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 8 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 9 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|10 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|11 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|12 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|13 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|14 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|15 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|16 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|17 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|18 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|19 Unknown state No data for path No remote path data Source CPU down Destination CPU down Src strtng SNet not allc Src strtng SNet allc Src brought down conn Dst strtng SNet not allc Dst strtng SNet allc Dst brought down conn Hrd dwn sequence errors Hrd dwn barrier NACK Hrd dwn transfer NACK Hrd dwn barrier timeout Hrd dwn conn state bad Hrd dwn src CPU died Hrd dwn max BTE timeouts Hrd dwn pwr on interrupt ServerNet Cluster Manual— 520575-003 8- 15 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) |20 External Path States For Y Fabric (OUT): |21 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15|22 SRC -----------------------------------------------|23 00 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|24 01 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|25 02 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|26 03 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|27 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|28 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|29 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|30 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|31 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|32 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|33 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|34 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|35 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|36 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|37 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|38 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| Hrd dwn node left clustr Hrd dwn fabric auto up Hrd dwn unknown reason Sft dwn barrier success Sft dwn max BTE timeouts Sft dwn unknown reason Sft dwn max rvival tries Downed by operator cmd Src fab dwnd oper cmd Src fab dwnd interrupt Src fab dwnd unkn cause Dst fab dwnd oper cmd Dst fab dwnd interrupt Dst fab dwnd unkn cause Path up other path died Path up operator command Path up probe success Path up CPU brought up Path up unknown cause External Path States For X Fabric (IN): | 1 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15| 2 SRC -----------------------------------------------| 3 00 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 4 01 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 5 02 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 6 03 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5| 7 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 8 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| 9 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|10 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|11 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|12 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|13 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|14 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|15 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|16 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|17 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|18 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|19 |20 External Path States For Y Fabric (IN): |21 DST 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15|22 SRC -----------------------------------------------|23 00 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|24 01 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|25 02 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|26 03 |36 36 36 36 5 5 5 5 5 5 5 5 5 5 5 5|27 04 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|28 05 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|29 06 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|30 07 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|31 08 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|32 09 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|33 10 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|34 11 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|35 12 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|36 13 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|37 14 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4|38 15 | 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4| Unknown state No data for path No remote path data Source CPU down Destination CPU down Src strtng SNet not allc Src strtng SNet allc Src brought down conn Dst strtng SNet not allc Dst strtng SNet allc Dst brought down conn Hrd dwn sequence errors Hrd dwn barrier NACK Hrd dwn transfer NACK Hrd dwn barrier timeout Hrd dwn conn state bad Hrd dwn src CPU died Hrd dwn max BTE timeouts Hrd dwn pwr on interrupt Hrd dwn node left clustr Hrd dwn fabric auto up Hrd dwn unknown reason Sft dwn barrier success Sft dwn max BTE timeouts Sft dwn unknown reason Sft dwn max rvival tries Downed by operator cmd Src fab dwnd oper cmd Src fab dwnd interrupt Src fab dwnd unkn cause Dst fab dwnd oper cmd Dst fab dwnd interrupt Dst fab dwnd unkn cause Path up other path died Path up operator command Path up probe success Path up CPU brought up Path up unknown cause Node(s) not available: ( 1 3 5 6 7 8 10 12 13 15 16 ) The following descriptions explain the fields returned by the STATUS SUBNET, DETAIL command: DST is the destination processor (00 through 15) for ServerNet packets originating from processors listed in the left column of the table. ServerNet Cluster Manual— 520575-003 8- 16 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) SRC is the source processor (00 through 15) of ServerNet packets intended for processors listed in the top row of the table. The numeric values in the command output indicate the state of the paths between the source and destination processors. Table 8-4 describes each of the 38 path state values. If Path State Problems Are Indicated If the output of the STATUS SUBNET, DETAIL command indicates path-state problems, see Methods for Repairing ServerNet Connectivity Problems on page 7-23. You can also use the TSM Service Application to investigate path-related problems. If an alarm appears on a component of the path, perform any associated repair actions. For details, see Using TSM Alarms on page 7-12. If these actions do not solve the path problem, contact your service provider. Table 8-4. Path State Values Returned by the STATUS SUBNET, DETAIL Command (page 1 of 3) No. Path State Value Meaning 1 Unknown state The state is unknown. 2 No data for path The node is not known by the local ServerNet cluster subsystem. 3 No remote path data The state of the processors in the node is not known by the local ServerNet cluster subsystem. 4 Source CPU down The source processor is down. Reload the processor if it is installed in the source node. 5 Destination CPU down The destination processor is down. Reload the processor if it is installed in the destination node 6 Src strtng SNet not allc The source processor is starting direct ServerNet connectivity along the path but has not yet allocated the necessary ServerNet resources in the kernel. 7 Src strtng SNet allc The source processor is starting direct ServerNet connectivity along the path and has already allocated the necessary ServerNet resources in the kernel. 8 Src brought down conn The source processor brought down direct ServerNet connectivity along the path because it believes the destination processor is down or the destination node is unreachable via ServerNet. 9 Dst strtng SNet not allc The destination processor is starting direct ServerNet connectivity along the path but has not yet allocated the necessary ServerNet resources in the kernel. ServerNet Cluster Manual— 520575-003 8- 17 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) Table 8-4. Path State Values Returned by the STATUS SUBNET, DETAIL Command (page 2 of 3) No. Path State Value Meaning 10 Dst strtng SNet allc The destination processor is starting direct ServerNet connectivity along the path and has already allocated the necessary ServerNet resources in the kernel. 11 Dst brought down conn The destination processor brought down direct ServerNet connectivity along the path, because it believes the source processor is down or the source node is unreachable via ServerNet. 12 Hrd dwn sequence errors The path is in hard down state because the source processor detected unrecoverable sequence errors along the path. 13 Hrd dwn barrier NACK The path is in hard down state because a ServerNet barrier packet sent by the source processor along the path was not acknowledged. 14 Hrd dwn transfer NACK The path is in hard down state because a ServerNet data packet sent by the source processor along the path was not acknowledged. 15 Hrd dwn barrier timeout The path is in hard down state because a ServerNet barrier packet sent by the source processor along the path timed out. 16 Hrd dwn conn state bad The path is in hard down state because of an inconsistency in ServerNet connectivity states between the source processor and its local ServerNet cluster subsystem. 17 Hrd dwn src CPU died The path is in hard down state because the source processor is down. 18 Hrd dwn max BTE timeouts The path is in hard down state because the source processor detected more than 40 ServerNet block transfer engine (BTE) data packet timeouts along the path. 19 Hrd dwn pwr on interrupt The path is in hard down state because the source processor received a power on interrupt, which indicates that the source node had a power outage. 20 Hrd dwn node left clustr The path is in hard down state because the node that contains the source processor has left the cluster. 21 Hrd dwn fabric auto up The path is in hard down state because the source processor automatically brought up access to a ServerNet fabric, but has not yet validated that the path is usable. 22 Hrd dwn unknown reason The path is hard down for an unknown reason. 23 Sft dwn barrier success The path was upgraded from hard down to soft down state after the source processor sucessfully sent a ServerNet barrier packet along the path. ServerNet Cluster Manual— 520575-003 8- 18 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) Table 8-4. Path State Values Returned by the STATUS SUBNET, DETAIL Command (page 3 of 3) No. Path State Value Meaning 24 Sft dwn max BTE timouts The path was downgraded from good to soft down state because the source processor detected more than 20 ServerNet block transfer engine (BTE) data packet timeouts along the path. 25 Sft dwn unknown reason The path is in soft down state for an unknown reason. 26 Sft dwn max rvival tries The path is in hard down state because it failed and was automatically revived more than 10 times in the last 1-hour interval. Due to excessive failed revivals, automatic path recovery is now disabled for this path. 27 Downed by operator cmd The path is down because an operator command brought down the path at the source processor. 28 Src fab dwnd oper cmd The path is down because an operator command brought down access to a ServerNet fabric at the source processor. 29 Src fab dwnd interrupt The path is down because the source processor received a hardware interrupt informing that access to a ServerNet fabric is down. 30 Src fab dwnd unkn cause The path is down because the source processor lost access to a ServerNet fabric for an unknown reason. 31 Dst fab dwned oper cmd The path is down because an operator command brought down access to a ServerNet fabric at the destination processor. 32 Dst fab dwnd interrupt The path is down because the destination processor received a hardware interrupt informing that access to a ServerNet fabric is down. 33 Dst fab dwnd unkn cause The path is down because the destination processor lost access to a ServerNet fabric for an unknown reason. 34 Path up other path died The source processor upgraded the path from soft or operator down to good state because the path along the other fabric failed. 35 Path up operator command The path is up because an operator command brought up the path at the source processor. 36 Path up probe success The path was upgraded from soft down to good state after the source processor sucessfully tested the path with automatic path recovery probe messages 37 Path up CPU brought up The path is up because the source processor learned that the destination processor was up and available for direct ServerNet connectivity. 38 Path up unknown cause The path is up for an unknown reason. ServerNet Cluster Manual— 520575-003 8- 19 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBNET, DETAIL Command Example (Partial Display) About IPC Paths and Connections All processors in a ServerNet cluster are connected over a pair of physical ServerNet X and Y fabrics. The fabric at a processor is said to be up if the processor can communicate over that fabric; otherwise it is considered down. A path is a unidirectional ServerNet communication conduit between a pair of processors over one ServerNet fabric. A path is defined as a source processor, a fabric, and a destination processor, in that order. Each processor in the ServerNet cluster has a pair of paths, X and Y, to every other processor in the cluster. The other processor could be in the same system or in another system. The definition of a path includes a sense of direction. Between two processors A and B, there are four paths: • • • • Path X from A to B (maintained by processor A) Path Y from A to B (maintained by processor A) Path X from B to A (maintained by processor B) Path Y from B to A (maintained by processor B) A message system connection between two processors is a logical bi-directional ServerNet communication conduit that contains the four paths mentioned above. The message system connection between processors A and B can be up only if there is direct ServerNet connectivity between the processors in both directions. That is, at least one of the paths from A to B must be up, and at least one of the paths from B to A must be up. Paths between processors exist in various states for a number of reasons. For example, processor A might have put its X path to processor B in a hard down state due to a barrier timeout. However, processor B might not have put its X path to processor A in a down state, because it has not received any errors on it. Consequently, the state of a path from A to B is not the same as the state of the path from B to A. An internal path is a path between a pair of processors within the same system. An external path is a path between a local and a remote processor. Paths and fabrics in a ServerNet cluster can fail and can also be repaired. In response to a path or fabric failure, see If Path State Problems Are Indicated on page 8-17. ServerNet Cluster Manual— 520575-003 8- 20 SCF Commands for SNETMON and the ServerNet Cluster Subsystem STATUS SUBSYS Command Example STATUS SUBSYS Command Example The following example displays the current logical state of the ServerNet cluster subsystem: -> STATUS SUBSYS $ZZSCL Servernet Cluster - Status SUBSYS \SYS.$ZZSCL State.............STARTED where State is one of STARTING, STARTED, STOPPING, or STOPPED. See SCL SUBSYS Object Summary States on page 8-3 for additional information. STOP Command The STOP command terminates access to the ServerNet cluster subsystem in an orderly manner. It stops ServerNet cluster services on the local system, terminates ServerNet connections with other systems in the ServerNet cluster, and moves the subsystem to the STOPPED state. Note. The ServerNet cluster monitor process (SNETMON) itself does not stop. It remains running in the STARTED logical state. This is a sensitive command. The STOP command syntax is: STOP [ /OUT file-spec/ ] SUBSYS $ZZSCL [, FORCED ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. FORCED Currently, there is no difference between STOP SUBSYS and STOP SUBSYS, FORCED. ServerNet Cluster Manual— 520575-003 8- 21 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Considerations Considerations • • If the STOP SUBSYS command is entered correctly, the ServerNet cluster monitor process generates an EMS message that reports the command, the time it was executed, the terminal from which the command was entered, and the group and user numbers of the user issuing the command. Terminating access from the local system to the ServerNet cluster proceeds as follows: 1. The ServerNet cluster monitor process sets the ServerNet cluster subsystem state to STOPPING and logs the state change. 2. The ServerNet cluster monitor process informs each remote ServerNet cluster monitor process that it is stopping. 3. The ServerNet cluster monitor process instructs each local processor to terminate ServerNet connectivity. 4. When the processors have completed, this ServerNet cluster monitor process moves the subsystem to the STOPPED state and logs the change. 5. The ServerNet cluster monitor process itself does not stop. It remains an active process in the STARTED logical state. 6. Only the subsystem state changes are logged. The individual path state changes are not logged. 7. On remote systems, when the ServerNet cluster monitor processes are notified of the STOP, they instruct their local processors to terminate ServerNet connectivity with the stopping system. These remote ServerNet cluster monitor processes then log the node disconnection to the event log. Example The following SCF command stops ServerNet cluster services on the local system, terminates ServerNet connections with other systems in the cluster, and moves the subsystem to the STOPPED state: > STOP SUBSYS $ZZSCL TRACE Command The TRACE command: • • • • Starts a trace operation on a ServerNet cluster monitor process Alters trace parameters set by a previous TRACE command Stops a previously requested trace operation Is a sensitive command ServerNet Cluster Manual— 520575-003 8- 22 SCF Commands for SNETMON and the ServerNet Cluster Subsystem TRACE Command The TRACE command syntax is: TRACE [ /OUT file-spec/ ]PROCESS $ZZSCL[#msgmon], { TO file-ID [, trace-option ... ] | STOP [, BACKUP ] } trace-option is BACKUP { COUNT records | PAGES pages } RECSIZE bytes SELECT tracelevel { WRAP | NOWRAP } OUT file-spec causes any SCF output generated for this command to be directed to the specified file. #msgmon is an optional qualifier designating that one of the message monitor processes is to be the target of the TRACE command. msgmon is designated as ZIMnn, where nn is the number of a running processor on the system. TO file-ID activates a trace and specifies the name of the file into which trace data is to be collected. This option is required when you are starting a trace operation. If the file already exists, it is purged of data before the trace is initiated. If the file does not exist, it is created with an extent size based on the value of the PAGES parameter. If TO is not specified, the existing trace is either stopped (if STOP is specified) or modified as specified in the trace-option. BACKUP specifies that the backup process should receive the trace request. This option is not valid in combination with the #msgmon qualifier. COUNT records specifies the number of trace records to be captured. The trace will terminate after the number of records has been collected. ServerNet Cluster Manual— 520575-003 8- 23 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Considerations PAGES pages designates how much memory space is allocated in the extended data segment used for tracing. The trace will terminate after the number of pages of trace data has been collected. RECSIZE bytes specifies the maximum size for any trace data record. Larger records are truncated. SELECT tracelevel identifies the kind of trace data to be collected. Currently, only ALL is supported. WRAP | NOWRAP WRAP specifies that when the trace disk file end-of-file mark is reached, trace data wraps around to the beginning of the file and overwrites any existing data. If you omit this option, the default is for wrapping to be turned off (NOWRAP). STOP ends the current trace operation. Considerations • • • Only a single ServerNet cluster monitor process trace can be running at any one time in each process. Therefore the ServerNet cluster monitor process can have two active traces, one in the primary process, and one in the backup process. Each message monitor process can also run a simultaneous trace. The BACKUP option takes effect immediately. For more information on tracing, including the ranges and defaults of the attributes listed above, see the PTrace Reference Manual. Examples • The following SCF command starts a trace operation on the ServerNet cluster monitor process and writes results into the file named $DATA.SNETMON.TRACE: > TRACE PROCESS $ZZSCL, TO $DATA.SNETMON.TRACE • The following SCF command alters the previously started trace to allow trace data to be overwritten when the end of file (EOF) is reached: > TRACE PROCESS $ZZSCL, WRAP • The next example stops the trace: > TRACE PROCESS $ZZSCL, STOP ServerNet Cluster Manual— 520575-003 8- 24 SCF Commands for SNETMON and the ServerNet Cluster Subsystem VERSION Command VERSION Command The VERSION command displays version information about the ServerNet cluster monitor process. VERSION is a nonsensitive command. The VERSION command syntax is: VERSION [ /OUT file-spec / ] { PROCESS } $ZZSCL [ , DETAIL ] { SUBNET } { SUBSYS } OUT file-spec causes any SCF output generated for this command to be directed to the specified file. DETAIL designates that complete version information is to be returned. If DETAIL is omitted, a single line of version information is returned. Examples The following SCF command displays the ServerNet cluster product name, product number, and release date: > VERSION SUBSYS $ZZSCL VERSION SUBSYS \SYS.$ZZSCL: SCL - T0294G08 - (01JUL01) - AAG The following SCF command shows the information returned by the VERSION, DETAIL command: > VERSION SUBSYS $ZZSCL,DETAIL Detailed VERSION SUBSYS \SYS.$ZZSCL SYSTEM \SYS SCL - T0294G08 - (01JUL01) - AAG GUARDIAN - T9050 - (Q06) SCF KERNEL - T9082G02 - (03OCT01) (25SEP01) SCL PM - T0294G08 - (01JUL01) - AAG The following descriptions explain the fields returned by the VERSION and VERSION, DETAIL commands: SCL - T0294G08 - 03JUL00 identifies the version of the ServerNet cluster monitor process and the release date. GUARDIAN - T9050 - (Q06) identifies the version of the NonStop operating system. ServerNet Cluster Manual— 520575-003 8- 25 SCF Commands for SNETMON and the ServerNet Cluster Subsystem Examples SCF KERNEL - T9082G02 - (26JUN00) identifies the version of the SCF Kernel (T9082G02) and the release date (26JUN00). SCL PM - T0294G08 - (03JUL00) identifies the version of the SCF product module (T0294G08) and the release date (02JUL00). ServerNet Cluster Manual— 520575-003 8- 26 9 SCF Commands for the External ServerNet SAN Manager Subsystem This section describes the SCF commands that are supported specifically for the external ServerNet system area network (SAN) manager subsystem (SMN). The SMN subsystem is used to manage the external ServerNet SAN manager process (SANMAN). Note. Commands that are generally supported by SCF, such as the ASSUME and ENV commands, are documented in the SCF Reference Manual for G-Series RVUs. Kernel subsystem SCF commands such as ADD, START and ABORT for configuring generic (system managed) processes (such as the ServerNet SAN manager process, represented as a PROCESS object) are documented in the SCF Reference Manual for the Kernel Subsystem. Table 9-1 lists the SCF commands for the SMN subsystem. Note. For SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies, see Appendix I, SCF Changes at G06.21. Table 9-1. External ServerNet SAN Manager (SMN) Subsystem SCF Commands Command CONN Object PROCESS Object ALTER Command INFO Command X LOAD Command PRIMARY Command Sensitive Command? X Yes 9-3 X No 9-5 X Yes 9-13 Yes 9-16 X Yes 9-16 X No 9-18 X RESET Command STATUS Command SWITCH Object X See page TRACE Command X Yes 9-34 VERSION Command X No 9-36 To use all the features in this section, you must be running G06.16 or a later G-series RVU, or you must have installed the SANMAN T0502AAH SPR. To determine the features that were introduced at each RVU, see Table 9-2, SCF Features for SANMAN by RVU. ServerNet Cluster Manual— 520575-002 9 -1 SCF Commands for the External ServerNet SAN Manager Subsystem SANMAN SCF Objects Table 9-2. SCF Features for SANMAN by RVU RVU SANMAN SPR G06.09 T0502 • G06.12 T0502AAE • T0502AAG • • G06.14 Introduced These New SCF Features • • • G06.16 T0502AAH • • PRIMARY PROCESS, TRACE PROCESS, and VERSION process commands ALTER, INFO, LOAD, RESET, and STATUS commands CONNECTION and SWITCH objects Firmware VPROC, PWA Number, Topology, and Config VPROC fields in the INFO SWITCH display Topology option for the LOAD SWITCH command NNA option and more complete MSEB information for STATUS CONNECTION command ROUTER option and new Port Neighbor information for the STATUS SWITCH command New values for the Fabric Access, Port Status and Status for Neighbor attributes of the STATUS CONNECTION $ZZSMN command. New values for the Switch Port Status Codes of the STATUS SWITCH $ZZSMN command. SANMAN SCF Objects The following SCF objects are supported for SANMAN: CONN Use this object to gather information about external fabric connections to a cluster switch. PROCESS Use this object to issue commands for the external ServerNet SAN manager process. SWITCH Use this object to issue commands for a cluster switch. Sensitive and Nonsensitive Commands Sensitive SCF commands can have detrimental effects if improperly used. A sensitive command can be issued only by a user with the super ID, the owner of the subsystem, or a member of the group of the subsystem owner. When used in conjunction with the security features of the system services, SCF provides effective access control for sensitive commands. Commands that request information or status but that do not affect operation are called nonsensitive commands and are available to all users. ServerNet Cluster Manual— 520575-002 9 -2 SCF Commands for the External ServerNet SAN Manager Subsystem ALTER Command ALTER Command The ALTER command is a sensitive command. It allows you to: • • • Specify the fabric setting used by a cluster switch. Assign or change a locator string for the cluster switch. Command the LEDs on the ServerNet II Switch to blink or stop blinking. (The ServerNet II Switch is the principal subcomponent of a cluster switch.) The ALTER command syntax is: ALTER [ /OUT file-spec/ ] SWITCH $ZZSMN, NEAREST { fabric-ID }, { attribute-spec } ... OUT file-spec causes any SCF output generated for this command to be directed to the specified file. fabric-ID specifies the nearest cluster switch on the specified (X or Y) external ServerNet fabric. attribute-spec is a specification of an attribute and a value to be assigned to it. It is one of the following attribute name and value combinations: [ FABRIC { X | Y | NONE } ] [ LOCATOR "{ locator-string }" ] [ BLINK { ALL | NONE } ] FABRIC X The cluster switch will be configured for the X fabric. Y The cluster switch will be configured for the Y fabric. NONE The cluster switch will be configured for neither fabric. LOCATOR locator-string indicates an identifier string of 0 to 32 ASCII characters that can be used to describe or help locate the cluster switch. ServerNet Cluster Manual— 520575-002 9 -3 SCF Commands for the External ServerNet SAN Manager Subsystem Considerations BLINK ALL Blink all switch port LEDs, including the fault LED. NONE Stop blinking all switch port LEDs, including the fault LED, and restore the normal operating state of the LED. (During normal operation, the port LED lights to indicate link alive.) Considerations • • • • • The ALTER command is a sensitive command and can be used only by a supergroup user (255, n) ID. If the FABRIC attribute is specified, the command requires confirmation by the user. Wild cards are not supported for the ALTER SWITCH command. Only one attribute specification (FABRIC, LOCATOR, or BLINK) can be specified in a single ALTER SWITCH command. The BLINK ALL attribute specification blinks the LEDs of all ports, including the fault LED, on the ServerNet II Switch component of the specified cluster switch. ALTER SWITCH Command Examples The following example changes the fabric setting of the nearest Y-fabric cluster switch to none, meaning that the cluster switch can support neither the X nor the Y fabric: > ALTER SWITCH $ZZSMN, NEAREST Y, FABRIC NONE This command should only be issued by Compaq trained support personnel. Executing this command will configure the switch with the specified fabric setting (i.e., one of X, Y, or NONE). The NONE option should normally be used only if the switch is being removed from the cluster. The X and Y options should be used only if the current fabric setting of the switch does not match the external fabric the switch is on. Do you wish to continue with this command? (Y, [N]) Y The nearest ServerNet switch in the external ServerNet Y fabric now has a fabric setting of NONE. The following example sets the locator string for the nearest Y-fabric cluster switch to a new value: > ALTER SWITCH $ZZSMN, NEAREST Y, LOCATOR "Building 3, Room 1346" The nearest ServerNet switch in the external ServerNet X fabric now has a switch locator string of: "Building 3, Room 1346" ServerNet Cluster Manual— 520575-002 9 -4 SCF Commands for the External ServerNet SAN Manager Subsystem INFO Command The following example blinks the LEDs on the nearest Y-fabric cluster switch: > ALTER SWITCH $ZZSMN, NEAREST Y, BLINK ALL The nearest ServerNet switch in the external ServerNet Y fabric has begun to blink the LEDs of all ports. INFO Command The INFO command obtains information about a cluster switch or about the external ServerNet fabric connection to the nearest cluster switch. Unless specified otherwise, information is displayed for both fabrics. This information is returned by the external ServerNet SAN manager process ($ZZSMN). INFO is a nonsensitive command. The INFO command syntax is: INFO [ /OUT file-spec/ ]{ CONN[ECTION] | SWITCH } $ZZSMN [, ONLY fabric-id ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. ONLY fabric-id displays switch configuration information for only the specified fabric, where fabric-id is either X or Y. Consideration In addition to the values described in the INFO command displays, you might see values of N/A or UNKNOWN. In general, a value of N/A means that a value is not applicable or not expected for the field. A value of UKNOWN means that a value is expected, but cannot be obtained for some reason. ServerNet Cluster Manual— 520575-002 9 -5 SCF Commands for the External ServerNet SAN Manager Subsystem INFO CONNECTION Command Example INFO CONNECTION Command Example The following example shows the INFO CONNECTION $ZZSMN command: > INFO CONN $ZZSMN INFO CONNECTION X Fabric Y Fabric |--------------------------------------------------------------------------| | Command Status | OK |OK | | Status Detail | No Status Detail |No Status Detail | |--------------------|---------------------------|------------|------------| | Configuration | MSEB Port | Switch Port | MSEB Port | Switch Port| |--------------------|-------------|-------------|------------|------------| | Port Info Valid | TRUE | TRUE | TRUE | TRUE | | Port Number | 6 | 1 | 6 | 1 | | Desired Port State | N/A | TX/RX ENBLD | N/A | TX/RX ENBLD| | Neighbor ID Check | N/A | NO QRY PASS | N/A | NO QRY PASS| | Node Number Mask | 0x000fc000 | 0x000fc000 | 0x000fc000 | 0x000fc000 | | Node Routing ID | 0x000d8000 | 0x000d8000 | 0x000d8000 | 0x000d8000 | | SvNet Node Number | 2 | 2 | 2 | 2 | | NNA Version | 22 | N/A | 22 | N/A | | PIC Functional ID | NNA PIC | SMF OPTICAL | NNA PIC | SMF OPTICAL| |--------------------|-------------|-------------|------------|------------| In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 Port Info Valid indicates whether the port information is valid. Possible values are TRUE and FALSE. Port Number is the connector location on the MSEB or cluster switch to which the fiber-optic ServerNet cable is attached. Valid port numbers on the cluster switch are 0 through 11. The only valid port number on the MSEB is port 6. Desired Port State indicates whether transmit and receive are to be enabled. Valid states are: TX/RX DSBLD transmit and receive are disabled. ServerNet Cluster Manual— 520575-002 9 -6 SCF Commands for the External ServerNet SAN Manager Subsystem INFO CONNECTION Command Example TX/RX ENBLD transmit and receive are enabled. TX/RX AUTO no low-level neighbor checks are run. The port can still enable or disable ServerNet traffic. N/A does not apply. Neighbor ID Check indicates the type of neighbor checks to be performed to enable the port. Possible values are: NO QRY PASS perform no query of neighbor and assume it passes. NO QRY FAIL perform no query of neighbor and assume it fails. IGNORE RSLT query neighbor, but ignore result. QRY TO PASS query neighbor and enable port only if it passes. Node Number Mask is a bit-mask indicating which bits of the node routing ID are valid. Node Routing ID is the node number routing ID. For the MSEB, the ID is configured on the NNA PIC. For the cluster switch port, the ID is assigned by the external fabric. SvNet Node Number is a number in the range 1 through 24 that identifies a member system in a ServerNet cluster. The ServerNet node number is a simplified expression of the six-bit node-routing ID that determines the node to which a ServerNet packet is routed. The ServerNet node number is assigned based on the port to which the node is connected on the ServerNet II Switch. ServerNet Cluster Manual— 520575-002 9 -7 SCF Commands for the External ServerNet SAN Manager Subsystem INFO CONNECTION Command Example NNA Version indicates the version of the NNA on the MSEB. A value of N/A (not applicable) appears for the cluster switch port. PIC Functional ID is the type of plug-in card (PIC) used in the MSEB or cluster switch. PIC types include: SMF OPTICAL Single-mode fiber-optic plug-in card MMF OPTICAL Multi-mode fiber-optic plug-in card ECL PIC Emitter-coupled logic plug-in card NNA PIC Node-numbering agent plug-in card Copper PIC Serial-copper plug-in card LVDS DC PIC Low-voltage differential signals plug-in card SCSI PIC Small computer system interface plug-in card PIC Absent No plug-in card present ServerNet Cluster Manual— 520575-002 9 -8 SCF Commands for the External ServerNet SAN Manager Subsystem INFO SWITCH Command Example INFO SWITCH Command Example The following example shows the INFO SWITCH $ZZSMN command: > INFO SWITCH $ZZSMN INFO SWITCH X Fabric Y Fabric |----------------------------------------------------------------------------| | Command Status | OK | OK | | Status Detail | No Status Detail | No Status Detail | |--------------------|---------------------------|---------------------------| | Configuration | ServerNet Switch | ServerNet Switch | |--------------------|---------------------------|---------------------------| | Switch Locator | Left switch in room 1205 | Room 1205 Left Side | | " " | | | | | | | | Eye Catcher | SN | SN | | Manufacturer ID | 0x0053 | 0x0053 | | Manufacturer Model | ServerNet II | ServerNet II | | Hardware Revision | 0_04 | 0_04 | | Firmware Revision | 3_0_81 | 3_0_81 | | Firmware VPROC | T0569G06^11APR01^FMW^AAE | T0569G06^11APR01^FMW^AAE | | Globally Unique ID | V0XY35 | V0XY1U | | PWA Number | 425655 | 425655 | | Conf Support Flags | Status,OS Hints,Tag Pairs | Status,OS Hints,Tag Pairs | | Number of Ports | 12 | 12 | | | | | | Capability Flag | Can Source And Sink Pkts | Can Source And Sink Pkts | | " " | Supplies Per Port Data | Supplies Per Port Data | | " " | Full Pkt Wormhole IBC | Full Pkt Wormhole IBC | | " " | Packet Level IBC | Packet Level IBC | | " " | Able To Route Packets | Able To Route Packets | | | | | | Position ID | 2 | 2 | | Topology | 16-Nodes | 16-Nodes | | Configuration Tag | 0x00010001 | 0x00010001 | | Config Revision | 0x0001000b | 0x0001000b | | Config VPROC | T0569G06^02JUL01CL2^AAE | T0569G06^02JUL01CL2^AAE | | | | | | PIC Func ID (00) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (01) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (02) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (03) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (04) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (05) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (06) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (07) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (08) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (09) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (10) | SMF Optical PIC Installed | SMF Optical PIC Installed | | PIC Func ID (11) | SMF Optical PIC Installed | SMF Optical PIC Installed | | | | | | ServerNet ID | 0x000ffffe | 0x000ffffe | | Fabric Setting | X Fabric | Y Fabric | | Switch Poll Intrvl | 60 seconds | 60 seconds | |--------------------|---------------------------|---------------------------| | Configuration | UPS | UPS | |--------------------|---------------------------|---------------------------| | UPS Info Valid | TRUE | TRUE | | UPS Type | UPS 1440 VA FW -0039 | UPS 1440 VA FW -0039 | | UPS ID | TS392A0114 | TS262A0122 | | UPS Part Number | 05144717-5901 | 05144717-5901 | | UPS Poll Interval | 60 seconds | 60 seconds | |--------------------|---------------------------|---------------------------| Total Errors = 0 Total Warnings = 0 ServerNet Cluster Manual— 520575-002 9 -9 SCF Commands for the External ServerNet SAN Manager Subsystem INFO SWITCH Command Example In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 Switch Locator is an identifier string of 0 to 32 ASCII characters that can be used to describe or help locate the cluster switch. Eye Catcher indicates the nature of the configuration. For cluster switch configuration, the value is SN. Manufacturer ID is the HP code for the manufacturer of the ServerNet switch subcomponent of the cluster switch. Manufacturer Model is the model of the ServerNet switch subcomponent of the cluster switch. Possible values are: ServerNet I and ServerNet II. Only ServerNet II Switches can be used in ServerNet clusters. Hardware Revision is the revision of the ServerNet switch motherboard. Firmware Revision is the version of firmware running on the cluster switch Firmware VPROC is the VPROC string for the version of firmware running on the cluster switch. Globally Unique ID is a unique 6-character value used to identify the cluster switch. The GUID is stored in nonvolatile memory (SEEPROM) on the ServerNet II Switch subcomponent at the time of manufacture. The GUID is not visible on the exterior of the ServerNet II Switch. ServerNet Cluster Manual— 520575-002 9- 10 SCF Commands for the External ServerNet SAN Manager Subsystem INFO SWITCH Command Example PWA Number is the printed wiring assembly (PWA) number of the ServerNet II Switch subcomponent of the cluster switch. Conf Support Flags indicates the configuration support flags. Number of Ports is the number of ports on the ServerNet switch. This value is always 12 for the ServerNet II Switch. Capability Flag indicates the services required or provided by the node. Position ID indicates the position of the cluster switch on the fabric. Possible values are 1, 2, or 3. The X1/Y1 cluster switches have a position ID of 1, the X2/Y2 cluster switches have a position ID of 2, and the X3/Y3 cluster switches have a position ID of 3. Topology indicates the topology of the cluster. Possible values are: 16-Nodes a cluster using either the star or split-star topology with 16 or fewer nodes 24-Nodes a cluster using the tri-star topology with 24 or fewer nodes Configuration Tag is the internal representation of the configuration tag stored on the cluster switch. The configuration tag is determined from the specified POSITION and TOPOLOGY parameters as follows: Position Topology Configuration Tag 1 16 nodes 0x00010000 2 16 nodes 0x00010001 1 24 nodes 0x00010002 2 24 nodes 0x00010003 3 24 nodes 0x00010004 The manufacturing default (blank) configuration tag is 0x00011111. ServerNet Cluster Manual— 520575-002 9- 11 SCF Commands for the External ServerNet SAN Manager Subsystem INFO SWITCH Command Example Config Revision is the revision of the configuration data installed on the cluster switch. Config VPROC is the VPROC string for the version of configuration data running on the cluster switch. PIC Func ID (xx) is the type of plug-in card (PIC) installed in the ServerNet II Switch subcomponent of the cluster switch and the port number in which the PIC is installed. Possible values are: • • • • • • • • SMF Optical PIC Installed MMF Optical PIC Installed Serial Copper PIC Installed ECL PIC Installed NNA PIC Installed SCSI PIC Installed LVDS DC PIC Installed PIC Not Present ServerNet ID is the internal representation of the ServerNet ID of the ServerNet II Switch. Fabric Setting is the external ServerNet fabric served by the cluster switch. Possible values are X Fabric, Y Fabric, or NONE. Switch Poll Intrvl is the interval in which SANMAN polls the cluster switch for status. The polling interval depends on the version of SANMAN that is running on your system and the state of the link. If a problem occurs with the link, later versions of SANMAN poll the switch more often until the link is repaired. Table 9-3. SANMAN Cluster Switch Polling Intervals SANMAN Version No Error Link Error T0502 (G06.09) 180 seconds 180 seconds T0502AAA (G06.09) 180 seconds 180 seconds T0502AAE (G06.12) 60 seconds 60 seconds T0502AAG (G06.14) 180 seconds 60 seconds T0502AAH (G06.16) 180 seconds 60 seconds ServerNet Cluster Manual— 520575-002 9- 12 SCF Commands for the External ServerNet SAN Manager Subsystem LOAD Command UPS Info Valid indicates whether the uninterruptible power supply (UPS) information is valid. Possible values are TRUE or FALSE. The value is FALSE if the UPS is disconnected from the ServerNet II Switch subcomponent. UPS Type is the VA rating (volts multiplied by amps) and the firmware version of the UPS subcomponent of the cluster switch. UPS ID is the number used to identify the UPS subcomponent of the cluster switch. UPS Part Number is the part number of the UPS subcomponent of the cluster switch. UPS Poll Interval is the interval in which the firmware polls the UPS subcomponent of the cluster switch for status. LOAD Command The LOAD command is a sensitive command. It allows you to download a configuration or firmware file from the server to the ServerNet II Switch subcomponent of the cluster switch. Note. The LOAD command is intended for service providers only. To load firmware or configuration for a cluster switch, use the TSM Service Application. The LOAD command syntax is: LOAD [ /OUT file-spec/ ] SWITCH $ZZSMN, NEAREST {fabric-ID }, { [ FIRMWARE { filename } ] | [ CONFIG { filename }, POSITION { position } , TOPOLOGY topology ] } OUT file-spec causes any SCF output generated for this command to be directed to the specified file. NEAREST fabric-ID specifies the cluster switch on the specified (X or Y) external ServerNet fabric. ServerNet Cluster Manual— 520575-002 9- 13 SCF Commands for the External ServerNet SAN Manager Subsystem Considerations FIRMWARE filename indicates the name of the firmware file to be downloaded. The file name should be specified in the standard file system external format. The file must be located on the local system. CONFIG filename indicates the name of the configuration file to be downloaded. The file name should be specified in the standard file system external format. The file must be located on the local system. POSITION position specifies the position (1, 2 or 3) of the cluster switch on the external fabric. The position determines which ServerNet node numbers the cluster switch supports: Position Cluster Switches Support ServerNet Node Numbers 1 X1 and Y1 1 through 8 2 X2 and Y2 9 through 16 3 X3 and Y3 17 through 24 TOPOLOGY topology specifies the topology of the cluster. If the cluster contains 16 or fewer nodes and uses either the split-star topology or the star topology, specify 16NODES. If the cluster contains 24 or fewer nodes and uses the tri-star topology, specify 24NODES. Considerations • • • • • • The LOAD command is a sensitive command and can be used only by a supergroup user (255, n) ID. The command requires confirmation by the user. Wild cards are not supported for the LOAD SWITCH command. Only one attribute specification (FIRMWARE or CONFIG) can be specified in a single LOAD SWITCH command. If CONFIG is specified, POSITION and TOPOLOGY must also be specified. A soft reset of the cluster switch must be performed after a firmware download. For details see the NonStop S-Series Service Provider Supplement. A hard reset of the cluster switch must be performed after a configuration download. For details see the NonStop S-Series Service Provider Supplement. ServerNet Cluster Manual— 520575-002 9- 14 SCF Commands for the External ServerNet SAN Manager Subsystem LOAD SWITCH Command Examples LOAD SWITCH Command Examples The following example downloads the firmware file $SYSTEM.SYS65.M6770 to the nearest X-fabric cluster switch: > LOAD SWITCH $ZZSMN, NEAREST X, FIRMWARE $SYSTEM.SYS65.M6770 This command should only be issued by Compaq trained support personnel. Executing this command will load a new firmware image on the switch. Before executing this command, be sure to read the instructions for switch firmware download in the ServerNet Cluster Manual. Do you wish to continue with this command? (Y / [N]) Y The nearest ServerNet switch in the external ServerNet X fabric has received firmware file \STAR3.$SYSTEM.SYS65.M6770 which will be put into use after the next SOFT RESET command. Note that the LOAD SWITCH command will only load the firmware file into one bank of the switch. If this is the first LOAD SWITCH command, it will be necessary to execute the command again before performing the SOFT RESET command. To confirm that a firmware file has loaded into both banks of a switch, use the STATUS SWITCH command and make sure that the "Firmware Images" field shows "Images The Same". If the field shows "Images Different", you will need to execute the same LOAD SWITCH command once again before performing the SOFT RESET command. The following example downloads the configuration file $SYSTEM.SYS65.M6770CL to the nearest cluster switch on the Y-fabric serving ServerNet node numbers 9 through 16 in a cluster using the tri-star topology: > LOAD SWITCH $ZZSMN, NEAREST Y, CONFIG $SYSTEM.SYS65.M6770CL, & POSITION 2, TOPOLOGY 24NODES This command should only be issued by Compaq trained support personnel. Executing this command will load a new configuration image on the switch. Before executing this command, be sure to read the instructions for switch configuration download in the ServerNet Cluster Manual. Do you wish to continue with this command? (Y / [N]) Y The nearest ServerNet switch in the external ServerNet X fabric has received configuration file \STAR3.$SYSTEM.SYS65.M6770CL and will be named ServerNet switch 2 (Y fabric) after the next HARD RESET command. Note that the LOAD SWITCH command will only load the configuration file into one bank of the switch. If this is the first LOAD SWITCH command, it will be necessary to execute the command again after performing the HARD RESET command. To confirm that a configuration file has loaded into both banks of a switch, use the STATUS SWITCH command and make sure that the "Config Images" field shows "Images The Same". If the field shows "Images Different", you will need to execute the same LOAD SWITCH command once again after performing the HARD RESET command. ServerNet Cluster Manual— 520575-002 9- 15 SCF Commands for the External ServerNet SAN Manager Subsystem PRIMARY Command PRIMARY Command The PRIMARY command is a sensitive command. It causes a processor switch, where the backup processor becomes the primary processor, and the primary processor becomes the backup processor. PRIMARY is a sensitive command. The PRIMARY command syntax is: PRIMARY [ /OUT file-spec/ ] PROCESS $ZZSMN [, cpunum ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. cpunum is the processor number of the current backup processor for the ServerNet SAN manager process. Consideration Wild cards are not supported for the PRIMARY PROCESS command. Example The following command causes the previously configured backup processor for the ServerNet SAN manager process (processor 3) to become the primary processor. > PRIMARY PROCESS $ZZSMN, 3 RESET Command The RESET command is a sensitive command. It allows you to perform a hard or soft reset of the ServerNet II Switch. (The ServerNet II Switch is the principal subcomponent of the cluster switch.) The RESET command syntax is: RESET [ /OUT file-spec/ ] SWITCH $ZZSMN, [ NEAREST { fabric-ID }], [ HARD | SOFT ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. ServerNet Cluster Manual— 520575-002 9- 16 SCF Commands for the External ServerNet SAN Manager Subsystem Considerations NEAREST allows you to designate the ServerNet II Switch to which the firmware or configuration is downloaded by specifying the cluster switch that is directly connected to the server. fabric-ID specifies the cluster switch on the specified (X or Y) external ServerNet fabric. HARD | SOFT allows you to designate the type of reset. Hard Reset Reinitializes the router-2 ASIC within the ServerNet II Switch, disrupting the routing of ServerNet packets through the switch for about 12 seconds. Soft Reset Restarts the firmware on the ServerNet II Switch but does not interfere with ServerNet pass-through data traffic. Considerations • • The RESET command is a sensitive command and can be used only by a supergroup user (255, n) ID. Wild cards are not supported for the RESET command. RESET SWITCH Command Examples The following example performs a soft reset of the ServerNet II Switch subcomponent of the nearest X-fabric cluster switch: > RESET SWITCH $ZZSMN, NEAREST X, SOFT This command should only be issued by Compaq trained support personnel. Executing this command will force the switch into a soft reset. A soft reset restarts the switch firmware, but does not interfere with ServerNet pass-through data traffic via the switch. When executing this command, be sure that cabling changes on any links directly connected to the switch are not performed. Do you wish to continue with this command? (Y / [N]) Y The nearest ServerNet switch in the external ServerNet Y fabric has been soft reset. ServerNet Cluster Manual— 520575-002 9- 17 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS Command The following example performs a hard reset of the ServerNet II Switch subcomponent of the nearest X-fabric cluster switch: > RESET SWITCH $ZZSMN, NEAREST Y, HARD This command should only be issued by Compaq trained support personnel. Executing this command will force the switch into a hard reset, which is functionally equivalent to a power-on reset. Before executing this command, be sure that interprocessor connectivity is up for all nodes via the other external ServerNet fabric. Note that the switch that is hard reset will temporarily not be able to respond to further operational commands. Do you wish to continue with this command? (Y / [N]) Y The nearest ServerNet switch in the external ServerNet Y fabric has been hard reset. To determine when the hard reset has been completed and the switch is ready for further operational commands, use the STATUS SWITCH command which will indicate if the switch is responding. Use the STATUS SUBNET ($ZZSCL) command to determine if interprocessor connectivity via the switch has returned. STATUS Command The STATUS command obtains dynamic status information for the external ServerNet fabric connections. This information is returned by the external ServerNet SAN manager process ($ZZSMN). STATUS is a nonsensitive command. The STATUS command syntax is: STATUS [ /OUT file-spec/ ] { CONN[ECTION] [, NNA ] | SWITCH [, ROUTER] } $ZZSMN [, ONLY fabric-id ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. NNA causes the STATUS CONNECTION command to display the status of the nodenumbering agent (NNA) registers instead of status of the MSEB and ServerNet II Switch ports ROUTER causes the STATUS SWITCH command to only display router status codes for the ServerNet II Switch ports. ONLY fabric-id displays status information for only the specified fabric, where fabric-id is either X or Y. ServerNet Cluster Manual— 520575-002 9- 18 SCF Commands for the External ServerNet SAN Manager Subsystem Considerations Considerations • • The STATUS command displays connection or cluster switch information for both external ServerNet fabrics unless the ONLY option is specified. In addition to the values described in the STATUS command displays, you might see values of N/A or UNKNOWN. In general, a value of N/A means that a value is not applicable or not expected for the field. A value of UKNOWN means that a value is expected, but cannot be obtained for some reason. STATUS CONNECTION Command Example The following example shows the STATUS CONNECTION $ZZSMN command: > SCF STATUS CONN $ZZSMN | STATUS CONNECTION X Fabric Y Fabric |---------------------------------------------------------------------------| | Command Status | OK | OK | | Status Detail | No Status Detail | No Status Detail | |---------------------------------------------------------------------------| | Fabric Access | No Error | No Error | |--------------------|---------------------------|--------------------------| | Status Result | MSEB Port | Switch Port | MSEB Port | Switch Port| |--------------------|-------------|-------------|-------------|------------| | Port Status Valid | TRUE | TRUE | TRUE | TRUE | | Port Status | LINK ALIVE | LINK ALIVE | LINK ALIVE | LINK ALIVE | | Link Speed | 125 MB/S | 125 MB/S | 125 MB/S | 125 MB/S | | Link Lvl Prtcl Rcv | ENABLED | ENABLED | ENABLED | ENABLED | | Link Lvl Prtcl Trn | ENABLED | ENABLED | ENABLED | ENABLED | | Pckt Lvl Prtcl Rcv | ENABLED | ENABLED | ENABLED | ENABLED | | Pckt Lvl Prtcl Trn | ENABLED | ENABLED | ENABLED | ENABLED | | Lost Optical Signl | ENABLED | FALSE | ENABLED | FALSE | | Status For Neighbr | N/A | OK | N/A | OK | | Target Output Port | ENABLED | ENABLED | ENABLED | ENABLED | | Node Number Mask | 0x000fc000 | 0x000fc000 | 0x000fc000 | 0x000fc000 | | Node Routing ID | 0x000d8000 | 0x000d8000 | 0x000d8000 | 0x000d8000 | | SvNet Node Number | 2 | 2 | 2 | 2 | | NNA Prog State | ENABLED | N/A | ENABLED | N/A | |--------------------|-------------|-------------|-------------|------------| Total Errors = 0 Total Warnings = 0 In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 ServerNet Cluster Manual— 520575-002 9- 19 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS CONNECTION Command Example Fabric Access indicates the status of the external X and Y fabric connections. Possible values are: • • • • • • • • • • • • • • • • • • • • • • • • No Error MSEB Missing Link Dead No Response Processor Fabric Down CRU Type Not MSEB No NNA PIC Wrong Fabric Bad Switch Port Number Bad SCB Loaded Bad Switch PIC Type Bad Switch GUID Node Number Mismatch NNA Verify Fail SP I/O Library Call Error System Power Up No MSEB Config Record Bad MSEB Config Record MSEB Config Fetch Error Internal Sys Fabric Down Both LEDs Set TNET Initialization Error Invalid Fabric Parameter Too Many Switches Port Status Valid indicates whether the port status information is valid. Possible values are TRUE and FALSE. Port Status indicates the status of the MSEB or ServerNet II Switch port. Possible values are Link Alive, Link Dead, Dead, Reset, or Uninstalled. Link Speed indicates the speed of the link in megabits per second. Link Lvl Prtcl Rcv indicates whether Link Receive is enabled. Possible values are ENABLED or DISABLED. ServerNet Cluster Manual— 520575-002 9- 20 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS CONNECTION Command Example Link Lvl Prtcl Trn indicates whether Link Transmit is enabled. Possible values are ENABLED or DISABLED. Pckt Lvl Prtcl Rcv indicates whether Packet Receive is enabled. Possible values are ENABLED or DISABLED. Pckt Lvl Prtcl Trn indicates whether Packet Transmit is enabled. Possible values are ENABLED or DISABLED. Lost Optical Signl indicates whether a lost optical signal error occurred. Status for Neighbor indicates the status of the cluster switch port as reported to neighboring switches on the same external fabric. This field does not apply to MSEB ports. Possible values are: OK No Error LINK DEAD Link is dead WRONG FAB Wrong Fabric INVLD NBR Neighbor Invalid INVLD PORT Invalid Port MIXED GUID Mixed Globally Unique ID INVLD PRTNM Invalid Part Number INVLD VERID Invalid Version ID MIXED CNFTG Mixed Configuration Tag INVLD CNFTG Invalid Configuration UNINITALZED Uninitialized Target Output Port indicates whether the target output port is enabled. Possible values are ENABLED and DISABLED. Node Number Mask is a bit-mask indicating which bits of the node routing ID are valid. ServerNet Cluster Manual— 520575-002 9- 21 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS CONNECTION, NNA Command Example Node Routing ID is the node number routing ID. For the MSEB, the ID is configured on the NNA PIC. For the cluster switch port, the ID is assigned by the external fabric. SvNet Node Number is a number in the range 1 through 24 used to route ServerNet packets across the external ServerNet X or Y fabrics. The ServerNet node number is assigned based on the port to which the node is connected on the ServerNet II Switch. NNA Prog State indicates the programming state of the NNA PIC on the MSEB. This field does not apply to the ServerNet II Switch port. Possible values are ENABLED, DISABLED, RESET, EN IBC ONLY, and CONFIG ERR. STATUS CONNECTION, NNA Command Example The following example shows the STATUS CONNECTION command with the NNA option: > SCF STATUS CONN $ZZSMN, NNA | STATUS CONNECTION X Fabric Y Fabric |---------------------------------------------------------------------------| | Command Status | OK | OK | | Status Detail | No Status Detail | No Status Detail | |--------------------|---------------------------|--------------------------| | NNA Registers | Outbound | Inbound | Outbound | Inbound | |--------------------|-------------|-------------|-------------|------------| | NNA Reg Data Valid | TRUE | TRUE | TRUE | TRUE | | Node Routing ID | 0x00d8000 | 0x00e0000 | 0x00d8000 | 0x00e0000 | | Node Number Mask | 0x000fc000 | 0x000fc000 | 0x000fc000 | 0x000fc000 | | Mode Control | 0x0000000a | 0x0000000d | 0x0000000a | 0x0000000d | | Busy Scaler | 0x00000080 | 0x00000080 | 0x00000080 | 0x00000080 | | Ready Scaler | 0x00000001 | 0x00000001 | 0x00000001 | 0x00000001 | | High Threshold | 0x00000006 | 0x00000006 | 0x00000006 | 0x00000006 | | Low Threshold | 0x00000001 | 0x00000001 | 0x00000001 | 0x00000001 | | Accumulator | 0x00000000 | 0x00000000 | 0x00000000 | 0x00000000 | | DID Check Error | FALSE | TRUE | FALSE | FALSE | | CRC Error | FALSE | TRUE | FALSE | FALSE | | Lost Optical Signl | FALSE | FALSE | FALSE | FALSE | | Force Idling | FALSE | FALSE | FALSE | FALSE | | Busy Limit | FALSE | FALSE | FALSE | FALSE | |--------------------|-------------|-------------|-------------|------------| In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. ServerNet Cluster Manual— 520575-002 9- 22 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS CONNECTION, NNA Command Example Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 NNA Reg Data Valid indicates whether the Node Numbering Agent (NNA) register contents shown in the following lines are valid. Possible values are TRUE and FALSE. Node Routing ID shows the node number routing IDs (outbound and inbound) for both external fabrics. Node Number Mask shows the contents of the node number mask registers (outbound and inbound) for both external fabrics. The node number mask is a bit-mask indicating which bits of the node routing ID are valid. Mode Control shows the contents of the mode control registers (outbound and inbound) for both external fabrics. The mode control registers are used to select the NNA mode of operation (pass-through, clock-checking, or conversion). Busy Scaler shows the contents of the busy scaler registers (outbound and inbound) for both external fabrics. The busy scaler registers contain data used to scale the busy symbol counting mechanism. Ready Scaler shows the contents of the ready scaler registers (outbound and inbound) for both external fabrics. The ready scaler registers contain data used to scale the ready symbol counting mechanism. High Threshold shows the contents of the high threshold registers (outbound and inbound) for both external fabrics. The high threshold registers are used to set the upper limit for comparison to the accumulator contents. Low Threshold shows the contents of the low threshold registers (outbound and inbound) for both external fabrics. The low threshold registers are used to set the lower limit for comparison to the accumulator contents. ServerNet Cluster Manual— 520575-002 9- 23 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS CONNECTION, NNA Command Example Accumulator shows the contents of the accumulators (outbound and inbound) for both external fabrics. DID Check Error indicates whether a Destination ServerNet ID check error occurred. Possible values are TRUE and FALSE. CRC Error indicates whether a Cyclic Redundancy Check error occurred. Possible values are TRUE and FALSE. Lost Optical Signal indicates whether a lost optical signal error occurred. Possible values are TRUE and FALSE. Shutdown Occurred indicates whether a shutdown occurred. Possible values are TRUE and FALSE. Force Idling indicates whether force idling is occurring. Possible values are TRUE and FALSE. Busy Limit indicates whether a busy limit error occurred. Possible values are TRUE and FALSE. ServerNet Cluster Manual— 520575-002 9- 24 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example STATUS SWITCH Command Example The following example shows the STATUS SWITCH $ZZSMN command: > SCF STATUS SWITCH $ZZSMN STATUS SWITCH X Fabric Y Fabric |---------------------------------------------------------------------------| |Command Status | OK | OK | |Status Detail | No Status Detail | No Status Detail | |-------------------|---------------------------|---------------------------| |General Status | ServerNet Switch | ServerNet Switch | |-------------------|---------------------------|---------------------------| |FW/HW Status Valid | TRUE | TRUE | |At Least One Error | FALSE | FALSE | |Unknown Error | No Error Detected | No Error Detected | |Switch Response | PASSED | PASSED | |Switch Ownership | No Owner | No Owner | |Owner ID | N/A | N/A | |-------------------|---------------------------|---------------------------| |Firmware Status | ServerNet Switch | ServerNet Switch | |-------------------|---------------------------|---------------------------| |Program Checksum | Status OK | Status OK | |Dflt Config Loaded | FALSE | FALSE | |New Conf Aftr Load | Status OK | Status OK | |Bth Cnfs Aftr Load | Status OK | Status OK | |SP Ran Ov Allc Sp | FALSE | FALSE | |Program CRC | Status OK | Status OK | |Router Config CRC | Status OK | Status OK | |Firmware Images | Images The Same | Images The Same | |Config Images | Images The Same | Images The Same | |-------------------|---------------------------|---------------------------| |Hardware Status | ServerNet Switch | ServerNet Switch | |-------------------|---------------------------|---------------------------| |SRAM Memory Test | PASSED | PASSED | |Flash Sector Test | PASSED | PASSED | |SEEPROM Checksum | Status OK | Status OK | |SBUSY Error | No Error Detected | No Error Detected | |FLASH Program Err | No Error Detected | No Error Detected | |Power Supply Fan | Status OK | Status OK | |Router Self Check | FALSE | FALSE | |Flash ID String | Status OK | Status OK | |Flash Boot Lckot 0 | No Error Detected | No Error Detected | |Flash Boot Lckot 1 | No Error Detected | No Error Detected | |-------------------|---------------------------|---------------------------| |UPS Status | UPS | UPS | |-------------------|---------------------------|---------------------------| |Cable Detected | TRUE | TRUE | |UPS Response | PASSED | PASSED | |UPS On Ok Status | NORMAL | NORMAL | |Mode Of Operation | Line Operation | Line Operation | |Battery Voltage | Status OK | Status OK | |Battery Management | FLOATING | RESTING | |Line Regulation | Step Down (Buck) | Step Down (Buck) | |Attention Required | NONE | NONE | |Immediate Attn Req | NONE | NONE | |Backup Time | 44 minutes or more | 44 minutes or more | |Nominal Input Volt | 0120.70 | 0120.60 | |-------------------|---------------------------|---------------------------| |AC Power Status | AC Transfer Switch | AC Transfer Switch | |-------------------|---------------------------|---------------------------| ServerNet Cluster Manual— 520575-002 9- 25 SCF Commands for the External ServerNet SAN Manager Subsystem |Primary Power Rail | ON |Scndary Power Rail | ON STATUS SWITCH Command Example | ON | ON | |-------------------|---------------------------|---------------------------| |Switch Port Status | ST LS LR LT PR PT OS NB TP| ST LS LR LT PR PT OS NB TP| |-------------------|---------------------------|---------------------------| |Switch Port (00) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (01) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| |Switch Port (02) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (03) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| |Switch Port (04) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (05) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (06) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (07) | LD S2 DS DS DS DS PR NC DS| LD S2 DS DS DS DS LS NC DS| |Switch Port (08) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| |Switch Port (09) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| |Switch Port (10) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| |Switch Port (11) | LA S2 EN EN EN EN PR OK EN| LA S2 EN EN EN EN PR OK EN| | | | | |Blinking LED Ports | No Port LEDs Blinking | No Port LEDs Blinking | |-------------------|---------------------------|---------------------------| | | Nearest Switch Port Numbr | Nearest Switch Port Numbr | |-------------------|---------------------------|---------------------------| |Port Neighbor Data | 08 | 09 | 10 | 11 | 08 | 09 | 10 | 11 | |-------------------|------|------|------|------|------|------|------|------| |Nghbor Port Number | 10 | 11 | 08 | 09 | 10 | 11 | 08 | 09 | |Fabric Setting | X | X | X | X | Y | Y | Y | Y | |Globally Unique ID |VOXY36|VOXY36|VOXE6L|VOXE6L|VOXE61|VOXE61|VOXY38|VOXY38| |Config Version ID | S2C0 | S2C0 | S2C0 | S2C0 | S2C0 | S2C0 | S2C0 | S2C0 | |Configuration Tag | 10003| 10003| 10004| 10004| 10003| 10003| 10004| 10004| |Config Revision | 2_01 | 2_01 | 2_01 | 2_01 | 2_01 | 2_01 | 2_01 | 2_01 | |Firmware Revision |3_0_52|3_0_52|3_0_52|3_0_52|3_0_52|3_0_52|3_0_52|3_0_52| |---------------------------------------------------------------------------| Total Errors = 0 Total Warnings = 0 In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 FW/HW Status Valid indicates whether the firmware and hardware status information is valid. Possible values are TRUE or FALSE. At Least One Error indicates whether any hardware or firmware errors have been detected. Possible values are TRUE or FALSE. ServerNet Cluster Manual— 520575-002 9- 26 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Unknown Error indicates whether an error of unknown type (hardware or firmware) occurred. Possible values are Error Detected or No Error Detected. Switch Response indicates the success of cluster switch response test. Possible values are PASSED or FAILED. Switch Ownership indicates whether there is an owner for the cluster switch. Ownership is required for some sensitive commands. Possible values are Owned, No Owner, and Locally Owned. Owner ID indicates the ID of the cluster switch owner if there is an owner. Otherwise the value is N/A (not applicable). Program Checksum indicates the status of the program checksum. Possible values are Status OK and Status Bad. Dflt Config Loaded indicates whether the factory default router configuration is loaded. Possible values are TRUE or FALSE. New Conf Aftr Load indicates the status of a new configuration after a load. Possible values are Status OK and Status Bad. If the status is bad, the previous configuration is used. Bth Cnfs Aftr Load indicates the status of both configurations stored in the ServerNet II Switch memory after a load. Possible values are Status OK and Status Bad. If the status is bad, the previous configuration is used. SP Ran Ov Allc Sp indicates whether the service processor ran over the allocated space. Possible values are TRUE or FALSE. Program CRC indicates the status of the program cyclic redundancy check (CRC). Possible values are Status OK and Status Bad. ServerNet Cluster Manual— 520575-002 9- 27 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Router Config CRC indicates the status of the router configuration cyclic redundancy check (CRC). Possible values are Status OK and Status Bad. Firmware Images indicates whether the firmware images stored in FLASH memory are the same. Possible values are Images the Same and Images Different. Config Images indicates whether the configuration images stored in FLASH memory are the same. Possible values are Images the Same and Images Different. SRAM Memory Test indicates the status of the SRAM memory test. Possible values are PASSED or FAILED. Flash Sector Test indicates the status of the Flash Sector test. Possible values are PASSED or FAILED. SEEPROM Checksum indicates the status of the SEEPROM checksum. Possible values are Status OK and Status Bad. SBUSY Error indicates whether an SBUSY error occurred after FPGA access to router. Possible values are No Error Detected and Error Detected. FLASH Program Err indicates whether a FLASH program error occurred while attempting to program the FLASH memory. Possible values are No Error Detected and Error Detected. Power Supply Fan indicates the status of the power supply fans. Possible values are Status OK and Status Bad. Router Self Check indicates the status of the router configuration cyclic redundancy check (CRC). Possible values are No Error Detected and Error Detected. ServerNet Cluster Manual— 520575-002 9- 28 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Flash ID String indicates whether the manufacturer code and device code stored in FLASH memory are the expected values. Possible values are Status OK and Status Bad. Flash Boot Lckot 0 indicates whether an error occurred because the FLASH lower boot block section is locked. Possible values are No Error Detected and Error Detected. Flash Boot Lckot 1 indicates whether an error occurred because the FLASH upper boot block section is locked. Possible values are No Error Detected and Error Detected. Cable Detected indicates whether the power status cables of the UPS subcomponent are connected. Possible values are TRUE or FALSE. UPS Response indicates the result of the UPS response test. Possible values are PASSED or FAILED. UPS On Ok Status indicates the condition of the uninterruptible power supply (UPS) subcomponent of the cluster switch. Possible values are NORMAL and ABNORMAL. Mode of Operation describes how the ServerNet II Switch is currently receiving power. Possible values are Battery Operation and Line Operation. Battery Voltage is the battery voltage. Possible values are Status OK and LOW. Battery Management describes the current operational state of the battery contained within the UPS. Possible values are: • • • • CHARGING DISCHARGING FLOATING RESTING ServerNet Cluster Manual— 520575-002 9- 29 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Line Regulation indicates the status of line regulation. Possible values are: • • • Normal Straight Through Step Down (Buck) Step Up (Boost) Attention Required describes a condition on the UPS that requires attention. Possible values are: • • • • NONE Ground Failure Battery Failure Overloaded Immediate Attn Req describes a condition on the UPS that requires urgent attention. Possible values are: • • • • NONE Backfeed Contact Failure Overload (UPS Shut Down) Inverter Under Voltage Backup Time is the remaining backup time of the UPS subcomponent of the cluster switch. Nominal Input Volt is the input voltage of the UPS subcomponent of the cluster switch. Primary Power Rail indicates the status of the Primary Power Rail. Possible values are ON or OFF. Scndary Power Rail indicates the status of the Secondary (Backup) Power Rail. Possible values are ON or OFF. Switch Port (xx) displays nine types of status codes for each of the 12 ports on the ServerNet II Switch. The ports are numbered in the range 00 through 11. Table 9-4 describes the status variables and lists possible values. ServerNet Cluster Manual— 520575-002 9- 30 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Table 9-4. Switch Port Status Codes and Possible Values (page 1 of 2) Status Variables Possible Values Code Description Code Description ST Status LA Link Alive LD Link Dead UN Uninstalled RS Reset UK Unknown S1 ServerNet 1 S2 ServerNet 2 UK Unknown EN Enabled DS Disabled UK Unknown EN Enabled DS Disabled UK Unknown EN Enabled DS Disabled UK Unknown EN Enabled DS Disabled UK Unknown PR Present LS Lost UK Unknown OK No Error LD Link Dead WF Wrong Fabric IV Neighbor Invalid IP Invalid Port MG Mixed Globally Unique ID BP Invalid Part Number BV Invalid Version ID MC Mixed Configuration Tag IC Invalid Configuration LS LR LT PR PT OS NB Link Speed Link Receives Link Transmits Packet Receives Packet Transmits Optical Signal Neighbor Status ServerNet Cluster Manual— 520575-002 9- 31 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH Command Example Table 9-4. Switch Port Status Codes and Possible Values (page 2 of 2) Status Variables Possible Values Code Code Description DU Disabled, Unknown Reason UN Uninitialized UK Unknown EN Enabled DS Disabled UK Unknown TP Description Target Port Enabled Blinking LED Ports indicates which ports on the ServerNet II Switch have their LEDs blinking. Possible values are: • • • No Port LEDs Blinking All Port LEDs Blinking A horizontal listing of port numbers that are blinking. Nghbor Port Number is the port number on the neighbor cluster switch to which the local cluster switch is connected. Fabric Setting is the external ServerNet fabric of the neighbor cluster switch. Possible values are X, Y, or NONE. Globally Unique ID is the Globally Unique ID (GUID) of the neighbor cluster switch. Configuration Tag is the last six digits of the configuration tag specified for the neighbor cluster switch. For more information, see Configuration Tag, defined earlier in this section. Config Revision is the version of the configuration running on the neighbor cluster switch. Firmware Revision is the version of the firmware running on the neighbor cluster switch. ServerNet Cluster Manual— 520575-002 9- 32 SCF Commands for the External ServerNet SAN Manager Subsystem STATUS SWITCH, ROUTER Command Example STATUS SWITCH, ROUTER Command Example The following example shows the STATUS SWITCH command with the router option: > SCF STATUS SWITCH $ZZSMN, ROUTER | STATUS SWITCH X Fabric Y Fabric |---------------------------------------------------------------------------| | Command Status | OK | OK | | Status Detail | No Status Detail | No Status Detail | |--------------------|---------------------------|--------------------------| | Router Status | ServerNet Switch | ServerNet Switch | |--------------------|---------------------------|--------------------------| | Router Data Valid | TRUE | TRUE | | Inport Routing (00)| IDLE | IDLE | | Inport Routing (01)| IDLE | EBLK, ROUTING TO (07) | | Inport Routing (02)| IDLE | IDLE | | Inport Routing (03)| IDLE | IDLE | | Inport Routing (04)| NBLK, ROUTING TO (04) | IDLE | | Inport Routing (05)| IDLE | IDLE | | Inport Routing (06)| IDLE | IDLE | | Inport Routing (07)| IDLE | IDLE | | Inport Routing (08)| IDLE | IDLE | | Inport Routing (09)| IDLE | IDLE | | Inport Routing (10)| IDLE | IDLE | | Inport Routing (11)| IDLE | IBLK, ROUTING TO (11) | |--------------------|---------------------------|--------------------------| In this example: Command Status is the general condition of the connection. For a list of possible values, see Command Status Enumeration on page 9-37. Status Detail is the specific condition of the connection. For a list of possible values, see Status Detail Enumeration on page 9-37 Router Data Valid indicates whether the routing data in the following lines is valid. Possible values are TRUE and FALSE. Inport Routing (nn) indicates the status of inport routing for each port, where nn is the number of the ServerNet II switch port. Possible values are: • • • • NBLK (not blocked) EBLK (externally blocked) IBLK (internally blocked) IDLE ServerNet Cluster Manual— 520575-002 9- 33 SCF Commands for the External ServerNet SAN Manager Subsystem TRACE Command TRACE Command The TRACE command is a sensitive command: The TRACE command: • • • Starts a trace operation on a ServerNet SAN manager process Alters trace parameters set by a previous TRACE command Stops a previously requested trace operation The TRACE command syntax is: TRACE [ /OUT file-spec/ ]PROCESS $ZZSMN, { TO file-ID [, trace-option ... ] | STOP [, BACKUP ] } trace-option is BACKUP { COUNT records | PAGES pages } RECSIZE bytes SELECT tracelevel { WRAP | NOWRAP } OUT file-spec causes any SCF output generated for this command to be directed to the specified file. TO file-ID activates a trace and specifies the name of the file into which trace data is to be collected. This option is required when you are starting a trace operation. If the file already exists, it is purged of data before the trace is initiated. If the file does not exist, it is created with an extent size based on the value of the PAGES parameter. If TO is not specified, the existing trace is either stopped (if STOP is specified) or modified as specified in the trace-option. BACKUP specifies that the backup process should receive the trace request. COUNT records specifies the number of trace records to be captured. The trace will terminate after the number of records has been collected. PAGES pages designates how much memory space is allocated in the extended data segment used for tracing. The trace will terminate after the number of pages of trace data has been collected. ServerNet Cluster Manual— 520575-002 9- 34 SCF Commands for the External ServerNet SAN Manager Subsystem Considerations RECSIZE bytes specifies the maximum size for any trace data record. Larger records are truncated. SELECT tracelevel identifies the kind of trace data to be collected. Currently, only PROCESS is supported. WRAP | NOWRAP WRAP specifies that when the trace disk file end-of-file mark is reached, trace data wraps around to the beginning of the file and overwrites any existing data. If you omit this option, the default is for wrapping to be turned off (NOWRAP). STOP ends the current trace operation. Considerations • With the following exception, only a single ServerNet SAN manager process trace can be running at any one time. Exception: one trace can run on the primary and another on the backup ServerNet SAN manager process if the tracing is started immediately on the backup (as opposed to when the backup takes over). • • The BACKUP option takes effect as the backup process takes over from the primary. For more information on tracing, including the ranges and defaults of the attributes listed above, see the PTrace Reference Manual. Examples • The following SCF command starts a trace operation on the ServerNet SAN manager process and writes results into the file named $DATA.SANMAN.TRACE: > TRACE PROCESS $ZZSMN, TO $DATA.SANMAN.TRACE • The following SCF command starts a trace operation on the ServerNet SAN manager process, writes results into the file named $DATA.SANMAN.TRACE, and allows trace data at the beginning of the file to be overwritten when the end of file (EOF) is reached: > TRACE PROCESS $ZZSMN, TO $DATA.SANMAN.TRACE, WRAP • The next example stops the trace: > TRACE PROCESS $ZZSMN, STOP ServerNet Cluster Manual— 520575-002 9- 35 SCF Commands for the External ServerNet SAN Manager Subsystem VERSION Command VERSION Command The VERSION command displays version information about the ServerNet SAN manager process. VERSION is a nonsensitive command. The VERSION command syntax is VERSION [ /OUT file-spec/ ] PROCESS $ZZSMN [ , DETAIL ] OUT file-spec causes any SCF output generated for this command to be directed to the specified file. DETAIL designates that complete version information is to be returned. If DETAIL is omitted, a single line of version information is returned. Examples The following examples display the ServerNet SAN manager product name, product number, and release date. The exact format of the output depends on the actual version of the objects involved. > VERSION PROCESS $ZZSMN VERSION PROCESS \SYS.$ZZSMN: SMN - T0502G08 - (02JUL01) - AAG The following example shows the information returned by the VERSION, DETAIL command: > VERSION PROCESS $ZZSMN, DETAIL Detailed VERSION PROCESS \SYS.$ZZSMN SYSTEM \SYS SMN - T0502G08 - (02JUL01) - AAG GUARDIAN - T9050 - (Q06) SCF KERNEL - T9082G02 - (14JAN02) (03JAN02) SMN PM - T0502G08 - (02JUL01) - AAG The following descriptions explain the fields returned by the VERSION and VERSION, DETAIL commands: SMN - T0502G08 - (02JUL01) - AAG identifies the version of the external ServerNet SAN manager process and the release date. GUARDIAN - T9050 - (Q06) identifies the version of the operating system. ServerNet Cluster Manual— 520575-002 9- 36 SCF Commands for the External ServerNet SAN Manager Subsystem Command Status Enumeration SCF KERNEL - T9082G02 - (03OCT01) (25SEP01) identifies the version of the SCF Kernel and the release date. SANMAN PM - T0502G08 - (02JUL01) - AAG identifies the version of the SCF product module (T0502G08) and the release date. Command Status Enumeration The following list contains the values that might appear in the Command Status field of the INFO CONNECTION, INFO SWITCH, STATUS CONNECTION, and STATUS SWITCH command displays: • • • • • • • • OK Low Resources No Response NACK Response Partial Response Abort Due to Pfail Download Failure Ownership Error Status Detail Enumeration The following list contains the values that might appear in the Status Detail field of the INFO CONNECTION, INFO SWITCH, STATUS CONNECTION, and STATUS SWITCH commands: • • • • • • • • • • • • • • • • • No Detail MSEB Missing Link Dead Processor Fabric Down Type Not MSEB Bad SCB Loaded No NNA PIC No MSEB Config Record Bad MSEB Config Record MSEB Config Fetch Error SP I/O Library Call Error Internal Sys Fabric Down NNA Verify Failure System Power Up TNET Initialization Error Invalid Fabric Parameter Too Many Switches ServerNet Cluster Manual— 520575-002 9- 37 SCF Commands for the External ServerNet SAN Manager Subsystem ServerNet Cluster Manual— 520575-002 9- 38 Status Detail Enumeration 10 SCF Error Messages This section describes the types of error messages generated by SCF and provides the cause, effect, and recovery information for the SCF error messages specific to the ServerNet cluster subsystem and the external system area network manager process (SANMAN). This section includes the following main topics: Heading Page Types of SCF Error Messages 10-1 ServerNet Cluster (SCL) Error Messages 10-2 SANMAN (SMN) Error Messages 10-7 If You Have to Call Your Service Provider 10-12 Types of SCF Error Messages Command Parsing Error Messages Command parsing error messages are generated when a command is being broken down into its component parts. These error messages have no associated error number and are generally self-explanatory. For example: Expecting an existing SCF supported object name Expecting an SCF command or a program file SCF-Generated Numbered Error Messages SCF-generated numbered error messages are generated by SCF and begin at 20000. For example: SCF E20211 Invalid object type Common Error Messages SCF provides a pool of error messages, called common errors, that can be used by all subsystems. These errors always have negative error numbers. Each error message is preceded by the name of the subsystem in which the error is encountered and a character type code (E for critical or W for noncritical). For example: SCL E-00005 Command is not supported by this subsystem SCL Subsystem-Specific Error Messages Error messages specific to the ServerNet cluster subsystem are generated by and pertain solely to the ServerNet cluster subsystem. These errors always have positive error numbers. ServerNet Cluster Manual— 520575-003 10- 1 SCF Error Messages SCF Error Messages Help Like common errors, subsystem-specific error messages are divided into two classes—critical and noncritical. • • Critical messages can be serious, such as the notification of software errors for which there is no automatic recovery. Critical messages are preceded by an “E.” Noncritical messages are generally informational. Noncritical messages are preceded by a “W.” SCF Error Messages Help To request help for any SCF error message, type: -> HELP subsystem error-number For example, if the following messages appeared on your terminal: SCL E00003 Internal error. Case value out of range. SMN E00006 Processor switch failed. SCF E20211 Invalid object type You could display additional information by typing: -> HELP SCL 3 -> HELP SMN 6 -> HELP SCF 20211 ServerNet Cluster (SCL) Error Messages The ServerNet cluster subsystem SCF error messages are listed in numeric order. SCL Error 00001 SCL E00001 Internal error. Call to system procedure failed. Cause. An internal error was caused by an unexpected return code from a system procedure. Effect. The command is not executed. SCF waits for the next command. Recovery. Contact your service provider. (See If You Have to Call Your Service Provider on page 10-12.) SCL Error 00002 SCL E00002 Duplicate attribute Cause. You specified the same attribute more than once in the command. Effect. The command is not executed. SCF waits for the next command. Recovery. Remove the duplicate attribute and reissue the command. ServerNet Cluster Manual— 520575-003 10- 2 SCF Error Messages ServerNet Cluster (SCL) Error Messages SCL Error 00003 SCL E00003 Internal error. Case value out of range. Cause. An invalid case value was generated with no associated case label. Effect. The command is not executed. SCF waits for the next command. Recovery. Contact your service provider. (See If You Have to Call Your Service Provider on page 10-12.) SCL Error 00004 SCL E00004 Invalid MsgMon process qualifier. Cause. The optional MSGMON qualifier was not formatted correctly. This qualifier only applies to the TRACE PROCESS $ZZSCL command. Effect. The command is not executed. SCF waits for the next command. Recovery. Correct the MSGMON qualifier and reissue the command. The syntax for the qualifier is ZIMnn, where nn is in the range 00 through 15. SCL Error 00005 SCL E00005 Invalid MsgMon qualifier range. Cause. The MSGMON qualifier contains a numerical value representing the processor of a MSGMON process. This number is larger than 15. Effect. The command is not executed. SCF waits for the next command. Recovery. Correct the MSGMON qualifier and reissue the command. SCL Error 00006 SCL E00006 Not supported by the down-version system. Cause. The command was rejected by the ServerNet cluster monitor process ($ZZSCL) because the information requested by the command is for an earlier RVU. Effect. The command is not executed. SCF waits for the next command. Recovery. It is not possible to request information from a system running an incompatible version of SCF. Contact your service provider to resolve the version mismatch. (See If You Have to Call Your Service Provider on page 10-12). ServerNet Cluster Manual— 520575-003 10- 3 SCF Error Messages ServerNet Cluster (SCL) Error Messages SCL Error 00007 SCL E00007 Failure in service function. error: err-num, error detail: err-detail. err-num is the error number returned from a system procedure. err-detail is the error detail subcode associated with the system procedure error. Cause. An unexpected error was returned from a system procedure that was called by the ServerNet cluster monitor process. For information on file-system errors and error detail subcodes, see the Guardian Procedure Errors and Messages Manual. Effect. The command is not executed. SCF waits for the next command. Recovery. Consult the documentation on the returned file-system error to determine what to do next. Check the event logs for more information pertaining to the problem. SCL Error 00008 SCL E00008 Unexpected error returned from $ZCNF, error errnum, error detail: filesys-err. err-num is the error number returned from the Configuration Manager process $ZCNF. filesys-err is the Guardian file-system error number. For information on file-system errors, see the Guardian Procedure Errors and Messages Manual. Cause. An unexpected error was returned from the $ZCNF process. The system configuration database may be corrupted. Effect. The command is not executed. SCF waits for the next command. Recovery. Verify the database record using the SCF INFO command. If necessary, reload the system using a saved version of the system configuration database. If the problem persists, contact your service provider. (See If You Have to Call Your Service Provider on page 10-12.) ServerNet Cluster Manual— 520575-003 10- 4 SCF Error Messages ServerNet Cluster (SCL) Error Messages SCL Error 00009 SCL E00009 Processor switch failed. Cause. A processor switch was not performed. The process pair continues to execute in the current processor(s). Effect. The command is not executed. SCF waits for the next command. Recovery. Verify the processors used by the ServerNet cluster monitor process. Retry the command if necessary. SCL Error 00010 SCL E00010 MsgMon process does not exist. Cause. The MSGMON process does not exist. The probable cause is that the processor hosting the MSGMON process is not running. Effect. The command is not executed. SCF waits for the next command. Recovery. Make sure the processor hosting the MSGMON process is loaded. Then reissue the command. SCL Error 00011 SCL E00011 Subsystem start failure. Cause. The ServerNet cluster subsystem failed to initiate START processing. Effect. The command is not executed. SCF waits for the next command. Recovery. Make sure ServerNet cables are connected from the node to the ServerNet II Switch. The subsystem cannot be started until the cables are connected and the external system area network (SAN) manager process (SANMAN) has communicated with the ServerNet II Switch. Check the event logs for error messages. Once any problems have been resolved, reissue the command. SCL Error 00012 SCL E00012 Subsystem shutdown failure. Cause. The subsystem failed to initiate shutdown processing. Effect. The command is not executed. SCF waits for the next command. Recovery. Check the event logs for error messages. Once any problems have been resolved, reissue the command. ServerNet Cluster Manual— 520575-003 10- 5 SCF Error Messages ServerNet Cluster (SCL) Error Messages SCL Error 00013 SCL E00013 Trace command error. Cause. The subsystem failed to execute the TRACE PROCESS command. Effect. The command is not executed. SCF waits for the next command. Recovery. Correct the command and reissue it. You can also check the event logs for additional error messages. SCL Error 00014 SCL E00014 PROBLEMS attribute must be specified without any other attributes. Cause. One or more attributes, such as DETAIL, were specified with the PROBLEMS option. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command without the PROBLEMS attribute or make PROBLEMS the only attribute specified in the command. ServerNet Cluster Manual— 520575-003 10- 6 SCF Error Messages SANMAN (SMN) Error Messages SANMAN (SMN) Error Messages The ServerNet SAN Manager subsystem SCF error messages are listed in numeric order. SMN Error 00001 SMN E00001 Internal error: Call to system procedure failed. Cause. An internal error was caused by an unexpected return code from a system procedure. Effect. The command is not executed. SCF waits for the next command. Recovery. Contact your service provider. (See If You Have to Call Your Service Provider on page 10-12.) SMN Error 00002 SMN E00002 Duplicate attribute. Cause. You entered the same attribute more than once in the command. Effect. The command is not executed. SCF waits for the next command. Recovery. Remove the duplicate attribute and reissue the command. SMN Error 00003 SMN E00003 Internal error. Case value out of range. Cause. An invalid case value was generated with no associated case label. Effect. The command is not executed. SCF waits for the next command. Recovery. Contact your service provider. (See If You Have to Call Your Service Provider on page 10-12.) SMN Error 00004 SMN E00004 Not supported by the down-version system. Cause. The command was rejected by the external ServerNet SAN manager process ($ZZSMN) because the information requested by the command is for an earlier RVU. Effect. The command is not executed. SCF waits for the next command. ServerNet Cluster Manual— 520575-003 10- 7 SCF Error Messages SANMAN (SMN) Error Messages Recovery. It is not possible to request information from a system running an incompatible version of SCF. Contact your service provider to resolve the version mismatch. (See If You Have to Call Your Service Provider on page 10-12.) SMN Error 00005 SMN E00005 Failure in service function. error: err-num, error detail: err-detail. err-num is the error number returned from a system procedure. err-detail is the error detail subcode associated with the system procedure error. Cause. An unexpected error was returned from a system procedure that was called by the external ServerNet SAN manager process ($ZZSMN). For information on filesystem errors and error detail subcodes, see the Guardian Procedure Errors and Messages Manual. Effect. The command is not executed. SCF waits for the next command. Recovery. Consult the documentation on the returned file-system error to determine what to do next. Check the event logs for more information pertaining to the problem. SMN Error 00006 SMN E00006 Processor switch failed. Cause. The processor switch could not be completed. You might have specified a nonexistent or halted processor as the primary processor. Effect. The command is not executed. The process pair continues to execute in the current processor(s). SCF waits for the next command. Recovery. Verify the processors in use by the external ServerNet SAN manager process pair. SMN Error 00007 SMN E00007 Trace command error. Cause. The subsystem could not execute the TRACE PROCESS command. Effect. The command is not executed. SCF waits for the next command. Recovery. Retype the command, making sure you type it correctly. ServerNet Cluster Manual— 520575-003 10- 8 SCF Error Messages SANMAN (SMN) Error Messages SMN Error 00008 SMN E00008 Error returned from the external ServerNet SAN manager. Cause. An unexpected error was returned from the external ServerNet SAN manager process ($ZZSMN) during the processing of a command. This error is often preceded by one or two lines of additional text providing more specific information about the error. For example: ***ERROR: Invalid Parameter ***ERROR DETAIL: Bad Fabric ID Setting Effect. The command is not executed. SCF waits for the next command. Recovery. Correct the cause of the error and then reissue the command. SMN Error 00009 SMN E00009 Only one reset type attribute value (SOFT or HARD) is allowed. Cause. The reset type attribute values SOFT and HARD were both specified in a single command. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command using only one reset type attribute value (SOFT or HARD). SMN Error 00010 SMN E00010 A reset type attribute value (SOFT or HARD) is required. Cause. A reset type attribute value (SOFT or HARD) was not specified. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify a reset type attribute value (SOFT or HARD). SMN Error 00011 SMN E00011 A NEAREST switch fabric attribute value (X or Y) is required. Cause. A NEAREST switch fabric attribute value (X or Y) was not specified. Effect. The command is not executed. SCF waits for the next command. ServerNet Cluster Manual— 520575-003 10- 9 SCF Error Messages SANMAN (SMN) Error Messages Recovery. Reissue the command and specify a NEAREST switch fabric attribute value (X or Y). SMN Error 00012 SMN E00012 The command can only be executed interactively. Cause. A sensitive command was executed noninteractively (for example, using an OBEY file). Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command from an interactive SCF session. SMN Error 00013 SMN E00013 Only a single attribute can be specified for each ALTER command. Cause. More than one attribute was specified in a single ALTER command. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command with only one attribute (FABRIC, LOCATOR, or BLINK). SMN Error 00014 SMN E00014 An attribute (FABRIC, LOCATOR, or BLINK) to ALTER is required. Cause. An attribute (FABRIC, LOCATOR, or BLINK) was not specified in an ALTER command. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify an attribute (FABRIC, LOCATOR, or BLINK) to ALTER. SMN Error 00015 SMN E00015 Specifying both FIRMWARE and CONFIG attributes is not allowed. Cause. Both load file attributes (FIRMWARE and CONFIG) were specified in a LOAD command. Effect. The command is not executed. SCF waits for the next command. ServerNet Cluster Manual— 520575-003 10 -10 SCF Error Messages SANMAN (SMN) Error Messages Recovery. Reissue the command and specify just one load file attribute (either FIRMWARE or CONFIG). SMN Error 00016 SMN E00016 The POSITION attribute cannot be used with the FIRMWARE attribute. Cause. The POSITION attribute was specified in a command with the FIRMWARE attribute. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command without the POSITION attribute. SMN Error 00017 SMN E00017 A load file attribute (FIRMWARE or CONFIG) is required. Cause. A load file attribute (FIRMWARE or CONFIG) was not specified. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify a load file attribute of either FIRMWARE or CONFIG. SMN Error 00018 SMN E00018 The CONFIG attribute requires a POSITION attribute. Cause. A configuration load file type attribute (CONFIG) was specified without a POSITION attribute to identify the topology position of the cluster switch. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify a POSITION attribute. SMN Error 00019 SMN E00019 The TOPOLOGY attribute cannot be used with the FIRMWARE attribute. Cause. The TOPOLOGY attribute was specified in a command with the FIRMWARE attribute. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command without the TOPOLOGY attribute. ServerNet Cluster Manual— 520575-003 10 -11 SCF Error Messages If You Have to Call Your Service Provider SMN Error 00020 SMN E00020 The specified POSITION and TOPOLOGY attributes are not compatible. Cause. The specified POSITION attribute is not compatible with the specified TOPOLOGY attribute. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify POSITION and TOPOLOGY attributes that are compatible. For example, a position of 3 is only allowed with a 24-node (24NODES) topology. SMN Error 00021 SMN E00021 attribute. The CONFIG attribute requires a TOPOLOGY Cause. A CONFIG attribute was specified without a TOPOLOGY attribute to identify the topology of the cluster. Effect. The command is not executed. SCF waits for the next command. Recovery. Reissue the command and specify a TOPOLOGY attribute. If You Have to Call Your Service Provider If the recovery for an error message indicates you should contact your service provider, be prepared to supply the following information. (If the error caused SCF to terminate, reenter SCF.) 1. Enter a LOG command to collect the following displays into a single file. For example: -> LOG PROBLEM ! 2. Enter a LISTPM command to collect information about the product versions of the SCF components, a list of the product modules on your system, and information about any product modules running when the error occurred. For example: -> LISTPM 3. Enter an ENV command to collect information about the SCF environment that was present when the error occurred. For example: -> ENV If the error caused SCF to terminate, respecify any environmental characteristics that were present when the error occurred. ServerNet Cluster Manual— 520575-003 10 -12 SCF Error Messages If You Have to Call Your Service Provider 4. Enter a DETAIL CMDBUFFER command to capture the contents of the SPI buffer. For example: -> DETAIL CMDBUFFER, ON 5. Reproduce the sequence of commands that produced the SCF error. ServerNet Cluster Manual— 520575-003 10 -13 SCF Error Messages If You Have to Call Your Service Provider ServerNet Cluster Manual— 520575-003 10 -14 A Part Numbers For an up-to-date list of part numbers, refer to: NTL Support and Service Library > Service Information > Part Numbers > Part Number List for NonStop S-Series Customer Replaceable Units (CRUs) > ServerNet Cluster (Model 6770). The following hardware is associated with a ServerNet cluster that contains Model 6770 ServerNet Cluster Switches: • • • • Modular ServerNet Expansion Board Plug-In Cards (PICs) for MSEBs: ° ° ° ° ° ° ServerNet ECL PIC ServerNet single-mode fiber NNA PIC ServerNet serial copper PIC ServerNet single-mode fiber PIC PIC blank PIC blank (with fastening hardware) 6770 ServerNet Cluster Switch components: • • • • • • • • ServerNet II switch AC transfer switch Uninterupptible Power Supply, North America Uninterupptible Power Supply, International Uninterupptible Power Supply, Japan Power monitor cable, 36 inches Power monitor cable, 84 inches Half-height service-side door ServerNet Cluster Switch (model 6770) Power Cords: ° Power cords to primary and secondary power sources North America and Japan (9.1 feet, 18 AWG, 125 volts, 10 amps) International (2.5 meters, 250 volts, 10 amps) ° Power cord from ServerNet II Switch to UPS North America and Japan (9.1 feet, 18 AWG, 125 volts, 10 amps) International (2 meters, 125/250 volts, 10 amps) ° Power cord from AC transfer switch to UPS ServerNet Cluster Manual— 520575-003 A- 1 Part Numbers (1 meter, 125/250 volts, 10 amps) • • • • • • Rack Mount Kit for ServerNet Cluster Switch (model 6770): ° ° ° ° ° ° ° ° ° ° ° ° UPS, left fixed rail UPS, right fixed rail UPS, rail bracket Cable management tray ServerNet Switch slide rail bracket ServerNet Switch slide rails Cable management assembly arm Cable management assembly bracket Torx screws, M5 Cage nuts, M5 Nuts, 8-32 x .375 Flat-head phillips screws, 8-32 x .375 Fiber-optic cables: ECL ServerNet Cables, SEB to SEB ECL ServerNet Cables SEB to MSEB ECL ServerNet Cables MSEB to MSEB Serial Copper ServerNet cables ServerNet Cluster Manual— 520575-003 A- 2 B Blank Planning Forms This appendix contains blank copies of planning forms: • • Cluster Planning Work Sheet Planning Form for Moving ServerNet Cables The Cluster Planning Work Sheet can accommodate information for up to eight nodes. For clusters with more than eight nodes, you need to make additional copies of the Cluster Planning Work Sheet to accommodate the number of nodes in your cluster. ServerNet Cluster Manual— 520575-003 B- 1 Blank Planning Forms Cluster Planning Work Sheet Cluster Name: Date: Page System Name \ Serial Number Expand Node # Location # of Processors Model NonStop SX/Y Switch # X/Y Switch Port # ServerNet Node # \ \ NonStop S- NonStop S- System Name \ Serial Number Expand Node # Location # of Processors Model NonStop SX/Y Switch # X/Y Switch Port # ServerNet Node # \ \ NonStop S- NonStop S- System Name \ Serial Number Expand Node # Location # of Processors Model NonStop SX/Y Switch # X/Y Switch Port # ServerNet Node # \ NonStop S- ServerNet Cluster Manual— 520575-003 B- 2 of Blank Planning Forms Planning Form for Moving ServerNet Cables a. Identify the node whose ServerNet cables will be moved: System Name: \_____________________ Expand Node Number: ________________ b. Record the old and new cluster switch port connections: Before Moving Cables After Moving Cables Expand Node Expand Node Cluster Switch Port X1/Y1 0 \ \ 1 \ \ 2 \ \ 3 \ \ 4 \ \ 5 \ \ 6 \ \ 7 \ \ 0 \ \ 1 \ \ 2 \ \ 3 \ \ 4 \ \ 5 \ \ 6 \ \ 7 \ \ 0 \ \ 1 \ \ 2 \ \ 3 \ \ 4 \ \ 5 \ \ 6 \ \ 7 \ \ X2/Y2 X3/Y3 System Name ServerNet Cluster Manual— 520575-003 B- 3 System Name Blank Planning Forms c. List the lines to abort on the node whose ServerNet cables will be moved and on all other nodes: On the node whose cables will be moved... On all other nodes... Node Abort Expand-OverServerNet Line Node Abort Expand-OverServerNet Line \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ ServerNet Cluster Manual— 520575-003 B- 4 C ESD Information Observe these ESD guidelines whenever servicing electronic components: • • • • • • • Obtain an electrostatic discharge (ESD) protection kit and follow the directions that come with the kit. You can purchase an ESD kit fromHP or from a local electronics store. Ensure that your ESD wriststrap has a built-in series resistor and that the kit includes an antistatic table mat. Before you unpack a replacement CRU, place the CRU package on the antistatic table mat and attach the grounding clip on your wriststrap to the mat. When you unpack the CRU, do not cut into the ESD protective bag surrounding the CRU. The protective bag protects the CRU and can be reused for storing a CRU. Before you move the CRU from the antistatic table mat, attach the grounding clip from your ESD wriststrap to any unpainted metal surface on the CRU frame. Before you bring a CRU in contact with a system enclosure, attach the grounding clip on your ESD wriststrap to any unpainted metal surface on the enclosure frame. When you remove a CRU from a system enclosure, first pull the CRU partly out of the slot and then attach the grounding clip on your ESD wriststrap to any unpainted metal surface on the CRU frame. Store CRUs that require ESD protection in ESD protective bags. Figure C-1 on page C-2 shows how to use an ESD kit when servicing CRUs. ServerNet Cluster Manual— 520575-003 C- 1 ESD Information Figure C-1. Using ESD Protection When Servicing CRUs System Enclosure (Appearance Side) ESD wriststrap with grounding clip ESD wriststrap clipped to door latch stud ESD floor mats ESD antistatic table mat. Mat should be connected to a soft ground (1 megohm min. to 10 megohm max.) Clip 15-foot straight ground cord to screw on grounded outlet cover VST 001.CDD ServerNet Cluster Manual— 520575-003 C- 2 D Service Categories for Hardware Components NonStop S-series hardware components fall into the service categories shown in Table D-1. Table D-1. Service Categories for Hardware Components (page 1 of 2) Category Class 1 CRU Class 2 CRU CRU or FRU* • • • • • • Disk drives Power cords Fans ECL ServerNet cables System consoles Tape drives (some) Definition A CRU that probably will not cause a partial or total system outage if the documented replacement procedure is not followed correctly. Customers replacing a Class 1 CRU do not require previous experience with replacing NonStop S-series CRUs. However, for some CRUs, customers must be able to use the tools needed for the replacement procedure (these are common tools) and must protect components from electrostatic discharge (ESD). A CRU that might cause a partial or total system outage if the documented replacement procedure is not followed correctly. Customers replacing Class 2 CRUs should have either three or more months’ experience with replacing NonStop S-series CRUs or equivalent training. Customers must be able to use the tools needed for the replacement procedure and must protect components from ESD. *The CRUs and FRUs listed do not represent a complete list of CRUs and FRUs used in NonStop S-series servers. ServerNet Cluster Manual— 520575-003 D- 1 Service Categories for Hardware Components Table D-1. Service Categories for Hardware Components (page 2 of 2) Category Class 3 CRU CRU or FRU* • • • • • • FieldReplaceable Unit (FRU) • • • • • Plug-in Cards (PICs) installed in MSEBs MSEBs PMF CRUs ServerNet II Switch ServerNet cables AC transfer switch Batteries (system) ServerNet F/X adapters PMCUs Uninterruptible power supply (UPS) Definition A CRU that probably will cause a partial or total system outage if the documented replacement procedure is not followed correctly. Customers replacing Class 3 CRUs should have six or more months’ experience replacing NonStop S-series CRUs or equivalent training. Customers must be able to use the tools needed for the replacement procedure, must protect components from electrostatic discharge (ESD), and must understand the dependencies involved in NonStop S-series CRU-replacement procedures, such as disk path switching. Replacement by a service provider trained by HP is recommended. A unit that can be replaced in the field only by qualified personnel trained by HP and cannot be replaced by customers. A unit is classified as a FRU because of safety hazards such as weight, size, sharp edges, or electrical potential; contractual agreements with suppliers; or national or international standards. Enclosures *The CRUs and FRUs listed do not represent a complete list of CRUs and FRUs used in NonStop S-series servers. Note. A cluster switch is neither a CRU nor a FRU. It is a collection of three components, which are included in Table D-1: the ServerNet II Switch, the AC transfer switch, and the UPS. For more information, see the ServerNet Cluster 6770 Hardware Installation and Support Guide. For more information about service categories, see the Replaceable Units topic in the Service Information category of the Support and Service category in NTL, or see the manual for the component you are servicing. ServerNet Cluster Manual— 520575-003 D- 2 E TACL Macro for Configuring MSGMON, SANMAN, and SNETMON The example macro in this appendix automates the process of adding MSGMON, SANMAN, and SNETMON to the system-configuration database. (The manual steps for adding these processes are documented in Section 3, Installing and Configuring a ServerNet Cluster). The macro: • • • • • Detects the number of processors currently loaded and prompts you to confirm this number (if your system has fewer than four processors) Aborts and deletes the MSGMON, SANMAN, and SNETMON processes, if they are already present Returns error messages if the processes could not be successfully deleted Adds the processes using a processor list that is appropriate for the system size Starts MSGMON, SANMAN, and SNETMON and attempts to start the ServerNet cluster subsystem Note. Before using the macro, note the following: • • • The macro is intended as an example and might not be appropriate for all systems. You must log on using the super ID (255, 255) in order to run the macro successfully. Do not run the macro on a system that is currently a member of a ServerNet cluster. MSGMON, SANMAN, and SNETMON will be aborted, and the connection to the cluster will be lost temporarily. To use the macro: 1. Log on using the super ID (255, 255). 2. Copy the macro to (or create it in) the $SYSTEM.ZSUPPORT subvolume. 3. At a TACL prompt, type RUN ZPMCONF. 4. When the macro finishes, check the SCF state of $ZZSCL to make sure that it is in the STARTED state: -> SCF STATUS PROCESS $ZZKRN.#ZZSCL 5. If $ZZSCL is not started, start it: -> SCF START PROCESS $ZZKRN.#ZZSCL ServerNet Cluster Manual— 520575-003 E- 1 TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Example Macro Example Macro ?tacl macro == This is a sample TACL macro which configures the $ZPM entries for == Msgmon, Sanman, and Snetmon. HP recommends different == configurations depending upon whether the system has 2 processors, == 4 processors, or more than 4 processors. This macro starts by == determining the number of processors currently loaded. If it is == 5 or more, the macro configures the $ZPM entries as such. If it is == less than 5 processors, it prompts the user for the number of == processors to configure for. #frame #push listProcessors numProcessors parseTest ptrChar scf^output #push #inlineprefix #set #inlineprefix + == This routines parses the user responses [#def parseResponse routine |BODY| #result [#argument number /minimum 2, maximum 16/ token /token yes/ token /token no/ token /token quit/ otherwise] ] == == This is the main procssing routine. It gets executed once at the bottom of the TACL macro. [#def mainBody routine |BODY| #output Determining the total of number of processors... #output #set listProcessors [#processorstatus] #set ptrChar [#charfindrv listProcessors [#charcount listProcessors] "-1"] #set numProcessors -1 == Initialize to -1 because we always overshoot by 1 [#loop |WHILE| [ptrChar] |DO| #set ptrChar [#charfindrv listProcessors [#compute [ptrChar]-1] " "] #set numProcessors [#compute numProcessors + 1] ] [#if numProcessors <= 4 |THEN| #output In order to properly configure this system for Servernet #output Clusters we need to know how many processors it has. #output ServerNet Cluster Manual— 520575-003 E- 2 TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Example Macro #output Based on the number of processors currently running, it appears #output that this system has a total of [numProcessors] processors. [#loop |DO| #set parseTest [parseResponse & [#input Is this the correct number of processors? (YES/NO/QUIT)]] [#if parseTest = 4 |THEN| #return] |UNTIL| parseTest = 2 or parseTest = 3] == Does the user want to specify the number of processors? [#if parseTest = 3 |THEN| [#loop |DO| #set numProcessors & [#input Please enter the total number of processors or QUIT->] #set parseTest [parseResponse [numProcessors]] [#if parseTest = 4 |THEN| #return] |UNTIL| parseTest = 1 ] ] ] #output #output Deleting the existing $ZPM entries... #output == == == Collect the $ZZKRN information in a variable called scf^output. The next time we are in SCF, we will parse that variable for configuration information. scf /name, outv scf^output/ ; info process $zzkrn.* scf /name, inline, out [#myterm]/ + allow all errors [#if [#charfindv scf^output 1 "SNETMON"] |THEN| + abort process $zzkrn.#zzscl #delay 200 + delete process $zzkrn.#zzscl] [#if [#charfindv scf^output 1 "SANMAN"] |THEN| + abort process $zzkrn.#zzsmn #delay 200 + delete process $zzkrn.#zzsmn] [#if [#charfindv scf^output 1 "MSGMON"] |THEN| ServerNet Cluster Manual— 520575-003 E- 3 TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Example Macro + abort process $zzkrn.#msgmon #delay 200 + delete process $zzkrn.#msgmon] + exit #output #output Verifying that the entries were successfully deleted... #output == == Repeat the above process to verify that we really did delete all of the entries. scf /name, outv scf^output/ ; info process $zzkrn.* [#if [#charfindv scf^output 1 "MSGMON"] |THEN| #output ERROR! The Msgmon entry was not successfully deleted. #output Please delete the entry manually and then rerun this macro. #output #return ] [#if [#charfindv scf^output 1 "SANMAN"] |THEN| #output ERROR! The Sanman entry was not successfully deleted. #output Please delete the entry manually and then rerun this macro. #output #return ] [#if [#charfindv scf^output 1 "SNETMON"] |THEN| #output ERROR! The Snetmon entry was not successfully deleted. #output Please delete the entry manually and then rerun this macro. #output #return ] #output #output Adding the new $ZPM entries... #output [#case [numProcessors] | 2 3 | #set listProcessors (0,1) | 4 5 | #set listProcessors (2,1,3) | OTHERWISE | #set listProcessors (2,5,6,3,7,4) ] scf /name, inline, out [#myterm]/ + allow all errors ServerNet Cluster Manual— 520575-003 E- 4 TACL Macro for Configuring MSGMON, SANMAN, and SNETMON Example Macro + add process $zzkrn.#msgmon, autorestart 10, cpu all, & hometerm $zhome, outfile $zhome, name $zim, & priority 199, program $system.system.msgmon, saveabend on, & startmode system, stopmode sysmsg + add process $zzkrn.#zzsmn, autorestart 10, priority 199, & program $system.system.sanman, cpu firstof [listProcessors], & hometerm $zhome, outfile $zhome, name $zzsmn, saveabend on, & startmode system, stopmode sysmsg, startupmsg "cpu-list <cpu-list>" + add process $zzkrn.#zzscl, autorestart 10, priority 199, & program $system.system.snetmon, cpu firstof [listProcessors], & hometerm $zhome, outfile $zhome, name $zzscl, saveabend on, & startmode system, stopmode sysmsg, startupmsg "cpu-list <cpu-list>" + + + + + + start start start alter abort start process $zzkrn.#msgmon process $zzkrn.#zzsmn process $zzkrn.#zzscl subsys $zzscl, startstate started process $zzkrn.#zzscl process $zzkrn.#zzscl #delay 500 + info subsys $zzscl + status subsys $zzscl + exit ] == This is the line of code that actually executes the mainBody routine mainBody #unframe ServerNet Cluster Manual— 520575-003 E- 5 TACL Macro for Configuring MSGMON, SANMAN, and SNETMON ServerNet Cluster Manual— 520575-003 E- 6 Example Macro F Common System Operations This appendix contains procedures for common operations used to manage the ServerNet cluster. Procedure Page Logging On to the TSM Low-Level Link Application F-1 Logging On to the TSM Service Application F-2 Logging On to Multiple TSM Client Applications F-3 Starting a TACL Session Using the Outside View Application F-5 Using the TSM EMS Event Viewer F-6 Note. You can use OSM instead of TSM for any of the procedures described in this manual. For information on using OSM instead of TSM, see Appendix H, Using OSM to Manage the Star Topologies. Logging On to the TSM Low-Level Link Application 1. From the Windows Start button, choose Programs>Compaq TSM>TSM LowLevel Link Application. The TSM Low-Level Link main window opens, and the Log On to TSM Low-Level Link dialog box appears. See Figure F-1. Figure F-1. Log On to TSM Low-Level Link Dialog Box vst059.vsd 2. On the Log On to TSM Low-Level Link dialog box, do the following: a. In the User name box, type your low-level link user name. ServerNet Cluster Manual— 520575-003 F-1 Common System Operations Logging On to the TSM Service Application b. In the Password box, type your password. c. Select the system you want to connect to. d. Click Log On or double-click the system name and number. 3. Click System Discovery to discover the system. Logging On to the TSM Service Application 1. From the Windows Start button, choose Programs>Compaq TSM>TSM Service Application. The TSM Service Application main window opens and the Log On to TSM Service Connection dialog box is displayed. See Figure F-2. Figure F-2. Log On to TSM Service Connection Dialog Box vst060.vsd 2. On the Log On to TSM Service Connection dialog box, do the following: a. In the User name box, type your operating system user name. b. In the Password box, type your password. c. Select the system you want to connect to. d. Click Log On, or double-click the system name and number in the system list. ServerNet Cluster Manual— 520575-003 F-2 Common System Operations Logging On to Multiple TSM Client Applications Logging On to Multiple TSM Client Applications TSM client applications, such as the TSM Low-Level Link Application and the TSM Service Application, allow you to log on to only one system at a time. However, you can start multiple instances of each client application. Starting multiple instances of an application allows you to log on to multiple systems at the same time from one system console. This capability is useful when you want to manage more than one node in a ServerNet cluster. Before you can log on to multiple systems, the systems must be added to the system list for the TSM application you want to use. And the systems must be connected to the same LAN. Both the dedicated TSM LAN and the public LAN support the connection of multiple systems. Use the following procedure to log on to multiple TSM client applications: 1. Log on to the TSM client application, as described in one of the following procedures: • • Logging On to the TSM Low-Level Link Application on page F-1 Logging On to the TSM Service Application on page F-2 2. Repeat Step 1 until you have started a TSM client application for each of the nodes you want to manage. Figure F-3 on page F-4 shows three instances of the TSM Service Application logged on to three different systems from the same system console. Caution. Running too many applications concurrently on one system console can degrade performance or cause the console to freeze. The number of software applications that your console can support depends on the amount of memory installed in the console. ServerNet Cluster Manual— 520575-003 F-3 Common System Operations Logging On to Multiple TSM Client Applications Figure F-3. Multiple TSM Client Applications vst061.vsd ServerNet Cluster Manual— 520575-003 F-4 Common System Operations Starting a TACL Session Using the Outside View Application Starting a TACL Session Using the Outside View Application Sometimes you need to start a TACL session on the host system to perform certain system actions, such as reloading a processor. Depending on whether you are logged on to the TSM Low-Level Link Application or the TSM Service Application, you might be prompted to type an IP address. Starting a TACL From the TSM Service Application To start a TACL terminal emulation session from the TSM Service Application: 1. From the File menu, choose Start Terminal Emulator>For TACL. An OutsideView window appears, and a TACL prompt is displayed. 2. If a TACL prompt does not appear in the OutsideView window, do the following: a. From the OutsideView session menu, choose New. The New Session Properties dialog box appears. b. On the Session tab, in the Session Caption box, type a session caption name, such as TACL. c. Click IO Properties. The TCP/IP Properties dialog box appears. d. In the Host name or IP address and port box, type the IP address that is currently configured for your primary service connection. e. Click OK. The TCP/IP Properties dialog box is closed, and you are returned to the New Session Properties dialog box. f. Click OK. The OutsideView window appears and a TACL prompt is displayed. Starting a TACL From the TSM Low-Level Link Application To start a TACL terminal emulation session from the TSM Low-Level Link Application: 1. From the File menu, choose Start Terminal Emulator>For TACL. The Winsock dialog box appears. 2. In the Enter Telnet IP Address box, type the IP address that is currently configured for your primary service connection. Click OK. An OutsideView window appears, and a TACL prompt is displayed. Note. The supplied IP address is not saved in the OutsideView Communication settings (under the Settings menu). If you need to reconnect, use the procedure for the connection type shown earlier in this topic. ServerNet Cluster Manual— 520575-003 F-5 Common System Operations Using the TSM EMS Event Viewer Using the TSM EMS Event Viewer 1. To start the Event Viewer, do one of the following: • • From the TSM Service Application drop-down menu, choose Display and click Events to launch the TSM EMS Event Viewer Application. Click Start and select Programs>Compaq TSM>TSM Event Viewer. 2. From the File menu, choose Log On. 3. In the Choose System list, click the line containing the name of the system you want to log on to. 4. Enter your User name and Password, and click OK. The application displays an empty task window. A message at the bottom of the screen shows the status of the connection request. 5. Set up retrieval criteria for the TSM EMS Event Viewer Application: a. From the Setup menu, choose Time Frame Criteria to open the Setup Search Criteria dialog box. b. Set the start and end times of the events you want to retrieve. c. Click the Sources tab. d. Specify one or more sources of the events to be retrieved. e. Click the Subsystem Events tab. f. Specify which subsystems should be retrieved and which events within each subsystem should be selected. g. Click OK to begin event retrieval. ServerNet Cluster Manual— 520575-003 F-6 G Fiber-Optic Cable Information This appendix provides additional information about the fiber-optic cables that connect a cluster switch to either a node or another cluster switch. These fiber-optic cables conform to the IEEE 802.3z (Gigabit Ethernet) specification. For this release of the ServerNet Cluster product, the following HP cables are supported: Cable Length Feet Meters 32.8 10 131.2 40 262.4 80 262.4 80 (plenum-rated) HP does not supply fiber-optic cables longer than 80 meters. However, longer cables are supported for connections in a multilane link if certain requirements are met. For more information, see Table 2-4, Cable Length Requirements for Multilane Links, on page 2-11. Note. Although IEEE 802.3z has a mechanism for supporting multimode fiber-optic cables, the ServerNet cluster product does not currently support them. Fiber-Optic Cabling Model Figure G-1 shows the basic model for the fiber-optic cable between a cluster switch and either another cluster switch or a node. The entire path between the two Medium Dependent Interfaces (MDIs) is referred to as the channel. The intermediate fiber-optic connectors might be part of a patch panel or might be splices to a broken cable, or they might not exist if a single run of cable is used between the two MDIs. Figure G-1. Fiber-Optic Cable Model MDI MDI Fiber Optic Cabling (channel) Cluster Switch Patch Cable Fiber Optic Connector Building Cable Fiber Optic Connector Patch Cable Cluster Switch or Node VST081.vsd ServerNet Cluster Manual—520575-003 G- 1 Fiber-Optic Cable Information Fiber-Optic Cabling Model Figure G-2 and Figure G-3 show drawings of the fiber-optic cable. The zipcord cable depicted in Figure G-2 is a cross-over cable. The drawings are for reference only. Figure G-2. Zipcord Cable Drawing SC Connector SC Connector Single-Mode Duplex Cable Label Figure G-3. Ruggedized Cable Drawing SC Connector 2-Fiber Breakout SC Connector VST084.vsd Note. There must be an odd number of cross overs used in the channel to allow transmitter to receiver connectivity. If all straight-through connections are used, the link will not come up. ServerNet Cluster Manual—520575-003 G- 2 Fiber-Optic Cable Information Optical Characteristics Optical Characteristics As specified by IEEE 802.3z, the fiber-optic cable requirements are satisfied by the fibers specified in IEC 793-2;1992 for the type B1 (10/125 µm single mode) with the exceptions noted in Table G-1. Table G-1. Optical Fiber and Cable Characteristic Description 9 µm SMF Nominal fiber specification wavelength 1310 nm Fiber cable attenuation (max) 0.5 dB/Km Zero dispersion wavelength (λ0) 1300 nm <= λ0 <= 1324 nm 0.093 ps / nm2 * km Dispersion slope (max) (S0) Optical Fiber Connection An optical connection consists of a mated pair of optical connectors. The Physical Media Dependent (PMD) is coupled into the fiber optic cabling through a connector plug into the MDI optical receptor as shown in Figure G-4. Insertion Loss The insertion loss is specified for a connection that consists of a mated pair of optical connectors. You must not exceed the 4.5dB channel loss specified in Table G-2. For example, assume a 1000-meter channel has six connections, each with an average of 0.5 dB loss per connection. This combined with .5 dB loss per Km would result in a total attenuation of 3.5 dB, which is 1 dB less than the maximum channel loss. Table G-2 lists the insertion loss for single-mode fiber-optic cables. . Table G-2. Single-Mode Fiber Insertion Loss Description 10 µm Single-Mode Fiber Wavelength 1310 nm Operating Distance For connections between a node and a cluster switch, a maximum of 80 m is supported. For connections in a multilane link, see Table 2-4, Cable Length Requirements for Multilane Links, on page 2-11. Channel Insertion Loss 4.5 db typical SC Connector Loss 0.5 typical You can use connections with different loss characteristics if the requirements in Table G-1 and Table G-2 are met. ServerNet Cluster Manual—520575-003 G- 3 Fiber-Optic Cable Information ServerNet MDI Optical Power Requirements ServerNet MDI Optical Power Requirements The ServerNet MDI optical power requirements are: Transmitter output optical power Minimum: -9.5 dBm Maximum: -3 dBm Receiver input optical power Minimum: -20 dBm Maximum: -3 dBm Optical power budget 10.5 dBm Connectors The cluster switch and Modular ServerNet Expansion Board (MSEB) PMDs are coupled to the fiber optic cabling through a connector plug into the MDI optical receptacle. The optical receptacle is duplex subscriber connector (SC). An SC is also known informally as a “Stick and Click” connector. See Figure G-4. Figure G-4. Duplex SC Connector and Receptacle Keys on Connector Body VST055.vsd ServerNet Cluster Manual—520575-003 G- 4 Fiber-Optic Cable Information ServerNet Cluster Connections ServerNet Cluster Connections The following sections list other requirements for the two types of ServerNet cluster connections that use single-mode fiber-optic cables. Node Connections The node connections use ports 0 through 7 on the cluster switch. The range of the node connections is up to 80 meters. The reason this distance is less than the distance that single-mode fiber can support is due to the hardware implementation of the ServerNet protocol. Caution. Do not exceed the 80-meter maximum cable length between a ServerNet node and a cluster switch. Doing so can cause multiple, unrecoverable link failures to and from the node that is in violation of the distance limit. Cluster Switch Connections The cluster switch connections use ports 8 through 11 of the cluster switch. These ports allow multiple cluster switches on a fabric to be connected in a multilane link. Cables up to 80 meters in length are supported for multilane links in all ServerNet Cluster topologies. However, you can use longer cables for a multilane link if the requirements in Table 2-4 on page 2-11 are met. ServerNet Cluster Manual—520575-003 G- 5 Fiber-Optic Cable Information Cluster Switch Connections ServerNet Cluster Manual—520575-003 G- 6 H Using OSM to Manage the Star Topologies The contents of this section were formerly part of the ServerNet Cluster 6770 Supplement. That supplement has been incorporated into this manual for easier access and linking to the information. OSM supports all network topologies of a ServerNet cluster: the star topologies (star, split-star, and tri-star) and the newer layered topology. If you use the OSM software to manage a cluster with one of the star topologies, this section describes the differences between the OSM Service Connection and the TSM Service Application. ServerNet Cluster Resource Appears at Top Level of Tree Pane The OSM Service Connection does not have a cluster tab as the TSM Service Application does. As a result, you do not need to switch between cluster resources and system resources. You can see all resources in the tree pane at the same time. Some Cluster Resources Are Represented Differently in OSM The OSM representation of the 6770 switch used in the star topologies is more complete than the TSM representation. In particular, the ServerNet II Switch, the AC Transfer Switch, and the UPS now appear as separate resources. To view these resources, expand the Switch Module resource. The OSM names for some cluster resources are different from the TSM names: TSM Name OSM Name Switch-to-Node Link Switch PIC (PICs 0 through 7 connect to the nodes.) Switch-to-Switch Link Switch PIC (PICs 8 through 11 connect to other switches.) Local Node ServerNet Local Node Remote Node ServerNet Remote Node Note. The full name for a Switch PIC is in the form: Switch PIC $ZZSMN.X.1.2. The last digit in the name is the PIC number. In this example, the last digit is 2, indicating that this PIC is used for connections to nodes. ServerNet Cluster 6770 Hardware Installation and Support Guide—522544-002 H- 1 Using OSM to Manage the Star Topologies Guided Procedures Have Changed Guided Procedures Have Changed Some of the TSM guided procedures used with a ServerNet cluster have been replaced by new actions in the OSM Service Connection. Other guided procedures are now launched directly by an action in the OSM Service Connection instead of from the guided procedures interface. TSM Guided Procedure OSM Changes Add Switch Procedure has been replaced by the Update Topology action of the ServerNet Cluster Resource. The tasks have been streamlined and are documented in the OSM Service Connection online help. Replace MSEB Procedure is now launched by the Replace action of the MSEB. Replace Switch Procedure has been enhanced to support switch types for all topologies. For the star topologies, the procedure is launched by the Replace action of the Switch Module, the ServerNet II Switch, the AC Transfer Switch, or the UPS. Update Switch Procedure has been replaced by separate Firmware Update and Configuration Update actions of the ServerNet II Switch. Configure ServerNet Node Procedure has been streamlined and renamed. It is now launched by the Add Node to ServerNet Cluster action of the System resource. Options for Changing Topologies To update your cluster to the split-star or tri-star topologies, you can use the OSM Update Topology interactive action which replaces the TSM Add Switch guided procedure. The tasks are described in the OSM Service Connection online help, which is accessible when you perform the Update Topology action. If you prefer, you can still use the TSM Add Switch guided procedure as described in the ServerNet Cluster Manual. If you need to migrate from one of the star topologies to the layered topology, see the ServerNet Cluster 6780 Planning and Installation Guide. If you use the OSM Update Topology action to update to the split-star or tri-star topology, you should still refer to the ServerNet Cluster Manual for planning information. Most of the information in the ServerNet Cluster Manual still applies, but: • • • References to TSM software generally apply to OSM software. To learn more about the differences, see For More Information About OSM. The OSM switch names are slightly different than the TSM switch names. The TSM client and server versions are not relevant because you should be running the OSM package on all nodes. The OSM package supports all network topologies of a ServerNet cluster. ServerNet Cluster 6770 Hardware Installation and Support Guide—522544-002 H- 2 Using OSM to Manage the Star Topologies • • For More Information About OSM Follow the procedure for updating topologies presented in the OSM Service Connection online help. References to TSM alarms generally apply to alarms generated in the OSM Service Connection. The alarm behavior is similar. To get alarm details, follow the instructions in the OSM Service Connection online help. After you complete the planning steps, and are ready to start the update process, use the OSM Service Connection to perform the Update Topology action on the ServerNet Cluster resource. Follow the procedure in the OSM Service Connection online help. For More Information About OSM • • The OSM User’s Guide provides details on the cluster resources that exist in both types of clusters. It provides guidance for using OSM to manage clusters with the star topologies. It also describes how to perform basic functions within the OSM Service Connection, such as monitoring attributes and alarms for system and cluster resources, and performing actions on those resources as needed. The OSM Migration Guide compares OSM and TSM functionality and describes how to prepare for and migrate to using OSM for system and cluster management. ServerNet Cluster 6770 Hardware Installation and Support Guide—522544-002 H- 3 Using OSM to Manage the Star Topologies For More Information About OSM ServerNet Cluster 6770 Hardware Installation and Support Guide—522544-002 H- 4 I SCF Changes at G06.21 This section describes SCF changes made at G06.21 to the SNETMON and SANMAN product modules that might affect management of a cluster with one of the star topologies. The contents of this section were formerly part of the ServerNet Cluster 6770 Supplement. That supplement has been incorporated into this manual for easier access and linking to the information. Using SCF Commands for SNETMON and the ServerNet Cluster Subsystem This section describes the changes made to SCF commands for the SNETMON product module. The changes are backward compatible and still support the star topologies and the 6770 switch. For more details, consult the SCF help text for this product module. To access SCF help for SNETMON and the ServerNet Cluster subsystem, at the SCF prompt, type: > HELP SCL ALTER SUBSYS A new STATSEVENT option is available. STATSEVENT specifies whether ServerNet cluster statistics are generated as EMS events (SCL event 1200). For details, see the SCF help. INFO SUBSYS The output of the INFO SUBSYS shows the current setting of the STATSEVENT attribute. STATUS SUBNET A new RANGE option is available. This option causes the command to display connectivity status for all nodes within a specified range of ServerNet node numbers. The output contains a line for each node in the specified range, whether active or inactive. If you issue the STATUS SUBNET command with no options specified, a summary table of all active nodes appears. Inactive nodes no longer appear. If you want the inactive nodes to appear in the display, use the RANGE option. For details, see the SCF help. TRACE The BACKUP option is not supported for a trace operation of a message monitor process. ServerNet Cluster Manual—520575-003 I-1 SCF Changes at G06.21 Using SCF Commands for SANMAN Using SCF Commands for SANMAN This section describes the changes made to SCF commands for the SNETMON product module. The changes are backward compatible and still support the star topologies and the 6770 switch. The 6780 switch is also supported, but the syntax for specifying each type of switch is different. Also, the output of the commands differs depending on the switch type. For more details, consult the SCF help text for this product module. To access SCF help for SANMAN, at the SCF prompt, type: > HELP SMN ALTER SWITCH The syntax for this command has changed. The FABRIC specifier does not change the setting of the 6770 switch, but designates which switch to alter. To configure a 6770 switch for use on a particular fabric, use the new FABRICSETTING attribute. For details, see the SCF help. STATUS CONN None of the changes to this command affect clusters with the star topologies, but the syntax presented earlier in this manual is incorrect when using this product version or later. For the correct syntax, see the SCF help. STATUS SWITCH None of the changes to this command affect clusters with the star topologies, however the syntax presented earlier in this manual is incorrect when using this product version or later. For the correct syntax, see the SCF help. New SCF Error Messages for SANMAN Because the syntax is different for 6770 and 6780 switches, many new error messages have been added in case you use incorrect syntax for your type of switch. For a complete listing of SCF error messages for SANMAN, see the ServerNet Cluster 6780 Operations Guide. ServerNet Cluster Manual—520575-003 I-2 Safety and Compliance Regulatory Compliance Statements The following warning and regulatory compliance statements apply to the products documented by this manual. FCC Compliance This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio frequency energy and, if not installed and used in accordance with the instruction manual, may cause interference to radio communications. Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense. The use of shielded cables is required in order for this equipment to meet the FCC emission limits. Exceptions: Ethernet and EPO cables. Any changes or modifications not expressly approved by Compaq Computer Corporation could avoid the user's authority to operate this equipment. CISPR Compliance This equipment complies with the requirements of CISPR 22 (EN 55 022) for Class A Information Technology Equipment (ITE). In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures Canadian Compliance This class A digital apparatus meets all the requirements of the Canadian InterferenceCausing Equipment Regulations. Cet appareil numérique de la classe A respecte toutes les exigences du Règelment sur le matériel brouilleur du Canada. ServerNet Cluster Manual— 520575-003 Statements -1 Safety and Compliance Regulatory Compliance Statements Taiwan (BSMI) Compliance Japan (VCCI) Compliance ServerNet Cluster Manual— 520575-003 Statements -2 Safety and Compliance Regulatory Compliance Statements DECLARATION OF CONFORMITY Supplier Name: COMPAQ COMPUTER CORPORATION Supplier Address: Compaq Computer Corporation Non-Stop Division 10300 North Tantau Ave Cupertino, CA 95014 USA Represented in the EU By: Compaq Computer EMEA BV P.O. Box 81 02 44 81902 Munich Germany Declares under our sole responsibility that the following product Product Name: Product Model: COMPAQ ServerNet Cluster Switch COMPAQ Model 67xx Conforms to the following normative European and International Standards Product Safety: EN60950:1995 (IEC 950 2nd Edition) Electromagnetic Compatability: EN55022:1998 EN55024:1998 - Radiated and Conducted Emission - EMC Immunity Following the provisions of the normative European Council Directives: EMC Directive 89/336/EEC (including amendments) Low Voltage Directive 73/23/EEC (amended by 93/68/EEC) Supplementary Information: Safety: Emissions: Year Assessed / First Production: Protection Class I, Pollution Degree II EMC Class A 2001 Product conformance to cited product specifications is based on sample (type) testing, evaluation, or assessment at Compaq’s compliance laboratories in Cupertino, California or at accredited laboratories accepted by European Union Notified and Competent Bodies. Derek Smith Manager, Hardware Product Assurance Cupertino, California ServerNet Cluster Manual— 520575-003 Statements -3 Safety and Compliance Consumer Safety Statements Consumer Safety Statements Customer Installation and Servicing of Equipment The following statements pertain to safety issues regarding customer installation and servicing of equipment described in this manual. Do not remove the covers of an AC transfer switch, ServerNet II Switch, or uninterruptible power supply (UPS). ServerNet Cluster Manual— 520575-003 Statements -4 Index A C AC transfer switch described 1-29 replacement 7-38 Add Switch guided procedure 3-25, 3-27, 3-30, 3-32, 4-26, 4-30, 4-60, 4-73, 4-87 Adding a node 3-24, 6-1 Alarm Detail dialog box 7-13 Alarms 7-12, 7-15 ALGORITHM modifier 3-11 Attributes external fabric resource 5-6 local node resource 5-5 MSEB 5-3 online help for 5-10 PIC 5-3 remote node resource 5-6 ServerNet cluster resource 5-5 service state 5-10 switch resource 5-7 Automatic configuration of line-handler processes 3-18 Automatic fail-over of ServerNet links 7-25 Cable lengths 2-10, 2-11, 2-19, 4-96, G-5 Cables bad fiber connection 3-21 considerations for connecting 3-6 disconnecting 6-3 fiber-optic, requirements 2-12 for multilane links 2-10 lengths 2-10, 2-11, 2-19, 4-96, G-5 media 1-19 moving fiber-optic 6-5 types of 2-17 Cabling considerations 2-20 Checking cluster status 5-1 MSEB PIC status 5-3 MSEB status 5-3 SNETMON status 5-15 version information 5-16, 5-17 Checklist for installation planning 2-1 Cluster defined 1-12 hardware components 1-9 software components 1-10, 1-33 splitting into smaller clusters 6-11 switch 1-26 tab 5-4, 7-3 tab and SANMAN 5-29 tab, troubleshooting 7-9 topologies 1-2 B Bend radius 2-12, 3-18 Benefits of upgrading 4-2/4-3 Blank forms B-1 Block diagram for cluster switch 1-29 ServerNet Cluster Manual—520575-003 Index -1 Index D Cluster switch block diagram 1-29 components 1-28 connections between 1-30, 2-10, 2-19, G-5 defined 1-26 floor space for servicing 2-21 globally unique ID 9-10 installation 3-12 location 2-19 number for clustering 1-26 packaging 1-26 placement of 2-20 polling intervals 9-12 power cord length 2-20 power requirements 2-22 powering on 3-16 remote 5-1 updating firmware and configuration 3-15 CLUSTER.CHM file 7-11 Configuration tag manufacturing default 4-5 SCF/TSM display of 4-5 Configuration, ServerNet II Switch checking 4-11 compatibility with NonStop Kernel 4-98 decision table for updating 3-13 downloading 4-99 file name 4-98 reference information 4-97/4-105 revisions 4-12 version procedure information 2-25 Configuring cluster switch 3-13 Expand-over-ServerNet lines 3-29, 3-35 MSGMON, SANMAN, and SNETMON 3-8 ServerNet node 3-17 Connectivity repairing problems 7-23 verifying 3-29, 3-36 Connectors for SEB and MSEB 2-17 Control tasks 5-26/5-35 CPU list configuration 1-35, 1-36, 1-39, 3-10 CRUs D-1 D Dedicated LAN 2-28 Destination ServerNet ID 1-23 Dial-out 1-44, 2-27, 2-29, 6-3 DID 1-23 Double-wide PIC 1-28 Dust caps 2-11 E ECL ServerNet cables connecting 3-6 Enclosure dimensions 2-21 Error messages SCL 10-2/10-6 SMN 10-7/10-8 ESD information C-1 Ethernet 4 ServerNet Adapter (E4SA) 2-31 Event messages 7-31 Expand line compatibility 2-26 line-handler process 3-11, 3-22 loss of traffic 4-98 network 1-10 node 1-13 node number 2-26 Expand-over-ServerNet lines configuring 3-29, 3-35 troubleshooting 7-21 External fabrics defined 1-17 testing 7-29 External routing 1-23 ServerNet Cluster Manual—520575-003 Index -2 Index F F F1 help 5-10 Fabrics checking external 7-29 checking internal 7-26 stopping data traffic on 5-33 FABRICTS.CHM file 7-11 Fail-over of ServerNet traffic 1-31, 4-3, 7-25 Fallback for merging clusters 4-66/4-67, 4-89/4-92 Fallback for upgrading software 4-26/4-33, 4-50/4-53 Fiber-optic cables bad connection 3-21 bend radius 2-12, 3-18 connecting 3-18 connectors 3-19 function 2-10 insertion loss G-3 inspecting 3-18 moving 6-5 optical characteristics G-3 plenum-rated 2-12 replacing 7-36 requirements 2-12 supported lengths 2-12, G-1 Firmware, ServerNet II Switch checking 4-11 combinations 4-103 compatibility with NonStop Kernel 3-14, 4-98 decision table for updating 3-13 downloading 4-99 file name 4-98 reference information 4-97/4-105 revisions 4-12 running 4-11 version procedure information 2-25 Firmware, SP 4-105, 7-19, 7-20 Five kilometer cables 2-11 Floor space requirements 2-18 Four-lane link connecting 3-27/3-28, 4-64/4-65 connections 1-31 described 1-30 length of 4-96 routing across 1-31 FPGA 1-22 FRUs D-2 G G06.12 functionality fallback for upgrading software to obtain 4-26/4-33 upgrading software to obtain 4-17/4-25 G06.13 versus G06.12 functions 4-2 G06.14 functionality fallback for upgrading software to obtain 4-50/4-53 upgrading software to obtain 4-34/4-49 Globally unique ID, cluster switch 9-10 Group Connectivity ServerNet Path Test 7-27 GUID (globally unique ID) 9-10 Guided procedure adding a node 6-1 configuring ServerNet node 3-17, 3-29, 5-21, 5-22 installing PICs 2-18 replacing an MSEB 2-18 replacing an SEB 2-18 updating a switch 4-100 H Hard reset, cluster switch 4-101 Hardware inventory 3-3 Hardware planning 2-9 Help F1 5-10 online 5-10, 7-11 subsystem error message 10-2 ServerNet Cluster Manual—520575-003 Index -3 Index I I M Installation planning 2-1 Installation tasks 3-1/3-24, 3-25 Installing cluster switch 3-12 fiber-optic cables 3-12 MSEBs 3-5 split-star topology 3-25 star topology 3-1/3-24 Internal fabrics defined 1-16 testing 7-27 Internal Loopback Test 7-6, 7-30 Inventory of required hardware 3-3 IPC subsystem 7-31 Managing multiple nodes F-3 Merging clusters 4-54/4-92 Message system traffic 1-43 Monitoring tasks 5-1/5-26 MSEB attributes 5-3 connectors 2-17 defined 1-19 diagram 2-15 guided procedure for replacement 3-5 installation 3-5 installation requirements 2-13 LEDs 7-33 replacement 7-35 troubleshooting 7-6 MSGMON aborting 5-28 creating 1-37, 3-7 functions 1-37 recommended configuration 3-9 starting 5-28 troubleshooting 7-19 Multilane link cables for 2-10 defined 1-30 maximum length 2-11, 2-19 L LAN dedicated 2-28 planning for ServerNet clusters 2-32 public 2-31 LAN planning 2-27 LEDs 7-33, 7-34 Line-handler process automatic configuration of 3-18 example 3-22 monitoring 5-20 required for clustering 3-22 rules for locating primary and backup 3-23 summary 1-42 using SCF to configure 3-22 Link to node 7-5 Long cables in multilane link 2-11 Loopback See Internal Loopback Test N Network topology 1-2 NNA FPGA 1-24 MSEB port 6 requirement 2-16 PIC type for MSEB port 6 7-6 Node adding to a cluster 3-24, 6-1 Expand 1-13 expanding 6-11 moving from one cluster to another 6-4 ServerNet Cluster Manual—520575-003 Index -4 Index O reducing 6-11 removing from a cluster 6-2 ServerNet 1-13 Node Connectivity ServerNet Path Test 7-4, 7-7, 7-29 Node number 1-13 Node Responsive Test 7-23 Node routing ID 1-13 Node-number assignments 1-13, 1-14 Node-numbering agent 1-22, 2-16 NonStop Himalaya Cluster Switch see cluster switch NonStop S700 considerations 2-13 O One kilometer cables 2-11 Online expansion 6-11 Online help 5-10, 7-11, 10-2 Operating system requirements 3-4 OSM software package 1-44 OSM, using to manage cluster H-1 Outside View F-5 P Packets, ServerNet 1-23 PDK.CHM file 7-11 Performance problems, diagnosing 7-39 Physical view (TSM) split-star topology 5-11 tri-star topology 5-11 PIC attributes 5-3 double-wide 1-22, 1-28 installation 2-16, 2-18 installed in port 6 2-16 NNA FPGA 1-22 replacement 7-35 single-wide 1-28 sizes 1-28 troubleshooting 7-6 Planning checklist 2-1/2-4 for tri-star topology 3-31 hardware 2-9 power 2-22 work sheet 2-5, 2-6, 2-7 worksheet for upgrade 4-10 Planning form cluster B-2 moving ServerNet cables B-3 Plug-in cards (PICs) attributes 5-3 connectors 2-17 double-wide 1-28 sizes 1-28 Polling intervals for cluster switches 9-12 position ID INFO SWITCH command 9-11 split-star topology 1-5 star topology 1-4 tri-star topology 1-7 Power requirements 2-22 Public LAN 2-31, 2-33 R Reference information 4-93/4-104 Releases, ServerNet cluster 4-2 Remote node attributes 5-6 checking communications with 7-23 Remote passwords 3-22 Remote switch 5-1, 5-9 Removing a node 6-2 Repair actions 7-13, 7-14 Replacement procedures 7-35/7-38 Required hardware 3-3 Reset, soft and hard 4-101 Routers 1-18 Routing between switches four-lane link 1-31 two-lane link 1-32 ServerNet Cluster Manual—520575-003 Index -5 Index S S SANMAN abending 7-19, 7-20 aborting 5-29 and the TSM cluster tab 5-29 cluster switch polling intervals 9-12 compatibility with NonStop Kernel 4-94 considerations for upgrading 4-93 creating 1-39, 3-7 functions 1-38 recommended configuration 3-10 restarting 5-30 SCF commands 1-45, 9-1 starting 5-29 switching primary and backup 5-34 troubleshooting 7-20 upgrading before loading new configuration 4-102 SAVE CONFIGURATION command 3-5 SC cable connectors 2-11 SCF ABORT LINE command 6-6 ABORT PROCESS command 5-34 ALTER SUBSYS command 8-5 ALTER SWITCH command 9-3, 9-4 commands for SANMAN 9-1/9-37 commands for SNETMON 8-1/8-26 configuring line-handler processes using 3-22/3-23 error messages 10-1 help for subsystem error messages 10-2 INFO CONNECTION command 9-6 INFO LINE command 5-24 INFO PROCESS command 5-13 INFO PROCESS $NCP, LINESET command 5-25 INFO SUBSYS command 8-6 INFO SWITCH command 9-9 LISTDEV command 5-14 LOAD SWITCH command 9-13, 9-15 nonsensitive commands 8-2 object summary states 8-3 PRIMARY PROCESS command 7-23, 8-7, 9-16 remote 5-12 RESET SWITCH command 9-16, 9-17 SANMAN features by release 9-2 SANMAN objects 9-2 SAVE CONFIGURATION command 3-5 sensitive commands 8-2 ServerNet cluster objects 8-2 SNETMON/SCL features by release 8-2 START LINE command 6-8 START PROCESS command 5-31 START SERVERNET command 7-24 START SUBSYS command 7-24, 8-8 STATS LINE command 5-23 STATS PATH command 5-24 STATUS CONNECTION command 9-19 STATUS CONNECTION, NNA command 9-22 STATUS DEVICE command 5-20 STATUS LINE command 5-22 STATUS PATH, DETAIL command 5-23 STATUS PROCESS command 5-15 STATUS SERVERNET command 7-27 STATUS SUBNET command 5-12, 5-17, 7-31, 8-11 STATUS SUBNET, DETAIL command 8-15 STATUS SUBNET, PROBLEMS command 8-14 STATUS SWITCH command 9-25 STATUS SWITCH, ROUTER command 9-33 ServerNet Cluster Manual—520575-003 Index -6 Index S STOP SUBSYS command 5-31, 5-33, 6-3, 7-24, 8-21 TRACE PROCESS command 8-23, 9-34 VERSION command 5-16, 5-17, 8-25, 9-36 SCF, changes to SNETMON and SANMAN at G06.21 I-1 SCL subsystem 7-31 SEB compared to MSEB 1-19 connector 2-17 diagram 2-15 guided procedure for replacement 3-5 replacement 2-18, 7-35 SEB.CHM file 7-11 ServerNet cables 1-30, 2-17 IDs 1-22 node 1-13 node number 1-13 packets 1-23, 1-25 stopping data traffic 5-33 ServerNet adapters 2-13, 2-18, 3-5 ServerNet cluster adding nodes 3-24 hardware components 1-9 joining 5-31 product 1-1 releases 4-2 software components 1-10 upgrading 4-1/4-105 ServerNet Cluster Connection Status dialog box 5-22 ServerNet cluster services starting 5-31 stopping 5-33, 6-3 ServerNet cluster subsystem defined 1-36 logical states 1-36, 8-3 ServerNet connectivity, repairing 7-23 ServerNet II Switch defined 1-27 keyed ports 3-20, 4-64 LEDs 7-34 powering on 3-16 replacement 7-38 troubleshooting 7-8 Service categories D-1/D-2 Service clearance 2-21 Service processors 1-18 Service provider 10-12 SID 1-23 Single-mode fiber-optic PIC 2-16 Single-wide PIC 1-28 Slots 51 and 52 (system enclosure) 2-14 SMN subsystem 7-31 SNET value for SPEEDK 1-42 SNETMON abending 7-19, 7-20 aborting 5-30 checking status 5-14 compatibility with operating system 4-95 creating 1-36, 3-7 displaying information about 5-13 fault tolerance 1-35 functions 1-33 interaction with Expand 1-36 recommended configuration 3-10 SCF commands 1-45, 8-1 starting 5-30 switching primary and backup 5-34 symbolic name 5-13 troubleshooting 7-17 Soft reset, cluster switch 4-101 Software for clustering 1-10, 1-33 operating system upgrade 3-4 problem areas 7-3 VPROC information 2-25 ServerNet Cluster Manual—520575-003 Index -7 Index T Source ServerNet ID 1-23 SP firmware checking version 2-24 for Tetra 8 topology 2-26, 7-19, 7-20 minimum required 2-23, 3-15, 4-8 SP functions 1-18 SPEEDK modifier 1-42 SPI 1-33 Splitting a cluster 6-11 Split-star topology configuration tags 4-5 described 1-5 fallback for merging clusters to create 4-66/4-67 installing 3-25 merging clusters to create 4-54/4-63 upgrade paths 4-14/4-15 SPRs checking current levels 2-24, 4-11 G06.12 and G06.14 functionality 3-15, 4-8 required for topologies 2-23, 4-7 VPROC information 2-25 SPs 1-18 Star group 1-12, 3-25, 4-7 star group 1-2 Star topology configuration tag 4-5 described 1-4 installing 3-1/3-24 upgrade paths 4-13 STARTMODE attribute 5-32 STARTSTATE attribute 5-32, 8-4 Statistics 5-18 Status information, displaying 5-1 Stopping the cluster monitor process 6-3 Subsets of a topology 1-7 Subsystems 7-31 Super time factors 1-42 SWANFFU.CHM file 7-11 Switch enclosure 1-26, 2-10 enclosure dimensions 2-21 packaging 2-19 SWITCHADDITION.CHM file 7-11 SWITCHFIRMWARE.CHM file 7-11 Switch-to-node link 5-8 Switch-to-switch link 5-8 SWITCH.CHM file 7-11 Symbolic names 3-7, 5-29 System console 2-27 name 2-26 number 2-26 T T0569 AAA IPM 4-101 AAA preloaded on ServerNet II Switch 3-12 compatibility with NonStop Kernel 4-99 configuration revisions 4-12 considerations for applying AAA and AAB 4-22 firmware revisions 4-12 TACL macro 1-35, 1-37, 1-39, 3-8, E-1/E-5 TACL session F-5 Tetra 8, SP firmware consideration 2-26 Time factors 1-42 Topology choosing 2-8, 4-6 comparison 4-7 identifying 4-4/4-5 maximum size for each 2-8 network 1-2 split-star 1-5 star 1-4 subsets 1-7, 2-9, 4-4, 4-6 Tetra 16 1-2 Tetra 8 1-2 tri-star 1-7 ServerNet Cluster Manual—520575-003 Index -8 Index U Tree pane 5-4 Tri-star topology configuration tags 4-5 described 1-7 fallback for merging clusters to create 4-89/4-92 installing 3-30 merging clusters to create 4-68/4-86 required releases 4-3 upgrade paths 4-16 Troubleshooting Cluster tab 7-9 Expand-over-ServerNet lines 7-21 Expand-over-ServerNet line-handler processes 7-21 external fabric 7-4 fiber-optic ServerNet cable 7-7 guided procedures interface 7-3 internal fabric 7-3 MSEB 7-6 MSGMON 7-19 PIC installed in MSEB 7-6 procedures 7-1/7-35 remote node communication 7-4 SANMAN 7-20 ServerNet cable 7-7 ServerNet communications 7-3 ServerNet II Switch 7-8 SNETMON 7-17 tips 7-1 UPS 7-8 TSM alarms 6-3, 7-12 client software versions 3-4 EMS Event Viewer Application F-6 managing multiple nodes F-3 on a public LAN 2-33 Service Application management functions 1-44 TSM client applications example of multiple F-4 logging on to multiple F-3 TSM Low-Level Link Application, logging on to F-1 TSM management window 5-2 TSM Physical view split-star topology 5-11 tri-star topology 5-11 TSM Service Application displaying status information using 5-1 functions 1-44 logging on to F-2 TSM software package 1-44 TSM.CHM file 7-11 Two-lane link connecting 3-34, 4-87/4-88 connections 1-32 described 1-31 routing across 1-32 routing cables 3-32 U UNKNOWN remote node 7-5 Upgrading SANMAN 4-93, 4-102 SNETMON/MSGMON 4-95 software with system loads 4-25 TSM 4-93 Upgrading a ServerNet cluster benefits of 4-2 planning tasks 4-4/4-16 upgrade paths 4-12/4-16 UPS defined 1-29 powering on 3-16 replacement 7-38 troubleshooting 7-8 ServerNet Cluster Manual—520575-003 Index -9 Index V V Version of ServerNet cluster subsystem 5-17 Version procedure (VPROC) information 2-25, 4-11 W Website, ServerNet cluster 3-9 X X fabric 1-2, 1-16 Y Y fabric 1-2, 1-16 Z ZPMCONF macro 1-35, 1-37, 1-39, 3-8, 3-9, E-1 ZZAL attachment files 7-15 Special Characters $NCP 3-11 $NCP ALGORITHM modifier 3-11 $ZEXP 3-11 $ZLOG 5-18 $ZPM 8-4 $ZZKRN.#MSGMON 7-19 $ZZKRN.#ZZSCL 7-17 $ZZKRN.#ZZSMN 7-20 ServerNet Cluster Manual—520575-003 Index -10
© Copyright 2025 Paperzz