sfung51's picture
From sfung51 rss RSS  subscribe Subscribe

HACMP For AIX 5 L 



 

 
 
Tags:  aix  hacmp 
Views:  3654
Downloads:  45
Published:  November 28, 2009
 
3
download

Share plick with friends Share
save to favorite
Report Abuse Report Abuse
 
Related Plicks
000-314

000-314

From: kelminsl
Views: 21 Comments: 0

 
See all 
 
More from this user
Dana Telsey

Dana Telsey

From: sfung51
Views: 403
Comments: 0

Nielsen Online Global Lanscape

Nielsen Online Global Lanscape

From: sfung51
Views: 435
Comments: 0

cummins  1999 ar

cummins 1999 ar

From: sfung51
Views: 515
Comments: 0

Database Tools and Developer Software Overview

Database Tools and Developer Software Overview

From: sfung51
Views: 279
Comments: 0

Pdp Program Form March 2010

Pdp Program Form March 2010

From: sfung51
Views: 37
Comments: 0

Buyer Keywords Generator Report

Buyer Keywords Generator Report

From: sfung51
Views: 266
Comments: 0

See all 
 
 
 URL:          AddThis Social Bookmark Button
Embed Thin Player: (fits in most blogs)
Embed Full Player :
 
 

Name

Email (will NOT be shown to other users)

 

 
 
Comments: (watch)
plicker vijayraj (2 years ago)
it great
 
 
Notes:
 
Slide 1: Front cover IBM HACMP for AIX V5.X Certification Study Guide udy Includes the updated, new features for HACMP V5.2 Valuable guide for HACMP system administrators Get ready for the HACMP V5.X certification exam Octavian Lascu Dharmesh Parikh David Pontes Liviu Rosca Christian Allan Schmidt ibm.com/redbooks
Slide 3: International Technical Support Organization IBM HACMP for AIX V5.X Certification Study Guide October 2004 SG24-6375-00
Slide 4: Note: Before using this information and the product it supports, read the information in “Notices” on page xi. First Edition (October 2004) This edition applies to Version 5 of HACMP for AIX (product number 5765-F62). © Copyright International Business Machines Corporation 2004. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Slide 5: Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 What is HACMP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 History and evolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.2 High availability concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.3 High availability vs. fault tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1.4 High availability solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 HACMP concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.1 HACMP terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 HACMP/XD (extended distance) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3.1 HACMP/XD: HAGEO components . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.2 HACMP/XD: HAGEO basic configurations . . . . . . . . . . . . . . . . . . . . 14 1.3.3 HACMP/XD PPRC integration feature . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 2. Planning and design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1 Planning consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.1 Sizing: choosing the nodes in the cluster . . . . . . . . . . . . . . . . . . . . . 18 2.1.2 Sizing: storage considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.3 Network considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 HACMP cluster planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Node configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.2 Network configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.3 HACMP networking terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.4 Network types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.5 Choosing the IP address takeover (IPAT) method . . . . . . . . . . . . . . 28 2.2.6 Planning for network security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.3 HACMP heartbeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.3.1 Heartbeat via disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.3.2 Heartbeat over IP aliases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.4 Shared storage configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4.1 Shared LVM requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.4.2 Non-Concurrent, Enhanced Concurrent, and Concurrent. . . . . . . . . 41 © Copyright IBM Corp. 2004. All rights reserved. iii
Slide 6: 2.4.3 Choosing a disk technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5 Software planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.5.1 AIX level and related requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.5.2 Application compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.5.3 Planning NFS configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.5.4 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.5.5 Client connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.6 Operating system space requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.7 Resource group planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.7.1 Cascading resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.7.2 Rotating resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.7.3 Concurrent resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.7.4 Custom resource groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.7.5 Application monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.8 Disaster recovery planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.9 Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 2.9.1 Sample questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Chapter 3. Installation and configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.1 HACMP software installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.1.1 Checking for prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.1.2 New installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.1.3 Installing HACMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.1.4 Migration paths and options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.1.5 Converting a cluster snapshot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.1.6 Node-by-node migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.1.7 Upgrade options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.2 Network configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.2.1 Types of networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.2.2 TCP/IP networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.3 Storage configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.3.1 Shared LVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 3.3.2 Non-concurrrent access mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 3.3.3 Concurrent access mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 3.3.4 Enhanced concurrent mode (ECM) VGs. . . . . . . . . . . . . . . . . . . . . 101 3.3.5 Fast disk takeover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.4 Configuring cluster topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 3.4.1 HACMP V5.X Standard and Extended configurations . . . . . . . . . . 104 3.4.2 Define cluster topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 3.4.3 Defining a node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 3.4.4 Defining sites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 3.4.5 Defining network(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.4.6 Defining communication interfaces . . . . . . . . . . . . . . . . . . . . . . . . . 122 iv IBM HACMP for AIX V5.X Certification Study Guide
Slide 7: 3.4.7 Defining communication devices. . . . . . . . . . . . . . . . . . . . . . . . . . . 124 3.4.8 Boot IP labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 3.4.9 Defining persistent IP labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 3.4.10 Define HACMP network modules . . . . . . . . . . . . . . . . . . . . . . . . . 127 3.4.11 Synchronize topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 3.5 Resource group configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 3.5.1 Cascading resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 3.5.2 Rotating resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 3.5.3 Concurrent access resource groups . . . . . . . . . . . . . . . . . . . . . . . . 133 3.5.4 Custom resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 3.5.5 Configuring HACMP resource groups using the standard path . . . 135 3.5.6 Configure HACMP resource group with extended path . . . . . . . . . 139 3.5.7 Configuring custom resource groups . . . . . . . . . . . . . . . . . . . . . . . 145 3.5.8 Verify and synchronize HACMP . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 3.6 Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 3.6.1 Sample questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Chapter 4. Cluster verification and testing . . . . . . . . . . . . . . . . . . . . . . . . 159 4.1 Will it all work?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.1.1 Hardware and license prerequisites . . . . . . . . . . . . . . . . . . . . . . . . 160 4.1.2 Operating system settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.1.3 Cluster environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 4.2 Cluster start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 4.2.1 Verifying the cluster services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 4.2.2 IP verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 4.2.3 Resource verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 4.2.4 Application verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 4.3 Monitoring cluster status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 4.3.1 Using clstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 4.3.2 Using snmpinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 4.3.3 Using Tivoli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 4.4 Cluster stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 4.5 Application monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 4.5.1 Verifying application status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 4.5.2 Verifying resource group status . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 4.5.3 Verifying NFS functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 4.6 Cluster behavior on node failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 4.7 Testing IP networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 4.7.1 Communication adapter failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 4.7.2 Network failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 4.7.3 Verifying persistent IP labels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 4.8 Testing non-IP networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 4.8.1 Serial networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Contents v
Slide 8: 4.8.2 SCSI networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 4.8.3 SSA networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 4.8.4 Heartbeat over disk networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 4.9 Cluster behavior on other failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 4.9.1 Hardware components failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 4.9.2 Rootvg mirror and internal disk failure . . . . . . . . . . . . . . . . . . . . . . 183 4.9.3 AIX and LVM level errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.9.4 Forced varyon of VGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.10 RSCT verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 4.11 Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 4.11.1 Sample questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Chapter 5. Post implementation and administration . . . . . . . . . . . . . . . . 193 5.1 Using C-SPOC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 5.1.1 C-SPOC overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 5.1.2 C-SPOC enhancements in HACMP V5.1 . . . . . . . . . . . . . . . . . . . . 199 5.1.3 Configuration changes: DARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 5.1.4 Managing users and groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 5.1.5 Managing cluster storage using C-SPOC LVM . . . . . . . . . . . . . . . . 208 5.2 Managing resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 5.2.1 Resource group movement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 5.2.2 Priority Override Location (POL) . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 5.2.3 Changing resource groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 5.2.4 Creating a new resource group. . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 5.2.5 Bringing a resource group online . . . . . . . . . . . . . . . . . . . . . . . . . . 227 5.2.6 Bringing a resource group offline . . . . . . . . . . . . . . . . . . . . . . . . . . 231 5.2.7 Moving a resource group between nodes . . . . . . . . . . . . . . . . . . . . 232 5.3 Problem determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 5.3.1 HACMP logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 5.3.2 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 5.4 Event and error management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 5.4.1 Pre- and post- event considerations . . . . . . . . . . . . . . . . . . . . . . . . 248 5.4.2 Custom events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 5.4.3 Error notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 5.4.4 Recovery from cluster errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 5.4.5 Recovery from failed DARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 5.5 Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 5.5.1 Sample questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Chapter 6. HACMP V5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 6.2 New features and changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 6.2.1 Two node configuration assistant . . . . . . . . . . . . . . . . . . . . . . . . . . 260 vi IBM HACMP for AIX V5.X Certification Study Guide
Slide 9: 6.2.2 Automated test tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 6.2.3 Custom (only) resource groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 6.2.4 Cluster configuration auto correction . . . . . . . . . . . . . . . . . . . . . . . 264 6.2.5 Cluster file collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 6.2.6 Automatic cluster verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 6.2.7 Web-based SMIT management . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 6.2.8 Resource group dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 6.2.9 Application monitoring changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 6.2.10 Enhanced online planning worksheets . . . . . . . . . . . . . . . . . . . . . 271 6.2.11 User password management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 6.2.12 HACMP Smart Assist for WebSphere Application Server (SAW) . 273 6.2.13 New security features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 6.2.14 Dynamic LPARs support and CUoD support . . . . . . . . . . . . . . . . 274 6.2.15 Cluster lock manager not available anymore . . . . . . . . . . . . . . . . 275 6.2.16 Cross-site LVM mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 6.2.17 RMC replaces Event Management (EM) . . . . . . . . . . . . . . . . . . . 277 Appendix A. ITSO sample cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Cluster hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Cluster installed software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Cluster storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Cluster networking environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Application scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Answers to the quizzes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Chapter 2 quiz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Chapter 3 quiz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Chapter 4 quiz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Chapter 5 quiz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Contents vii
Slide 10: viii IBM HACMP for AIX V5.X Certification Study Guide
Slide 11: Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces. © Copyright IBM Corp. 2004. All rights reserved. ix
Slide 12: Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AIX 5L™ AIX® DB2® Enterprise Storage Server® Eserver® FlashCopy® FICON® HACMP™ IBM® Magstar® pSeries® Redbooks™ Redbooks (logo) RS/6000® Seascape® Tivoli® TotalStorage® ™ The following terms are trademarks of other companies: Windows and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others. x IBM HACMP for AIX V5.X Certification Study Guide
Slide 13: Preface The AIX® and IBM Eserver® pSeries® Certifications offered through the Professional Certification Program from IBM® are designed to validate the skills required of technical professionals who work in the powerful and often complex environments of AIX and IBM ^ pSeries. A complete set of professional certifications is available. They include: IBM ^ Certified Specialist - p5 and pSeries Technical Sales Support IBM ^ Certified Specialist - pSeries Cluster 1600 Sales Solutions IBM ^ Certified Specialist - p5 Sales Solutions IBM ^ Certified Specialist - pSeries Administration and Support for AIX 5L™ V5.2 IBM ^ Certified Specialist - pSeries AIX System Administration IBM ^ Certified Specialist - pSeries AIX System Support IBM ^ Certified Systems Expert - pSeries Enterprise Technical Support IBM ^ Certified Systems Expert - pSeries Cluster 1600 PSSP V5 IBM ^ Certified Systems Expert - pSeries HACMP™ for AIX 5L IBM ^ Certified Specialist - AIX Basic Operations V5 IBM ^ Certified Advanced Technical Expert - pSeries and AIX 5L Each certification is developed by following a thorough and rigorous process to ensure that the exam is applicable to the job role and is a meaningful and appropriate assessment of skill. Subject matter experts who successfully perform the job participate throughout the entire development process. These job holders bring a wealth of experience to the development process, thus making the exams much more meaningful than the typical test, which only captures classroom knowledge. These subject matter experts ensure the exams are relevant to the real world and that the test content is both useful and valid. The result is a certification of value that appropriately measures the skill required to perform the job role. This IBM Redbook is designed as a study guide for professionals wishing to prepare for the certification exam to achieve IBM Eserver Certified Systems Expert - pSeries HACMP for AIX 5L. © Copyright IBM Corp. 2004. All rights reserved. xi
Slide 14: The pSeries HACMP for AIX certification validates the skills required to successfully plan, install, configure, and support an HACMP for AIX cluster installation. The requirements for this include a working knowledge of the following: Hardware options supported for use in a cluster, along with the considerations that affect the choices made AIX parameters that are affected by a HACMP installation, and their correct settings The cluster and resource configuration process, including how to choose the best resource configuration for a customer requirement Customization of the standard HACMP facilities to satisfy special customer requirements Diagnosis and troubleshooting knowledge and skills This redbook helps AIX professionals seeking a comprehensive and task-oriented guide for developing the knowledge and skills required for the certification. It is designed to provide a combination of theory and practical experience. This redbook will not replace the practical experience you should have, but, when combined with educational activities and experience, should prove to be a very useful preparation guide for the exam. Due to the practical nature of the certification content, this publication can also be used as a deskside reference. So, whether you are planning to take the pSeries HACMP for AIX certification exam, or just want to validate your HACMP skills, this redbook is for you. For additional information about certification and instructions on How to Register for an exam, contact IBM Learning Services or visit our Web site at: http://www.ibm.com/certify The team that wrote this redbook This redbook was produced by a team of specialists from around the world working at the International Technical Support Organization, Poughkeepsie Center. Octavian Lascu is a Project Leader at the International Technical Support Organization, Poughkeepsie Center. He writes extensively and teaches IBM classes worldwide on all areas of pSeries clusters and Linux. Before joining the ITSO, Octavian worked in IBM Global Services Romania as a software and hardware Services Manager. He holds a Master's Degree in Electronic Engineering from the Polytechnical Institute in Bucharest and is also an IBM xii IBM HACMP for AIX V5.X Certification Study Guide
Slide 15: Certified Advanced Technical Expert in AIX/PSSP/HACMP. He has worked with IBM since 1992. Dharmesh Parikh is a pSeries Support Specialist in IBM Global Services, India. He has nine years of experience in the IT support and services field. He has worked at IBM for four years. His areas of expertise include RS/6000®, pSeries, AIX, HACMP, and storage. He is also certified on AIX 5L V5 for the pSeries. David Pontes is an Advisory I/T Specialist in IBM Brazil. He has been working for five years with the IBM Integrated Technology Services team in customer support and services delivery for AIX and HACMP. He is a Certified AIX and HACMP Specialist. His areas of expertise include RS/6000, pSeries, AIX, HACMP, and ESS. He also has some knowledge of Tivoli® Storage Manager implementation and support. Liviu Rosca is a pSeries Support Specialist at IBM Global Services, Romania. He has been working for two years for IBM Integrated Technology Services, providing customer support for pSeries, AIX, HACMP, and WVR. His areas of expertise include pSeries, AIX, HACMP and networking. He is IBM Certified AIX and HACMP System Administrator and CCNP. He teaches AIX and HACMP classes. Christian Allan Schmidt is an Advisory IT Specialist working for the Strategic Outsourcing Division (SSO) of IBM Global Services in Denmark. He has worked for IBM for 11 years, specializing in Cluster 1600, providing support and second-level support for AIX and Cluster 1600 configurations for the IBM Software Delivery and Fulfillment in Copenhagen. His areas of expertise include designing and implementing highly available Cluster 1600 solutions using GPFS, AIX, PSSP, CSM, security, system tuning, and performance. Christian is an IBM Certified Specialist in SP and System Support. He is also the co-author of three other Redbooks™. Thanks to the following people for their contributions to this project: Dino Quintero International Technical Support Organization, Poughkeepsie Center David Truong IBM Dallas Chris Algozzine IBM Poughkeepsie Michael K. Coffey IBM Poughkeepsie Preface xiii
Slide 16: Paul Moyer IBM Poughkeepsie Gheorghe Olteanu IBM Romania Darin Hartman IBM Austin Become a published author Join us for a two- to six-week residency program! Help write an IBM Redbook dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. You'll team with IBM technical professionals, Business Partners and/or customers. Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you'll develop a network of contacts in IBM development labs, and increase your productivity and marketability. Find out more about the residency program, browse the residency index, and apply online at: ibm.com/redbooks/residencies.html Comments welcome Your comments are important to us! We want our Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways: Use the online Contact us review redbook form found at: ibm.com/redbooks Send your comments in an Internet note to: redbook@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. JN9B Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400 xiv IBM HACMP for AIX V5.X Certification Study Guide
Slide 17: 1 Chapter 1. Introduction This chapter contains an introduction to IBM High Availability Cluster Multi-Processing (HACMP) for AIX product line, and the concepts on which IBM’s high availability products are based on. The following topics will be discussed: What is HACMP? History and evolution High availability concepts High availability vs. fault tolerance © Copyright IBM Corp. 2004. All rights reserved. 1
Slide 18: 1.1 What is HACMP? Before we explain what is HACMP, we have to define the concept of high availability. High availability In today’s complex environments, providing continuous service for applications is a key component of a successful IT implementation. High availability is one of the components that contributes to providing continuous service for the application clients, by masking or eliminating both planned and unplanned systems and application downtime. This is achieved through the elimination of hardware and software single points of failure (SPOFs). A high availability solution will ensure that the failure of any component of the solution, either hardware, software, or system management, will not cause the application and its data to be unavailable to the user. High Availability Solutions should eliminate single points of failure (SPOF) through appropriate design, planning, selection of hardware, configuration of software, and carefully controlled change management discipline. Downtime The downtime is the time frame when an application is not available to serve its clients. We can classify the downtime as: Planned: – Hardware upgrades – Repairs – Software updates/upgrades – Backups (offline backups) – Testing (periodic testing is required for cluster validation.) – Development Unplanned: – Administrator errors – Application failures – Hardware failures – Environmental disasters 2 IBM HACMP for AIX V5.X Certification Study Guide
Slide 19: IBM’s high availability solution for AIX, High Availability Cluster Multi Processing, is based on IBM’s well-proven clustering technology, and consists of two components: High availability: The process of ensuring an application is available for use through the use of duplicated and/or shared resources. Cluster multi-processing: Multiple applications running on the same nodes with shared or concurrent access to the data. A high availability solution based on HACMP provides automated failure detection, diagnosis, application recovery, and node reintegration. With an appropriate application, HACMP can also provide concurrent access to the data for parallel processing applications, thus offering excellent horizontal scalability. A typical HACMP environment is shown in Figure 1-1. HACMP Cluster Network ethernet pSeries pSeries Serial network Node A Node B Resource Group Application_01 Volume Groups File systems hdisk1 hdisk2 hdisk3 hdisk1 hdisk2 hdisk3 Resource Group Application_02 Volume Groups File systems Figure 1-1 HACMP cluster Chapter 1. Introduction 3
Slide 20: 1.1.1 History and evolution IBM High Availability Cluster Multi-Processing goes back to the early 1990s. HACMP development started in 1990 to provide high availability solution for applications running on RS/6000 servers. We do not provide information about the very early releases, since those releases are nor supported or in use at the time this book was developed, instead, we provide highlights about the most recent versions. HACMP V4.2.2 Along with HACMP Classic (HAS), this version introduced the enhanced scalability version (ES) based on RSCT (Reliable Scalable Clustering Technology) topology, group, and event management services, derived from PSSP (Parallel Systems Support Program). HACMP V4.3.X This version introduced, among other aspects, 32 node support for HACMP/ES, C-SPOC enhancements, ATM network support, HACMP Task guides (GUI for simplifying cluster configuration), multiple pre- and post- event scripts, FDDI MAC address takeover, monitoring and administration support enhancements, node by node migration, and AIX fast connect support. HACMP V4.4.X New items in this version are integration with Tivoli, application monitoring, cascading with out fallback, C-SPOC enhancements, improved migration support, integration of HA-NFS functionality, and soft copy documentation (HTML and PDF). HACMP V4.5 In this version, AIX 5L is required, and there is an automated configuration discovery feature, multiple service labels on each network adapter (through the use of IP aliasing), persistent IP address support, 64-bit-capable APIs, and monitoring and recovery from loss of volume group quorum. HACMP V5.1 This is the version that introduced major changes, from configuration simplification and performance enhancements to changing HACMP terminology. Some of the important new features in HACMP V5.1 were: SMIT “Standard” and “Extended” configuration paths (procedures) Automated configuration discovery Custom resource groups Non IP networks based on heartbeating over disks 4 IBM HACMP for AIX V5.X Certification Study Guide
Slide 21: Fast disk takeover Forced varyon of volume groups Heartbeating over IP aliases HACMP “classic” (HAS) has been dropped; now there is only HACMP/ES, based on IBM Reliable Scalable Cluster Technology Improved security, by using cluster communication daemon (eliminates the need of using standard AIX “r” commands, thus eliminating the need for the /.rhosts file) Improved performance for cluster customization and synchronization Normalization of HACMP terminology Simplification of configuration and maintenance Online Planning Worksheets enhancements Forced varyon of volume groups Custom resource groups Heartbeat monitoring of service IP addresses/labels on takeover node(s) Heartbeating over IP aliases Heartbeating over disks Various C-SPOC enhancements GPFS integration Fast disk takeover Cluster verification enhancements Improved resource group management HACMP V5.2 Starting July 2004, the new HACMP V5.2 added more improvements in management, configuration simplification, automation, and performance areas. Here is a summary of the improvements in HACMP V5.2: Two-Node Configuration Assistant, with both SMIT menus and a Java™ interface (in addition to the SMIT “Standard” and “Extended” configuration paths). File collections. User password management. Classic resource groups are not used anymore, having been replaced by custom resource groups. Automated test procedures. Chapter 1. Introduction 5
Slide 22: Automatic cluster verification. Improved Online Planning Worksheets (OLPW) can now import a configuration from an existing HACMP cluster. Event management (EM) has been replaced by resource monitoring and a control (RMC) subsystem (standard in AIX). Enhanced security. Resource group dependencies. Self-healing clusters. Note: At the time this redbook was developed, both HACMP V5.1 and V5.2 were available. The certification exam only contains HACMP V5.1 topics. 1.1.2 High availability concepts What needs to be protected? Ultimately, the goal of any IT solution in a critical environment is to provide continuous service and data protection. The high availability is just one building block in achieving the continuous operation goal. The high availability is based on the availability of the hardware, software (operating system and its components), application, and network components. For a high availability solution you need: Redundant servers Redundant networks Redundant network adapters Monitoring Failure detection Failure diagnosis Automated fallover Automated reintegration The main objective of the HACMP is eliminate Single Points of Failure (SPOFs) (see Table 1-1 on page 7). 6 IBM HACMP for AIX V5.X Certification Study Guide
Slide 23: Table 1-1 Single point of failure Cluster object Node (servers) Power supply Network adapter Network TCP/IP subsystem Disk adapter Disk Application Eliminated as a single point of failure by: Multiple nodes Multiple circuits and/or power supplies Redundant network adapters Multiple networks to connect nodes A non- IP networks to back up TCP/IP Redundant disk adapters Redundant hardware and disk mirroring or RAID technology Configuring application monitoring and backup node(s) to acquire the application engine and data Each of the items listed in Table 1-1 in the Cluster Object column is a physical or logical component that, if it fails, will result in the application being unavailable for serving clients. 1.1.3 High availability vs. fault tolerance The systems for the detection and handling of the hardware and software failures can be defined in two groups: Fault-tolerant systems High availability systems Fault-tolerant systems The systems provided with fault tolerance are designed to operate virtually without interruption, regardless of the failure that may occur (except perhaps for a complete site down due to a natural disaster). In such systems, ALL components are at least duplicated for either software or hardware. Thus, CPUs, memory, and disks have a special design and provide continuous service, even if one sub-component fails. Such systems are very expensive and extremely specialized. Implementing a fault tolerant solution requires a lot of effort and a high degree of customization for all system components. Chapter 1. Introduction 7
Slide 24: In places where no downtime is acceptable (life support and so on), fault-tolerant equipment and solutions are required. High availability systems The systems configured for high availability are a combination of hardware and software components configured in such a way to ensure automated recovery in case of failure with a minimal acceptable downtime. In such systems, the software involved detects problems in the environment, and then provides the transfer of the application on another machine, taking over the identity of the original machine (node). Thus, it is very important to eliminate all single points of failure (SPOF) in the environment. For example, if the machine has only one network connection, a second network interface should be provided in the same node to take over in case the primary adapter providing the service fails. Another important issue is to protect the data by mirroring and placing it on shared disk areas accessible from any machine in the cluster. The HACMP (High Availability Cluster Multi-Processing) software provides the framework and a set of tools for integrating applications in a highly available system. Applications to be integrated in a HACMP cluster require a fair amount of customization, not at the application level, but rather at the HACMP and AIX platform level. HACMP is a flexible platform that allows integration of generic applications running on AIX platform, providing for high available systems at a reasonable cost. 1.1.4 High availability solutions The high availability (HA) solutions can provide many advantages compared to other solutions. In Table 1-2, we describe some HA solutions and their characteristics. Table 1-2 Types of HA solutions Solutions Downtime Data Availability Standalone Couple of days Last full Backup Enhanced Standalone Couple of hours Last transaction High Availability Clusters Depends (usually three minutes) Last transaction Fault-Tolerant Computers Never stop No loss of data 8 IBM HACMP for AIX V5.X Certification Study Guide
Slide 25: High availability solutions offer the following benefits: Standard components Can be used with the existing hardware Works with just about any application Works with a wide range of disk and network types Excellent availability at reasonable cost IBM’s high available solution for the IBM Eserver pSeries offers some distinct benefits. Such benefits include: Proven solution (more than 14 years of product development) Flexibility (virtually any application running on a standalone AIX system can be protected with HACMP) Using “of the shelf” hardware components Proven commitment for supporting our customers Considerations for providing high availability solutions include: Thorough design and detailed planning Elimination of single points of failure Selection of appropriate hardware Correct implementation (no “shortcuts”) Disciplined system administration practices Documented operational procedures Comprehensive testing 1.2 HACMP concepts The basic concepts of HACMP can be classified as follows: Cluster topology Contains basic cluster components nodes, networks, communication interfaces, communication devices, and communication adapters. Cluster resources Entities that are being made highly available (for example, file systems, raw devices, service IP labels, and applications). Resources are grouped together in resource groups (RGs), which HACMP keeps highly available as a single entity. Chapter 1. Introduction 9
Slide 26: Resource groups can be available from a single node or, in the case of concurrent applications, available simultaneously from multiple nodes. Fallover Represents the movement of a resource group from one active node to another node (backup node) in response to a failure on that active node. Fallback Represents the movement of a resource group back from the backup node to the previous node, when it becomes available. This movement is typically in response to the reintegration of the previously failed node. 1.2.1 HACMP terminology To understand the correct functionality and utilization of HACMP, it is necessary to know some important terms: Cluster Loosely-coupled collection of independent systems (nodes) or LPARs organized into a network for the purpose of sharing resources and communicating with each other. HACMP defines relationships among cooperating systems where peer cluster nodes provide the services offered by a cluster node should that node be unable to do so. These individual nodes are together responsible for maintaining the functionality of one or more applications in case of a failure of any cluster component. Node An IBM Eserver pSeries machine (or LPAR) running AIX and HACMP that is defined as part of a cluster. Each node has a collection of resources (disks, file systems, IP address(es), and applications) that can be transferred to another node in the cluster in case the node fails. Resource Resources are logical components of the cluster configuration that can be moved from one node to another. All the logical resources necessary to provide a Highly Available application or service are grouped together in a resource group (RG). The components in a resource group move together from one node to another in the event of a node failure. A cluster may have more than one resource group, thus allowing for efficient use of the cluster nodes (thus the “Multi-Processing” in HACMP). 10 IBM HACMP for AIX V5.X Certification Study Guide
Slide 27: Takeover It is the operation of transferring resources between nodes inside the cluster. If one node fails due to a hardware problem or crash of AIX, its resources application will be moved to the another node. Clients A client is a system that can access the application running on the cluster nodes over a local area network. Clients run a client application that connects to the server (node) where the application runs. 1.3 HACMP/XD (extended distance) The High Availability Cluster Multi-Processing for AIX (HACMP) base software product addresses part of the continuos operation problem. It addresses recovery from the failure of a computer, an adapter, or a local area network within a computing complex at a single site. A typical HACMP/XD High Availability Geographic Cluster (HAGEO) is presented in Figure 1-2. GEO_NET1 PUB_NET1 icar_geo1 ulise_geo1 ajax_geo1 Serial network (via modem) ICAR ULISE AJAX StatMap: fkmsmlv1 fkmsmlog GMD: fkmgeolv1 fkmgeolog fkmvg1 Site:Paris /fs1 /fkm fkmvg1 /fkm Site:Bonn Resource Group:GEO_RG(YOKO_RG) Figure 1-2 Typical HACMP/XD HAGEO configuration Chapter 1. Introduction 11
Slide 28: For protecting an application in case of a major disaster (site failure), additional software is needed. HAGEO provides: Ability to configure a cluster with geographically separate sites. HAGEO extends HACMP to encompass two geographically distant data centers or sites. This extension prevents an individual site from being a single point of failure within the cluster. The geo-mirroring process supplies each site with an updated copy of essential data. Either site can run key applications, ensuring that mission-critical computing resources remain continuously available at a geographically separate site if a failure or disaster disables one site. Automatic failure detection and notification. HAGEO works with HACMP to provide automatic detection of a site or geographic network failure. It initiates the recovery process and notifies the system administrator about all failures it detects and actions it takes in response. Automatic fallover HAGEO includes event scripts to handle recovery from a site or geographic network failure. These scripts are integrated with the standard HACMP event scripts. You can customize the behavior for your configuration by adding pre- or postevent scripts, just as you can for HACMP. Fast recovery from a disaster. HAGEO also provides fast recovery of data and applications at the operable site. The geo-mirroring process ensures that the data is already available at the second site when a disaster strikes. Recovery time typically takes minutes, not including the application recovery time. Automatic resynchronization of data during site recovery. HAGEO handles the resynchronization of the mirrors on each site as an integral part of the site recovery process. The nodes at the rejoining site are automatically updated with the data received while the site was in failure. Reliable data integrity and consistency. HAGEO’s geographic mirroring and geographic messaging components ensure that if a site fails, the surviving site’s data is consistent with the failed site’s data. 12 IBM HACMP for AIX V5.X Certification Study Guide
Slide 29: When the failed site reintegrates into the cluster, HAGEO updates that site with the current data from the operable site, once again ensuring data consistency. Flexible, scalable configurations. HAGEO software supports a wide range of configurations, allowing you to configure the disaster recovery solution unique to your needs. You can have up to eight nodes in an HAGEO cluster, with varying numbers of nodes at each site. HAGEO is file system and database independent, since the geo-mirroring device behaves the same as the disk devices it supports. Because the mirroring is transparent, applications configured to use geo-mirroring do not have to be modified in any way. 1.3.1 HACMP/XD: HAGEO components The software has three significant functions: GeoMirror: Consists of a logical device and a pseudo device driver that mirrors at a second site; the data is entered at one site. TCP/IP is used as a transport for mirrored data. GeoMirror can be used in synchronous and asynchronous mode, depending on the communication bandwidth between sites, and the application transaction volume (which determines the amount of changed data). GeoMessage: Provides reliable delivery of data and messages between GeoMirror devices at the two sites. Geographic topology: Provides the logic for integrating the geo-mirroring facilities with HACMP facilities to provide automatic failure detection and recovery from events that affect entire sites. Recovering from disaster When a disaster causes a site failure, the Cluster Manager on nodes at the surviving site detects the situation quickly and takes action to keep geo-mirrored applications available. Likewise, if the cluster is partitioned due to global geographic network failure, then the Cluster Manager on the site configured as non-dominant will bring itself down in order to avoid data divergence. Chapter 1. Introduction 13
Slide 30: 1.3.2 HACMP/XD: HAGEO basic configurations You can configure an HAGEO cluster in any of the configurations supported by the HACMP base software. These include standby, one-sided takeover, mutual takeover, and concurrent access configurations. Standby configurations The standby configuration is a traditional redundant hardware configuration where one or more nodes in the cluster stand idle until a server node fails. In HAGEO, this translates to having an idle site. A site is not completely idle since it may also be involved in the geo-mirroring process. But nodes at this site do not perform application work. Takeover configurations In a takeover configuration, all nodes are processing; no idle nodes exist. Configurations include: – Intrasite (local) takeover – Remote one-sided takeover – Remote mutual takeover Concurrent configurations In a concurrent access configuration, all nodes at one site have simultaneous access to the concurrent volume group and own the same disk resources. The other site is set up the same way. If a node leaves the site, availability of the resources is not affected, since other nodes have the concurrent volume group varied on. If a site fails, the other site offers concurrent access on nodes at that site. A concurrent application can be accessed by all nodes in the cluster. The HACMP Cluster Lock Manager must be running on all nodes in the cluster. Not all databases can be used for concurrent access that involves nodes across the geography. 1.3.3 HACMP/XD PPRC integration feature This feature, introduced in simultaneously in HACMP V4.5 PTF5 and HACMP V5.1, provides automated site fallover and activation of remote copies of application data in an environment where the IBM Enterprise Storage Server® (ESS) is used in both sites and the Peer to Peer Remote Copy (PPRC) facility provides storage volumes mirroring. 14 IBM HACMP for AIX V5.X Certification Study Guide
Slide 31: In case of primary site failure, data should be available for use at the secondary site (replicated via PPRC). The data copy in the secondary site must be activated in order to be used for processing. The HACMP/XD PPRC integration feature provides automated copy split in case of primary site failure and automated reintegration when the primary site becomes available. For detailed information, see High Availability Cluster Multi-Processing XD (Extended Distance) V5.1: Concepts and Facilities for HAGEO Technology, SA22-7955. Chapter 1. Introduction 15
Slide 32: 16 IBM HACMP for AIX V5.X Certification Study Guide
Slide 33: 2 Chapter 2. Planning and design When planning and desiging a high availability cluster, you must follow all customer requirements. You should have a good understanding of the hardware and networking configuration and of the applications to be made highly available. You should also be able to control the behavior of the applications in a failure situation. Knowing the behavior of the application in a failure situation is important to controlling how the cluster will react in such a situation. The information needed for planning and implementing a cluster should cover applications, environment, hardware, networks, storage, and also support and change procedures. This chapter describes the following HACMP cluster topics: Node sizing consideration Cluster hardware planning Software planning Storage planning Disaster recovery planning © Copyright IBM Corp. 2004. All rights reserved. 17
Slide 34: 2.1 Planning consideration When planning a high availability cluster, you should consider the sizing of the nodes, storage, network and so on, to provide the necessary resources for the applications to run properly, even in a takeover situation. 2.1.1 Sizing: choosing the nodes in the cluster Before you start the implementation of the cluster, you should know how many nodes are required, and the type of the nodes that should be used. The type of nodes to be used is important in terms of the resources required by the applications. Sizing of the nodes should cover the following aspects: CPU (number of CPUs and speed) Amount of random access memory (RAM) in each node Disk storage (internal) Number of communication and disk adapters in each node Node reliability The number of nodes in the cluster depends on the number of applications to be made highly available, and also on the degree of availability desired. Having more than one spare node for each application in the cluster increases the overall availability of the applications. Note: The maximum number of nodes in an HACMP V5.1 cluster is 32. HACMP V5.1 supports a variety of nodes, ranging from desktop systems to high-end servers. SP nodes and Logical Partitions (LPARs) are supported as well. For further information, refer to the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. The cluster resource sharing is based on the applications requirements. Nodes that perform tasks that are not directly related to the applications to be made highly available and do not need to share resources with the application nodes should be configured in separate clusters for easier implementation and administration. All nodes should provide sufficient resources (CPU, memory, and adapters) to sustain execution of all the designated applications in a fail-over situation (to take over the resources from a failing node). 18 IBM HACMP for AIX V5.X Certification Study Guide
Slide 35: If possible, you should include additional nodes in the cluster, to increase the availability of the cluster; this also provides greater flexibility when performing node failover, reintegration, and maintenance operations. We recommend using cluster nodes with a similar hardware configuration, especially when implementing clusters with applications in mutual takeover or concurrent configurations. This makes it easier to distribute resources and to perform administrative operations (software maintenance and so on). 2.1.2 Sizing: storage considerations In the most commonly used configurations, applications to be made highly available require a shared storage space for application data. The shared storage space is used either for concurrent access, or for making the data available to the application on the takeover node (in a fail-over situation). The storage to be used in a cluster should provide shared access from all designated nodes for each application. The technologies currently supported for HACMP shared storage are SCSI, SSA, and Fibre Channel. The storage configuration should be defined according to application requirements as non-shared (“private”) or shared storage. The private storage may reside on internal disks and is not involved in any takeover activity. Shared storage should provide mechanisms for controlled access, considering the following reasons: Data placed in shared storage must be accessible from whichever node the application may be running at a point in time. In certain cases, the application is running on only one node at a time (non-concurrent), but in some cases, concurrent access to the data must be provided. In a non-concurrent environment, if the shared data is updated by the wrong node, this could result in data corruption. In a concurrent environment, the application should provide its own data access mechanism, since the storage controlled access mechanisms are by-passed by the platform concurrent software (AIX/HACMP). 2.1.3 Network considerations When you plan the HACMP cluster, the following aspects should be considered: IP network topology (routing, switches, and so on) IP network performance (speed/bandwidth, latency, and redundancy) ATM and/or X.25 network configuration Chapter 2. Planning and design 19
Slide 36: The IP networks are used to provide client access to the applications running on the nodes in the cluster, as well as for exchanging heartbeat messages between the cluster nodes. In an HACMP cluster, the heartbeat messages are exchanged via IP networks and point-to-point (non-IP) networks. HACMP is designed to provide client access through TCP/IP-based networks, X.25, and ATM networks. 2.2 HACMP cluster planning The cluster planning is perhaps the most important step in implementing a successful configuration. HACMP planning should include the following aspects: Hardware planning – Nodes – Network – Storage Software planning – OS version – HACMP version – Application compatibility Test and maintenance planning – The test procedures – Change management – Administrative operations Hardware planning The goal in implementing a high availability configuration is to provide highly available service by eliminating single points of failure (hardware, software, and network) and also by masking service interruptions, either planned or unplanned. The decision factors for node planning are: Supported nodes: Machine types, features, supported adapters, power supply (AC, DC, dual power supply vs. single power supply, and so on). Connectivity and cables: Types of cables, length, connectors, model numbers, conduit routing, cable tray capacity requirements, and availability. 20 IBM HACMP for AIX V5.X Certification Study Guide
Slide 37: 2.2.1 Node configurations HACMP V5.1 supports IBM Eserver pSeries (stand-alone and LPAR mode), IBM SP nodes, as well as legacy RS/6000 servers, in any combination of nodes within a cluster. Nodes must meet the minimum requirements for internal memory, internal disk, number of available I/O slots, and operating system compatibility (AIX version). Items to be considered: Internal disk (number of disks, capacities, and LVM mirroring used?) Shared disk capacity and storage data protection method (RAID and LVM mirroring) I/O slot limitations and their effect on creating a single point of failure (SPOF) Client access to the cluster (network adapters) Other LAN devices (switches, routers, and bridges) Redundancy of I/O adapters and subsystems Redundancy of power supplies 2.2.2 Network configuration The main objective when planning the cluster networks is to assess the degree of redundancy you need to eliminate network components as potential single points of failure. The following aspects should be considered: Network: Nodes connected to multiple physical networks For TCP/IP subsystem failure: Non-IP network to help with the decision process Network interfaces: Redundant networks adapters on each network (to prevent resource group failover in case a single network interface fails) When planning the cluster network configuration, you must chose the proper combination for the node connectivity: Cluster network topology (switches, routers, and so on). The combination of IP and non-IP (point-to-point) networks connect your cluster nodes and the number of connections for each node to all networks. The method for providing high availability service IP addresses: IP address takeover (IPAT) via IP aliases IPAT via IP Replacement. Chapter 2. Planning and design 21
Slide 38: For a complete list of nodes and adapters supported in HACMP configuration, refer to the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02; also, check the IBM support Web site at: http://www-1.ibm.com/servers/eserver/pseries/ha/ 2.2.3 HACMP networking terminology Starting with HACMP V5.1, the terminology used to describe HACMP configuration and operation has changed dramatically. The reason for this change is to simplify the overall usage and maintenance of HACMP, and also to align the terminology with the IBM product line. For example, in previous HACMP versions, the term “Adapter”, depending on the context, could have different meanings, which made configuration confusing and difficult. IP label The term IP label represents the name associated with a specific IP address, as defined in the name resolution method used on the cluster nodes (DNS or static /etc/hosts). This replaces the host name, which may be confused with the output of the hostname command and may not be associated with any IP address. In HACMP V5.1, the term Adapter has been replaced as follows: Service IP Label / Address: An IP label/address over which a service is provided. It may be bound to a single node or shared by multiple nodes, and is kept highly available by HACMP. Communication Interface: A physical interface that supports the TCP/IP protocol, represented by its base IP address. Communication Device: A physical device representing one end of a point-to-point non-IP network connection, such as /dev/tty1, /dev/tmssa1, /dev/tmscsi1, and /dev/hdisk1. Communication Adapter: An X.25 adapter used to provide a highly available communication link. Service IP address/label The service IP address is an IP address used for client access. This service IP address (and its associated label) is monitored by HACMP and is part of a resource group. 22 IBM HACMP for AIX V5.X Certification Study Guide
Slide 39: There are two types of service IP address (label): Shared service IP address (label): An IP address that can be configured on multiple nodes and is part of a resource group that can be active only on one node at a time. Node-bound service IP address (label): An IP address that can be configured only one node (is not shared by multiple nodes). Typically, this type of service IP address is associated with concurrent resource groups. The service IP addresses become available when HACMP is started and their associated resource group has an online status. HACMP communication interfaces The communication interface definition in HACMP is a logical grouping of the following: A logical network interface is the name to which AIX resolves a port (for example, en0) of a physical network adapter. A service IP address is an IP address over which services, such as an application, are provided, and over which client nodes communicate. A service IP label is a label that maps to the Service IP address. A communication interface refers to IP-based networks and networks adapters. The networks adapters that are connected to a common physical network are combined into logical networks that are used by HACMP. Each network adapter is capable of hosting several TCP/IP addresses. When configuring a cluster, you define the IP addresses that HACMP will monitor (base or boot IP addresses) and the IP addresses that HACMP will keep highly available (the service IP addresses) to HACMP. Heartbeating in HACMP occurs over communication interfaces. HACMP uses the heartbeating facility of the RSCT subsystem (using UDP) to monitor its network interfaces and IP addresses. HACMP passes the network topology defined and stored in the ODM to RSCT, whenever HACMP services are started on that node, and RSCT provides failure notifications to HACMP. HACMP communication devices HACMP also provides monitoring of point-to-point non-IP networks. Both ends of a point-to-point network are AIX devices (as defined in /dev directory). These are the communication devices and they include serial RS232 connections, target mode SCSI, target mode SSA, and disk heartbeat connections. Chapter 2. Planning and design 23
Slide 40: The point-to-point networks are also monitored by RSCT, and their status information is used by HACMP to distinguish between node failure and IP network failure. For example, a heartbeat over disk uses the disk device name (for example, /dev/hdisk2) as the device configured to HACMP at each end of the connection. The recommendation for such networks is to have at least one non-IP network defined between any two nodes in the cluster. In case of disk heartbeat, the recommendation is to have one point-to-point network consisting of one disk per pair of nodes per physical enclosure. One physical disk cannot be used for two point-to-point networks. Communication adapters and links You can define the following communication links as resources in HACMP: SNA configured over LAN network adapters (ent*) SNA configured over X.25 adapter Native X.25 links HACMP managed these links as part of resource groups, thus ensuring high availability communication links. In the event of a physical network interface failure, an X.25 link failure, or a node failure, the highly available communication link is migrated over to another available adapter on the same node, or on a takeover node (together with all the resources in the same resource group). IP aliases An IP alias is an IP address that is configured on a communication (network) interface in addition to the base IP address. An IP alias is an AIX function that is supported by HACMP. AIX supports multiple IP aliases on each communication interface. Each IP alias on the adapter can be on a separate subnet. AIX also allows IP aliases with different subnet masks to be configured for an interface; this functionality is not yet supported by HACMP. IP aliases are used in HACMP both as service and non-service addresses for IP address takeover, as well as for heartbeat configuration. Network interface functions For IP networks, it is recommended that you configure more than one communication interface per node per network. The communication interfaces will have specific roles each, depending on the HACMP cluster status. 24 IBM HACMP for AIX V5.X Certification Study Guide
Slide 41: Service Interface A service interface is a communications interface configured with one or more service IP addresses (labels). Depending on the IP address takeover (IPAT) method defined for each network, the service IP address will be added on top of the base IP address (IPAT via aliasing), or will replace the base (boot) IP address of the communication interface. This interface is used for providing access to the application(s) running on that node. The service IP address is monitored by HACMP via RSCT heartbeat. Boot Interface This is a communication interface represented by its base (boot) IP address, as defined in an AIX configuration. If heartbeating over IP aliases is used, this IP address will not be monitored by HACMP, but the communication interface will be monitored via the IP alias assigned by HACMP at startup time. No client traffic is carried over the boot interface; however, if a service interface fails, HACMP will move the service IP address(es) onto a non-service interface. If a node fails, another interface on the takeover node will configure the service IP address when performing a resource group fallover. Note: A node can have from zero to seven non-service interfaces for each network. Using multiple non-service interfaces on the same network eliminates the communication interface as a single point of failure. Persistent Node IP Label A persistent node IP label is an IP alias that can be assigned to a specific node on a cluster network. A persistent node IP label: – Is node-bound (always stays on the same node). – Can coexist on a network adapter that already has a service or non-service IP label defined. – Has the advantage where it does not require installation of an additional physical network adapter on that node. – Is not part of any resource group. Assigning a persistent node IP label provides a node-bound IP address, and is useful for administrative purposes, since issuing a connection to a persistent node IP label will always identify that particular cluster node, even if HACMP services are not started on that node. Chapter 2. Planning and design 25
Slide 42: Note: It is possible to configure one persistent node IP label (address) per network per node. For example, if you have a node connected to two networks defined in HACMP, that node can be identified via two persistent IP labels (addresses), one for each network. The persistent IP labels are defined in the HACMP configuration, and they become available the first time HACMP is started on each node. Once configured, the persistent IP labels (addresses) will remain available on the adapter they have been configured on, even if HACMP is stopped on the node(s), or the nodes are rebooted. The persistent node IP labels can be created on the following types of IP-based networks: – Ethernet – Token Ring – FDDI – ATM LAN Emulator Restriction: It is not possible to configure a persistent node IP label on the SP Switch, on ATM Classical IP, or on non-IP networks. The persistent IP label behavior is the following: – If a network adapter that has a service IP label configured fails, and there is also a persistent label defined on this network adapter, then the persistent IP label (address) is moved together with the service IP label (address) over to the same non-service interface. – If all network adapters on the cluster network on a specified node fail, then the persistent node IP label becomes unavailable. A persistent node IP label always remains on the same network, and on the same node; it does not move between the nodes in the cluster. For more information, see 3.4, “Configuring cluster topology” on page 104. IP aliases used for heartbeat These IP addresses are allocated from a pool of private, non-routable addresses, and are used to monitor the communication interfaces without the need to change their base (boot) IP address. This is useful in certain cases when it is not desirable to change the base IP addresses (as they are defined in AIX) of the network adapters on each node, 26 IBM HACMP for AIX V5.X Certification Study Guide
Slide 43: and those addresses do not conform to the HACMP requirements (they are in the same subnet, so the network adapters cannot be monitored). For this purpose, HACMP provides the usage of heartbeat over IP aliases. 2.2.4 Network types In HACMP, the term “network” is used to define a logical entity that groups the communication interfaces and devices used for communication between the nodes in the cluster, and for client access. The networks in HACMP can be defined as IP networks and non-IP networks. Both IP and non-IP networks are used to exchange heartbeat (“keep alive”) messages between the nodes. In this way, HACMP maintains information about the status of the cluster nodes and their respective communication interfaces and devices. The IP network types supported in HACMP V5.1 are: Ethernet (ether) Token ring (token) FDDI (fddi) SP Switch and SP Switch2 (hps) ATM (atm) The following IP network types are not supported: Serial Optical Channel Converter (SOCC) Serial Line IP (SLIP) Fibre Channel Switch (FCS) 802.3 IBM High Performance Switch (HPS) The non-IP networks are point-to-point connections between two cluster nodes, and are used by HACMP for control messages and heartbeat traffic. These networks provide an additional protection level for the HACMP cluster, in case the IP networks (or the TCP/IP subsystem on the nodes) fail. The following devices are supported for non-IP (device-based) networks in HACMP: Target mode SCSI (tmscsi) Target mode SSA (tmssa) Disk heartbeat (diskhb) Serial RS232 Chapter 2. Planning and design 27
Slide 44: Note: HACMP now supports Ethernet aggregated (Etherchannel) communication interfaces for IP address takeover in both AIX 5L V5.1 and AIX 5L V5.2. Etherchannel is not supported for: Hardware address takeover PCI hot plug Also, in its current release, HACMP does not support the AIX Virtual IP facility (VIPA) and IPV6. 2.2.5 Choosing the IP address takeover (IPAT) method One of the key decisions to be made when implementing a cluster is the behavior of the resource groups and the service IP address(es) associated with them. Since most of the times HACMP is used to protect stand-alone, non-concurrent applications, one must chose the method to be used for providing highly available service IP addresses. When an application is started or moved to another node together with its associated resource group, the service IP address can be configured in two ways: By replacing the base (boot-time) IP address of a communication interface; this method is known as IP address takeover (IPAT) via IP replacement. By configuring one communication interface with an additional IP address on top of the existing one; this method is known as IP address takeover via IP aliasing. The default IPAT method in HACMP V5.1 is via aliasing (IPAT via aliasing). To change this default behavior, the network properties must be changed using HACMP extended configuration menus. IP address takeover IP address takeover is a mechanism for recovering a service IP label by moving it to another physical network adapter on another node, when the initial physical network adapter fails. IPAT ensures that an IP address (label) over which services are provided to the client nodes remains available. 28 IBM HACMP for AIX V5.X Certification Study Guide
Slide 45: IPAT and service IP labels We can explain the two methods of IPAT and how they will control the service IP label as follows: IP address takeover via IP aliases The service IP address/label is aliased onto an existing communication interface, without changing (replacing) the base address of the interface. HACMP uses the ifconfig command to perform this operation. Note: In this configuration, all base (boot) IP addresses/labels defined on the nodes must be configured on different subnets, and also different from the service IP addresses (labels). This method also saves hardware, but requires additional subnets. See Figure 2-1 on page 29. en0 192.168.11.131 192.168.100.31 Service IP Label Boot IP Label 192.168.11.132 192.168.100.32 en0 node1 en1 172.16.100.31 net mask : 255.255.255.0 Boot IP Label node2 172.16.100.32 en1 Note: All IP addresses/labels must be on different subnets Figure 2-1 IPAT via IP aliases HACMP supports IP Address Takeover on different types of network using the IP aliasing network capabilities of AIX. IPAT via IP Aliases can use the gratuitous ARP capabilities on certain types of networks. IPAT via IP aliasing allows a single network adapter to support more than one service IP address (label). Therefore, the same node can host multiple resource groups at the same time, without limiting the number of resource groups to the number of available communication interfaces. IPAT via IP aliases provides the following advantages over IPAT via IP replacement: – IP address takeover via IP aliases is faster than IPAT via IP replacement, because replacing the IP address takes a considerably longer time than adding an IP alias onto the same interface. – IP aliasing allows the co-existence of multiple service labels on the same network interface, so you can use fewer physical network interface cards in your cluster. Chapter 2. Planning and design 29
Slide 46: Note: In HACMP V5.1, IPAT via IP aliases is the default mechanism for keeping a service IP label highly available. IP address takeover via IP replacement The service IP address replaces the existing (boot/base) IP address on the network interface. With this method, only one IP address/label is configured on the same network interface at a time. Note: In this configuration, the service IP address must be in the same subnet with one of the node’s communication interface boot address, while a backup communication interface’s base IP address must be on a different subnet. This method may save subnets, but requires additional hardware. See Figure 2-2 on page 30. en0 192.168.100.131 Service IP Label 192.168.100.31 Boot IP Label 192.168.100.132 192.168.100.32 en0 node1 en1 172.16.100.31 net mask : 255.255.255.0 Boot IP Label node2 172.16.100.32 en1 Note: The Service IP Label and Boot IP Label must be in the same subnet. Figure 2-2 IPAT via IP replacement If the communication interface holding the service IP address fails, when using the IPAT via IP replacement, HACMP moves the service IP address on another available interface on the same node and on the same network; in this case, the resource group associated is not affected. If there is no available interface on the same node, the resource group is moved together with the service IP label on another node with an available communication interface. When using IPAT via IP replacement (also known as “classic” IPAT), it is also possible to configure hardware address takeover (HWAT). This is achieved by masking the native MAC address of the communication interface with a locally administered address (LAA), thus ensuring that the mappings in the ARP cache on the client side remain unchanged. 30 IBM HACMP for AIX V5.X Certification Study Guide
Slide 47: 2.2.6 Planning for network security Planning network security is also important to avoid unauthorized access at the cluster nodes. Starting with HACMP V5.1, a new security mechanism has been introduced, by providing common communication infrastructure (daemon) for all HACMP configuration related communications between nodes. The introduction of the new cluster communication daemon (clcomdES) provides enhanced security in a HACMP cluster and also speeds up the configuration related operations. There are three levels of communication security: Standard – Default security level. – Implemented directly by cluster communication daemon (clcomdES). – Uses HACMP ODM classes and the /usr/es/sbin/cluster/rhosts file to determine legitimate partners. Enhanced – Used in SP clusters. – Takes advantage of enhanced authentication method based on third-party authentication method provided by Kerberos. Virtual Private Networks (VPN) – VPNs are configured within AIX. – HACMP is then configured to use VPNs for all inter-node configuration related communication operations. By using the cluster secure communication subsystem, HACMP eliminates the need for either /.rhosts files or a Kerberos configuration on each cluster node. However, the /.rhosts may still be needed to support operations for applications that require this remote communication mechanism. Chapter 2. Planning and design 31
Slide 48: Note: Not all cluster communication is secured via clcomdES; other daemons have their own communication mechanism (not based on “r” commands). Cluster Manager (clstrmgrES) Cluster Lock Daemon (cllockdES) Cluster Multi Peer Extension Communication Daemon (clsmuxpdES) The clcomdES is used for cluster configuration operations such as cluster synchronization, cluster management (C-SPOC), and dynamic reconfiguration (DARE) operations. The Cluster Communication Daemon, clcomdES, provides secure remote command execution and HACMP ODM configuration file updates by using the principle of the “least privilege”. Thus, only the programs found in /usr/es/sbin/cluster/ will run as root; everything else will run as “nobody”. Beside the clcomdES, the following programs are also used: cl_rsh is the cluster remote shell execution program. clrexec is used to run specific, dangerous commands as root, such as altering files in /etc directory. cl_rcp is used to copy AIX configuration files. These commands are hardcoded in clcomdES and are not supported for running by users. The cluster communication daemon (clcomdES) has the following characteristics: Since cluster communication does not require the standard AIX “r” commands, the dependency on the /.rhosts file has been removed. Thus, even in “standard” security mode, the cluster security has been enhanced. Provides reliable caching mechanism for other node’s ODM copies on the local node (the node from which the configuration changes and synchronization are performed). Limits the commands which can be executed as root on remote nodes (only the commands in /usr/es/sbin/cluster run as root). clcomdES is started from /etc/inittab and is managed by the system resource controller (SRC) subsystem. Provides its own heartbeat mechanism, and discovers active cluster nodes (even if cluster manager or RSCT is not running). 32 IBM HACMP for AIX V5.X Certification Study Guide
Slide 49: Note: ClcomdES provides a transport mechanism for various HACMP services, such as clverify, godm, rsh, and rexec. The clcomdES authentication process for incoming connections is based on checking the node identity against the following files: HACMPadapter ODM class (IP labels defined in this class) HACMPnode ODM (the IP addresses/labels used as communication path for the nodes in the cluster) The /usr/sbin/cluster/etc/rhosts file Incoming connections are not allowed if the /usr/sbin/cluster/etc/rhosts file is missing or does not contain an entry for the remote initiating node (either IP address or resolvable IP label). If the HACMPnode, HACMPadapter ODM classes, and the /usr/sbin/cluster/etc/rhosts files are empty, then clcomdES assumes the cluster is being configured and accepts incoming connections, then adds the peer node IP label (address) to the /usr/sbin/cluster/etc/rhosts file, once the initial configuration is completed. If the IP address requesting connection matches a label in the above locations (HACMPadapter, HACMPnode, and /usr/es/sbin/cluster/etc/rhosts) then clcomdES connects back to the requesting node and asks for the IP label (host name); if the returned IP label (host name) matches the requesting IP address, the authentication is completed successfully. Note: If there is an unresolvable label in the /usr/es/sbin/cluster/etc/rhosts file, then all clcomdES connections from remote nodes will be denied. 2.3 HACMP heartbeat As in many other types of clusters, heartbeating is used to monitor the availability of network interfaces, communication devices, and IP labels (service, non-service, and persistent), and thus the availability of the nodes. Starting with HACMP V5.1, heartbeating is exclusively based on RSCT topology services (thus HACMP V5.1 is only “Enhanced Scalability”; classic heartbeating with network interface modules (NIMs), monitored directly by the cluster manager daemon, is not used anymore). Chapter 2. Planning and design 33
Slide 50: Heartbeating is performed by exchanging messages (keep alive packets) between the nodes in the cluster over each communication interface or device. Each cluster node sends heartbeat messages at specific intervals to other cluster nodes, and expects to receive heartbeat messages from the corresponding nodes at specific intervals. If messages stop being received, the RSCT recognizes this as a failure and tells HACMP, which takes the appropriate action for recovery. The heartbeat messages can be sent over: TCP/IP networks Point to point non-IP networks To prevent cluster partitioning (split brain), HACMP must be able to distinguish between a node failure and a TCP/IP network failure. TCP/IP network failures can be caused by faulty network elements (switches, hubs, and cables); in this case, the nodes in the cluster are not able to send and receive heartbeat messages (keep alive (KA)) over IP, so each node considers the peers down and will try to acquire the resources. This is a potential data corruption exposure, especially when using concurrent resources. The non-IP networks are direct connections (point-to-point) between nodes, and do not use IP for heartbeat messages exchange, and are therefore less prone to IP network elements failures. If these network types are used, in case of IP network failure, nodes will still be able to exchange messages, so the decision is to consider the network down and no resource group activity will take place. To avoid partitioning in an HACMP, we recommend: Configure redundant networks (IP and non-IP) Use both IP and non-IP networks. For a recommended two-node cluster configuration, see Figure 2-3 on page 35. 34 IBM HACMP for AIX V5.X Certification Study Guide
Slide 51: net_ether_01 KA messages net_ether_02 Ethernet networks pSeries pSeries Serial network net_rs232 KA messages Node A m e KA ss ag es net_diskhb Node B B KA ges sa es m hdisk1 hdisk2 Enhanced concurrent volume group Shared storage: FAStT, ESS, SSA, SCSI hdisk3 Figure 2-3 Heartbeating in an HACMP cluster 2.3.1 Heartbeat via disk The heartbeat via disk (diskhb) is a new feature introduced in HACMP V5.1, with a proposal to provide additional protection against cluster partitioning and simplified non-IP network configuration, especially for environments where the RS232, target mode SSA, or target mode SCSI connections are too complex or impossible to implement. This type of network can use any type of shared disk storage (Fibre Channel, SCSI, or SSA), as long as the disk used for exchanging KA messages is part of an AIX enhanced concurrent volume group. The disks used for heartbeat networks are not exclusively dedicated for this purpose; they can be used to store application shared data (see Figure 2-3 for more information). Customers have requested a target mode Fibre Channel connection, but due to the heterogeneous (non-standard initiator and target functions) FC environments (adapters, storage subsystems, SAN switches, and hubs), this is difficult to implement and support. Chapter 2. Planning and design 35
Slide 52: By using the shared disks for exchanging messages, the implementation of a non-IP network is more reliable, and does not depend of the type of hardware used. Moreover, in a SAN environment, when using optic fiber to connect devices, the length of this non-IP connection has the same distance limitations as the SAN, thus allowing very long point-to-point networks. By defining a disk as part of an enhanced concurrent volume group, a portion of the disk will not be used for any LVM operations, and this part of the disk (sector) is used to exchange messages between the two nodes. The specifications for using the heartbeat via disk are: One disk can be used for one network between two nodes. The disk to be used is uniquely identified on both nodes by its LVM assigned physical volume ID (PVID). The recommended configuration for disk heartbeat networks is one disk per pair of nodes per storage enclosure. Requires that the disk to be used is part of an the enhanced concurrent volume group, though it is not necessary for the volume group to be either active or part of a resource group (concurrent or non-concurrent). The only restriction is that the VG must be defined on both nodes. Note: The cluster locking mechanism for enhanced concurrent volume groups does not use the reserved disk space for communication (as the “classic” clvmd does); it uses the RSCT group services instead. 2.3.2 Heartbeat over IP aliases For IP networks, a new heartbeat feature has been introduced: heartbeat over IP aliases. This feature is provided for clusters where changing the base IP addresses of the communication interfaces is not possible or desired. The IP aliases used for heartbeat are configured on top of existing IP address when HACMP services are started. The IP addresses used for this purpose must be in totally different subnets from the existing ones, and should not be defined for any name resolution (/etc/hosts, BIND, and so on). This configuration does not require any additional routable subnets. Instead of using the base/boot IP addresses for exchanging heartbeat messages, RSCT uses the HACMP defined IP aliases to establish the communication groups (heartbeat rings) for each communication interface. 36 IBM HACMP for AIX V5.X Certification Study Guide
Slide 53: Attention: When using heartbeat over IP aliases, the base/boot IP addresses of the communication interfaces are not monitored by RSCT topology services (and, as a consequence, by HACMP). The communication interfaces are monitored via the assigned IP aliases. Even with this technique, HACMP still requires that all the interfaces on a network (from all nodes) be able to communicate with each other (can see each other’s MAC address). The subnet mask used for IP aliases is the same as the one used for the service IP addresses. When defining the IP address to be used for heartbeat, you have to specify the start address to be used for heartbeating, and must ensure that you have enough subnets available (one per each physical communication interface in a node) that do not conflict with your existent subnets used on the networks. For example, in a three node cluster were all the nodes have three communication interfaces defined on the same network, you need three non-routable subnets. Assuming that all nodes have three Ethernet adapters (en0, en1, and en2), netmask class C (255.255.255.0), and the starting IP address to be used for heartbeat over IP aliases is 172.16.100.1, the aliases assigned for each Ethernet adapter (communication interface) will be as shown in Table 2-1. See also Figure 2-4 on page 38 and Figure 2-5 on page 39. Table 2-1 IP aliases for heartbeat Adapter / Node en0 en1 en2 Node 1 172.16.100.1 172.16.101.1 172.16.102.1 Node 2 172.16.100.2 172.16.101.2 172.16.102.2 Node 3 172.16.100.3 172.16.101.3 172.16.102.3 The addresses used for heartbeat over IP aliases are stored in the HACMPadapter ODM class during the cluster synchronization. Chapter 2. Planning and design 37
Slide 54: en0 172.16.100.1 RING1 RING2 RING3 172.16.100.2 en0 en1 172.16.101.1 172.16.101.2 en1 Node 1 en2 Node 2 en2 172.16.102.2 172.16.102.1 172.16.101.3 172.16.100.3 172.16.102.3 en0 en1 en2 Node 3 Address offset: 172.16.100.1 Netmask: 255.255.255.0 Figure 2-4 Heartbeat alias address assignment In HACMP V5.1, heartbeating over IP aliases can be configured to establish IP-based heartbeat rings for networks using either type of IPAT (via IP aliasing or via IP replacement). The type of IPAT configured determines how the HACMP handles the service IP address (label): IPAT via IP replacement the service label replaces the base (boot) address of the communication interface, not the heartbeat alias. With IPAT via IP aliasing, the service label is aliased to the communication interface along with the base address and the heartbeat alias. Heartbeating over IP aliases is a defined as a network (HACMP) characteristic, and is part of the HACMP topology definition. To enable this facility, users must specify the start address in the HACMP network definition. To set this characteristic, you have to use the extended SMIT menu (for cluster topology). This can be defined when you define the network, or it can be changed later. 38 IBM HACMP for AIX V5.X Certification Study Guide
Slide 55: en0 en1 en2 pSeries 172.16.100.1 192.168.100.31 172.16.101.1 192.168.50.31 172.16.10.31 172.16.102.1 192.168.11.131 Heartbeat alias (monitored) Boot address (not monitored) Heartbeat alias (monitored) Boot address (not monitored) Boot address (not monitored) Heartbeat alias (monitored) Service alias (monitored) Figure 2-5 IP aliases management For more information on this topic, refer to Chapter 3, “Planning Cluster Network Connectivity”, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. 2.4 Shared storage configuration Most of the HACMP configurations require shared storage. The IBM disk subsystems that support access from multiple hosts include SCSI, SSA, ESS, and FAStT. There are also third-party (OEM) storage devices and subsystems that may be used, although most of these are not directly certified by IBM for HACMP usage. For these devices, check the manufacturer’s respective Web sites. Table 2-2 lists a subset of IBM storage devices (the most commonly used) that can be used for shared access in an HACMP cluster. Table 2-2 External storage subsystems IBM 7133 SSA Disk Subsystem Models D40 and T40 (up to 72.8 GB disk modules, and up to eight nodes per SSA loop). IBM Enterprise Storage Server (ESS) Models E10, E20, F10, and F20 (supports up to eight nodes using SCSI and Fibre Channel interfaces via IBM FC/FICON®, Feature Code: 3021, 3022, and 3023) IBM 2105-800 (ESS) Total Storage Enterprise Storage Server (FS and SCSI) Chapter 2. Planning and design 39
Slide 56: IBM 7133 SSA Disk Subsystem Models D40 and T40 (up to 72.8 GB disk modules, and up to eight nodes per SSA loop). IBM Total Storage FAStT 200, 500, 600, 700, and 900 models. HACMP also supports shared tape drives (SCSI or FC). The shared tape(s) can be connected via SCSI or FC. Concurrent mode tape access is not supported. See Table 2-3 for some of the supported tape subsystems. Table 2-3 Tape drive support IBM 3583 Ultrium Scalable Tape Library Model L18, L32 and L72 IBM 3584 Ultra Scalable Tape Library Model L32 and D32 IBM Total Storage Enterprise Tape Drive 3590 Model H11 IBM Magstar® 3590 Tape Drive Model E11 & B11 IBM 3581 Ultrium Tape Autoloader Model H17 & L17 IBM 3580 Ultrium Tape Drive Model H11 & L11 For an updated list of supported storage and tape drives, check the IBM Web site at: http://www-1.ibm.com/servers/eserver/pseries/ha/ HACMP may also be configured with non-IBM shared storage subsystems (disk and tape subsystems). For a list of non-IBM storage, refer to the respective manufacturer’s Web sites, and at the Availant Web site: http://www.availant.com/ 2.4.1 Shared LVM requirements Planning shared LVM for an HACMP cluster depends on the method of shared disk access and the type of shared disk device. The elements that should be considered for shared LVM are: Data protection method Storage access method Storage hardware redundancy 40 IBM HACMP for AIX V5.X Certification Study Guide
Slide 57: Note: HACMP itself does not provide storage protection. Storage protection is provided via: AIX (LVM mirroring) Hardware RAID In this section, we provide information about data protection methods at the storage level, and also talk about the LVM shared disk access modes. Non concurrent Concurrent “classic” (HACMP concurrent logical volume manager - clvm) Enhanced concurrent mode (ECM), a new option in AIX 5L V5.1 and higher 2.4.2 Non-Concurrent, Enhanced Concurrent, and Concurrent In a non-concurrent access configuration, only one cluster node can access the shared data at a time. If the resource group containing the shared disk space moves to another node, the new node will activate the disks, and check the current state of the volume groups, logical volumes, and file systems. In non-concurrent configurations, the disks can be shared as: Raw physical volumes Raw logical volumes File systems In a concurrent access configuration, data on the disks is available to all nodes concurrently. This mode does not support file systems (either JFS or JFS2). Fast disk takeover HACMP V5.1 exploits the new AIX enhanced concurrent LVM. In AIX 5L V5.2, any new concurrent volume group must be created in enhanced concurrent mode. In AIX 5L V5.2 only, the enhanced concurrent volume groups can also be used for file systems (shared or non-shared). This is exploited by the fast disk takeover option to speed up the process of taking over the shared file systems in a fail-over situation. The enhanced concurrent volume groups are varied on all nodes in the resource group, and the data access is coordinated by HACMP. Only the node that has the resource group active will vary on the volume group in “concurrent active” mode; the other nodes will vary on the volume group in “passive” mode. In “passive” mode, no high level operations are permitted on that volume group. Chapter 2. Planning and design 41
Slide 58: Attention: When using the resource groups with fast disk takeover option, it is extremely important to have redundant networks and non-IP networks. This will avoid data corruption (after all, the volume groups are in concurrent mode) in a “split brain” situation. RAID and SSA concurrent mode RAID concurrent mode volume groups are functionally obsolete, since enhanced concurrent mode provides additional capabilities, but RAID concurrent VGs will continue to be supported for some time. Both RAID and SSA concurrent mode volume groups are supported by HACMP V5.1 with some important limitations: A concurrent resource group that includes a node running a 64-bit kernel requires enhanced concurrent mode for any volume groups. SSA concurrent mode is not supported on 64-bit kernels. SSA disks with the 32-bit kernel can still use SSA concurrent mode. The C-SPOC utility cannot be used with RAID concurrent volume groups. You have to convert these volume groups to enhanced concurrent mode (otherwise, AIX sees them an non-concurrent). In AIX 5L V5.1, it is still possible to create SSA concurrent VGs (with a 32-bit kernel), but in AIX 5L V5.2, it is not possible to create a new HACMP concurrent; all new VGS must be created in enhanced concurrent mode. LVM requirements The Logical Volume Manager (LVM) component of AIX manages the storage by coordinating data mapping between physical and logical storage. Logical storage can be expanded and replicated, and can span multiple physical disks and enclosures. The main LVM components are: Physical volume A physical volume (PV) represents a single physical disk as it is seen by AIX (hdisk*). The physical volume is partitioned into physical partitions (PPs), which represent the physical allocation units used by LVM. Volume group A volume group (VG) is a set of physical volumes that AIX treats as a contiguous, addressable disk region. In HACMP, the volume group and all its logical volumes can be part of a shared resource group. A volume group cannot be part of multiple resource groups (RGs). 42 IBM HACMP for AIX V5.X Certification Study Guide
Slide 59: Physical partition A physical partition (PP) is the allocation unit in a VG. The PVs are divided into PPs (when the PV is added to a VG), and the PPs are used for LVs (one, two, or three PPs per logical partition (LP)). Volume group descriptor area (VGDA) The VGDA is a zone on the disk that contains information about the storage allocation in that volume group. For a single disk volume group, there are two copies of the VGDA. For a two disk VG, there are three copies of the VGDA: two on one disk and one on the other. For a VG consisting of three or more PVs, there is one VGDA copy on each disk in the volume group. Quorum For an active VG to be maintained as active, a “quorum” of VGDAs must be available (50% + 1). Also, if a VG has the quorum option set to “off”, it cannot be activated (without the “force” option) if one VGDA copy is missing. If the quorum is be turned off, the system administrator must know the mapping of that VG to ensure data integrity. Logical volume A logical volume (LV) is a set of logical partitions that AIX makes available as a single storage entity. The logical volumes can be used as raw storage space or as file system’s storage. In HACMP, a logical volume that is part of a VG is already part of a resource group, and cannot be part of another resource group. Logical partition A logical partition (LP) is the space allocation unit for logical volumes, and is a logical view of a physical partition. With AIX LVM, the logical partitions may be mapped to one, two, or three physical partitions to implement LV mirroring. Note: Although LVM mirroring can be used with any type of disk, when using IBM 2105 Enterprise Storage Servers or FAStT storage servers, you may skip this option. These storage subsystems (as well as some non-IBM ones) provide their own data redundancy by using various levels of RAID. File systems A file system (FS) is in fact a simple database for storing files and directories. A file system in AIX is stored on a single logical volume. The main components of the file system (JFS or JFS2) are the logical volume that holds the data, the file system log, and the file system device driver. HACMP supports both JFS and JFS2 as shared file systems, with the remark that the Chapter 2. Planning and design 43
Slide 60: log must be on a separated logical volume (JFS2 also may have inline logs, but this is not supported in HACMP). Forced varyon of volume groups HACMP V5.1 provides a new facility, the forced varyon of a volume group option on a node. If, during the takeover process, the normal varyon command fails on that volume group (lack of quorum), HACMP will ensure that at least one valid copy of each logical partition for every logical volume in that VG is available before varying on that VG on the takeover node. Forcing a volume group to varyon lets you bring and keep a volume group online (as part of a resource group) as long as there is one valid copy of the data available. You should use a forced varyon option only for volume groups that have mirrored logical volumes, and use caution when using this facility to avoid creating a partitioned cluster. Note: You should specify the super strict allocation policy for the logical volumes in volume groups used with the forced varyon option. In this way, the LVM makes sure that the copies of a logical volume are always on separate disks, and increases the chances that forced varyon will be successful after a failure of one or more disks. This option is useful in a takeover situation in case a VG that is part of that resource group loses one or mode disks (VGDAs). If this option is not used, the resource group will not be activated on the takeover node, thus rendering the application unavailable. When using a forced varyon of volume groups option in a takeover situation, HACMP first tries a normal varyonvg. If this attempt fails due to lack of quorum, HACMP checks the integrity of the data to ensure that there is at least one available copy of all data in the volume group before trying to force the volume online. If there is, it runs varyonvg -f; if not, the volume group remains offline and the resource group results in an error state. Note: The users can still use quorum buster disks or custom scripts to force varyon a volume group, but the new forced varyon attribute in HACMP automates this action, and customer enforced procedures may now be relaxed. For more information see Chapter 5, “Planning Shared LVM Components”, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. 44 IBM HACMP for AIX V5.X Certification Study Guide
Slide 61: 2.4.3 Choosing a disk technology HACMP V5.1 supports the following storage technologies: SCSI, SSA, and Fibre Channel (like FAStT and ESS disk subsystems). The complete list of supported external storage subsystems (manufactured by IBM) can be found at the following IBM Web site: http://www-1.ibm.com/servers/eserver/pseries/ha/ HACMP supports the following IBM disk technologies as shared external disks in a highly availability cluster. IBM 2105 Enterprise Storage Server IBM 2105 Enterprise Storage Server provides concurrent attachment and disk storage sharing for a variety of open systems servers. Beside IBM Eserver pSeries machines, a variety of other platforms are supported. Due to the multitude of platforms supported in a shared storage environment, to avoid interference, it is very important to configure secure access to storage by providing appropriate LUN masking and zoning configurations. The ESS uses IBM SSA disk technology. ESS provides built-in availability and data protection. RAID technology is used to protect data. Also, the disks have intrinsic predictive failure analysis features to predict errors before they affect data availability. The ESS has virtually all components doubled and provides protection if any internal component fails. The ESS manages the internal storage (SSA disks) with a cluster of two nodes connected through a high speed internal bus, each of the nodes providing the exact same functionality, Thus, in case one of the internal node fails, the storage remains available to the client systems. For more information on planning and using the 2105-800 Enterprise Storage Server (including attachment diagrams, and more), see the following Web site: http://www.storage.ibm.com/disk/ess/index.html An example of a typical HACMP cluster using ESS as shared storage is shown in Figure 2-6 on page 46. Chapter 2. Planning and design 45
Slide 62: IP Network Heartbeat AIX1 Non-IP Network AIX2 FC Switch 1 Zone1 Zone2 FC Switch 2 1 2 ESS Figure 2-6 ESS Storage IBM FAStT 700 and 900 midrange Storage Servers IBM FAStT 900 and 700 Storage Servers deliver breakthrough disk performance and outstanding reliability for demanding applications in compute intensive environments. IBM FAStT Series Storage subsystems are the choice for implementing midrange solutions, by providing good scalability, performance, and data protection. The FAStT architecture, although not as sophisticated as the one implemented in the ESS, is also based on redundant elements (storage controllers, power supplies, and storage attachment adapters). The FAStT 700 and 900 architecture implements native Fibre Channel protocol on both host side and storage side. It does not offer SCSI support, and does not accommodate a dedicated high speed bus between the two controllers, but it provides controller fail-over capability for uninterrupted operations, and host side data caching. For complete information about IBM Storage Solutions, see the following Web site: http://www.storage.ibm.com/disk/fastt/index.html 46 IBM HACMP for AIX V5.X Certification Study Guide
Slide 63: For a typical FAStT connection to an HACMP cluster, see Figure 2-7. IP Network Heartbeat AIX1 Non-IP Network AIX2 FC Switch 1 Zone1 Zone2 FC Switch 2 Host Side FastT 900 ABAB Drive Side Controllers Figure 2-7 FastT Storage IBM Serial Storage Architecture disk subsystem Serial Storage Architecture (SSA) storage subsystems provide a more “discrete components” solution, offering features for reducing the number of single points of failure. SSA storage provides high availability in an HACMP environment through the use of redundant hardware (power supplies and storage connections) and hot swap capability (concurrent maintenance) for power supplies and disks. SSA storage also offers RAID capability at the adapter (Host Bus Adapter - HBA) level. Note: By using the SSA RAID option, the number of HACMP nodes able to share the same data is limited to two. IBM 7133 SSA disk subsystems can be used as shared external disk storage devices to provide concurrent access in an HACMP cluster configuration. Chapter 2. Planning and design 47
Slide 64: SSA storage provides a flexible, fairly simple, more “custom” approach for configuring HACMP clusters with “legacy” applications and a limited number of nodes. We recommend that all new configurations to be implemented using the new technologies (FC storage). For an example of a two node HACMP cluster, see Figure 2-8. IP Network SSA Adapter(s) Node 1 Non-IP Network SSA Adapter(s) Node 2 Enclosure 1 SSA Enclosure 2 Figure 2-8 SSA storage 2.5 Software planning In the process of planning a HACMP cluster, one of the most important steps is to chose the software levels that will be running on the cluster nodes. The decision factors in node software planning are: Operation system requirements: AIX version and recommended levels. Application compatibility: Ensure that all requirements for the applications are met, and supported in cluster environments. Resources: Types of resources that may be used (IP addresses, storage configuration, if NFS is required, and so on). 48 IBM HACMP for AIX V5.X Certification Study Guide
Slide 65: 2.5.1 AIX level and related requirements Before you install the HACMP, you must check the OS level requirements. Table 2-4 shows the recommended HACMP and OS levels at the time this redbook was written. Table 2-4 OS level requirements for HACMP V5.1 and V5.2 HACMP Version HACMP V5.1 HACMP V5.1 HACMP V5.2 HACMP V5.2 AIX OS Level 5100-05 5200-02 5100-06 5200-03 AIX APARs IY50579, IY48331 IY48180, IY44290 IY54018, IY53707, IY54140, IY55017 IY56213 RSCT Level 2.2.1.30 or higher 2.3.1.0 or higher 2.2.1.30 or higher 2.3.3.0 or higher For the latest list of recommended maintenance levels for HACMP V5.1 and V5.2, please access the IBM Web site at: http://www-912.ibm.com/eserver/support/fixes/fcgui.jsp Note: To use C-SPOC with VPATH disks, Subsytem Device Driver (SDD) 1.3.1.3 or later is required. To use HACMP Online Planning Worksheets, AIX 5L Java Runtime Environment 1.3.1 or later and a graphics display (local or remote) are required. HACMP V5.1 and V5.2 support the use of AIX 5L V5.2 Multi-path I/O (MPIO) device drivers for accessing disk subsystems. The following AIX optional base operating system (BOS) components are prerequisites for HACMP: bos.adt.lib bos.adt.libm bos.adt.syscalls bos.net.tcp.client bos.net.tcp.server bos.rte.SRC bos.rte.libc bos.rte.libcfg bos.rte.libcur bos.rte.libpthreads Chapter 2. Planning and design 49
Slide 66: bos.rte.odm bos.data When using the (enhanced) concurrent resource manager access, the following components are also required. bos.rte.lvm.5.1.0.25 or higher (for AIX 5L V5.1) bos.clvm.enh For the complete list of recommended maintenance levels for AIX 5L V5.1 and V5.2, see the following IBM Web page: http://www-912.ibm.com/eserver/support/fixes/fcgui.jsp 2.5.2 Application compatibility HACMP is a flexible, high availability solution, in the sense that virtually any application running on an stand-alone AIX server can be protected through the use of an HACMP cluster. When starting cluster application planning, you should consider the following aspects: Application compatibility with the version of AIX used. Application compatibility with the storage method to be implemented for high availability. You also must know all the interdependencies between the application and platform, that is, all the locations where all the application files are stored (permanent data, temporary files, sockets, and pipes, if applicable). You should be able to provide an unattended application start/stop method (scripts) and the application must be able to recover from errors (for example, in case the node running the application crashes) when restarted. Important: Do not proceed to HACMP implementation if your application does not run correctly on a stand-alone node, or if you are not sure about all application dependencies!!! If you plan to use application monitoring, you should also provide application monitoring tools (methods, behavior, and scripts). Application client dependencies (client behavior when the server is restarted). Application network dependencies (sockets, routes, and so on) Licensing issues, that is, if your application is dependant on the CPU ID, you should consider purchasing a standby license for each node that can host the application. Also, if the application is licensed based on the number of 50 IBM HACMP for AIX V5.X Certification Study Guide
Slide 67: processors, make sure, in a fail-over situation, that the licensing is not breached. Application servers According to the HACMP definition, an application server is represented by a collection of scripts that are used by HACMP to start an application when activating a resource group and to stop the same application when bringing the resource group offline. Once the application has been started, HACMP can also monitor this application, and take action in case the application does not run properly. The application monitoring can be performed at process level, and also by using a custom method (for example, for a multi-process application like database engines and so on). Note: Application monitoring has been introduced in HACMP/ES V4.4, based on the event management function (EM) of RSCT. Starting with HACMP V5.2, event management has been replaced by Resource Monitoring and Control (RMC), which is functionally equivalent, but provides more flexibility. Starting with HACMP V5.2, it is also possible to monitor application startup. HACMP also provides the application availability analysis tool, which is useful for auditing the overall application availability, and for assessing the cluster environment. For information about application servers and other resources, see 3.5, “Resource group configuration” on page 128. 2.5.3 Planning NFS configurations One of the typical applications of HACMP is to provide high availability network file systems (HA-NFS) for client machines and applications. This is useful, especially in a cluster running applications, for mutual takeover with cross-mount network file systems. Starting with HACMP V4.4, the HA-NFS function has been integrated in HACMP, so there is no separate product anymore. Some considerations when using NFS: For the shared volume groups that will be exported via NFS, the volume group Major Number is the same on all cluster nodes that can serve the file system(s) in that VG. Chapter 2. Planning and design 51
Slide 68: In AIX, when you export files and directories, the mknfsexp command is used, so the /etc/exports file is created/updated. In HACMP, on the other hand, the file systems and directories to be exported and NFS mounted must be specified in the resource group configuration. If you need any optional configuration for these file systems, you should create the /usr/es/sbin/cluster/etc/exports file. For all resource groups that have file systems to export, the “File systems Mounted before IP Address Configured” attribute must be set to “true”. The HACMP scripts contain the default NFS behavior. You may need to modify these scripts to handle your particular configuration. In HACMP V5.1, in addition to cascading resource groups, you can configure high availability NFS in either in rotating or custom resource groups. Note: The NFS locking functionality is limited to a cluster with two nodes. This functionality provides a reliable NFS server capability that allows a backup processor to recover current NFS activity should the primary NFS server fail, preserving the locks on NFS file systems and dupcache. For more information, see the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. 2.5.4 Licensing Most software vendors require that you have a unique license for each application for each physical machine or per processor in a multi-processor (SMP) machine. Usually, the license activation code is entered at installation time. However, in a HACMP environment, in a takeover situation, if the application is restarted on a different node, you must make sure that you have the necessary activation codes (licenses) for the new machine; otherwise the application may not start properly. The application may also require a unique node-bound license (a separate license file on each node). Some applications also have restrictions with the number of floating licenses available within the cluster for that application. To avoid this problem, be sure that you have enough licenses for each cluster node machine, so the application can run simultaneously on multiple nodes (especially for concurrent applications). 52 IBM HACMP for AIX V5.X Certification Study Guide
Slide 69: 2.5.5 Client connections During resource group takeover, the application is started on another node, so clients must be aware of the action. In certain cases, the applications client uses the ARP cache on the client machine to reconnect to the server. In this case, there are two possible situations: The network holding the service IP for that application uses IPAT via IP replacement with locally administered MAC address takeover (thus, the client machine ARP cache does not have to be updated). HACMP uses the clinfo program that calls the /usr/es/sbin/cluster/etc/clinfo.rc script whenever a network or node event occurs. By default, this action updates the system’s ARP cache and specified clients ARP cache to reflect changes to network addresses. You can customize this script if further action is desired. Clients running the clinfo daemon will be able to reconnect to the cluster quickly after a cluster event. Note: If you are using IPAT via IP Aliases, make sure all your clients support TCP/IP gratuitous ARP functionality. If the HACMP nodes and the clients are on the same subnet, and clients are not running the clinfo daemon, you may have to update the local ARP cache indirectly by pinging the client from the cluster node. You can achieve this by adding, on the cluster nodes, the IP labels or IP addresses of the client hosts you want to notify to the PING_CLIENT_LIST variable in the clinfo.rc script. Whenever a cluster event occurs, the clinfo.rc scripts executes the following command for each host specified in PING_CLIENT_LIST: # ping -c1 $host In case the clients are on a different subnet, make sure that the router ARP cache is updated when an IPAT occurs; otherwise, the clients will expect delays in reconnecting. Chapter 2. Planning and design 53
Slide 70: 2.6 Operating system space requirements In HACMP V5.1, both the cluster verification program (clverify) and the new cluster communication daemon (clcomdES) need additional space in the /var file system. Due to verbose messaging and additional debugging information, the following requirements must be satisfied for the free space in the /var file system on every node in the cluster: 20 MB, times 1, where: – /var/hacmp/clcomd/clcomd.log requires 2 MB. – /var/hacmp/clcomd/clcomddiag.log requires 18 MB. Additional (1 MB x number of nodes in the cluster) space for the files stored in /var/hacmp/odmcache directory. 4 MB for each cluster node for cluster verification data. 2 MB for the cluster verification log (clverify.log[0-9]). For example, for a four-node cluster, it is recommended to have at least 42 MB of free space in the /var file system, where: 2 MB should be free for writing the clverify.log[0-9] files. 16 MB (4 MB per node) should be free for writing the verification data from the nodes. 20 MB should be free for writing the clcomd log information. 4 MB (1 MB per node) should be free for writing the ODM cache data. For each node in the cluster, the clverify utility requires up to 4 MB of free space in the /var file system. The clverify can keep up to four different copies of a node's verification data at a time (on the node that has initiated the verification): /var/hacmp/clverify/current/<nodename>/* contains logs from a current execution of clverify. /var/hacmp/clverify/pass/<nodename>/* contains logs from the last passed verification. /var/hacmp/clverify/pass.prev/<nodename>/* contains logs from the second last passed verification. /var/hacmp/clverify/fail/<nodename>/* contains information about the last failed verification process. Also, the /var/hacmp/clverify/clverify.log and its copies [0-9] typically consume 1-2 MB of disk space. 54 IBM HACMP for AIX V5.X Certification Study Guide
Slide 71: 2.7 Resource group planning A resource group is a logical entity containing the resources to be made highly available by HACMP. The resources can be: Storage space (application code and data) – File systems – Network File Systems – Raw logical volumes – Raw physical disks Service IP addresses/labels (used by the clients to access application data) Application servers – Application start script – Application stop script To be made highly available by the HACMP, each resource must be included in a resource group. HACMP ensures the availability of cluster resources by moving resource groups from one node to another whenever a cluster event occurs and conditions in the cluster change. HACMP controls the behavior of the resource groups in the following situations: Cluster startup Node failure Node reintegration Cluster shutdown During each of these cluster stages, the behavior of resource groups in HACMP is defined by: Which node, or nodes, acquire the resource group at cluster startup. Which node takes over the resource group when the owner node fails. Whether a resource group falls back to the node that has just recovered from a failure that occurred earlier, or stays on the node that currently owns it. The priority relationships among cluster nodes determines which cluster node originally controls a resource group and which node takes over control of that resource group when the original node re-joins the cluster after a failure. Chapter 2. Planning and design 55
Slide 72: The resource groups takeover relationship can be defined as: Cascading Rotating Concurrent Custom The cascading, rotating and concurrent resource groups are the “classic”, preHACMP V5.1 types. Since the definition of these types may be difficult to understand, the new “custom” type of resource group has been introduced in HACMP V5.1. This is just one step in normalizing HACMP terminology and making HACMP concepts easier to understand. Starting with HACMP V5.2, the “classic” resource group types have been replaced by custom only resource groups. 2.7.1 Cascading resource groups A cascading resource group defines a list of all the nodes that can control the resource group and each node’s priority in taking over the resource group. A cascading resource group behavior is as follows: At cluster startup, a cascading resource group is activated on its home node by default (the node with the highest priority in the node group). In addition, another attribute named “Inactive Takeover” may be used to specify that the resource group can be activated on a lower priority node if the highest priority node (also known as the home node) is not available at cluster startup. Upon node failure, a cascading resource group falls over to the available node with the next priority in the RG node priority list. In addition, by specifying a “Dynamic Node Priority” policy for a resource group, the fail over process will determine the node that will take over that resource group based on some dynamic parameters (the node with the highest CPU free, for example). Upon node reintegration into the cluster, a cascading resource group falls back to its home node by default. In addition, by specifying the “Cascading without Fallback” attribute for the resource group, the resource group will remain on the takeover node even if a node with a higher priority becomes available. 56 IBM HACMP for AIX V5.X Certification Study Guide
Slide 73: To summarize, cascading resource groups have the following attributes: Inactive Takeover (IT) is an attribute that allows you to fine tune the startup (initial acquisition) of a resource group in case the home node is not available. When a failure occurs on a node that currently owns one of these groups, the group will fall over to the next available in the node priority list. The fall-over priority can be configured in one of two ways: using the default node priority list (which is the order the nodes are listed when configuring the RG), or by setting a Dynamic Node Priority (DNP) policy. Cascading without Fallback (CWOF) is an attribute that modifies the fall-back behavior. By using the CWOF attribute, you can avoid unnecessary RG fallback (thus client interruption) whenever a node with a higher priority becomes available. In this mode, you can move the RG to its home node manually at a convenient time, without disturbing the clients. 2.7.2 Rotating resource groups For a rotating resource group, the node priority list only determines which node will take over the resource group, in case the owner node fails. At cluster startup, the first available node in the node priority list will activate the resource group. If the resource group is on the takeover node, it will never fall back to a higher priority node if one becomes available. There is no Dynamic Node Priority (DNP) calculation for rotating RGs. When configuring multiple rotating RGs over the same node set in order to control the preferred location of rotating resource groups, each group should be assigned a different highest priority node from the list of participating nodes. When the cluster starts, each node will attempt to acquire the rotating resource group for which it is the highest priority. If all rotating resource groups are up, new nodes joining the cluster will join only as backup nodes for these resource groups. If all rotating groups are not up, a node joining the cluster will generally acquire only one of these inactive resource groups.The remaining resource groups will stay inactive. However, if multiple networks exist on which the resource groups can move, a node may acquire multiple rotating groups, one per network. Chapter 2. Planning and design 57
Slide 74: 2.7.3 Concurrent resource groups As the name suggests, a concurrent RG can be active on multiple nodes at the same time. At cluster startup, the RG will be activated on all nodes in the list, in no preferred startup order. For concurrent resource groups, there is no priority among the nodes; they are all equal owner-nodes. If one node fails, the other nodes continue to offer the service; the group does not move. Additional concurrent software may be required to manage concurrent access to application data. 2.7.4 Custom resource groups This new RG type has been introduced in HACMP V5.1 to simplify resource group management and understanding. The resource group designations (cascading, rotating, and concurrent) can be confusing for new users, because: They do not clearly indicate the underlying RG behaviors. Additional RG parameters can further complicate the RG definition: Cascading without Fallback and Inactive Takeover. Also, in some cases, users require combinations of behaviors that are not provided by the standard RG definitions. HACMP V5.1 introduces Custom Resource Groups. – Users have to explicitly specify the desired startup, fall-over, and fall-back behaviors. – RG Startup and Fallback can be controlled through the use of Settling and Fallback Timers. – RG Fallover can also be influenced through the use of Dynamic Node Priority (DNP). Limitations (HACMP V5.1 only): – Custom RGs support only IPAT-via-Aliasing service IP addresses/labels. – There is no site or replicated resource support (for HACMP-XD). Startup preferences Online On Home Node Only: At node startup, the RG will only be brought online on the highest priority node. This behavior is equivalent to cascading RG behavior. Online On First Available Node: At node startup, the RG will be brought online on the first node activated. This behavior is equivalent to that of a rotating RG 58 IBM HACMP for AIX V5.X Certification Study Guide
Slide 75: or a cascading RG with inactive takeover. If a settling time is configured, it will affect RGs with this behavior. Online On All Available Nodes: The RG should be online on all nodes in the RG. This behavior is equivalent to concurrent RG behavior. This startup preference will override certain fall-over and fall-back preferences. Fallover preferences Fallover To Next Priority Node In The List: The RG will fall over to the next available node in the node list. This behavior is equivalent to that of cascading and rotating RGs. Fallover Using Dynamic Node Priority: The RG will fall over based on DNP calculations. The resource group must specify a DNP policy. Bring Offline (On Error Node Only): The RG will not fall over on error; it will simply be brought offline. This behavior is most appropriate for concurrent-like RGs. Fallback preferences Fallback To Higher Priority Node: The RG will fall back to a higher priority node if one becomes available. This behavior is equivalent to cascading RG behavior. A fall-back timer will influence this behavior. Never Fallback: The resource group will stay where it is, even if a higher priority node comes online. This behavior is equivalent to rotating RG behavior. 2.7.5 Application monitoring In addition to resource group management, HACMP can also monitor applications in one of the following two ways: Application process monitoring: Detects the death of a process, using RSCT event management capability. Application custom monitoring: Monitors the health of an application based on a monitoring method (program or script) that you define. Note: You cannot use application process monitoring for applications launched via a shell script, or for applications where monitoring just the process may not be relevant for application sanity. For monitoring shell script applications, you have to use custom monitoring methods (for example, Apache Web server). Chapter 2. Planning and design 59
Slide 76: When application monitoring is active, HACMP behaves as follows: For application process monitoring, a kernel hook informs the HACMP cluster manager that the monitored process has died, and HACMP initiates the application recovery process. For the recovery action to take place, you must provide a method to clean up and restart the application (the application start/stop scripts provided for the application server definition may be used). HACMP tries to restart the application and waits for the application to stabilize a specified number of times, before sending an notification message and/or actually moving the entire RG to a different node (next node in the node priority list). For custom application monitoring (custom method), beside the application cleanup and restart methods, you must also provide a program/script to be used for performing periodic application tests. To plan the configuration of a process monitor, check the following: Verify whether this application can be monitored with a process monitor. Check the name(s) of the process(es) to be monitored. It is mandatory to use the exact process names to configure the application monitor. Specify the user name that owns of the processes, for example, root. Note that the process owner must own all processes to be monitored. Specify the number of instances of the application to monitor (number of processes). The default is one instance. Specify the time (in seconds) to wait before beginning monitoring. Note: In most circumstances, this value should not be zero. For example, with a database application, you may wish to delay monitoring until after the start script and initial database search have been completed. The restart count, denoting the number of times to attempt to restart the application before taking any other actions. The interval (in seconds) that the application must remain stable before resetting the restart count. The action to be taken if the application cannot be restarted within the restart count. The default choice is notify, which runs an event to inform the cluster of the failure. You can also specify fallover, in which case the resource group containing the failed application moves over to the cluster node with the next-highest priority for that resource group. The restart method, if desired. (This is required if “Restart Count” is not zero.) 60 IBM HACMP for AIX V5.X Certification Study Guide
Slide 77: If you plan to set up a custom monitor method, also check: Whether you have specified a program/script to be used for checking the specified application. The polling interval (in seconds) for how often the monitor method is to be run. If the monitor does not respond within this interval, the application is considered in error and the recovery process is started. The signal to kill the user-defined monitor method if it does not return within the polling interval. The default is SIGKILL. The time (in seconds) to wait before beginning monitoring. For example, with a database application, it is recommended to delay monitoring until after the start script and initial database search has been completed (otherwise, the application may be considered in an error state and the recovery process will be initiated). The restart count, that is the number of times to attempt to restart the application before taking any other actions. The interval (in seconds) that the application must remain stable before resetting the restart count. The action to be taken if the application cannot be restarted within the restart count. For more information, see the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. 2.8 Disaster recovery planning Starting with HACMP V5.1, HAGEO and GeoRM have been integrated into HACMP as the IBM HACMP/XD (extended distance) feature. HAGEO software product provides a flexible, reliable platform for building disaster-tolerant computing environments. HAGEO components can mirror data across TCP/IP point-to-point networks over an unlimited distance from one geographic site to another. HAGEO works with HACMP to provide automatic detection, notification, and recovery of an entire geographic site from failures. Chapter 2. Planning and design 61
Slide 78: The disaster recovery strategies discussed in this book use two sites: the original and the recovery or backup site. Data recovery strategies must address the following issues: Data readiness levels. – Level 0: None. No provision for disaster recovery. – Level 1: Periodic backup. Data required for recovery up to a given date is backed up and sent to another location. – Level 2: Ready to roll forward. In addition to periodic backups, data update logs are also sent to another location. Transport can be manual or electronic. Recovery is to the last log data set stored at the recovery site. – Level 3: Roll forward or forward recover. A shadow copy of the data is maintained on disks at the recovery site. Data update logs are received and periodically applied to the shadow copy using recovery utilities. – Level 4: Real time roll forward. Like roll forward, except updates are transmitted and applied at the same time as they are being logged in the original site. This real-time transmission and application of log data does not impact transaction response time at the original site. – Level 5: Real time remote update. Both the original and the recovery copies of data are updated before sending the transaction response or completing a task. Site interconnection options. – Level 0: None. There is no interconnection or transport of data between sites. – Level 1: Manual transport. There is no interconnection. For transport of data between sites, dispatch, tracking, and receipt of data is managed manually. – Level 2: Remote tape. Data is transported electronically to a remote tape. Dispatch and receipt are automatic. Tracking can be either automatic or manual. – Level 3: Remote disk. Data is transported electronically to a remote disk. Dispatch, receipt, and tracking are all automatic. Recovery site readiness. – Cold: A cold site typically is an environment with the proper infrastructure, but little or no data processing equipment. This equipment must be installed as the first step in the data recovery process. Both periodic backup and ready to roll forward data can be shipped from a storage location to this site when a disaster occurs. 62 IBM HACMP for AIX V5.X Certification Study Guide
Slide 79: – Warm: A warm site has data processing equipment installed and operational. This equipment is used for other data processing tasks until a disaster occurs. Data processing resources can be used to store data, such as logs. Recovery begins after the regular work of the site is shut down and backed up. Both periodic backup and ready to roll forward data can be stored at this site to expedite disaster recovery. – Hot: A hot site has data processing equipment installed and operational and data can be restored either continually or regularly to reduce recovery time. All levels from roll forward to real-time remote update can be implemented. HAGEO software provides the highest level of disaster recovery: Level 5: HAGEO provides real-time remote update data readiness by updating both the original and the recovery copies of data prior to sending a transaction response or completing a task. Level 3: HAGEO also provides remote disk site interconnectivity by transmitting data electronically to a geographically distant site where the disks are updated and all bookkeeping is automatic. HAGEO provides hot site readiness. Since recovery site contains operational data processing equipment along with current data, this keeps recovery time to a minimum. Moreover, with HAGEO, the recovery site can be actively processing data and performing useful work. In fact, each site can be a backup for the other, thereby minimizing the cost of setting up a recovery site for each original production site. HACMP contribution to disaster recovery The HACMP base software lays the foundation of the loosely coupled clustering technology to prevent individual system components like processors, networks, and network adapters from being single points of failure within a cluster. This software ensures that the computing environment within a site remains highly available. You just have to define the system components within your site in terms of HACMP cluster components, and the HACMP base software facilities help keep the system components highly available within that site. For more information, see the High Availability Clusters Multi-Processing XD (Extended Distance) for HAGEO Technology: Planning and Administration Guide, SA22-7956. Chapter 2. Planning and design 63
Slide 80: Figure 2-9 presents a diagram of a geographic cluster with the remote mirroring (GeoRM) option. GEO_NET1 PUB_NET1 icar_geo1 ulise_geo1 ajax_geo1 Serial network (via modem) ICAR ULISE AJAX StatMap: fkmsmlv1 fkmsmlog GMD: fkmgeolv1 fkmgeolog fkmvg1 Site:Paris /fs1 /fkm fkmvg1 /fkm Site:Bonn Resource Group:GEO_RG(Paris_rg) Figure 2-9 HAGEO components 2.9 Review In this section, we provide a quiz about the topics covered earlier in this chapter. The questions are multiple choice, with one or more correct answers. The questions are NOT the actual certification exam questions; they are just provided for testing your knowledge and understanding of the matters discussed in this chapter. 2.9.1 Sample questions 1. What is the maximum number of nodes supported in an HACMP V5.X cluster? a. 16 b. 32 c. 48 d. 12 64 IBM HACMP for AIX V5.X Certification Study Guide
Slide 81: 2. Which are the most important characteristics of an AIX node to be considered while sizing the cluster? a. CPU, amount of internal memory, amount of internal storage, and number of PCI I/O slots b. CPU, amount of internal memory, number of power supplies, and size of internal storage c. CPU, amount of internal memory, number of I/O slots, and type of external storage d. CPU, amount of internal memory, type of external storage, and number of fans 3. Which are the most important characteristics to be consider while planning cluster shared storage? a. Number of disks and power supplies b. Number of nodes supported for shared access and data protection technology (RAID, JBOD, and so on) c. Number of nodes and disks supported 4. What is the purpose of the non-IP network in an HACMP V5.X cluster? a. To help avoid a data corruption in case of IP network failure b. Client access c. Service network d. Exchange heartbeat message 5. What is the definition of “communication interface” in HACMP V5.X? a. An end of a point to point serial connection b. A network interface capable of communicating via IP protocol c. One physical interface used to provide node to node communication 6. What is the meaning of “IP alias” in HACMP V5.X? a. A name of a communication interface b. An IP address added to a communication interface on top of existing IP address(es) c. An alternate hardware address Chapter 2. Planning and design 65
Slide 82: 7. How many persistent IP labels/addresses can be configured for each node in an HACMP cluster? a. One per node per network b. Two c. Four d. Eight 8. Select one non-supported type of non-IP network in HACMP V5.X: a. RS232 b. Target mode SSA c. 802_ether d. Disk heartbeat 9. Which IP address takeover method requires that the service IP address(es) be in a different subnet from any of the boot IP addresses of the node’s communication interfaces? a. IPAT via replacement b. IPAT via aliasing with heartbeat over IP aliases c. Hardware address takeover d. IPAT via aliasing 10.Which is the default takeover mechanism used by HACMP V5.X for service IP addresses? a. IPAT via replacement b. IPAT via aliasing c. Hardware address takeover d. Heartbeat over IP aliases 11.Which is the default authentication mechanism for configuring HACMP V5.X cluster communication? a. Standard b. VPN c. Enhanced d. Kerberos V4 66 IBM HACMP for AIX V5.X Certification Study Guide
Slide 83: 12.What is the name of new cluster communication daemon? a. clinfoES b. clstmgrES c. cllockdES d. clcomdES 13.Which base operating system (AIX) file is updated to provide automatic start for the cluster communication daemon? a. /etc/hosts b. /etc/services c. /etc/rc.net d. /etc/inittab 14.What type of volume group is used for disk heartbeat networks? a. Concurrent b. Non-concurrent c. Enhanced concurrent d. Shared concurrent 15.How many disks are required to configure the heartbeat over disk network between two cluster nodes? a. Two b. Three c. One d. One disk per pair of nodes, per enclosure 16.How many nodes can be connected (configured) on a single heartbeat over disk network? a. Two b. Three c. Five d. All cluster nodes Chapter 2. Planning and design 67
Slide 84: 17.Which filesets are required to be installed for using the concurrent resource manager with enhanced concurrent VGs? a. bos.lvm and cluster.es.clvm b. bos.clvm.enh and cluster.es.lvm c. bos.clvm.enh and cluster.es.clvm d. All of the above 18.When configuring the resource group attributes via SMIT, which option must be set to True to guarantee that IP address takeover will be performed after exporting the file systems part of that resource group? a. File systems mounted after IP address configured b. File systems mounted before IP configured c. File systems automatically mounted d. NFS hard mount 19.What is the amount of storage space required in the /var file system to accommodate HACMP dynamic reconfiguration (DARE) operation logs? a. 10 MB b. 4 MB per cluster node c. 20 MB d. 1 MB per cluster node 20.What is the new type of resource group type that provides for configuring settling timers? a. Cascading b. Rotating c. Concurrent d. Custom 21.Which IPAT method is supported for custom resource groups in HACMP V5.1? a. IPAT via replacement b. IPAT via aliasing c. Hardware address takeover d. Heartbeat over disk network Answers to the quiz can be found in Appendix A, “ITSO sample cluster” on page 279. 68 IBM HACMP for AIX V5.X Certification Study Guide
Slide 85: 3 Chapter 3. Installation and configuration In this chapter, we cover some of the basic HACMP installation issues and various installation procedures. The topics covered in this chapter are: HACMP software installation Network configuration Storage configuration HACMP cluster configuration – Topology configuration – Resource configuration (standard) – Custom resource configuration Planning is one half of a successful implementation, but when it comes to HACMP, we cannot emphasize enough that proper planning is needed. If planning is not done properly, you might find yourself entangled in restrictions at a later point, and recovering from these restrictions can be a painful experience, so take your time and use the planning worksheets that comes with the product; they are invaluable in any migration or problem determination situations or for plan documentation. © Copyright IBM Corp. 2004. All rights reserved. 69
Slide 86: 3.1 HACMP software installation The HACMP software provides a series of facilities that you can use to make your applications highly available. You must keep in mind that not all system or application components are protected by HACMP. For example, if all the data for a critical application resides on a single disk, and that specific disk fails, then that disk is a single point of failure for the entire cluster, and is not protected by HACMP. AIX logical volume manager or storage subsystems protection must be used in this case. HACMP only provides takeover for the disk on the backup node, to make the data available for use. This is why HACMP planning is so important, because your major goal throughout the planning process is to eliminate single points of failure. A single point of failure exists when a critical cluster function is provided by a single component. If that component fails, the cluster has no other way of providing that function, and the application or service dependent on that component becomes unavailable. Also keep in mind that a well-planned cluster is easy to install, provides higher application availability, performs as expected, and requires less maintenance than a poorly planned cluster. 3.1.1 Checking for prerequisites Once you have finished your planning working sheets, verify that your system meets the requirements that are required by HACMP; many potential errors can be eliminated if you make this extra effort. HACMP V5.1 requires one of the following operating system components: AIX 5L V5.1 ML5 with RSCT V2.2.1.30 or higher. AIX 5L V5.2 ML2 with RSCT V2.3.1.0 or higher (recommended 2.3.1.1). C-SPOC vpath support requires SDD 1.3.1.3 or higher. For the latest information about prerequisites and APARs, refer to the README file that comes with the product and the following IBM Web page: http://techsupport.services.ibm.com/server/cluster/ 70 IBM HACMP for AIX V5.X Certification Study Guide
Slide 87: 3.1.2 New installation HACMP supports the Network Installation Management (NIM) program, including the Alternate Disk Migration option. You must install the HACMP filesets on each cluster node. You can install HACMP filesets either by using NIM or from a local software repository. Installation via a NIM server We recommend using NIM, simply because it allows you to load the HACMP software onto other nodes faster from the server than from other media. Furthermore, it is a flexible way of distributing, updating, and administrating your nodes. It allows you to install multiple nodes in parallel and provide an environment for maintaining software updates. This is very useful and a time saver in large environments; for smaller environments a local repository might sufficient. If you choose NIM, you need to copy all the HACMP filesets onto the NIM server and define a lpp_source resource before proceeding with the installation. Installation from CD-ROM or hard disk If your environment has only a few nodes, or if the use of NIM is more than you need, you can use a simple CD-ROM installation or make a local repository by copying the HACMP filesets locally and then use the exportfs command; this allows other nodes to access the data using NFS. For other installation examples, such as installations on SP systems, and for instructions on how to create an installation server, please refer to Part 3, “Network Installation”, in the AIX 5L Version 5.2 Installation Guide and Reference, SC23-4389. 3.1.3 Installing HACMP Before installing HACMP, make sure you read the HACMP V5.1 release notes in the /usr/es/lpp/cluster/doc directory for the latest information on requirements or known issues. To install the HACMP software on a server node, do the following steps: 1. If you are installing directly from the installation media, such as a CD-ROM or from a local repository, enter the smitty install_all fast path. SMIT displays the Install and Update from ALL Available Software screen. 2. Enter the device name of the installation medium or install directory in the INPUT device/directory for software field and press Enter. Chapter 3. Installation and configuration 71
Slide 88: 3. Enter the corresponding field values. To select the software to install, press F4 for a software listing, or enter all to install all server and client images. Select the packages you want to install according to your cluster configuration. Some of the packages may require prerequisites that are not available in you environment (for example, Tivoli Monitoring). The cluster.es and cluster.cspoc images (which contain the HACMP run-time executable) are required and must be installed on all servers. Note: If you are installing the Concurrent Resource Manager feature, you must install the cluster.es.clvm LPPs, and if you choose cluster.es and cluster.cspoc, you must also select the associated message packages. Make sure you select Yes in the Accept new license agreements field. You must choose Yes for this item to proceed with installation. If you choose No, the installation may stop with a warning that one or more filesets require the software license agreements. You accept the license agreement only once for each node. 4. Press Enter to start the installation process. Post-installation steps To complete the installation after the HACMP software is installed, do the following steps: 1. Verify the software installation by using the AIX command lppchk, and check the installed directories to see if the expected files are present. 2. Run the commands lppchk -v and lppchk -c cluster*. Both commands run clean if the installation is OK; if not, use the proper problem determination techniques to fix any problems. 3. Although not mandatory, we recommend you reboot each cluster node in your HACMP environment. If you do not want to reboot, make sure you start the cluster communication daemon (clcomdES) on all cluster nodes with the following command: # startsrc -s clcomdES 3.1.4 Migration paths and options If you are in the process of upgrading or converting your HACMP cluster, the following options are available: node-by-node migration and snapshot conversion. 72 IBM HACMP for AIX V5.X Certification Study Guide
Slide 89: Node-by-node migration The node-by-node migration path is used if you need to maintain the application available during the migration process. The steps for a node-by-node migration are: 1. Stop the cluster services on one cluster node. 2. Upgrade the HACMP software. 3. Reintegrate the node into the cluster again. This process has also been referred to as “rolling migration”. This migration option has certain restrictions; for more details, see 3.1.6, “Node-by-node migration” on page 77. If you can afford a maintenance window for the application, the steps for migration are: 1. Stop cluster services on all cluster nodes. 2. Upgrade the HACMP software on each node. 3. Start cluster services on one node at a time. Snapshot migration You can also convert the entire cluster to HACMP V5.1 by using a cluster snapshot facility. However, the cluster will be unavailable during the entire process, and all nodes must be upgraded before the cluster is activated again. For more details, see 3.1.5, “Converting a cluster snapshot” on page 73. 3.1.5 Converting a cluster snapshot This migration method has been provided for cases where both AIX and HACMP must be upgraded/migrated at once (for example, AIX V4.3.3 and HACMP V4.4.1 to AIX 5L V5.1 and HACMP V5.1). Important: It is very important that you do not leave your cluster in a mixed versions state for longer periods of time, since high availability cannot be guaranteed. If you are migrating from an earlier supported version of HACMP (HAS) to HACMP V5.X, you can migrate the cluster without taking a snapshot. Save the planning worksheet and configuration files from the current configuration for future reference if you want to configure the HACMP cluster in the same way as it was configured in the previous installation. Uninstall the HACMP software components, reinstall them with the latest HACMP version, and configure them according to the saved planning and configuration files. Chapter 3. Installation and configuration 73
Slide 90: Note: You should be aware that after a migration or upgrade, none of the new HACMP V5.X features are active. To activate the new features (enhancements), you need to configure the options and synchronize the cluster. To convert from a supported version of HAS to HACMP, do the following steps: 1. Make sure that the current software is committed (not in applied status). 2. Save your HAS cluster configuration in a snapshot and save any customized event scripts you want to retain. 3. Remove the HAS software on all nodes in the cluster. 4. Install the HACMP V5.1 software. 5. Verify the installed software. 6. Convert and apply the saved snapshot. The cluster snapshot utility allows you to save the cluster configuration to a file by doing the following steps: 1. Reinstall any saved customized event scripts, if needed. 2. Reboot each node. 3. Synchronize and verify the HACMP V5.1 configuration. The following sections explain each of these steps. Check for previous HACMP versions To see if HACMP Classic (HAS) software exists on your system, enter the following command: # lslpp -h “cluster*” If the output of the lslpp command reveals that HACMP is installed, but is less than V4.5, you must upgrade to V4.5 at a minimum before continuing with the snapshot conversion utility. For more information, please refer to the HACMP for AIX 5L V5.1 Administration and Troubleshooting Guide, SC23-4862-02. Saving your cluster configuration and customized event scripts To save your HACMP (HAS) (V4.5 or greater) cluster configuration, create a snapshot in HACMP (HAS). If you have customized event scripts, they must also be saved. 74 IBM HACMP for AIX V5.X Certification Study Guide
Slide 91: Attention: Do not save your cluster configuration or customized event scripts in any of the following directory paths /usr/sbin/cluster, /usr/es/sbin/cluster, or /usr/lpp/cluster. These directories are deleted and recreated during the installation of new HACMP packages. How to remove the HACMP (HAS) software To remove the HACMP software and your cluster configuration on cluster nodes and clients, do the following steps: 1. Enter the smitty install_remove fast path. You should get the screen shown in Example 3-1. Example 3-1 Remove installed software Remove Installed Software Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [cluster*] yes no no no * SOFTWARE name PREVIEW only? (remove operation will NOT occur) REMOVE dependent software? EXTEND file systems if space needed? DETAILED output? + + + + + F1=Help F5=Reset F9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do F4=List F8=Image Installing HACMP V5.1 Follow the instructions for installing the HACMP software in 3.1.3, “Installing HACMP” on page 71. Note: Do not reboot until you have converted and applied the saved snapshot. Verify the installed software After installing HACMP, verify that the expected files are there using lppchk. For more information, see “Post-installation steps” on page 72. Convert and apply the saved snapshot After you have installed HACMP V5.1 on the cluster nodes, you need to convert and apply the snapshot you saved from your previous configuration. Chapter 3. Installation and configuration 75
Slide 92: Important: Converting the snapshot must be performed before rebooting the cluster nodes. To convert and apply the saved snapshot: 1. Use the clconvert_snapshot utility, specifying the HACMP (HAS) version number and snapshot file name to be converted. The -C flag converts an HACMP (HAS) snapshot to an HACMP V5.1 snapshot format: clconvert_snapshot -C -v version -s <filename> 2. Apply the snapshot. Reinstall saved customized event scripts Reinstall any customized event scripts that you saved from your previous configuration. Note: Some pre- and post-event scripts used in previous versions may not be useful in HACMP V5.1, especially in resource groups using parallel processing. Reboot cluster nodes Rebooting the cluster nodes is necessary to activate the new cluster communication daemon (clcomdES). Verify and synchronize the cluster configuration After applying the HACMP software and rebooting each node, you must verify and synchronize the cluster topology. Verification provides errors and/or warnings to ensure that the cluster definition is the same on all nodes. In the following section, we briefly go through the cluster verification process. Run smitty hacmp and select Extended Configuration → Extended Verification and Synchronization, select Verify changes only, and press Enter (see Example 3-2 on page 77). 76 IBM HACMP for AIX V5.X Certification Study Guide
Slide 93: Example 3-2 HACMP Verification and Synchronization HACMP Verification and Synchronization (Active Cluster on a Local Node) Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [Actual] [No] [No] [Standard] * Emulate or Actual Force synchronization if verification fails? * Verify changes only? * Logging + + + + F1=Help F5=Reset F9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do F4=List F8=Image Important: You cannot synchronize the configuration in a mixed-version cluster. While upgrading, you should not leave the cluster with mixed versions of HACMP for long periods of time. New functionality supplied with V5.1 is only available when all nodes have been upgraded and the cluster has been synchronized. 3.1.6 Node-by-node migration You must consider the following items in order to perform a node-by-node (“rolling”) migration: All nodes in the cluster must have HACMP V4.5 installed and committed. Node-by-node migration functions only for HACMP (HAS) V4.5 to HACMP V5.1 migrations. All nodes in the cluster must be up and running the HAS V4.5 software. The cluster must be in a stable state. There must be enough disk space to hold both HAS and HACMP software during the migration process: – Approximately 120 MB in the /usr directory – Approximately 1.2 MB in the / (root) directory When the migration is complete, the space requirements are reduced to the normal amount necessary for HACMP V5.1 alone. Nodes must have enough memory to run both HACMP (HAS) and HACMP daemons simultaneously. This is a minimum of 64 MB of RAM. 128 MB of RAM is recommended. Chapter 3. Installation and configuration 77
Slide 94: Check that you do not have network types unsupported in HACMP. You cannot make configuration changes once migration is started.You must remove or change unsupported types beforehand. See Chapter 3, “Planning Cluster Network Connectivity“, of the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02 for details. Important: As in any migration, once you have started the migration process, do not attempt to make any changes to the cluster topology or resources. If any nodes in the cluster are currently set to start cluster services automatically on reboot, change this setting before beginning the migration process. The following procedures describe how to turn off automatic startup for a cluster. – Use C-SPOC to disable automatic starting of cluster services on system restart. – Use the SMIT fastpath smitty clstop, and select the options shown in Example 3-3. Example 3-3 Stop cluster services Stop Cluster Services Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] on system restart [p630n01] true graceful * Stop now, on system restart or both Stop Cluster Services on these nodes BROADCAST cluster shutdown? * Shutdown mode + + + + F1=Help F5=Reset F9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do F4=List F8=Image If you do not use C-SPOC, you must change the setting on each cluster node individually. 78 IBM HACMP for AIX V5.X Certification Study Guide
Slide 95: How to perform a node-by-node migration To perform a node-by-node migration from HACMP V4.5 to HACMP V5.1, do the following steps: 1. Save the current configuration in a snapshot (as a precautionary measure). Place it in a safe directory (one that is not touched by the installation procedures). Do not use /usr/sbin/cluster. 2. Stop cluster services on one of the nodes running HAS V4.5 using the graceful with takeover method. To stop cluster services from the command line, run: # /usr/es/sbin/cluster/utilities/clstop -gr 3. Verify that the cluster services are stopped on the node and that its cluster resources have been transferred to take over nodes before proceeding. 4. Install HACMP V5.1 on the node. For instructions, see 3.1, “HACMP software installation” on page 70. 5. Check the installed software using the AIX command lppchk. See “Post-installation steps” on page 72. 6. Reboot the node. 7. Restart the HACMP software: a. Enter the fast path smitty hacmp. b. Go to System Management (C-SPOC). c. Select Manage HACMP Services. d. Select Start Cluster Services. When you restart Cluster Services: – The HACMP software is also started. – HACMP cluster services run on the node and the node rejoins the cluster. – The node reacquires the cascading resources for which it is the primary node (depending on your Inactive Takeover set). Both the old and new versions of HACMP (that is, HACMP V4.5 and Enhanced Scalability HACMP V5.1) are now running on the node, but only HACMP Classic (HAS) controls the cluster events and resources. If you list the daemons controlled by the system resource controller (SRC), you will see the following daemons listed on this hybrid node (see Table 3-1 on page 80). Chapter 3. Installation and configuration 79
Slide 96: Table 3-1 List of daemons used by HACMP HACMP clstmgr clockd (optional) clsmuxpd clinfo (optional) HACMP/ES clstmgrES clockdES (optional) clsmuxpES clinfoES (optional) clcomdES RSCT grpsvcs topsvcs emsvcs grpglsm emaixos 8. Repeat steps 2 through 6 for all the other nodes in the cluster. Attention: Starting the cluster services on the last node is the point of no return. Once you have restarted HACMP (which restarts both versions of HACMP) on the last node, and the migration has commenced, you cannot reverse the migration. If you wish to return to the HACMP configuration after this point, you will have to reinstall the HACMP software and apply the saved snapshot. Up to this point, you can back out of the installation of HACMP and return to your running HACMP cluster. If you need to do this, see “Backout procedure” on page 82. During the installation and migration process, when you restart each node, the node is running both products, with the HACMP clstrmgr in control of handling cluster events and the clstrmgrES in passive mode. After you start the cluster services on the last node, the migration to HACMP proceeds automatically. Full control of the cluster transfers automatically to the HACMP V5.1 daemons. Messages documenting the migration process are logged to the /tmp/hacmp.out file as well as to the /tmp/cm.log and /tmp/clstrmgr.debug log files. When the migration is complete, and all cluster nodes are up and running HACMP V5.1, the HACMP (HAS) software is uninstalled. 9. After all nodes have been upgraded and rebooted, and the cluster is stable, synchronize and verify the configuration. For more information, see 3.5.8, “Verify and synchronize HACMP” on page 151. You should also test the cluster’s proper fall-over and recovery behavior after any migration. 80 IBM HACMP for AIX V5.X Certification Study Guide
Slide 97: Note: The process of node-by-node migration from HAS 4.5 to HACMP V5.1, you will see the following warnings: sysck: 3001-036 WARNING: File /etc/cluster/lunreset.lst is also owned by fileset cluster.base.server.events. sysck: 3001-036 WARNING: File /etc/cluster/disktype.lst is also owned by fileset cluster.base.server.events. You may safely ignore these warnings and proceed with the installation. config_too_long message When the migration process has completed and the HACMP filesets are being deinstalled, you may see a config_too_long message. This message appears when the cluster manager detects that an event has been processing for more than the specified time. The config_too_long messages continue to be appended to the hacmp.out log until the event completes. If you observe these messages, you should periodically check that the event is indeed still running and has not failed. You can avoid these messages by increasing the time to wait before HACMP calls the config_too_long event (use SMIT). To change the interval allocated for an event to process, do the following steps: 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Event Configuration. 4. Select Change/Show Time Until Warning. You must do this on every node. It takes effect after restarting cluster services. How the node-by-node migration process works When you have installed HACMP on all cluster nodes (all nodes are now in a hybrid state), starting Cluster Services on the last cluster node automatically triggers the transfer of control to HACMP V5.1 as follows: 1. Installing HACMP V5.1 installs a recovery file called firstboot in a holding directory on the cluster node, and creates a migration file (.mig) to be used as a flag during the migration process. 2. The HACMP recovery driver sends a message to the HACMP Cluster Manager telling it to run the waiting and waiting_complete events. – HACMP uses the RSCT Group Services to verify cluster stability and membership. Chapter 3. Installation and configuration 81
Slide 98: – The firstboot file on each cluster node is moved to an active directory (/etc). – The migration flag (.mig file) created during installation is transferred from the HACMP V5.1 directory to the HACMP V4.5 directory on all nodes. When the firstboot file is moved to the active directory and the.mig file transfer is complete on all nodes, transfer of control to HACMP continues with the HACMP migrate event. 3. The HACMP recovery driver issues the migrate event. – HACMP V5.1 stops the HACMP V4.5 daemons using the forced option. – The HACMP V5.1 clinfoES and clsmuxpdES daemons are all activated, reusing the ports previously used by the HACMP V4.5 versions of those daemons. 4. HACMP V5.1 recovery driver runs the migrate_complete event. – HACMP V4.5 is deinstalled. Configuration files common to both products are left untouched. – Base directories are relinked. – The /etc/firstboot files are removed. – The migration flag (.mig file) in the HACMP /usr/sbin/cluster directory is removed. 5. Migration is now complete. Cluster snapshots saved during migration Pre-existing HACMP snapshots are saved in the /usr/es/sbin/cluster/snapshots directory. Handling node failure during the migration process If a node fails during the migration process after its firstboot file moved to an active directory, it completes the migration process during node reboot. However, the failed node may have an HACMP ODM that is not in synch when it reintegrates into the cluster. In this case, synchronize the topology and resources of the cluster before reintegrating the failed node into the cluster. To synchronize the cluster (see 3.5.8, “Verify and synchronize HACMP” on page 151). Backout procedure If for some reason you decide not to complete the migration process, you can uninstall the HACMP V5.1 software on the nodes where you have installed it at any point in the process before starting HACMP on the last node. 82 IBM HACMP for AIX V5.X Certification Study Guide
Slide 99: Note: Deinstall the HACMP software only on the local node. During a migration, do not select the option to deinstall the software from multiple nodes. To deinstall the HACMP software: 1. On each node, one by one, stop cluster services: To stop cluster services, see Example 3-3 on page 78. Check that the cluster services are stopped on the node and that its cluster resources have been transferred to takeover nodes before proceeding. 2. When you are sure the resources on the node have been properly transferred to a takeover node, remove the HACMP V5.1 software. See “How to remove the HACMP (HAS) software” on page 75. 3. Start HACMP on this node. When you are certain the resources have transferred properly (if necessary) back to this node, repeat these steps on the next node. 4. Continue this process until HACMP has been removed from all nodes in the cluster. Handling synchronization failures during node-by-node migration If you try to make a change to the cluster topology or resources when migration is incomplete, the synchronization process will fail. You will receive the following message: cldare: Migration from HACMP V4.5 to HACMP V5.1 Detected. cldare cannot be run until migration has completed. To back out from the change, you must restore the active ODM. Do the following steps: 1. Enter smitty hacmp. 2. Go to Problem Determination Tools. 3. Select Restore HACMP Configuration Database from Active Configuration. Chapter 3. Installation and configuration 83
Slide 100: 3.1.7 Upgrade options Here we discuss upgrades to HACMP. Supported upgrades to HACMP V5.1 HACMP conversion utilities provide an easy upgrade path from the versions listed here to V5.1: HACMP/ES V4.4.1 to HACMP V5.1 HACMP/ES V4.5 to HACMP V5.1 If you wish to convert to HACMP V5.1 from versions earlier than those listed here, you must first upgrade to one of the supported versions. You will then be able to convert to HACMP V5.1. For example, to convert from HACMP/ES 4.2.2 to HACMP V5.1, you must first perform an installation upgrade to HACMP/ES 4.4.1 or higher and then upgrade to HACMP V5.1. To upgrade to HACMP V5.1, do the following steps: 1. Upgrade to AIX 5L V5.1 Maintenance Level 5 or higher if needed. 2. Check and verify the AIX installation, if needed. 3. Commit your current HACMP software on all nodes. 4. Stop HACMP/ES on one node (gracefully with takeover) using the clstop command. 5. After the resources have moved successfully from the stopped node to a takeover node, install the new HACMP software. For instructions on installing the HACMP V5.1 software, see 3.1, “HACMP software installation” on page 70. Verify the software installation by using the AIX command lppchk, and check the installed directories to see that expected files are present: lppchk -v or lppchk -c “cluster.*” Both commands should run clean if the installation is OK. 6. Reboot the first node. 7. Start the HACMP software on the first node using smitty clstart and verify that the first node successfully joins the cluster. 8. Repeat the preceding steps on remaining cluster nodes, one at a time. 9. Check that the tty device is configured as a serial network. 10.Check that all external disks are available on the first node (use lspv to check the PVIDs for each disk). If PVIDs are not displayed for the disks, you may need to remove the disk and reconfigure them. 84 IBM HACMP for AIX V5.X Certification Study Guide
Slide 101: 11.After all nodes have been upgraded, synchronize the node configuration and the cluster topology from Node A to all nodes, as described in “Verifying the upgraded cluster definition” on page 85. Do not skip verification during synchronization. Important: When upgrading, never synchronize the cluster definition from an upgraded node, when a node that has not been upgraded remains in a mixed-version cluster. The cl_convert utility assigns node IDs that are consistent across all nodes in the cluster. These new IDs may conflict with the already existing ones. 12.Restore the HACMP event ODM object class to save any pre- and post-events you have configured for your cluster. 13.Make additional changes to the cluster if needed. 14.Complete a test phase on the cluster before putting it into production. Verifying the upgraded cluster definition To verify the cluster, see 3.5.8, “Verify and synchronize HACMP” on page 151. cl_convert and clconvert_snapshot The HACMP conversion utilities are cl_convert and clconvert_snapshot. Upgrading HACMP/ES software to the newest version of HACMP involves converting the ODM from a previous release to that of the current release. When you install HACMP, cl_convert is run automatically. However, if installation fails, you must run cl_convert from the command line. In a failed conversion, run cl_convert using the -F flag. For example, to convert from HACMP/ES V4.5 to HACMP V5.1, use the -F and -v (version) flags as follows (note the “0” added for V4.5): # /usr/es/sbin/cluster/conversion/cl_convert -F -v 4.5.0 To run a conversion utility requires: Root user privileges The HACMP version from which you are converting The cl_convert utility logs the conversion progress to the /tmp/clconvert.log file so that you can gauge conversion success. This log file is generated (overwritten) each time cl_convert or clconvert_snapshot is executed. The clconvert_snapshot utility is not run automatically during installation, and must be run from the command line. Run clconvert_snapshot to upgrade cluster Chapter 3. Installation and configuration 85
Slide 102: snapshots when migrating from HACMP (HAS) to HACMP, as described in “cl_convert and clconvert_snapshot” on page 85. Upgrading the concurrent resource manager To install the concurrent access feature on cluster nodes, install the Concurrent Resource Manager (CRM) using the procedure outlined in 3.1, “HACMP software installation” on page 70. AIX 5L V5.1 supports enhanced concurrent mode (ECM). If you are installing HACMP with the Concurrent Resource Manager feature, see Chapter 2, “Initial Cluster Planning”, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. See Chapter 5, “Planning Shared LVM Components“, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02 for information on enhanced concurrent mode and on supported IBM shared disk devices. In addition, if you want to use disks from other manufacturers, see Appendix D, “OEM Disk Accommodation”, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. Problems during the installation If you experience problems during the installation, the installation program automatically performs a cleanup process. If, for some reason, the cleanup is not performed after an unsuccessful installation, do the following steps: 1. Enter smitty install. 2. Select Software Maintenance and Utilities. 3. Select Clean Up After a Interrupted Installation. 4. Review the SMIT output (or examine the /smit.log file) for the interruption’s cause. 5. Fix any problems by using AIX problem determination techniques and repeat the installation process. 3.2 Network configuration Cluster nodes communicate with each other over communication networks. If one of the physical network interface cards (NIC) on a node on a network fails, HACMP preserves the communication to the node by transferring the traffic to another physical network interface card on the same node. If a “connection” to the node fails, HACMP transfers resources to another available node. 86 IBM HACMP for AIX V5.X Certification Study Guide
Slide 103: In addition, HACMP (via RSCT topology services) uses heartbeat messages between the nodes (over the cluster networks) to periodically check availability of the cluster nodes and communication interfaces. If HACMP detects no heartbeat from a node, the node is considered failed, and its resources are automatically transferred to another node. Configuring multiple communication paths between the cluster nodes is highly recommended. Having multiple networks prevents cluster partitioning (“split brain”). In a partitioned cluster, the danger is that the nodes in each partition could simultaneously, without coordination, access the same data, which results in data corruption. 3.2.1 Types of networks Here we discuss the types of networks. Physical and logical networks A physical network connects two or more physical network interfaces. There are many types of physical networks, and HACMP broadly categorizes them as IP-based and non-IP networks: TCP/IP-based, such as Ethernet, or Token Ring Device-based, such as RS-232, or target mode SSA (tmssa) In HACMP, all network interfaces that can communicate with each other directly are grouped in a logical network. HACMP assigns a name for each HACMP logical network (for example, net_ether_01). A logical network in HACMP may contain one or more IP subnets. RSCT manages the heartbeat packets in each logical subnet. Global network A global network is a combination of multiple HACMP networks. The HACMP networks may be composed of any combination of physically different networks, and/or different logical networks (subnets), as long as they share the same “collision domain”, for example, Ethernet. HACMP treats the combined global network as a single network. RSCT handles the routing between the networks defined in a global network. 3.2.2 TCP/IP networks The IP based networks supported by HACMP are: ether (Ethernet) atm (Asynchronous Transfer Mode - ATM) Chapter 3. Installation and configuration 87
Slide 104: fddi (Fiber Distributed Data Interface - FDDI) hps (SP Switch) token (Token Ring) These types of IP based networks are monitored by HACMP via RSCT topology services. Heartbeat over IP aliases In HACMP V5.1, you can configure heartbeat over IP aliases. In prior releases of HACMP, heartbeats were exchanged over the service and non-service IP addresses/labels (base or boot IP addresses/labels). With this configuration, the communication interfaces’ IP boot addresses can reside on the same subnet or different ones. RSCT sets up separate heartbeat rings for each communication interface group, using a automatically assigned IP aliases, grouped in different subnets. You can use non-routable subnets for the heartbeat rings, preserving your other subnets for routable (client) traffic. For more information on configuration of heartbeat over IP based network, see 3.4.6, “Defining communication interfaces” on page 122. Persistent IP addresses/labels A persistent node IP label is an IP alias that can be assigned to a network for a specified node. A persistent node IP label is a label that: Always stays on the same node (is node-bound) Co-exists with other IP labels present on the same interface Does not require the installation of an additional physical interface on that node Is not part of any resource group Assigning a persistent node IP label for a network on a node allows you to have a node-bound address on a cluster network that you can use for administrative purposes to access a specific node in the cluster. For more information, see 3.4.9, “Defining persistent IP labels” on page 126. Non-IP networks Non-IP networks in HACMP are used as an independent path for exchanging messages between cluster nodes. In case of IP subsystem failure, HACMP can still differentiate between a network failure and a node failure when an independent path is available and functional. Below is a short description of the four currently available non-IP network types and their characteristics. Even though it is possible to configure an HACMP cluster without non-IP networks, it is 88 IBM HACMP for AIX V5.X Certification Study Guide
Slide 105: strongly recommended that you use at least one non-IP connection between the cluster nodes. Currently HACMP supports the following types of networks for non-TCP/IP heartbeat exchange between cluster nodes: Serial (RS232) Disk heartbeat network (diskhb) Target-mode SSA (tmssa) Target-mode SCSI (tmscsi) Serial (RS232) A serial (RS232) network needs at least one available serial port per cluster node. In case of a cluster consisting of more than two nodes, a ring of nodes is established through serial connections, which requires two serial ports per node. In case the number of native serial ports does not match your HACMP cluster configuration needs, you can extend it by adding an eight-port asynchronous adapter. For more information, see 3.4.7, “Defining communication devices” on page 124. Disk heartbeat network In certain situations RS232, tmssa, and tmscsi connections are considered too costly or complex to set up. Heartbeating via disk (diskhb) provides users with: A point-to-point network type that is very easy to configure. Additional protection against cluster partitioning. A point-to-point network type that can use any disk-type to form a data path. A setup that does not require additional hardware; it can use a disk that is also used for data and included in a resource group. In order to support SSA concurrent VGs, there is a small space reserved on every disk for use in clvmd communication. Enhanced concurrent VGs do not use the reserved space for communication; instead, they use the RSCT group services. Disk heart beating uses a reserved disk sector (that has been reserved for SSA concurrent mode VGs) as a zone where nodes can exchange keep alive messages. Any disk that is part of an enhanced concurrent VG can be used for a diskhb network, including those used for data storage. Moreover, the VG that contains the disk used for a diskhb network does not have to be varied on. Chapter 3. Installation and configuration 89
Slide 106: Any disk type may be configured as part of an enhanced concurrent VG, making this network type extremely flexible. For more information on configuring a disk heartbeat network, see Chapter 3, “Planning Cluster Network Connectivity”, in the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. Target mode SSA If you are using shared SSA devices, target mode SSA can be used for non-IP communication in HACMP. This relies on the built in capabilities of the SSA adapters (using the SCSI communication protocol). The SSA devices in a SSA loop (disks and adapters) use the communication between “initiator” and “target”; SSA disks are “targets”, but the SSA adapter has both capabilities (“initiator” and “target”); thus, a tmssa connection uses these capabilities for establishing a serial-like link between HACMP nodes. This is a point-to point communication network, which can communicate only between two nodes. To configure a tmssa network between the cluster node, the SSA adapter (one or more) in that node must be part of a SSA loop containing shared disks. In this case, each node must be assigned with a unique node number for the SSA router device (ssar). To change the SSA node number of the system, do the following steps: 1. Run the smitty ssa fast path. 2. Select Change/Show SSA Node Number of this System. 3. Change the node number to a unique number in your cluster environment. For more information on configuring a tmssa network in a cluster, see 3.4.7, “Defining communication devices” on page 124. Attention: In a cluster that uses concurrent disk access, it is mandatory that the SSA router number matches (is the same as) the HACMP node number; otherwise, you cannot varyon the shared volume groups in concurrent mode. Target mode SCSI Another possibility for a non-IP network is a target mode SCSI connection. Whenever you use a shared SCSI device, you can also use the SCSI bus for exchanging heartbeats. Target mode SCSI (tmscsi) is only supported with SCSI-2 Differential or SCSI-2 Differential Fast/Wide devices. SCSI-1 Single-Ended and SCSI-2 Single-Ended do not support serial networks in an HACMP cluster. We do not recommend that you use this type of network in any future configurations (since the disk heartbeat network works with any type of supported shared SCSI disk). 90 IBM HACMP for AIX V5.X Certification Study Guide
Slide 107: 3.3 Storage configuration Storage configuration is one of the most important tasks you have to perform before starting the HACMP cluster configuration. Storage configuration can be considered a part of HACMP configuration. Depending on the application needs, and on the type of storage, you have to decide that how many nodes in a cluster will have shared storage access, and which resource groups will use which disks. Most of the IBM storage subsystems are supported with HACMP. To find more information on storage server support, see the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. The most commonly used shared storage subsystems are: Fiber Attach Storage Server (FAStT) Enterprise Storage Servers (ESS/Shark) Serial Architecture Storage (SSA) Storage protection (data or otherwise) is independent of HACMP; for high availability of storage, you must use storage that has proper redundancy and fault tolerance levels. HACMP does not have any control on storage availability. For data protection, you can use either RAID technology (at storage or adapter level) or AIX LVM mirroring. Redundant Array of Independent Disks (RAID) Disk arrays are groups of disk drives that work together to achieve data transfer rates higher than those provided by single (independent) drives. Arrays can also provide data redundancy so that no data is lost if one drive (physical disk) in the array fails. Depending on the RAID level, data is either mirrored, striped, or both. For the characteristics of some widely used RAID levels, see Table 3-2 on page 94. RAID 0 RAID 0 is also known as data striping. Conventionally, a file is written out sequentially to a single disk. With striping, the information is split into chunks (fixed amounts of data usually called blocks) and the chunks are written to (or read from) a series of disks in parallel. There are two performance advantages to this: Data transfer rates are higher for sequential operations due to the overlapping of multiple I/O streams. Random access throughput is higher because access pattern skew is eliminated due to the distribution of the data. This means that with data Chapter 3. Installation and configuration 91
Slide 108: distributed evenly across a number of disks, random accesses will most likely find the required information spread across multiple disks and thus benefit from the increased throughput of more than one drive. RAID 0 is only designed to increase performance. There is no redundancy, so any disk failures will require reloading from backups. RAID 1 RAID 1 is also known as disk mirroring. In this implementation, identical copies of each chunk of data are kept on separate disks, or more commonly, each disk has a “twin” that contains an exact replica (or mirror image) of the information. If any disk in the array fails, then the mirror disk maintains data availability. Read performance can be enhanced because the disk that has the actuator (disk head) closest to the required data is always used, thereby minimizing seek times. The response time for writes can be somewhat slower than for a single disk, depending on the write policy; the writes can either be executed in parallel (for faster response) or sequential (for safety). RAID Level 1 has data redundancy, but data should be regularly saved (backups). This is the only way to recover data in the event that a file or directory is accidentally corrupted or deleted. RAID 2 and RAID 3 RAID 2 and RAID 3 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks (a fixed amount of data), and each chunk is written out to the same physical position on separate disks (in parallel). When a read occurs, simultaneous requests for the data can be sent to each disk. This architecture requires parity information to be written for each stripe of data; the difference between RAID 2 and RAID 3 is that RAID 2 can utilize multiple disk drives for parity, while RAID 3 can use only one. If a drive should fail, the system can reconstruct the missing data from the parity and remaining drives. Performance is very good for large amounts of data, but poor for small requests, since every drive is always involved, and there can be no overlapped or independent operation. RAID 4 RAID 4 addresses some of the disadvantages of RAID 3 by using larger chunks of data and striping the data across all of the drives except the one reserved for parity. Using disk striping means that I/O requests need only reference the drive that the required data is actually on. This means that simultaneous, as well as independent reads, are possible. Write requests, however, require a read/modify/update cycle that creates a bottleneck at the single parity drive. Each stripe must be read, the new data inserted, and the new parity then calculated before writing the stripe back to the disk. The parity disk is then 92 IBM HACMP for AIX V5.X Certification Study Guide
Slide 109: updated with the new parity, but cannot be used for other writes until this has completed. This bottleneck means that RAID 4 is not used as often as RAID 5, which implements the same process but without the bottleneck. RAID 5 RAID 5 is very similar to RAID 4. The difference is that the parity information is also distributed across the same disks used for the data, thereby eliminating the bottleneck. Parity data is never stored on the same drive as the chunks that it protects. This means that concurrent read and write operations can now be performed, and there are performance increases due to the availability of an extra disk (the disk previously used for parity). There are other possible enhancements to further increase data transfer rates, such as caching simultaneous reads from the disks and transferring that information while reading the next blocks. This can generate data transfer rates that approach the adapter speed. As with RAID 3, in the event of disk failure, the information can be rebuilt from the remaining drives. A RAID 5 array also uses parity information, though it is still important to make regular backups of the data in the array. RAID 5 arrays stripe data across all of the drives in the array, one segment at a time (a segment can contain multiple blocks). In an array with n drives, a stripe consists of data segments written to “n-1” of the drives and a parity segment written to the “n-th” drive. This mechanism also means that not all of the disk space is available for data. For example, in an array with five 72 GB disks, although the total storage is 360 GB, only 288 GB are available for data. RAID 0+1 (RAID 10) RAID 0+1, also known as IBM RAID-1 Enhanced, or RAID 10, is a combination of RAID 0 (data striping) and RAID 1 (data mirroring). RAID 10 provides the performance advantages of RAID 0 while maintaining the data availability of RAID 1. In a RAID 10 configuration, both the data and its mirror are striped across all the disks in the array. The first stripe is the data stripe, and the second stripe is the mirror, with the mirror being placed on the different physical drive than the data. RAID 10 implementations provide excellent write performance, as they do not have to calculate or write parity data. RAID 10 can be implemented via software (AIX LVM), hardware (storage subsystem level), or in a combination of the hardware and software. The appropriate solution for an implementation depends on the overall requirements. RAID 10 has the same cost characteristics as RAID 1. The most common RAID levels used in today’s IT implementations are listed in Table 3-2 on page 94. Chapter 3. Installation and configuration 93
Slide 110: Table 3-2 Characteristics of RAID levels widely used RAID level Available disk capacity 100% 50% Performance in read/write operations High both read/write Medium/High read, Medium write High read Medium write High both read/write Cost Data Protection No Yes RAID 0 RAID 1 Low High RAID 5 RAID 10 80% 50% Medium High Yes Yes Fiber Attach Storage Server (FAStT) There are different models of FAStT storage available and supported in HACMP. Covering all models of FAStT is not within the scope of this book. To understand how to configure the FAStT storage, we present an example of the FAStT900 Storage Server. FAStT900 Storage Server The FAStT900 Storage Server supports direct attachment of up to four hosts that contain two host adapters each, and is designed to provide maximum host-side and drive-side redundancy. By using external Fibre Channel switches in conjuction with the FAStT900 Storage Server, you can attach up to 64 hosts (each with two host bus adapters) to a FAStT900 Storage Server. Before configuring the FAStT storage, you must make sure all hardware and cabling connection is done, as per the required configuration. For more information on FAStT cabling, see IBM TotalStorage FAStT900 Fibre Channel Storage Server Installation Guide, GC26-7530. FAStT Storage Manager software The only way to configure FAStT Storage is to use the FAStT Storage Manager software. The FAStT Storage Manager software is available on most popular operating systems, such as AIX, Linux, and Windows® XP/2000. With FAStT Storage Manager, you can configure supported RAID levels, logical drives, and partitions. Supported RAID levels are RAID 0, RAID 1, RAID 5, and RAID 0+1. There is no option to configure RAID 10 in FAStT Storage Manger. Selecting RAID 1 with multiple disks, FAStT Manager takes care of striping and mirroring of the data. 94 IBM HACMP for AIX V5.X Certification Study Guide
Slide 111: It allows a user to format the logical drives as required by the host operating systems. There are different versions of Storage Manager. Storage Manager V8.4 is the supported version of Storage Manager by FAStT900, with some newer features than the previous versions. Some of the new features supported by FAStT900 with Storage Manager V8.4 are: FlashCopy® A FlashCopy logical drive is a logical point-in-time image of another logical drive, called a base logical drive, that is in the storage subsystem. A FlashCopy is the logical equivalent of a complete physical copy, but you create it much more quickly and it requires less disk space (20% of the original logical drive). Remote mirror option The remote mirror option is used for online, real-time replication of data between storage subsystems over a remote distance. Volumecopy The volumecopy option is a firmware-based mechanism for replicating logical drives data within a storage array. Users submit volumecopy requests by specifying two compatible drives. One drive is designated as the source and the other as a target. The volumecopy request is persistent so that any relevant result of the copy process can be communicated to the user. Storage partitioning Storage partitioning allows the user to present all storage volumes to a SAN through several different partitions by mapping storage volumes to a LUN number, each partition presenting LUNs 0-255. This volume or LUNs mapping applies only to the host port or ports that have been configured to access that LUN. This feature also allows the support of multiple hosts using different operating systems and their own unique disk storage subsystems settings to be connected to the same FAStT storage server at the same time. For more information on installation and configuration of Storage Manager V8.4, refer the IBM TotalStorage FAStT Storage Manager 8.4 Installation & Support Guide for Intel-based Operating Environments, GC26-7589. Enterprise Storage Server (ESS/Shark) The IBM Enterprise Storage Server (ESS) is a second-generation Seascape® disk storage system that provides industry-leading availability, performance, manageability, and scalability. RAID levels in ESS are predefined in certain configurations and have limited modification capabilities. Available RAID levels are RAID 1, RAID 5, and RAID 0+1. Chapter 3. Installation and configuration 95
Slide 112: The IBM Enterprise Storage Server (ESS) does more than simply enable shared storage across enterprise platforms; it can improve the performance, availability, scalability, and manageability of enterprise-wide storage resources through a variety of powerful features. Some of the features are similar in name to those in available FAStT Storage, but the technical concepts differ to a great extent. Some of those features are: FlashCopy FlashCopy provides fast data duplication capability. This option helps eliminate the need to stop applications for extended periods of time in order to perform backups and restores. Peer-to-peer remote copy This feature maintains a synchronous copy (always up-to-date with the primary copy) of data in a remote location. This backup copy of data can be used to quickly recover from a failure in the primary system without losing any transactions; this is an optional capability that can literally keep your e-business applications running. Extended remote copy (XRC) This feature provides a copy of data at a remote location (which can be connected using telecommunications lines at unlimited distances) to be used in case the primary storage system fails. The ESS enhances XRC with full support for unplanned outages. In the event of a telecommunications link failure, this optional function enables the secondary remote copy to be resynchronized quickly without requiring duplication of all data from the primary location for full disaster recovery protection. Custom volumes Custom volumes enable volumes of various sizes to be defined for high-end servers, enabling administrators to configure systems for optimal performance. Storage partitioning Storage partitioning uses storage devices more efficiently by providing each server access to its own pool of storage capacity. Storage pools can be shared among multiple servers. For more information on the configuration of the Enterprise Storage Server, refer to the IBM TotalStorage Enterprise Storage Server Service Guide 2105 Model 750/800 and Expansion Enclosure, Volume 1, SY27-7635. 96 IBM HACMP for AIX V5.X Certification Study Guide
Slide 113: Serial Storage Architecture (SSA) Serial storage architecture is an industry-standard interface that provides high-performance fault-tolerant attachment of I/O storage devices. In SSA subsystems, transmissions to several destinations are multiplexed; the effective bandwidth is further increased by spatial reuse of the individual links. Commands are forwarded automatically from device to device along a loop until the target device is reached. Multiple commands can be travelling around the loop simultaneously. SSA supports RAID 0, RAID 1, RAID 5, and RAID 0+1. To use any of the RAID setups, it is necessary to follow the looping instruction of SSA enclosures. Specific looping across the disk is required to create RAID. For more information on IBM SSA RAID Configuration, refer to IBM Advanced SerialRAID Adapters Installation Guide, SA33-3287. 3.3.1 Shared LVM For a HACMP cluster, the key element is the data used by the highly available applications. This data is stored on AIX Logical Volume Manager (LVM) entities. HACMP clusters use the capabilities of the LVM to make this data accessible to multiple nodes. AIX Logical Volume Manager provides shared data access from multiple nodes. Some of the components of shared logical volume manager are: A shared volume group is a volume group that resides entirely on the external disks shared by cluster nodes. A shared physical volume is a disk that resides in a shared volume group. A shared logical volume is a logical volume that resides entirely in a shared volume group. A shared file system is a file system that resides entirely in a shared logical volume. If you are an system administrator of an HACMP cluster, you may be called upon to perform any of the following LVM-related tasks: Create a new shared volume group. Extend, reduce, change, or remove an existing volume group. Create a new shared logical volume. Extend, reduce, change, or remove an existing logical volume. Create a new shared file system. Extend, change, or remove an existing file system. Add and remove physical volumes. Chapter 3. Installation and configuration 97
Slide 114: When performing any of these maintenance tasks on shared LVM components, make sure that ownership and permissions are reset when a volume group is exported and then re-imported. After exporting and importing, a volume group is owned by root and accessible by the system group. Note: Applications, such as some database servers, that use raw logical volumes may be affected by this change if they change the ownership of the raw logical volume device. You must restore the ownership and permissions back to what is needed after this sequence. Shared logical volume access can be made available in any of the following data accessing modes: Non-concurrent access mode Concurrent access mode Enhanced concurrent access mode 3.3.2 Non-concurrrent access mode HACMP in a non-concurrent access environment typically uses journaled file systems to manage data, though some database applications may bypass the journaled file system and access the logical volume directly. Both mirrored and non-mirrored configuration is supported by non-concurrent access of LVM. For more information on creating mirrored and non-mirrored logical volumes, refer to the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. To create a non-concurrent shared volume group on a node, do the following steps: 1. Use the fast path smitty mkvg. 2. Use the default field values unless your site has other specific requirements. – VOLUME GROUP name The name of the shared volume group should be unique within the cluster. – Activate volume group AUTOMATICALLY at system restart? Set to No so that the volume group can be activated as appropriate by the cluster event scripts. – ACTIVATE volume group after it is created? Set to Yes. 98 IBM HACMP for AIX V5.X Certification Study Guide
Slide 115: – Volume Group MAJOR NUMBER Make sure to use the same major number on all nodes. Use the lvlstmajor command on each node to determine a free major number common to all nodes. To create a non-concurrent shared filesystem on a node, do the following steps: 1. Use the fast path smitty crjfs. 2. Rename both the logical volume and the log logical volume for the file system and volume group. AIX assigns a logical volume name to each logical volume it creates. Examples of logical volume names are /dev/lv00 and /dev/lv01. Within an HACMP cluster, the name of any shared logical volume must be unique. Also, the journaled file system log (jfslog) is a logical volume that requires a unique name in the cluster. 3. Review the settings for the following fields: – Mount automatically at system restart? Make sure this field is set to No. – Start Disk Accounting Set this field to No unless you want disk accounting. 4. Test the newly created file system by mounting and unmounting it. Importing a volume group to a fall-over node Before you import the volume group, make sure the volume group is varied off from the primary node. You can then run the discovery process of HACMP, which will collect the information about all volume groups available across all nodes. Importing the volume group onto the fall-over node synchronizes the ODM definition of the volume group on each node on which it is imported. When adding a volume group to the resource group, you may choose to manually import a volume group onto the fall-over node or you may choose to automatically import it onto all the fall-over node in the resource group. For more information on importing volume groups, see the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. Chapter 3. Installation and configuration 99
Slide 116: Note: After importing a volume group on the fall-over node, it is necessary to change the volume group startup status. Run following command to change the volume group status, as required by HACMP: # chvg -an -Qn <vgname> This will disable automatic varyon when the system restarts and also disable the quorum of the volume group. 3.3.3 Concurrent access mode Using concurrent access with HACMP requires installing an additional fileset. For additional information, see Chapter 2, “Planning and design” on page 17. Concurrent access mode is not supported for file systems; instead, you must use raw logical volumes or physical disks. Creating a concurrent access volume group 1. The physical volumes (hdisk*) should be installed, configured, and available. You can verify the disks’ status using the following command: # lsdev -Cc disk 2. To use a concurrent access volume group, you must create it as a concurrent capable volume group. A concurrent capable volume group can be activated (varied on) in either non-concurrent mode or concurrent access mode. To create a concurrent access volume group, do the following steps: a. Enter smit cl_convg. b. Select Create a Concurrent Volume Group. c. Enter the field values as desired. d. Press Enter. Import the concurrent capable volume group Importing the concurrent capable volume group is done by running the following command: # importvg -C -y vg_name physical_volume_name Specify the name of any disk in the volume group as an argument to the importvg command. By default, AIX automatically varies on non-concurrent capable volume groups when they are imported. AIX does not automatically varyon concurrent capable volume groups when they are imported. 100 IBM HACMP for AIX V5.X Certification Study Guide
Slide 117: Varyon the concurrent capable VGs in non-concurrent mode It is necessary to varyon the concurrent capable volume group in a non-concurrent mode to create logical volume. Use the varyonvg command to activate a volume group in non-concurrent mode: # varyonvg <vgname> Create logical volumes on the concurrent capable volume group You can create logical volumes on the volume group, specifying the logical volume mirrors to provide data redundancy. To create logical volumes on a concurrent capable volume group on a source node, do the following steps: 1. Use the SMIT fast path smit cl_conlv. 2. Specify the size of the logical volume as the number of logical partitions. 3. Specify the desired values to the other available option. 4. Press Enter. Varyoff a volume group After creating the logical volume, varyoff the volume group using the varyoffvg command so that it can be varied on by the HACMP scripts. Enter: # varyoffvg <vgname> Define a concurrent volume group in an HACMP resource group To start the concurrent volume group simultaneously on all the nodes, specify the volume group name in the startup scripts of HACMP. On cluster startup, you may find the concurrent volume group is activated on all the configured nodes. For more information on configuring HACMP resource groups, refer to 3.5, “Resource group configuration” on page 128. 3.3.4 Enhanced concurrent mode (ECM) VGs With HACMP V5.1, you now have the ability to create and use enhanced concurrent VGs. These can be used for both concurrent and non-concurrent access. You can also convert existing concurrent (classic) volume groups to enhanced concurrent mode using C-SPOC. For enhanced concurrent volume groups that are used in a non-concurrent environment, rather than using the SCSI reservation mechanism, HACMP V5.1 uses the fast disk takeover mechanism to ensure fast takeover and data integrity. Chapter 3. Installation and configuration 101
Slide 118: Note: Fast disk takeover in HACMP V5.1 is available only in AIX 5L V5.2. The ECM volume group is varied on all nodes in the cluster that are part of that resource group. However, the access for modifying data is only granted to the node that has the resource group active (online). Active and passive varyon in ECM An enhanced concurrent volume group can be made active on the node, or varied on, in two modes: active or passive. Active varyon In the active state, all high level operations are permitted. When an enhanced concurrent volume group is varied on in the active state on a node, it allows the following: Operations on file systems, such as file system mounts Operations on applications Operations on logical volumes, such as creating logical volumes Synchronizing volume groups Passive varyon When an enhanced concurrent volume group is varied on in the passive state, the LVM provides the equivalent of fencing for the volume group at the LVM level. The node that has the VG varied on in passive mode is allowed only a limited number of read-only operations on the volume group: LVM read-only access to the volume group’s special file LVM read-only access to the first 4 KB of all logical volumes that are owned by the volume group The following operations are not allowed when a volume group is varied on in the passive state: Operations on file systems, such mount Any open or write operation on logical volumes Synchronizing volume groups 102 IBM HACMP for AIX V5.X Certification Study Guide
Slide 119: Creating a enhanced concurrent access volume group 1. When concurrent volume groups are created on AIX 5L 5.1 and later, they are automatically created in enhanced concurrent mode. 2. To create a concurrent capable volume group from the AIX command line, use the mkvg command. For example: # mkvg -n -s 32 -C -y myvg hdisk11 hdisk12 will create an enhanced concurrent VG on hdisk11 and hdisk12. The flags do the following: -n -s 32 -C -y Do not vary on VG at boot. Gives a partition size of 32 MB. Creates an enhanced concurrent VG. Specifies the VG name. 3.3.5 Fast disk takeover This is a new feature of HACMP V5.1, which has the following main purposes: Decreases the application downtime, with faster resource group fallover (and movement) Concurrent access to a volume group (preserving the data integrity) Uses AIX Enhanced Concurrent VGs (ECM) Uses RSCT for communications The enhanced concurrent volume group supports active and passive mode varyon, and can be included in an non-concurrent resource group. The fast disk takeover is set up automatically by the HACMP software. For all shared volume groups that have been created in enhanced concurrent mode and contain file systems, HACMP will activate the fast disk takeover feature. When HACMP starts, all nodes in a Resource Group that share the same enhanced Volume Group will varyon that Volume Group in passive mode. When the Resource Group is brought online, the node that acquires the resources will varyon the Volume Group in active mode. The other nodes will maintain the Volume Group variedon in passive mode. In this case, all the changes to the Volume Group will be propagated automatically to all the nodes in that Volume Group. The change from active to passive mode and the reverse are coordinated by HACMP at cluster startup, Resource Group activation and failover, and when a failing node rejoins the cluster. Chapter 3. Installation and configuration 103
Slide 120: The prerequisites for this functionality are: HACMP V5.1 AIX 5L 5.2 or higher bos.clvm.5.2.0.11 or higher APAR IY44237 For more information on fast disk takeover, see the HACMP for AIX 5L V5.1 Planning and Installation Guide, SC23-4861-02. 3.4 Configuring cluster topology The cluster topology represents the physical components of the cluster and how they are connected via networks (IP and non-IP). 3.4.1 HACMP V5.X Standard and Extended configurations HACMP V5.1 has introduced the Standard and the Extended SMIT configuration paths. Standard configuration path The Standard path allows users to easily configure the most common options, such as: IPAT via Aliasing Networks Shared Service IP Labels Volume Groups and File systems Application Servers When using the options under the Initialization and Standard Configuration SMIT menu, you can add the basic components of a cluster to the HACMP configuration in a few steps. HACMP configuration is automatically discovered and used for defining a cluster with most common options (see Example 3-4 on page 105 and Example 3-5 on page 105). 104 IBM HACMP for AIX V5.X Certification Study Guide
Slide 121: Example 3-4 HACMP for AIX HACMP for AIX Move cursor to desired item and press Enter. Initialization and Standard Configuration Extended Configuration System Management (C-SPOC) Problem Determination Tools F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image Example 3-5 Initialization and Standard Configuration Initialization and Standard Configuration Move cursor to desired item and press Enter. Add Nodes to an HACMP Cluster Configure Resources to Make Highly Available Configure HACMP Resource Groups Verify and Synchronize HACMP Configuration Display HACMP Configuration F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image Prerequisite tasks for using the standard path To use the standard configuration path, HACMP V5.1 must be installed on all the nodes, and connectivity must exist between the node where you are performing the configuration and all other nodes to be included in the cluster. At least one network interface on each node must be both physically (VLAN and switch) and logically (subnets) configured so that you can successfully communicate from one node to each of the other nodes. Once you have configured and powered on all disks, communication devices, serial networks, and configured communication paths to other nodes in AIX, HACMP automatically collects information about the physical and logical configuration and displays it in corresponding SMIT picklists, to aid you in the HACMP configuration process. Chapter 3. Installation and configuration 105
Slide 122: With the connectivity path established, HACMP can discover cluster information and you are able to access all of the nodes needed to perform any necessary AIX administrative tasks. You do not need to open additional windows or physically move to other nodes' consoles and manually log in to each node individually. SMIT fast paths to the relevant HACMP and/or AIX SMIT screens on the remote nodes are available within the HACMP SMIT screens. Assumptions and defaults for the standard path The software makes some assumptions regarding the environment, such as assuming all network interfaces on a physical network belong to the same HACMP network. Intelligent and/or default parameters are supplied or automatically configured. This helps to minimize the number of steps needed to configure the cluster. Basic assumptions are: The host names (as revealed by the hostname command) are used as node names. All network interfaces that can “see” each other’s MAC address are automatically configured in HACMP. Those network interfaces that can ping each other without going through a router are placed on the same logical network. HACMP names each logical network. IP aliasing is used as the default mechanism for service IP label/address assignment to a network interface. IP Address Takeover via IP Aliases is configured for any logical network capable of taking over a service IP label as an alias. Note: To configure the IP Adress Takeover via IP replacement mechanism, you have to use the Extended configuration path to change the HACMP network properties (you turn off IP aliasing). You can configure the resource groups with basic predefined management policies (fall-over and fall-back preferences: cascading, rotating or concurrent) or the new custom resource groups. Adding, changing, or deleting serial (non-IP) networks and devices is done via the Extended path, since you must manually define the desired communication devices (end-points) for each point-to-point network. For further configuration, or to add more details or customization to the cluster configuration, use the HACMP Extended Configuration SMIT path. For further reference, see “Extended configuration path” on page 111. 106 IBM HACMP for AIX V5.X Certification Study Guide
Slide 123: Note: When using the standard HACMP configuration SMIT path, if any information needed for configuration resides on remote nodes, then discovery of cluster information will automatically be performed. Steps for configuring a cluster using the standard path 1. Configure the cluster topology. Identify the cluster nodes and establish communication paths between them using the Configure Nodes to an HACMP Cluster menu options. Here you name the cluster and select the nodes (listed in /etc/hosts) either by their names or their IP addresses. This gives HACMP the base knowledge it needs to communicate with the nodes that are participating in the cluster. Once each of the nodes is properly identified and a working communication paths exist, HACMP automatically runs a discovery operation that identifies the basic components within the cluster. The discovered host names are used as the node names and added to the HACMP node ODM. The networks (and the associated interfaces) that share physical connectivity with two or more nodes in the cluster are automatically added to the HACMP network and HACMP adapter ODMs. Other discovered shared resource information includes PVIDs, and volume groups (see Example 3-6). Example 3-6 Configuring nodes in a cluster (standard) Configure Nodes to an HACMP Cluster (standard) Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [cluster5x] [] + p630n01 p630n02 p630n> * Cluster Name New Nodes (via selected communication paths) Currently Configured Node(s) F1=Help F5=Reset 9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do F4=List F8=Image 2. Configure the cluster resources. Configure the resources to be made highly available. Use the Configure Resources to make Highly Available menu to configure resources that are to be shared among the nodes in the cluster (see Example 3-7 on page 108). Chapter 3. Installation and configuration 107
Slide 124: You can configure these resources: – IP address/IP label – Application server – Volume groups (shared and concurrent) – Logical volumes – File systems Example 3-7 Configure resources to make high available Configure Resources to Make Highly Available Move cursor to desired item and press Enter. Configure Configure Configure Configure Service IP Labels/Addresses Application Servers Volume Groups, Logical Volumes and Filesystems Concurrent Volume Groups and Logical Volumes F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image 3. Configure the resource groups. Use the Configure HACMP Resource Groups menu to create the resource groups you have planned for each set of related or dependent resources. You can choose to configure cascading, rotating, concurrent, or custom resource groups (note that you cannot specify the fallback timer policies for the custom resource groups in the Standard menu) (see Example 3-8). Example 3-8 Configure an HACMP resource group (Standard) Configure HACMP Resource Groups Move cursor to desired item and press Enter. Add a Resource Group Change/Show a Resource Group Remove a Resource Group Change/Show Resources for a Resource Group (standard) F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image 108 IBM HACMP for AIX V5.X Certification Study Guide
Slide 125: 4. Group the resources to be managed together into the previously defined resource group(s). Select Configure HACMP Resource Groups → Change/Show Resources for a Resource Group to assign the resources to each resource group (see Example 3-9). Example 3-9 Change/Show resource for cacading resource group (Standard) Change/Show Resources for a Cascading Resource Group Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] Resource Group Name Participating Node Names (Default Node Priority) * Service IP Labels/Addresses Volume Groups Filesystems (empty is ALL for VGs specified) Application Servers rg01 p630n01 p630n02 p630n> [n01a1] + [] + [] + [] + F1=Help F5=Reset F9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do F4=List F8=Image 5. Adjust log viewing and management (optional). (Optional) Adjust log viewing and management (settings for the debug level and hacmp.out log file formatting options per node). 6. Verify and synchronize the cluster configuration. Use the Verify and Synchronize HACMP Configuration menu to guarantee the desired configuration is feasible given the physical connections and devices, and ensure that all nodes in the cluster have the same view of the configuration (see Example 3-10 on page 110). Chapter 3. Installation and configuration 109
Slide 126: Example 3-10 Verification and synchronization of cluster Command: running stdout: yes stderr: no Before command completion, additional instructions may appear below. Verification to be performed on the following: Cluster Topology Cluster Resources Retreiving data from available cluster nodes. This could take a few minutes... Verifying Cluster Topology... 7. Display the cluster configuration (optional). See Example 3-11. Review the cluster topology and resources configuration. Example 3-11 Display HACMP information Command: OK stdout: yes stderr: no Before command completion, additional instructions may appear below. Cluster Description of Cluster: cluster5x Cluster Security Level: Standard There are 6 node(s) and 2 network(s) defined NODE p630n01: Network net_ether_01 gp01 10.1.1.1 Network net_ether_02 n06a1 192.168.11.136 n05a1 192.168.11.135 n04a1 192.168.11.134 n02a1 192.168.11.132 n01a1 192.168.11.131 F1=Help F2=Refresh F3=Cancel F6=Command F8=Image F9=Shell F10=Exit /=Find n=Find Next 8. Make further additions or adjustments to the cluster configuration (optional). You may want to use some options available on the HACMP Extended Configuration path. Such additions or adjustments include (for example): – Adding non-IP networks for heartbeating – Adding other resources, such as SNA communication interfaces and links or tape resources – Configuring resource group run-time policies, including Workload Manager – Adding custom resource group timers 110 IBM HACMP for AIX V5.X Certification Study Guide
Slide 127: – Configuring cluster security – Customizing cluster events – Configuring site policies Extended configuration path The Extended path allows users fine control over the configuration: Configuration discovery is optional. Node names do not need to match host names. IP network topology may be configured according to a specific user's environment. Using the options under the Extended Configuration menu you can add the basic components of a cluster to the HACMP configuration database (ODM), as well as all additional types of resources. Steps for configuring the cluster using the extended path 1. Run discovery (optional) Run discovery if you have already configured some or all of the cluster components. Running discovery retrieves current AIX configuration information from all cluster nodes. This information is displayed in picklists to help you make accurate selections of existing components. The discovered components are highlighted as predefined components and made available for selections (see Example 3-12). Example 3-12 Discovering information from other nodes (Extended) Command: OK stdout: yes stderr: no Before command completion, additional instructions may appear below. [TOP] Discovering IP Network Connectivity Retreiving data from available cluster nodes. This could take a few minutes... Discovered [24] interfaces IP Network Discovery completed normally Discovering Volume Group Configuration clharvest_vg: Initializing.... Gathering cluster information, which may take a few minutes... clharvest_vg: Processing... Storing the following information in file /usr/es/sbin/cluster/etc/config/clvg_config p630n01: Chapter 3. Installation and configuration 111
Slide 128: Hdisk: PVID: VGname: VGmajor: [MORE...1761] F1=Help F8=Image n=Find Next hdisk0 0006856f612dab6e rootvg active F2=Refresh F9=Shell F3=Cancel F10=Exit F6=Command /=Find 2. Configure, change, or customize the cluster topology. Under the Extended Topology Configuration menu, you can: Identify the nodes and establish communication paths between them using the Configure Nodes to an HACMP Cluster menu. In this case, you name the cluster and select the nodes (listed in /etc/hosts) either by their names or their IP addresses. This gives HACMP the information needed to communicate with the nodes that are participating in the cluster. You can also: – Select the PVIDs and the existing volume groups. – Configure, change, or show sites (optional). – Configure, change, or show predefined or discovered IP-based networks, and predefined or discovered serial devices. – Configure, change, show and update HACMP communication interfaces and devices with AIX settings. – Configure previously defined, or previously discovered, communication interfaces and devices. – Configure, change, and show persistent Node IP Labels. See Example 3-13 on page 113 for more details. 112 IBM HACMP for AIX V5.X Certification Study Guide
Slide 129: Example 3-13 Extended topology configuration Extended Topology Configuration Move cursor to desired item and press Enter. Configure an HACMP Cluster Configure HACMP Nodes Configure HACMP Sites Configure HACMP Networks Configure HACMP Communication Interfaces/Devices Configure HACMP Persistent Node IP Label/Addresses Configure HACMP Global Networks Configure HACMP Network Modules Configure Topology Services and Group Services Show HACMP Topology F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image 3. Configure or customize the resources to be made highly available. Use the Configure Resources to Make Highly Available menu to configure resources that are to be shared among the nodes in the cluster, so that if one component fails, another component will automatically take its place (see Example 3-14 on page 114). You can configure the following resources: – Service IP address (labels) – Application servers – Volume groups – Concurrent volume groups – Logical volumes – File systems – Application monitoring – Tape resources – Communication adapters and links for the operating system – HACMP communication interfaces and links – Custom disk methods Chapter 3. Installation and configuration 113
Slide 130: Example 3-14 Extended resource configuration Extended Resource Configuration Move cursor to desired item and press Enter. HACMP Extended Resources Configuration Configure Resource Group Run-Time Policies HACMP Extended Resource Group Configuration F1=Help F9=Shell F2=Refresh F10=Exit F3=Cancel Enter=Do F8=Image 4. Configure the resource groups. You can choose to configure custom, cascading, concurrent, or rotating resource groups. Using the “Extended” path, you can also configure resource group run-time policies, customize the fall-over/fall-back behavior of cascading resource groups, and define custom resource group policies and parameters. 5. Assign the resources that are to be managed together with resource groups. Place related or dependent resources into resource groups. 6. Make any further additions or adjustments to the cluster configuration. – Configure cluster security. – Customize cluster events. – Configure performance tuning. – Change attributes of nodes, communication interfaces and devices, networks, resources, or resource groups. 7. Verify and synchronize the cluster configuration. Use the Verify and Synchronize HACMP Configuration menu to guarantee the desired configuration is feasible given the physical connections and devices, and ensure that all nodes in the cluster have the same view of the configuration. 8. Display the cluster configuration (optional). View the cluster topology and resources configuration. 114 IBM HACMP for AIX V5.X Certification Study Guide
Slide 131: 3.4.2 Define cluster topology Here we define the cluster topology. Standard topology configuration Complete the following procedures to define the cluster topology. You only need to perform these steps on one node. When you verify and synchronize the cluster topology, its definition is copied to the all other nodes. 1. Enter smitty hacmp. 2. Select Initialization and Standard Configuration → Configure Nodes to an HACMP Cluster and press Enter. 3. In the Configure Nodes to an HACMP Cluster screen enter the values as follows: – Cluster name Enter an ASCII text string that identifies the cluster. The cluster name can include alpha and numeric characters and underscores, but cannot have a leading numeric. Use no more than 31 characters. It can be different from the host name. Do not use reserved names. – New nodes via selected communication paths Enter (or add) one resolvable IP Label (this may be the host name), IP address, or Fully Qualified Domain Name for each new node in the cluster, separated by spaces. This path will be taken to initiate communication with the node. Use F4 to see the picklist display of the host names and/or addresses in /etc/hosts that are not already configured in HACMP (see Figure 3-1 on page 116). Chapter 3. Installation and configuration 115
Slide 132: Figure 3-1 Configure nodes to an HACMP Cluster (Standard) – Currently configured node(s) If nodes are already configured, they are displayed here. You can add node names or IP addresses in any order. 4. Press Enter. The HACMP software uses this information to create the cluster communication paths for the ODM. Once communication paths are established, HACMP runs the discovery operation and prints results to the SMIT screen. 5. Verify that the results are reasonable for your cluster. Extended topology configuration Before you start the configuration of HACMP via the extended topology configuration, we recommend gathering the HACMP information via the Discover HACMP Related information from Configured Nodes menu (see Example 3-12 on page 111). Complete the following procedures to define the cluster topology. You only need to perform these steps on one node. When you verify and synchronize the cluster topology, its definition is copied to the other node. 116 IBM HACMP for AIX V5.X Certification Study Guide
Slide 133: The Extended topology configuration screens include: Configuring an HACMP cluster Configuring HACMP nodes Configuring HACMP sites Configuring HACMP networks Configuring communication interfaces Configuring communication devices Configuring HACMP persistent IP labels Configuring HACMP network modules Configuring an HACMP cluster The only step necessary to configure the cluster is to assign the cluster name. Do the following steps: 1. Enter smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure an HACMP Cluster. 5. Select Add/Change/Show an HACMP Cluster and press Enter. 6. Enter the following values: – Cluster name Enter an ASCII text string that identifies the cluster. The cluster name can include alpha and numeric characters and underscores, but cannot have a leading numeric. Use no more than 31 characters. The node name can be different from the host name. Do not use reserved names. For a list of reserved names, see the HACMP List of Reserved Words, which are found in Chapter 6, “Verifying and Synchronizing a Cluster Configuration“, of the HACMP for AIX 5L V5.1 Administration and Troubleshooting Guide, SC23-4862-00. 7. Press Enter. 8. Press F3 until you return to the Extended Topology SMIT screen. Chapter 3. Installation and configuration 117
Slide 134: 3.4.3 Defining a node Defining node using a standard path is not an available option. HACMP automatically takes the host name of the discovered IP network as a node name. If you want to define nodes manually, you need to use the extended SMIT path of HACMP. To define the cluster nodes using the extended path, do the following steps: 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Nodes. 5. Select Add a Node to the HACMP Cluster and press Enter. 6. Enter field values as follows: – Node name Enter (or add) one resolvable IP Label (this may be the host name), IP address, or Fully Qualified Domain Name for each new node in the cluster, up to 31, separated by spaces. This path will be taken to initiate communication with the node. Use F4 to see the picklist display of the contents of /etc/hosts that are not already HACMP-configured IP Labels/Addresses. – Communication path to node If nodes are already configured, they are displayed here. The HACMP software uses this information to create the cluster communication paths for the ODM. Once communication paths are established, HACMP updates the authentication information for the cluster communication daemon. 3.4.4 Defining sites Site definitions are optional. They are supplied to provide easier integration with the HAGEO product. If you define sites to be used outside of HAGEO, appropriate methods or customization must be provided to handle site operations. If sites are defined, site events run during node_up and node_down events. To add a site definition to an HACMP cluster, do the following steps: 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 118 IBM HACMP for AIX V5.X Certification Study Guide
Slide 135: 3. Select Extended Topology Configuration. 4. Select Configure HACMP Sites. 5. Select Cluster Configuration. 6. Select Cluster Topology. 7. Select Configure Sites. 8. Select Add Site Definition and press Enter. 9. Enter the filed values as follows: – Site name Enter a name for this site using no more than 32 alphanumeric characters and underscores. – Site nodes Enter the names of the cluster nodes that belong to the site. Leave a space between names. A node can belong to only one site. – Dominance (If HAGEO is installed.) Choose yes or no to indicate whether the current site is dominant or not. – Backup communications (If HAGEO is installed.) Select the type of backup communication for your HAGEO cluster (DBFS for telephone line, SGN for a Geo_Secondary network). You can also select None. • • Press Enter to add the site definition to the ODM. Repeat the steps to add the second site. 3.4.5 Defining network(s) The cluster should have more than one network, to avoid a single point of failure. Often the cluster has both IP and non-IP based networks in order to use different heartbeat paths. Use the Add a Network to the HACMP Cluster SMIT screen to define HACMP IP and non-IP networks. To speed up the process, we recommend that you run discovery before network configuration. Running discovery may also reveal any “strange” networking configurations at your site. You can use any or all of these methods for heartbeat paths: Serial networks IP-based networks, including heartbeat using IP aliases Chapter 3. Installation and configuration 119
Slide 136: Heartbeat over disk IP-based networks To configure IP-based networks, do the following steps: 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Networks. 5. Select Add a Network to the HACMP Cluster and press Enter. 6. Select the type of network to configure and press Enter. 7. Enter the information as follows: – Network name If you do not enter a name, HACMP will give the network a default network name made up of the type of network with a number appended (for example, ether1). If you change the name for this network, use no more than 31 alphanumeric characters and underscores. – Network type This field is filled in depending on the type of network you selected. – Netmask The network mask, for example, 255.255.0.0. – Enable IP takeover via IP aliases The default is True. If the network does not support IP aliases, then IP Replacement will be used. IP Replacement is the mechanism whereby one IP address is removed from an interface, and another IP address is added to that interface. If you want to use IP Replacement on a network that does support aliases, change the default to False. – IP address offset for heartbeating over IP aliases Enter the base address of a private address range for heartbeat addresses, for example, 10.10.10.1. HACMP will use this address to automatically generate IP addresses for heartbeat for each boot interface in the configuration. This address range must be unique and must not conflict with any other subnets on the network. 8. Press Enter to configure this network. 9. Press F3 to return to configure more networks. 120 IBM HACMP for AIX V5.X Certification Study Guide
Slide 137: Configuring IPAT via IP replacement If you do not have extra subnets to use in the HACMP cluster, you may need to configure IPAT via IP Replacement. In HACMP V5.1, IPAT via IP Aliases is the default method for binding an IP label to a network interface, and for ensuring the IP label recovery. IPAT via IP aliases saves hardware, but requires multiple subnets. The steps for configuring IPAT via IP Replacement are: 1. In the Add a Service IP Label/Address SMIT screen, specify that the IP label that you add as a resource to a resource group is Configurable on Multiple Nodes. 2. In the same screen, configure hardware address takeover (HWAT) by specifying the Alternate Hardware Address to Accompany IP Label/Address. 3. In the Add a Network to the HACMP Cluster screen, specify False in the Enable IP Takeover via IP Aliases SMIT field. Configuring serial networks to HACMP Do the following steps to configure a serial network: 1. Enter smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Networks. 5. Select Add a Network to the HACMP Cluster and press Enter. SMIT displays a choice of types of networks. 6. Select the type of network to configure and press Enter. 7. Fill in the fields on the Add a non IP-based Network screen as follows: – Network name Name the network, using no more than 31 alphanumeric characters and underscores; do not begin the name with a numeric. Do not use reserved names to name the network. – Network type Valid types are RS232, tmssa, tmscsi, and diskhb 8. Press Enter to configure this network. 9. Press F3 to return to configure more networks. Chapter 3. Installation and configuration 121
Slide 138: 3.4.6 Defining communication interfaces Communication interfaces are already configured to AIX, and you run the HACMP discovery program to add them to HACMP picklists to aid in the HACMP configuration process. Adding discovered communication interfaces to the HACMP cluster 1. In SMIT, go to Extended Configuration. 2. Select Extended Topology Configuration. 3. Select Configure HACMP Communication Interfaces/Devices. 4. Select the discovered option. SMIT displays a selector screen for the Discovered Communications Type. 5. Select Communication Interfaces and press Enter. 6. Select one or more discovered communication interfaces to add and press Enter. HACMP either uses HACMP ODM defaults, or automatically generates values, if you did not specifically define them earlier. For example, the physical network name is automatically generated by combining the string “Net” with the network type (for example, ether) plus the next available integer, as in net_ether_03 (see Example 3-15). Example 3-15 Configure HACMP communication interface Configure HACMP Communication Interfaces/Devices Move cursor to desired item and press Enter. Add Communication Interfaces/Devices Change/Show Communication Interfaces/Devices Remove Communication Interfaces/Devices Update HACMP Communication Interface with Operating System Settings +--------------------------------------------------------------------------+ ¦ Select a Network ¦ ¦ ¦ ¦ Move cursor to desired item and press Enter. ¦ ¦ ¦ ¦ net_ether_01 (10.1.1.0/24) ¦ ¦ net_ether_02 (192.168.11.0/24 172.16.100.0/24 192.168.100.0/24) ¦ ¦ ¦ ¦ F1=Help F2=Refresh F3=Cancel ¦ ¦ F8=Image F10=Exit Enter=Do ¦ F1¦ /=Find n=Find Next ¦ F9+--------------------------------------------------------------------------+ 122 IBM HACMP for AIX V5.X Certification Study Guide
Slide 139: Adding predefined communication interfaces to the HACMP cluster 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Communication Interfaces/Devices and press Enter. 5. Select the predefined option. SMIT displays a selector screen for the Predefined Communications Type. 6. Select Communication Interfaces and press Enter. The Add a Communication Interface screen appears. 7. Fill in the fields as follows: – Node name The name of the node on which this network interface physically exists. – Network name A unique name for this logical network. – Network interface Enter the network interface associated with the communication interface (for example, en0). – IP label/Address The IP label/address associated with this communication interface, which will be configured on the network interface when the node boots. The picklist filters out IP labels/addresses already configured to HACMP. – Network type The type of network media/protocol (for example, Ethernet, token ring, fddi, and so on) Select the type from the predefined list of network types. Note: The network interface that you are adding has the base or service function by default. You do not specify the function of the network interface as in releases prior to HACMP V5.1, but further configuration defines the function of the interface Chapter 3. Installation and configuration 123
Slide 140: 3.4.7 Defining communication devices Communication devices are already configured to AIX, and you have run the HACMP discovery program to add them to the HACMP picklists to aid in the HACMP configuration process. Configuring discovered serial devices for the cluster 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Communication Interfaces/Devices and press Enter. 5. Select the discovered option and press Enter. 6. Select the Communications Devices type from the selector screen. The screen Select Point-to-Point Pair of Discovered Communication Devices to Add appears. It displays a picklist that contains multiple communication devices, which, when you select one or more, are added to the HACMPadapter ODM class. Devices that are already added to the cluster are filtered from the picklist (see Example 3-16). Example 3-16 Configure HACMP communication devices Configure HACMP Communication Interfaces/Devices Move cursor to desired item and press Enter. Add Communication Interfaces/Devices +--------------------------------------------------------------------------+ ¦ Select Point-to-Point Pair of Discovered Communication Devices to Add ¦ ¦ ¦ ¦ Move cursor to desired item and press F7. Use arrow keys to scroll. ¦ ¦ ONE OR MORE items can be selected. ¦ ¦ Press Enter AFTER making all selections. ¦ ¦ ¦ ¦ # Node Device Device Path Pvid ¦ ¦ p630n01 tty0 /dev/tty0 ¦ ¦ p630n02 tty0 /dev/tty0 ¦ ¦ p630n03 tty0 /dev/tty0 ¦ ¦ p630n04 tty0 /dev/tty0 ¦ ¦ p630n05 tty0 /dev/tty0 ¦ ¦ p630n06 tty0 /dev/tty0 ¦ ¦ ¦ ¦ F1=Help F2=Refresh F3=Cancel ¦ ¦ F7=Select F8=Image F10=Exit ¦ F1¦ Enter=Do /=Find n=Find Next ¦ 124 IBM HACMP for AIX V5.X Certification Study Guide
Slide 141: F9+--------------------------------------------------------------------------+ 7. Select only two devices in this screen. It is assumed that these devices are physically connected, you are responsible for making sure this is true. 8. Continue defining devices as needed. Configuring predefined communication devices for the cluster 1. Enter the fast path smitty hacmp. 2. Go to Extended Configuration. 3. Select Extended Topology Configuration. 4. Select Configure HACMP Communication Interfaces/Devices and press Enter. 5. Select the predefined option and press Enter. 6. Select the Communications Devices type from the selector screen and press Enter. SMIT displays the Add a Communications Device screen. 7. Select the non IP-based network to which you want to add the devices and press Enter. 8. Enter the field values as follows: – Node name The node name for the serial device. – Device name A device file name. RS232 serial devices must have the device file name /dev/ttyn. Target mode SCSI serial devices must have the device file name /dev/tmscsin. Target mode SSA devices must have the device file name /dev/tmssan. Disk heartbeat serial devices have the name /dev/hdiskn. n is the number assigned in each device file name. – Device path For an RS232, for example, /dev/tty0. – Network type This field is automatically filled in (RS232, tmssa, tmscsi, or diskhb) when you enter the device name. – Network name This field is automatically filled in. 9. Press Enter after filling in all the required fields. HACMP now checks the validity of the device configuration. You may receive warnings if a node cannot be reached. Chapter 3. Installation and configuration 125
Slide 142: 10.Repeat until each node has all the appropriate communication devices defined. 3.4.8 Boot IP labels Every node in a cluster is configured with an IP address on each of the avaliable adapters. These IP addresses are labeled as boot IP labels for a HACMP configuration. This IP labels are monitored by cluster for adapter alive status. If you have heartbeat over IP alias configured on the nodes, adapter availability is monitored via heartbeat IP labels. To see the boot IP labels on a node, you can run the following command: # netstat -in 3.4.9 Defining persistent IP labels A persistent node IP label is an IP alias that can be assigned to a network for a specified node. A persistent node IP label is a label that: Always stays on the same node (is node-bound) Co-exists with other IP labels present on an interface Does not require installing an additional physical interface on that node Is not part of any resource group. Assigning a persistent node IP label for a network on a node allows you to have a node-bound address on a cluster network that you can use for administrative purposes to access a specific node in the cluster. The prerequisites to use persistent IP labels are: You can define only one persistent IP label on each node per cluster network. Persistent IP labels become available at a node’s boot time. On a non-aliased network, a persistent label may be defined on the same subnet as the service labels, or it may be placed on an entirely different subnet. However, the persistent label must be placed on a different subnet than all non-service IP labels on the network. On an aliased network, a persistent label may be placed on the same subnet as the aliased service label, or it may be configured on an entirely different subnet. However, it must be placed on a different subnet than all boot IP labels on the network. 126 IBM HACMP for AIX V5.X Certification Study Guide
Slide 143: Once a persistent IP label is configured for a network interface on a particular network on a particular node, it becomes available on that node on a boot interface at the operating system boot time and remains configured on that network when HACMP is shut down on that node. You can remove a persistent IP label from the cluster configuration using the Delete a Persistent Node IP Label/Address SMIT screen. However, after the persistent IP label has been removed from the cluster configuration, it is not automatically deleted from the interface on which it was aliased. In order to completely remove the persistent IP label from the node, you should manually remove the alias with the ifconfig delete command or reboot the cluster node. Persistent node IP labels must be defined individually, without using the discovery process. Do the following steps: a. Enter the fast path smitty hacmp. b. Go to Extended Configuration. c. Select Extended Topology Configuration. d. Select Configure HACMP Persistent Node IP Labels/Addresses. e. Add a Persistent Node IP Label and press Enter. f. Enter the field values as follows: • • • Node name The name of the node on which the IP label/address will be bound. Network name The name of the network on which the IP label/address will be bound. Node IP label/Address The IP label/address to keep bound to the specified node. g. Press Enter. 3.4.10 Define HACMP network modules As explained before, HACMP has predefined networks with specific values. While configuring clusters for different types of networks, you must, at a certain point of time, change the predefined values of the network modules. If you want to see and change the network module values, HACMP V5.X’s SMIT menu can take you directly to the network modules menu. If you want to see the current values of a network module, use the following procedure: 1. Enter smitty hacmp. Chapter 3. Installation and configuration 127
Slide 144: 2. Go to Extended Configuration. 3. Select Configure HACMP Network Modules. 4. Select Change a Network Module and press Enter. SMIT displays a list of defined network modules (see Example 3-17). 5. Select the name of the network module for which you want to see current settings and press Enter. Example 3-17 Change network module using predefined values Change a Cluster Network Module using Pre-defined Values Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Network Module Name Description Failure Detection Rate F1=Help F5=Reset F9=Shell F2=Refresh F6=Command F10=Exit F3=Cancel F7=Edit Enter=Do ether Ethernet Protocol Slow F4=List F8=Image + After the command completes, a screen appears that shows the current settings for the specified network module. 3.4.11 Synchronize topology You must synchronize the cluster configuration before you proceed with resource group creation. To synchronize a cluster, refer to 3.5.8, “Verify and synchronize HACMP” on page 151. 3.5 Resource group configuration HACMP provides the following types of resource groups: Cascading resource groups Rotating resource groups Concurrent access resource groups Custom access groups 128 IBM HACMP for AIX V5.X Certification Study Guide
Slide 145: 3.5.1 Cascading resource groups A cascading resource group defines a list of all the nodes that can control the resource group and then, by assigning a takeover priority to each node, specifies a preference for which cluster node controls the resource group. When a fallover occurs, the active node with the highest priority acquires the resource group. If that node is unavailable, the node with the next-highest priority acquires the resource group, and so on (see Figure 3-2, Figure 3-3, and Figure 3-4 on page 130). Resource Group 1 IP_1,VG_1 FS_1 APP_SRV_1 NODE A Priority=1 Status: Active NODE B Priority=2 Status: Standby NODE C Priority=3 Status: Standby Figure 3-2 Cascading resource group in initial configuration Resource Group 1 IP_1,VG_1 FS_1 APP_SRV_1 NODE A Priority=1 Status: FAILED NODE B Priority=2 Status: Active NODE C Priority=3 Status: Standby Figure 3-3 Cascading resource group in fall-over condition Chapter 3. Installation and configuration 129
Slide 146: R e s o u rc e G ro u p 1 IP _ 1 ,V G _ 1 FS_1 APP_SR V_1 NODE A P rio rity = 1 S ta tu s : A c tive NODE B P rio rity = 2 S ta tu s : S ta n d b y NODE C P rio rity = 3 S ta tu s : S ta n d b y Figure 3-4 Cacading resource group in fall-back condition The list of participating nodes establishes the resource chain for that resource group. When a node with a higher priority for that resource group joins or reintegrates into the cluster, it takes control of the resource group, that is, the resource group falls back from nodes with lesser priorities to the higher priority node. Special cascading resource group attributes Cascading resource groups support the following attributes: Cascading without fallback Inactive takeover Dynamic node priority Cascading without fallback (CWOF) is a cascading resource group attribute that allows you to refine fall-back behavior. When the Cascading Without Fallback flag is set to false, this indicates traditional cascading resource group behavior: when a node of higher priority than that on which the resource group currently resides joins or reintegrates into the cluster, and interfaces are available, the resource group falls back to the higher priority node. When the flag is set to true, the resource group will not fall back to any node joining or reintegrating into the cluster, even if that node is a higher priority node. A resource group with CWOF configured does not require IP Address Takeover. Inactive takeover is a cascading resource group attribute that allows you to fine tune the initial acquisition of a resource group by a node. If inactive takeover is true, then the first node in the resource group to join the cluster acquires the resource group, regardless of the node’s designated priority. If Inactive Takeover 130 IBM HACMP for AIX V5.X Certification Study Guide
Slide 147: is false, each node to join the cluster acquires only those resource groups for which it has been designated the highest priority node. The default is false. Dynamic node priority lets you use the state of the cluster at the time of the event to determine the order of the takeover node list. 3.5.2 Rotating resource groups A rotating resource group, like a cascading resource group, defines the list of nodes that can take over control of a resource group and uses priorities to determine the order in which other nodes can take control of the resource. Like cascading resource groups with CWOF set to true, control of the resource group does not automatically revert to the node with the highest priority when it reintegrates into the cluster. Use rotating resource groups to avoid the interruption in service caused by a fallback and when it is important that resources remain distributed across a number of nodes (see Figure 3-5, Figure 3-6 on page 132, and Figure 3-7 on page 132). Resource Group 1 IP_1,VG_1 FS_1 APP_SRV_1 NODE A Priority=1 Status: Active NODE B Priority=2 Status: Standby NODE C Priority=3 Status: Standby Figure 3-5 Rotating resource group in initial configuration Chapter 3. Installation and configuration 131
Slide 148: Resource Group 1 IP_1,VG _1 FS_1 APP_SRV_1 NODE A Priority=1 Status: FAILED NO DE B Priority=2 Status: Active NO DE C Priority=3 Status: Standby Figure 3-6 Rotating resource group in fall-over condition Resource Group 1 IP_1,VG_1 FS_1 APP_SRV_1 NODE A Priority=1 Status: Standby NODE B Priority=2 Status: Active NODE C Priority=3 Status: Standby Figure 3-7 Rotating resource group after reintegration of failed node For rotating resource groups, the node with the highest priority for a resource group and the available connectivity (network, network interface, and address) acquires that resource group from a failing node, unless dynamic node priority has been set up. The HACMP software assumes that the node that has the rotating resource group's associated service address controls the resource group. Rotating groups share some similarities with Cascading without Fallback groups. However, there are important differences. Unlike cascading groups, rotating groups interact with one another. Because rotating resource groups require the use of IP address takeover, the nodes in the resource chain must all share the same network connection to the resource group. If several rotating groups share a network, only one of these resource groups can be up on a given node at any time. Thus, rotating groups 132 IBM HACMP for AIX V5.X Certification Study Guide
Slide 149: distribute themselves. Cascading without Fallback groups, however, may clump together - that is, multiple groups will end up residing on the same node. CWOF does not require an IP address to be associated with the group. 3.5.3 Concurrent access resource groups A concurrent access resource group may be shared simultaneously by multiple nodes. All nodes concurrently accessing a resource group acquire that resource group when they join the cluster. There are no priorities among nodes. Concurrent access resource groups are supported in clusters with up to 32 nodes. Note that all nodes in a cluster must be members of a concurrent resource group. The only resources included in a concurrent resource group are volume groups with raw logical volumes, raw disks, and application servers that use the disks. The device on which these logical storage entities are defined must support concurrent access. 3.5.4 Custom resource groups In HACMP V5.1, in addition to cascading, rotating, and concurrent resource groups, you can configure custom resource groups. Parameters for custom resource groups let you precisely describe the resource group’s behavior at startup, fallover, and fallback. The regular cascading, rotating, and concurrent resource groups have predefined startup, fall-over, and fall-back behaviors. The policies for custom resource groups are easier to understand than the CWOF or Inactive Takeover (IT) attributes. They are not restricted to the predefined policies of regular resource groups, and can be tailored to your needs. Table 3-3 on page 134 presents the resource group behavior equivalency between “classic” resource groups (pre- HACMP 5.1) and custom resource groups. Chapter 3. Installation and configuration 133
Slide 150: Table 3-3 Custom resource group behavior RG Mapping OHNO Cascading CWOF+IT+ DNP Rotating Concurrent X X X X X X Startup OFAN OAAN FOPN X X Fallover FDNP BOEN Fallback FBHP X X X X NFB Custom resource group attributes You can configure parameters specific to custom resource groups that define how the resource group behaves at startup, fallover and fallback. Configuration for custom resource groups use: Default node priority list List of nodes that can host a particular resource group, as defined in the “Participating Node Names” for a resource group. Home node The first node that is listed in the default node list for any non-concurrent resource group, including custom resource groups that behave like non-concurrent. Custom resource group parameters Settling time The settling time is the time required for a resource group to bring online, which is currently offline. When the settling time is not configured, the resource group will start on the first available higher priority node that joins the cluster. You can configure a custom resource group’s startup behavior by specifying the settling time. The settling time is used to ensure that a resource group does not bounce among nodes as nodes with increasing priority for the resource group are brought online. It lets HACMP wait for a given amount of time before activating a resource group, and then activates it on the highest priority node available. If you set the settling time, HACMP will bring the resource group online immediately, if the highest priority node for the resource group is up; otherwise, it waits for the duration of the settling time interval before determining the node on which to place the resource group. 134 IBM HACMP for AIX V5.X Certification Study Guide

   
Time on Slide Time on Plick
Slides per Visit Slide Views Views by Location