I received the following question from an AIX administrator in Germany. “Hi Chris, on your blog, you explain how to find out the active value of num_cmd_elems of an fc-adapter by using the kdb. So you can decide, if the value of lsattr is active or not ... I wonder if you can find out the values fc_err_recov and dyntrk of the fscsiX device.? melih[/etc]# attach dyntrk fc_err_recov scsi_id sw_fc_class lsattr -El fscsi0 switch How this adapter is CONNECTED False yes Dynamic Tracking of FC Devices True delayed_fail FC Fabric Event Error RECOVERY Policy True 0x1021f Adapter SCSI ID False 3 FC Class for Fabric True I try to use echo efscsi fscsi0 | kdb .. but I can't figure it out.. Can you help my please?” I did a little research on his behalf and came up with an answer. However, I’m not at all surprised he had trouble finding the right information. It's not easy, clear or documented! I received the following information from my IBM AIX contacts. “The following relies on internal structures that are subject to change. The procedure was tested on 6100‐06, 6100‐07, and 7100‐01. I don't have a lab system with physical HBAs and 5.3 at the moment. Hopefully the same steps should work for 5.3. You may need to first run efscsi without arguments to load the kdb module before running efscsi fscsiX. # kdb (0)> efscsi fscsi1 | grep efscsi_ddi struct efscsi_ddi ddi = 0xF1000A060084A080 (0)> dd 0xF1000A060084A080+20 2 F1000A060084A0A0: 0101020202010200 000000B400000028 ...............( FFDD NNNNNNNN FF = fc_error_recov: 01=delayed_fail 02=fast_fail DD = dyntrk: 00=disabled 01=enabled NNNN=num_cmd_elems - 20 (20 reserved) e.g. 200 - 20 = 180 = B4 So in this example, fc_err_recov is set to fast_fail (02), dyntrk is set to yes (01) and num_cmd_elems is set to 200.“ I tested this on a lab system running AIX 6.1 TL6 and AIX 7.1 TL1. Starting with an FC adapter with dyntrk disabled (set to no), fc_err_recov disabled (set to delayed_fail) and num_cmd_elems set to 500. # lsattr -El attach dyntrk fc_err_recov scsi_id sw_fc_class fscsi1 none How this adapter is CONNECTED False no Dynamic Tracking of FC Devices True delayed_fail FC Fabric Event Error RECOVERY Policy True Adapter SCSI ID False 3 FC Class for Fabric True # lsattr -El fcs1 -a num_cmd_elems num_cmd_elems 500 Maximum number of COMMANDS to queue to the adapter True # kdb (0)> efscsi fscsi1 | grep efscsi_ddi struct efscsi_ddi ddi = 0xF1000A060096E080 (0)> dd 0xF1000A060096E080+20 2 F1000A060096E0A0: 0101020201000100 000001E000000028 FFDD NNNNNNNN ...............( OK, let’s break it down. From the kdb output we can determine the following: • fc_error_recov is currently set to delayed_fail (FF=01 = fc_error_recov = delayed_fail). • dyntrk is currently set to no (DD=00 = dyntrk = disabled). • num_cmd_elems is currently set to 500 (NNNNNNNN=1E0 = num_cmd_elems = 480 + 20 = 500). If I set dyntrk to yes, we notice that the value changes immediately within the kernel running config. I was able to make this change without a reboot as the device was not in use. # chdev -l fscsi1 -a dyntrk=yes # kdb (0)> efscsi fscsi1 | grep efscsi_ddi struct efscsi_ddi ddi = 0xF1000A060096E080 (0)> dd 0xF1000A0800CB6080+20 2 F1000A0800CB60A0: 0101020201010200 000001E000000028 FFDD NNNNNNNN ...............( And now dynamic tracking is enabled (DD=01 = dyntrk = enabled, set to yes). Poor old AIX 5.3 struggled to provide me with any information using the steps provided. So what about max_xfer_size? For a physical FC adapter we can find the current value using the following kdb commands: (0)> efcs fcs1 |grep ddi struct efc_ddi ddi = 0xF1000A06006D0080 (0)> dd 0xF1000A06006D0080+60 4 F1000A06006D00E0: 00000000000000C8 0000012C900000C1 F1000A06006D00F0: 900000C1000FFC00 0010000000800000 ...........,.... ................ Based on the output, num_cmd_elems is set to 200 (C8) and max_xfer_size is set to 1048576 (100000). The max_xfer_size for VFC is tricky because it is contained in a structure that can and does change between SPs and TLs. In 6100‐06‐01 max_xfer_size is offset 3932 bytes into the structure so we get the value like this: (0)> vfcs NAME ADDRESS STATE HOST HOST_ADAP OPENED NUM_ACTIVE fcs2 0xF100010100B38000 0xFFFF nimlab102-vfchost0 0x00 0x0000 (0)> dcal 3932 Value decimal: 3932 Value hexa: 00000F5C (0)> dd 0xF100010100B38000+F50 F100010100B38F50: 0000002800000002 000000C800100000 ...(............ Perhaps the easiest way to handle changes between versions is to use the fact that max_xfer_size is immediately after num_cmd_elems and that is very unlikely to change. So, knowing that the structure size does not change by very much you can grep in the general area: (0)> vfcs fcs2 | grep elems num_cmd_elems: 0xC8 (0)> dd 0xF100010100B38000 200 | grep 000000C8 F100010100B38F50: 0000002800000002 000000C800100000 ...(............ Here are the links to my previous posts on kdb: https://www.ibm.com/developerworks/mydeveloperworks/blogs/cgaix/entry/checking_num_cmd_ elems_for_vfc_adapters_with_kdb1?lang=en https://www.ibm.com/developerworks/mydeveloperworks/blogs/cgaix/entry/checking_your_queu e_depth_with_kdb?lang=en Enjoy kdb fans! Attention: just a note about max_xfer_size and virtual FC adapters. In my experience, if the values for this attribute on the VIO client do not match those on the VIO server, then you will have trouble configuring the virtual FC adapters. Possible side effects may include your system never booting again! So if I change the value to 0x200000 on the client, without mirroring this value on the VIO server, I may encounter the following effects: # rmdev -Rl fcs1 sfwcomm1 Defined fscsi1 Defined fcnet1 Defined fcs1 Defined # chdev -l fcs1 -a max_xfer_size=0x200000 fcs1 changed The cfgmgr command will report errors for the FC adapter. # cfgmgr Method error (/usr/lib/methods/cfgefscsi -l fscsi1 ): 0514-061 Cannot find a child device. Method error (/usr/lib/methods/cfgstorfworkcom -l sfwcomm1 ): 0514-040 Error initializing a device into the kernel. Errors, similar to the following, may appear in the AIX error report. # errpt errpt | grep fcs 0E0C5B31 0726123812 U S fcs1 8C9E9221 0726123812 I S fcs1 Undefined error Informational message You’ll observe messages in the error report that claim a request from the client was rejected by the VIOS. ... Request was rejected by VIOS Response was rejected by the client ... # errpt -aN fcs1 --------------------------------------------------------------------------LABEL: VFC_ERR8 IDENTIFIER: 0E0C5B31 Date/Time: Sequence Number: Machine Id: Node Id: Class: Type: WPAR: Resource Name: Thu Jul 26 12:38:29 EETDT 2012 1040 00C123C64C00 aixlpar1 S UNKN Global fcs1 Description Undefined error Probable Causes PROCESSOR Failure Causes PROCESSOR Recommended Actions PERFORM PROBLEM DETERMINATION PROCEDURES Detail Data Error Location 0000 00E0 Error Type 00 RC FFFF FFFF FFFF FFFF VIO Server Partition Name vio2 Physical Adapter Instance Name vfchost50 Physical Adapter Location Code U5873.001.8SS0071-P2-C6-T1 Physical Adapter DRC Name U9119.FHB.87654C6-V7-C1100 Adapter N Port ID 0000 0000 0000 0000 Adapter State 0000 FFFF Additional Information 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 --------------------------------------------------------------------------LABEL: VFC_ERR7 IDENTIFIER: 8C9E9221 Date/Time: Sequence Number: Machine Id: Node Id: Class: Type: WPAR: Resource Name: Thu Jul 26 12:38:29 EETDT 2012 1039 00C123C64C00 aixlpar1 S INFO Global fcs1 Description Informational message Probable Causes Request was rejected by VIOS Response was rejected by the client Failure Causes PROCESSOR Recommended Actions PERFORM PROBLEM DETERMINATION PROCEDURES Detail Data Error Location 0000 0088 Error Type 00 RC 0000 0000 0010 0000 VIO Server Partition Name vio2 Physical Adapter Instance Name vfchost50 Physical Adapter Location Code U5873.001.8SS0071-P2-C6-T1 Physical Adapter DRC Name U9119.FHB.87654C6-V7-C1100 Adapter N Port ID 0000 0000 0000 0000 Adapter State 0000 0004 If you encounter this problem, restore the clients FC adapter attributes to their previous values before restarting the system. If you don’t, then your LPAR may no longer boot and may hang on LED 554. Change your VIOS first then update your VIO clients.
© Copyright 2026 Paperzz