MAP4F10 Recovery from I/O enclosure repair with one CEC enclosure
fenced
This procedure addresses a situation in which a failure
in an I/O enclosure causes a CEC to become fenced and, in some configurations,
adapters in another I/O enclosure to become unavailable. Use this
procedure after the original I/O enclosure failure has been successfully
repaired. This procedure will recover the CEC to dual cluster operational
and, if needed, will recover the other I/O enclosure that might not
be available.
MAP4F10 Section-1
About this task
- You completed replacing one of the following I/O enclosure FRUs:
- I/O enclosure backplane
- RIO card
- RIO cable
- The serviceable event that
you just repaired for the I/O enclosure has been automatically closed.
- A new serviceable event was
created with SRC BE400100 = CEC in fenced state during I/O enclosure
repair. A CEC enclosure is fenced, quiesced, or powered off.
Procedure
-
Are there any open serviceable
events with SRCs
BE1E2167, BE1E2543, or BE1E2551?
- Yes, close these serviceable
events. These are
expected when an I/O enclosure has serviceable
events and
a CEC enclosure is fenced.
- No, go to the next step.
-
Are there any open serviceable
events with CEC FRUs?
- Yes, repair them and then go to step 4.
- No, go to the next step.
- Use this special pseudo repair
procedure to reset the fenced CEC enclosure, which will quiesce, power
off, power on, and resume the CEC enclosure:
- Use the Display Storage Facility State (End of Call)
to determine which CEC enclosure is fenced.
- From the navigation area, click
- From the bottom Task area, click The View Storage Facility State (end of call) window
opens.
- Click the Fenced Resources option at the
bottom of the list. Then, click Details and
the fenced LPAR information will be shown.
For example:
Server Not Good
lparName SF75FW820ESS11
state=4(Fenced),PartitionState=Running
Note: SFsssssssssESS0x (x = 1 or 2) is in CEC0 (upper)
SFsssssssssESS1x (x = 1 or 2) is in CEC1 (lower)
- Return to the Task area, and click Service Utilities > View
Hardware Topology. You can identify the fenced CEC enclosure
location code.
For example:
Current Hardware Topology
CEC 0 MTMS = 9117-MMA*10D5242
CEC 0 Unit ID = U787D.001.DQD53K3
CEC 1 MTMS = 9117-MMA*10D5272
CEC 1 Unit ID = U787D.001.DQD17BM
- Use the Exchange Parts procedure to select the
CEC enclosure that displayed as fenced:
- From the navigation area, click
- From the bottom Task area, click . The Exchange CEC Components window opens.
- Select a CEC enclosure and click Show FRUs.
The Show CEC FRUs window opens.
- Select the FRU Location Code, and then click Exchange
FRU.
- Select the System Processor Card FRU and continue the
guided repair:
- From the Show CEC FRUs window, select and click Exchange FRU.
Notes:
- If the System Processor Card is not displayed, you may need to
maximize the window and manually scroll down the list.
- Do not disconnect the black power cables to the
CEC enclosure power supplies when directed.
- Do not replace the system processor card when directed,
leave it installed and continue the repair.
- After the pseudo repair of the CEC is complete, go to
the next step.
-
Determine if another I/O enclosure needs to be recovered. Are there any open serviceable
events with the following SRCs?
- BE340012 (Device Area Resource Manager detected device adapter card objects missing in the
current harvest)
- BE360012 (IBM.EssSAHARM detected host adapter card objects missing in the current
harvest)
- Yes, go to the next step.
- No, go to step 6.
-
Identify the I/O enclosure for the serviceable
events with the SRCs BE340012 (Device adapter card missing) or BE360012 (Host adapter card
missing):
-
Use the location codes in the FRU list to determine the I/O enclosure.
Note: Each device adapter card and host adapter card in the I/O enclosure will have an open serviceable event.
-
Do not repair any of these open serviceable
events.
-
Use the "Exchange Parts" procedure to do a pseudo repair of the I/O enclosure RIO card to make
the I/O enclosure adapters available.
-
After the I/O enclosure pseudo repair is successful, manually close each serviceable event with SRC BE340012 or SRC BE360012.
-
Go to the next step.
-
Display and repair any open serviceable
events
containing I/O enclosure FRUs.