Cisco ACI: RMA procedure for one node in APIC cluster

RMA procedure for one node in APIC cluster

One of the concern which people are asking us, if one of the node or two or the node fails in APIC cluster then what is the procedure to replace. 

APIC cluster has minimum of 3 nodes and if one fails, you still can read write and push policies to Spine and Leaf. If two fails, its read only access. In this article we will go through the RMA procedure of one node in APIC cluster.

Identify the failed APIC and the current fabric setting, First Login to APIC

Fig 1.1- APIC Login Screen

Step 1: From the Web interface of an operational APIC, choose System > Controllers.

Step 2: On the left-hand side of the screen, choose controllers > (any APIC) > Cluster as Seen by Node.

Step 3: Under Operational State, the APIC fails to load. 

Note⭐: You should note the fabric name, target size, and node ID of the failed APIC, as well as the address space for the Tunnel End Point TEP

Fig 1.2- Failed Node in a cluster

Step 4: Decommission failed node in the APIC . In the work pane, verify in the APIC cluster tab the Health State in the Active Controllers summary table indicates the cluster is Fully Fit before continuing. In the Active Controllers table located in the APIC Cluster tab of the Work pane, right-click on the controller and choose Decommission. The confirmation dialog box appears, click yes.

Fig 1.3- Decommission Failed Node in a cluster

Step 5: The decommissioned controller displays Unregistered in the Operational state column. The controller is then taken out of the service and no longer visible in the Work pane.

Step 6: Commission node in the APIC cluster. Give it a few minutes for fabric discovery to propagate information about replaced APIC  to other cluster members.

Fig 1.4- Commission Failed Node in a cluster

Log into the fabric using the CLI of the new APIC to verify that it has joined the fabric. Make sure the authentication credentials are configured accordingly.

In similar manner, if two nodes fails, you need to RMA both the nodes and do step by step procedure.

No comments