F VXLAN in Cisco ACI - The Network DNA: Networking, Cloud, and Security Technology Blog

VXLAN in Cisco ACI

Cisco ACI Technical Deep-Dive Series

In-Depth Technical Guide

How ACI uses VXLAN as its universal packet transport — covering TEP types, VNID roles, the Overlay-1 VRF, IS-IS underlay, COOP mapping, and a full packet walk from ingress leaf to egress leaf.

24-bit

VNID Space

16M+

L2 Segments

50 B

VXLAN Overhead

3 Types

of ACI VNID

VXLAN RFC 7348 TEP / VTEP / PTEP IS-IS Underlay COOP · GIPo

1. Why VXLAN? The Problem ACI Was Built to Solve

Traditional data center networks suffered from two structural limitations that became painful at scale. First, spanning-tree protocol blocked half the available links to prevent forwarding loops, leaving expensive bandwidth sitting idle. Second, the 802.1Q VLAN tag field is only 12 bits wide — which caps a network at 4,094 VLANs. For a multi-tenant data center running thousands of isolated application environments, that ceiling is hit quickly.

Cisco ACI's answer is to treat every packet in the fabric as a VXLAN packet — full stop. The moment a frame enters the fabric at an ingress leaf switch, it gets wrapped in a VXLAN header regardless of whether it arrived as a plain Ethernet frame, a 802.1Q-tagged frame, or even an NVGRE packet. This normalization means the fabric itself speaks one universal language internally, while still connecting seamlessly to any external encapsulation on the edges.

⚠ Figure 1 — Traditional Network Problems vs ACI VXLAN Solution

 Traditional Network — Pain Points

STP Blocked Ports: 50% of links idle to prevent loops. No true multipath forwarding.

4,094 VLAN Limit: 12-bit 802.1Q tag cannot scale for large multi-tenant environments.

L2/L3 Boundary Constraints: Endpoints locked to subnets. Moving VMs across racks breaks IP.

No Policy Carry: Bare Ethernet carries no information about which security policy applies.

ACI VXLAN

✅ ACI VXLAN — What Changes

Full Mesh, No STP: All links active via ECMP over IP underlay. Loop-free by design.

16 Million Segments: 24-bit VNID field in VXLAN header gives 16,777,215 unique segments.

Flexible Endpoint Placement: Endpoints move anywhere while keeping their IP. Ingress leaf does routing.

Policy in Every Packet: ACI embeds policy class tags (pcTag/sclass) inside the VXLAN header for distributed enforcement.

2. VXLAN Basics — RFC 7348 Fundamentals

VXLAN, defined in RFC 7348, is a MAC-in-UDP encapsulation scheme. It wraps an original Layer 2 Ethernet frame inside a UDP packet, which rides over a standard IP network. The genius of this approach is that the IP network in the middle — the underlay — has no awareness of the overlay topology. It simply routes UDP packets from one IP address to another based on the outer headers.

The device that performs the encapsulation and decapsulation is called a VTEP (VXLAN Tunnel Endpoint). In standard VXLAN deployments, VTEPs can be physical switches, hypervisors, or any device with an IP address in the underlay. In ACI, every leaf switch is a VTEP — but ACI adds several proprietary extensions to the standard VXLAN header that make the fabric do things a generic VXLAN implementation cannot.

 Figure 2 — VXLAN Encapsulation Frame Structure

Outer Headers — Added by VTEP (Leaf Switch)

Outer Ethernet Header

Dst MAC: Next-hop router  |  Src MAC: Leaf uplink

14 bytes

L2 Underlay

Outer IP Header

Src IP: Ingress VTEP (PTEP)  |  Dst IP: Egress VTEP (PTEP)

20 bytes

L3 Underlay

Outer UDP Header

Dst Port: 4789 (IANA)  |  Src Port: Flow entropy hash

8 bytes

ECMP Entropy

VXLAN Header (ACI-Extended)

24-bit VNID + Flags (incl. ACI Policy Bit) + Reserved

8 bytes

ACI Overlay
Original (Inner) Frame — Untouched

Inner Ethernet Header

Original Src/Dst MAC of the communicating endpoints

14 bytes

Inner L2

Inner IP + Payload

Original IP packet (TCP/UDP/etc.) — application data

Variable

App Data

Total VXLAN Overhead = 50 bytes (14 outer ETH + 20 outer IP + 8 UDP + 8 VXLAN header). ACI adds 54 bytes if the 802.1Q tag from the original frame is preserved. Configure MTU ≥ 1600 bytes on all fabric links to avoid fragmentation.

The UDP source port is deliberately varied based on a hash of the inner frame's flow fields (src IP, dst IP, protocol, src port, dst port). This per-flow entropy means that different traffic flows take different ECMP paths through the spine — giving you true load balancing without any per-packet reordering issues.

3. ACI Spine-Leaf and the Underlay

ACI mandates a spine-leaf topology. Every leaf connects to every spine, and no two leaves connect directly to each other. This creates a two-hop maximum path between any two endpoints in the fabric — traffic always traverses at most one spine switch. The regularity of this topology is what makes ACI's VXLAN forwarding model predictable and scalable.

The underlay IP network between leaf and spine uses IS-IS (Intermediate System to Intermediate System) to distribute reachability to each VTEP's loopback address. This is a critical distinction from traditional data center IS-IS — in ACI, IS-IS runs only on the point-to-point links between leaf and spine switches. It never touches servers or external routers. All it does is make sure every leaf knows how to reach the loopback IP address (PTEP) of every other leaf through the spine switches.

⚒ Figure 3 — ACI Spine-Leaf VXLAN Topology

Spine Layer

IS-IS routes

Proxy-TEP

COOP DB

Spine-1

Proxy-TEP: 10.0.0.128

IS-IS · COOP Oracle · anycast

Spine-2

Proxy-TEP: 10.0.0.128

IS-IS · COOP Oracle · anycast

IS-IS Underlay Links — All Active (ECMP)

Leaf Layer

PTEP (VTEP)

Policy

Enforcement

Leaf-1

PTEP: 10.0.64.64

BD/VRF VNID assignment

Leaf-2

PTEP: 10.0.96.64

Encap/decap VXLAN

Leaf-3

PTEP: 10.0.128.64

Default GW for endpoints

Server A

10.10.1.10

Server B

10.10.1.11

Server C

10.10.2.10 (different subnet)

Server D

10.10.1.12 (same subnet as A/B)

All leaf-to-leaf traffic travels via exactly one spine switch — maximum two hops. Spines forward on outer VXLAN IP only.

4. TEP Types — PTEP, Proxy-TEP, FTEP, and vPC-TEP

One of the first things that trips up engineers new to ACI is the proliferation of TEP types. In standard VXLAN, a VTEP is simply "the IP address I use to send VXLAN traffic." In ACI, the fabric carves out several distinct TEP roles, each serving a specific forwarding purpose. All of these addresses live inside the Overlay-1 VRF — a dedicated, non-tenant infrastructure VRF that ACI uses exclusively for fabric communication.

 Figure 4 — ACI TEP Types in the Overlay-1 VRF

PTEP — Physical Tunnel Endpoint

The unique loopback IP address assigned to each individual leaf and spine by APIC from the infrastructure TEP pool (configured at fabric init time, e.g. 10.0.0.0/16). This address is allocated as a /32 loopback on Overlay-1.

Used for: Non-vPC data plane, APIC-to-leaf communication, traceroute, MP-BGP peering (for L3Out), ping between fabric nodes.

Proxy-TEP — Spine Anycast Address

An anycast IP address shared by all spines. When a leaf cannot find the destination endpoint's VTEP in its local mapping table, it sends the VXLAN packet to this anycast address. Any spine can receive it and look up the correct destination in the COOP mapping database.

Used for: Unknown unicast forwarding when the leaf doesn't have a local mapping for the destination endpoint.

FTEP — Fabric Loopback TEP

A special anycast address identical on all leaf nodes, used when a VMM domain (VMware vSphere / ESXi) is integrated. The hypervisor hosts its own VTEP (vSwitch VTEP), and the leaf uses the FTEP as the source to encapsulate VXLAN traffic destined for the vSwitch. This lets virtual machine VTEPs "see" a consistent fabric-side address regardless of which leaf they connect through.

Used for: VMM domain integration with vSwitch VTEPs.

vPC-TEP — Virtual Port Channel TEP

When two leaf switches form a vPC pair, they share a virtual IP address called the vPC-TEP (sometimes called VPC VIP or VTEP). Traffic destined to endpoints connected across both vPC members uses this shared address as the tunnel destination, allowing either leaf to receive and forward the VXLAN packet correctly.

Used for: Dual-homed server connectivity via vPC for high availability.

 Verify TEP Addresses — CLI

leaf101# acidiag fnvread <!-- shows PTEP of all nodes -->

leaf101# show ip interface vrf overlay-1 <!-- shows all TEP loopbacks -->

leaf101# show interface tunnel <id> <!-- shows VXLAN tunnel state + destination VTEP -->

5. VNID Types — BD, VRF, and EPG

The 24-bit VNID field in the VXLAN header carries 16 million possible values, but ACI doesn't use them all the same way. The fabric assigns VNIDs to three different construct types, and the value in that 24-bit field changes based on what kind of forwarding is happening at any given moment. Getting this right is the key to understanding how ACI actually forwards packets.

 Figure 5 — The Three VNID Types in ACI

BD VNID

Bridge Domain Identifier

Assigned to each Bridge Domain. Represents a Layer 2 flooding domain. Used when:

✓ Multicast/BUM traffic is forwarded within the BD

✓ ARP flooding (when proxy ARP is off)

✓ IPv6 with NDP (Neighbor Discovery Protocol)

Example: VNID 15761386 = BD "Web-BD"

VRF VNID

(L3 VNID / Private Network ID)

Assigned to each VRF (Context). Represents a Layer 3 routing domain. Used when:

✓ Traffic is being routed between subnets within the same VRF

✓ Ingress leaf sends inter-subnet traffic through the fabric

✓ Spine proxy forwarding for unknown unicast unicast lookup

One L3 VNID per VRF — shared across all BDs in that VRF

EPG VNID

Endpoint Group VLAN mapping

Assigned per EPG. Maps the external VLAN tag (on the access port) to the ACI policy domain. Less commonly discussed but essential at the access edge:

✓ Maps customer VLAN (e.g. VLAN 100) to EPG on a port

✓ Leaf translates the access VLAN to a VNID for fabric forwarding

✓ Infra VLAN 4093 is used between leaf and APIC for fabric management

Access VLAN → VNID normalization at leaf ingress

Key Insight: A single traffic flow might use different VNIDs as it traverses the fabric. An L3 (routed) packet from Web-BD to App-BD uses the VRF VNID while transiting the fabric, but the destination BD VNID at the egress leaf for the final lookup.

7. Underlay Protocols — IS-IS, COOP, and Overlay-1 VRF

ACI's underlay isn't a blank IP network — it runs two protocols that are critical to how VXLAN forwarding actually works. Understanding both IS-IS and COOP, and how they interact with the Overlay-1 VRF, is what separates engineers who truly understand ACI from those who only know the GUI.

 Figure 7 — ACI Control Plane Protocol Stack

IS-IS — Underlay Reachability

Runs on the point-to-point links between leaf and spine switches within the Overlay-1 VRF. Its sole job is to ensure every node in the fabric knows the /32 loopback address (PTEP) of every other node.

▶ Distributes /32 PTEP routes for each leaf and spine

▶ Distributes vPC-TEP (VPC VIP) addresses

▶ Distributes the Proxy-TEP anycast address on all spines

▶ ECMP load balancing across spine uplinks — all paths active

Does NOT carry: Tenant routes, endpoint MAC/IP mappings, or any overlay information — only underlay reachability.

COOP — Endpoint Mapping Database

Council of Oracle Protocol. Runs between leaf switches and spine switches over PTEP loopbacks (inside Overlay-1). Each spine acts as a "COOP Oracle" — a distributed mapping database that knows the VTEP location of every endpoint in the fabric.

▶ When a leaf learns a new endpoint (MAC/IP), it registers the mapping with all spine COOP Oracles

▶ Spines replicate the mapping to each other — ensuring all spines have full fabric endpoint visibility

▶ When a leaf needs to forward to an unknown destination, it sends the VXLAN packet to the Proxy-TEP (anycast spine IP)

▶ The spine looks up the real VTEP (PTEP) of the destination and re-encapsulates the packet to that leaf

Verify COOP: spine101# show coop internal info repo ep

 Overlay-1 VRF — The Fabric's Own Network

ACI reserves a dedicated VRF called Overlay-1 for all fabric control plane and data plane communication. Tenant VRFs never share the same forwarding table. Overlay-1 contains:

/32 routes to every PTEP

vPC VIP (TEP) addresses

Spine Proxy-TEP anycast

APIC management address

FTEP address (VMM)

8. L2 Forwarding — Same Leaf and Different Leaf

Layer 2 (switched) traffic in ACI carries the Bridge Domain VNID inside the VXLAN header. The forwarding behavior splits into two scenarios: same-leaf and different-leaf, and each follows a distinct path.

 Figure 8 — L2 Traffic Forwarding Scenarios

Scenario A — Same Leaf

1

Server A sends Ethernet frame to Server B. Both are connected to the same Leaf-1.

2

Leaf-1 checks its local endpoint table. Server B's MAC is locally known — no VXLAN needed.

3

Policy lookup: leaf checks that Server A's EPG is permitted to communicate with Server B's EPG via contract.

4

Frame forwarded locally to Server B's port. Traffic never left Leaf-1. No spine involved.

✅ Zero fabric hops — most efficient L2 path possible

Scenario B — Different Leaf (Known Endpoint)

1

Server A (Leaf-1) sends frame to Server D (Leaf-3, same BD).

2

Leaf-1 checks local table — knows Server D's VTEP is 10.0.128.64 (Leaf-3 PTEP). Uses BD VNID in VXLAN header.

3

Leaf-1 encapsulates: Outer Dst IP = 10.0.128.64, VNID = BD's 24-bit ID, P bit = 1 (policy applied at ingress).

4

Spine receives packet, routes on outer IP header only — forwards to Leaf-3's PTEP via IS-IS routes.

5

Leaf-3 decapsulates VXLAN, delivers original frame to Server D on its local port.

Two fabric hops max: Leaf-1 → Spine → Leaf-3

Unknown Unicast (Destination Unknown): If Leaf-1 doesn't know Server D's VTEP, it sends the VXLAN packet to the Proxy-TEP anycast address on the spine. The spine queries COOP, finds the correct PTEP for Server D (Leaf-3), strips the outer IP header and re-encapsulates with the correct destination. This spine-proxy lookup happens only for the first packet — subsequent traffic flows directly leaf-to-leaf.

9. L3 Forwarding — Inter-Subnet Routing

One of ACI's most elegant architectural decisions is where it places the default gateway for tenant subnets. Rather than centralizing routing at a pair of core switches, ACI distributes the default gateway function to every leaf switch simultaneously. The same subnet gateway IP and MAC address exist on every leaf in the fabric — APIC programs them identically across all assigned leaves.

This means that when Server A (10.10.1.10) wants to send traffic to Server C (10.10.2.10 in a different subnet), it sends the packet to its default gateway — and that gateway is physically present on Leaf-1 right next to Server A. There is no round trip to a core router. The routing happens at ingress, and then the traffic is sent across the fabric in the VRF VNID (not the BD VNID) to reach the egress leaf where it gets re-encapsulated in the destination BD VNID.

 Figure 9 — L3 Inter-Subnet Packet Walk (A → C)

Step Location Action VXLAN VNID
1 Server A Sends packet: Dst IP = 10.10.2.10, Dst MAC = default gateway MAC (shared across all leaves) None (access port)
2 Leaf-1 (Ingress) Default gateway catches packet. Routes at L3. Looks up 10.10.2.10 in VRF — finds VTEP = Leaf-2 (10.0.96.64). Encapsulates with VRF VNID. Sets P=1, sclass = Source EPG PCTag. VRF VNID
3 Spine Switch Spine sees only the outer IP header. Routes to Leaf-2's PTEP (10.0.96.64) via IS-IS. Does NOT open the VXLAN packet. Transparent
4 Leaf-2 (Egress) Decapsulates VXLAN. VRF VNID tells it this is a routed packet for App-BD. Looks up Server C's MAC in App-BD local table. Re-encapsulates with destination BD VNID if needed — or delivers directly to Server C's port. BD VNID (egress)
5 Server C Receives packet. Sees original IP packet — completely unaware it traversed a VXLAN fabric. Source IP = 10.10.1.10, from Server A. None (access port)

Distributed Gateway Advantage: Routing happens at ingress — never at a centralized core. This eliminates the "tromboning" problem where traffic had to travel to a central router before reaching its destination, even when source and destination were physically adjacent.

10. Multidestination Traffic — GIPo Multicast and Head-End Replication

Not all traffic in a data center is unicast. ARP broadcasts, multicast streams, and unknown unicast flooding all need to reach multiple destinations simultaneously. In ACI, this is handled using a concept called GIPo (Group IP Outer) — a multicast IP address assigned to each Bridge Domain that serves as the destination for all BUM (Broadcast, Unknown unicast, Multicast) traffic within that BD.

When a leaf needs to flood traffic for a given BD, it encapsulates the frame in VXLAN and sends it to that BD's GIPo multicast address. Every leaf that has endpoints in that Bridge Domain is subscribed to the GIPo group, so they all receive the flooded packet through the underlay multicast tree. Spines act as multicast rendezvous points (or forward the multicast traffic) — they do not need to know anything about the overlay content, just the outer multicast IP address.

 Figure 10 — GIPo Multicast BUM Traffic Forwarding

Leaf-1 (Source)

Server A sends ARP broadcast

Encapsulates in VXLAN → Outer Dst = GIPo: 225.1.45.64

VXLAN UDP to GIPo multicast address

Spine-1

Multicast RP — replicates to all subscribers

Spine-2

Forwards to subscribed leaves

Delivered to all leaves subscribed to GIPo 225.1.45.64

Leaf-1 (Self)

Receives own copy — discards (loop prevention)

Leaf-2 ✅

Has endpoints in this BD — decapsulates and delivers to Server C's port

Leaf-3 ✅

Has endpoints in this BD — decapsulates and delivers to Server D's port

Leaf-4 ✗

No endpoints in this BD — not subscribed to GIPo, never receives the packet

GIPo Assignment: Each Bridge Domain receives a unique multicast IP address (GIPo) from a configured multicast range — typically something like 225.x.x.x. APIC assigns these automatically. When a leaf has no endpoint in a BD, it never joins that GIPo group and never wastes fabric resources processing floods for that BD.

Multicast-Free Option — Head-End Replication: When an external multicast underlay is not available (such as in ACI Multi-Site across DCI links), ACI can use head-end replication instead of GIPo. The ingress leaf sends a separate unicast VXLAN copy to each remote leaf that has endpoints in the BD. More CPU-intensive but works over any IP underlay without multicast support.

 Essential VXLAN in ACI — CLI Quick Reference

leaf# show endpoint mac <mac-addr> <!-- find endpoint location, VTEP IP, VNID -->

leaf# show endpoint ip <ip-addr> <!-- find endpoint by IP -->

leaf# show vxlan <!-- summary of VXLAN config and tunnel interfaces -->

leaf# show interface tunnel <id> <!-- tunnel src/dst VTEP, encap stats -->

leaf# show ip route vrf overlay-1 <!-- underlay IS-IS routes to all PTEPs -->

leaf# acidiag fnvread <!-- PTEP/node ID mapping for all fabric nodes -->

spine# show coop internal info repo ep <!-- full endpoint database on spine COOP Oracle -->

✅ Key Takeaways — VXLAN in Cisco ACI

Every packet in the ACI fabric is normalized to VXLAN at ingress — external VLAN, NVGRE, or untagged frames all become VXLAN inside the fabric.
ACI uses four distinct TEP types — PTEP (per-node), Proxy-TEP (spine anycast), FTEP (VMM), and vPC-TEP (dual-homed) — each playing a specific role in packet delivery.
The VNID field carries either a BD VNID (L2 forwarding) or a VRF VNID (L3 routing) depending on the packet type and forwarding path.
IS-IS builds underlay reachability (PTEP /32 routes). COOP maintains the endpoint mapping database on spine oracles. Together they enable spine-proxy forwarding for unknown unicast.
The Policy Bit and sclass in the ACI-extended VXLAN header carry security policy context in every packet, enabling distributed contract enforcement at any leaf in the fabric.
BUM traffic uses GIPo multicast addresses per Bridge Domain — only leaves with active endpoints in a BD receive flooded traffic, preventing fabric-wide broadcast storms at scale.

Go Deeper into Cisco ACI

Explore Cisco's official documentation on ACI forwarding, VXLAN design guides, and the full ACI architecture white papers.

Tags

VXLAN Cisco ACI ACI VXLAN Forwarding PTEP VTEP ACI VNID Types Overlay-1 VRF ACI Policy Bit sclass IS-IS ACI Underlay COOP Protocol ACI GIPo Multicast ACI Spine Leaf Cisco ACI Deep Dive RFC 7348 VXLAN