GNS3 Cisco LAB : Nexus 9000v NxOS VxLAN Lab and concept.

I have written a few blogs earlier on VxLAN. You can check them in the below links.

ASR1000 : VxLAN Unicast mode Configuration and Verification

ASR1000: Cisco Vxlan Multicast Mode Configuration and Verification

Since then VxLAN has changed a lot and I am planning to write a series of articles on VxLAN and EVPN on Nexus devices. For this, I have set up GNS3 Lab with Cisco NxOSv (Nexus 9000v).

You can find more details about the virtual router here:

https://www.cisco.com/c/en/us/support/switches/nexus-9000v-switch/model.html

Lets start digging VxLAN and explore it from the scratch.

What is an Underlay?

In VxLAN we see many new terms which need to be understood to understand the overall working of VxLAN and the Underlay network is one of them. In VxLAN you may consider IGP(interior gateway protocol)+ Multicast network as an underlay. Before setting up VxLAN we need to (i) Provide a way for VTEPs to reach each other. (ii) Identify a method for VTEPs to learn the unknown hosts. For the first point, we use any of the IGP, in this case, I am using OSPF and for the second point we need a method that can help us in flood and learn technique, that means if our node doesn’t know the destination then it should be able to flood the ARP/BUM(Broadcast, Unknown unicast and Multicast) and learn back the mac addresses from the response. For this, if we don’t have a broadcast segment then the only known method is multicast which can help in flooding. The network between the VTEPs is a routed network so it doesn’t allow any kind of flooding. So we resort to multicast protocol like PIM. In this case, I have used PIM SM(sparse-mode).

What is Overlay ?

An overlay network is a kind of network which is built over an already existing well-connected network to achieve some of the specialized characteristics that the underlay network doesn’t provide. For example, in the case of VxLAN the underlying IP routing based network cannot provide layer-2 features like flood and learn, etc so if we need to connect two data centers separated geographically then we don’t have any option but to look for BGP based ISP or MPLS based network, either ways we (i) Loose control of the traffic (ii) IP routing adds extra latency which limits the capability of modern DCs which require low latency networks. Overlay network solves all these problems for us.

How the Connection Initiates?

Suppose we have two devices(VTEPs or Virtual tunnel endpoints) configured for VxLAN, once the IP connectivity and multicast network is set up between them they start to form the overlay network. Below is the packet exchange at the beginning of the connection.

I have kept the network as simple as I can, there are two Nexus9000v routers connected with a direct link and we have OSPF and PIM over that link (Ethernet 1/1). I have enabled a packet capture over this link to show you the kind of packet that is exchanged between the two VxLAN talking nodes.

Lets see what we found in the packet capture.

We already know that once the connectivity is formed in this network then we will have an OSPF packet exchange, I know that you are not on this blog to know the OSPF packet exchange. The major concept here to understand with packet captures is the formation of a multicast tree. So let’s check what kind of PIM packets are exchanged and how both routers are joining the multicast group (230.1.1.1 in this case).

Here are the three important packets that we need to understand about.

STEP-1 : PIM Join sent by NxOS-1 router to RP (rendezvous point) router NxOS-2. This simply means that the router wants to join the multicast group and wants to receive all the traffic sent for this group(230.1.1.1).

STEP-2 The NxOS-2 router also expresses the desire to join the multicast group 230.1.1.1

STEP-3 : All routers send PIM register message to RP in the network informing that they have multicast traffic from the group 230.1.1.1 .

With the above three packets, we ensured that both the routers have shown interest to join the multicast group 230.1.1.1 and the NON-RP router has informed the RP that it has a stream from 230.1.1.1 waiting to be transferred so RP can send any client looking for this group. We have to build a group of routers that want to receive the traffic sent to mcast group 230.1.1.1 now this group will be used to send or receive the flooding traffic like arp like below.

Flood and learn:

Above we have seen the multicast setup that ensures that all VxLAN talking routers join a multicast group and if any router is sending any traffic to that multicast group then that traffic will be seen by all those routers, now how can we use this characteristic in our case? The whole world of Data Plane learning depends on a well-known and tried and tested method of Flood and Learn as we do with ARP or any kind of BUM traffic in the ethernet segment to find out the destination mac. Here we leverage a similar kind of approach but not with broadcast but multicast. The router that wants to send the traffic to an unknown destination will encapsulate the arp packet coming from the source with VxLAN headers (vxlan + udp) and multicast that arp to neighbors who have joined the configured group. In our case, we have defined 230.1.1.1 as that group and that is the reason if you see the above Wireshark capture, you see it being sent to the destination 230.1.1.1.

This multicast packet will be received by all the routers joined the group and then forwarded to the LAN corresponding to the VNI in the VxLAN header. This VNI maps to a VLAN on the router.

The response will be unicast to the router from where the ARP has originated.

Headers in VxLAN packet

You must have already noticed the headers now in the previous Wireshark captures, here is a look of a vxlan encapsulated packet.

UDP header is very important because with the help of this header only the router comes to know that the packet belongs to VxLAN. The VxLAN header has VNI ID details which is necessary to know the destination vlan of the packet.

Configurations.

Now we have seen a good set of basics lets check out the configuration on both the Switches NxOS-1 and NxOS-2 router.

NxOS-1 Switch :
!
feature ospf
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay
!
ip pim rp-address 4.4.4.4 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8
!
vlan 1000
vlan 1000
  vn-segment 5000
!
interface nve1
  no shutdown
  source-interface loopback1
  member vni 5000 mcast-group 230.1.1.1
!
interface Ethernet1/1
  ip address 10.10.10.1/24
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown
!
interface Ethernet1/3
  switchport
  switchport access vlan 1000
  no shutdown
!
interface loopback1
  ip address 1.1.1.1/32
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
!
router ospf 1
  router-id 1.1.1.1
!

NxOS-2 Switch :

!
feature ospf
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay
!
ip pim rp-address 4.4.4.4 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8
!
vlan 1,1000
vlan 1000
  vn-segment 5000
!
interface nve1
  no shutdown
  source-interface loopback1
  member vni 5000 mcast-group 230.1.1.1
!
interface Ethernet1/1
  ip address 10.10.10.2/24
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown
!
interface Ethernet1/3
  switchport
  switchport access vlan 1000
  no shutdown
!
interface loopback1
  ip address 4.4.4.4/32
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
!
interface loopback2
  ip address 3.3.3.3/32
  ip router ospf 10 area 0.0.0.0
  ip pim sparse-mode
!
router ospf 1
  router-id 4.4.4.4
!

Verification:

NxOS-1# show nve vni
Codes: CP - Control Plane        DP - Data Plane
       UC - Unconfigured         SA - Suppress ARP
       SU - Suppress Unknown Unicast
       Xconn - Crossconnect
       MS-IR - Multisite Ingress Replication

Interface VNI      Multicast-group   State Mode Type [BD/VRF]      Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1      5000     230.1.1.1         Up    DP   L2 [1000]



NxOS-2# show nve vni
Codes: CP - Control Plane        DP - Data Plane
       UC - Unconfigured         SA - Suppress ARP
       SU - Suppress Unknown Unicast
       Xconn - Crossconnect
       MS-IR - Multisite Ingress Replication

Interface VNI      Multicast-group   State Mode Type [BD/VRF]      Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1      5000     230.1.1.1         Up    DP   L2 [1000]

NxOS-1# show nve interface
Interface: nve1, State: Up, encapsulation: VXLAN
 VPC Capability: VPC-VIP-Only [not-notified]
 Local Router MAC: 0c56.3a00.1b08
 Host Learning Mode: Data-Plane
 Source-Interface: loopback1 (primary: 1.1.1.1, secondary: 0.0.0.0)

NxOS-2# show nve interface
Interface: nve1, State: Up, encapsulation: VXLAN
 VPC Capability: VPC-VIP-Only [not-notified]
 Local Router MAC: 0c31.7000.1b08
 Host Learning Mode: Data-Plane
 Source-Interface: loopback1 (primary: 4.4.4.4, secondary: 0.0.0.0)

The other step for the verification is ping from the VPC hosts. Lets check that as well.

PC1> ping 192.168.10.2

84 bytes from 192.168.10.2 icmp_seq=1 ttl=64 time=2.351 ms
84 bytes from 192.168.10.2 icmp_seq=2 ttl=64 time=2.502 ms
84 bytes from 192.168.10.2 icmp_seq=3 ttl=64 time=2.338 ms
84 bytes from 192.168.10.2 icmp_seq=4 ttl=64 time=6.568 ms
84 bytes from 192.168.10.2 icmp_seq=5 ttl=64 time=2.398 ms



PC2> ping 192.168.10.1

84 bytes from 192.168.10.1 icmp_seq=1 ttl=64 time=2.176 ms
84 bytes from 192.168.10.1 icmp_seq=2 ttl=64 time=2.254 ms
84 bytes from 192.168.10.1 icmp_seq=3 ttl=64 time=2.417 ms
84 bytes from 192.168.10.1 icmp_seq=4 ttl=64 time=5.774 ms
84 bytes from 192.168.10.1 icmp_seq=5 ttl=64 time=2.547 ms

Conclusion:

So we made it, we are able to ping end to end in the same network and it certainly feels like layer2. This was my first blog on the Nexus 9000v device and I was not sure if this will work but good that I completed it. Now I am planning to build on this lab and cover the EVPN part which brings the control plane learning and the flood and learn task is offloaded to BGP in that case. Stay tuned for the exciting journey and thanks for reading.

Leave a Reply