With the release of NSX-T 3.1, improvements to inter-TEP communication within the same host have been implemented. The Edge TEP IP can now be on the same subnet as the local hypervisor TEP. This feature reduces the complexity for collapsed setups where the Edge VM runs on an ESXi host that is also part of the Geneve overlay transport zone.
The following tunnel configuration is now possible:
In previous NSX-T versions, this tunnel setup was not possible because the Geneve encapsulation only happened on the physical uplink. When a packet was received on the network adapter, the system was not able to differentiate between the ESXi Hosts TEP and the Edges TEP. To achieve a collapsed setup you either have to use different physical network adapters or external routing.
A collapsed setup is a configuration where the Edge VM runs on an ESXi host that is part of the overlay itself. While this configuration seems convenient, in the real world it is prevalent to have dedicated ESXi hosts for Edge VMs.
For a collapsed design in NSX-T 3.0 and earlier, you had two options:
Option 1 - Use different physical NICs. The Geneve tunnel is established over the physical switch, allowing the ESXi host to differentiate between the ESXi hosts TEP, and the Edge nodes TEP.
Option 2 - Use different VLANs. In this configuration, the ESXi is able to differentiate between the ESXi hosts TEP, and the Edge nodes TEP by the VLAN tag. This setup requires an external router that routes packets between VLAN 200 to 201.
How to migrate to a shared transport VLAN?
When you already have Host and Edge TEPs on the same VLAN, the migration is very simple. You just have to configure a Trunk VLAN Segment in NSX-T and change the Virtual Machines network interface in the vCenter.
Important: Trunk Porgroups, configured on the VDS do not work! It has to be a segment, configured in NSX-T.
- Configure a Trunk VLAN Segment (NSX-T > Networking > Segments)
- Change the Edge VMs network interface to the Trunk VLAN Segment. Please keep in mind that "Network Adapter 1" is the management interface and must not be changed.
When you are using the second option and the Edge VM is on another VLAN, you additionally have to reconfigure the Edge Node in NSX-T to use the shared VLAN and IP Pool.
- Change the Transport VLAN in the Uplink Profile (NSX-T > System > Configuration > Fabric > Profiles > Uplink Profiles)
- Reconfigure the Edge Transport Node to use the proper Uplink Profile and IP Pool (NSX-T > System > Configuration > Fabric > Nodes > Edge Transport Nodes > [Edge VM] > Edit)
When I first reconfigured the edge virtual machine, all tunnels went down immediately with the error message "1 - Control Detection Time Expired".
The ESXi Hosts and Edge VMs TEP addresses could still ping each other, and also the MTU was configured correctly. This was exactly the symptom when you've had configured a shared transport VLAN prior to NSX-T 3.1.
After double-checking the configuration, it turned out that I placed the Edge VM in a Distributed Portgroup, instead of an NSX-T Segement.
After changing the Edge VM to the Trunk VLAN Segment, all tunnels could be established.