Instead you can create multiple Wireguard interfaces and use policy routing / ECMP / BGP / all the layer 3 tricks, that way you can achieve similar things to what vxlan could give you but at layer 3.
There's a performance benefit to doing it this way too, in some testing I found the wireguard interface can be a bottleneck (there's various offload and multiple core support in Linux, but it still has some overhead).
This is the correct answer, routing between subnets is how it’s suppose to work. I think there are some edge cases like DR where it seems like stretching L2 might sound like a good idea, but it practice it gets messy fast.
Agreed. They've also been extremely finnicky from my experience - had cases where large EVPN deployments just blackholed some arbitrary destination MAC until GARPs were sent out of them.
Also IME EVPN is mostly deployed/pushed when clueless app developers expect to have arbitrary L2 reachability across any two points in a (cross DC!) fabric [1], or when they want IP addresses that can follow them around the DC or other dumb shit that they just assumed they can do.
[1] "What do you mean I can't just use UDP broadcast as a pub sub in my application? It works in the office, fix your network!" and the like.
The good clouds don't support L2, they use a centralized control plane instead of brittle EVPN, and they virtualize in the hypervisor instead of in the switches. People are being sold EVPN as "we have cloud at home" and it's not really true.
AWS/GCE/Azure's network implementations pre-date EVPN and are proprietary to their cloud. EVPN is for on-premise. You don't exactly have the opportunity to use their implementation unless you are on their cloud, so I am not sure comparing the merits of either is productive.
> Also IME EVPN is mostly deployed/pushed when clueless app developers expect to have arbitrary L2 reachability across any two points in a (cross DC!) fabric [1], or when they want IP addresses that can follow them around the DC or other dumb shit that they just assumed they can do.
Sorry, but that's really reductive and backwards. It's usually pushed by requirements from the lower regions of the stack, operators don't want to let VMs have downtime so they live migrate to other places in the DC. It's not a weird requirement to let those VM's keep the same IP once migrated. I never had a developer ask me for L2 reachability.