The folks on the wormhole-rs fork (who appear to share your Github organization? [1]) already have NAT punching working 95+% of the time in my testing, so maybe what they're doing could be ported over to the Python implementation.
The Rust implementation on Tailscale worked well for me. Except on a layer 7 firewall have to be quick to permit the connection or else it tries fallback.
[1] https://github.com/magic-wormhole