ZeroTier Operation Behind NAT - Comprehensive Research¶
Executive Summary¶
ZeroTier is specifically designed to work behind NAT with no port forwarding required. It uses UDP hole punching and automatic relay fallback to provide connectivity in nearly all NAT scenarios. For our use case (OPNsense 26.1 as a KVM guest behind libvirt NAT), ZeroTier should work reliably with minimal configuration.
Key Findings:
- ✅ Works behind NAT (including double NAT in many cases)
- ✅ No port forwarding required
- ✅ Only requires outbound UDP 9993 + replies
- ✅ Automatic relay fallback ensures connectivity always works
- ⚠️ Symmetric NAT causes relay usage (affects 4-8% of deployments)
- ⚠️ OPNsense uses symmetric NAT by default (affects clients, not gateway itself)
1. ZeroTier NAT Traversal Architecture¶
How It Works¶
ZeroTier uses UDP hole punching combined with a lazy NAT traversal approach:
-
Initial Connection: When a peer wants to communicate, traffic immediately begins relaying through ZeroTier's infrastructure (root servers)
-
Rendezvous Messages: While relaying, ZeroTier's root servers periodically send
VERB_RENDEZVOUSmessages to both peers containing connection hints -
Hole Punching Attempts: Both peers simultaneously send test packets to each other based on the rendezvous information, attempting to create NAT mappings
-
Direct Connection: If hole punching succeeds, peers switch to direct peer-to-peer communication and stop relaying
-
Continuous Retry: If traversal fails, traffic continues relaying indefinitely while retry attempts happen periodically in the background
This "lazy" approach means:
- Connections start working instantly (no setup delay)
- They always work for everyone (relay fallback)
- Direct optimization happens automatically when possible
- Connection setup is stateless (no complex NAT characterization step)
NAT Traversal Mechanisms¶
ZeroTier employs multiple strategies:
| Mechanism | Purpose | Availability |
|---|---|---|
| UDP Hole Punching | Primary NAT traversal | Always active |
| UPnP/NAT-PMP | Automatic port mapping | When router supports it |
| Port Prediction | Symmetric NAT traversal | Experimental |
| TCP Relay (port 443) | Last-resort fallback | When UDP blocked |
| IPv6 | Bypass NAT entirely | When available |
NAT Type Compatibility¶
| NAT Type | Traversal Success | Notes |
|---|---|---|
| Full Cone NAT | ✅ Direct connection | Best performance |
| Address-Restricted Cone | ✅ Direct connection | Works reliably |
| Port-Restricted Cone | ✅ Direct connection | Works reliably |
| Symmetric NAT | ⚠️ Relay only | 4-8% of deployments |
Success Rate: ZeroTier reports that 92-96% of NAT scenarios allow direct peer-to-peer connections, despite 99% of users being behind NAT.
2. Port Requirements¶
Essential Ports¶
| Port | Protocol | Direction | Required? | Purpose |
|---|---|---|---|---|
| 9993 | UDP | Outbound + replies | YES | Primary ZeroTier communication |
| Random high port | UDP | Outbound + replies | Auto | Derived from ZeroTier address |
| Random UPnP port | UDP | Outbound + replies | Auto | UPnP/NAT-PMP mapping |
| 443 | TCP | Outbound | Fallback | Last-resort relay if UDP blocked |
Do You Need Port Forwarding?¶
NO. ZeroTier explicitly does not require port forwarding.
Minimal firewall requirements:
# On the ZeroTier host's local firewall
allow UDP port 9993 (inbound and outbound)
# On upstream NAT/firewall
allow outbound UDP port 9993 + stateful replies
TCP Fallback (Port 443)¶
If UDP is completely blocked:
- ZeroTier falls back to TCP tunneling through port 443 (HTTPS impersonation)
- Higher latency than UDP
- Still functional, just slower
- Works in extremely restrictive corporate environments
3. Libvirt NAT Considerations¶
How Libvirt NAT Works¶
Libvirt's default NAT network (virbr0, 192.168.122.0/24):
- Virtual switch operates in NAT mode (IP masquerading)
- VMs get private IPs (192.168.122.x)
- Outbound traffic appears to come from host's IP
- Standard Linux iptables-based NAT
ZeroTier Compatibility¶
Good news: ZeroTier behind libvirt NAT is a standard NAT scenario - no special handling needed.
Configuration steps:
- Install ZeroTier inside the KVM guest (OPNsense)
- Ensure guest allows outbound UDP 9993
- No port forwarding needed on libvirt NAT
- ZeroTier will handle traversal automatically
Is Libvirt NAT "Symmetric NAT"?¶
Standard libvirt NAT uses Linux netfilter/iptables which typically implements port-restricted cone NAT or full cone NAT depending on conntrack settings. This is ZeroTier-friendly.
MTU/MSS Considerations¶
Nested NAT overhead:
- Physical interface MTU: 1500
- ZeroTier interface default: 2800
- Libvirt NAT: 1500 (inherits from physical)
- OPNsense guest NICs: 1500
Recommendation:
- ZeroTier will auto-negotiate MTU
- Default MTU 2800 works for direct connections
- May drop to 1280-1400 for relayed/fragmented paths
- No manual tuning needed for basic operation
- If seeing packet loss with large transfers, lower ZeroTier MTU to 1400
4. OPNsense-Specific Considerations¶
FreeBSD ZeroTier Support¶
- ZeroTier has native FreeBSD support
- Available via
os-zerotierplugin in OPNsense - Well-tested on OPNsense/pfSense platforms
Critical Issue: Symmetric NAT for LAN Clients¶
Problem: OPNsense (like pfSense) uses symmetric NAT (endpoint-dependent NAT) by default.
Impact:
- Gateway itself: ✅ No problem (ZeroTier on router works fine)
- LAN clients behind gateway: ⚠️ Will relay, not direct peer connection
Our use case: ZeroTier runs on the OPNsense gateway, not on clients behind it, so this is not a concern.
ZeroTier-over-ZeroTier Issue¶
Problem: If using multiple OPNsense routers with ZeroTier for site-to-site VPN, ZeroTier traffic may try to route over ZeroTier tunnels recursively.
Symptoms:
- High CPU usage
- Dropped packets
- Slow performance
Solution: Add to local.conf in OPNsense UI:
{
"physical": {
"10.0.0.0/8": {"blacklist": true},
"172.16.0.0/12": {"blacklist": true},
"192.168.0.0/16": {"blacklist": true}
}
}
This prevents ZeroTier from using private IP addresses as physical paths, forcing it to use public internet routes.
Enhanced local.conf Configuration¶
For more control (disable UPnP, secondary ports, TCP fallback):
{
"physical": {
"10.0.0.0/8": {"blacklist": true},
"172.16.0.0/12": {"blacklist": true},
"192.168.0.0/16": {"blacklist": true}
},
"settings": {
"primaryPort": 9993,
"portMappingEnabled": false,
"allowSecondaryPort": false,
"allowTcpFallbackRelay": false
}
}
Note: local.conf must be valid JSON or the service will fail to start.
OPNsense Firewall Rules¶
Interface with ZeroTier: Usually WAN (vtnet0)
Required rules (likely auto-created by os-zerotier plugin):
# Allow ZeroTier on WAN interface
Protocol: UDP
Source: any
Destination: This Firewall (WAN address)
Destination Port: 9993
Action: Pass
Verify in Firewall → Rules → WAN after enabling ZeroTier.
FreeBSD-Specific Tunables¶
No ZeroTier-specific sysctls documented. Standard FreeBSD network tunables apply:
# General network performance (optional, not ZeroTier-specific)
sysctl net.inet.tcp.recvspace=65536
sysctl net.inet.tcp.sendspace=65536
For our use case (management traffic, not high throughput), default settings are fine.
5. Common Problems and Solutions¶
Problem 1: ZeroTier Shows "RELAY" Status¶
Symptoms:
Causes:
- Symmetric NAT (both endpoints)
- UDP 9993 blocked
- Multiple NAT layers (>1)
- Very restrictive firewall
Diagnosis:
# Check if UDP is blocked
zerotier-cli info
# Status: RELAY = UDP likely blocked
# Status: ONLINE = UDP working
# Check surface addresses (symmetric NAT detection)
zerotier-cli info -j | grep surfaceAddresses
# Growing list = symmetric NAT
Solutions:
- If it's working: RELAY connections are functional, just higher latency. If acceptable, no action needed.
- Enable UPnP/NAT-PMP on upstream router
- Enable Static Port in OPNsense NAT settings (Firewall → NAT → Outbound → Static Port)
- Use IPv6 if available (bypasses NAT)
Problem 2: ZeroTier Service Won't Start (OPNsense)¶
Cause: Invalid local.conf JSON
Solution:
- Validate JSON: https://jsonlint.com/
- Check logs:
/var/log/zerotier-one.log - Remove
local.confcontent and test - Gradually add configuration back
Problem 3: High CPU / Packet Loss (ZeroTier-over-ZeroTier)¶
Symptoms:
- CPU spikes when ZeroTier active
- Dropped packets
- Multiple OPNsense gateways on same ZeroTier network
Solution: Add RFC1918 blacklist to local.conf (see section 4 above)
Problem 4: Connection Works Then Breaks¶
Causes:
- NAT mapping timeout (< 60 seconds)
- ZeroTier keepalive interval (120 seconds) exceeds NAT timeout
Solution:
- Check NAT/firewall connection timeout settings
- Increase timeout to minimum 120 seconds (180+ recommended)
- On Linux NAT:
sysctl net.netfilter.nf_conntrack_udp_timeout=180
Problem 5: MTU Issues (SSH Works, Large Transfers Hang)¶
Symptoms:
- Ping works
- SSH login works
- File transfers freeze
Diagnosis:
# Test with specific packet sizes
ping -M do -s 1400 <zerotier-peer-ip>
ping -M do -s 1450 <zerotier-peer-ip>
ping -M do -s 1472 <zerotier-peer-ip>
Solution:
Or via API:
curl -X POST https://my.zerotier.com/api/network/<network-id> \
-H "Authorization: Bearer <api-token>" \
-d '{"mtu": 1400}'
6. Performance and Reliability¶
Direct vs Relay Performance¶
| Connection Type | Latency | Throughput | Packet Loss |
|---|---|---|---|
| Direct P2P | Native (LAN-like) | Full bandwidth | Minimal |
| UDP Relay | +10-50ms | Reduced | Low |
| TCP Relay | +20-100ms | Significantly reduced | Low-Medium |
Real-world expectations:
- 92-96% of deployments achieve direct connections
- 4-8% use relay (still functional)
- Management traffic (SSH, monitoring) works fine even on relay
Long-Term Stability¶
ZeroTier is designed for "always-on" operation:
- Automatic reconnection after network changes
- Survives IP changes, reboots, interface flaps
- Built-in keepalives (every ~120 seconds)
- Continuous retry of direct connection attempts
Best practices for stability:
- Allow continuous outbound UDP (no aggressive timeout rules)
- Ensure NAT timeout ≥ 120 seconds (180+ recommended)
- Monitor connection status with
zerotier-cli peers - Use IPv6 where available for better stability
Resource Usage¶
CPU: Negligible (< 1% on modern hardware) Memory: ~10-20 MB per ZeroTier instance Bandwidth: Minimal overhead (mostly keepalives when idle)
Exception: ZeroTier-over-ZeroTier routing loop causes high CPU - use local.conf blacklist to prevent.
7. Configuration Best Practices¶
For OPNsense Behind Libvirt NAT¶
Step 1: Install os-zerotier plugin
Step 2: Configure ZeroTier
Step 3: Add local.conf (prevent ZeroTier-over-ZeroTier)
{
"physical": {
"10.0.0.0/8": {"blacklist": true},
"172.16.0.0/12": {"blacklist": true},
"192.168.0.0/16": {"blacklist": true}
}
}
Step 4: Verify firewall rules
Step 5: Check status
ssh opnsense-dev
zerotier-cli info
# Should show: ONLINE (not RELAY if possible)
zerotier-cli peers
# Check peer connectivity
Firewall Rules (OPNsense)¶
WAN interface (vtnet0):
Action: Pass
Protocol: UDP
Source: any
Destination: This Firewall
Destination Port: 9993
Description: ZeroTier
ZeroTier interface (zt0):
Action: Pass
Protocol: any
Source: <ZeroTier network subnet>
Destination: any
Description: Allow ZeroTier network traffic
Monitoring Connection Quality¶
# Check overall status
zerotier-cli info
# Check peer connections
zerotier-cli peers
# Detailed JSON output
zerotier-cli info -j
# Watch for changes
watch -n 5 'zerotier-cli peers'
Interpreting output:
DIRECT= Direct P2P connection (optimal)RELAY= Relaying through ZeroTier infrastructure (functional)- Latency shown in output (e.g.,
123 ms)
8. Troubleshooting Checklist¶
Initial Setup Issues¶
- Is os-zerotier plugin installed?
- Is ZeroTier service running? (
service zerotier status) - Is network ID correct?
- Is device authorized in ZeroTier Central?
- Does
zerotier-cli infoshow "ONLINE"? - Is UDP 9993 allowed outbound?
Connectivity Issues¶
- Run
zerotier-cli peerson both endpoints - Check if showing RELAY vs DIRECT
- Verify firewall rules allow ZeroTier interface traffic
- Test with
pingover ZeroTier IP - Check
surfaceAddressesfor symmetric NAT (growing list) - Review
/var/log/zerotier-one.logfor errors
Performance Issues¶
- Is CPU spiking? (check for ZeroTier-over-ZeroTier loop)
- Add RFC1918 blacklist to
local.conf - Check MTU with large ping tests
- Verify NAT timeout ≥ 120 seconds
- Consider enabling UPnP/NAT-PMP on upstream router
FreeBSD/OPNsense Specific¶
- Is
local.confvalid JSON? - Check OPNsense logs: System → Log Files → General
- Verify interface assignment: Interfaces → Assignments
- Check routing: Diagnostics → Routes
- Test from CLI:
ssh admin@opnsense-wan-ip
9. Diagnostic Commands Reference¶
Essential Commands¶
# Service status
service zerotier status
# Basic info
zerotier-cli info
# Output: 200 info <node-id> <version> ONLINE
# Peer connections
zerotier-cli peers
# Shows DIRECT/RELAY status, latency per peer
# Network list
zerotier-cli listnetworks
# Shows joined networks and assigned IPs
# Detailed JSON output
zerotier-cli info -j
zerotier-cli listnetworks -j
# Generate debug dump
zerotier-cli dump
Log Locations¶
FreeBSD/OPNsense:
/var/log/zerotier-one.log # Main ZeroTier log
/var/db/zerotier-one/ # ZeroTier data directory
/var/db/zerotier-one/<network-id>.local.conf # Network-specific config
Viewing logs:
Network Testing¶
# Ping over ZeroTier
ping <zerotier-peer-ip>
# MTU testing
ping -M do -s 1400 <zerotier-peer-ip>
# TCP connection test
nc -vz <zerotier-peer-ip> 22
# Traceroute over ZeroTier
traceroute <zerotier-peer-ip>
10. Known Gotchas and How to Avoid Them¶
Gotcha 1: ZeroTier-over-ZeroTier Routing Loop¶
Scenario: Multiple gateways on same ZeroTier network routing traffic between sites
Impact: High CPU, packet loss, slow performance
Prevention: Always add RFC1918 blacklist to local.conf on all gateway nodes:
{
"physical": {
"10.0.0.0/8": {"blacklist": true},
"172.16.0.0/12": {"blacklist": true},
"192.168.0.0/16": {"blacklist": true}
}
}
Gotcha 2: Invalid local.conf Breaks Service¶
Scenario: Typo in local.conf JSON
Impact: ZeroTier service fails to start silently
Prevention:
- Always validate JSON before applying
- Test config with
zerotier-cli infoafter changes - Keep backup of working config
Gotcha 3: NAT Timeout Too Short¶
Scenario: NAT/firewall drops UDP mapping before ZeroTier keepalive (120s)
Impact: Connections break intermittently
Prevention: Set NAT UDP timeout ≥ 180 seconds
Gotcha 4: Assuming RELAY = Broken¶
Scenario: Seeing RELAY status and thinking it's not working
Reality: RELAY connections are fully functional, just higher latency
Prevention: Test actual connectivity (ping, SSH) before troubleshooting
Gotcha 5: Double NAT Panic¶
Scenario: Thinking double NAT breaks ZeroTier
Reality: ZeroTier often works fine through double NAT (may relay, but functional)
Prevention: Test first, optimize later
Gotcha 6: Forgetting to Authorize in Central¶
Scenario: Device shows in ZeroTier Central but can't communicate
Impact: Network inaccessible despite ZeroTier running
Prevention: Always check ZeroTier Central → Network → Members → Authorize device
11. Monitoring and Health Checks¶
Prometheus Metrics¶
ZeroTier can export metrics for Prometheus monitoring:
Metrics available:
zt_packet- Packet flow statisticszt_peer- Peer connection statuszt_network- Network membership status
Collection methods:
- ZeroTier Central API (for network-level stats)
- Node exporter textfile collector (for local node stats)
Example textfile collector script:
#!/bin/bash
# /usr/local/bin/zerotier-metrics.sh
OUTPUT=/var/lib/node_exporter/textfile_collector/zerotier.prom
# Get status
STATUS=$(zerotier-cli info -j)
ONLINE=$(echo "$STATUS" | jq -r '.online')
# Get peers
PEERS=$(zerotier-cli peers -j | jq '[.[] | select(.role == "LEAF")] | length')
DIRECT=$(zerotier-cli peers -j | jq '[.[] | select(.role == "LEAF" and .paths[0].active == true)] | length')
# Write metrics
cat > "$OUTPUT" << EOF
# HELP zerotier_online ZeroTier online status (1=online, 0=offline)
# TYPE zerotier_online gauge
zerotier_online{node_id="$(echo "$STATUS" | jq -r '.address')"} $([ "$ONLINE" = "true" ] && echo 1 || echo 0)
# HELP zerotier_peers_total Total number of ZeroTier peers
# TYPE zerotier_peers_total gauge
zerotier_peers_total $PEERS
# HELP zerotier_peers_direct Number of direct peer connections
# TYPE zerotier_peers_direct gauge
zerotier_peers_direct $DIRECT
EOF
Health Check Script¶
#!/bin/bash
# /usr/local/bin/zerotier-healthcheck.sh
# Check if service running
if ! service zerotier status > /dev/null; then
echo "ERROR: ZeroTier service not running"
exit 1
fi
# Check if online
STATUS=$(zerotier-cli info 2>/dev/null)
if ! echo "$STATUS" | grep -q "ONLINE"; then
echo "WARNING: ZeroTier not online: $STATUS"
exit 1
fi
# Check network membership
NETWORKS=$(zerotier-cli listnetworks -j)
if [ "$(echo "$NETWORKS" | jq 'length')" -eq 0 ]; then
echo "WARNING: No ZeroTier networks joined"
exit 1
fi
# Check if assigned IP
if ! echo "$NETWORKS" | jq -e '.[0].assignedAddresses | length > 0' > /dev/null; then
echo "WARNING: No IP assigned on ZeroTier network"
exit 1
fi
echo "OK: ZeroTier healthy"
exit 0
Cron schedule:
12. Configuration for Our Specific Use Case¶
Environment Details¶
- Host: GCE n2-standard-4 (opnsense-dev VM host)
- Hypervisor: KVM with nested virtualization
- Network: libvirt NAT (virbr0, 192.168.122.0/24)
- Guest: OPNsense 26.1 (FreeBSD-based)
- Guest NICs:
- vtnet0 (WAN) - libvirt NAT network
- vtnet1 (LAN) - isolated network
- vtnet2 (OPT1) - isolated network
- vtnet3 (OPT2) - isolated network
- Access: IAP SSH tunnel to host, port forwarding to guest
- Use case: Management and monitoring over ZeroTier
Recommended Configuration¶
1. Install ZeroTier on OPNsense guest:
2. Configure ZeroTier:
3. Add to local.conf:
{
"physical": {
"10.0.0.0/8": {"blacklist": true},
"172.16.0.0/12": {"blacklist": true},
"192.168.0.0/16": {"blacklist": true}
},
"settings": {
"primaryPort": 9993
}
}
4. Verify firewall rules:
Firewall → Rules → WAN
Action: Pass
Protocol: UDP
Source: any
Destination: This Firewall
Destination Port: 9993
5. Authorize in ZeroTier Central:
- Go to https://my.zerotier.com/network/6ab565387a4b9177
- Find new member (short name: opnsense-dev)
- Check "Authorized"
- Assign IP: 192.168.194.199 (or auto)
6. Test connectivity:
# From opnsense-dev
zerotier-cli info
# Expect: 200 info <node-id> <version> ONLINE
ping 192.168.194.131 # dumbo
ping 192.168.194.10 # owl
# From remote host (e.g., dumbo)
ping 192.168.194.199 # opnsense-dev
ssh joe@192.168.194.199
Expected Behavior¶
Connection type: Likely RELAY initially, may transition to DIRECT depending on:
- GCE network configuration
- Libvirt NAT implementation
- ZeroTier network topology
Performance: Even if relayed:
- Latency: +20-50ms (acceptable for management)
- Throughput: Sufficient for SSH, API calls, monitoring
- Reliability: High (relay provides stable connection)
Verification:
If all peers show DIRECT except planets (which may relay), configuration is optimal. If some/all show RELAY, it's still functional - verify actual connectivity with ping/SSH.
13. References and Sources¶
Official Documentation¶
- ZeroTier Protocol Documentation - Core NAT traversal mechanisms
- OPNsense Configuration Guide - OPNsense-specific setup and local.conf
- Router Configuration Tips - NAT type recommendations and best practices
- Corporate Firewalls Guide - Firewall rules and port requirements
- Connection Issues Troubleshooting - Diagnostic commands and solutions
- Troubleshooting & FAQ - General troubleshooting resources
- Metrics and Monitoring - Prometheus integration
- TCP Relay Documentation - Relay vs direct connection performance
Technical Deep Dives¶
- The State of NAT Traversal - Technical analysis of NAT traversal
- Inside ZeroTier: Building a Network that Works - Architecture overview
Community Resources¶
- ZeroTier Discussions - Community support forum
- ZeroTier GitHub Issues - Bug reports and feature requests
- OPNsense Forums - ZeroTier - OPNsense-specific discussions
Related Documentation¶
- Libvirt Networking Documentation - NAT forwarding details
- FreeBSD Network Performance Tuning - FreeBSD-specific tuning
- OPNsense Documentation - Official OPNsense ZeroTier guide
Conclusion¶
ZeroTier is well-suited for our use case (OPNsense behind libvirt NAT):
✅ No port forwarding required - works with default libvirt NAT ✅ Automatic relay fallback - always provides connectivity ✅ Native FreeBSD support - well-tested on OPNsense ✅ Simple configuration - minimal setup required ✅ Battle-tested - widely deployed in NAT scenarios
Key success factors:
- Install on the gateway itself (not LAN clients)
- Add RFC1918 blacklist to local.conf
- Verify UDP 9993 allowed outbound
- Authorize in ZeroTier Central
- Test connectivity, don't assume RELAY = broken
Next steps:
- Deploy via Ansible to opnsense-dev
- Monitor with
zerotier-cli peers - Add to Prometheus monitoring
- Validate connectivity from remote hosts
- Document in runbooks for future reference
Last Updated: 2026-02-13 Author: Research compilation for scandora.net infrastructure Related Docs:
/Users/joe/src/scandora.net/gateways/owl/docs/DEV-WORKFLOW.md/Users/joe/src/scandora.net/scripts/opnsense-dev/dev-up.sh