Troubleshooting¶
Quick Diagnostics¶
Can't SSH to Host¶
# 1. Check if host is reachable
ping HOST_IP
# 2. Check if SSH port is open
nc -zv HOST_IP 22
# 3. Try verbose SSH
ssh -vvv joe@HOST_IP
# 4. Check your IP isn't banned
# Use emergency access (SSM/IAP) then:
sudo fail2ban-client status sshd
ZeroTier Issues¶
# Check local status
zerotier-cli listnetworks
zerotier-cli listpeers
# Check specific peer connection
zerotier-cli listpeers | grep PEER_ID
# Restart service
sudo systemctl restart zerotier-one # Linux
sudo configctl zerotier restart # OPNsense
DNS Not Resolving¶
# Test PowerDNS directly
dig @10.10.10.10 hostname.scandora.net
# Test through gateway
dig @10.7.0.1 hostname.scandora.net
# Check gateway forwarding
ssh joe@192.168.194.10 "cat /usr/local/etc/unbound.opnsense.d/dot.conf"
SSH Problems¶
Permission Denied¶
Symptoms: Permission denied (publickey)
Causes:
- Wrong username (using
rootinstead ofjoe) - SSH key not loaded
- Key not in authorized_keys
Solutions:
# Check loaded keys
ssh-add -l
# Add key if missing
ssh-add ~/.ssh/id_ed25519
# Verify correct user
ssh joe@HOST # NOT root@HOST
Connection Refused¶
Symptoms: Connection refused
Causes:
- SSH service not running
- Firewall blocking port 22
- Wrong IP address
Solutions:
# Verify IP is correct
host hostname.scandora.net
# Use emergency access to check SSH
aws ssm start-session --target INSTANCE_ID --region us-west-2
sudo systemctl status sshd
Connection Timeout¶
Symptoms: Connection hangs then times out
Causes:
- Network unreachable
- Firewall dropping packets
- ZeroTier not connected
Solutions:
# Try different access methods
ssh joe@192.168.194.x # ZeroTier
ssh joe@PUBLIC_IP # Direct
gcloud compute ssh HOST --tunnel-through-iap # IAP
# Check ZeroTier status
zerotier-cli listnetworks
Banned by fail2ban¶
Symptoms: Connection refused after multiple failed attempts
Solution:
# Use emergency access
aws ssm start-session --target INSTANCE_ID --region us-west-2
# Check if banned
sudo fail2ban-client status sshd
# Unban IP
sudo fail2ban-client set sshd unbanip YOUR_IP
ZeroTier Problems¶
Network Not Joined¶
Symptoms: zerotier-cli listnetworks shows nothing
# Join network
sudo zerotier-cli join 6ab565387a4b9177
# Check authorization in ZeroTier Central
# https://my.zerotier.com/network/6ab565387a4b9177
ACCESS_DENIED Status¶
Symptoms: Network shows ACCESS_DENIED
Solution: Authorize node in ZeroTier Central:
- Go to https://my.zerotier.com/
- Find network 6ab565387a4b9177
- Scroll to Members
- Check the authorization box for the node
RELAY Instead of DIRECT¶
Symptoms: zerotier-cli listpeers shows RELAY
Causes:
- Firewall blocking UDP 9993
- Strict NAT
Solutions:
# Check firewall allows ZeroTier
sudo ufw status | grep 9993
# Enable IPv6 transport (helps with CGNAT)
sudo zerotier-cli set 6ab565387a4b9177 allowGlobal=true
DNS Problems¶
PowerDNS Unreachable¶
Symptoms: dig @10.10.10.10 times out
# Check ZeroTier connectivity to bogart
ping 10.10.10.10
ping 192.168.194.x # bogart's ZT IP
# SSH to bogart and check service
ssh joe@bogart
sudo systemctl status pdns
Gateway Not Forwarding¶
Symptoms: Local dig works, gateway dig fails
# Check Unbound forward config
ssh joe@192.168.194.10 # gateway
cat /usr/local/etc/unbound.opnsense.d/dot.conf
# Should contain:
# forward-zone:
# name: "scandora.net."
# forward-addr: 10.10.10.10
# Restart Unbound
sudo configctl unbound restart
DHCP Names Not Resolving¶
Symptoms: New DHCP clients not in DNS
# Check watcher service on gateway
ssh joe@192.168.194.10
sudo service pdns_dhcp_watcher status
# Check lease file
cat /var/dhcpd/var/db/dhcpd.leases
# Restart watcher
sudo service pdns_dhcp_watcher restart
Service Problems¶
Service Won't Start¶
# Check status
sudo systemctl status SERVICE
# View recent logs
sudo journalctl -u SERVICE -n 50
# Check configuration
sudo SERVICE --check-config # if supported
Service Crashing¶
# View crash logs
sudo journalctl -u SERVICE --since "1 hour ago"
# Check for core dumps
ls -la /var/crash/
# Monitor in real-time
sudo journalctl -u SERVICE -f
Gateway Problems (OPNsense)¶
Service Commands¶
OPNsense uses different commands than Linux:
# Service control
sudo configctl service status
sudo configctl SERVICE restart
# View logs
clog -f /var/log/system.log
clog -f /var/log/zerotier.log
# Restart entire gateway (careful!)
sudo reboot
Web Interface Unreachable¶
# SSH and check lighttpd
ssh joe@192.168.194.10
sudo configctl webgui restart
# Check if port 443 is listening
sockstat -4l | grep 443
Config Corruption¶
If config.xml is corrupted:
# Restore from Git backup
git clone github.com/scandora/opnsense-owl.git
scp config.xml joe@192.168.194.10:/tmp/
ssh joe@192.168.194.10
sudo cp /tmp/config.xml /conf/config.xml
sudo reboot
Cloud Provider Issues¶
AWS Instance Unreachable¶
# Check instance status
aws ec2 describe-instance-status --instance-ids i-xxx
# Get console output
aws ec2 get-console-output --instance-id i-xxx --output text
# Reboot if needed
aws ec2 reboot-instances --instance-ids i-xxx
GCE Instance Unreachable¶
# Check instance status
gcloud compute instances describe INSTANCE --zone=ZONE
# Get serial console
gcloud compute instances get-serial-port-output INSTANCE --zone=ZONE
# Reset instance
gcloud compute instances reset INSTANCE --zone=ZONE
Getting Help¶
- Check logs first - Most issues leave traces
- Try emergency access - SSM/IAP can always reach running instances
- Check ZeroTier Central - Network status and member authorization
- Review recent changes -
git logfor what changed recently