Skip to content

Troubleshooting

Quick Diagnostics

Can't SSH to Host

# 1. Check if host is reachable
ping HOST_IP

# 2. Check if SSH port is open
nc -zv HOST_IP 22

# 3. Try verbose SSH
ssh -vvv joe@HOST_IP

# 4. Check your IP isn't banned
# Use emergency access (SSM/IAP) then:
sudo fail2ban-client status sshd

ZeroTier Issues

# Check local status
zerotier-cli listnetworks
zerotier-cli listpeers

# Check specific peer connection
zerotier-cli listpeers | grep PEER_ID

# Restart service
sudo systemctl restart zerotier-one   # Linux
sudo configctl zerotier restart       # OPNsense

DNS Not Resolving

# Test PowerDNS directly
dig @10.10.10.10 hostname.scandora.net

# Test through gateway
dig @10.7.0.1 hostname.scandora.net

# Check gateway forwarding
ssh joe@192.168.194.10 "cat /usr/local/etc/unbound.opnsense.d/dot.conf"

SSH Problems

Permission Denied

Symptoms: Permission denied (publickey)

Causes:

  1. Wrong username (using root instead of joe)
  2. SSH key not loaded
  3. Key not in authorized_keys

Solutions:

# Check loaded keys
ssh-add -l

# Add key if missing
ssh-add ~/.ssh/id_ed25519

# Verify correct user
ssh joe@HOST   # NOT root@HOST

Connection Refused

Symptoms: Connection refused

Causes:

  1. SSH service not running
  2. Firewall blocking port 22
  3. Wrong IP address

Solutions:

# Verify IP is correct
host hostname.scandora.net

# Use emergency access to check SSH
aws ssm start-session --target INSTANCE_ID --region us-west-2
sudo systemctl status sshd

Connection Timeout

Symptoms: Connection hangs then times out

Causes:

  1. Network unreachable
  2. Firewall dropping packets
  3. ZeroTier not connected

Solutions:

# Try different access methods
ssh joe@192.168.194.x      # ZeroTier
ssh joe@PUBLIC_IP          # Direct
gcloud compute ssh HOST --tunnel-through-iap  # IAP

# Check ZeroTier status
zerotier-cli listnetworks

Banned by fail2ban

Symptoms: Connection refused after multiple failed attempts

Solution:

# Use emergency access
aws ssm start-session --target INSTANCE_ID --region us-west-2

# Check if banned
sudo fail2ban-client status sshd

# Unban IP
sudo fail2ban-client set sshd unbanip YOUR_IP

ZeroTier Problems

Network Not Joined

Symptoms: zerotier-cli listnetworks shows nothing

# Join network
sudo zerotier-cli join 6ab565387a4b9177

# Check authorization in ZeroTier Central
# https://my.zerotier.com/network/6ab565387a4b9177

ACCESS_DENIED Status

Symptoms: Network shows ACCESS_DENIED

Solution: Authorize node in ZeroTier Central:

  1. Go to https://my.zerotier.com/
  2. Find network 6ab565387a4b9177
  3. Scroll to Members
  4. Check the authorization box for the node

RELAY Instead of DIRECT

Symptoms: zerotier-cli listpeers shows RELAY

Causes:

  1. Firewall blocking UDP 9993
  2. Strict NAT

Solutions:

# Check firewall allows ZeroTier
sudo ufw status | grep 9993

# Enable IPv6 transport (helps with CGNAT)
sudo zerotier-cli set 6ab565387a4b9177 allowGlobal=true

DNS Problems

PowerDNS Unreachable

Symptoms: dig @10.10.10.10 times out

# Check ZeroTier connectivity to bogart
ping 10.10.10.10
ping 192.168.194.x  # bogart's ZT IP

# SSH to bogart and check service
ssh joe@bogart
sudo systemctl status pdns

Gateway Not Forwarding

Symptoms: Local dig works, gateway dig fails

# Check Unbound forward config
ssh joe@192.168.194.10  # gateway
cat /usr/local/etc/unbound.opnsense.d/dot.conf

# Should contain:
# forward-zone:
#   name: "scandora.net."
#   forward-addr: 10.10.10.10

# Restart Unbound
sudo configctl unbound restart

DHCP Names Not Resolving

Symptoms: New DHCP clients not in DNS

# Check watcher service on gateway
ssh joe@192.168.194.10
sudo service pdns_dhcp_watcher status

# Check lease file
cat /var/dhcpd/var/db/dhcpd.leases

# Restart watcher
sudo service pdns_dhcp_watcher restart

Service Problems

Service Won't Start

# Check status
sudo systemctl status SERVICE

# View recent logs
sudo journalctl -u SERVICE -n 50

# Check configuration
sudo SERVICE --check-config  # if supported

Service Crashing

# View crash logs
sudo journalctl -u SERVICE --since "1 hour ago"

# Check for core dumps
ls -la /var/crash/

# Monitor in real-time
sudo journalctl -u SERVICE -f

Gateway Problems (OPNsense)

Service Commands

OPNsense uses different commands than Linux:

# Service control
sudo configctl service status
sudo configctl SERVICE restart

# View logs
clog -f /var/log/system.log
clog -f /var/log/zerotier.log

# Restart entire gateway (careful!)
sudo reboot

Web Interface Unreachable

# SSH and check lighttpd
ssh joe@192.168.194.10
sudo configctl webgui restart

# Check if port 443 is listening
sockstat -4l | grep 443

Config Corruption

If config.xml is corrupted:

# Restore from Git backup
git clone github.com/scandora/opnsense-owl.git
scp config.xml joe@192.168.194.10:/tmp/
ssh joe@192.168.194.10
sudo cp /tmp/config.xml /conf/config.xml
sudo reboot

Cloud Provider Issues

AWS Instance Unreachable

# Check instance status
aws ec2 describe-instance-status --instance-ids i-xxx

# Get console output
aws ec2 get-console-output --instance-id i-xxx --output text

# Reboot if needed
aws ec2 reboot-instances --instance-ids i-xxx

GCE Instance Unreachable

# Check instance status
gcloud compute instances describe INSTANCE --zone=ZONE

# Get serial console
gcloud compute instances get-serial-port-output INSTANCE --zone=ZONE

# Reset instance
gcloud compute instances reset INSTANCE --zone=ZONE

Getting Help

  1. Check logs first - Most issues leave traces
  2. Try emergency access - SSM/IAP can always reach running instances
  3. Check ZeroTier Central - Network status and member authorization
  4. Review recent changes - git log for what changed recently