Skip to content

Ansible Roles

Overview

Ansible handles configuration management for all cloud instances after Terraform provisioning.

Available Roles

Role Description Hosts
base OS setup, packages, users, SSH hardening, fail2ban All
dotfiles Standard home directory layout All
zerotier ZeroTier VPN client All
docker Docker installation Selected
internal-dns DNS routing to PowerDNS Selected
ddns Cloudflare + PowerDNS DDNS All
cloudflared Cloudflare Zero Trust tunnels Selected
cloudsql-client Cloud SQL Auth Proxy GCE with Cloud SQL
opnsense OPNsense gateway config via REST API Gateways (owl, blue)
cloudwatch AWS CloudWatch agent AWS only
powerdns PowerDNS server bogart only

Base Role

The base role configures:

Security

  • SSH hardening (key-only, no root, limited attempts)
  • fail2ban with sshd jail
  • MaxAuthTries: 3
  • LoginGraceTime: 30

Users

  • Creates joe user with passwordless sudo
  • Deploys SSH authorized keys

Packages

  • Common utilities (vim, htop, curl, etc.)
  • 1Password CLI (on trusted hosts)

Dotfiles Role

Standardizes home directory layout:

/home/joe/
├── .aws/                    # AWS credentials (if needed)
├── .bashrc                  # Standard bashrc
├── .bash_aliases            # Custom aliases
├── .gitconfig               # Git configuration
├── .ssh/
│   ├── authorized_keys      # SSH public keys
│   └── config               # SSH client config
├── .config/
│   └── gcloud/              # GCE credentials (if needed)
├── src/                     # Source code directory
│   └── scandora.net/        # This repo (cloned)
└── bin/                     # Personal scripts

Cloudflared Role

Installs Cloudflare Zero Trust tunnel for secure SSH access without exposed ports.

See Cloudflare Zero Trust for details.

Cloud SQL Client Role

Installs the Cloud SQL Auth Proxy for secure PostgreSQL connectivity.

Features

  • Downloads Cloud SQL Auth Proxy binary
  • Creates systemd service with security hardening
  • Listens on localhost:5432
  • Health check on port 9090
  • IAM authentication (no database password in config)

Variables

# Required
cloudsql_connection_name: "scandoraproject:us-central1:scandora-postgres"

# Optional
cloudsql_enabled: true
cloudsql_proxy_port: 5432
cloudsql_proxy_version: "2.14.3"

Usage

# Deploy to dumbo
ansible-playbook -i inventory/dumbo.yml playbooks/site.yml --tags cloudsql

See Cloud SQL (PostgreSQL) for details.

OPNsense Role

Manages OPNsense gateways via the REST API using oxlorg.opnsense and puzzle.opnsense Ansible collections.

Subsystems (~78% of Owl config managed)

Tag Task File Description
system system.yml Security reminders, interface verification
interfaces interfaces.yml Interface assignment verification (read-only)
packages packages.yml Plugin installation (ZeroTier, SNMP, themes)
firewall firewall.yml Rules, aliases, GeoIP blocking, savepoint pattern
dhcp dhcp.yml Kea DHCP4 subnet + static MAC-to-IP reservations
dns dns.yml Unbound: DNSSEC, forwarding, host overrides, DoT
zerotier zerotier.yml Overlay network join + local config
ipv6-tunnel ipv6-tunnel.yml Hurricane Electric 6in4 GIF interface
ids ids.yml Suricata IDS/IPS config + ruleset management
syslog syslog.yml Remote syslog destinations
monitoring monitoring.yml SNMP + node_exporter on ZeroTier IP
gateways gateways.yml WAN and IPv6 tunnel gateway definitions
users users.yml User accounts and group membership
monit monit.yml Monit alerts, tests, and service monitoring
system-identity system-identity.yml Hostname, domain, timezone (via SSH — disabled on 26.1)
sysctl config-xml-sysctl.yml Sysctl tunables (via SSH, config.xml)
ssh-hardening config-xml-ssh-hardening.yml SSH KEX, ciphers, MACs, root login (via SSH, config.xml)
sudo config-xml-sudo.yml Sudo group + wheel config (via SSH, config.xml)
webgui config-xml-webgui.yml Web GUI listen interfaces (via SSH, config.xml)
gdrive-cleanup config-xml-gdrive-cleanup.yml Remove stale Google Drive backup config
fw-cleanup config-xml-fw-cleanup.yml Remove legacy firewall rules from config.xml
git-backup config-xml-git-backup.yml Git backup plugin config (via SSH, config.xml)

Usage

OPNsense uses a dedicated credential helper that retrieves API keys from 1Password (production) or directly from the GCE host (dev):

cd /Users/joe/src/scandora.net/cloud/ansible

# Full deployment
./scripts/run-opnsense.sh owl --prod

# Specific subsystems
./scripts/run-opnsense.sh owl --prod --tags dhcp,dns

# Dry run
./scripts/run-opnsense.sh owl --prod --check

# Deploy via WAN IP (when ZeroTier is down)
./scripts/run-opnsense.sh owl --prod --wan

# Dev VM testing (4-NIC GCE/KVM, mirrors Owl production layout) — no --prod
# First bring up the dev VM: ./scripts/opnsense-dev/dev-up.sh
./scripts/run-opnsense.sh opnsense-dev
./scripts/run-opnsense.sh opnsense-dev --tags firewall,dhcp,dns,ids
# Tear down when done: ./scripts/opnsense-dev/dev-down.sh

Production Deployment Status

Target Status Date Notes
Owl Deployed 2026-02-10 18 tags, ~78% coverage, 15 FW rules, 28 DHCP reservations
Blue Planned Needs inventory + Starlink-specific vars

API Limitations (OPNsense 26.1)

Some settings have no REST API. These are managed via SSH using two patterns:

  • puzzle.opnsense modules — for hostname/domain/timezone (system-identity tag) — disabled on 26.1, pending Rosa-Luxemburg conversion
  • Rosa-Luxemburg pattern — fetch config.xml, edit locally with community.general.xml, push back. Used for sysctl, ssh-hardening, sudo, webgui, gdrive-cleanup, fw-cleanup, and git-backup tags

Remaining unmanaged (no API, no config.xml mapping):

  • Interface assignments (creation/modification)

Known Quirks

  • oxlorg.opnsense.raw POST tasks always report changed — can't compare state, so idempotency reruns show ~12 changed tasks
  • ids_general module broken on 26.1 — maps blockips but API now uses mode; IDS config uses raw API instead
  • DNSBL model restructured in 26.1 — old type names invalid; configure manually via web UI
  • puzzle.opnsense doesn't support 26.1system-identity tag disabled; hostname/domain/timezone must be set manually or via Rosa-Luxemburg conversion
  • ZeroTier service/reconfigure returns 404 — no ServiceController.php; handled with failed_when: false
  • Kea HA must be disabled — enabled by default with no peers, crashes kea-dhcp4; dhcp.yml sets ha.enabled: "0"
  • Extra-vars override inventory — 1Password credentials from run-opnsense.sh have highest precedence; use opn_skip_packages guards for plugin-dependent tasks on dev

Prerequisites

ansible-galaxy collection install puzzle.opnsense oxlorg.opnsense
pip install httpx

Playbooks

Playbook Description
site.yml Full site deployment (all roles)
base.yml Base configuration only
opnsense.yml OPNsense gateway configuration
dns-server.yml PowerDNS server setup (bogart)

Inventory

Production

# inventory/production.yml
all:
  children:
    aws:
      hosts:
        pluto:
          ansible_host: 52.32.80.62
        mickey:
          ansible_host: "{{ lookup('env', 'MICKEY_IP') }}"
    gce:
      hosts:
        dumbo:
          ansible_host: 34.44.33.3
        bogart:
          ansible_host: 35.209.219.216

Common Commands

Full Deployment

# Deploy everything to a host
ansible-playbook -i inventory/production.yml playbooks/site.yml --limit pluto

Specific Role

# Run only base role
ansible-playbook -i inventory/production.yml playbooks/base.yml --limit pluto

# Run with specific tags
ansible-playbook -i inventory/production.yml playbooks/site.yml \
  --limit pluto --tags cloudflared

With Extra Variables

# Pass secrets at runtime (never commit)
TOKEN=$(op item get "Cloudflare Tunnel Token - pluto" --fields credential --reveal)
ansible-playbook -i inventory/production.yml playbooks/site.yml \
  --limit pluto --tags cloudflared \
  -e cloudflared_tunnel_token="$TOKEN"

Dry Run

# Check what would change
ansible-playbook -i inventory/production.yml playbooks/site.yml \
  --limit pluto --check --diff

Secret Management

Guidelines

  1. Ansible playbooks: Pass secrets via --extra-vars or Ansible Vault at runtime
  2. Runtime scripts: Use 1Password service account on trusted hosts
  3. Bogart exception: No secrets stored - receives only non-sensitive config
  4. Never commit secrets: All credentials in 1Password, referenced by item name

1Password Service Account

Trusted hosts have access to scandora-full service account:

# On trusted host
export OP_SERVICE_ACCOUNT_TOKEN=$(sudo cat /etc/op-service-account.token)
op item get "Item Name" --vault scandora.net --fields credential
Host Token Location Status
Pluto /etc/op-service-account.token ✅ Active
Dumbo /etc/op-service-account.token ✅ Active

Implementation Status

Phase 2: Ansible Configuration

  • Create base role (packages, users, sudo)
  • Create dotfiles role
  • Create zerotier role (skeleton exists)
  • Create internal-dns role (skeleton exists)
  • Create ddns role (skeleton exists)
  • Test full provisioning: terraform + ansible

Phase 3: OPNsense IaC

  • Create opnsense role (9 subsystem tags)
  • Validate against dev VM (8 bugs found and fixed)
  • Deploy to production Owl (6 additional bugs found and fixed)
  • Expand to ~70% coverage (14 tags, 5 more bugs fixed)
  • Harden dev workflow (Terraform startup_script, API pre-check)
  • Config.xml SSH play (sysctl, SSH hardening, sudo via Rosa-Luxemburg pattern)
  • Golden image bootstrap (joe user + sudo + SSH pre-baked, ~35s quickstart)
  • Golden image rebuild + full 18-tag validation (2026-02-11)
  • Deploy to Blue gateway
  • Convert system-identity to Rosa-Luxemburg pattern (puzzle.opnsense 26.1 incompatible)
  • MCP server validation (vespo92/OPNSenseMCP)