Ansible Roles¶
Overview¶
Ansible handles configuration management for all cloud instances after Terraform provisioning.
Available Roles¶
| Role | Description | Hosts |
|---|---|---|
base |
OS setup, packages, users, SSH hardening, fail2ban | All |
dotfiles |
Standard home directory layout | All |
zerotier |
ZeroTier VPN client | All |
docker |
Docker installation | Selected |
internal-dns |
DNS routing to PowerDNS | Selected |
ddns |
Cloudflare + PowerDNS DDNS | All |
cloudflared |
Cloudflare Zero Trust tunnels | Selected |
cloudsql-client |
Cloud SQL Auth Proxy | GCE with Cloud SQL |
opnsense |
OPNsense gateway config via REST API | Gateways (owl, blue) |
cloudwatch |
AWS CloudWatch agent | AWS only |
powerdns |
PowerDNS server | bogart only |
Base Role¶
The base role configures:
Security¶
- SSH hardening (key-only, no root, limited attempts)
- fail2ban with sshd jail
- MaxAuthTries: 3
- LoginGraceTime: 30
Users¶
- Creates
joeuser with passwordless sudo - Deploys SSH authorized keys
Packages¶
- Common utilities (vim, htop, curl, etc.)
- 1Password CLI (on trusted hosts)
Dotfiles Role¶
Standardizes home directory layout:
/home/joe/
├── .aws/ # AWS credentials (if needed)
├── .bashrc # Standard bashrc
├── .bash_aliases # Custom aliases
├── .gitconfig # Git configuration
├── .ssh/
│ ├── authorized_keys # SSH public keys
│ └── config # SSH client config
├── .config/
│ └── gcloud/ # GCE credentials (if needed)
├── src/ # Source code directory
│ └── scandora.net/ # This repo (cloned)
└── bin/ # Personal scripts
Cloudflared Role¶
Installs Cloudflare Zero Trust tunnel for secure SSH access without exposed ports.
See Cloudflare Zero Trust for details.
Cloud SQL Client Role¶
Installs the Cloud SQL Auth Proxy for secure PostgreSQL connectivity.
Features¶
- Downloads Cloud SQL Auth Proxy binary
- Creates systemd service with security hardening
- Listens on
localhost:5432 - Health check on port 9090
- IAM authentication (no database password in config)
Variables¶
# Required
cloudsql_connection_name: "scandoraproject:us-central1:scandora-postgres"
# Optional
cloudsql_enabled: true
cloudsql_proxy_port: 5432
cloudsql_proxy_version: "2.14.3"
Usage¶
See Cloud SQL (PostgreSQL) for details.
OPNsense Role¶
Manages OPNsense gateways via the REST API using oxlorg.opnsense and puzzle.opnsense Ansible collections.
Subsystems (~78% of Owl config managed)¶
| Tag | Task File | Description |
|---|---|---|
system |
system.yml |
Security reminders, interface verification |
interfaces |
interfaces.yml |
Interface assignment verification (read-only) |
packages |
packages.yml |
Plugin installation (ZeroTier, SNMP, themes) |
firewall |
firewall.yml |
Rules, aliases, GeoIP blocking, savepoint pattern |
dhcp |
dhcp.yml |
Kea DHCP4 subnet + static MAC-to-IP reservations |
dns |
dns.yml |
Unbound: DNSSEC, forwarding, host overrides, DoT |
zerotier |
zerotier.yml |
Overlay network join + local config |
ipv6-tunnel |
ipv6-tunnel.yml |
Hurricane Electric 6in4 GIF interface |
ids |
ids.yml |
Suricata IDS/IPS config + ruleset management |
syslog |
syslog.yml |
Remote syslog destinations |
monitoring |
monitoring.yml |
SNMP + node_exporter on ZeroTier IP |
gateways |
gateways.yml |
WAN and IPv6 tunnel gateway definitions |
users |
users.yml |
User accounts and group membership |
monit |
monit.yml |
Monit alerts, tests, and service monitoring |
system-identity |
system-identity.yml |
Hostname, domain, timezone (via SSH — disabled on 26.1) |
sysctl |
config-xml-sysctl.yml |
Sysctl tunables (via SSH, config.xml) |
ssh-hardening |
config-xml-ssh-hardening.yml |
SSH KEX, ciphers, MACs, root login (via SSH, config.xml) |
sudo |
config-xml-sudo.yml |
Sudo group + wheel config (via SSH, config.xml) |
webgui |
config-xml-webgui.yml |
Web GUI listen interfaces (via SSH, config.xml) |
gdrive-cleanup |
config-xml-gdrive-cleanup.yml |
Remove stale Google Drive backup config |
fw-cleanup |
config-xml-fw-cleanup.yml |
Remove legacy firewall rules from config.xml |
git-backup |
config-xml-git-backup.yml |
Git backup plugin config (via SSH, config.xml) |
Usage¶
OPNsense uses a dedicated credential helper that retrieves API keys from 1Password (production) or directly from the GCE host (dev):
cd /Users/joe/src/scandora.net/cloud/ansible
# Full deployment
./scripts/run-opnsense.sh owl --prod
# Specific subsystems
./scripts/run-opnsense.sh owl --prod --tags dhcp,dns
# Dry run
./scripts/run-opnsense.sh owl --prod --check
# Deploy via WAN IP (when ZeroTier is down)
./scripts/run-opnsense.sh owl --prod --wan
# Dev VM testing (4-NIC GCE/KVM, mirrors Owl production layout) — no --prod
# First bring up the dev VM: ./scripts/opnsense-dev/dev-up.sh
./scripts/run-opnsense.sh opnsense-dev
./scripts/run-opnsense.sh opnsense-dev --tags firewall,dhcp,dns,ids
# Tear down when done: ./scripts/opnsense-dev/dev-down.sh
Production Deployment Status¶
| Target | Status | Date | Notes |
|---|---|---|---|
| Owl | Deployed | 2026-02-10 | 18 tags, ~78% coverage, 15 FW rules, 28 DHCP reservations |
| Blue | Planned | — | Needs inventory + Starlink-specific vars |
API Limitations (OPNsense 26.1)¶
Some settings have no REST API. These are managed via SSH using two patterns:
puzzle.opnsensemodules — for hostname/domain/timezone (system-identitytag) — disabled on 26.1, pending Rosa-Luxemburg conversion- Rosa-Luxemburg pattern — fetch config.xml, edit locally with
community.general.xml, push back. Used forsysctl,ssh-hardening,sudo,webgui,gdrive-cleanup,fw-cleanup, andgit-backuptags
Remaining unmanaged (no API, no config.xml mapping):
- Interface assignments (creation/modification)
Known Quirks¶
oxlorg.opnsense.rawPOST tasks always reportchanged— can't compare state, so idempotency reruns show ~12 changed tasksids_generalmodule broken on 26.1 — mapsblock→ipsbut API now usesmode; IDS config usesrawAPI instead- DNSBL model restructured in 26.1 — old type names invalid; configure manually via web UI
puzzle.opnsensedoesn't support 26.1 —system-identitytag disabled; hostname/domain/timezone must be set manually or via Rosa-Luxemburg conversion- ZeroTier
service/reconfigurereturns 404 — no ServiceController.php; handled withfailed_when: false - Kea HA must be disabled — enabled by default with no peers, crashes kea-dhcp4;
dhcp.ymlsetsha.enabled: "0" - Extra-vars override inventory — 1Password credentials from
run-opnsense.shhave highest precedence; useopn_skip_packagesguards for plugin-dependent tasks on dev
Prerequisites¶
Playbooks¶
| Playbook | Description |
|---|---|
site.yml |
Full site deployment (all roles) |
base.yml |
Base configuration only |
opnsense.yml |
OPNsense gateway configuration |
dns-server.yml |
PowerDNS server setup (bogart) |
Inventory¶
Production¶
# inventory/production.yml
all:
children:
aws:
hosts:
pluto:
ansible_host: 52.32.80.62
mickey:
ansible_host: "{{ lookup('env', 'MICKEY_IP') }}"
gce:
hosts:
dumbo:
ansible_host: 34.44.33.3
bogart:
ansible_host: 35.209.219.216
Common Commands¶
Full Deployment¶
# Deploy everything to a host
ansible-playbook -i inventory/production.yml playbooks/site.yml --limit pluto
Specific Role¶
# Run only base role
ansible-playbook -i inventory/production.yml playbooks/base.yml --limit pluto
# Run with specific tags
ansible-playbook -i inventory/production.yml playbooks/site.yml \
--limit pluto --tags cloudflared
With Extra Variables¶
# Pass secrets at runtime (never commit)
TOKEN=$(op item get "Cloudflare Tunnel Token - pluto" --fields credential --reveal)
ansible-playbook -i inventory/production.yml playbooks/site.yml \
--limit pluto --tags cloudflared \
-e cloudflared_tunnel_token="$TOKEN"
Dry Run¶
# Check what would change
ansible-playbook -i inventory/production.yml playbooks/site.yml \
--limit pluto --check --diff
Secret Management¶
Guidelines¶
- Ansible playbooks: Pass secrets via
--extra-varsor Ansible Vault at runtime - Runtime scripts: Use 1Password service account on trusted hosts
- Bogart exception: No secrets stored - receives only non-sensitive config
- Never commit secrets: All credentials in 1Password, referenced by item name
1Password Service Account¶
Trusted hosts have access to scandora-full service account:
# On trusted host
export OP_SERVICE_ACCOUNT_TOKEN=$(sudo cat /etc/op-service-account.token)
op item get "Item Name" --vault scandora.net --fields credential
| Host | Token Location | Status |
|---|---|---|
| Pluto | /etc/op-service-account.token |
✅ Active |
| Dumbo | /etc/op-service-account.token |
✅ Active |
Implementation Status¶
Phase 2: Ansible Configuration¶
- Create base role (packages, users, sudo)
- Create dotfiles role
- Create zerotier role (skeleton exists)
- Create internal-dns role (skeleton exists)
- Create ddns role (skeleton exists)
- Test full provisioning: terraform + ansible
Phase 3: OPNsense IaC¶
- Create opnsense role (9 subsystem tags)
- Validate against dev VM (8 bugs found and fixed)
- Deploy to production Owl (6 additional bugs found and fixed)
- Expand to ~70% coverage (14 tags, 5 more bugs fixed)
- Harden dev workflow (Terraform startup_script, API pre-check)
- Config.xml SSH play (sysctl, SSH hardening, sudo via Rosa-Luxemburg pattern)
- Golden image bootstrap (joe user + sudo + SSH pre-baked, ~35s quickstart)
- Golden image rebuild + full 18-tag validation (2026-02-11)
- Deploy to Blue gateway
- Convert system-identity to Rosa-Luxemburg pattern (puzzle.opnsense 26.1 incompatible)
- MCP server validation (vespo92/OPNSenseMCP)