Slack Notifications Integration¶
Complete guide to Slack notifications for scandora.net infrastructure.
Overview¶
The #scandora-notifications Slack channel receives automated notifications from:
- Prometheus AlertManager - Infrastructure monitoring alerts
- GitHub - Repository events (commits, PRs, releases, issues)
Architecture¶
┌─────────────────────┐
│ Prometheus/Alert │
│ Manager │
│ (192.168.194.131) │
└──────────┬──────────┘
│
│ HTTPS POST
▼
┌─────────────────────┐ ┌─────────────────────┐
│ Slack Incoming │ │ GitHub │
│ Webhook │◀─────│ Repository │
│ (hooks.slack.com) │ │ (scandora/...net) │
└──────────┬──────────┘ └─────────────────────┘
│
▼
┌─────────────────────┐
│ #scandora- │
│ notifications │
│ (Slack) │
└─────────────────────┘
1. Prometheus/AlertManager → Slack¶
Configuration Status¶
✅ Deployed and Active (as of 2026-02-13)
- AlertManager: Running on dumbo (192.168.194.131:9093)
- Webhook URL: Stored in 1Password (
slack_webhook_scandora_notifications) - Channel: #scandora-notifications
- Ansible Role:
monitoring-stack
Alert Types¶
Warning Alerts (4-hour repeat):
- Host down (InstanceDown)
- High CPU/memory/disk usage
- Network issues
- ZeroTier connectivity problems
- Dev VM running too long (1 hour)
Critical Alerts (1-hour repeat):
- Multiple hosts down
- Critical resource exhaustion
- Gateway failures
- Dev VM running critically long (4 hours)
Message Format¶
Warning Example:
Critical Example:
Resolved Example:
Testing Alerts¶
# Stop node_exporter on a host to trigger alert
ssh dumbo "sudo systemctl stop node-exporter"
# Check #scandora-notifications for alert (within 30s)
# Restore service
ssh dumbo "sudo systemctl start node-exporter"
# Check for resolved message (within 5m)
Configuration Files¶
- Ansible defaults:
cloud/ansible/roles/monitoring-stack/defaults/main.yml - AlertManager template:
cloud/ansible/roles/monitoring-stack/templates/alertmanager.yml.j2 - Deployment script:
cloud/ansible/scripts/run-monitoring.sh - 1Password item:
slack_webhook_scandora_notifications(scandora-automation vault)
Redeployment¶
2. GitHub → Slack¶
Configuration Options¶
Three options available (choose one):
Option A: Native Slack GitHub App (Easiest)¶
Setup:
# In Slack
/github subscribe scandora/scandora.net
/github subscribe scandora/scandora.net commits:all pulls issues
Pros:
- Quick setup (one command)
- Rich formatting with previews
- Built-in features (unfurl, threading)
Cons:
- Not managed via IaC
- Requires Slack app permission
- Less control over formatting
Option B: Manual GitHub Webhook¶
⚠️ Does not work with Slack incoming webhooks - requires transformation middleware
GitHub's webhook payload format is incompatible with Slack's incoming webhook format. This option only works if you have a custom endpoint (Lambda, Cloud Function) that transforms the payload.
Option C: Terraform-Managed Webhook (Requires Middleware)¶
Setup:
# 1. Create GitHub PAT (admin:repo_hook scope)
# https://github.com/settings/tokens
# 2. Store in 1Password
op item create \
--category="API Credential" \
--title="GitHub Personal Access Token - Terraform" \
--vault="scandora.net" \
credential[password]="<token>"
# 3. Load credentials
source scripts/terraform/tf-github-slack-env.sh
# 4. Deploy
cd cloud/terraform/environments/production/network/github-slack
terraform init
terraform plan
terraform apply
Pros:
- ✅ Infrastructure as Code
- ✅ Version controlled
- ✅ Auditable changes
- ✅ Reproducible
Cons:
- More initial setup
- Requires GitHub PAT management
Events Sent to Slack¶
When using webhooks (Options B or C):
- push - Commits to any branch
- pull_request - PRs created, merged, closed
- release - New releases published
- issues - Issues opened, closed
- issue_comment - PR/issue comments
Testing GitHub Integration¶
# After setup, test with empty commit
git commit --allow-empty -m "test: verify Slack notifications"
git push origin main
# Check #scandora-notifications within seconds
Webhook Management¶
Credentials Storage¶
All webhook credentials stored in 1Password:
| Item | Vault | Field | Purpose |
|---|---|---|---|
slack_webhook_scandora_notifications |
scandora-automation | webhook_url | AlertManager + GitHub |
GitHub Personal Access Token - Terraform |
scandora.net | credential | Terraform GitHub provider |
Rotating Webhook URL¶
If webhook URL needs to change (compromised, new app, etc.):
# 1. Create new webhook in Slack (see Setup section)
# 2. Update 1Password
op item edit "slack_webhook_scandora_notifications" \
--vault scandora-automation \
webhook_url[password]="<new-webhook-url>"
# 3. Redeploy AlertManager
cd cloud/ansible
./scripts/run-monitoring.sh dumbo deploy
# 4. Update GitHub webhook (if using Terraform)
source scripts/terraform/tf-github-slack-env.sh
cd cloud/terraform/environments/production/network/github-slack
terraform apply
# 5. Test both integrations
Webhook Security¶
Best Practices:
- ✅ Never commit webhook URLs to git
- ✅ Store in 1Password with limited access
- ✅ Use separate webhook per environment (dev/prod) if needed
- ✅ Rotate periodically (annually recommended)
- ✅ Monitor Slack audit logs for unusual activity
Access Control:
- Webhook URL allows anyone to post to channel
- Treat as sensitive credential
- If exposed, rotate immediately
Troubleshooting¶
AlertManager Not Sending Notifications¶
Check AlertManager status:
Check configuration:
Check logs:
Test webhook manually:
WEBHOOK_URL=$(op item get "slack_webhook_scandora_notifications" \
--vault scandora-automation --fields webhook_url --reveal)
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"Test from curl"}' \
"$WEBHOOK_URL"
GitHub Webhook Not Firing¶
Check webhook deliveries:
- Go to: https://github.com/scandora/scandora.net/settings/hooks
- Click on webhook
- View "Recent Deliveries"
- Check for 200 OK responses
Common issues:
- Webhook URL incorrect (verify in 1Password)
- Events not selected (push, pulls, etc.)
- Webhook disabled/inactive
- SSL verification issue (should be disabled for Slack)
Re-trigger webhook:
- Find failed delivery in Recent Deliveries
- Click "Redeliver"
- Check Slack for message
Monitoring and Alerts¶
Notification Volume¶
Expected notification frequency:
- GitHub: 5-20/day (depending on development activity)
- Prometheus: 0-5/day normally (more during incidents)
Alert Fatigue Prevention¶
- Alerts grouped by severity (warning vs critical)
- Repeat intervals: 4hr warning, 1hr critical
- Inhibition rules (host down suppresses other alerts)
- Dev VM alerts separate from production
Notification Review¶
Periodically review:
- Are alerts actionable?
- Too many false positives?
- Missing important events?
- Need different channels per severity?
Future Enhancements¶
Potential improvements:
- Separate channels by severity
-
scandora-critical (critical only)¶
-
scandora-info (github, warnings)¶
-
Enhanced formatting
- Custom Slack blocks for richer display
- Thread replies for alert updates
-
Buttons for common actions (acknowledge, silence)
-
Integration with other services
- PagerDuty for on-call rotation
- Opsgenie for escalation
-
Slack slash commands for querying status
-
Custom webhooks for specific events
- Backup completions/failures
- Cost threshold alerts
-
Security scan results
-
Bi-directional integration
- Slack → Prometheus (silence alerts via command)
- Slack → GitHub (create issues from alerts)
Related Documentation¶
- AlertManager Template:
cloud/ansible/roles/monitoring-stack/templates/alertmanager.yml.j2 - GitHub Terraform Module:
cloud/terraform/modules/github-slack-webhook/ - Terraform Environment:
cloud/terraform/environments/production/network/github-slack/ - Credential Helper:
scripts/terraform/tf-github-slack-env.sh
Quick Reference¶
Useful Commands¶
# View AlertManager config
ssh dumbo "cat /opt/monitoring/alertmanager/alertmanager.yml"
# Check alert status
curl -s http://192.168.194.131:9093/api/v1/alerts | jq
# Silence an alert (1 hour)
curl -X POST http://192.168.194.131:9093/api/v1/silences \
-d '{"matchers":[{"name":"alertname","value":"InstanceDown"}],"startsAt":"2026-02-13T00:00:00Z","endsAt":"2026-02-13T01:00:00Z","comment":"Maintenance window"}'
# List GitHub webhooks (via Terraform)
cd cloud/terraform/environments/production/network/github-slack
terraform state list
terraform show
# Test Slack webhook
WEBHOOK_URL=$(op item get "slack_webhook_scandora_notifications" \
--vault scandora-automation --fields webhook_url --reveal)
curl -X POST "$WEBHOOK_URL" \
-H 'Content-type: application/json' \
-d '{"text":"Test notification"}'
Last updated: 2026-02-13 Status: Active and operational Owner: Infrastructure team Contact: #scandora-notifications (Slack)