HOMELAB-547: fix(monitoring): adjust memory headroom alert thresholds for homelab #145
Open
aaron
wants to merge 1 commit from
plane/HOMELAB-547-memory-headroom-alerts into live
pull from: plane/HOMELAB-547-memory-headroom-alerts
merge into: aaron:live
aaron:main
aaron:live
aaron:homelab-1056-overwrite-false
aaron:fix/cnpg-wal-archive-alert
aaron:homelab-1056-terraform-drift
aaron:homelab-1053-remove-lab-director
aaron:plane/HOMELAB-1052-clickhouse-memory
aaron:plane/longhorn-singlerep-iac
aaron:plane/longhorn-restore-throttle-after-resilver
aaron:plane/longhorn-throttle-recovery
aaron:plane/HOMELAB-958-revert-maxpods
aaron:plane/HOMELAB-958-vm-ignore-cilium-churn
aaron:plane/HOMELAB-990-drop-wait-triggers
aaron:plane/HOMELAB-990-talos-cluster-bootstrap
aaron:plane/HOMELAB-994-cnpg-backups
aaron:plane/HOMELAB-988-runner-es-repoint
aaron:plane/HOMELAB-988-runner-fsgroup
aaron:plane/HOMELAB-988-runner-secrets
aaron:plane/HOMELAB-988-runner-image-bake
aaron:plane/HOMELAB-966-adr-amend
aaron:plane/HOMELAB-966-sops-operator
aaron:plane/HOMELAB-973-outline-probes
aaron:plane/HOMELAB-974-clickhouse-init-wait
aaron:plane/HOMELAB-971-intellimation-letterbox
aaron:plane/HOMELAB-970-ssd-tier-migration-spec
aaron:plane/HOMELAB-954-mapstorage-pvc
aaron:HOMELAB-969
aaron:HOMELAB-968
aaron:plane/HOMELAB-956-kubelet-maxpods
aaron:plane/HOMELAB-954-workadventure
aaron:fix/HOMELAB-860-talos-hostname-conflict
aaron:plane/HOMELAB-854-nats-deploy
aaron:plane/HOMELAB-847-nats-k8s-deployment
aaron:fix/minio-oom-pg-cascade
aaron:plane/HOMELAB-846-ch-memory
aaron:fix/HOMELAB-845-clickhouse-resources
aaron:fix/eso-lab-director-secrets
aaron:plane/HOMELAB-817-hostname-iac
aaron:plane/HOMELAB-817-hostname-in-iac
aaron:plane/HOMELAB-817-revert-hostname
aaron:plane/HOMELAB-817-hostname-fix
aaron:plane/HOMELAB-817-etcd-stability
aaron:plane/HOMELAB-809-setup-token
aaron:HOMELAB-819-alert-cleanup
aaron:fix/HOMELAB-811-etcd-disk-tuning
aaron:plane/HOMELAB-798-alert-investigation
aaron:plane/HOMELAB-797-alert-fix
aaron:plane/HOMELAB-796-cpu-throttling-fix
aaron:plane/HOMELAB-794-node-cpu-fix
aaron:plane/HOMELAB-793
aaron:plane/HOMELAB-791-alert-fix
aaron:plane/HOMELAB-787-alert-fix
aaron:plane/HOMELAB-783-alert-fix
aaron:plane/HOMELAB-782-fix-longhorn-degradation
aaron:plane/HOMELAB-781-alert-fix
aaron:plane/HOMELAB-780-node-saturation-fix
aaron:fix/lab-director-remove-subpath-mount
aaron:plane/HOMELAB-698-cnpg-alert-guard
aaron:plane/HOMELAB-698-cnpg-backup-alerts
aaron:plane/HOMELAB-585-fix-disk-io-saturation
aaron:plane/HOMELAB-584-fix-tempo-oom
aaron:plane/HOMELAB-580-fix-tempo-oom
aaron:plane/HOMELAB-568-fix-prometheus-for-syntax
aaron:plane/HOMELAB-568-fix-alert-for-syntax
aaron:plane/HOMELAB-558-token-manager
aaron:plane/HOMELAB-553-dispatcher-restart
aaron:plane/HOMELAB-108-opnsense-firewall-rule
aaron:plane/HOMELAB-548-alert-auto-tickets
aaron:plane/HOMELAB-548-alert-auto-ticket-creation
aaron:plane/HOMELAB-550-fix-alert-permissions
aaron:fix/HOMELAB-546-sandbox-share-pid-namespace
aaron:fix/HOMELAB-545-memory-limits
aaron:plane/HOMELAB-549-fix-longhorn-alert
aaron:plane/HOMELAB-542-fix-critical-alerts
aaron:plane/HOMELAB-542-critical-alerts-triage
aaron:fix/HOMELAB-544-ksm-restart-loop
aaron:fix/token-refresh-schedule
aaron:fix/critical-alerts-live-homelab-542
aaron:fix/critical-alerts-triage-homelab-542
aaron:plane/HOMELAB-540-otel-filter
aaron:fix/sandbox-docker-writable
aaron:fix/sandbox-privileged-v3
aaron:fix/sandbox-privileged
aaron:fix/sandbox-privileged-v2
aaron:fix/sandbox-dev-kmsg-v2
aaron:fix/sandbox-share-docker-data
aaron:fix/sandbox-docker-socket
aaron:fix/sandbox-netpol-label
aaron:fix/token-refresh-backoff
aaron:plane/HOMELAB-516-sandbox-api-url
aaron:plane/HOMELAB-516-argocd-rbac
aaron:plane/HOMELAB-516-argocd-env-vars
aaron:plane/HOMELAB-516-sandbox-env-vars
aaron:plane/HOMELAB-510-simplify-harbor-route
aaron:plane/HOMELAB-510-harbor-gateway-tls
aaron:plane/HOMELAB-510-remove-harbor-route
aaron:plane/HOMELAB-510-harbor-route
aaron:plane/HOMELAB-476-langfuse-mcp
aaron:plane/HOMELAB-477-seccomp
aaron:plane/HOMELAB-478-fix-reflector-loop
aaron:plane/HOMELAB-477-token-refresh
aaron:fix/homelab-476-recreate-strategy
aaron:plane/HOMELAB-471-chart
aaron:plane/HOMELAB-469-deploy-ws-fix
aaron:homelab-474/lab-director-service-config
aaron:plane/HOMELAB-473-runner-cache
aaron:bump/lab-director-ab2dec6
aaron:plane/HOMELAB-469-bump-c11df5f
aaron:plane/HOMELAB-469-helm-fix
aaron:plane/HOMELAB-449-ui-phase1-helm
aaron:plane/HOMELAB-465-longhorn-cleanup
aaron:plane/HOMELAB-464-remove-mimir
aaron:plane/HOMELAB-463-wk02-cpu
aaron:plane/HOMELAB-462-infra-remediation
aaron:plane/HOMELAB-460-litellm-ollama
aaron:homelab-447-sandbox-env-vars
aaron:plane/HOMELAB-439-sandbox-ssh-permissions
aaron:plane/HOMELAB-437-disable-node-exporter
aaron:fix/HOMELAB-424-align-sandbox-values
aaron:plane/HOMELAB-433-alertmanager-pvc
aaron:HOMELAB-424-sandbox-helm-chart
aaron:plane/HOMELAB-433-temporal-prometheus-pvc
aaron:plane/HOMELAB-419-coredns-upstream-opnsense
aaron:plane/HOMELAB-416-single-devenv-image
aaron:plane/HOMELAB-404-chart-mode
aaron:plane/HOMELAB-404-temporal-env
aaron:homelab-406/ollama-resources
aaron:plane/HOMELAB-404-delete-worker-deployment
aaron:fix/HOMELAB-406-cluster-health
aaron:fix/HOMELAB-406-talos-version
aaron:plane/HOMELAB-404-devenv-temporal
aaron:fix/claude-agent-cluster-admin
aaron:plane/HOMELAB-395-temporal-ns
aaron:plane/HOMELAB-395-configmaps
aaron:plane/HOMELAB-397-agent-rbac
aaron:fix/temporal-schema-no-create-db
aaron:fix/HOMELAB-394-per-worker-storage-overrides
aaron:plane/HOMELAB-379-temporal-worker
aaron:fix/temporal-helm-values
aaron:plane/HOMELAB-378-temporal-server
aaron:plane/HOMELAB-377-temporal-cnpg
aaron:plane/HOMELAB-304-sha-tags-fix
aaron:plane/HOMELAB-316-runner-label
aaron:plane/HOMELAB-304-sha-tags
aaron:plane/HOMELAB-319-harbor-limits
aaron:plane/HOMELAB-319-forgejo-runner-k8s
aaron:plane/HOMELAB-278-devenv-fixes
aaron:plane/HOMELAB-287-ssh-key-eso
aaron:plane/HOMELAB-284-resource-scheduling
aaron:plane/HOMELAB-280-add-worker-node-03
aaron:plane/HOMELAB-274-fix-deployment
aaron:plane/HOMELAB-275-chart
aaron:plane/HOMELAB-272-remaining
aaron:plane/HOMELAB-272-api-fix
aaron:plane/HOMELAB-272-harbor-image
aaron:plane/HOMELAB-272-eso
aaron:feat/HOMELAB-262-zitadel-oidc
aaron:plane/homelab-254-dev-environment-chart
aaron:HOMELAB-256-langfuse-clickhouse-oom
aaron:plane/HOMELAB-236-litellm-model-fix
aaron:plane/HOMELAB-188-k8s
aaron:plane/HOMELAB-180-resource-limit-alerts
aaron:plane/HOMELAB-179-velero-sync-frequency
aaron:plane/HOMELAB-175-otel-retry-fix
aaron:plane/HOMELAB-175-langfuse-otel-litellm
aaron:plane/HOMELAB-176-release-versioning
aaron:plane/HOMELAB-173-worktree-setup
aaron:plane/HOMELAB-141-cilium-network-policies
aaron:plane/HOMELAB-159-langfuse-observability
aaron:plane/HOMELAB-151-harbor-rbac-admin-group
aaron:plane/HOMELAB-99-playwright-mcp-argocd
aaron:plane/HOMELAB-144-rbac-foundation
aaron:plane/HOMELAB-111-observability-dashboards
aaron:plane/HOMELAB-104-remove-coder
aaron:plane/HOMELAB-98-fix-bucket-url
aaron:plane/HOMELAB-98-expose-minio-public
aaron:plane/HOMELAB-72-outline-deployment
No reviewers
Labels
Clear labels
No items
No labels
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".
No due date set.
Dependencies
No dependencies set.
Reference
aaron/infra-core!145
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "plane/HOMELAB-547-memory-headroom-alerts"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Root Cause
Previous thresholds were too conservative for the resource-constrained homelab environment, causing alert fatigue. The homelab normally operates at higher memory utilization (80-90%+) as evidenced by recent memory pressure fixes for multiple services.
Recent Context
Test Plan
Impact
✅ Self-merge eligible - Simple configuration change, Helm values adjustment
🤖 Generated with Claude Code
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.