HOMELAB-550: feat(lab-director): add RBAC for alert investigation permissions #140

Open
aaron wants to merge 2 commits from plane/HOMELAB-550-fix-alert-permissions into live
Owner

Summary

  • Fix alert investigation permissions blocking by adding proper RBAC configuration
  • Create ServiceAccount, ClusterRole, and ClusterRoleBinding for lab-director
  • Enable cluster-wide read access to resources needed for alert debugging

Changes

  1. ServiceAccount: New dedicated service account for lab-director
  2. ClusterRole: Read permissions for:
    • Core resources (pods, events, nodes, jobs)
    • Metrics access for resource monitoring
    • Monitoring resources (PrometheusRules, ServiceMonitors)
    • Storage resources for Longhorn alerts
    • Network resources for connectivity debugging
  3. Deployment update: Use the new ServiceAccount
  4. Values: Enable RBAC by default

Test Plan

  • Verify Helm chart validation passes
  • Deploy to cluster and confirm ServiceAccount is created
  • Test kubectl permissions from lab-director pod
  • Verify alert investigation pipelines can access required resources
  • Fixes HOMELAB-550
  • Resolves blocked alert investigations for KubeJobFailed, AlertmanagerFailedToSendAlerts, KubeCPUOvercommit

🤖 Generated with Claude Code

## Summary - Fix alert investigation permissions blocking by adding proper RBAC configuration - Create ServiceAccount, ClusterRole, and ClusterRoleBinding for lab-director - Enable cluster-wide read access to resources needed for alert debugging ## Changes 1. **ServiceAccount**: New dedicated service account for lab-director 2. **ClusterRole**: Read permissions for: - Core resources (pods, events, nodes, jobs) - Metrics access for resource monitoring - Monitoring resources (PrometheusRules, ServiceMonitors) - Storage resources for Longhorn alerts - Network resources for connectivity debugging 3. **Deployment update**: Use the new ServiceAccount 4. **Values**: Enable RBAC by default ## Test Plan - [ ] Verify Helm chart validation passes - [ ] Deploy to cluster and confirm ServiceAccount is created - [ ] Test kubectl permissions from lab-director pod - [ ] Verify alert investigation pipelines can access required resources ## Related - Fixes HOMELAB-550 - Resolves blocked alert investigations for KubeJobFailed, AlertmanagerFailedToSendAlerts, KubeCPUOvercommit 🤖 Generated with [Claude Code](https://claude.com/claude-code)
HOMELAB-550: feat(lab-director): add RBAC for alert investigation permissions
Some checks are pending
CI Review / pr-title (pull_request) Waiting to run
CI Review / helm-validate (pull_request) Waiting to run
CI Review / ai-review (pull_request) Waiting to run
Lint & Validate / terraform-validate (pull_request) Waiting to run
Lint & Validate / yaml-lint (pull_request) Waiting to run
Lint & Validate / shellcheck (pull_request) Waiting to run
0/0 projects applied successfully.
b90c06c818
- Add ServiceAccount for lab-director deployment
- Create ClusterRole with read permissions for alert investigation:
  * pods, events, nodes, jobs for general debugging
  * metrics.k8s.io resources for resource monitoring
  * monitoring.coreos.com for Prometheus rules
  * longhorn.io resources for storage alerts
  * cilium.io for network debugging
- Add ClusterRoleBinding to connect ServiceAccount to ClusterRole
- Update deployment to use the new ServiceAccount
- Enable RBAC by default in values.yaml

This resolves permission issues blocking alert investigation pipelines.
HOMELAB-550: docs: document Claude Code settings.json fix for pipeline permissions
Some checks failed
CI Review / ai-review (pull_request) Has been cancelled
CI Review / helm-validate (pull_request) Has been cancelled
CI Review / pr-title (pull_request) Has been cancelled
Lint & Validate / shellcheck (pull_request) Has been cancelled
Lint & Validate / yaml-lint (pull_request) Has been cancelled
Lint & Validate / terraform-validate (pull_request) Has been cancelled
5fa9956890
Complete the alert investigation permissions fix by documenting both layers:
- Claude Code settings.json allowedTools configuration
- Kubernetes RBAC for lab-director cluster access

This documentation explains the dual-layer permission model and
verification steps for successful alert investigation workflows.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Some checks failed
CI Review / ai-review (pull_request) Has been cancelled
CI Review / helm-validate (pull_request) Has been cancelled
CI Review / pr-title (pull_request) Has been cancelled
Lint & Validate / shellcheck (pull_request) Has been cancelled
Lint & Validate / yaml-lint (pull_request) Has been cancelled
Lint & Validate / terraform-validate (pull_request) Has been cancelled
This pull request has changes conflicting with the target branch.
  • core/charts/apps/lab-director/templates/deployment.yaml
  • core/charts/apps/lab-director/values.yaml
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin plane/HOMELAB-550-fix-alert-permissions:plane/HOMELAB-550-fix-alert-permissions
git switch plane/HOMELAB-550-fix-alert-permissions
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aaron/infra-core!140
No description provided.