ACCESSING CLUSTERS

CRITICAL: Always prefix kubectl/flux commands with inline KUBECONFIG assignment. Do NOT use export or &&

the variable must be set in the same command:

CORRECT - inline assignment

KUBECONFIG=~/.kube/<cluster>.yaml kubectl get pods

WRONG - export with && breaks in some shell contexts

export KUBECONFIG=~/.kube/<cluster>.yaml && kubectl get pods

Cluster Context

CRITICAL: Always confirm cluster before running commands.

Cluster Purpose Kubeconfig

dev

Manual testing ~/.kube/dev.yaml

integration

Automated testing ~/.kube/integration.yaml

live

Production ~/.kube/live.yaml

KUBECONFIG=~/.kube/<cluster>.yaml kubectl <command>

Accessing Internal Services

Platform services are exposed through the internal ingress gateway over HTTPS. DNS URLs are useful for browser-based access (Grafana, Hubble UI, Longhorn UI).

OAuth2 Proxy caveat: Prometheus, Alertmanager, and some other services are behind OAuth2 Proxy. DNS URLs redirect to an OAuth login page and cannot be used for API queries via curl. Use kubectl exec or port-forward instead for programmatic access.

Service Live Auth API Access

Prometheus https://prometheus.internal.tomnowak.work

OAuth2 Proxy kubectl exec or port-forward

Alertmanager https://alertmanager.internal.tomnowak.work

OAuth2 Proxy kubectl exec or port-forward

Grafana https://grafana.internal.tomnowak.work

Built-in auth Browser only

Hubble UI https://hubble.internal.tomnowak.work

None Browser

Longhorn UI https://longhorn.internal.tomnowak.work

None Browser

Garage Admin https://garage.internal.tomnowak.work

None Browser

Domain pattern: <service>.internal.<cluster-suffix>.tomnowak.work

live: internal.tomnowak.work
integration: internal.integration.tomnowak.work
dev: internal.dev.tomnowak.work

Querying Prometheus/Alertmanager API

Option 1: kubectl exec (quick, no setup)

KUBECONFIG=~/.kube/<cluster>.yaml kubectl exec -n monitoring prometheus-kube-prometheus-stack-0 -c prometheus --
wget -qO- 'http://localhost:9090/api/v1/query?query=up' | jq '.data.result'

KUBECONFIG=~/.kube/<cluster>.yaml kubectl exec -n monitoring prometheus-kube-prometheus-stack-0 -c prometheus --
wget -qO- 'http://localhost:9090/api/v1/alerts' | jq '.data.alerts[] | select(.state == "firing")'

KUBECONFIG=~/.kube/<cluster>.yaml kubectl exec -n monitoring alertmanager-kube-prometheus-stack-0 -c alertmanager --
wget -qO- 'http://localhost:9093/api/v2/alerts' | jq .

Option 2: Port-forward (for scripts and repeated queries)

KUBECONFIG=~/.kube/<cluster>.yaml kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090 & curl -s "http://localhost:9090/api/v1/query?query=up" | jq '.data.result'

Using the helper scripts:

Prometheus (start port-forward first; script defaults to http://localhost:9090)

KUBECONFIG=~/.kube/<cluster>.yaml kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090 & .claude/skills/prometheus/scripts/promql.sh alerts --firing

Loki (no HTTPRoute — always requires port-forward)

KUBECONFIG=~/.kube/<cluster>.yaml kubectl port-forward -n monitoring svc/loki-headless 3100:3100 & export LOKI_URL=http://localhost:3100 .claude/skills/loki/scripts/logql.sh tail '{namespace="monitoring"}' --since 15m

Common kubectl Patterns

Read-only commands used during daily operations and investigations:

Command Purpose

kubectl get pods -n <ns>

List pods in a namespace

kubectl get pods -A

List pods across all namespaces

kubectl describe pod <pod> -n <ns>

Detailed pod info with events

kubectl logs <pod> -n <ns> --tail=100

Recent logs from a pod

kubectl logs <pod> -n <ns> --previous

Logs from previous container instance

kubectl get events -n <ns> --sort-by='.lastTimestamp'

Recent events timeline

kubectl top pods -n <ns>

CPU/memory usage per pod

kubectl top nodes

CPU/memory usage per node

kubectl get ns <ns> --show-labels

Namespace labels (network policy profiles)

kubectl explain <resource>

API schema reference for a resource type

Flux GitOps Commands

Status and Reconciliation

Check status

KUBECONFIG=~~/.kube/<cluster>.yaml flux get all KUBECONFIG=~~/.kube/<cluster>.yaml flux get kustomizations KUBECONFIG=~/.kube/<cluster>.yaml flux get helmreleases -A

Trigger reconciliation

KUBECONFIG=~~/.kube/<cluster>.yaml flux reconcile source git flux-system KUBECONFIG=~~/.kube/<cluster>.yaml flux reconcile kustomization <name> KUBECONFIG=~/.kube/<cluster>.yaml flux reconcile helmrelease <name> -n <namespace>

Flux Status Interpretation

Status Meaning Action

Ready: True

Resource is reconciled and healthy None - operating normally

Ready: False

Resource failed to reconcile Check the message/reason for details

Stalled: True

Resource has stopped retrying after repeated failures Suspend/resume to reset (see sre skill)

Suspended: True

Resource is intentionally paused Resume when ready: flux resume <type> <name>

Reconciling

Resource is actively being applied Wait for completion

Researching Unfamiliar Services

When investigating unknown services, spawn a haiku agent to research documentation:

Task tool:

subagent_type: "general-purpose"
model: "haiku"
prompt: "Research [service] troubleshooting docs. Focus on:
1. Common failure modes
2. Health indicators
3. Configuration gotchas Start with: [docs-url]"

Chart URL to Docs mapping:

Chart Source Documentation

charts.jetstack.io

cert-manager.io/docs

charts.longhorn.io

longhorn.io/docs

grafana.github.io

grafana.com/docs

prometheus-community.github.io

prometheus.io/docs

Common Confusions

BAD: Use helm list to check Helm release status GOOD: Use kubectl get helmrelease -A

Flux manages releases via CRDs, not Helm CLI

Keywords

kubernetes, kubectl, kubeconfig, flux, flux status, cluster access, internal URL, service URL, port-forward, helm release, gitops, reconciliation

k8s

Safety Notice

Copy this and send it to your AI assistant to learn

CORRECT - inline assignment

WRONG - export with && breaks in some shell contexts

Option 1: kubectl exec (quick, no setup)

Option 2: Port-forward (for scripts and repeated queries)

Prometheus (start port-forward first; script defaults to http://localhost:9090)

Loki (no HTTPRoute — always requires port-forward)

Check status

Trigger reconciliation

Source Transparency

Related Skills

prometheus

opentofu-modules

taskfiles

terragrunt