Skip to main content

Common Issues

Solutions to frequently encountered issues.

Pods Not Starting

Check Pod Status

kubectl get pods -n optimal-system
kubectl describe pod <pod-name> -n optimal-system

Common Causes

IssueCauseSolution
ImagePullBackOffCan't pull imageCheck registry credentials, image exists
CrashLoopBackOffApp crashingCheck logs: kubectl logs <pod>
PendingNo resourcesCheck node resources, PVC binding
Init:ErrorInit container failedCheck init container logs

Check Logs

# Current logs
kubectl logs <pod-name> -n optimal-system

# Previous container logs (if restarting)
kubectl logs <pod-name> -n optimal-system --previous

# All containers in pod
kubectl logs <pod-name> -n optimal-system --all-containers

Database Connection Issues

Test Connectivity

# From API pod
kubectl exec -it <api-pod> -n optimal-system -- nc -zv postgresql 5432

# Check database service
kubectl get svc -n optimal-system | grep postgresql

Common Fixes

  1. Check credentials: Verify secret values

    kubectl get secret optimal-db-credentials -n optimal-system -o yaml
  2. Check PostgreSQL pod: Ensure database is running

    kubectl get pods -n optimal-system | grep postgresql
  3. Check network policies: Ensure traffic is allowed

    kubectl get networkpolicies -n optimal-system

Ingress Issues

Check Ingress Controller

kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx -l app.kubernetes.io/component=controller

Check Ingress Configuration

kubectl get ingress -n optimal-system
kubectl describe ingress <ingress-name> -n optimal-system

Common Issues

SymptomCauseSolution
503 errorBackend not readyCheck pod readiness
404 errorPath not configuredCheck ingress paths
SSL errorCertificate issueCheck cert-manager
Connection refusedService misconfiguredCheck service selector

Kyverno Policy Violations

View Policy Reports

kubectl get policyreport -n optimal-system
kubectl describe policyreport -n optimal-system

Check Specific Violations

# See which policies are failing
kubectl get clusterpolicy
kubectl describe clusterpolicy <policy-name>

Exempt Resources

If needed, create a PolicyException:

apiVersion: kyverno.io/v2beta1
kind: PolicyException
metadata:
name: allow-specific-workload
spec:
exceptions:
- policyName: require-run-as-non-root
ruleNames:
- run-as-non-root
match:
any:
- resources:
namespaces:
- your-namespace

Observability Issues

Prometheus Not Scraping

# Check targets
kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090:9090 -n monitoring
# Then visit http://localhost:9090/targets

Loki Not Receiving Logs

# Check Promtail
kubectl get pods -n logging | grep promtail
kubectl logs -n logging -l app=promtail

# Check Loki
kubectl logs -n logging -l app=loki

Grafana Dashboard Empty

  1. Check data source configuration
  2. Verify time range
  3. Check query syntax

Velero Backup Issues

Check Backup Status

kubectl get backups -n velero
kubectl describe backup <backup-name> -n velero

Check Velero Logs

kubectl logs -n velero -l app.kubernetes.io/name=velero

Common Issues

IssueCauseSolution
Backup stuckStorage issueCheck BackupStorageLocation
Partial failurePV snapshot failedCheck VolumeSnapshotLocation
Restore failedNamespace conflictDelete existing resources first

Getting Help

If you can't resolve an issue:

  1. Collect diagnostics:

    kubectl get events -n optimal-system --sort-by='.lastTimestamp'
    kubectl describe pods -n optimal-system > pods.txt
  2. Check documentation: https://launchpad.gooptimal.io

  3. Open an issue: https://github.com/optimal-platform/optimal-platform/issues

  4. Contact support: support@gooptimal.io