How to Secure Kubernetes Clusters for Production

About us: Personal website of Timofey Bugaevsky and the company Zetka Interactive

Guides: A comprehensive collection of technical guides written from a senior developer's perspective. Each article provides in-depth explanations, practical code examples, and production-ready patterns.

DevOps: CI/CD, containerization, and deployment

You need to harden your Kubernetes cluster against security threats, implement proper access controls, encrypt sensitive data, and protect network communications between services.

Problem Statement

You need to harden your Kubernetes cluster against security threats, implement proper access controls, encrypt sensitive data, and protect network communications between services.

Security Layers

┌──────────────────────────────────────────────────────────────────┐
│ External Security │
│ • WAF, DDoS Protection, SSL/TLS Termination │
└──────────────────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────────────┐
│ Cluster Security │
│ • RBAC, Network Policies, Pod Security │
└──────────────────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────────────┐
│ Application Security │
│ • Secret Management, Container Scanning, Runtime Security │
└──────────────────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────────────┐
│ Data Security │
│ • Encryption at Rest, Encryption in Transit, Backup Security │
└──────────────────────────────────────────────────────────────────┘

1. Role-Based Access Control (RBAC)

Principle of Least Privilege

Create specific roles for different teams and use cases.

Developer Role (Namespace-Scoped)

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
namespace: development
rules:
# Read-only access to most resources
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "events"]
verbs: ["get", "list", "watch"]
# Can manage deployments
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
# Can view logs and exec into pods
- apiGroups: [""]
resources: ["pods/log", "pods/exec"]
verbs: ["get", "create"]
# Cannot access secrets
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: development
subjects:
- kind: Group
name: developers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer
apiGroup: rbac.authorization.k8s.io

CI/CD Pipeline Role

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cicd-deployer
namespace: production
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "create", "update", "patch"]
- apiGroups: [""]
resources: ["services"]
verbs: ["get", "list", "watch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: gitlab-deployer
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cicd-deployer-binding
namespace: production
subjects:
- kind: ServiceAccount
name: gitlab-deployer
namespace: production
roleRef:
kind: Role
name: cicd-deployer
apiGroup: rbac.authorization.k8s.io

Audit RBAC Permissions

# Check what a user can do
kubectl auth can-i --list [email protected]
# Check specific permission
kubectl auth can-i delete pods --namespace=production [email protected]
# List all cluster role bindings
kubectl get clusterrolebindings -o wide

2. Pod Security Standards

Restricted Pod Security Policy

apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted

Secure Pod Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-app
spec:
template:
spec:
# Don't use the default service account
serviceAccountName: app-service-account
automountServiceAccountToken: false
# Security context at the pod level
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:v1.0
# Security context at the container level
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# Resource limits prevent DoS
resources:
limits:
cpu: "1"
memory: "512Mi"
requests:
cpu: "100m"
memory: "128Mi"
# Use tmpfs for writable directories
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
volumes:
- name: tmp
emptyDir:
medium: Memory
sizeLimit: 100Mi
- name: cache
emptyDir: {}

3. Network Policies

Default Deny All

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress

Allow Specific Traffic

# Allow frontend to backend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-allow-frontend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
---
# Allow backend to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-allow-backend
namespace: production
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 5432
---
# Allow egress to external APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-api
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
ports:
- protocol: TCP
port: 443
# Allow DNS
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53

4. Secret Management

Using External Secrets Operator with HashiCorp Vault

# Install External Secrets Operator first
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: "https://vault.example.com"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "external-secrets"
serviceAccountRef:
name: external-secrets
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-secrets
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: app-secrets
creationPolicy: Owner
data:
- secretKey: database-password
remoteRef:
key: production/database
property: password
- secretKey: api-key
remoteRef:
key: production/api
property: key

Encrypting Secrets at Rest

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <base64-encoded-32-byte-key>
- identity: {}

Apply to kube-apiserver:

# Add to kube-apiserver manifest
spec:
containers:
- command:
- kube-apiserver
- --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
volumeMounts:
- mountPath: /etc/kubernetes/encryption-config.yaml
name: encryption-config
readOnly: true
volumes:
- hostPath:
path: /etc/kubernetes/encryption-config.yaml
type: FileOrCreate
name: encryption-config

5. Container Image Security

Image Scanning in CI/CD

# .gitlab-ci.yml
container-scan:
stage: security
image:
name: aquasec/trivy:latest
entrypoint: [""]
script:
- trivy image --exit-code 1 --severity CRITICAL $IMAGE_TAG
- trivy image --exit-code 0 --severity HIGH,MEDIUM --format json -o trivy-report.json $IMAGE_TAG
artifacts:
reports:
container_scanning: trivy-report.json

Image Policy with Kyverno

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-images
spec:
validationFailureAction: enforce
rules:
- name: verify-signature
match:
resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "myregistry.com/*"
attestors:
- entries:
- keys:
publicKeys: |
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----
---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-latest-tag
spec:
validationFailureAction: enforce
rules:
- name: require-image-tag
match:
resources:
kinds:
- Pod
validate:
message: "Using 'latest' tag is not allowed"
pattern:
spec:
containers:
- image: "!*:latest"

6. Service Mesh Security (Istio)

Enable mTLS

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: backend-policy
namespace: production
spec:
selector:
matchLabels:
app: backend
rules:
- from:
- source:
principals:
- "cluster.local/ns/production/sa/frontend"
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/*"]

7. Audit Logging

Enable Audit Logs

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log authentication failures at the Metadata level
- level: Metadata
users: ["system:anonymous"]
verbs: ["*"]
# Log secrets access at the RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["secrets"]
# Log all changes to deployments
- level: RequestResponse
verbs: ["create", "update", "patch", "delete"]
resources:
- group: "apps"
resources: ["deployments"]
# Default: log at the Metadata level
- level: Metadata

Forward to SIEM

# Fluentd ConfigMap for audit logs
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |
<source>
@type tail
path /var/log/kubernetes/audit/*.log
pos_file /var/log/fluentd-audit.pos
tag kubernetes.audit
<parse>
@type json
</parse>
</source>
<filter kubernetes.audit>
@type record_transformer
<record>
cluster_name production
</record>
</filter>
<match kubernetes.audit>
@type elasticsearch
host elasticsearch.logging.svc
port 9200
index_name kubernetes-audit
</match>

8. etcd Security

Encrypt etcd Communication

# Check current etcd encryption
ETCDCTL_API=3 etcdctl endpoint health \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key

Restrict etcd Access

# Only the API server should access etcd
firewall-cmd --zone=trusted --add-source=<api-server-ip>
firewall-cmd --zone=public --remove-port=2379/tcp
firewall-cmd --zone=public --remove-port=2380/tcp

Regular etcd Backups with Encryption

Encrypted, offsite backups protect against both data loss and security incidents:

# Create an encrypted backup
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-$(date +%Y%m%d).db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key
# Encrypt the backup before storing
gpg --encrypt --recipient [email protected] /backup/etcd-*.db

The Senior Security Mindset

Senior engineers understand that security breaches often result from misconfigurations rather than sophisticated attacks. A single exposed secret, an overprivileged container, or a missing network policy can compromise your entire cluster.

Defense in Depth: Multiple overlapping controls ensure that no single failure compromises the system:

Container image security (minimal base images, vulnerability scanning)
Pod security standards (non-root, read-only filesystem)
Network policies (default deny, explicit allow)
RBAC (least privilege, regular access reviews)
Secrets management (encryption at rest, external stores)

Common Attack Vectors:

Kubernetes Secrets are base64-encoded, not encrypted—additional protection is required
Default network policies allow all pods to communicate—start with deny-all
Service account tokens auto-mounted by default—disable when not needed
Cluster-admin access granted too broadly—audit regularly

Security Audit Checklist

Cluster Level

[ ] RBAC enabled and properly configured
[ ] Anonymous authentication disabled
[ ] API server audit logging enabled
[ ] etcd encrypted and access restricted
[ ] Cluster certificates rotated regularly
[ ] Kubernetes version up to date

Network Level

[ ] Default deny network policies in place
[ ] Service mesh with mTLS enabled
[ ] Ingress uses TLS/SSL
[ ] No services exposed with NodePort unnecessarily
[ ] External access through ingress controller only

Pod Level

[ ] Pods run as non-root
[ ] Read-only root filesystem
[ ] No privileged containers
[ ] Resource limits defined
[ ] Security contexts configured
[ ] Service account tokens not auto-mounted

Image Level

[ ] Images scanned for vulnerabilities
[ ] Base images from trusted sources
[ ] No latest tags in production
[ ] Image pull policy set to Always
[ ] Image signing verified

Secret Management

[ ] Secrets encrypted at rest
[ ] External secret management (Vault, etc.)
[ ] Secrets not in environment variables
[ ] Regular secret rotation
[ ] No secrets in container images

Log In