Kubernetes Network Security: Why Default-Deny, Service Meshes, and Zero Trust Belong Together

Home >> TECHNOLOGY >> Kubernetes Network Security: Why Default-Deny, Service Meshes, and Zero Trust Belong Together
Share

The vast majority of teams leave the entrance door to their Kubernetes cluster locked and all their internal window open. Network policies are tacked on to the end. Service-to-service traffic is not encrypted. “Zero trust” remains a seldom-seen slide in a security deck.
Breaches occur in that same divide between intention or action.

This article explains how Kubernetes network security works, practically implemented and how it fits into a real zero-trust architecture, including default deny policies, Istio-supported encryption, ingress hardening, and more. Not theory. Real-world decisions about configuration that actually make a difference.

The Default-Deny Problem Nobody Talks About Enough

What many people are surprised about when learning more about Kubernetes is that all pods are able to communicate with each other by default. No restrictions. Complete (E-W) traffic unrestricted.

It’s okay if you are using a laptop. Its nature is counterproductive in production.

With default-allow networking, a compromised logging pod within a pod namespace can send probes to your payment service, access your database and steal your data by traversing no firewall rule. The impacted area due to any hole will be an entire cluster.

The fix seems straightforward – set it to default-deny. Actually, teams don’t use it because there’s a danger of breaking it if it is not considered with the care taken as to what you add to the allowlist first…

Writing a Default-Deny Policy That Doesn’t Break Your Cluster

A NetworkPolicy is a network resource that is only applied to those pods that may be selected by the podSelector. A deny-all policy has the form of:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

An empty podSelector matches all items in the namespace. Ingress listed, as well as egress, but no rules are defined; therefore, nothing goes in or out.

Then you gradually increase from there on. Only permit primaries: Frontend to the Backend, backend to DB, specified port only.

I have experience in doing this process on the medium sized microservices portfolios and the most important point to infer from this experience is: do the Namespace mapping first. If you’re not aware of any traffic patterns before you make policy, you’re going to waste lots of time following the wrong services.

Here are some points to bear in mind:

  • NetworkPolicy needs a CNI plugin that implements these network policies: Calico, Cilium, or Weave. The default kubenet doesn’t support this type of private partner.
  • Policies are additive. All the policies within the same pod combine their allow rules.
  • Egress rules are commonly overlooked, but they do come into play – particularly when considering the external endpoints that your pods can access.

This is a part of the solid Kubernetes Security, but not all of it.

What Istio Actually Does for Your Network (And What It Doesn’t)

NetworkPolicies operate on an IP and port basis. Failure to understand HTTP methods, JWT tokens, or services identity.Failure to grasp concepts about methods of the web, JWT, an identity for a service. A service mesh is what fits the bill here.

The most popular of these is Istio, and its main strength for network security is that it provides a mutual TLS – mTLS – between all of the services. Each of the sidecar proxys is automatically managed for certificate rotation. Pods encrypt data while traversing, and verify each other’s identity before the connection is finalized.

Enabling mTLS Cluster-Wide Without Breaking Everything

The mTLS behavior is controlled by Istio’s PeerAuthentication resource. If you choose STRICT mode, no traffic will be accepted if unencrypted.

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT

The aim is to apply this on a cluster-wide basis. However, strict mode applies to services that are not yet part of the mesh (legacy applications and third party tools) break these.

Methodology: Be PERMISSIVE (allow both encrypted and plain traffic), gradually move services to the new location, then move to STRICT.

My experience indicated a phase of the migration (migration phase) where most teams get themselves stuck. Permissive mode remains active for months since no one has the cutover “ownership”. It is definitely worth factoring in a rollout-time period.

Authorization Policies – The Part Most Tutorials Skip

mTLS will provide encryption and identity. However, you will need to specify what services you are authenticating can do.

This is managed by Istio AuthorizationPolicy resource:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: payment-service-policy
  namespace: production
spec:
  selector:
    matchLabels:
      app: payment-service
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/production/sa/checkout-service"]
    to:
    - operation:
        methods: ["POST"]
        paths: ["/api/v1/charge"]

This is fine-grained. The payment service only supports POST requests on a particular path, and only by the checkout service’s service account. Not a bit of anything passes through — even encrypted.

This type of policy would certainly not be a type of policy that supports security needs. This is a first step towards real defense in depth, along with Container Security practices at the pod level.

Ingress Is Still Your Most Exposed Surface

Intercluster east-west traffic receives a lot of focus. The real exposure may exist in the form of ingress traffic or traffic in from the outside.

Kubernetes Ingress maps out external HTTP/HTTPS traffic to services. However, the security configuration is not a part of the Ingress resource, but the ingress controller. There are many different ways to do this with nginx, traffik and native controllers of AWS/GCP/Azure.

There are some things that need to be in-place on entry:

Do terminate TLS at the controller — Do not run unencrypted traffic in your cluster from the ingress point. On controller, end TLS and re-encrypt when you are running mTLS internally.

Rate limiting — Most ingress controllers have it as a native feature. It otherwise can be saturated when there is a misconfigured client or bot.

Web Application Firewall Integration — When you have public-facing APIs, you can have a Web Application Firewall (WAF) in front of your API layer that can intercept common attack patterns before they reach your application code.

Blocking source IPs (if it is necessary) – Use nginx.ingress.kubernetes.io/whitelist-source-range or similar annotations for ingress where it must only accept traffic from a set of known source IPs (internal tools or traffic from a particular region.

I noticed that teams who use cloud-managed ingress controllers generally make the assumption that the provider will manage the security config. They don’t, generally. These controls need to be added explicitly as well.

My Take on Zero Trust – It’s an Architecture, Not a Feature

You are seeing the phrase, “Zero trust” pretty much all over the place these days. It basically translates to not completely depending on network location. Not because it’s inside the cluster. Not because that’s in the same namespace. Trust must be maintained at each interaction at a conscious level.

Kubernetes provides you with the building blocks. How you combine them is called zero trust.
Well this makes good sense given that the four corners of Kubernetes security are Cloud, Cluster, Container, Code. At every layer, zero trust thinking is implemented:

  • Cloud: Install the IAM roles, access control to nodes, monitoring of the KSPM tools on your cloud config.
  • Cluster: Attach audit_sink and cluster to the components in the correspondingemplates.In components in correspondingtemplates, attach audit_sink and cluster.
  • Container: Pod Security Admission, read-only filesystems, dropping Linux capabilities
  • Code: Dependency scanning and image signing / SBOM generation

Don’t assume you need all of this the first day you enter the zero trust concept. It’s a process of creating a state of explicit, logged and revocable access decisions.

Where Kubernetes Network Security Fits in a Zero-Trust Model

The network layer is the point where the enforcement occurs. You can’t take policies to the street without them being enforced there.

Within a zero-trust Kubernetes design, this is quite how to enforce:

  • No access is granted for services unless required, using limited service accounts. Workload identity based on SPIFFE/SPIRE or cloud-based identity system.
  • Transit encrypted – eveything within the cluster is encrypted via mTLS with Istio or Linkerd.
  • Policy enforcement – NetworkPolicies in CNI layer, AuthorizationPolicies in mesh layer.
  • Continuous verification – Audit logs, Falco for the detection of runtime anomalies, alerts when policy drift occurs.
  • Least-privilege egress (Pods only access the external endpoint(s) they require). Nothing else.

Kubernetes Cluster Hardening is the cluster-level controls that are fundamental to all this: API server flags, etcd encryption, kubeconfig security, and node hardening.

Secrets Management – The Gap That Undermines Everything Else

All your network policies and airtight mTLS can be drifted away by a single hardcoded password in a ConfigMap database.

Everyone’s team is lacking at Secret Management in Kubernetes. Values are default Secret base64-encoded – which is not encryption. So it can be easily decrypted by anyone who has read access to the namespace in seconds.

Better options:

  • Encryption of secrets before reaching the cluster. They can only be decrypted by the controller.
  • External Secrets Operator – Fetch secrets from enabled (AWS Secrets Manager, GCP Secret Manager, or HashiCorp Vault) Secrets storage at runtime.
  • Vault Agent Injector – Inject secrets directly into pod filesystems, without modifying environment variables.

The secret that really matters: secrets don’t stay in the cluster when it’s not working. They are pulled out of a store during boot-up and then they cycle through on schedule; they never are versioned.

Two Unique Angles Most Articles Don’t Cover

eBPF-based policy enforcement is changing the game

Traditional CNI-distro plugins operate based on iptables rules to control NetworkPolicies. If this happens on large systems, it shifts to a performance blocker and debugging nightmare.

With significantly lower overhead and much improved observability Cilium succeeds where other technologies have failed by implementing eBPF (which is a technology of the Linux kernel). See all connections being allowed or dropped, identity context, in real time.
On large clusters, or workloads requiring higher latencies, the impact of the practice is evident. I’ve seen iptables rule counts in mid size clusters reach in the tens of thousands. eBPF doesn’t suffer from this issue.

NetworkPolicy testing is almost never done

Creating a policy is easy one thing.Drafting a policy is easy one. Verifying that it is really effective as intended is another. This is typically skipped by most teams.

Use netassert, kubectl-netpol or simply a test pod running curl in the cluster to ensure that deny rules are indeed denying. Policy drift does happen, and here is no need to remind you that a new deployment, namespace label change or CNI update can silently make enforcement go down the drain.

Wrapping Up: Who Needs This and Where to Start

Visualizing the placement of a wrap around the world and the processes used to wrap it.
These controls are essential if you’re not running a dev cluster. They’re what make a security incident limited to one namespace and not a complete cluster compromise.

Use the default-deny NetworkPolicies for the most sensitive namespaces. Install the Istio mTLS addon in the permissive mode and move towards strict mode. Check the ingress configuration. Then deploy the zero trust controls (RBAC, workload identity, runtime detection) one at a time.
The intent of the objective is not perfection on day one. It’s creating a cluster that is based on trust, not assumption.

Leave a Reply

Your email address will not be published. Required fields are marked *