Kubernetes Default-Deny Egress Stops Pod Exfiltration

Run a default-deny egress NetworkPolicy to stop a compromised pod from phoning home. Two manifests, one DNS gotcha that breaks it, and the CNI fix.

Every Kubernetes cluster ships with the same quiet default: a pod can reach anything on the internet the second it starts. Ingress gets all the scrutiny, because that is where an attacker knocks. Egress, the traffic leaving your pods, is treated as plumbing. That treatment is the bug.

The default nobody audits

The upstream NetworkPolicy documentation is blunt about it. A pod is non-isolated for egress until some policy selects it, and until that happens every outbound connection is allowed. No policy, no limit.

Play that out. One pod, freshly scheduled, can open a socket to your database, to the cloud metadata endpoint at 169.254.169.254, to your secrets store, and to any host on the public internet, with nothing standing in the path. Most teams write an ingress policy, feel covered, and never notice that the return direction is wide open. Ingress tells you who may reach the pod. It says nothing about where the pod may reach.

Why the 2026 worms need that open door

The supply-chain worms this year made the cost concrete, and they did it without a single clever exploit. The Shai-Hulud family and the copycats trailing it arrive as a trusted dependency. An npm or PyPI package you already pull. It runs at install time, inside your build pod, harvests whatever credentials are in reach, and then opens an outbound connection to carry them out.

Per Microsoft's May 20, 2026 writeup of the Mini Shai-Hulud variant that hit the @antv packages, the payload scraped CI/CD credentials off the runner and exfiltrated them, then propagated through publishing workflows. Datadog Security Labs documented the 2.0 variant going further: multi-platform credential theft spanning GitHub, AWS, Vault, npm, Kubernetes, and 1Password, GitHub Action runner memory scraping, and dual-channel exfiltration that included writes to public GitHub dead-drop repositories.

Look at the shape of it. Code execution is the entry. But the theft only becomes a breach at the moment of exfiltration, when the payload dials out. And that dial-out runs, by default, completely unblocked.

Containment, not prevention. Say it out loud.

Default-deny egress will not stop the malicious package from executing. It will not stop the token from being read out of memory. Anyone who sells it as prevention is lying to you.

What it does is turn "compromised pod" into "compromised pod that cannot phone home." It shrinks the blast radius. The stolen token exists, sitting in a process, with nowhere to go. That is a containment control, and containment is worth a great deal when prevention has already failed at the dependency layer, which is exactly where these worms operate. Honest framing matters here because the wrong framing gets the control ripped out the first time someone points out it "didn't stop the malware."

DNS is the first thing you break

Here is where most rollouts die. You apply default-deny egress, and within seconds pods cannot resolve names. Nearly everything fails, and it fails in a way that is genuinely miserable to debug, because the app logs say "connection timeout," not "your NetworkPolicy ate my DNS lookup."

The deny itself is two lines of intent:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-egress
  namespace: build
spec:
  podSelector: {}
  policyTypes:
  - Egress

An empty podSelector: {} selects every pod in the namespace, policyTypes: [Egress] isolates them for outbound, and the absence of any egress: rule means nothing is permitted. Everything out is denied.

That is why the DNS allow has to ship in the same change, never as a follow-up:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: build
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

Deploy the deny and the DNS allow together, in one commit. Split them across two commits and you buy yourself a phantom outage and a panicked revert.

Native NetworkPolicy cannot match a domain name

Once DNS works, the next wall is real and it is a limitation, not a preference. The built-in policy ipBlock selector matches CIDR ranges only. There is no FQDN support in upstream NetworkPolicy. None.

So if the honest egress list for your build pod is "must reach api.github.com and one S3 bucket," you cannot express that in native policy. You are reduced to enumerating IP ranges that GitHub and AWS rotate without telling you. That does not scale, and a stale CIDR allowlist fails in both directions: it blocks legitimate traffic when the ranges shift, and it silently permits whatever new tenant moved into an old range.

This is the point where the CNI stops being an implementation detail and becomes part of the security control. Cilium's DNS-aware policy runs a DNS proxy that watches lookups and programs egress rules for the resolved addresses, so you write intent instead of arithmetic:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-github-egress
  namespace: build
spec:
  endpointSelector: {}
  egress:
  - toFQDNs:
    - matchName: "api.github.com"

Calico offers equivalent domain-based egress rules. If your CNI cannot do FQDN egress, that is now a security decision you are making, not just a networking one, and it should be on the record as such.

The strongest objection, and the answer

The best counterargument against all of this is real, so I will state it properly rather than knock down a strawman. A determined attacker can tunnel exfiltration through a domain you have already allowlisted. GitHub, for instance. And the Shai-Hulud 2.0 dead-drop-to-public-repo channel is precisely that: the data leaves through github.com, a host your build pod is supposed to reach.

True. It does not stop that path. But "does not stop everything" is not "does not help," and the gap between those two is the whole value. Forcing exfiltration through a narrow, logged, allowlisted set of destinations does two things at once. It kills the long tail of arbitrary attacker-controlled endpoints outright, and it converts the remaining attempts into something you can alert on. A control that turns invisible theft into a policy-violation log entry has earned its place, even though it cannot cover the case where the thief hides inside your own approved traffic.

Where to start, on your build namespaces first

Do these in order. Each step ties to something specific above.

Pick one build namespace in staging, not the whole cluster. Egress breaks apps in ways ingress never does, so apply default-deny to a single workload's namespace and let it soak for a few days before you touch anything production-shaped.
Ship the deny and the DNS allow in the same commit. Your first two objects are default-deny-egress and allow-dns-egress on UDP/TCP 53 to kube-system. Deploy them as one change. Separating them is the single most common reason a rollout gets reverted in a panic.
Lock down the pods that run untrusted code before your app pods. CI runners, dependency-update bots like Renovate, and AI-agent sandboxes execute third-party code on every build, and their legitimate egress list is short, which makes them both the highest-value target and the easiest to scope. Start there.
Reach for FQDN policy the moment you need a real external service. Use Cilium toFQDNs or Calico domain rules instead of hand-built CIDR lists. If your CNI cannot do it, record that as an accepted security gap, do not paper over it with a static IP allowlist that will rot.
Alert on the deny, do not just enforce it. Set the trigger concretely: a pod that has never made an outbound connection suddenly generating egress denies to an unknown host is a possible install-time payload. Pull it for inspection. That denied connection is your earliest and cheapest signal that a dependency went bad, and it costs you nothing to watch for it once the policy is live.

Egress default-deny is not new and it is not clever. It is the boring control that the 2026 worms quietly bet you never enabled. Prove them wrong on your build namespaces, this week, and work outward from there.