Configuration
The agent is configured in two layers: a YAML filter file (what internal
traffic to suppress) and a set of environment variables (inspection depth,
output, metrics, enforcement). With the Helm chart these are set via
helm/ebfw/values.yaml.
Filtering
From a YAML file (-config / EBFW_CONFIG, e.g. a mounted ConfigMap). Lists left
unset fall back to built-in defaults.
cgroup: /sys/fs/cgroup
exclude:
cidrs: # suppress these destination IPs (cluster/private/loopback)
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
- 127.0.0.0/8
domainSuffixes: # suppress DNS/SNI/HTTP for these suffixes
- cluster.local
- svc
- in-addr.arpa
Environment variables
| Env var | Default | Effect |
|---|---|---|
EBFW_INSPECT_PATHS |
true |
capture HTTPS paths via the SSL_write uprobe |
EBFW_INSPECT_HEADERS |
false |
also report HTTP request headers |
EBFW_INSPECT_BODY |
false |
stub — request-body capture is not implemented |
EBFW_OUTPUT |
text |
event format: text or json (one object per line) |
EBFW_METRICS_ADDR |
:9090 |
Prometheus /metrics listen address (empty disables) |
EBFW_NODE_NAME |
(unset) | this node’s name (set via the downward API); scopes the pod informer |
EBFW_ENFORCE_MODE |
off |
egress enforcement: off / log / enforce (see below) |
EBFW_POLICY_SOURCE |
file |
where policy comes from: file (the EBFW_POLICY YAML) or crd (watch the EgressPolicy + ClusterEgressPolicy CRDs) |
EBFW_POLICY |
(unset) | path to the egress policy YAML (used when EBFW_POLICY_SOURCE=file) |
EBFW_ENFORCE_DRY_RUN |
false |
in enforce mode, program the datapath but suppress drops (canary) |
Enforcement
The agent evaluates an egress policy (allow/deny per pod by
domain/IP/CIDR/port) — either a YAML file (EBFW_POLICY, see
examples/policy.yaml) or the
EgressPolicy CRDs. EBFW_ENFORCE_MODE selects the behavior:
off(default) — observe-only; policy ignored.log— evaluate each connection-level event and annotate it with the verdict (action=deny rule=…in text,"action"/"rule"in JSON) without dropping anything. A safe dry-run to validate a policy against live traffic.enforce— drop denied egress. Denied IPv4 TCPconnect()fails fast withEPERM(thecgroup/connect4hook); anything else denied is dropped at thecgroup_skb/egresshook (the SYN is dropped → connection times out).EBFW_ENFORCE_DRY_RUN=trueprograms the datapath and stamps verdicts but suppresses the drop, as a canary.
What enforce drops today: per-pod (or node-global) IP/CIDR and CIDR+port
rules + default posture, and domain rules — a cgroup_skb/ingress hook
captures DNS answers and the agent programs the resolved IPs into the verdict map,
so a domain-blocked connection’s SYN is dropped. (A domain-blocked flow shows only
a CONNECT with action=deny; the SYN never completes, so there’s no TLS event
carrying the SNI/rule name.) Port-only / L7 (method,path) / IPv6 rules are
evaluated for log/metrics but not yet dropped — the agent logs how many
dimensions it couldn’t program.
Policy is hot-reloaded on file change (a bad reload is logged and ignored, keeping the last good policy). Evaluate a policy offline, no kernel needed:
ebfw policy test --policy examples/policy.yaml \
--flow 'pod=payments/web dst=203.0.113.5 port=443 domain=api.example.com' \
--flow 'domain=evil.com port=443'
Modify rules (header injection / path rewrite) are accepted and shown by
policy test, but are not enforced — that datapath (a terminating proxy + TLS
MITM) is a deferred, opt-in feature; the cgroup/connect datapath treats Modify
as Allow.
Policy file format
The EBFW_POLICY file is the policy model written at the top level — there is
no apiVersion / kind / metadata envelope (that is CRD-only). The same
examples/policy.yaml
fixture is what policy test uses.
defaultAction: Deny # posture when a flow matches no rule; OPTIONAL,
# default Allow (blocklist). Deny = allowlist.
podSelector: # OPTIONAL: scope the WHOLE file to these pods
matchLabels: { app: web } # (label-based; needs the Pods informer, so it
# no-ops off-cluster — see Pod attribution)
rules: # evaluated in order; FIRST match wins
- name: allow-github # label for logs/metrics
action: Allow # Allow | Deny | Modify
match: # AND across dimensions; an absent one matches any,
# a list is OR within (any domain / any port)
pod: # per-rule source selector (file source only)
namespace: payments
labels: { team: platform }
matchExpressions: # In / NotIn / Exists / DoesNotExist
- { key: tier, operator: In, values: ["web"] }
domains: ["github.com", "*.githubusercontent.com"] # DNS qname / SNI / Host globs
cidrs: ["203.0.113.0/24"] # destination IP ranges (IPv4 enforced; IPv6 logged)
ports: [443] # destination ports, 1..65535
methods: ["GET"] # L7 — evaluated for log/metrics, not dropped yet
pathPrefix: "/api" # L7 — evaluated for log/metrics, not dropped yet
mutations: # required iff action: Modify (modeled, not enforced)
- { type: SetHeader, header: X-Egress-Checked, value: ebfw }
# type: SetHeader | AddHeader | RemoveHeader | RewritePath
# header/value for the header types; pathReplace for RewritePath
It is the same rule/match model as the CRD spec: (see
egresspolicy.md for full field semantics and the
enforced-vs-logged breakdown), with two differences:
- No envelope — fields sit at the document root, not under
spec:. - Per-rule
match.pod— the file source lets each rule carry its own source pod selector (namespace / name / uid / labels / matchExpressions). The CRD has no per-rule pod field; it uses the single top-levelspec.podSelectorinstead.
defaultAction and podSelector are optional in the file source (the pure model
is permissive); on the CRDs podSelector is required by the schema.
Pod attribution
When the agent runs in-cluster it watches Pods on its own node (a spec.nodeName
field selector) to map pod UID → namespace/name, so it needs RBAC to
get/list/watch pods (granted by the chart) and EBFW_NODE_NAME from the
downward API. This enrichment is best-effort: with no in-cluster config, or
before the informer has synced, events still carry the node-local identity (pod
UID, container id, QoS) — only the human-readable namespace/name is absent.
Metrics
/metrics (default :9090, on hostNetwork so reachable on the node) exposes:
ebfw_events_total{kind}, ebfw_filtered_total{kind},
ebfw_attribution_total{result} (hit/miss), ebfw_uprobe_attached, and — when
enforcement is enabled — ebfw_enforcement_decisions_total{action,mode} and
ebfw_policy_rules. Labels are deliberately low-cardinality — pod identity lives
in the event lines, not in metric labels.
Limitations
- IPv6 extension headers are not parsed. The packet monitor handles both IPv4 and IPv6, but an IPv6 packet whose next-header is not TCP/UDP directly (a hop-by-hop, routing, fragment, or destination-options header) is skipped rather than walked. These are rare on normal egress.
- HTTPS paths need OpenSSL-dynamic. Statically-linked TLS has no
libssl.soto hook — notably Go (crypto/tls, often stripped), Java, rustls. - Single segment — DNS/TLS/HTTP are parsed from the first packet/segment only; larger ClientHellos/requests are truncated. Reassembly is future work.
- Request bodies are a stub (
EBFW_INSPECT_BODYdoes nothing yet). - TLS 1.3 ECH encrypts the SNI; such connections show only the dst IP.
- De-dup of connections is in-memory and unbounded — fine for a node agent, not tuned for very long runs.