Fix bugbunny findings#5282
Open
geyslan wants to merge 9 commits into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens Tracee against multiple security issues reported by bugbunny.ai, focusing on safer filesystem operations (symlink/TOCTOU resistance), safer defaults for network listeners, and robustness against resource-exhaustion and detector-tampering vectors.
Changes:
- Switch artifact/output filesystem operations to traversal-resistant patterns (Go 1.24
os.Root,O_NOFOLLOW, symlink checks) and update call sites accordingly. - Harden network exposure by defaulting HTTP and gRPC TCP listeners to loopback and documenting migration paths (including Helm defaults + NetworkPolicy).
- Add/adjust guards for detector loading (duplicate list names fatal, reject non-regular YAML inputs) and cap argv/envp array decoding to prevent OOM.
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/testutils/tracee_integrated.go | Updates test artifacts output path to new default artifacts dir constant. |
| tests/integration/capture_test.go | Updates integration test capture directory base to new artifacts dir default. |
| pkg/signatures/signature/signature.go | Skips symlinked Go plugins before plugin.Open to reduce symlink attack surface. |
| pkg/server/http/server_test.go | Trivial formatting change (trailing newline). |
| pkg/server/grpc/server_test.go | Updates expected gRPC TCP address formatting to host:port form. |
| pkg/server/grpc/server.go | Adds loopback warning for TCP binds; uses fd-based chmod for unix sockets; simplifies Address(). |
| pkg/pcaps/pcaps.go | Updates pcaps constructor to accept *os.Root output directory. |
| pkg/pcaps/common.go | Migrates global output directory to *os.Root and adjusts related helpers/errors. |
| pkg/ebpf/tracee.go | Migrates Tracee.OutDir to *os.Root and opens via fileutil.OpenRootDir; adapts artifacts map writer. |
| pkg/ebpf/capture.go | Adjusts RenameAt call sites to new single-root signature. |
| pkg/ebpf/c/common/buffer.h | Caps elem_num in ARGS_ARR_T to avoid unbounded allocations downstream. |
| pkg/detectors/yaml/parser.go | Uses Lstat + regular-file enforcement before reading YAML to prevent FIFO/device hangs. |
| pkg/detectors/yaml/loader_test.go | Adds regression test ensuring duplicate shared-list names abort directory load. |
| pkg/detectors/yaml/loader.go | Skips non-regular files, aborts on duplicate list names, adds Lstat guard in type peeker. |
| pkg/detectors/yaml/list_loader.go | Uses Lstat + regular-file enforcement for list files before reading. |
| pkg/detectors/events.go | Escalates YAML load errors from WARN to ERROR. |
| pkg/cmd/initialize/bpfobject.go | Uses MkdirTemp for BTF unpack directory and SafeOpenFile for BTF output creation. |
| pkg/cmd/flags/server_test.go | Updates expectations for loopback-default server bindings (HTTP + gRPC). |
| pkg/cmd/flags/server.go | Defaults HTTP to 127.0.0.1; parses gRPC tcp as host:port (bare port => loopback); guards unix socket symlinks. |
| pkg/cmd/flags/output.go | Tightens output dir perms and uses SafeOpenFile for output file creation. |
| pkg/cmd/flags/artifacts_test.go | Updates expected default artifacts output path to /var/lib/tracee/out. |
| pkg/cmd/flags/artifacts.go | Exports DefaultArtifactsDir as /var/lib/tracee; uses SafeRemoveAll for artifact cleanup. |
| pkg/bufferdecoder/eventsreader.go | Adds Go-side cap for ARGS_ARR_T element count as defense-in-depth. |
| docs/man/server.1 | Documents loopback-default behavior for HTTP/gRPC and migration/security guidance. |
| docs/man/artifacts.1 | Documents new default artifacts directory (/var/lib/tracee). |
| docs/docs/install/config/monitoring.md | Updates monitoring docs for loopback default and adds migration guidance. |
| docs/docs/flags/server.1.md | Updates server flag docs to reflect new TCP binding semantics and security notes. |
| docs/docs/flags/artifacts.1.md | Updates artifacts docs to reflect new default artifacts directory. |
| deploy/helm/tracee/values.yaml | Sets Helm default HTTP bind to 0.0.0.0:3366 and adds NetworkPolicy values. |
| deploy/helm/tracee/templates/networkpolicy.yaml | Adds optional NetworkPolicy restricting ingress to Tracee HTTP port. |
| deploy/helm/tracee/templates/daemonset.yaml | Uses helper to derive HTTP port for containerPort/probes. |
| deploy/helm/tracee/templates/_helpers.tpl | Adds helper to extract HTTP port from configured address. |
| common/fileutil/fileutil_test.go | Updates tests to os.Root APIs and adds symlink/path traversal security tests. |
| common/fileutil/fileutil.go | Replaces openat-based helpers with Go 1.24 os.Root; adds SafeOpenFile and SafeRemoveAll. |
| cmd/traceectl/pkg/cmd/flags/output.go | Tightens output dir perms and uses O_NOFOLLOW when creating output file. |
Comments suppressed due to low confidence (1)
pkg/cmd/initialize/bpfobject.go:55
- BpfObject now unpacks embedded BTF into a new os.MkdirTemp directory, but that temp directory is never cleaned up. Since the daemon may start repeatedly (and tests start Tracee many times), this can leave accumulating /tmp/tracee-btf-* directories on disk. Consider tracking the tempDir and removing it after the eBPF module has loaded the BTF (or copying the BTF into a deterministic cache directory under the artifacts dir).
if !environment.OSBTFEnabled() && btfFilePath == "" {
tempDir, mkErr := os.MkdirTemp("", "tracee-btf-*")
if mkErr != nil {
return errfmt.Errorf("could not create temp dir for BTF: %v", mkErr)
}
unpackBTFFile := filepath.Join(tempDir, "tracee.btf")
err = unpackBTFHub(unpackBTFFile, osInfo)
if err == nil {
logger.Debugw("BTF: btfhub embedded BTF file", "file", unpackBTFFile)
cfg.BTFObjPath = unpackBTFFile
} else {
logger.Debugw("BTF: error unpacking embedded BTFHUB file", "error", err)
}
}
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 40 out of 40 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
pkg/detectors/yaml/loader.go:213
- peekFileType uses os.ReadFile after the new Lstat check, which reads the entire YAML file into memory. This bypasses the MaxYAMLFileSize guard used in ParseFile/loadListFile and can reintroduce memory-exhaustion risk for large files in detector directories. Consider enforcing MaxYAMLFileSize here (or reading only a bounded prefix / using an io.LimitReader + yaml decoder) before unmarshalling.
func peekFileType(path string) (string, error) {
info, err := os.Lstat(path)
if err != nil {
return "", err
}
if !info.Mode().IsRegular() {
return "", fmt.Errorf("not a regular file: %s", path)
}
data, err := os.ReadFile(path)
if err != nil {
return "", err
}
Member
Author
|
@josedonizetti tks for reviewing it, NBD I'll return to that. Cheers. |
f3af0e2 to
0a89929
Compare
A FIFO or symlink-to-FIFO placed in a detector directory causes os.ReadFile to block indefinitely, hanging the loader goroutine and preventing detector ingestion (local DoS). Guard LoadFromDirectory with entry.Type().IsRegular() so non-regular files never reach any read path. Add Lstat-based checks in peekFileType, ParseFile, and loadListFile as defense-in-depth for direct callers. Detected by: bugbunny.ai
The default HTTP address was ":3366" (all interfaces), exposing metrics, healthz, and pprof endpoints to the network without authentication. This contradicted the documented default of "localhost:3366". Change the default to "127.0.0.1:3366" so the CLI matches the documented behavior. Update the Helm chart to explicitly bind to "0.0.0.0:3366" (required for kubelet probes and Prometheus scraping) and add an opt-in NetworkPolicy (disabled by default) that operators can enable to restrict HTTP ingress to specific pods. Log a startup warning when using the loopback default so operators notice the change immediately. Detected by: bugbunny.ai BREAKING CHANGE: The default HTTP server address changed from ":3366" (all interfaces) to "127.0.0.1:3366" (loopback only). Remote Prometheus scraping or health checks will stop working unless --server http-address=0.0.0.0:3366 is set explicitly. Kubernetes Helm deployments are unaffected (chart sets 0.0.0.0 explicitly).
TCP gRPC always bound to 0.0.0.0 (all interfaces) with no way to restrict it, no TLS, and no auth interceptors (CWE-306). Any network-reachable client could invoke privileged RPCs such as DisableEvent, StreamEvents, and ChangeLogLevel. Default bare-port TCP addresses to 127.0.0.1, accept host:port syntax (e.g. tcp:0.0.0.0:4466) for explicit wildcard bind, and warn when binding to a non-loopback address. Detected by: bugbunny.ai BREAKING CHANGE: --server grpc-address=tcp:4466 now binds to 127.0.0.1:4466 instead of 0.0.0.0:4466. Use tcp:0.0.0.0:4466 to restore the old behavior.
Replace custom openat/mkdirat/renameat syscall wrappers with Go 1.24 os.Root, which confines all file operations to a directory tree. Symlinks escaping the root and ".." traversals are rejected by the kernel. OpenRootDir adds O_NOFOLLOW so a symlinked output directory itself is also rejected (ELOOP). Migrate Tracee.OutDir and pcaps.outputDirectory from *os.File to *os.Root, updating all callers in capture.go, processor_funcs.go, tracee.go, and the pcaps package. Detected by: bugbunny.ai BREAKING CHANGE: fileutil directory-relative functions now take *os.Root instead of *os.File. OpenExistingDir, MkdirAt, and ValidateRelativePath are removed. RenameAt takes a single root instead of two directory handles.
Add SafeOpenFile (O_NOFOLLOW) and SafeRemoveAll (Lstat guard) to fileutil and migrate all remaining file operation call sites to resist symlink attacks and TOCTOU races: - Move default artifacts dir from /tmp/tracee to /var/lib/tracee - Replace os.RemoveAll with SafeRemoveAll for artifact cleanup - Replace os.Create/os.OpenFile with SafeOpenFile in output, traceectl, and BTF unpack paths - Tighten MkdirAll permissions from 0755 to 0750 - Use os.Lstat to reject symlinked unix socket paths - Replace os.Chmod with fd-based Fchmod on unix socket listener - Add os.Lstat guard before plugin.Open to skip symlinked .so - Replace predictable /tmp/tracee-btf with os.MkdirTemp - Export DefaultArtifactsDir constant for test/caller use BREAKING CHANGE: default artifacts directory moved from /tmp/tracee to /var/lib/tracee; symlinked unix socket paths and plugin .so files are now rejected; output directory permissions tightened from 0755 to 0750.
An attacker can exec processes with huge argv/envp arrays, causing the Go decoder to allocate unbounded slices from the uncapped elem_num field -- leading to OOM kills under memory pressure. Cap elem_num to MAX_ARR_LEN (8192) in the eBPF helper before it reaches the perf buffer, and add a matching Go-side cap as defense-in-depth. Detected by: bugbunny.ai
Duplicate shared-list names were handled fail-open: the loader kept the first-seen value and logged a warning. Because os.ReadDir returns files in lexical order, an attacker who can write to the detector directory deterministically overrides any legitimate list by choosing a lexically-earlier filename -- silently tampering detection rules without modifying the original file. Treat duplicate list names as fatal (matching the existing fatal handling for duplicate event names) and escalate loader errors from WARN to ERROR. Detected by: bugbunny.ai
The Makefile hardcoded origin/main as the base ref, breaking workflows that name the upstream remote differently (e.g. upstream, private). checkpatch.sh now probes remotes in preference order (upstream, private, origin, then any other) and all Make targets forward BASE_REF only when explicitly set. Remove the unused LOGFROM variable.
Consolidate BTF path, object bytes, and uprobe binary path under BPFConfig. Track the temp directory explicitly via BTFTempDir so Cleanup only removes directories we created, never user-supplied BTF paths. Clear object bytes after module creation so memory can be reclaimed. Remove the BTFHub temp tree when embedded unpack fails.
0a89929 to
15cb88e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1. Explain what the PR does
15cb88e refactor(config): group BPF state and clean temp BTF
--
6e51ba1 fix(build): auto-detect base remote in PR targets
--
5d7cf1c fix(detectors): abort load on duplicate list names
--
4e4a165 fix(ebpf): cap ARGS_ARR_T element count
--
4edb89f fix(security)!: harden file ops against symlinks
--
3dcd173 fix(fileutil)!: use os.Root for symlink-safe I/O
--
67a0d1b fix(grpc)!: bind TCP listener to loopback by default
--
69721dd fix(server)!: bind HTTP to loopback by default
--
6edb6c4 fix(detectors): reject non-regular files in YAML loader
--
2. Explain how to test it
3. Other comments