Skip to content

CEP: SBOM for package contents #127

@xhochy

Description

@xhochy

While we strive to avoid vendored dependencies in conda packages, they still happen under certain circumstances, e.g.

  • If a specific commit of a C/C++ Library is required
  • For languages where vendoring/static linkage is the norm: Haskell, Rust, Go, JavaScript

In these situations, the conda package metadata does not sufficiently cover the actual package contents. This situation is often described as having Phantom Dependencies. While some languages (e.g. Go) already contain sufficient information in the binary itself, it is costly to extract them. Thus, we should precompute them during the build, where we have the full dependency information

The generated SBOMs should be part of the files installed into the environment so that SBOM generation tools can generate high-quality SBOMs from the environment itself without the need to parse repodata or have a specialized tool for conda.

Implementation Idea (for rattler)

I would really like if rattler-build had a flag to generate an SBOM from package contents. It could be as simple as running syft over the directory. But already having the metadata computed once from the binares would save a lot of time. E.g. for a typical go binary, it took me 8s to do binary inspection with syft. If an SBOM were already present, this would be a matter of milliseconds.

Once we have decided where we want to put the SBOM in the package, pixi could have a pixi sbom that extracts the SBOM from the packages and uses the pixi.lock information itself to generate an SBOM.

My initial idea:

  • Compute SBOMs as part (or after) of the package build
  • Store them as part of the package, but ensure it ends up in conda-meta/ on installation
  • If someone wants to generate an SBOM of a conda environment, it should be sufficient to run the tool on top of the conda-meta folder (e.g. in the case of syft, using only the conda and the SBOM cataloguer).
  • Only generate SBOMs for packages that vendor something, not for all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions