diff --git a/cep-sigstore-predicate.md b/cep-sigstore-predicate.md
index 88f62596..21758f1d 100644
--- a/cep-sigstore-predicate.md
+++ b/cep-sigstore-predicate.md
@@ -1,7 +1,7 @@
-# CEP - Standardizing the v1 predicate for sigstore attestations
+# CEP - Standardizing a publish attestation for the conda ecosystem
-| Title | Standardizing the v1 predicate for sigstore attestations |
+
| Title | Standardizing a publish attestation for the conda ecosystem |
| Status | Proposed |
| Author(s) | Wolf Vollprecht <wolf@prefix.dev> |
| Created | Feb 18, 2025 |
@@ -12,61 +12,250 @@
## Abstract
-We want to standardize attestations for the conda ecosystem.
+This CEP proposes a standard attestation layout for the conda ecosystem.
+This attestation layout is based on the [in-toto] framework
+and will enable further integration with signing schemes like
+[Sigstore].
-### Sigstore Attestations
+## Definitions and Concepts
-Sigstore attestations are cryptographic statements about software artifacts that provide:
+- An **attestation** is a machine-readable cryptographically signed statement.
+ When an attestation's signature is verified against a trusted key, that
+ verification provides integrity and authenticity guarantees about the
+ attestation's subject. For example:
-- Authenticity: Proof of who created/signed the artifact
-- Integrity: Verification that the artifact hasn't been tampered with
-- Transparency: Public record of signatures in a tamper-evident log
+ - Alice is the maintainer of the `widgets` package.
+ - Alice signs a machine readable statement equivalent to the following
+ English sentence, producing her attestation:
-### Key Components
+ > Alice published the `widgets` package at version v1.2.3 with
+ > hash `sha256:abcd...` to the `conda-forge` channel.
-- Predicates: JSON documents containing metadata about the signing event, using the `in-toto` format
-- Signatures: Cryptographic proofs made using ephemeral keys
-- Rekor: A tamper-evident log that stores attestations
-- Fulcio: A certificate authority that issues short-lived certificates
+ - Bob establishes trust in Alice's public key.
+ - Bob can verify the attestation's signature against Alice's public key,
+ giving him confidence that the statement is true.
+ - Correspondingly, Bob can reject any statement for `widgets` that is not
+ signed by Alice's public key.
-In this document, we want to standardize the sigstore predicate for conda packages. The bundle format to be used for sigstore attestations is the `v0.3` bundle format.
+- [in-toto] is a framework and standard for defining attestations.
+
+ - Within in-toto, an attestation's statement is composed of a
+ **subject** and a **predicate**. The subject is the resource
+ (or resources) being attested to, and the predicate is a
+ an arbitrary collection of metadata about the subject.
+ The predicate is identified by a **predicate type**,
+ which defines the predicate's expected schema.
+
+- [Sigstore] is a project that enables misuse-resistant software signing
+ and verification via short-lived certificates and a tamper-evident log.
+ Sigstore composes with attestation frameworks like in-toto to provide
+ transparency and misuse-resistance properties on top of the integrity
+ and authenticity properties of attestations.
+
+ One of Sigstore's major misuse-resistance contributions is
+ the use of *ephemeral keys* for signing. Modifying the example above:
+
+ - Instead of maintaining a long-lived signing key, Alice generates an
+ *ephemeral key* and binds it to her *identity*
+ ("`alice@trustme.example.com`").
+
+ This binding is done via a certificate issued by [Fulcio], which verifies a
+ *proof of possession* (such as from [OpenID Connect]) from Alice for her
+ claimant identity. The certificate issued by Fulcio is, in turn auditable
+ via [RFC 6962] Certificate Transparency (CT) logs.
+
+ - Alice signs her attestation with her ephemeral key, and distributes a
+ "bundle" containing both her attestation and her signing certificate.
+
+ - Instead of establishing trust a long-lived key from Alice, Bob establishes
+ trust in Alice's identity.
+
+ - Bob can verify the attestation's signature against Alice's emphemeral key,
+ which in turn can be verified as authentically Alice's via the Fulcio-
+ issued certificate.
+
+ With this flow, neither Alice nor Bob needs to maintain long-lived signing
+ or verifying keyrings, in turn reducing the attacker surface for key
+ compromise.
+
+ Another key misuse-resistance contribution within Sigstore is *machine
+ identities*. A machine identity behaves similarly to a human identity
+ (Alice or Bob), but identifies a machine instead of a human. For example,
+ `github.com/example/example/.github/workflows/release.yml@refs/tags/v1.2.3`
+ could be the machine identity of a GitHub Actions workflow that ran from
+ `release.yml` within `example/example` against the `v1.2.3` tag.
+
+## Motivation
+
+The conda ecosystem contains metadata that answers the following questions,
+in part or in full:
+
+* _Who_ (or _what_) published this package?
+* _What_ is the package's hash?
+* _Where_ was this package _published from_, and where _to_?
+* _When_ was this package published?
+
+However, this metadata is not currently **cryptographically verifiable**:
+the consuming party must either trust it as presented, or verify it manually
+against independent sources of truth (such as a project's release history).
+
+Attestations that present this metadata in a cryptographically
+verifiable manner are desirable for a number of reasons:
+
+* Package maintainers wish to demonstrate the integrity and authenticity
+ of their package uploads;
+* Individual downstream users wish to verify the integrity and authenticity of
+ packages they consume, without placing additional trust in the
+ channel or channel's hosting server;
+* Attestations change the sophistication and risk profile for attackers in
+ defenders' favor: the attacker must be sufficiently sophisticated
+ to access private key material, *and* have a risk tolerance profile that
+ accepts exposure via auditable transparency logs.
+
+More broadly, attestation schemes like the one proposed in this CEP have
+seen adoption in similar and related ecosystems:
+
+* Python (PyPA/PyPI): [PEP 740] and [PyPI - Attestations]
+* NodeJS (npm): [npm - Generating provenance statements]
+* Ruby (RubyGems): [rubygems/release-gem]
## Specification
-The in-toto predicate should contain the following fields:
+### Attestation format
+
+This CEP proposes the following attestation statement layout, using the
+[in-toto Statement schema]:
+
+- `predicateType` **MUST** be `https://schemas.conda.org/attestations/publish/v1`
+- `subject` **MUST** be a single [`ResourceDescriptor`], with the following
+ constraints:
+ - `subject[0].name` **MUST** be the full filename of the conda package
+ that will be part of the `repodata.json` and under which it will appear on
+ the server.
+ - `subject[0].digest` **MUST** be a [`DigestSet`], and it **MUST** contain
+ a single `sha256` entry with the SHA256 hash of the conda package.
+- `predicate` **MAY** be present. If present and not `null`, it **MUST** be a
+ JSON object with the following fields:
+ - `targetChannel` **MUST** be a string, indicating where the package
+ is being uploaded to. This field **MUST** be a valid URL with no
+ trailing slashes.
+
+An example of a compliant statement is provided below:
```json
{
"_type": "https://in-toto.io/Statement/v0.1",
"subject": [{
"name": "file-name-0.0.1-h123456_5.conda",
- "digest": {"sha256": "..."}, ...
+ "digest": {"sha256": "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b"},
}],
- // Schema URL
- "predicateType": "https://schemas.conda.org/predicate-v1.json",
+ "predicateType": "https://schemas.conda.org/attestations/publish/v1",
"predicate": {
- // Canonical URL of the target channel
"targetChannel": "https://prefix.dev/conda-forge",
}
}
```
-The `subject` field is already defined in the in-toto specification and contains the name of the package and its digest.
-For conda packages a SHA256 hash MUST be used.
-The subject MUST be the full filename of the conda package that will be part of the repodata.json and under which it will appear on the server.
+### Signing and distributing
+
+This CEP recommends the following signing process:
+
+1. The signer (i.e. Alice or Alice's trusted machine identity) uses a
+ [Sigstore]-compatible client to generate an ephemeral keypair and bind it to
+ their identity via a public certificate.
+2. The signer generates an in-toto statement as described above, and
+ produces an attestation by signing that statement with their ephemeral
+ private key.
+3. The signer uploads their attestation to the Sigstore transparency log
+ as a [DSSE] envelope.
+4. The signer produces a [Sigstore bundle] containing their certificate,
+ attestation, and transparency log inclusion proof.
+
+Each of these steps is performed transparently by a Sigstore client like
+[sigstore-python], except for step (2) as it concerns the specific
+layout of the signed-over statement.
+
+The result of this process is a single Sigstore bundle, which can be
+distributed alongside the conda package or otherwise made discoverable.
+
+This CEP does not proscribe a distribution mechanism. Prior art for distribution
+mechanisms can be found in the PyPI and RubyGems ecosystems, e.g.
+[PyPI's Integrity API].
-The `predicateType` field is used to specify the schema of the predicate. The `predicate` field contains the actual predicate data.
-We propose to publish a schema to validate the `predicate` field. The schema will be available at `https://schemas.conda.org/predicate-v1.json`.
+### Verifying
-The predicate MUST contain the `targetChannel` field, to indicate where the package is being uploaded to. This field MUST be validated by the receiving server. The channel MUST be in canonical form (full URL, no trailing slashes).
+This CEP recommends the following verification process:
+
+1. The verifier retrieves Alice's conda package and associated
+ Sigstore bundle.
+1. The verifier performs a standard Sigstore verification process against
+ the bundle, using Alice's identity (or machine identity) as the
+ signing identity. This process produces a verified in-toto statement.
+
+ This step requires the verifier to establish trust in the identity
+ being verified against.
+
+ Exact mechanisms for establishing this trust are
+ outside the scope of this CEP. However, one option is a TOFU (trust on first
+ use) scheme with an attestation-aware conda channel, where package names
+ are "locked" to attesting identities on first use, with subsequent updates
+ being verified against that identity.
+
+1. The verifier checks the in-toto statement for consistency against their
+ ground truth:
+
+ - The `predicateType` field **MUST** be `https://schemas.conda.org/attestations/publish/v1`.
+ - The `subject[0].name` field **MUST** match the filename of the conda package.
+ - The `subject[0].digest` field **MUST** match the SHA256 hash of the conda
+ package.
+ - The `predicate.targetChannel` field **SHOULD** match the channel that
+ the package was retrieved from, if `predicate` is present. However, the
+ verifier **MAY** choose to allow a channel mismatch, e.g. if the known
+ context is a mirroring context (where the conda package was originally
+ published to a different channel, but is now being consumed from
+ a mirror).
+
+At the end of this process, the verifier is confident in the following facts:
+
+- The package was published by the signer (Alice or Alice's machine identity).
+ - If the publisher is a machine identity, this further establishes source
+ provenance via the machine identity's claims. See [Sigstore OID information]
+ for additional information on these claims.
+- The package is authentic and integral modulo trust in the signer.
## Discussion
-This predicate adds basic verifiable facts about the package. It will tie the producer of the package to the target channel.
-This is similar to what PyPI has implemented with the [PyPI publish attestation](https://docs.pypi.org/attestations/publish/v1/). Since there is no single authoritative index in the Conda world, we add the `targetChannel` field to reach parity.
+This predicate adds basic verifiable facts about the package. It will tie the
+producer of the package to the target channel. This is similar to what PyPI has
+implemented with the [PyPI publish
+attestation](https://docs.pypi.org/attestations/publish/v1/). Since there is no
+single authoritative index in the Conda world, we add the `targetChannel` field
+to reach parity.
-On the server, the certificate should be tested against the Trusted Publisher used to upload the certificate to establish a chain of trust.
+On the server, the certificate should be tested against the Trusted Publisher
+used to upload the certificate to establish a chain of trust.
## Future work
-Once sigstore attestations are established and more research has been done, we might want to use the [SLSA (Supply-chain Levels for Software Artifacts)](https://slsa.dev) spec as base for predicates in the conda ecosystem.
\ No newline at end of file
+Once sigstore attestations are established and more research has been done, we
+might want to use the [SLSA (Supply-chain Levels for Software
+Artifacts)](https://slsa.dev) spec as base for predicates in the conda
+ecosystem.
+
+[in-toto]: https://in-toto.io
+[Sigstore]: https://sigstore.dev
+[Fulcio]: https://github.com/sigstore/fulcio
+[RFC 6962]: https://datatracker.ietf.org/doc/html/rfc6962
+[OpenID Connect]: https://openid.net/connect/
+[PEP 740]: https://peps.python.org/pep-0740/
+[PyPI - Attestations]: https://docs.pypi.org/attestations/
+[npm - Generating provenance statements]: https://docs.npmjs.com/generating-provenance-statements
+[rubygems/release-gem]: https://github.com/rubygems/release-gem
+[in-toto Statement schema]: https://github.com/in-toto/attestation/blob/main/spec/v1/statement.md
+[`ResourceDescriptor`]: https://github.com/in-toto/attestation/blob/main/spec/v1/resource_descriptor.md
+[`DigestSet`]: https://github.com/in-toto/attestation/blob/main/spec/v1/digest_set.md
+[DSSE]: https://github.com/secure-systems-lab/dsse/blob/master/envelope.md
+[Sigstore bundle]: https://docs.sigstore.dev/about/bundle/
+[sigstore-python]: https://github.com/sigstore/sigstore-python
+[Sigstore OID information]: https://github.com/sigstore/fulcio/blob/main/docs/oid-info.md
+[PyPI's Integrity API]: https://docs.pypi.org/api/integrity/