Skip to content

feat(marketplace): community pipeline marketplace (GitHub-backed registry) #555

@staging-devin-ai-integration

Description

Summary

Extend the marketplace concept from plugins to sample pipelines, so the community can discover, share, and install good example pipelines the same way they install plugins today. Pipelines are tiny YAML recipes (not binaries), so this is fundamentally about sharing recipes + resolving their dependencies, distributed through a GitHub-backed static registry that reuses the existing plugin-marketplace machinery.

This is the umbrella/design issue. The first concrete deliverable is tracked separately (contribution flow + generated registry); server install + UI surfaces are phase-2 follow-ups spun off here.

Why GitHub-backed (Option A)

The plugin marketplace is already a static, signed JSON registry: index.json → per-version manifest.json + minisign (.minisig, Ed25519) signature, hosted on GitHub Pages (metadata) + GitHub Releases (bundles). The relevant pieces:

  • apps/skit/src/marketplace.rs — registry client + minisign verification; RegistryIndex { schema_version, #[serde(default)] plugins }.
  • apps/skit/src/marketplace_security.rs — https/host/origin URL policy + allowlist.
  • apps/skit/src/marketplace_installer.rs — install job queue with progress steps.
  • Endpoints under /api/v1/marketplace/*; UI in ui/src/views/PluginsView.tsx.
  • Official registry auto-generated from each plugins/native/<id>/plugin.yml via scripts/marketplace/ and published at https://streamkit.dev/registry/index.json.

Reusing this gets us signing, security policy, the registry client, the install job queue, and a UI pattern essentially for free. GitHub additionally gives review/moderation/identity/versioning at zero infra cost, and RegistryIndex.plugins already carries #[serde(default)], so adding a sibling pipelines[] array is backward-compatible with existing registries and older servers.

Rejected for now:

  • Option B — hosted community service (DB + API): richer UX (in-app publish, ratings, search) but someone must run/secure/moderate it (auth, spam, abuse, DMCA, cost) and it abandons the static-registry trust model.
  • Option C — hybrid: ship A, then later layer a thin read-only discovery/search service on top of the same GH-sourced signed data. This is the intended long-term end state (see Scaling), not the starting point.

How sample pipelines work today

The key wrinkle: dependencies, not code

A pipeline is small YAML, but it references node kinds — and some are plugin::… (native plugins), feature-gated codecs, hardware backends, models, and assets (fonts/images/audio). A pipeline marketplace must therefore surface and resolve dependencies, not just hand over text. Proposed manifest shape (mirrors PluginManifest):

{
  "schema_version": 1,
  "id": "voice-translate-en-es",
  "name": "Voice translate (EN→ES)",
  "mode": "dynamic",
  "description": "...",
  "group": "...", "variant": "...", "category": "...", "tags": ["..."],
  "yaml": "<inline pipeline yaml>",        // or "yaml_url" for larger recipes
  "requires": {
    "plugins":  ["nllb", "whisper", "kokoro"],   // resolvable via the plugin registry
    "models":   ["..."],
    "features": ["svt_av1"],                       // cargo features / hardware backends
    "assets":   ["fonts/...", "images/..."]
  },
  "license": "...", "author": "...", "homepage": "..."
}

index.json gains a sibling array:

{ "schema_version": 1, "plugins": [ ... ], "pipelines": [ { "id": "...", "latest": "...", "versions": [ { "version": "...", "manifest_url": "...", "signature_url": "...", "published_at": "..." } ] } ] }

Cross-cutting design decisions to nail

  1. Dependency surfacing / install UX. Reuse the node-availability logic behind test(samples): guard dynamic samples with parse/compile + node-availability checks #543 to compute, before install, which plugin::… kinds / features / models a pipeline needs and whether they're present. Offer to chain-install missing plugins via the existing marketplace_installer; clearly warn (or block) when a dependency can't be satisfied (e.g. hardware/feature-gated codec).
  2. Trust model. Pipelines are data, but they pull native plugins and embed URLs (web_capture, MoQ gateways, S3). Keep minisign signing for curated/official registries. Decide whether unsigned community entries are install-with-warning vs blocked, driven by the existing marketplace_* config knobs (marketplace_enabled, scheme/host policy, origin allowlist). Installing a community pipeline must never silently install native code without consent.
  3. Where installed pipelines live. Add a distinct source (e.g. samples/pipelines/installed/) so installed-from-marketplace pipelines are distinguishable from user-authored (user/) and system samples, with its own is_system=false semantics and delete support.
  4. Secrets / portability. A shared pipeline must declare inputs, not embed secrets. Define how env-specific bits (S3 creds, gateway URLs, API keys) are parameterized/stripped on contribution and prompted-for on install, so recipes are portable and safe to publish.

Scaling path (deliberate, not a surprise)

The catalog stays thin (metadata only); YAML + requires{} live in per-version manifests fetched on demand — same lazy split plugins use. Rough sizing: ~1k pipelines ≈ 0.5 MB raw / ~50–80 KB gzipped; ~10k ≈ ~5 MB / ~0.5 MB gzipped, single-digit-ms parse. GitHub Pages serves gzip/brotli + ETag/Cache-Control, so unchanged indices are conditional-GET 304s.

Escalation order as the catalog grows:

  1. Thin catalog + gzip + conditional GET (the default design) — comfortable to ~1–10k entries.
  2. Shard/paginate the index (index.json + pages/*.json, or by mode/category) — still 100% static, good to tens of thousands.
  3. Option C — a thin read-only discovery/search service (or a prebuilt SQLite/FTS artifact served as a file) built from the same GH-sourced signed manifests, when we want server-side search/ranking/popularity. Discovery becomes dynamic; the publish + trust model is unchanged.

Phasing

  • Phase 1 (separate issue): community/ contribution dir + CI validation reuse + registry generator + GitHub Pages publish + pipelines[] schema. No server/UI changes — just produces and validates a consumable registry.
  • Phase 2 (follow-ups off this epic): server-side support to list/fetch/install pipelines (marketplace.rs + marketplace_installer.rs + /api/v1/marketplace/pipelines*), including dependency resolution; UI surface for browsing/installing pipelines.
  • Phase 3 (later): Option C discovery/search layer if/when catalog size warrants it.

Open questions

  • Single combined registry (plugins[] + pipelines[] in one index.json) vs a dedicated pipeline registry URL? (Leaning combined for reuse.)
  • Same-repo community/ dir vs a separate streamkit/pipelines repo for community submissions?
  • Unsigned community entries: warn-and-install vs require signing, and what the default policy ships as.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions