Skip to content

Backfill explicit discovery metadata on sample YAMLs; thin the derivation heuristics #551

@staging-devin-ai-integration

Description

Background

PR #544 introduced presentation-only discovery metadata (group/variant/category/tags) for sample pipelines. These are first-class optional YAML fields, but in practice almost every sample (≈85/88) relies on the derivation fallback in apps/skit/src/sample_discovery.rs, which infers grouping/variants/category/tags from:

  • hardcoded filename token tables (COMPOUND_TOKENS, SINGLE_TOKENS, LANGUAGE_TOKENS) in Rust, and
  • a parallel SYNONYM_GROUPS table in TS (ui/src/utils/samplePipelineOrdering.ts).

This re-encodes codec/hardware/language knowledge that already exists structurally in each YAML's node kinds, and splits it across two languages.

Problem

A new codec, hardware backend, or language silently mis-groups until both token tables (Rust + TS) are edited. The heuristics are doing the work of the default path rather than acting as a thin safety net.

Proposal

Backfill explicit group/variant/category/tags on the sample YAMLs (a co-located, reviewable change), then shrink the derivation in sample_discovery.rs to a minimal fallback for samples that omit them.

Why this was deferred from #544

S4 was explicitly scoped to need minimal/zero edits to sample YAML files to avoid merge conflicts with the parallel sample-testing work (merge order S2 → S3 → S4). Bulk-editing ~85 samples during #544 would have collided with those branches. Now that they've merged, this backfill is safe to do as a focused follow-up.

Source: Devin Review observation on PR #544 (sample_discovery.rs "altitude" finding).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions