Rating v1: turn metered token counts into money (the revenue path) by hhuuggoo · Pull Request #6 · saturncloud/phoebe

hhuuggoo · 2026-06-08T16:47:16Z

What

The rating system — turns raw billing_event token counts into money. This is the piece between "we metered tokens" and "a neocloud can send an invoice." Per Wendy's PM assessment, this was the #1 revenue blocker (Phoebe metered but did not bill).

Mechanism only — Hugo sets the actual prices later, as data, not code. The rating system is reversible and buildable now; the prices are the one-way commercial door, decoupled.

Money correctness (treated like auth — go deep, it's expensive to be wrong)

Integer micro-USD everywhere, never float. cost_micro_usd int64 (1e-6 USD). 1 Atlas hourly_usage_record unit = 100 micro-USD.
The billable-prompt formula (the highest-risk line): vLLM's prompt_tokens is the TOTAL, cached_tokens the cache-hit SUBSET. Each prompt token is charged exactly once: billable_prompt = prompt_tokens - cached_tokens at the prompt rate, cached_tokens at the (discounted) cached rate. Charging the naive prompt*prompt_rate + cached*cached_rate would overbill every cache hit — guarded + tested by name.
Effective-dated prices — an event rates at the price effective at its event_ts; rating "now" never retroactively reprices old traffic.
Idempotent — rollups upsert on (auth_id, model, window_start); re-rating a window reconciles, never doubles.
Fail loud, never $0 — a model with no price → ErrNoPrice, counted, logged ERROR, dropped from rollups (not silently $0-billed). Rater exits 2 so a CronJob alerts.
Unattributable rows are counted, not silently dropped — billing_event rows with NULL auth_id/model are counted + surfaced + trigger exit-2 (a nonzero count means an upstream billing-gate leak). This closed a silent-loss gap found in review.

Schema

model_price — effective-dated price book (prompt/cached/completion micro-USD per token). Tables only; no prices in the migration.
rated_usage — hourly rollups per (auth_id, model, window_start), unique-keyed for idempotent upsert.
migrations/seed_example_prices.sql — clearly NON-BINDING placeholder prices (do not ship to prod).
Alembic chained after billing_event (down_revision = b1f0c2d3e4a5), following the drainer's migration-ownership pattern.

Code

internal/rating: pure Rate() money core, effective-dated PriceBook, Rater.Run(window), Store (+ Postgres impl).
cmd/rater: one-shot batch job (run by a k8s CronJob), --since/--until, default = last complete hour. Exit 0 / 1 (fatal) / 2 (rated but anomaly: unpriced or unattributable).

Base

Targets postgres-drainer (reads its billing_event table). ~36 tests, every money rule named, race/lint/fmt clean, zero new deps.

The revenue path. A batch job (cmd/rater) reads billing_event over a time window, joins an effective-dated price book, computes cost in integer micro-USD, and upserts per-(auth_id, model, hour) cost rollups into rated_usage. Money correctness is the product; the non-negotiable invariants, each tested by name: - INTEGER micro-USD (1e-6 USD) everywhere, never float. 1 Atlas hourly_usage_record unit (1e-4 USD) == 100 micro-USD; finer base avoids rounding tiny per-token prices to zero before the multiply. - Billable-prompt formula (the highest-risk line): vLLM's prompt_tokens is the TOTAL and cached_tokens is the cache-hit SUBSET, so cost = (prompt-cached)*prompt_rate + cached*cached_rate + completion*comp_rate. Cached tokens are charged ONCE (at the discounted rate), never double-counted. - Effective-dated prices: an event is rated with the price in effect at its event_ts (fallback created_at); rating now never retroactively reprices old traffic. - Idempotent re-runs: rollups upsert ON CONFLICT (auth_id, model, window_start) DO UPDATE, recomputing totals from scratch — re-rating a window reconciles, never doubles. - Fail-closed on missing price: a model with no price-book entry at the event's time is counted as unpriced and logged loudly, NEVER silently billed $0 (that is lost revenue). The rater exits 2 so a CronJob can alert. Schema (migrations + ready-to-copy Alembic, chained after billing_event's b1f0c2d3e4a5): model_price (effective-dated per-token price book) and rated_usage (the hourly rollup). Prices are DATA, not code — a clearly-labelled non-binding seed lives in seed_example_prices.sql; no prices ship in the schema. No new dependencies. go build/vet/test -race/golangci-lint/gofmt all clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ReadWindow previously filtered out billing_event rows with NULL auth_id or model via `AND auth_id IS NOT NULL AND model IS NOT NULL`, with a comment claiming they were "surfaced elsewhere." That elsewhere did not exist, so such a row vanished from rating with zero count and zero alert — a silent revenue-loss / data-loss path, exactly what we fail LOUD on for missing prices. Make the exclusion loud, mirroring UnpricedEvents: - store: drop the NULL filter from the WHERE; scan every in-window row with sql.NullString and, when auth_id/model is NULL, count it and skip it (atomic single-scan count). ReadWindow now returns (events, unattributable, error). - rater: add Result.UnattributableEvents, HasUnattributable(), and a combined HasAnomaly() = HasUnpriced() || HasUnattributable(); thread the count through Run(); log it loudly (ERROR) and fold it into the loud summary. - cmd/rater: exit 2 on HasAnomaly() (was HasUnpriced()); rename exitUnpriced -> exitAnomaly and update the exit-code doc. A nonzero unattributable count now alerts a CronJob the same as unpriced events. - replace the stale "surfaced elsewhere" comment with the truth. This should never be nonzero (the interceptor's fail-closed billing gate rejects requests missing auth_id before they are metered) — which is precisely why a nonzero count must be loudly surfaced: it means something upstream is broken and revenue is leaking. Tests: TestRater_UnattributableEventsCountedNotSilent (not rated, counted, triggers anomaly/exit-2) and TestPostgresStore_ReadWindowCountsUnattributable (NULL rows skipped+counted in the scan). Existing money-rule and ReadWindow tests updated for the new signature; all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hhuuggoo and others added 2 commits June 8, 2026 16:41

This was referenced Jun 8, 2026

Dockerfile: build all three binaries (add rater) #8

Closed

Fix lost metering event on aborted requests (real abort-path race) #9

Closed

Capture X-Saturn-Auth-Id (token identity) into metering events #1

Closed

hhuuggoo closed this Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rating v1: turn metered token counts into money (the revenue path)#6

Rating v1: turn metered token counts into money (the revenue path)#6
hhuuggoo wants to merge 2 commits into
postgres-drainerfrom
rating-v1

hhuuggoo commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hhuuggoo commented Jun 8, 2026

What

Money correctness (treated like auth — go deep, it's expensive to be wrong)

Schema

Code

Base

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant