Rating v1: turn metered token counts into money (the revenue path)#6
Closed
hhuuggoo wants to merge 2 commits into
Closed
Rating v1: turn metered token counts into money (the revenue path)#6hhuuggoo wants to merge 2 commits into
hhuuggoo wants to merge 2 commits into
Conversation
The revenue path. A batch job (cmd/rater) reads billing_event over a time window, joins an effective-dated price book, computes cost in integer micro-USD, and upserts per-(auth_id, model, hour) cost rollups into rated_usage. Money correctness is the product; the non-negotiable invariants, each tested by name: - INTEGER micro-USD (1e-6 USD) everywhere, never float. 1 Atlas hourly_usage_record unit (1e-4 USD) == 100 micro-USD; finer base avoids rounding tiny per-token prices to zero before the multiply. - Billable-prompt formula (the highest-risk line): vLLM's prompt_tokens is the TOTAL and cached_tokens is the cache-hit SUBSET, so cost = (prompt-cached)*prompt_rate + cached*cached_rate + completion*comp_rate. Cached tokens are charged ONCE (at the discounted rate), never double-counted. - Effective-dated prices: an event is rated with the price in effect at its event_ts (fallback created_at); rating now never retroactively reprices old traffic. - Idempotent re-runs: rollups upsert ON CONFLICT (auth_id, model, window_start) DO UPDATE, recomputing totals from scratch — re-rating a window reconciles, never doubles. - Fail-closed on missing price: a model with no price-book entry at the event's time is counted as unpriced and logged loudly, NEVER silently billed $0 (that is lost revenue). The rater exits 2 so a CronJob can alert. Schema (migrations + ready-to-copy Alembic, chained after billing_event's b1f0c2d3e4a5): model_price (effective-dated per-token price book) and rated_usage (the hourly rollup). Prices are DATA, not code — a clearly-labelled non-binding seed lives in seed_example_prices.sql; no prices ship in the schema. No new dependencies. go build/vet/test -race/golangci-lint/gofmt all clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ReadWindow previously filtered out billing_event rows with NULL auth_id or model via `AND auth_id IS NOT NULL AND model IS NOT NULL`, with a comment claiming they were "surfaced elsewhere." That elsewhere did not exist, so such a row vanished from rating with zero count and zero alert — a silent revenue-loss / data-loss path, exactly what we fail LOUD on for missing prices. Make the exclusion loud, mirroring UnpricedEvents: - store: drop the NULL filter from the WHERE; scan every in-window row with sql.NullString and, when auth_id/model is NULL, count it and skip it (atomic single-scan count). ReadWindow now returns (events, unattributable, error). - rater: add Result.UnattributableEvents, HasUnattributable(), and a combined HasAnomaly() = HasUnpriced() || HasUnattributable(); thread the count through Run(); log it loudly (ERROR) and fold it into the loud summary. - cmd/rater: exit 2 on HasAnomaly() (was HasUnpriced()); rename exitUnpriced -> exitAnomaly and update the exit-code doc. A nonzero unattributable count now alerts a CronJob the same as unpriced events. - replace the stale "surfaced elsewhere" comment with the truth. This should never be nonzero (the interceptor's fail-closed billing gate rejects requests missing auth_id before they are metered) — which is precisely why a nonzero count must be loudly surfaced: it means something upstream is broken and revenue is leaking. Tests: TestRater_UnattributableEventsCountedNotSilent (not rated, counted, triggers anomaly/exit-2) and TestPostgresStore_ReadWindowCountsUnattributable (NULL rows skipped+counted in the scan). Existing money-rule and ReadWindow tests updated for the new signature; all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The rating system — turns raw
billing_eventtoken counts into money. This is the piece between "we metered tokens" and "a neocloud can send an invoice." Per Wendy's PM assessment, this was the #1 revenue blocker (Phoebe metered but did not bill).Mechanism only — Hugo sets the actual prices later, as data, not code. The rating system is reversible and buildable now; the prices are the one-way commercial door, decoupled.
Money correctness (treated like auth — go deep, it's expensive to be wrong)
cost_micro_usd int64(1e-6 USD). 1 Atlashourly_usage_recordunit = 100 micro-USD.prompt_tokensis the TOTAL,cached_tokensthe cache-hit SUBSET. Each prompt token is charged exactly once:billable_prompt = prompt_tokens - cached_tokensat the prompt rate,cached_tokensat the (discounted) cached rate. Charging the naiveprompt*prompt_rate + cached*cached_ratewould overbill every cache hit — guarded + tested by name.event_ts; rating "now" never retroactively reprices old traffic.(auth_id, model, window_start); re-rating a window reconciles, never doubles.ErrNoPrice, counted, logged ERROR, dropped from rollups (not silently $0-billed). Rater exits 2 so a CronJob alerts.Schema
model_price— effective-dated price book (prompt/cached/completion micro-USD per token). Tables only; no prices in the migration.rated_usage— hourly rollups per(auth_id, model, window_start), unique-keyed for idempotent upsert.migrations/seed_example_prices.sql— clearly NON-BINDING placeholder prices (do not ship to prod).billing_event(down_revision = b1f0c2d3e4a5), following the drainer's migration-ownership pattern.Code
internal/rating: pureRate()money core, effective-datedPriceBook,Rater.Run(window),Store(+ Postgres impl).cmd/rater: one-shot batch job (run by a k8s CronJob),--since/--until, default = last complete hour. Exit 0 / 1 (fatal) / 2 (rated but anomaly: unpriced or unattributable).Base
Targets
postgres-drainer(reads itsbilling_eventtable). ~36 tests, every money rule named, race/lint/fmt clean, zero new deps.