feat: manage ClickHouse named collections from connector lifecycle#9335
Draft
feat: manage ClickHouse named collections from connector lifecycle#9335
Conversation
When a connector resource of an applicable driver type (s3, gcs, azure, mysql, postgres) reconciles in a project whose OLAP engine is ClickHouse, create or replace a named collection `rill_<connector_name>` on the CH server with the connector's resolved credentials. Drop the named collection on connector delete. The behavior is analogous to the TEMPORARY SECRET creation in the DuckDB driver (`connectorsForSecrets`) but is driven by connector reconcile rather than model execution. - Add `runtime/drivers/clickhouse/named_collections.go` with the driver -> field-name mapping registry, CREATE/DROP SQL builders (with `ON CLUSTER` support), the `named_collection_admin` permission probe, and the auto-detection regex used by the model executor. - Hook the create/drop into `runtime/reconcilers/connector.go`, gated on the project's resolved OLAP being ClickHouse. Surface a clear error if the CH user lacks `named_collection_admin`. - Auto-detect `s3(rill_<conn>, ...)`, `postgresql(rill_<conn>, ...)`, etc. references in `runtime/drivers/clickhouse/model_executor_self.go` and emit warnings if the referenced collection is missing. - Add unit tests for the field mapping, SQL builders, and detection regex, plus an end-to-end test sketch that exercises the full lifecycle against a real CH cluster via testcontainers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backend half of the CH named-collections feature. The frontend half (CH source templates rewritten to reference
s3(rill_<connector>, url='...')and similar) lives onroyendo/template-connectors-v2-frontend(#9329) — that branch's templates emit SQL that depends on this PR's named collections existing on the CH server. Backend should land first.What this does
When a Rill connector resource of an applicable driver type (s3, gcs, azure, mysql, postgres) reconciles in a project that uses ClickHouse as the OLAP engine, this code creates a CH named collection called
rill_<connector_name>populated from the connector's resolved config. On connector delete it drops the named collection. DuckDB-OLAP projects are unaffected.This is analogous to how the DuckDB driver creates
TEMPORARY SECRETs from connectors, but the lifecycle differs: CH named collections are cluster-wide and persistent, so creation is bound to the connector resource rather than per model execution.Files
runtime/drivers/clickhouse/named_collections.go(370 lines) — driver field-mapping registry,CREATE/DROP NAMED COLLECTIONSQL builders, permission probe, model-SQL reference detection.runtime/drivers/clickhouse/named_collections_test.go— unit tests for the registry, builders, and reference detection.runtime/drivers/clickhouse/model_executor_self.go— auto-detection hook inselfToSelfExecutor.Execute; warns when a model referencesrill_<conn>but no matching named collection exists.runtime/reconcilers/connector.go—syncClickHouseNamedCollectionwired into the connector create + delete paths.runtime/testruntime/testruntime.go— newNewInstanceWithClickhouseFileshelper for tests that mutate project files at runtime.runtime/connector_named_collection_test.go— end-to-end test covering create → update → model query → delete, gated bytestmode.Expensiveand using the existingch_cluster_2S_2R/cluster fixture.Naming convention
The named collection identifier is
rill_<connector_name>— fixed by cross-PR convention with the frontend templates on #9329.Things to look at first
syncClickHouseNamedCollectioninruntime/reconcilers/connector.goacquires the OLAP handle on every connector reconcile. Worth confirming this is the right architectural seam vs hooking closer to the existingUpdateInstanceConnectorplumbing.CheckNamedCollectionAdmindoes aCREATE+DROPon a probe collection rather than readingsystem.users, since granted permissions in CH are role-dependent and not portably queryable. Adds ~2 round-trips per connector reconcile; could memoize per*Connectionif it becomes hot-path noise.access_key_id,secret_access_key,host,database,user, etc.). Worth a sanity check against feat: rewire add-data modal to use template RPCs (template-connectors v2, PR 3) #9329's template SQL (s3(rill_foo, url='...'),postgresql(rill_foo, table='...'), etc.) before merge.Open items flagged by implementation
google_application_credentialsis set (no HMAC). Could be a warning + skip instead — open to either.urltable-function detection: the regex includesurl(...), buthttps-typed connectors aren't currently in the supported-driver registry, so aurl(rill_https, ...)reference would auto-detect-warn without a corresponding NC. Either removeurlfrom detection or extend the registry to handle https.rill_probe_*collection is left behind. Currently surfaced as an error. Periodic GC would be nice but out of scope here.Test instructions
Test is gated by
testmode.Expensive. Uses real CH via testcontainers. Covers connector create → NC appears with expected fields, edit → NC updated, model withs3(rill_…, …)reference resolves cleanly, connector delete → NC dropped. Cluster path (ON CLUSTER) exercised via the cluster fixture.Checklist:
Developed in collaboration with Claude Code