Skip to content

Conversation

@DZakh
Copy link
Member

@DZakh DZakh commented Nov 11, 2025

Summary by CodeRabbit

  • New Features

    • Optional ClickHouse mirroring with automatic history-table setup and mirroring metrics.
    • Native timestamp fields added to generated entity schemas.
    • New storage metrics: batch write count and cumulative write time.
  • Chores

    • New ClickHouse environment configuration (host, database, credentials).
    • Normalized internal types and DB column rename to envio_checkpoint_id.
    • Persistence API updated to emit an explicit "change" history shape and new writeBatch payloads.
  • Tests

    • Added tests validating ClickHouse SQL/history table generation.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 11, 2025

Walkthrough

Adds ClickHouse mirror support and bindings; threads an optional Mirror through PgStorage initialization and batch writes; refactors Postgres/field-type system and many table schemas (including checkpoint renaming); changes in-memory/history payloads to Change.t; updates templates, env config, Prometheus metrics, and tests.

Changes

Cohort / File(s) Change Summary
ClickHouse integration & Mirror
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res, codegenerator/cli/npm/envio/src/Mirror.res, codegenerator/cli/npm/envio/package.json, codegenerator/cli/npm/envio/package.json.tmpl
Added ClickHouse FFI, SQL generation for history tables, client lifecycle, and a Mirror abstraction (initialize, writeBatch). Added @clickhouse/client dependency.
PgStorage mirror plumbing & Postgres types
codegenerator/cli/npm/envio/src/PgStorage.res, codegenerator/cli/npm/envio/src/bindings/Postgres.res
Threaded optional ~mirror: option<Mirror.t>=? through make/initialize/writeBatch; mirror.initialize invoked during init; writeBatch mirrors history writes in parallel. Added Postgres.columnType variants.
Field-type system refactor
codegenerator/cli/src/config_parsing/field_types.rs, codegenerator/cli/src/config_parsing/mod.rs, codegenerator/cli/src/config_parsing/entity_parsing.rs, codegenerator/cli/npm/envio/src/db/Table.res, codegenerator/cli/src/hbs_templating/codegen_templates.rs
Replaced legacy postgres_types with new field_types primitives (String, Int32, BigInt, BigDecimal, Number, Json, Date, Entity, Enum); updated mapping, getPgFieldType and SQL/type generation, and imports.
Internal schema & history model changes
codegenerator/cli/npm/envio/src/db/InternalTable.res, codegenerator/cli/npm/envio/src/db/EntityHistory.res, codegenerator/cli/npm/envio/src/Internal.res, codegenerator/cli/npm/envio/src/Change.res, codegenerator/cli/npm/envio/src/db/Table.res
Normalized many table fields to String/Int32/Json/BigInt/Date; renamed checkpoint_idenvio_checkpoint_id; introduced Change.t<'entity> (Set/Delete) and switched history schemas/usages to use setChangeSchema.
In-memory store/table and persistence API changes
codegenerator/cli/npm/envio/src/InMemoryStore.res, codegenerator/cli/npm/envio/src/InMemoryTable.res, codegenerator/cli/npm/envio/src/Persistence.res, codegenerator/cli/npm/envio/src/IO.res, codegenerator/cli/npm/envio/src/TestHelpers_MockDb.res.hbs, scenarios/test_codegen/test/helpers/Mock.res
Replaced previous entityUpdate wrapper with Change.t payloads; changed InMemoryStore/InMemoryTable APIs and signatures; Persistence.writeBatch signature expanded (removed inMemoryStore, added rawEvents, rollbackTargetCheckpointId, updatedEntities, updatedEffectsCache).
Templates, env config, and generated code changes
codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs, codegenerator/cli/templates/static/codegen/src/Env.res, codegenerator/cli/templates/static/codegen/src/EventProcessing.res, codegenerator/cli/templates/static/codegen/src/UserContext.res, codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs, codegenerator/cli/src/hbs_templating/codegen_templates.rs
Template adjustments: platform/ecosystem naming, EventRegister parameter change, conditional Mirror creation in makeStorage, removed ~sql=initialSql from Persistence.make; added Env.ClickHouse config; updated generated event processing to call storage write and new Prometheus increments.
Prometheus metrics & wiring
codegenerator/cli/npm/envio/src/Prometheus.res, codegenerator/cli/templates/static/codegen/src/EventProcessing.res
Added storage write counters/timers and Mirror-specific counters; wired increments into the write path.
Tests & scenarios (timestamps, ClickHouse SQL test, rollbacks)
scenarios/test_codegen/schema.graphql, scenarios/test_codegen/src/EventHandlers.ts, scenarios/test_codegen/test/WriteRead_test.res, scenarios/test_codegen/test/lib_tests/PgStorage_test.res, scenarios/test_codegen/test/lib_tests/ClickHouse_test.res, multiple scenario tests
Added timestamp fields to schema and tests; updated many tests and SQL to use envio_checkpoint_id and new Change-based history shapes; added ClickHouse CREATE TABLE generation unit test.
Utilities & minor additions
codegenerator/cli/npm/envio/src/Utils.res
Added Array.immutableEmpty constant.

Sequence Diagram(s)

sequenceDiagram
    participant App
    participant PgStorage
    participant Mirror
    participant ClickHouse as CH
    participant Postgres as PG

    App->>PgStorage: make(..., ~mirror=Some(m)?)
    App->>PgStorage: initialize(chainConfigs, entities, enums)
    PgStorage->>Mirror: initialize(~chainConfigs, ~entities, ~enums)
    Mirror->>CH: createClient / exec(drop/create/use DB)
    loop per entity
      Mirror->>CH: exec(CREATE TABLE IF NOT EXISTS ...envio_history...)
    end
    CH-->>Mirror: close / success
    Mirror-->>PgStorage: initialized
    PgStorage->>PG: run initialization SQL
    PG-->>PgStorage: success

    App->>PgStorage: writeBatch(~batch, ~rawEvents, ~updatedEntities, ...)
    par Parallel writes
        PgStorage->>PG: INSERT/UPDATE batch
        PgStorage->>Mirror: writeBatch(~updatedEntities)
        Mirror->>CH: insert(...) / exec
    end
    PG-->>PgStorage: write result
    Mirror-->>PgStorage: mirror result
    PgStorage-->>App: completed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Areas to inspect closely:

  • ClickHouse SQL generation: type mapping, Nullable/Array formatting, ORDER BY and ENGINE correctness.
  • PgStorage concurrency and error handling when Postgres writes run in parallel with mirror writes.
  • Field-type refactor ripple: config parsing, templates, generated code, and tests.
  • EntityHistory SQL changes (envio_checkpoint_id) and backfill/prune/rollback correctness.
  • Persistence.writeBatch signature and call sites to ensure callers updated consistently.

Possibly related PRs

Suggested reviewers

  • JonoPrest
  • moose-code

Poem

🐇
I hopped through types and tiny SQL nights,
Mirrors now hum with history's lights,
Int32s and Strings in tidy rows align,
ClickHouse listens while metrics chime,
A rabbit cheers — the pipeline's fine!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'ClickHouse mirror' directly describes the main feature introduced: a new Mirror system with ClickHouse support as a data backend mirror.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dz/clickhouse-mirror

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Base automatically changed from dz/platform-abstraction to main November 12, 2025 10:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
codegenerator/cli/npm/envio/package.json.tmpl (1)

34-34: Same version verification needed as in package.json.

This is the template version of the same dependency added in codegenerator/cli/npm/envio/package.json line 23. Please refer to that review comment for verification steps.

🧹 Nitpick comments (1)
codegenerator/cli/npm/envio/src/PgStorage.res (1)

554-868: LGTM: Comprehensive batch write implementation.

The writeBatch function consolidates multiple storage operations into an efficient batch:

  • Entity history management
  • Rollback operations
  • Batch entity updates
  • Checkpoint tracking
  • Effect cache operations

The implementation handles edge cases including UTF8 encoding errors with retry logic. While the function is complex, the structure is clear and the complexity is warranted given the scope.

Optional refactor suggestion: Consider extracting the UTF8 encoding error handling (lines 847-867) into a separate helper function in a future refactor for improved maintainability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f34a2b and d673b9d.

⛔ Files ignored due to path filters (2)
  • codegenerator/cli/npm/envio/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
  • scenarios/test_codegen/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (31)
  • codegenerator/cli/npm/envio/package.json (1 hunks)
  • codegenerator/cli/npm/envio/package.json.tmpl (1 hunks)
  • codegenerator/cli/npm/envio/src/Config.res (3 hunks)
  • codegenerator/cli/npm/envio/src/EventRegister.res (3 hunks)
  • codegenerator/cli/npm/envio/src/EventRegister.resi (1 hunks)
  • codegenerator/cli/npm/envio/src/InMemoryStore.gen.ts (1 hunks)
  • codegenerator/cli/npm/envio/src/Mirror.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Persistence.res (5 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (6 hunks)
  • codegenerator/cli/npm/envio/src/Platform.res (1 hunks)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
  • codegenerator/cli/src/hbs_templating/codegen_templates.rs (0 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs (3 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (3 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/Types.res.hbs (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/Env.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (5 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/UserContext.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/Db.res (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsEntities.res (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (8 hunks)
  • codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res (1 hunks)
  • scenarios/test_codegen/test/helpers/DbHelpers.res (1 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (1 hunks)
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res (2 hunks)
  • scenarios/test_codegen/test/schema_types/BigDecimal_test.res (1 hunks)
  • scenarios/test_codegen/test/schema_types/Timestamp_test.res (1 hunks)
💤 Files with no reviewable changes (5)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js
  • codegenerator/cli/src/hbs_templating/codegen_templates.rs
  • codegenerator/cli/templates/dynamic/codegen/src/Types.res.hbs
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsEntities.res
  • codegenerator/cli/templates/static/codegen/src/db/Db.res
🧰 Additional context used
📓 Path-based instructions (6)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/Config.res
  • codegenerator/cli/npm/envio/package.json.tmpl
  • codegenerator/cli/npm/envio/src/EventRegister.resi
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • codegenerator/cli/npm/envio/src/InMemoryStore.gen.ts
  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Platform.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/package.json
  • codegenerator/cli/npm/envio/src/EventRegister.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/npm/envio/src/Persistence.res
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/Config.res
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • scenarios/test_codegen/test/schema_types/Timestamp_test.res
  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Platform.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/src/EventRegister.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/npm/envio/src/Persistence.res
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • scenarios/test_codegen/test/helpers/DbHelpers.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/test_codegen/test/schema_types/BigDecimal_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/Config.res
  • codegenerator/cli/npm/envio/package.json.tmpl
  • codegenerator/cli/npm/envio/src/EventRegister.resi
  • codegenerator/cli/npm/envio/src/InMemoryStore.gen.ts
  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Platform.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/package.json
  • codegenerator/cli/npm/envio/src/EventRegister.res
  • codegenerator/cli/npm/envio/src/Persistence.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
codegenerator/cli/templates/static/**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Static templates are raw ReScript files copied verbatim under codegenerator/cli/templates/static

Files:

  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
{codegenerator/cli/templates/static/codegen/src/**,codegenerator/cli/templates/dynamic/codegen/src/**}

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Template versions of runtime live under templates/{static,dynamic}/codegen/src and are the recommended editing targets

Files:

  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
codegenerator/cli/templates/dynamic/**/*.hbs

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Dynamic templates are Handlebars files under codegenerator/cli/templates/dynamic

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
🧠 Learnings (11)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Config.res : In generated runtime, Config.res maps env to typed config and sets up persistence

Applied to files:

  • codegenerator/cli/npm/envio/src/Config.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
📚 Learning: 2025-05-27T17:07:04.699Z
Learnt from: JonoPrest
Repo: enviodev/hyperindex PR: 555
File: scenarios/test_codegen/test/lib_tests/EntityHistory_test.res:582-629
Timestamp: 2025-05-27T17:07:04.699Z
Learning: In ReScript test files using DbHelpers.resetPostgresClient(), this function is synchronous with signature `unit => unit` and should not be awaited. It performs a raw JavaScript reassignment of the postgres client and does not return a promise.

Applied to files:

  • scenarios/test_codegen/test/schema_types/Timestamp_test.res
  • codegenerator/cli/templates/static/codegen/src/db/Migrations.res
  • scenarios/test_codegen/test/helpers/DbHelpers.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
📚 Learning: 2025-05-27T17:07:12.878Z
Learnt from: JonoPrest
Repo: enviodev/hyperindex PR: 555
File: scenarios/test_codegen/test/lib_tests/EntityHistory_test.res:631-659
Timestamp: 2025-05-27T17:07:12.878Z
Learning: In isolated tests for EntityHistory rollbacks with controlled mock data setup, the ordering of getRollbackDiff results is deterministic and doesn't require sorting before comparison with deepStrictEqual.

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
  • scenarios/test_codegen/test/helpers/Mock.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/IO.res : In generated runtime, IO.res is used by persistence and processing

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to codegenerator/cli/templates/dynamic/**/*.hbs : Dynamic templates are Handlebars files under codegenerator/cli/templates/dynamic

Applied to files:

  • codegenerator/cli/templates/dynamic/codegen/src/Generated.res.hbs
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/ChainManager.res : In generated runtime, ChainManager.res is part of the fetch side

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to codegenerator/cli/npm/envio/** : Shared library-fied runtime lives in codegenerator/cli/npm/envio

Applied to files:

  • codegenerator/cli/npm/envio/package.json
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/EventProcessing.res : In generated runtime, EventProcessing.res handles processing

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • scenarios/test_codegen/test/helpers/DbHelpers.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Persistence.res : In generated runtime, Persistence.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Hasura.res : In generated runtime, Hasura.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/Persistence.res
🔇 Additional comments (27)
codegenerator/cli/templates/static/codegen/src/ink/components/CustomHooks.res (1)

37-37: No issues found - types are compatible.

The Platform.name type definition contains only Evm and Fuel variants, which exactly matches the ecosystem type. The upcast operator at line 37 is safe and type-correct. No additional variants exist that could cause runtime issues.

scenarios/test_codegen/test/schema_types/BigDecimal_test.res (1)

16-16: LGTM!

The switch from Generated.codegenPersistence.sql to Db.makeClient() correctly aligns with the architectural refactor moving away from exposing SQL clients through the persistence layer.

scenarios/test_codegen/test/helpers/DbHelpers.res (1)

2-6: LGTM!

The removal of the direct SQL client assignment and the updated comment correctly reflect the architectural shift toward storage-based APIs. The function still properly resets the storage layer for tests.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

51-51: LGTM!

The migration from persistence.sql->DbFunctions.EntityHistory.hasRows to persistence.storage.hasEntityHistoryRows() provides better abstraction and aligns with the storage-based persistence API refactor.

codegenerator/cli/npm/envio/src/EventRegister.resi (1)

6-10: LGTM!

The parameter rename from ecosystem to platform is a clear improvement in terminology and aligns with the broader architectural refactor introducing the Platform.t type throughout the codebase.

scenarios/test_codegen/test/lib_tests/Persistence_test.res (2)

7-7: LGTM!

The removal of the ~sql parameter from Persistence.make correctly reflects the API change that internalizes SQL client management within the storage layer, simplifying the public interface.


133-133: LGTM!

Consistent with the signature change at line 7, correctly removing the ~sql parameter.

codegenerator/cli/templates/static/codegen/src/db/Migrations.res (1)

12-12: LGTM!

The migration from sql->unsafe(query) to storage.executeUnsafe(query)->Promise.ignoreValue correctly aligns with the storage-based API refactor. The explicit Promise.ignoreValue appropriately discards the return value for this DDL operation.

codegenerator/cli/npm/envio/package.json (1)

23-23: No issues found with @clickhouse/client version 1.12.1.

The pinned version is the latest available on npm and has no known security vulnerabilities.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (5)

190-190: LGTM: Storage abstraction applied correctly.

The change from direct SQL calls to storage-based API is consistent with the broader persistence layer refactor introduced in this PR.


198-198: LGTM: Function signature updated for platform-based operations.

The additional ~indexer parameter enables access to platform-specific operations (like getTimestamp on line 231). All call sites within this file have been updated consistently.


231-231: LGTM: Platform-based timestamp extraction.

The change from direct Block.getTimestamp to platform.getTimestamp aligns with the platform abstraction introduced in this PR to support multiple blockchain platforms.


821-855: LGTM: Rollback and maintenance operations migrated to storage interface.

The changes consistently migrate rollback and maintenance operations (pruneStaleCheckpoints, pruneStaleEntityHistory) to the storage-based API, aligning with the broader persistence layer abstraction.


1020-1037: LGTM: Rollback data retrieval migrated to storage interface.

The rollback target checkpoint and progress diff queries are now accessed through the storage interface, maintaining consistency with the storage abstraction pattern.

codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (3)

46-47: LGTM: Platform parameter updated in registration.

The change from ecosystem to platform is consistent with the broader platform abstraction introduced in this PR.


373-379: LGTM: Platform-based block field extraction.

The changes route timestamp and block number extraction through the platform abstraction layer, enabling support for different blockchain platforms.


590-606: Placeholder mock implementations acceptable; note test coverage gap for rollback/checkpoint features.

The mock storage methods at lines 590-605 are placeholder implementations that return no-ops or empty results. This is acceptable for the current test suite, as the EntityHistory_test.res file shows all tests calling these methods (such as pruneStaleEntityHistory, getRollbackTargetCheckpoint) are commented out. However, this reveals a test coverage gap: production code in GlobalState.res actively uses these methods for critical rollback and checkpoint maintenance operations. The mock placeholders won't exercise these code paths until the disabled tests are re-enabled.

codegenerator/cli/npm/envio/src/Persistence.res (3)

39-43: LGTM: New type for batch cache operations.

The effectBatchCache type enables collecting effect cache operations before the batch write, allowing all storage operations to be executed together efficiently.


86-127: LGTM: Storage interface expanded for rollback and maintenance operations.

The storage interface has been significantly expanded to include:

  • Rollback operations (getRollbackTargetCheckpoint, getRollbackProgressDiff, getRollbackData)
  • Maintenance operations (pruneStaleCheckpoints, pruneStaleEntityHistory)
  • Batch writing with cache support (writeBatch with ~batchCache parameter)

Note that the writeBatch signature is a breaking change that requires all call sites to provide the ~batchCache parameter.


242-263: LGTM: Effect cache now batched for optimal writes.

The function has been renamed to getEffectBatchCache and now returns a batch cache object instead of writing immediately. This enables batching effect cache operations with other storage operations in writeBatch for better performance.

The in-memory cache counter is still updated immediately to maintain consistency for metrics.

codegenerator/cli/npm/envio/src/PgStorage.res (7)

179-185: LGTM: Delete query builders added.

Clean implementation of delete query builders following the established pattern for load queries.


516-540: LGTM: Delete operation implementation.

The deleteByIdsOrThrow function properly handles both single and batch delete cases with consistent error handling.


542-552: LGTM: Helper for conditional set execution.

Simple and effective optimization to avoid unnecessary database calls when there are no items to process.


879-879: LGTM: Optional mirror parameter added.

The optional ~mirror parameter enables ClickHouse mirroring support while maintaining backward compatibility.


1011-1015: LGTM: Mirror initialization integrated.

The mirror initialization is properly invoked before PostgreSQL schema creation, ensuring the mirror is ready to receive data when indexing begins.


1276-1358: LGTM: Storage interface methods implemented.

The concrete PostgreSQL implementations for the storage interface are well-structured:

  • executeUnsafe provides raw query capability
  • hasEntityHistoryRows comprehensively checks all history tables
  • Maintenance and rollback operations properly delegate to internal table functions
  • Error handling is appropriate throughout

All implementations follow consistent patterns and integrate cleanly with the existing codebase.


1359-1398: LGTM: Public storage interface properly exported.

The writeBatchMethod wrapper and the returned storage object properly expose all new storage methods, completing the storage abstraction layer. All methods from the Persistence.storage interface are implemented and exported.

Comment on lines 187 to 205
executeUnsafe: _ => Promise.resolve(%raw(`undefined`)),
hasEntityHistoryRows: () => Promise.resolve(false),
setChainMeta: _ => Promise.resolve(%raw(`undefined`)),
pruneStaleCheckpoints: _ => Promise.resolve(),
pruneStaleEntityHistory: (~entityName as _, ~entityIndex as _, ~safeCheckpointId as _) =>
Promise.resolve(),
getRollbackTargetCheckpoint: (~reorgChainId as _, ~lastKnownValidBlockNumber as _) =>
Promise.resolve([]),
getRollbackProgressDiff: _ => Promise.resolve([]),
getRollbackData: (~entityConfig as _, ~rollbackTargetCheckpointId as _) =>
Promise.resolve(([], [])),
writeBatch: (
~batch as _,
~inMemoryStore as _,
~isInReorgThreshold as _,
~config as _,
~allEntities as _,
~batchCache as _,
) => Promise.resolve(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Return a list from executeUnsafe.
Line 187 currently resolves the mock promise to undefined, but downstream code (and the real storage implementation) expects executeUnsafe to yield an array of rows. Any test that iterates or spreads the result will crash instead of exercising the handler. Please have the mock mirror the production contract by returning an empty array (or a configurable array).

-        executeUnsafe: _ => Promise.resolve(%raw(`undefined`)),
+        executeUnsafe: _ => Promise.resolve([]),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
executeUnsafe: _ => Promise.resolve(%raw(`undefined`)),
hasEntityHistoryRows: () => Promise.resolve(false),
setChainMeta: _ => Promise.resolve(%raw(`undefined`)),
pruneStaleCheckpoints: _ => Promise.resolve(),
pruneStaleEntityHistory: (~entityName as _, ~entityIndex as _, ~safeCheckpointId as _) =>
Promise.resolve(),
getRollbackTargetCheckpoint: (~reorgChainId as _, ~lastKnownValidBlockNumber as _) =>
Promise.resolve([]),
getRollbackProgressDiff: _ => Promise.resolve([]),
getRollbackData: (~entityConfig as _, ~rollbackTargetCheckpointId as _) =>
Promise.resolve(([], [])),
writeBatch: (
~batch as _,
~inMemoryStore as _,
~isInReorgThreshold as _,
~config as _,
~allEntities as _,
~batchCache as _,
) => Promise.resolve(),
executeUnsafe: _ => Promise.resolve([]),
hasEntityHistoryRows: () => Promise.resolve(false),
setChainMeta: _ => Promise.resolve(%raw(`undefined`)),
pruneStaleCheckpoints: _ => Promise.resolve(),
pruneStaleEntityHistory: (~entityName as _, ~entityIndex as _, ~safeCheckpointId as _) =>
Promise.resolve(),
getRollbackTargetCheckpoint: (~reorgChainId as _, ~lastKnownValidBlockNumber as _) =>
Promise.resolve([]),
getRollbackProgressDiff: _ => Promise.resolve([]),
getRollbackData: (~entityConfig as _, ~rollbackTargetCheckpointId as _) =>
Promise.resolve(([], [])),
writeBatch: (
~batch as _,
~inMemoryStore as _,
~isInReorgThreshold as _,
~config as _,
~allEntities as _,
~batchCache as _,
) => Promise.resolve(),
🤖 Prompt for AI Agents
In scenarios/test_codegen/test/helpers/Mock.res around lines 187 to 205, the
mock implementation of executeUnsafe currently returns
Promise.resolve(%raw(`undefined`)) which breaks callers expecting an array;
change executeUnsafe to return an array (e.g. Promise.resolve([])) to mirror the
real storage contract — optionally allow configuring the returned array for
tests so callers can iterate/spread safely.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1)

25-48: Fix NUMERIC/BigInt ClickHouse mappings before shipping

Mapping Table.BigInt to Int64 will overflow anything beyond 2⁶³‑1, and the Custom("NUMERIC(p,s)") branch rewrites every numeric as Decimal128(18), treating 18 as the scale while ignoring the original precision/scale. This was already flagged earlier and still breaks big integers and decimals. Parse the NUMERIC definition and pick the appropriate ClickHouse Decimal* type, falling back only when ClickHouse can’t represent the schema.

-  | BigInt => "Int64" // FIXME: This is not correct, we need to use higher precision
+  | BigInt => "Decimal256(0)"
@@
-    if name->Js.String2.startsWith("NUMERIC(") {
-      // Extract precision from NUMERIC(p, s) or use default
-      name
-      ->Js.String2.replace("NUMERIC", "Decimal128")
-      ->Js.String2.replaceByRe(%re("/\((\d+),\s*(\d+)\)/"), "(18)")
-    } else {
+    if name->Js.String2.startsWith("NUMERIC(") {
+      switch name->Js.String2.match(%re("/^NUMERIC\\((\\d+)(?:,\\s*(\\d+))?\\)$/i")) {
+      | Some(groups) => {
+          let precisionOpt = groups->Js.Array2.getUnsafe(1)->Belt.Int.fromString
+          let scale =
+            groups
+            ->Js.Array2.get(2)
+            ->Belt.Option.flatMap(value => value->Belt.Int.fromString)
+            ->Belt.Option.getWithDefault(0)
+          switch precisionOpt {
+          | Some(precision) when scale <= precision && precision <= 76 => {
+              let target =
+                if precision <= 9 {
+                  "Decimal32"
+                } else if precision <= 18 {
+                  "Decimal64"
+                } else if precision <= 38 {
+                  "Decimal128"
+                } else {
+                  "Decimal256"
+                }
+              `${target}(${Belt.Int.toString(scale)})`
+            }
+          | _ => "String"
+          }
+        }
+      | None => "String"
+      }
+    } else {
       // For enums and other custom types, return String as fallback
       "String"
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d673b9d and e25caff.

⛔ Files ignored due to path filters (1)
  • scenarios/test_codegen/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (10)
  • codegenerator/cli/npm/envio/src/Internal.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (3 hunks)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res (2 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (5 hunks)
  • codegenerator/cli/npm/envio/src/db/Table.res (1 hunks)
  • codegenerator/cli/src/config_parsing/entity_parsing.rs (10 hunks)
  • codegenerator/cli/src/config_parsing/field_types.rs (1 hunks)
  • codegenerator/cli/src/config_parsing/mod.rs (1 hunks)
  • codegenerator/cli/src/hbs_templating/codegen_templates.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • codegenerator/cli/src/hbs_templating/codegen_templates.rs
  • codegenerator/cli/npm/envio/src/PgStorage.res
🧰 Additional context used
📓 Path-based instructions (3)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/src/config_parsing/mod.rs
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/src/config_parsing/entity_parsing.rs
  • codegenerator/cli/src/config_parsing/field_types.rs
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to codegenerator/cli/src/system_config.rs : system_config.rs converts to internal structs

Applied to files:

  • codegenerator/cli/src/config_parsing/mod.rs
🧬 Code graph analysis (1)
codegenerator/cli/src/config_parsing/field_types.rs (1)
codegenerator/cli/templates/static/codegen/index.js (1)
  • BigDecimal (7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (10)
codegenerator/cli/src/config_parsing/entity_parsing.rs (5)

2-2: LGTM: Import path refactored to field_types module.

This change aligns with the PR's objective to introduce a field type abstraction that works across multiple database backends (Postgres and ClickHouse).


1009-1009: LGTM: Improved error message with documentation link.

The updated error message provides clearer guidance by explicitly mentioning @derivedFrom and linking to documentation, making it more actionable for developers.


1243-1243: LGTM: More descriptive placeholder text.

The change from <MY_ENTITY> to <ENTITY_FIELD_NAME> better clarifies what developers should substitute in the example.


1473-1473: LGTM: Test imports and assertions updated consistently.

All test code has been properly updated to use the new field_types module and corresponding primitive types (String, Int32, etc.), ensuring test coverage for the refactored type system.

Also applies to: 1845-1845, 1858-1858, 1936-1936, 1965-1965, 1990-1990


1413-1429: Verify the Timestamp → Date mapping before merging.

The mapping of GqlScalar::Timestamp to PGPrimitive::Date at line 1426 shows inconsistent naming: the ReScript side uses RescriptTypeIdent::Timestamp (line 1441) while the PostgreSQL side uses PGPrimitive::Date. Database systems typically distinguish between DATE (date-only) and TIMESTAMP (with time component), so this inconsistency warrants verification.

Please confirm: Does PGPrimitive::Date actually generate a TIMESTAMP or TIMESTAMP WITH TIME ZONE column in the database schema, or does it generate a DATE column? If it generates a DATE column, rename the variant to PGPrimitive::Timestamp for clarity and consistency.

codegenerator/cli/npm/envio/src/db/InternalTable.res (5)

49-58: LGTM: DynamicContractRegistry table fields updated consistently.

The field type updates correctly map:

  • Text fields → String
  • Numeric fields → Int32

This aligns with the broader field_types refactoring described in the PR.


134-165: LGTM: Chains table fields updated consistently.

The field type updates are systematic:

  • Numeric fields → Int32
  • Timestamp field → DateNull (nullable date/timestamp)
  • Boolean field properly typed

These changes align with the field_types refactoring.


370-377: LGTM: PersistedState table fields updated appropriately.

The text fields are correctly mapped to String, while the auto-increment primary key remains as Serial.


406-411: LGTM: Checkpoints table fields updated consistently.

The field types are correctly updated with Int32 for numeric fields and String for text fields, including proper nullable handling.


587-601: LGTM: RawEvents table fields updated with proper types.

The field types are correctly mapped:

  • BigInt for large event IDs (line 588)
  • Int32 for standard numeric fields
  • Json for JSON columns (lines 596-598)
  • String for text fields

All changes align with the field_types refactoring.

Comment on lines +114 to +118
try {
await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
await client->exec({query: `CREATE DATABASE ${database}`})
await client->exec({query: `USE ${database}`})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Stop dropping the ClickHouse database on init

initialize currently runs DROP DATABASE IF EXISTS ${database} before every bootstrap, erasing all mirrored data on restart. That’s catastrophic in production. Make the schema setup idempotent by only creating the database when absent instead of dropping it.

-    await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
-    await client->exec({query: `CREATE DATABASE ${database}`})
-    await client->exec({query: `USE ${database}`})
+    await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})
+    await client->exec({query: `USE ${database}`})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
await client->exec({query: `CREATE DATABASE ${database}`})
await client->exec({query: `USE ${database}`})
try {
await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})
await client->exec({query: `USE ${database}`})
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/bindings/ClickHouse.res around lines 114 to
118, the initializer currently executes "DROP DATABASE IF EXISTS ${database}"
which destroys mirrored data on every bootstrap; instead make schema setup
idempotent by removing the DROP and creating the database only if it doesn't
exist. Replace the DROP + CREATE sequence with a single idempotent operation
(for example use "CREATE DATABASE IF NOT EXISTS ${database}" or first check for
existence and only create when absent), then continue with "USE ${database}" so
rest of initialization runs against the existing or newly-created database.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1)

94-96: Do not drop the ClickHouse database during initialization.

DROP DATABASE IF EXISTS blows away every mirrored table on each restart—fatal once this runs against production. Create the database only when absent (and then USE it) so init becomes idempotent instead of destructive.

Apply this diff to make the setup safe:

-    await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
-    await client->exec({query: `CREATE DATABASE ${database}`})
+    await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})
     await client->exec({query: `USE ${database}`})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e25caff and 6e93e4b.

⛔ Files ignored due to path filters (1)
  • scenarios/test_codegen/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (11)
  • codegenerator/cli/npm/envio/src/PgStorage.res (3 hunks)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
  • codegenerator/cli/npm/envio/src/bindings/Postgres.res (1 hunks)
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res (2 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (6 hunks)
  • codegenerator/cli/npm/envio/src/db/Table.res (3 hunks)
  • codegenerator/cli/src/config_parsing/field_types.rs (1 hunks)
  • scenarios/test_codegen/schema.graphql (2 hunks)
  • scenarios/test_codegen/src/EventHandlers.ts (2 hunks)
  • scenarios/test_codegen/test/WriteRead_test.res (2 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
🧰 Additional context used
📓 Path-based instructions (3)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/Postgres.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/src/config_parsing/field_types.rs
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/Postgres.res
  • scenarios/test_codegen/test/WriteRead_test.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/db/Table.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/Postgres.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/db/Table.res
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
🧬 Code graph analysis (1)
codegenerator/cli/src/config_parsing/field_types.rs (1)
codegenerator/cli/templates/static/codegen/index.js (1)
  • BigDecimal (7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (7)
codegenerator/cli/src/config_parsing/field_types.rs (2)

6-18: LGTM! Clean type system refactor.

The new Primitive enum variants represent a well-designed shift from PostgreSQL-specific types to database-agnostic field types. The addition of optional precision parameters for BigInt and BigDecimal, along with the new Entity variant for entity references, provides good flexibility for the type system.


21-44: LGTM! Correct ReScript type mappings.

The get_res_field_type_variant implementation correctly maps all the new Primitive variants to their ReScript string representations. The formatting for parametric types like BigInt and BigDecimal is handled appropriately.

codegenerator/cli/npm/envio/src/db/InternalTable.res (5)

49-58: LGTM! Consistent type migration.

The DynamicContractRegistry field types have been correctly migrated from legacy types to the new type system. All text fields use String, numeric fields use Int32, and the primary key is properly configured.


134-164: LGTM! Appropriate type updates including Date for timestamps.

The Chains table field types have been correctly updated. The change of ready_at to Date type (line 154) is particularly appropriate for timestamp fields, aligning with the database-agnostic type system.


371-375: LGTM! Correct String type for version and hash fields.

The PersistedState table fields have been appropriately updated to use String for all version and hash fields.


406-410: LGTM! Consistent type updates.

The Checkpoints table field types have been correctly migrated to Int32 for numeric fields and String for the hash field, with proper nullable handling.


587-599: LGTM! Comprehensive type migration with appropriate types.

The RawEvents table demonstrates excellent type choices:

  • BigInt({}) for event_id (line 588) correctly uses the new parameterized BigInt variant
  • Int32 for block numbers and indices
  • String for names, addresses, and hashes
  • Json for structured block/transaction/param fields

All field types are consistent with the new type system.

Comment on lines +115 to +129
| Date =>
(isNullable ? Postgres.TimestampWithTimezoneNull : Postgres.TimestampWithTimezone :> string)
| Enum({name}) => `"${pgSchema}".${name}`
| Entity(_) => (Postgres.Text :> string) // FIXME: Will it work correctly if id is not a text column?
}

// Workaround for Hasura bug https://github.com/enviodev/hyperindex/issues/788
let isNumericAsText = isArray && isNumericArrayAsText
let columnType = if columnType == (Postgres.Numeric :> string) && isNumericAsText {
(Postgres.Text :> string)
} else {
columnType
}

columnType ++ (isArray ? "[]" : "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove “NULL” from the generated timestamp type.

getPgFieldType now emits TIMESTAMP WITH TIME ZONE NULL, so the UNNEST path builds casts like ::TIMESTAMP WITH TIME ZONE NULL[]. Postgres rejects that syntax, meaning every insert into a nullable timestamp column will fail at runtime. Let the column definition control nullability (NOT NULL vs default) and keep the type literal as plain TIMESTAMP WITH TIME ZONE.

Apply this diff to restore valid SQL (and adjust the tests back to expect TIMESTAMP WITH TIME ZONE[]):

-  | Date =>
-    (isNullable ? Postgres.TimestampWithTimezoneNull : Postgres.TimestampWithTimezone :> string)
+  | Date => (Postgres.TimestampWithTimezone :> string)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Date =>
(isNullable ? Postgres.TimestampWithTimezoneNull : Postgres.TimestampWithTimezone :> string)
| Enum({name}) => `"${pgSchema}".${name}`
| Entity(_) => (Postgres.Text :> string) // FIXME: Will it work correctly if id is not a text column?
}
// Workaround for Hasura bug https://github.com/enviodev/hyperindex/issues/788
let isNumericAsText = isArray && isNumericArrayAsText
let columnType = if columnType == (Postgres.Numeric :> string) && isNumericAsText {
(Postgres.Text :> string)
} else {
columnType
}
columnType ++ (isArray ? "[]" : "")
| Date => (Postgres.TimestampWithTimezone :> string)
| Enum({name}) => `"${pgSchema}".${name}`
| Entity(_) => (Postgres.Text :> string) // FIXME: Will it work correctly if id is not a text column?
}
// Workaround for Hasura bug https://github.com/enviodev/hyperindex/issues/788
let isNumericAsText = isArray && isNumericArrayAsText
let columnType = if columnType == (Postgres.Numeric :> string) && isNumericAsText {
(Postgres.Text :> string)
} else {
columnType
}
columnType ++ (isArray ? "[]" : "")
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/db/Table.res around lines 115 to 129, the
generator emits the literal "TIMESTAMP WITH TIME ZONE NULL" which produces
invalid casts like ::TIMESTAMP WITH TIME ZONE NULL[]; change the logic so the
timestamp type literal does not include "NULL" (emit plain TIMESTAMP WITH TIME
ZONE) and let column nullability be handled by column constraints (NOT NULL vs
default) rather than the type; update any branching that maps
TimestampWithTimezoneNull to use the non-null timestamp string, ensure array
wrapping still appends "[]" correctly, and adjust tests to expect TIMESTAMP WITH
TIME ZONE[] instead of TIMESTAMP WITH TIME ZONE NULL[].

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1)

110-113: CRITICAL: Stop dropping the ClickHouse database on init.

This exact issue was flagged in a previous review: the DROP DATABASE IF EXISTS command erases all mirrored data on every restart, which is catastrophic in production. The database should be created idempotently instead.

Apply this diff to make schema setup idempotent:

-   await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
-   await client->exec({query: `CREATE DATABASE ${database}`})
+   await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})
    await client->exec({query: `USE ${database}`})

Based on learnings

codegenerator/cli/npm/envio/src/db/Table.res (1)

116-117: Critical: Remove "NULL" from the timestamp type literal.

As flagged in the previous review, appending "NULL" to the timestamp type produces invalid SQL syntax for array casts. PostgreSQL rejects ::TIMESTAMP WITH TIME ZONE NULL[]. The type literal should be plain TIMESTAMP WITH TIME ZONE, and nullability should be handled by column constraints.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e93e4b and 92c3a53.

📒 Files selected for processing (6)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res (5 hunks)
  • codegenerator/cli/npm/envio/src/db/Table.res (3 hunks)
  • codegenerator/cli/src/config_parsing/entity_parsing.rs (10 hunks)
  • codegenerator/cli/src/config_parsing/field_types.rs (1 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (3 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/src/config_parsing/field_types.rs
  • codegenerator/cli/src/config_parsing/entity_parsing.rs
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/db/Table.res
🧠 Learnings (2)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to codegenerator/cli/src/system_config.rs : system_config.rs converts to internal structs

Applied to files:

  • codegenerator/cli/src/config_parsing/field_types.rs
  • codegenerator/cli/src/config_parsing/entity_parsing.rs
🧬 Code graph analysis (1)
codegenerator/cli/src/config_parsing/field_types.rs (1)
codegenerator/cli/templates/static/codegen/index.js (1)
  • BigDecimal (7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (15)
scenarios/test_codegen/test/lib_tests/PgStorage_test.res (3)

173-179: LGTM: Consistent checkpoint column naming.

The renaming to envio_checkpoint_id across all history tables improves clarity and maintains consistent naming conventions throughout the schema.

Also applies to: 334-334


180-181: LGTM: Timestamp columns added correctly.

The new optTimestamp and timestamp columns are properly defined. Note that timestamp is NOT NULL in the main table but nullable in the history table—this is intentional and correct, as history tables typically make non-key fields nullable to support partial snapshots.


456-457: LGTM: INSERT query correctly updated.

The unnest-based INSERT query properly includes the new timestamp fields with correct types (TIMESTAMP WITH TIME ZONE NULL[] for optTimestamp and TIMESTAMP WITH TIME ZONE[] for timestamp) and updates the EXCLUDED clause accordingly.

codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (3)

3-21: LGTM! Clean FFI bindings.

The ClickHouse client bindings are well-structured with appropriate type definitions and external declarations.


123-128: LGTM! Proper error handling.

The error handling correctly logs the exception, ensures the client is closed, and propagates the failure.


64-93: Verify entity "id" field requirement is enforced by framework validation.

The code in EntityHistory.res (line 66) explicitly matches only entities with an "id" field and marks it as primary key. The ClickHouse.res query generator (line 92) then assumes this "id" column always exists in the ORDER BY clause.

If an entity configuration omits the "id" field, the ClickHouse CREATE TABLE statement would reference a non-existent column, causing a runtime SQL error. I could not locate validation logic in the bindings that enforces the "id" field requirement—this validation may exist in the schema parsing layer or code generation phase outside these modules.

Please confirm that all entities are guaranteed to have an "id" field (either required by schema validation or auto-generated by the framework), or document this as a required constraint.

codegenerator/cli/npm/envio/src/db/Table.res (1)

6-19: LGTM! Well-structured tagged union for field types.

The new @tag("type") fieldType definition provides a clear, type-safe representation of PostgreSQL field types with appropriate metadata (precision for BigInt, config for BigDecimal, name for Enum/Entity).

codegenerator/cli/npm/envio/src/db/EntityHistory.res (3)

22-22: LGTM! Consistent field type and naming updates.

The rename to envio_checkpoint_id aligns with the envio_ prefix convention, and the type changes (Enum for action field, Uint32 for checkpoint) correctly use the new field type system.

Also applies to: 81-81, 85-85


106-119: LGTM! SQL generation correctly uses renamed field.

The query construction consistently references envio_checkpoint_id in both the field matching logic and the UNNEST alias.


232-232: LGTM! Correct type cast reference.

The change from Table.fieldType to Postgres.columnType appropriately uses the column type literal for the SQL cast.

codegenerator/cli/src/config_parsing/field_types.rs (1)

6-18: LGTM! Well-defined primitive type system.

The Primitive enum variants correctly mirror the ReScript fieldType definitions with appropriate precision/scale options for numeric types.

codegenerator/cli/src/config_parsing/entity_parsing.rs (4)

2-2: LGTM! Import path correctly updated to field_types module.


1411-1430: LGTM! GraphQL to PostgreSQL type mappings are correct.

The scalar type conversions appropriately map GraphQL types to their PostgreSQL equivalents:

  • ID/String → String (TEXT)
  • Int → Int32 (INTEGER)
  • Float → Number (DOUBLE PRECISION)
  • Timestamp → Date (TIMESTAMP WITH TIME ZONE)
  • BigInt/BigDecimal preserve precision/scale metadata

1009-1010: LGTM! Improved error message with updated documentation link.

The error message now includes a direct link to the relationships documentation section, making it easier for developers to understand how to properly define one-to-many relationships.


1473-1473: LGTM! Test updates correctly reflect the new type system.

Test imports and assertions have been appropriately updated to use the new primitive types (String, Int32, etc.) from the field_types module.

Also applies to: 1845-1845, 1858-1858, 1936-1936, 1965-1965, 1990-1990

Comment on lines 32 to 44
| BigInt({?precision}) =>
switch precision {
| None => `UInt256` // FIXME: Should use String here?
| Some(precision) => `Decimal(${precision->Js.Int.toString},0)`
}
| BigDecimal({?config}) =>
switch config {
| None =>
Js.Exn.raiseError(
"Please provide a @config(precision: <precision>, scale: <scale>) directive on the BigDecimal field for ClickHouse to work correctly",
)
| Some((precision, scale)) => `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate ClickHouse Decimal limits before emitting precision/scale.

Past reviews flagged that precision/scale must be validated against ClickHouse limits. The current code uses precision/scale from config but doesn't check if they exceed ClickHouse's Decimal limits (precision ≤ 38, scale ≤ precision). Values that violate these constraints will be rejected by ClickHouse at runtime.

Apply this diff to add validation:

  | BigInt({?precision}) =>
    switch precision {
-   | None => `UInt256` // FIXME: Should use String here?
+   | None => "String" // Fallback for unbounded BigInt
    | Some(precision) =>
+     if precision > 38 {
+       "String"
+     } else {
        `Decimal(${precision->Js.Int.toString},0)`
+     }
    }
  | BigDecimal({?config}) =>
    switch config {
    | None =>
      Js.Exn.raiseError(
        "Please provide a @config(precision: <precision>, scale: <scale>) directive on the BigDecimal field for ClickHouse to work correctly",
      )
    | Some((precision, scale)) =>
+     if precision > 38 || scale > precision {
+       "String"
+     } else {
        `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
+     }
    }

Based on learnings

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| BigInt({?precision}) =>
switch precision {
| None => `UInt256` // FIXME: Should use String here?
| Some(precision) => `Decimal(${precision->Js.Int.toString},0)`
}
| BigDecimal({?config}) =>
switch config {
| None =>
Js.Exn.raiseError(
"Please provide a @config(precision: <precision>, scale: <scale>) directive on the BigDecimal field for ClickHouse to work correctly",
)
| Some((precision, scale)) => `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
| BigInt({?precision}) =>
switch precision {
| None => "String" // Fallback for unbounded BigInt
| Some(precision) =>
if precision > 38 {
"String"
} else {
`Decimal(${precision->Js.Int.toString},0)`
}
}
| BigDecimal({?config}) =>
switch config {
| None =>
Js.Exn.raiseError(
"Please provide a @config(precision: <precision>, scale: <scale>) directive on the BigDecimal field for ClickHouse to work correctly",
)
| Some((precision, scale)) =>
if precision > 38 || scale > precision {
"String"
} else {
`Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
}
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/bindings/ClickHouse.res around lines 32 to
44, the BigDecimal branch emits Decimal(precision,scale) without checking
ClickHouse limits; validate that precision is <= 38 and scale is <= precision
(and both non-negative) after extracting the @config tuple, and if invalid throw
a Js.Exn.raiseError with a clear message; keep the existing error for missing
config, and only produce the
`Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})` output when
the values pass validation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codegenerator/cli/npm/envio/src/PgStorage.res (1)

612-705: Run mirror writes only after the PostgreSQL work succeeds.

mirror.setOrThrow is scheduled alongside the PostgreSQL promises. If the PG transaction later aborts, ClickHouse has already committed those rows and the mirror diverges. Collect the mirror writes and execute them after Promise.all resolves for the PG work (still inside the transaction so a mirror failure aborts the commit).

-          let promises = []
+          let promises = []
+          let mirrorWrites = ref([])
@@
-              switch mirror {
-              | Some(mirror) =>
-                promises
-                ->Js.Array2.push(
-                  mirror.setOrThrow(
-                    ~items=batchSetUpdates,
-                    ~itemSchema=entityConfig.entityHistory.setUpdateSchema,
-                    ~table=entityConfig.entityHistory.table,
-                  ),
-                )
-                ->ignore
-              | None => ()
-              }
+              switch mirror {
+              | Some(mirror) =>
+                mirrorWrites.contents
+                ->Js.Array2.push(() =>
+                  mirror.setOrThrow(
+                    ~items=batchSetUpdates,
+                    ~itemSchema=entityConfig.entityHistory.setUpdateSchema,
+                    ~table=entityConfig.entityHistory.table,
+                  )
+                )
+                ->ignore
+              | None => ()
+              }
@@
-          let _ = await promises->Promise.all
+          let _ = await promises->Promise.all
+          let _ =
+            await mirrorWrites.contents
+            ->Js.Array2.map(run => run())
+            ->Promise.all
♻️ Duplicate comments (3)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (3)

46-53: Validate Decimal precision/scale before emission.

This was previously flagged: forwarding unchecked (precision, scale) straight into Decimal(…, …) lets invalid metadata through, and ClickHouse rejects the CREATE TABLE at runtime (e.g. precision <= 0, scale < 0, scale > precision, or precision > 76). Please validate and raise a descriptive error (or fall back) before emitting the type.

-    | Some((precision, scale)) => `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
+    | Some((precision, scale)) =>
+      if precision <= 0 || scale < 0 || scale > precision || precision > 76 {
+        Js.Exn.raiseError(
+          `ClickHouse Decimal precision/scale out of bounds: precision=${precision->Js.Int.toString}, scale=${scale->Js.Int.toString}`,
+        )
+      } else {
+        `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
+      }

145-147: Do not drop the ClickHouse database on init.

This still erases the entire mirror every time the process starts. Use an idempotent create instead so existing mirrored data survives.

-    await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
-    await client->exec({query: `CREATE DATABASE ${database}`})
+    await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})

41-45: Stop mapping unbounded BigInt to UInt256.

Table.fieldType.BigInt without a declared precision can still hold negative values. Emitting UInt256 rejects any negative row at insert time and breaks mirroring. Fall back to a safe representation (e.g. String) unless a bounded precision is provided.

-    | None => `UInt256` // FIXME: Should use String here?
+    | None => "String"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92c3a53 and 35c16f7.

📒 Files selected for processing (3)
  • codegenerator/cli/npm/envio/src/Mirror.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (7 hunks)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/Mirror.res
🧰 Additional context used
📓 Path-based instructions (3)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test

Comment on lines +63 to +69
let baseType = if isArray {
`Array(${baseType})`
} else {
baseType
}

isNullable ? `Nullable(${baseType})` : baseType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't emit Nullable(Array(...)) columns.

ClickHouse forbids wrapping composite types in Nullable(...). As soon as you mirror a nullable array field this generates Nullable(Array(_)) and the history table DDL fails. Push the nullability into the element type (or drop it entirely) before wrapping with Array.

-  let baseType = if isArray {
-    `Array(${baseType})`
-  } else {
-    baseType
-  }
-
-  isNullable ? `Nullable(${baseType})` : baseType
+  if isArray {
+    let elementType = if isNullable { `Nullable(${baseType})` } else { baseType }
+    `Array(${elementType})`
+  } else {
+    isNullable ? `Nullable(${baseType})` : baseType
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let baseType = if isArray {
`Array(${baseType})`
} else {
baseType
}
isNullable ? `Nullable(${baseType})` : baseType
if isArray {
let elementType = if isNullable { `Nullable(${baseType})` } else { baseType }
`Array(${elementType})`
} else {
isNullable ? `Nullable(${baseType})` : baseType
}
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/bindings/ClickHouse.res around lines 63-69,
the code currently emits Nullable(Array(...)) when a field is both nullable and
an array; ClickHouse forbids Nullable wrapped around composite types. Change the
construction so nullability is applied to the array element type (or omitted)
before wrapping with Array: compute the element type first as either
Nullable(baseType) or baseType depending on isNullable, then if isArray wrap
that element type with Array(...), otherwise return the element type.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
scenarios/test_codegen/test/lib_tests/ClickHouse_test.res (1)

3-24: Consider expanding test coverage.

The current test validates the happy path for a single entity. To improve confidence in the SQL generation logic, consider adding test cases for:

  • Entities with different field type combinations (numeric, boolean, timestamps, etc.)
  • Entities with all-nullable or all-required fields
  • Entities with special characters in field names
  • Multiple entities to ensure consistency
codegenerator/cli/npm/envio/src/Prometheus.res (1)

19-29: Consider aligning metric naming conventions.

The new storage write metrics use a different naming pattern compared to existing counters:

  • Existing: execute_batch_processing_time_spent (no prefix, verbose suffix)
  • New: envio_storage_write_time (with envio_ prefix, concise suffix)

While this may indicate a migration to a new naming convention, it creates inconsistency in the metric namespace. Consider either:

  1. Keeping the existing pattern: storage_write_processing_time_spent and storage_write_count
  2. Or documenting that the envio_ prefix is the new standard going forward
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35c16f7 and b09515d.

📒 Files selected for processing (4)
  • codegenerator/cli/npm/envio/src/Prometheus.res (2 hunks)
  • codegenerator/cli/src/config_parsing/entity_parsing.rs (10 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (1 hunks)
  • scenarios/test_codegen/test/lib_tests/ClickHouse_test.res (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/src/config_parsing/entity_parsing.rs
  • codegenerator/cli/npm/envio/src/Prometheus.res
codegenerator/cli/templates/static/**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Static templates are raw ReScript files copied verbatim under codegenerator/cli/templates/static

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
{codegenerator/cli/templates/static/codegen/src/**,codegenerator/cli/templates/dynamic/codegen/src/**}

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Template versions of runtime live under templates/{static,dynamic}/codegen/src and are the recommended editing targets

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • scenarios/test_codegen/test/lib_tests/ClickHouse_test.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/Prometheus.res
🧠 Learnings (2)
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Prometheus.res : In generated runtime, Prometheus.res handles metrics

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to codegenerator/cli/src/system_config.rs : system_config.rs converts to internal structs

Applied to files:

  • codegenerator/cli/src/config_parsing/entity_parsing.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (8)
scenarios/test_codegen/test/lib_tests/ClickHouse_test.res (1)

1-21: LGTM! Clean and well-structured test.

The test correctly verifies SQL generation for ClickHouse history tables. The expected query format is valid, with proper use of ClickHouse-specific syntax (backticks for identifiers, MergeTree engine, appropriate data types). The exact string matching approach is appropriate here to ensure consistency in generated SQL.

codegenerator/cli/templates/static/codegen/src/EventProcessing.res (1)

370-371: LGTM! New storage write metrics added.

The new metrics storageWriteTimeCounter and storageWriteCounter are properly integrated and will help track storage write operations separately. Note that both executeBatchDurationCounter (line 369) and storageWriteTimeCounter (line 370) currently track the same dbWriteDuration, which appears intentional for the ClickHouse mirror feature to enable separate tracking of storage backends.

codegenerator/cli/npm/envio/src/Prometheus.res (1)

228-234: LGTM!

The increment functions are correctly implemented and follow the established patterns:

  • incrementStorageWriteTimeCounter uses incMany for duration tracking
  • incrementStorageWriteCounter uses inc for simple count tracking
codegenerator/cli/src/config_parsing/entity_parsing.rs (5)

2-2: LGTM: Import path updated for field_types refactor.

The import path change from postgres_types to field_types aligns with the PR's refactoring objectives.


1009-1010: LGTM: Improved error message with actionable guidance.

The updated error message provides clearer guidance by explicitly requiring @derivedFrom and including a documentation link.


1473-1473: LGTM: Test imports updated for field_types refactor.

The test import path correctly references the new field_types module.


1845-1845: Test assertions updated consistently with primitive mappings.

The test expectations have been updated to reflect the new primitive type names (e.g., Text → String, Integer → Int32, Text → Entity(...)). These changes are mechanically consistent with the primitive mapping updates.

Note: These test assertions depend on the semantic correctness of the primitive mapping changes flagged in the previous comment.

Also applies to: 1858-1858, 1936-1936, 1965-1965, 1990-1990


1411-1432: Timestamp mapping to Date requires manual verification of generated code behavior.

The review comment identified a critical concern: line 1426 maps GqlScalar::Timestamp => PGPrimitive::Date. Verification reveals that:

  • PGPrimitive is a type alias for Primitive from field_types.rs
  • Primitive::Date is a simple enum variant with no documentation explaining whether it preserves time information
  • Primitive types are serialized into Handlebars templates for ReScript/TypeScript code generation (not direct SQL generation)
  • The field_type renders as the string "Date" in generated code

The actual impact on the database schema cannot be determined from the Rust CLI code alone, as database schema generation occurs in the generated TypeScript/JavaScript handlers at runtime. The name "Date" traditionally implies date-only (no time component), but without tracing to the generated code and database initialization, the actual behavior is unclear.

Action required: Verify that the generated code correctly handles Timestamp fields as timestamp-with-time in the database (not as date-only), and confirm this change does not cause data loss for existing schemas.

None => {
let example_str = Self::DerivedFromField {
entity_name: "<MY_ENTITY>".to_string(),
entity_name: "<ENTITY_FIELD_NAME>".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incorrect placeholder name for entity_name field.

The placeholder "<ENTITY_FIELD_NAME>" is misleading since it's assigned to the entity_name field, which represents an entity name, not a field name. The previous "<MY_ENTITY>" was semantically clearer, or consider using "<ENTITY_NAME>" for better clarity.

Apply this diff to fix the placeholder:

-                        entity_name: "<ENTITY_FIELD_NAME>".to_string(),
+                        entity_name: "<ENTITY_NAME>".to_string(),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
entity_name: "<ENTITY_FIELD_NAME>".to_string(),
entity_name: "<ENTITY_NAME>".to_string(),
🤖 Prompt for AI Agents
In codegenerator/cli/src/config_parsing/entity_parsing.rs around line 1243, the
placeholder assigned to entity_name is misleading ("<ENTITY_FIELD_NAME>");
change it to a semantically correct placeholder such as "<MY_ENTITY>" or
"<ENTITY_NAME>" so the variable reflects an entity name rather than a field
name; update the string literal to the chosen placeholder and run
tests/formatting to ensure consistency.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (2)

70-76: ClickHouse forbids Nullable(Array(...)) – push nullability into element type.

The current logic produces Nullable(Array(...)) when both isArray and isNullable are true, but ClickHouse rejects this syntax. History table DDL will fail for any nullable array field. Apply nullability to the element type before wrapping with Array.

Apply this diff:

-  let baseType = if isArray {
-    `Array(${baseType})`
-  } else {
-    baseType
-  }
-
-  isNullable ? `Nullable(${baseType})` : baseType
+  if isArray {
+    let elementType = isNullable ? `Nullable(${baseType})` : baseType
+    `Array(${elementType})`
+  } else {
+    isNullable ? `Nullable(${baseType})` : baseType
+  }

152-154: Remove DROP DATABASE – it destroys all mirrored data on every restart.

Line 152 executes DROP DATABASE IF EXISTS on every initialization, which is catastrophic in production. All historical mirror data is lost whenever the indexer restarts. Make the schema setup idempotent instead.

Apply this diff to make initialization idempotent:

-    await client->exec({query: `DROP DATABASE IF EXISTS ${database}`})
-    await client->exec({query: `CREATE DATABASE ${database}`})
+    await client->exec({query: `CREATE DATABASE IF NOT EXISTS ${database}`})
     await client->exec({query: `USE ${database}`})

Since makeCreateHistoryTableQuery already uses CREATE TABLE IF NOT EXISTS, the rest of the initialization is already idempotent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b09515d and 196b19f.

📒 Files selected for processing (1)
  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/bindings/ClickHouse.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (3)
codegenerator/cli/npm/envio/src/bindings/ClickHouse.res (3)

1-31: LGTM: External bindings are well-structured.

The ClickHouse client bindings are clean and properly typed, with appropriate use of ReScript's external binding syntax.


79-110: LGTM: Insert logic is clean and properly error-handled.

The function correctly handles empty arrays, performs schema conversion, and wraps errors appropriately.


112-142: LGTM: History table DDL generation is well-structured.

The query builder correctly filters derived fields, uses idempotent CREATE TABLE IF NOT EXISTS, and sets up an appropriate MergeTree engine with ordering by id and checkpoint.

Comment on lines +51 to +60
| BigDecimal({?config}) =>
switch config {
| None => "String" // Fallback for unbounded BigInt
| Some((precision, scale)) =>
if precision > 38 || scale > precision {
"String"
} else {
`Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate that precision and scale are non-negative.

The BigDecimal branch correctly checks ClickHouse's upper limits (precision ≤ 38, scale ≤ precision), but doesn't verify that both values are non-negative. Negative values would bypass validation and produce invalid Decimal types.

Apply this diff to add non-negative validation:

  | BigDecimal({?config}) =>
    switch config {
    | None => "String" // Fallback for unbounded BigInt
    | Some((precision, scale)) =>
-     if precision > 38 || scale > precision {
+     if precision < 0 || scale < 0 || precision > 38 || scale > precision {
        "String"
      } else {
        `Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
      }
    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| BigDecimal({?config}) =>
switch config {
| None => "String" // Fallback for unbounded BigInt
| Some((precision, scale)) =>
if precision > 38 || scale > precision {
"String"
} else {
`Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
}
| BigDecimal({?config}) =>
switch config {
| None => "String" // Fallback for unbounded BigInt
| Some((precision, scale)) =>
if precision < 0 || scale < 0 || precision > 38 || scale > precision {
"String"
} else {
`Decimal(${precision->Js.Int.toString},${scale->Js.Int.toString})`
}
}
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/bindings/ClickHouse.res around lines 51 to
60, the BigDecimal branch validates upper bounds but not that precision and
scale are non-negative; update the switch's Some((precision, scale)) path to
first check precision >= 0 and scale >= 0 and treat any negative values as
invalid (return the "String" fallback), then keep the existing upper-bound
checks (precision <= 38 and scale <= precision) and only emit the Decimal(...)
string when all validations pass.

}
| BigDecimal({?config}) =>
switch config {
| None => "String" // Fallback for unbounded BigInt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix incorrect comment.

The comment says "Fallback for unbounded BigInt" but this is the BigDecimal branch, not BigInt.

Apply this diff:

-    | None => "String" // Fallback for unbounded BigInt
+    | None => "String" // Fallback for BigDecimal without config
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/bindings/ClickHouse.res at line 53, the
inline comment for the "None" branch incorrectly references "unbounded BigInt";
update the comment to reference "BigDecimal" (e.g., "Fallback for unbounded
BigDecimal") so it matches the BigDecimal branch semantics and avoids misleading
documentation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
scenarios/test_codegen/test/helpers/Mock.res (1)

184-190: Mock executeUnsafe should return an array, not undefined

Storage.storage.executeUnsafe still resolves to %raw("undefined"), while real storage returns an array of rows. Any test that iterates/spreads the result can break instead of exercising handlers.

Consider updating it to mirror the real contract:

-        executeUnsafe: _ => Promise.resolve(%raw(`undefined`)),
+        executeUnsafe: _ => Promise.resolve([]),

or optionally making the returned array configurable per test.

🧹 Nitpick comments (5)
codegenerator/cli/npm/envio/src/InMemoryTable.res (1)

213-220: Minor: unify Updated constructor usage for clarity

rowToEntity patterns mix Internal.Updated and bare Updated:

| Internal.Updated({latest: Set({entity})}) => Some(entity)
| Updated({latest: Delete(_)}) => None

For readability, consider consistently using either the qualified or unqualified constructor (whichever matches the rest of the module), e.g.:

-    | Internal.Updated({latest: Set({entity})}) => Some(entity)
-    | Updated({latest: Delete(_)}) => None
+    | Updated({latest: Set({entity})}) => Some(entity)
+    | Updated({latest: Delete(_)}) => None

assuming the unqualified Updated is already in scope.

codegenerator/cli/npm/envio/src/PgStorage.res (4)

25-58: Confirm Table.getPgFieldType does not duplicate NULL/NOT NULL decorations

makeCreateTableQuery now uses:

`"${fieldName}" ${Table.getPgFieldType(...)}${switch defaultValue {
  | Some(defaultValue) => ` DEFAULT ${defaultValue}`
  | None => isNullable ? `` : ` NOT NULL`
}}`

Since Table.getPgFieldType receives ~isNullable, please ensure it only returns the raw PostgreSQL type (including any array / domain suffixes) and does not itself append NULL/NOT NULL, or you'll end up with duplicated / conflicting nullability clauses in the generated DDL.


619-703: Mirror sees only Set history updates; Delete handling is still Postgres-only

Within the history branch:

  • Delete changes are translated to batchDeleteEntityIds/batchDeleteCheckpointIds and written via EntityHistory.insertDeleteUpdates (Postgres).
  • Set changes go into batchSetUpdates and are written to both Postgres (sql->setOrThrow with setChangeSchema) and, when configured, to the mirror via mirror.setOrThrow.

If the ClickHouse mirror is expected to reflect the full entity history (including Delete actions), you’ll need a corresponding mirror path for deletes, otherwise mirror history will miss those entries. If the mirror intentionally tracks only Set updates, this asymmetry is fine but should be a conscious decision.


669-703: Mirror errors are surfaced after commit and may leave PG and ClickHouse out of sync

Inside each entity’s async sql => { ... }:

  • mirror.setOrThrow(...) is added to promises.
  • Any exception (PG or mirror) during await promises->Promise.all is caught, normalized, and stored in specificError.contents.
  • The error is not rethrown from the per-entity function, so the PG transaction can still commit successfully.
  • After beginSql completes, specificError is checked and re-raised outside the transaction.

Effectively, a mirror failure after Postgres succeeds will:

  • Commit the PG batch.
  • Then throw from writeBatch, leaving ClickHouse behind while the indexer perceives the batch as failed.

If you want mirror writes to be best-effort and not fail the batch, you might instead log mirror errors and avoid re-raising them. If you want strong failure semantics, consider documenting that PG and mirror are not updated atomically, and decide whether you need compensating logic (e.g., retrying mirror writes) rather than just surfacing the error post-commit.


1001-1074: Initialization correctly calls mirror.initialize before PG migrations

initialize now invokes mirror.initialize(~chainConfigs, ~entities, ~enums) (when provided) before constructing and executing the PG initialization transaction. That ordering is reasonable and keeps mirror init outside of the DB transaction.

One thing to be aware of: if mirror initialization succeeds but the PG migration fails, the two systems can diverge until the next run. If this matters operationally, consider making mirror.initialize idempotent and/or tolerant of partial retries, and documenting that PG remains the source of truth.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 196b19f and 46cf29a.

📒 Files selected for processing (12)
  • codegenerator/cli/npm/envio/src/Change.res (1 hunks)
  • codegenerator/cli/npm/envio/src/InMemoryStore.res (4 hunks)
  • codegenerator/cli/npm/envio/src/InMemoryTable.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Internal.res (2 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (10 hunks)
  • codegenerator/cli/npm/envio/src/db/EntityHistory.res (6 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (2 hunks)
  • scenarios/test_codegen/test/E2E_test.res (1 hunks)
  • scenarios/test_codegen/test/WriteRead_test.res (6 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (3 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/Internal.res
🧰 Additional context used
📓 Path-based instructions (6)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/Change.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryStore.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/Change.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • scenarios/test_codegen/test/WriteRead_test.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • scenarios/test_codegen/test/E2E_test.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryStore.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/db/EntityHistory.res
  • codegenerator/cli/npm/envio/src/Change.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryStore.res
codegenerator/cli/templates/dynamic/**/*.hbs

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Dynamic templates are Handlebars files under codegenerator/cli/templates/dynamic

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
{codegenerator/cli/templates/static/codegen/src/**,codegenerator/cli/templates/dynamic/codegen/src/**}

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Template versions of runtime live under templates/{static,dynamic}/codegen/src and are the recommended editing targets

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/IO.res
codegenerator/cli/templates/static/**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Static templates are raw ReScript files copied verbatim under codegenerator/cli/templates/static

Files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
🧠 Learnings (2)
📚 Learning: 2025-05-27T17:07:12.878Z
Learnt from: JonoPrest
Repo: enviodev/hyperindex PR: 555
File: scenarios/test_codegen/test/lib_tests/EntityHistory_test.res:631-659
Timestamp: 2025-05-27T17:07:12.878Z
Learning: In isolated tests for EntityHistory rollbacks with controlled mock data setup, the ordering of getRollbackDiff results is deterministic and doesn't require sorting before comparison with deepStrictEqual.

Applied to files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • scenarios/test_codegen/test/WriteRead_test.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (16)
scenarios/test_codegen/test/WriteRead_test.res (2)

51-52: LGTM! Timestamp fields added correctly.

The addition of timestamp and optTimestamp fields to test entities aligns with the schema changes and provides proper test coverage for timestamp handling.

Also applies to: 76-77


107-112: LGTM! History representation updated consistently.

The migration from entityUpdateAction: Set(entity) to Set({checkpointId, entityId, entity}) correctly reflects the new Change.t-based history encoding introduced across the codebase. The flatter structure improves clarity and aligns with the broader refactoring.

Also applies to: 121-126, 144-150, 168-174

codegenerator/cli/templates/static/codegen/src/IO.res (1)

21-24: LGTM! Cleaner Change.t payload construction.

The removal of the entityUpdateAction wrapper in favor of direct Delete({entityId, checkpointId}) and Set({entityId, checkpointId, entity}) construction simplifies the code and aligns with the new Change.t type structure.

Also applies to: 36-40

codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

121-124: LGTM! Test helper template updated correctly.

The template now generates mock database operations using the new Change.t-based payloads (Delete({entityId, checkpointId}) and Set({entityId, checkpointId, entity})), and pattern matching is updated accordingly. This ensures generated test helpers align with the refactored data structures.

Also applies to: 137-141, 223-223, 231-231

scenarios/test_codegen/test/E2E_test.res (1)

286-293: LGTM! E2E test assertions updated correctly.

The history entries now use the new Set({checkpointId, entityId, entity: {...}}) structure instead of the old entityUpdateAction-based format. This correctly validates the Change.t-based history encoding in end-to-end scenarios.

Also applies to: 294-301, 302-309

codegenerator/cli/npm/envio/src/Change.res (1)

1-9: LGTM! Well-designed Change.t type.

The new Change.t<'entity> type provides a clean, type-safe abstraction for entity updates with:

  • Explicit Set/Delete variants that directly contain relevant data
  • Generic type parameter for entity flexibility
  • Convenient accessors for common fields (entityId, checkpointId)
  • Proper tagging for JSON serialization

This serves as a solid foundation for the history/change representation refactoring across the codebase.

scenarios/test_codegen/test/rollback/Rollback_test.res (1)

135-142: LGTM! Rollback tests updated comprehensively.

All history entries in rollback scenarios have been migrated to the new Change.t format:

  • Set entries now use Set({checkpointId, entityId, entity: {...}})
  • Delete entries now use Delete({checkpointId, entityId})

The updates are consistent throughout the file and correctly validate rollback behavior with the new data structure.

Also applies to: 143-150, 151-158, 159-166, 167-170, 278-285, 286-293

codegenerator/cli/npm/envio/src/db/EntityHistory.res (2)

12-12: LGTM! Improved field naming and types.

Key improvements:

  • envio_checkpoint_id prefix avoids collisions with user-defined entity fields
  • Enum({name: RowAction.name}) is more semantically appropriate than Custom for the action field
  • Uint32 better represents checkpoint IDs (non-negative, bounded integers)

Also applies to: 71-71, 75-75


14-23: LGTM! Schema refactoring aligns with Change.t.

The renaming from setUpdateSchema to setChangeSchema and the use of Change.Set constructor properly integrate the new Change.t type into the history schema, providing clearer terminology and structure.

Also applies to: 27-29, 88-88, 154-155

codegenerator/cli/npm/envio/src/InMemoryStore.res (2)

136-140: LGTM! InMemoryStore updated to use Change.t.

The setBatchDcs function now constructs entity updates using the new Set({entityId, checkpointId, entity}) format, removing the entityUpdateAction wrapper. This aligns with the Change.t-based refactoring and maintains consistency across the storage layer.


57-57: LGTM! Cleaner function signature.

Removing the extra parentheses grouping improves code readability while maintaining the same functionality.

scenarios/test_codegen/test/helpers/Mock.res (2)

4-20: In-memory Set payload now correctly uses Change.Set shape

InMemoryStore.setEntity now passes Set({entityId, checkpointId: 0, entity}) directly into InMemoryTable.Entity.set, which matches the new Change.t<'entity>-based API and aligns the mock’s in-memory state with the runtime representation. No issues spotted.


225-245: History query wiring matches new Change.t<'entity> representation

The Indexer.t.queryHistory signature and its implementation now return array<Change.t<'entity>>, decoding via:

  • entityConfig.entityHistory.setChangeSchema for Set changes
  • An explicit S.object branch that tags RowAction.DELETE and constructs Delete({entityId, checkpointId})

The final Utils.magic cast from Internal.entity to 'entity matches how query is already typed. This looks consistent with the broader Change-based refactor.

Also applies to: 377-405

codegenerator/cli/npm/envio/src/InMemoryTable.res (1)

166-211: Change-based set logic and history handling look correct

  • set now keys rows by change->Change.getEntityId, and stores latest: change plus a history array of Change.t<'entity>, with deduping when the checkpoint ID matches the previous latest.
  • Index maintenance correctly distinguishes Set({entity}) (update indices) vs Delete({entityId}) (remove from indices), so secondary indexes stay in sync with change actions.

This matches the new Change.t-based API and semantics.

codegenerator/cli/npm/envio/src/PgStorage.res (2)

555-607: New ~mirror plumbing into writeBatch looks structurally sound

The writeBatch signature now accepts ~mirror: option<Mirror.t>=?, and callers thread it via writeBatchMethod and the PgEncodingError retry path using ~mirror?. This ensures mirror configuration is preserved across retries and higher-level calls without changing external storage API types. No functional issues spotted in the argument wiring.


1375-1395: writeBatchMethod correctly threads mirror into internal writeBatch

The public writeBatch method now simply forwards ~mirror? into the internal writeBatch helper alongside the existing arguments. This keeps the Persistence.storage surface unchanged while enabling mirror integration. Looks good.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
codegenerator/cli/npm/envio/src/Persistence.res (2)

124-133: writeBatch storage API changes are coherent but slightly redundant

Threading rawEvents, rollbackTargetCheckpointId, updatedEffectsCache, and updatedEntities through the storage interface is consistent with the new Persistence.writeBatch implementation and should make storage layers more self-contained. The only minor nit is that allEntities is now both stored in persistence and passed in on every call; if this API continues to grow, consider having storage capture allEntities at initialization instead of on each batch.

Based on learnings


249-323: Effect cache accounting and construction: consider guarding count and extracting a helper

The overall structure (guarding on storageStatus, deriving updatedEntities from in-memory tables, and collecting updatedEffectsCache from inMemoryStore.effects) looks correct and matches the new storage API. Two things to consider:

  • effectCacheRecord.count is updated as count + items.length - invalidationsCount. If for any reason invalidationsCount can exceed count + items.length, this will drive the counter negative while it’s documented as “Number of rows in the table” and also exported as a Prometheus gauge. If that invariant is not strictly guaranteed, it’s safer to clamp or assert:
-                let shouldInitialize = effectCacheRecord.count === 0
-                effectCacheRecord.count =
-                  effectCacheRecord.count + items->Js.Array2.length - invalidationsCount
+                let shouldInitialize = effectCacheRecord.count === 0
+                let newCount =
+                  effectCacheRecord.count + items->Js.Array2.length - invalidationsCount
+                effectCacheRecord.count = if newCount > 0 { newCount } else { 0 }
                 Prometheus.EffectCacheCount.set(~count=effectCacheRecord.count, ~effectName)
  • The updatedEffectsCache builder is quite dense (dict iteration, unsafe gets, metrics, and shouldInitialize logic mixed together). Extracting this into a small helper (e.g. makeUpdatedEffectsCache(~effects, ~cache)) would improve readability and testability without changing behavior.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46cf29a and 91255b7.

📒 Files selected for processing (11)
  • codegenerator/cli/npm/envio/src/InMemoryTable.res (3 hunks)
  • codegenerator/cli/npm/envio/src/Internal.res (3 hunks)
  • codegenerator/cli/npm/envio/src/Mirror.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Persistence.res (3 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (13 hunks)
  • codegenerator/cli/npm/envio/src/Prometheus.res (3 hunks)
  • codegenerator/cli/npm/envio/src/Utils.res (1 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (5 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/UserContext.res (2 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • scenarios/test_codegen/test/helpers/Mock.res
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/Prometheus.res
🧰 Additional context used
📓 Path-based instructions (6)
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • codegenerator/cli/npm/envio/src/Persistence.res
codegenerator/cli/templates/dynamic/**/*.hbs

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Dynamic templates are Handlebars files under codegenerator/cli/templates/dynamic

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
{codegenerator/cli/templates/static/codegen/src/**,codegenerator/cli/templates/dynamic/codegen/src/**}

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Template versions of runtime live under templates/{static,dynamic}/codegen/src and are the recommended editing targets

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
codegenerator/cli/templates/static/**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Static templates are raw ReScript files copied verbatim under codegenerator/cli/templates/static

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/templates/static/codegen/src/UserContext.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • codegenerator/cli/npm/envio/src/Persistence.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/Mirror.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/InMemoryTable.res
  • codegenerator/cli/npm/envio/src/Persistence.res
🧠 Learnings (3)
📚 Learning: 2025-05-27T17:07:12.878Z
Learnt from: JonoPrest
Repo: enviodev/hyperindex PR: 555
File: scenarios/test_codegen/test/lib_tests/EntityHistory_test.res:631-659
Timestamp: 2025-05-27T17:07:12.878Z
Learning: In isolated tests for EntityHistory rollbacks with controlled mock data setup, the ordering of getRollbackDiff results is deterministic and doesn't require sorting before comparison with deepStrictEqual.

Applied to files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/PgStorage.res : In generated runtime, PgStorage.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/Persistence.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
Repo: enviodev/hyperindex PR: 0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Persistence.res : In generated runtime, Persistence.res is part of the persistence stack

Applied to files:

  • codegenerator/cli/npm/envio/src/Persistence.res
🔇 Additional comments (19)
codegenerator/cli/npm/envio/src/Utils.res (1)

202-203: Verify usage pattern shows the constant is safe despite shared state.

The original concern about shared mutable state can be dismissed. Analysis of Utils.Array.setIndexImmutable (lines 249-253) shows that it explicitly creates a shallow copy via Belt.Array.copy before modifying the array. This means the shared immutableEmpty constant is never actually mutated, even when used as a history field initial value.

The usage pattern in InMemoryTable.res (line 182) shows immutableEmpty is used as an optimization when shouldSaveHistory is false. When later updates call setIndexImmutable on that history reference, a copy is created first, protecting the original shared array.

This is a safe copy-on-modify optimization pattern. No Object.freeze is needed, and the constant name is appropriate.

codegenerator/cli/templates/static/codegen/src/EventProcessing.res (3)

175-196: Block handler now consistently uses Platform.makeBlockEvent

Using Platform.makeBlockEvent(~blockNumber, ~chainId) in both runHandlerOrThrow and preloadBatchOrThrow keeps EVM/Fuel block handler construction centralized and consistent; the change looks correct given chainId comes from onBlockConfig and blockNumber from the item.

Also applies to: 284-301


348-366: Additional storage write Prometheus metrics look well-integrated

incrementStorageWriteTimeCounter and incrementStorageWriteCounter are wired right after existing batch timing metrics and use dbWriteDuration, so they should accurately track storage write latency and count without altering control flow.


428-435: Updated Persistence.writeBatch call matches new storage API

Passing ~batch, ~config=indexer.config, ~inMemoryStore, and ~isInReorgThreshold aligns with the new Persistence.writeBatch surface that internally derives raw events, updated entities, and mirror plumbing. No issues spotted here.

codegenerator/cli/templates/static/codegen/src/UserContext.res (1)

148-161: Entity context now correctly emits Change.t for set/delete

Switching context.<Entity>.set and deleteUnsafe to use Set({entityId, checkpointId, entity}) and Delete({entityId, checkpointId}) matches the new Change.t<'entity> and InMemoryTable.Entity.set API, with checkpoint and id wiring preserved. This keeps user-facing context in sync with the new history/update model.

Also applies to: 223-238

codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (3)

116-127: Mock store operators emit Change.t consistently

The entity mock set/delete helpers now use Delete({entityId, checkpointId: 0}) and Set({entityId, checkpointId: 0, entity}) with ~shouldSaveHistory=false, which lines up with the new Change.t and InMemoryTable.Entity.set contract while keeping history disabled in tests. Looks good.

Also applies to: 131-144


213-237: executeRowsEntity adaptation to status/latest looks correct

Matching on

  • Updated({latestChange: Set({entity})}) and Loaded with latest: Some(entity) to (re)initialize rows, and
  • Updated({latestChange: Delete({entityId})}) / Loaded with latest: None to delete or ignore,

fits the new entityWithIndices and inMemoryStoreEntityStatus model and ensures the test MockDb reflects final in‑memory state. Given that downstream load paths use InMemoryTable.Entity.values, deleting via dict->deleteDictKey is sufficient here.


600-609: Mock storage writeBatch signature kept up-to-date with runtime

The mock writeBatch in makeMockStorage now accepts the full parameter set (~batch, ~rawEvents, ~rollbackTargetCheckpointId, ~config, ~allEntities, ~updatedEffectsCache, ~updatedEntities) but still resolves immediately, which is appropriate for tests that only exercise in‑memory behavior.

codegenerator/cli/npm/envio/src/Mirror.res (1)

1-39: Mirror abstraction and ClickHouse implementation are coherent

Mirror.t cleanly captures the mirror surface (initialize, writeBatch), and makeClickHouse correctly:

  • wraps the ClickHouse client,
  • ignores chainConfigs (not needed) while still accepting them,
  • initializes schemas via ClickHouse.initialize(~database, ~entities, ~enums), and
  • mirrors history by sending each entity’s updates.latestChange into the corresponding entityHistory.table using setChangeSchema.

Errors from initialize or writeBatch will propagate to the caller, which PgStorage treats as storage failures—this is a sensible default for a first version.

If you expect mirror write failures to be non-fatal, we can adjust PgStorage to treat mirror.writeBatch as best-effort rather than transactional.

codegenerator/cli/npm/envio/src/Internal.res (2)

319-326: Effect cache table fields switched to String/Json

Using String for id and Json for output (nullable) simplifies type mapping and aligns with the generic Table.getPgFieldType path. Migration-wise this recreates cache tables on init, which is acceptable given they are derived data.


342-358: New in-memory entity update/status types match Change-based model

inMemoryStoreEntityUpdate<'entity> with {latestChange, history, containsRollbackDiffChange} and the unboxed inMemoryStoreEntityStatus<'entity> = Updated(...) | Loaded provide the right structure for:

  • tracking the latest Change.t<'entity> and its history per entity, and
  • distinguishing untouched (Loaded) from modified (Updated) rows.

These definitions line up with how InMemoryTable.Entity.set and the persistence layer now consume updates.

codegenerator/cli/npm/envio/src/InMemoryTable.res (2)

41-45: Row shape (latest, status) and basic accessors look consistent

Updating entityWithIndices<'entity> to hold latest: option<'entity> plus status: Internal.inMemoryStoreEntityStatus<'entity> and initializing rows as {latest=entity, status=Loaded, entityIndices} in initValue, with rowToEntity = row.latest, keeps:

  • the “current view” of the entity easily accessible, and
  • the status flag orthogonal for update/history logic.

The getUnsafe and index-building code correctly delegate via rowToEntity, so this structural change looks sound.

Also applies to: 151-167, 229-241


344-353: updates helper cleanly exposes only modified entities

Collecting Updated(update) rows via Array.keepMapU and discarding Loaded rows gives persistence a concise array<inMemoryStoreEntityUpdate<'entity>> for each table. This matches the new updatedEntities concept and avoids unnecessary noise from untouched entities.

codegenerator/cli/npm/envio/src/PgStorage.res (5)

25-45: Delegating field SQL types to Table.getPgFieldType is a good centralization

Replacing ad‑hoc field type rendering with Table.getPgFieldType(~fieldType, ~pgSchema, ~isArray, ~isNullable, ~isNumericArrayAsText) reduces duplication and should keep Postgres/Hasura-related type quirks in one place. The surrounding primary key logic is unchanged.


555-587: writeBatch now cleanly separates raw events and Change-based entity updates

The expanded writeBatch signature and body do the right thing:

  • Raw events are written via setOrThrow against InternalTable.RawEvents.table using the existing schema.
  • Per-entity updates are supplied as updatedEntities: array<Persistence.updatedEntity> and then split into:
    • entitiesToSet/idsToDelete for the main entity table, and
    • history per entity with Change.t values, plus containsRollbackDiffChange to gate backfill behavior.
  • When history is enabled, it:
    • backfills baseline history rows via EntityHistory.backfillHistory for entities not marked as rollback-affected,
    • accumulates batched delete updates (entityId + checkpointId), and
    • writes set updates using entityHistory.setChangeSchema and the history table.

The conversion from Change.t to history rows and the split between Set/Delete cases look consistent with the new Change-based model and with InMemoryTable.Entity.set’s history semantics.

If you haven’t already, please run rollback/history tests that touch entities with multiple changes in the same checkpoint to confirm history rows still align with expectations.

Also applies to: 588-687


515-523: Mirror initialization is correctly integrated into initialize

PgStorage.make now accepts ~mirror: option<Mirror.t>=?, and initialize calls mirror.initialize(~chainConfigs, ~entities, ~enums) before executing Postgres schema migrations. This lets the mirror create its own schema up front while keeping the call optional and backwards compatible.

Also applies to: 989-1029


1363-1402: Confirm whether mirror failures should block the batch or be logged and ignored

The verification confirms your concern is technically valid: mirror failures will block the entire batch because:

  • mirror.writeBatch(~updatedEntities)->Promise.thenResolve(...) at line 1378 chains only the success path; rejections are unhandled
  • The await mirrorPromise at line 821 is inside the transaction block
  • Any rejection propagates and aborts the transaction (caught and re-raised at lines 839–848)

However, the codebase provides no design documentation indicating whether this fail-fast behavior is intentional (strong coupling for atomicity) or an oversight. The Mirror module interface defines no error-handling mechanism, and the initialization at line 1017 is similarly unguarded but at least outside the transaction.

Please confirm: Should mirror failures be treated as critical blocking errors (current), or non-critical best-effort operations (requiring catch + log pattern)?


615-653: Verify mirror data consistency on UTF-8 retry—may not be "raw vs sanitized" as described

The UTF-8 error handling with retry looks solid in structure, but there's a potential data consistency issue worth clarifying:

  • writeBatchMethod creates mirrorPromise once from mirror.writeBatch(~updatedEntities) before calling writeBatch
  • On first writeBatch call (no escapeTables), shouldRemoveInvalidUtf8 is false, so entities are not mutated
  • If a UTF-8 error occurs during SQL operations, the retry adds the failing table to escapeTables and calls writeBatch again
  • On retry, removeInvalidUtf8InPlace mutates the same updatedEntities array (line 688)
  • The same mirrorPromise is then awaited inside the second writeBatch call (line 820–822), executing with already-mutated entities

This means the mirror likely sees sanitized data (from the mutated array), while your review suggested it sees raw data. If data consistency between mirror and Postgres is required, confirm whether this is intentional—or consider capturing and mirroring the final state after sanitization explicitly.

codegenerator/cli/npm/envio/src/Persistence.res (1)

39-48: New cache/update record types look sound

updatedEffectCache and updatedEntity cleanly model per-batch deltas and line up with how they’re constructed in writeBatch (effect + items + shouldInitialize, and entityConfig + updates). I don’t see typing or ownership issues here.

Comment on lines 169 to 227
let setRow = set
let set = (
inMemTable: t<'entity>,
entityUpdate: EntityHistory.entityUpdate<'entity>,
change: Change.t<'entity>,
~shouldSaveHistory,
~containsRollbackDiffChange=false,
) => {
//New entity row with only the latest update
@inline
let newEntityRow = () => Internal.Updated({
latest: entityUpdate,
history: shouldSaveHistory ? [entityUpdate] : [],
let newStatus = () => Internal.Updated({
latestChange: change,
history: shouldSaveHistory
? [change]
: Utils.Array.immutableEmpty->(Utils.magic: array<unknown> => array<Change.t<'entity>>),
containsRollbackDiffChange,
})
let latest = switch change {
| Set({entity}) => Some(entity)
| Delete(_) => None
}

let {entityRow, entityIndices} = switch inMemTable.table->get(entityUpdate.entityId) {
| None => {entityRow: newEntityRow(), entityIndices: Utils.Set.make()}
| Some({entityRow: InitialReadFromDb(_), entityIndices}) => {
entityRow: newEntityRow(),
let updatedEntityRecord = switch inMemTable.table->get(change->Change.getEntityId) {
| None => {latest, status: newStatus(), entityIndices: Utils.Set.make()}
| Some({status: Loaded, entityIndices}) => {
latest,
status: newStatus(),
entityIndices,
}
| Some({entityRow: Updated(previous_values), entityIndices}) =>
let entityRow = Internal.Updated({
latest: entityUpdate,
| Some({status: Updated(previous_values), entityIndices}) =>
let newStatus = Internal.Updated({
latestChange: change,
history: switch shouldSaveHistory {
// This prevents two db actions in the same event on the same entity from being recorded to the history table.
| true if previous_values.latest.checkpointId === entityUpdate.checkpointId =>
| true
if previous_values.latestChange->Change.getCheckpointId ===
change->Change.getCheckpointId =>
previous_values.history->Utils.Array.setIndexImmutable(
previous_values.history->Array.length - 1,
entityUpdate,
change,
)
| true => [...previous_values.history, entityUpdate]
| true => [...previous_values.history, change]
| false => previous_values.history
},
containsRollbackDiffChange: previous_values.containsRollbackDiffChange,
})
{entityRow, entityIndices}
{latest, status: newStatus, entityIndices}
}

switch entityUpdate.entityUpdateAction {
| Set(entity) => inMemTable->updateIndices(~entity, ~entityIndices)
| Delete => inMemTable->deleteEntityFromIndices(~entityId=entityUpdate.entityId, ~entityIndices)
switch change {
| Set({entity}) =>
inMemTable->updateIndices(~entity, ~entityIndices=updatedEntityRecord.entityIndices)
| Delete({entityId}) =>
inMemTable->deleteEntityFromIndices(
~entityId,
~entityIndices=updatedEntityRecord.entityIndices,
)
}
inMemTable.table->setRow(entityUpdate.entityId, {entityRow, entityIndices})
inMemTable.table->setRow(change->Change.getEntityId, updatedEntityRecord)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

set correctly adopts Change.t, but rollback flag aggregation may be lossy

The new set implementation does the right things overall:

  • Derives latest from the incoming Change.t (Some(entity) for Set, None for Delete).
  • Initializes or upgrades rows from None/Loaded to Updated with a single-element or empty history based on shouldSaveHistory.
  • For already Updated rows, appends to history or replaces the last entry when the checkpoint id matches, ensuring at most one history row per entity per checkpoint.
  • Updates or clears indices depending on whether the change is Set or Delete.

One concern: in the Some({status: Updated(previous_values), ...}) branch, containsRollbackDiffChange is set to previous_values.containsRollbackDiffChange, ignoring the containsRollbackDiffChange argument for the new change. If an entity first receives a “normal” change (flag=false) and later in the same in-memory aggregation gets a change affected by rollback (flag=true), the aggregated update will still have containsRollbackDiffChange=false. That can cause PgStorage.writeBatch to incorrectly backfill history for entities whose later changes were actually rollback-affected.

Consider OR-ing the flag when extending an existing Updated row:

-      let newStatus = Internal.Updated({
-        latestChange: change,
-        history: /* existing logic */,
-        containsRollbackDiffChange: previous_values.containsRollbackDiffChange,
-      })
+      let newStatus = Internal.Updated({
+        latestChange: change,
+        history: /* existing logic */,
+        containsRollbackDiffChange:
+          previous_values.containsRollbackDiffChange || containsRollbackDiffChange,
+      })

This preserves previous behavior while ensuring any rollback-affected change marks the aggregated update accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let setRow = set
let set = (
inMemTable: t<'entity>,
entityUpdate: EntityHistory.entityUpdate<'entity>,
change: Change.t<'entity>,
~shouldSaveHistory,
~containsRollbackDiffChange=false,
) => {
//New entity row with only the latest update
@inline
let newEntityRow = () => Internal.Updated({
latest: entityUpdate,
history: shouldSaveHistory ? [entityUpdate] : [],
let newStatus = () => Internal.Updated({
latestChange: change,
history: shouldSaveHistory
? [change]
: Utils.Array.immutableEmpty->(Utils.magic: array<unknown> => array<Change.t<'entity>>),
containsRollbackDiffChange,
})
let latest = switch change {
| Set({entity}) => Some(entity)
| Delete(_) => None
}
let {entityRow, entityIndices} = switch inMemTable.table->get(entityUpdate.entityId) {
| None => {entityRow: newEntityRow(), entityIndices: Utils.Set.make()}
| Some({entityRow: InitialReadFromDb(_), entityIndices}) => {
entityRow: newEntityRow(),
let updatedEntityRecord = switch inMemTable.table->get(change->Change.getEntityId) {
| None => {latest, status: newStatus(), entityIndices: Utils.Set.make()}
| Some({status: Loaded, entityIndices}) => {
latest,
status: newStatus(),
entityIndices,
}
| Some({entityRow: Updated(previous_values), entityIndices}) =>
let entityRow = Internal.Updated({
latest: entityUpdate,
| Some({status: Updated(previous_values), entityIndices}) =>
let newStatus = Internal.Updated({
latestChange: change,
history: switch shouldSaveHistory {
// This prevents two db actions in the same event on the same entity from being recorded to the history table.
| true if previous_values.latest.checkpointId === entityUpdate.checkpointId =>
| true
if previous_values.latestChange->Change.getCheckpointId ===
change->Change.getCheckpointId =>
previous_values.history->Utils.Array.setIndexImmutable(
previous_values.history->Array.length - 1,
entityUpdate,
change,
)
| true => [...previous_values.history, entityUpdate]
| true => [...previous_values.history, change]
| false => previous_values.history
},
containsRollbackDiffChange: previous_values.containsRollbackDiffChange,
})
{entityRow, entityIndices}
{latest, status: newStatus, entityIndices}
}
switch entityUpdate.entityUpdateAction {
| Set(entity) => inMemTable->updateIndices(~entity, ~entityIndices)
| Delete => inMemTable->deleteEntityFromIndices(~entityId=entityUpdate.entityId, ~entityIndices)
switch change {
| Set({entity}) =>
inMemTable->updateIndices(~entity, ~entityIndices=updatedEntityRecord.entityIndices)
| Delete({entityId}) =>
inMemTable->deleteEntityFromIndices(
~entityId,
~entityIndices=updatedEntityRecord.entityIndices,
)
}
inMemTable.table->setRow(entityUpdate.entityId, {entityRow, entityIndices})
inMemTable.table->setRow(change->Change.getEntityId, updatedEntityRecord)
}
let setRow = set
let set = (
inMemTable: t<'entity>,
change: Change.t<'entity>,
~shouldSaveHistory,
~containsRollbackDiffChange=false,
) => {
//New entity row with only the latest update
@inline
let newStatus = () => Internal.Updated({
latestChange: change,
history: shouldSaveHistory
? [change]
: Utils.Array.immutableEmpty->(Utils.magic: array<unknown> => array<Change.t<'entity>>),
containsRollbackDiffChange,
})
let latest = switch change {
| Set({entity}) => Some(entity)
| Delete(_) => None
}
let updatedEntityRecord = switch inMemTable.table->get(change->Change.getEntityId) {
| None => {latest, status: newStatus(), entityIndices: Utils.Set.make()}
| Some({status: Loaded, entityIndices}) => {
latest,
status: newStatus(),
entityIndices,
}
| Some({status: Updated(previous_values), entityIndices}) =>
let newStatus = Internal.Updated({
latestChange: change,
history: switch shouldSaveHistory {
// This prevents two db actions in the same event on the same entity from being recorded to the history table.
| true
if previous_values.latestChange->Change.getCheckpointId ===
change->Change.getCheckpointId =>
previous_values.history->Utils.Array.setIndexImmutable(
previous_values.history->Array.length - 1,
change,
)
| true => [...previous_values.history, change]
| false => previous_values.history
},
containsRollbackDiffChange:
previous_values.containsRollbackDiffChange || containsRollbackDiffChange,
})
{latest, status: newStatus, entityIndices}
}
switch change {
| Set({entity}) =>
inMemTable->updateIndices(~entity, ~entityIndices=updatedEntityRecord.entityIndices)
| Delete({entityId}) =>
inMemTable->deleteEntityFromIndices(
~entityId,
~entityIndices=updatedEntityRecord.entityIndices,
)
}
inMemTable.table->setRow(change->Change.getEntityId, updatedEntityRecord)
}
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/InMemoryTable.res around lines 169 to 227,
the Updated branch that builds newStatus uses
previous_values.containsRollbackDiffChange and ignores the incoming
containsRollbackDiffChange argument, so a later rollback-affected change can be
lost; fix by combining (OR-ing) the previous flag with the incoming
containsRollbackDiffChange when constructing the newStatus for the
Updated(previous_values) branch (i.e. set containsRollbackDiffChange:
previous_values.containsRollbackDiffChange || containsRollbackDiffChange) so any
rollback-affected change marks the aggregated update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants