Skip to content

Conversation

@neekolas
Copy link
Contributor

@neekolas neekolas commented Nov 14, 2025

Add DB chaos to fork tests by exposing CLI flags and workers.WorkerClient.lockDB with defaults 100–2000 ms and 5000 ms interval

Introduce CLI flags to enable DB file locking chaos, pass settings via environment variables, and implement workers.WorkerClient.lockDB to acquire a write lock for a duration; add utilities to start/clear periodic locks and integrate them into the concurrent forks test while increasing group count to 10.

📍Where to Start

Start with CLI parsing and env wiring in cli.ts, then follow DB chaos setup in db-chaos-utils.ts and the test integration in forks.test.ts.


📊 Macroscope summarized c78f6d1. 5 files reviewed, 20 issues evaluated, 19 issues filtered, 0 comments posted

🗂️ Filtered Issues

forks/cli.ts — 0 comments posted, 1 evaluated, 1 filtered
  • line 104: runForkTest unconditionally calls .toString() on options.dbLockTimeMin, options.dbLockTimeMax, and options.dbLockInterval. If a caller omits any of these newly added fields (they are not runtime-enforced), options.<field> will be undefined and .toString() will throw TypeError: Cannot read properties of undefined. Add defaults or guard these before use, or only set the env vars when dbChaosEnabled is true and the values exist. [ Low confidence ]
forks/config.ts — 0 comments posted, 2 evaluated, 2 filtered
  • line 94: dbLockTimeMin, dbLockTimeMax, and dbLockInterval are parsed using parseInt(process.env.* , 10) with string fallbacks only when the env var is undefined or empty (due to ||). If an env var is set to a non-numeric value (e.g., "abc"), parseInt returns NaN. These exported values can then propagate as NaN, leading to unpredictable behavior in timing/locking logic. Validate and reject non-numeric values or coerce with safe defaults. [ Low confidence ]
  • line 99: No validation ensures dbLockTimeMin <= dbLockTimeMax and that both and dbLockInterval are positive. Misordered or negative values from env can lead to nonsensical or harmful timing (e.g., negative durations or inverted ranges) in downstream logic. Add range checks and fail fast with a clear error when constraints are violated. [ Low confidence ]
forks/db-chaos-utils.ts — 0 comments posted, 1 evaluated, 1 filtered
  • line 21: In applyDbChaos (code object 2) the randomness threshold is hard-coded to 50% (Math.random() < 0.5) (line 21) and there is no way to scope selection or ensure that at-most-one worker per installation is locked. While not a crash, this can cause many workers to be locked concurrently. This is a behavioral issue with potential wide impact on tests. [ Code style ]
workers/main.ts — 0 comments posted, 16 evaluated, 15 filtered
  • line 131: Event naming mismatch inside workerThreadCode (code object 6): the worker thread listens on parentPort.on("worker_message", ...) (line 131) but worker thread/parent port standard message event name is 'message'. Using a custom event name prevents receiving parent messages in the worker thread. This appears unused/incorrect. [ Low confidence ]
  • line 312: startSync launches an infinite loop (while(true)) with no cancellation or tie to worker lifecycle (lines ~312-324). Once started it cannot be stopped leading to runaway background tasks and resource leaks. [ Low confidence ]
  • line 312: startSync spawns an infinite loop with no cancellation or lifecycle tie-in (lines ~312-325 in code object 6). There is no stored AbortController or check that the worker is terminating — calling terminate() on WorkerClient only stops streams but does not stop sync tasks started by startSync, so background sync loops may continue after termination, leaking resources. [ Low confidence ]
  • line 336: startStream intentionally does not return when a stream is already active: the early if (this.activeStreamTypes.has(streamType)) logs but continues and may initialize the same stream twice (lines ~336-342). This can create duplicate streaming loops and race conditions. [ Low confidence ]
  • line 337: Duplicate stream start allowed: in WorkerClient.startStream the code intentionally does not return when a stream is already active (lines 337-343 in code object 6), so calling startStream for the same streamType multiple times will create additional concurrent loops/stream handlers and duplicate behavior (memory leak, duplicated emits, multiple responses). The commented-out return indicates the intended guard was disabled. [ Low confidence ]
  • line 427: In getSQLiteFileSizes (code object 6) the code calls fs.readdirSync(dbDir) without checking that dbDir exists (line ~427). If dbDir does not exist, this will throw synchronously and bubble up causing the caller to crash. [ Low confidence ]
  • line 442: getSQLiteFileSizes calls await this.client.conversations.list() inside a loop over filesystem entries (lines ~442-445). This queries the API repeatedly for each file instead of once, causing unnecessary IO, potential rate limits and inconsistent results if conversations change during the loop. [ Low confidence ]
  • line 582: Inconsistent awaiting of streams leading to potential runtime errors: initMessageStream uses const stream = await this.client.conversations.streamAllMessages() then for await (const message of stream), while initConversationStream does const stream = this.client.conversations.stream(); followed by for await (const conversation of await stream) (lines ~582 and ~790 in code object 6). If the SDK returns either a Promise or AsyncIterable directly inconsistently between APIs, one of these patterns can throw or iterate incorrectly. [ Low confidence ]
  • line 596: Unsafe property access on potentially undefined message fields: in initMessageStream the check message?.senderInboxId.toLowerCase() === this.client.inboxId.toLowerCase() calls .toLowerCase() without guarding senderInboxId or this.client.inboxId (lines ~595-603 in code object 6), which will throw if either side is undefined/null. [ Low confidence ]
  • line 599: In initMessageStream the code calls .toLowerCase() on values that may be undefined: message?.senderInboxId.toLowerCase() and this.client.inboxId.toLowerCase() (lines ~599-602). If message.senderInboxId is undefined or not a string this will throw a runtime TypeError. [ Low confidence ]
  • line 727: In handleResponse the code does (message.content as string).toLowerCase() (line ~727) without verifying message.content is a string, which will throw if content is undefined or an object. [ Already posted ]
  • line 728: Potential thrown error in handleResponse when calling (message.content as string).toLowerCase() without ensuring message.content is a string (lines ~728-731 in code object 6). If content is undefined or an object (non-string), this will throw and be caught by outer try/catch, but may cause unnecessary stack noise; more importantly, subsequent content checks rely on content being a string. [ Low confidence ]
  • line 1074: collectConversations's filter function ignores the fromInboxId parameter — when fromInboxId is truthy it returns msg.conversation.id !== undefined instead of checking the message's originating inbox ID (lines ~1074-1078). This likely causes incorrect filtering semantics and can return irrelevant conversations. [ Low confidence ]
  • line 1075: In collectConversations the filter uses fromInboxId ? (msg) => { if (msg.type !== StreamCollectorType.Conversation) return false; return msg.conversation.id !== undefined; } : undefined (lines ~1075-1080 in code object 6) — the filter ignores fromInboxId value (does not check that the conversation originated from fromInboxId). This causes the collector to accept any conversation that has an id when fromInboxId is truthy, violating the caller's expected filter-by-sender semantics. [ Low confidence ]
  • line 1130: In lockDB (code object 7) the implementation calls this.client.disconnectDatabase() synchronously and then schedules a setTimeout to reconnectDatabase. If disconnectDatabase is asynchronous in the SDK (returns a Promise) or requires awaiting, calling it without awaiting could leave the DB in an inconsistent state or race with reconnect. The method wraps this in a Promise but does not await a possible Promise from disconnectDatabase. [ Low confidence ]

Copy link
Contributor Author

neekolas commented Nov 14, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue on line in workers/main.ts:815:

Rethrowing in stream handlers causes unhandled promise rejections and leaked state. Consider catching and handling: abort/cancel the controller, clear streamControllers/streamReferences, update activeStreamTypes, then break/restart the loop instead of rethrowing.

-              throw error;
+              break;
@@
-          throw error;
+          continue;

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

@neekolas neekolas force-pushed the 11-13-add_streams_to_fork_test branch from 1985c18 to 289c08c Compare November 14, 2025 02:01
@neekolas neekolas force-pushed the 11-13-add_db_chaos_by_locking_db_file branch from 8153581 to 9587ba2 Compare November 14, 2025 02:01
--streams Enable message streams on all workers [default: false]
--db-chaos-enabled Enable database locking chaos [default: false]
--db-lock-time-min <ms> Minimum DB lock duration in ms [default: 100]
--db-lock-time-max <ms> Maximum DB lock duration in ms [default: 2000]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

showHelp() lists defaults that don’t match main() (db-lock-time-max 2000 vs 6000, db-lock-interval 15000 vs 5000). Consider updating the help text to reflect the actual runtime defaults.

Suggested change
--db-lock-time-max <ms> Maximum DB lock duration in ms [default: 2000]
--db-lock-time-max <ms> Maximum DB lock duration in ms [default: 6000]
--db-lock-interval <ms> How often to apply DB locks in ms [default: 5000]

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

@neekolas neekolas force-pushed the 11-13-add_db_chaos_by_locking_db_file branch 3 times, most recently from d8691e2 to 6aaafdf Compare November 14, 2025 22:39
@neekolas neekolas force-pushed the 11-13-add_streams_to_fork_test branch from 289c08c to 24c219c Compare November 17, 2025 16:21
@neekolas neekolas force-pushed the 11-13-add_db_chaos_by_locking_db_file branch from 6aaafdf to c78f6d1 Compare November 17, 2025 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants