Skip to content

[Feat(tests)] multi-worker freestyle mock#1430

Merged
nams1570 merged 2 commits into
devfrom
refactor-flaky-email-tests
May 13, 2026
Merged

[Feat(tests)] multi-worker freestyle mock#1430
nams1570 merged 2 commits into
devfrom
refactor-flaky-email-tests

Conversation

@nams1570
Copy link
Copy Markdown
Collaborator

@nams1570 nams1570 commented May 13, 2026

Context

Lots of flakiness comes from email polling leading to timeouts. This usually happens when freestyle mock cannot service requests in time. Old mock was single threaded and so clogged up by a lot of requests.

Summary of Changes

A multiworker system should be better.

Summary by CodeRabbit

  • Tests

    • Replaced fixed delays with polling-based verification for failed-emails digest tests, improving reliability and ensuring assertions wait for expected email batches.
    • Updated password-update E2E tests to create users without waiting for email delivery, keeping behavior and assertions unchanged.
  • Chores

    • Reworked mock execution service to use a pooled worker model with per-job timeouts, job recycling, bounded dependency installs, improved logging, and robust per-request cleanup.

Review Change Stack

Copilot AI review requested due to automatic review settings May 13, 2026 21:47
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
stack-auth-hosted-components Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-auth-mcp Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-backend Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-dashboard Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-demo Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-docs Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-preview-backend Ready Ready Preview, Comment May 13, 2026 11:27pm
stack-preview-dashboard Ready Ready Preview, Comment May 13, 2026 11:27pm

@nams1570 nams1570 changed the title feat(tests): multi-worker freestyle mock [Feat(tests)] multi-worker freestyle mock May 13, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 49158f09-ec75-4621-8e21-f580ed4f4047

📥 Commits

Reviewing files that changed from the base of the PR and between 30853e8 and f0fbf74.

📒 Files selected for processing (1)
  • apps/e2e/tests/backend/endpoints/api/v1/auth/password/update.test.ts

📝 Walkthrough

Walkthrough

Tests for failed-emails digest now poll the dry-run endpoint and filter mailbox-specific batches; the freestyle-mock Dockerfile adds a worker thread script and a queued WorkerPool with npm-install orchestration and safer HTTP request handling.

Changes

Test infrastructure and worker pool architecture

Layer / File(s) Summary
Digest polling helpers and test updates
apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts
Adds types and dry-run-only helpers that poll the digest endpoint, select mailbox-owner failed-email batches, and wait for expected counts; updates test helper signature and tests to use polling.
Password-update signup parameter tweaks
apps/e2e/tests/backend/endpoints/api/v1/auth/password/update.test.ts
Updates multiple password-update E2E tests to create users with noWaitForEmail: true during signup.
Worker thread implementation
docker/dependencies/freestyle-mock/Dockerfile
Adds worker.mjs which runs user scripts inside worker_threads, captures console logs, applies env overrides, executes the script module, and posts structured `{status, result
Server header: worker integration
docker/dependencies/freestyle-mock/Dockerfile
Updates server.mjs header to import worker_threads and compute the workerScript path.
Pool sizing and timeouts
docker/dependencies/freestyle-mock/Dockerfile
Introduces POOL_SIZE, MAX_JOBS_PER_WORKER, JOB_TIMEOUT_MS, and NPM_INSTALL_TIMEOUT_MS constants used by the pool and server.
WorkerPool, queuing, and lifecycle
docker/dependencies/freestyle-mock/Dockerfile
Implements a queued WorkerPool with per-slot busy/draining state, job dispatch to idle slots, worker recycling after a job-count threshold, and JOB_TIMEOUT_MS enforcement.
npm install orchestration
docker/dependencies/freestyle-mock/Dockerfile
Adds runNpmInstall(workDir) with an NPM_INSTALL_TIMEOUT_MS, child-process spawn error handling, and SIGKILL on timeout.
HTTP server handler and startup
docker/dependencies/freestyle-mock/Dockerfile
Rewrites request handling to create per-request workdirs, write script.mjs and generated package.json, conditionally run npm install based on a preinstalled-version map, execute via the WorkerPool, serialize responses before writeHead, ensure cleanup, and log computed pool size on startup.

Sequence Diagram

sequenceDiagram
  participant HTTPServer
  participant WorkerPool
  participant Worker
  HTTPServer->>WorkerPool: run(jobConfig)
  WorkerPool->>WorkerPool: select idle slot or enqueue
  WorkerPool->>Worker: dispatch(job)
  Worker->>Worker: apply envVars and capture console and execute script
  Worker->>WorkerPool: postMessage {status,result or error,logs}
  WorkerPool->>HTTPServer: resolve or reject with {result|error,logs}
Loading

🎯 4 (Complex) | ⏱️ ~45 minutes

"I polled and waited, patient paws in tune,
Workers hum softly under the moon,
Logs gathered like clover in a stack,
Jobs queue and finish, then head back,
A rabbit hops — the tests pass soon."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly reflects the main change: implementing a multi-worker system for the freestyle mock to improve test reliability.
Description check ✅ Passed The description provides context about test flakiness and explains the solution, though it lacks specific implementation details about the worker pool system.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor-flaky-email-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce e2e flakiness caused by timeouts during email polling by increasing the throughput of the freestyle-mock execution service, and by making the failed-emails digest test poll deterministically instead of using a fixed sleep.

Changes:

  • Replace the inline single-threaded freestyle-mock execution model with a worker_threads pool to run multiple scripts in parallel.
  • Add a polling helper to failed-emails-digest e2e tests to wait until expected failures are recorded, removing a fixed sleep.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docker/dependencies/freestyle-mock/Dockerfile Adds worker.mjs + a worker pool in server.mjs to parallelize script execution and recycle workers periodically.
apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts Adds a digest polling helper and updates the dry-run test to wait for expected failures instead of sleeping.
Comments suppressed due to low confidence (1)

apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts:167

  • Indentation in this block is inconsistent (e.g., expect(response.status)... and the following if/else are indented deeper than the surrounding statements). Since apps/e2e runs ESLint with the indent: ["error", 2, ...] rule, this will likely fail pnpm lint for the e2e package—please reformat this section to match the established 2-space indentation levels.
    const { response, batches: mockProjectFailedEmails } = await waitForFailedEmailsDigest(2);
      expect(response.status).toBe(200);

      if (process.env.STACK_TEST_SOURCE_OF_TRUTH === "true") {
      expect(mockProjectFailedEmails).toMatchInlineSnapshot(`[]`);
      } else {
      expect(mockProjectFailedEmails).toMatchInlineSnapshot(`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/dependencies/freestyle-mock/Dockerfile
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR tackles test flakiness caused by the freestyle mock being unable to service concurrent email-rendering requests. The single-threaded server is replaced with a WorkerPool backed by one worker_threads worker per CPU core, each handling at most 50 jobs before recycling, with per-job (30 s) and npm install (60 s) timeouts. On the test side, the brittle fixed wait(10_000) is replaced with a polling helper that throws a descriptive timeout error instead of silently returning partial state.

  • Dockerfile: New worker.mjs thread handles script execution with full process.env / console isolation per job; server.mjs manages the pool, job dispatch, worker recycling, and guaranteed HTTP responses on all error paths.
  • Test file: waitForFailedEmailsDigest polls the digest endpoint until the expected email count appears, and testFailedEmails no longer accepts an isDryRun argument (always dry-run); the non-dry-run test is left as .todo with an updated comment.

Confidence Score: 5/5

Safe to merge — the worker pool replaces a well-understood single-threaded bottleneck, all error paths send an HTTP response, and previously flagged concerns are now addressed.

The Dockerfile changes are a self-contained testing mock with no production data paths; the multi-worker logic is well-commented and correctly handles worker recycling, timeouts, and error/exit races. The test change removes a fragile fixed delay in favour of a polling helper that throws a clear diagnostic on timeout.

No files require special attention beyond the minor polling-condition note in the test helper.

Important Files Changed

Filename Overview
apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts Replaced fixed 10 s wait() with a polling helper that throws a clear timeout error; isDryRun parameter removed in favour of always using dry_run=true. Minor: polling exit condition checks only batches[0], which could prematurely exit when multiple projects share the same owner mailbox.
docker/dependencies/freestyle-mock/Dockerfile Extracted script execution into a worker.mjs thread; server.mjs now manages a WorkerPool with per-job timeouts (30 s), per-install timeouts (60 s), worker recycling after 50 jobs, and stdio ignore on npm install to prevent pipe-buffer deadlock. Addresses previously flagged concerns.

Sequence Diagram

sequenceDiagram
    participant Client as HTTP Client
    participant Server as server.mjs
    participant Pool as WorkerPool
    participant Worker as worker.mjs (thread)
    participant FS as Filesystem

    Client->>Server: POST /execute/v2/script
    Server->>FS: mkdir(workDir), writeFile(script.mjs), writeFile(package.json)
    alt needs extra modules
        Server->>Server: runNpmInstall(workDir) [60s timeout]
    end
    Server->>Pool: "pool.run({ scriptFile, envVars }) [30s timeout]"
    alt idle slot available
        Pool->>Worker: "postMessage({ scriptFile, envVars })"
        Worker->>FS: import file://scriptFile
        Worker->>Worker: execute, capture logs
        Worker->>Pool: "postMessage({ status, result/error, logs })"
    else all slots busy
        Pool->>Pool: enqueue job, wait for slot
    end
    Pool->>Server: resolve/reject
    Server->>Client: 200/500 JSON response
    Server->>FS: rm(workDir) [finally]

    note over Pool,Worker: After MAX_JOBS_PER_WORKER=50 jobs, slot.draining=true, terminate, exit, spawnSlot()
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts:53
The polling exit condition only inspects `batches[0]?.emails.length`, but `selectBatchesForCurrentMailbox` can return multiple batches when the same owner mailbox has failed emails across more than one project (e.g. across the 10 repeats of the outer test). In that case the loop exits as soon as *any* first batch reaches the threshold — which may not be the current test's project — and the returned `batches` array then contains extra entries, causing the inline snapshot assertion to fail with a confusing multi-batch diff instead of a clear "wrong project" error. Scoping the check to the total across all current-mailbox batches would avoid this.

```suggestion
  while (batches.reduce((sum, b) => sum + b.emails.length, 0) < expectedFailedEmailCount) {
```

### Issue 2 of 2
docker/dependencies/freestyle-mock/Dockerfile:57
**Stale console patch leaks between jobs on uncaught async rejection**

The `finally` block restores `console[method]` only when the `try` block resolves or throws synchronously. If user code spawns a detached `Promise` that rejects *after* `parentPort.postMessage(response)` fires (i.e. after the `finally` block has already run and the next job has started patching `console`), that rejection will execute inside the next job's patched console — so the next job's `logs` array picks up log lines that belong to the previous job. This is unlikely in practice for email-rendering scripts, but the issue exists structurally because the `parentPort.on("message", async …)` handler is one long async function and a stale floating promise can outlive the `finally`.

Reviews (2): Last reviewed commit: "feat(tests): multi-worker freestyle mock" | Re-trigger Greptile

Comment thread apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts Outdated
Comment thread docker/dependencies/freestyle-mock/Dockerfile
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts (1)

38-61: 💤 Low value

Consider clarifying the single-batch assumption.

Line 53 checks only batches[0]?.emails.length, which assumes at most one batch per mailbox will be returned. While this appears correct given the test expectations (lines 167-188 show a single batch), the function comment could explicitly state this assumption for future maintainers.

📝 Optional comment enhancement
 // Polls the failed-emails-digest until the current mailbox's batch contains
 // at least `expectedFailedEmailCount` emails, or `timeoutMs` elapses. Replaces
 // a fixed sleep that flaked when the SMTP failure pipeline (DNS lookup +
 // connection attempt + outbox status update) hadn't recorded all failures yet.
+// Note: Checks only the first batch, assuming one batch per mailbox owner.
 async function waitForFailedEmailsDigest(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts`
around lines 38 - 61, Update the comment above waitForFailedEmailsDigest to
explicitly state the single-batch assumption: note that the loop checks
batches[0]?.emails.length and therefore expects at most one batch per mailbox
(as used by selectBatchesForCurrentMailbox and the tests). Mention that if
multiple batches could be returned in the future the logic should be revised
(e.g., aggregate across batches or handle multiple indices) so maintainers know
why batches[0] is used.
docker/dependencies/freestyle-mock/Dockerfile (1)

38-43: ⚡ Quick win

Log capture stringifies objects to "[object Object]".

args.map(String).join(" ") collapses any non-primitive argument (objects, errors, arrays) to [object Object] / Error: ... with no fields, which makes the captured logs substantially less useful when tests assert on script output. Node's default console uses util.format, which preserves this detail and matches behavior of the un-patched console.

♻️ Proposed fix
 import { parentPort } from "worker_threads";
+import { format } from "util";
@@
   logMethods.forEach((method) => {
     console[method] = (...args) => {
-      logs.push({ message: args.map(String).join(" "), type: method });
+      logs.push({ message: format(...args), type: method });
       originalConsole[method](...args);
     };
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker/dependencies/freestyle-mock/Dockerfile` around lines 38 - 43, The
captured logs currently stringify arguments with args.map(String).join(" "),
losing object/ Error detail; change the console patch inside the
logMethods.forEach to use Node's util.format so captured messages match the
unpatched console: require or import util.format (e.g. const { format } =
require('util') or import { format } from 'util') and replace
args.map(String).join(" ") with format(...args) when calling logs.push({
message: ..., type: method }), keeping the originalConsole[method](...args) call
intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docker/dependencies/freestyle-mock/Dockerfile`:
- Around line 136-181: The `failInFlight` path clears slot.busy but doesn't mark
the slot draining, so a concurrent dispatch can claim a worker that's about to
exit; update the code so that `slot.draining = true` is set when a worker fails
(either inside `failInFlight` or immediately in the `worker.on("error", ...)`
handler before calling `failInFlight`), ensuring `dispatchNext()` will skip the
slot until the `exit` handler replaces it; refer to the `failInFlight` function,
the `worker.on("error", (err) => { ... })` handler, and the `slot.draining` flag
when making the change.
- Around line 258-261: The call to pool.run({ scriptFile, envVars }) can hang
forever if the user script never posts a message, so wrap/race the pool.run
promise with a timeout and on timeout forcefully terminate the associated worker
so the worker 'exit' handler can respawn it and clear slot.busy; specifically,
implement a timedRace around pool.run that rejects after a configurable TIMEOUT,
and on that rejection call the worker termination method used by the pool (so
the exit/cleanup logic that checks MAX_JOBS_PER_WORKER and clears slot.busy
runs), ensuring pool.run's rejection path also marks the slot free and surfaces
a clear timeout error to the caller.
- Around line 244-256: The npm install spawn block can hang because it lacks an
'error' listener, doesn't consume stdio pipes, and has no timeout; update the
logic around spawn("npm", ["install"], { cwd: workDir, stdio: "pipe" }) (used
when needsInstall && Object.keys(requestedNodeModules).length) to add
installProcess.on("error", ...) to reject immediately on spawn failures, pipe or
forward installProcess.stdout and installProcess.stderr to a logger or to
process.stdout/stderr to avoid filling OS buffers, implement a configurable
timeout timer that kills installProcess and rejects if install takes too long,
and ensure you clean up listeners and timer in the installProcess.on("close",
...) handler so the promise always resolves or rejects and doesn't leak
resources.

---

Nitpick comments:
In
`@apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts`:
- Around line 38-61: Update the comment above waitForFailedEmailsDigest to
explicitly state the single-batch assumption: note that the loop checks
batches[0]?.emails.length and therefore expects at most one batch per mailbox
(as used by selectBatchesForCurrentMailbox and the tests). Mention that if
multiple batches could be returned in the future the logic should be revised
(e.g., aggregate across batches or handle multiple indices) so maintainers know
why batches[0] is used.

In `@docker/dependencies/freestyle-mock/Dockerfile`:
- Around line 38-43: The captured logs currently stringify arguments with
args.map(String).join(" "), losing object/ Error detail; change the console
patch inside the logMethods.forEach to use Node's util.format so captured
messages match the unpatched console: require or import util.format (e.g. const
{ format } = require('util') or import { format } from 'util') and replace
args.map(String).join(" ") with format(...args) when calling logs.push({
message: ..., type: method }), keeping the originalConsole[method](...args) call
intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d6ad83eb-3f40-47f5-97dc-08abff6e76de

📥 Commits

Reviewing files that changed from the base of the PR and between 748d708 and 0187a5d.

📒 Files selected for processing (2)
  • apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts
  • docker/dependencies/freestyle-mock/Dockerfile

Comment thread docker/dependencies/freestyle-mock/Dockerfile
Comment thread docker/dependencies/freestyle-mock/Dockerfile
Comment thread docker/dependencies/freestyle-mock/Dockerfile
Comment thread docker/dependencies/freestyle-mock/Dockerfile Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docker/dependencies/freestyle-mock/Dockerfile (1)

45-66: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the full worker environment after each job.

This only snapshots keys from envVars. If a script mutates or deletes any other process.env entry, that state leaks into later requests handled by the same recycled worker, which can reintroduce cross-test flakiness.

🛡️ Proposed fix
-  const previousEnv = new Map();
-  Object.entries(envVars || {}).forEach(([key, value]) => {
-    previousEnv.set(key, process.env[key]);
+  const previousEnv = { ...process.env };
+  Object.entries(envVars || {}).forEach(([key, value]) => {
     process.env[key] = value;
   });
@@
-    previousEnv.forEach((value, key) => {
-      if (value === undefined) {
-        delete process.env[key];
-      } else {
-        process.env[key] = value;
-      }
-    });
+    Object.keys(process.env).forEach((key) => {
+      if (!(key in previousEnv)) delete process.env[key];
+    });
+    Object.entries(previousEnv).forEach(([key, value]) => {
+      process.env[key] = value;
+    });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker/dependencies/freestyle-mock/Dockerfile` around lines 45 - 66, The
current snapshot logic only saves keys listed in envVars (previousEnv) so any
mutations/deletions to other process.env entries by the user code leak across
recycled workers; instead, capture a full copy of process.env before
importing/executing the user script (e.g., create a complete snapshot object of
process.env before the userModule import/execute around scriptFile/userFunction)
and in the finally block fully restore the environment by removing keys added
during execution and restoring original values from that full snapshot (clear
current process.env then reassign original keys), ensuring the worker's
environment is identical after each job.
🧹 Nitpick comments (1)
docker/dependencies/freestyle-mock/Dockerfile (1)

87-103: ⚡ Quick win

Use container-aware parallelism for POOL_SIZE.

cpus().length returns the host system's total logical cores regardless of cgroup limits, which can cause worker pool oversubscription in CI containers. Node.js 22 provides availableParallelism() for this—it respects Docker --cpus, Kubernetes cpu limits, and cgroup constraints (both v1 and v2).

♻️ Proposed fix
-import { cpus } from "os";
+import { availableParallelism } from "os";
@@
-const POOL_SIZE = Math.max(2, cpus().length);
+const POOL_SIZE = Math.max(2, availableParallelism());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker/dependencies/freestyle-mock/Dockerfile` around lines 87 - 103, The
POOL_SIZE calculation uses cpus().length which ignores cgroup limits; change it
to prefer os.availableParallelism() when present and fall back to cpus().length,
then clamp with Math.max(2, ...). Update the code that defines POOL_SIZE (the
POOL_SIZE constant) to call availableParallelism() safely (e.g., typeof
availableParallelism === "function" ? availableParallelism() : cpus().length)
and keep the existing Math.max(2, ...) clamp so the pool remains at least 2;
reference the os import and the POOL_SIZE constant to locate where to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@docker/dependencies/freestyle-mock/Dockerfile`:
- Around line 45-66: The current snapshot logic only saves keys listed in
envVars (previousEnv) so any mutations/deletions to other process.env entries by
the user code leak across recycled workers; instead, capture a full copy of
process.env before importing/executing the user script (e.g., create a complete
snapshot object of process.env before the userModule import/execute around
scriptFile/userFunction) and in the finally block fully restore the environment
by removing keys added during execution and restoring original values from that
full snapshot (clear current process.env then reassign original keys), ensuring
the worker's environment is identical after each job.

---

Nitpick comments:
In `@docker/dependencies/freestyle-mock/Dockerfile`:
- Around line 87-103: The POOL_SIZE calculation uses cpus().length which ignores
cgroup limits; change it to prefer os.availableParallelism() when present and
fall back to cpus().length, then clamp with Math.max(2, ...). Update the code
that defines POOL_SIZE (the POOL_SIZE constant) to call availableParallelism()
safely (e.g., typeof availableParallelism === "function" ?
availableParallelism() : cpus().length) and keep the existing Math.max(2, ...)
clamp so the pool remains at least 2; reference the os import and the POOL_SIZE
constant to locate where to modify.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3d7b77fb-7493-469d-b369-d95928ec4329

📥 Commits

Reviewing files that changed from the base of the PR and between 0187a5d and 39db261.

📒 Files selected for processing (2)
  • apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts
  • docker/dependencies/freestyle-mock/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/e2e/tests/backend/endpoints/api/v1/internal/failed-emails-digest.test.ts

Lots of flakiness comes from email polling leading to timeouts.
This usually happens when freestyle mock cannot service requests in time.
Old mock was single threaded and so clogged up by a lot of requests.
A multiworker system should be better.
@nams1570 nams1570 force-pushed the refactor-flaky-email-tests branch from 39db261 to 30853e8 Compare May 13, 2026 22:34
@nams1570
Copy link
Copy Markdown
Collaborator Author

@greptileai could you please rereview this PR? Do an exhaustive review pass over the changes.

signing up with an email is well tested elsewhere. Having the assertion here doesn't actually give us any extra coverage, but it does increase the load on the mock
@nams1570 nams1570 merged commit c0871e6 into dev May 13, 2026
33 checks passed
@nams1570 nams1570 deleted the refactor-flaky-email-tests branch May 13, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants