Skip to content

Conversation

@margaretjgu
Copy link
Member

@margaretjgu margaretjgu commented Oct 22, 2025

@margaretjgu margaretjgu requested a review from JoshMock as a code owner October 22, 2025 20:20
@margaretjgu margaretjgu marked this pull request as draft October 22, 2025 20:27
@margaretjgu
Copy link
Member Author

margaretjgu commented Oct 23, 2025

Logic mapping/reference:

Constructor (lines 281-286, 315-317):

  • // MIDDLEWARE: HeaderManagementMiddleware - user-agent, client-meta, accept-encoding
  • // MIDDLEWARE: OpenTelemetryMiddleware - distributed tracing setup

_request method:

  • Lines 474-481: // MIDDLEWARE: HeaderManagementMiddleware - header merging and opaque-id
  • Lines 483-520: // MIDDLEWARE: ContentTypeMiddleware - JSON/NDJSON serialization
  • Lines 531-553: // MIDDLEWARE: CompressionMiddleware - gzip compression
  • Lines 555-562: // MIDDLEWARE: ContentTypeMiddleware - default headers
  • Lines 579-595: // MIDDLEWARE: OpenTelemetryMiddleware - URL and server attributes
  • Lines 617-626: // MIDDLEWARE: OpenTelemetryMiddleware - response status attributes
  • Lines 628-634: // MIDDLEWARE: ProductCheckMiddleware - product header validation
  • Lines 671-682: // MIDDLEWARE: RetryMiddleware - retry attempts tracking
  • Lines 699-770: // MIDDLEWARE: ErrorRedactionMiddleware and RetryMiddleware - error handling

request method:
Lines 781-818: // MIDDLEWARE: OpenTelemetryMiddleware - span creation
Lines 820-833: // MIDDLEWARE: OpenTelemetryMiddleware - span lifecycle

Copy link
Member

@JoshMock JoshMock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts and questions on my first thorough pass through this PR:

  • What is the mechanism for respecting the TransportRequestOptions object that is passed to every Transport.request call? The MiddlewareContext object includes it as its options value, so each middleware may need to check ctx.options every time to decide whether it needs to do anything.
  • We need to add unit tests for each middleware to prove that we're able to test them each in isolation, which is a big perk of this refactor.
  • Just based on the size of this change, we might want to consider releasing it in phases. Like, leave all the new middleware modules in here, but only automatically enable a couple of the simpler ones for the next minor release of the transport, just to validate our work in the wild.

And one little nit: we can remove Middleware from the filenames of each middleware, since it's already in the directory name.

- Legacy + retries: 0.003ms (faster than baseline?)
- Middleware + retries: 0.005ms

This warrants investigation but may be measurement variance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue it might not be a significant concern if the difference is two thousandths of a millisecond.

additionalArg?: TransportResult | Error
): Promise<{ context: MiddlewareContext, error?: Error }> {
let currentContext = initialContext
const syncPhase = (phase + 'Sync') as keyof Middleware
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding Sync to the function name, we can detect a Promise at runtime:

let result = handler(currentContext)
if (result != null && typeof result.then === 'function') {
  result = await result
}

Simplifies the code a bit!

retryBackoff: (min: number, max: number, attempt: number) => number
}

export class RetryMiddleware implements Middleware {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a way to encapsulate all this retry logic in the middleware so we can pull it out of Transport. Improves the readability of Transport.request and probably makes testing retry logic a lot easier, too.

currentContext = this.mergeContext(currentContext, result.context)
}
} catch (error) {
console.warn(`Middleware ${middleware.name} failed in ${phase}:`, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should drop the console.warn and just throw a MiddlewareException that uses Error.cause to wrap the original error.

For a refactor like this, I want our changes to fail as loudly as possible, to make sure we're catching any issues it introduces early, and to make debugging issues from users more straightforward. Instead of someone noticing compression suddenly stopped working, for example, they'll see they got a MiddlewareException, which will help both them and us to diagnose the underlying cause faster.

}
}

private async _requestWithMiddleware (params: TransportRequestParams, options: TransportRequestOptions = {}): Promise<any> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than break this out into a totally separate function, I think we should add all the middleware handling to Transport.request directly. It will make the code longer and uglier in the short term, until we can confidently turn on middleware by default. Duplicating the code could make it easy to overlook that a bug in Transport.request might exist in more than one place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand that you mean to suggest this:

async request(params, options) {
  if (this[kUseMiddleware]) {
    // inline all the middlware code
    const meta = ...
    middlewareContext = await this[kMiddlewareEngine].executePhase('onBeforeRequest', ...)
  } else {

I understand the problem of handing bugs/fixes and needing to remember to modify both of the code paths -
But I also would like to argue in keeping the _requestWithMiddleware as a separate method not only for readability.

Once middleware becomes the default (after rollout), we can merge the methods and remove the old code path. This keeps the migration cleaner and makes it easier to AB test or to rollback.

@margaretjgu
Copy link
Member Author

margaretjgu commented Oct 28, 2025

  • What is the mechanism for respecting the TransportRequestOptions object that is passed to every Transport.request call? The MiddlewareContext object includes it as its options value, so each middleware may need to check ctx.options every time to decide whether it needs to do anything.

Yes, middleware should check ctx.options for "per request" overrides. I've documented this pattern in docs/middleware.md.

Eg. HeaderManagement checks ctx.options.headers to merge request specific headers:

  1. Check middleware's own config (transport level)
  2. Check ctx.options for per-request override
  3. Apply the appropriate value

Just based on the size of this change, we might want to consider releasing it in phases. Like, leave all the new middleware modules in here, but only automatically enable a couple of the simpler ones for the next minor release of the transport, just to validate our work in the wild.
Yes, I agree. I feel that the current implementation allows for this.

  1. Middleware is opt-in via useMiddleware: true option (default: false)
  2. When disabled, the original _request code path is used
  3. We can start by enabling it only in elasticsearch-js tests initially
  4. Then roll out to a percentage of users gradually (?how tbd)
    This gives us confidence before making it the default...overtime we can deprecate the old one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants