-
Notifications
You must be signed in to change notification settings - Fork 841
Open
Description
We noticed that some requests were generating HTTP 502 errors. The cause is the origin server closed an HTTP/1 keep-alive connection. It happens as follows.
- An origin connection is released from the session pool.
- ATS buffers a request.
- Immediately after, ATS enters
state_read_server_response_header - Somewhere between steps 1 and 5, the server decides to close the connection, probably due to a keep alive time out, before the request arrives.
- ATS reads from the socket, and gets an EOS.6. The EOS is handled as follows
trafficserver/src/proxy/http/HttpSM.cc
Line 1942 in 2e244e5
t_state.current.retry_attempts.maximize(t_state.configured_connect_attempts_max_retries());
Note that EOS falls through, and retries are disabled from this point onwards. This behavior was introduced in PR #9366 Http2 to origin. This issue can be mitigated by reducing the keep alive timeout in ATS so it's lower than the origin's keep alive timeout.
I think ATS should provide an option to retry a request if an invalid response has been received.