Skip to content

Conversation

@JakeChampion
Copy link
Contributor

@JakeChampion JakeChampion commented Apr 18, 2025

This pull request adds full support for the zstd (Zstandard) compression algorithm throughout Apache Traffic Server, including build system integration, compression plugin support, Accept-Encoding header normalization, and test coverage.

Key Features

Build system and dependencies:

  • Add CMake support for finding zstd library with new Findzstd.cmake
  • Update Docker build files to include libzstd-dev package
  • Add TS_HAS_ZSTD feature flag for conditional compilation

Core compression support:

  • Extend compress plugin to support zstd compression alongside gzip and brotli
  • Add zstd stream handling structures and functions
  • Update compression configuration to include zstd in supported algorithms list

Accept-Encoding header normalization:

  • Extend proxy.config.http.normalize_ae configuration to support values 4 and 5 for zstd normalization
  • Update HTTP transaction cache matching to handle zstd encoding

API and infrastructure:

  • Add TS_HTTP_VALUE_ZSTD and TS_HTTP_LEN_ZSTD constants
  • Update MIME field handling to recognize zstd encoding
  • Add zstd support to traffic_layout feature detection

Improved cache matching:

  • Fix Accept-Encoding quality calculation logic in HTTP transaction cache
  • Replace multiplicative quality combining with minimum quality selection
  • Remove problematic gzip-specific fallback logic that caused cache inconsistencies
  • Ensure proper cache behavior for all compression algorithms

Test Coverage

  • Comprehensive compress plugin tests covering zstd compression scenarios
  • Accept-Encoding normalization tests for all zstd combinations
  • Updated golden files to include zstd compression test results
  • New compress3.config for zstd-specific plugin configuration
  • Test all combinations of zstd, br, and gzip in various scenarios
  • Cache matching tests demonstrate correct behavior without workarounds

Standards Compliance

The implementation follows RFC 8878 standards for zstd compression and maintains backward compatibility with existing gzip and brotli compression functionality. All tests pass and the feature is properly integrated with existing caching and content negotiation mechanisms.

Configuration

New normalization modes:

  • proxy.config.http.normalize_ae = 4: Prioritize zstd, fallback to br then gzip
  • proxy.config.http.normalize_ae = 5: Support all combinations of zstd, br, and gzip

Benefits

  • Improved compression ratios compared to gzip
  • Better performance than brotli for compression speed
  • Reduced bandwidth usage and faster page loads
  • Seamless integration with existing ATS infrastructure

@JakeChampion JakeChampion marked this pull request as ready for review April 18, 2025 18:34
@masaori335 masaori335 self-requested a review April 18, 2025 23:25
@masaori335 masaori335 added the compress compress plugin label Apr 18, 2025
@masaori335 masaori335 added this to the 10.2.0 milestone Apr 18, 2025
@masaori335
Copy link
Contributor

[approve ci]

@JakeChampion
Copy link
Contributor Author

@masaori335 do you know which files I need to modify to install zstd on the osx used in the Jenkins job?

@masaori335
Copy link
Contributor

We need to ask @ezelkow1 to install the package in the osx env, I guess.

However, this PR has #if HAVE_ZSTD_F check, why it's failing?

@cmcfarlen cmcfarlen self-requested a review April 21, 2025 22:23
@JakeChampion JakeChampion force-pushed the zstd branch 2 times, most recently from 0167bf5 to 74cd8b5 Compare April 22, 2025 07:51
@JakeChampion
Copy link
Contributor Author

We need to ask @ezelkow1 to install the package in the osx env, I guess.

However, this PR has #if HAVE_ZSTD_F check, why it's failing?

thank you, _F was a mistake, it was meant to be _H

@JakeChampion JakeChampion force-pushed the zstd branch 4 times, most recently from 308032c to c598282 Compare April 22, 2025 10:12
@JakeChampion
Copy link
Contributor Author

Looks like all the FreeBSD machines for CI are offline

Copy link
Contributor

@shukitchan shukitchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate it if you can please add some documentation to the changes.

Thanks.

@cmcfarlen
Copy link
Contributor

Hi! This doesn't compile for me. I added a line to include/tscore/ink_config.cmake.in so HAVE_ZSTD_H would be truthy for the CPP macros. This lit up that code, but then there were several compiler errors:

/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/compress.cc:203:9: error: no member named 'zstd_ctx' in 'Data'; did you mean 'zstd_cctx'?
  203 |   data->zstd_ctx = nullptr;
      |         ^~~~~~~~
      |         zstd_cctx
/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/misc.h:93:14: note: 'zstd_cctx' declared here
   93 |   ZSTD_CCtx *zstd_cctx;
      |              ^
/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/compress.cc:206:11: error: no member named 'zstd_ctx' in 'Data'; did you mean 'zstd_cctx'?
  206 |     data->zstd_ctx = ZSTD_createCCtx();
      |           ^~~~~~~~
      |           zstd_cctx
/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/misc.h:93:14: note: 'zstd_cctx' declared here
   93 |   ZSTD_CCtx *zstd_cctx;
      |              ^
/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/compress.cc:207:16: error: no member named 'zstd_ctx' in 'Data'; did you mean 'zstd_cctx'?
  207 |     if (!data->zstd_ctx) {
      |                ^~~~~~~~
      |                zstd_cctx
/Users/cmcfarlen/projects/oss/trafficserver/plugins/compress/misc.h:93:14: note: 'zstd_cctx' declared here
   93 |   ZSTD_CCtx *zstd_cctx;
      |              ^

Please include this patch in your PR to enable the ZSTD code. Thanks!

diff --git a/include/tscore/ink_config.h.cmake.in b/include/tscore/ink_config.h.cmake.in
index 634ed94c3..7da0771fd 100644
--- a/include/tscore/ink_config.h.cmake.in
+++ b/include/tscore/ink_config.h.cmake.in
@@ -184,3 +184,5 @@ const int DEFAULT_STACKSIZE = @DEFAULT_STACK_SIZE@;
 #cmakedefine YAMLCPP_LIB_VERSION "@YAMLCPP_LIB_VERSION@"

 #cmakedefine01 TS_HAS_CRIPTS
+
+#cmakedefine HAVE_ZSTD_H 1

Copy link
Contributor

@cmcfarlen cmcfarlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please patch ink_config.h and resolve compiler issues.

@JakeChampion JakeChampion force-pushed the zstd branch 3 times, most recently from 4b8603e to 72c0dbf Compare May 28, 2025 19:48
@bryancall
Copy link
Contributor

Code Review - ZSTD Compression Implementation

Thank you for this excellent work on adding ZSTD compression support! The testing shows the feature works well and achieves great compression ratios (98.6%). I've completed a thorough code review and found a few issues that should be addressed before merging.


🔴 Critical Issue

Resource Management in zstd_transform_init()

Location: plugins/compress/compress.cc:558-580

Problem: If ZSTD_CCtx_setParameter fails during initialization, the function returns early but leaves data->zstrm_zstd.cctx in a valid (non-NULL) state. The caller at line 395-398 only checks if (!data->zstrm_zstd.cctx), which won't catch these initialization failures. This could lead to compression proceeding with a misconfigured context.

Current Code:

static void
zstd_transform_init(Data *data)
{
  if (!data->zstrm_zstd.cctx) {
    error("Failed to initialize Zstd compression context");
    return;
  }

  size_t result = ZSTD_CCtx_setParameter(data->zstrm_zstd.cctx, ZSTD_c_compressionLevel, ...);
  if (ZSTD_isError(result)) {
    error("Failed to set Zstd compression level: %s", ZSTD_getErrorName(result));
    return;  // ⚠️ Context is valid but misconfigured
  }
  
  result = ZSTD_CCtx_setParameter(data->zstrm_zstd.cctx, ZSTD_c_checksumFlag, 1);
  if (ZSTD_isError(result)) {
    error("Failed to enable Zstd checksum: %s", ZSTD_getErrorName(result));
    return;  // ⚠️ Context exists but checksum not enabled
  }
}

Recommended Fix:

static void
zstd_transform_init(Data *data)
{
  if (!data->zstrm_zstd.cctx) {
    error("Failed to initialize Zstd compression context");
    return;
  }

  size_t result = ZSTD_CCtx_setParameter(data->zstrm_zstd.cctx, ZSTD_c_compressionLevel, data->hc->zstd_compression_level());
  if (ZSTD_isError(result)) {
    error("Failed to set Zstd compression level: %s", ZSTD_getErrorName(result));
    ZSTD_freeCCtx(data->zstrm_zstd.cctx);  // ✅ Clean up
    data->zstrm_zstd.cctx = nullptr;        // ✅ Mark as invalid
    return;
  }
  
  result = ZSTD_CCtx_setParameter(data->zstrm_zstd.cctx, ZSTD_c_checksumFlag, 1);
  if (ZSTD_isError(result)) {
    error("Failed to enable Zstd checksum: %s", ZSTD_getErrorName(result));
    ZSTD_freeCCtx(data->zstrm_zstd.cctx);  // ✅ Clean up
    data->zstrm_zstd.cctx = nullptr;        // ✅ Mark as invalid
    return;
  }

  debug("zstd compression context initialized with level %d", data->hc->zstd_compression_level());
}

⚠️ Medium Issues

1. Integer Overflow in zstd_compress_operation()

Location: plugins/compress/compress.cc:514

Problem: Casting int64_t upstream_length to size_t without bounds checking. If upstream_length is negative (malicious input or bug), the cast wraps to a huge positive value.

Current Code:

ZSTD_inBuffer input = {upstream_buffer, static_cast<size_t>(upstream_length), 0};

Recommended Fix:

if (upstream_length < 0) {
  error("Invalid upstream_length: %" PRId64 " (negative value)", upstream_length);
  return false;
}

if (upstream_buffer == nullptr && upstream_length > 0) {
  error("upstream_buffer is NULL with non-zero length");
  return false;
}

ZSTD_inBuffer input = {upstream_buffer, static_cast<size_t>(upstream_length), 0};

2. Similar Issue with downstream_length

Location: plugins/compress/compress.cc:520

Problem: Same cast issue, though less critical since it comes from ATS API.

Recommended: Add validation:

if (downstream_length < 0) {
  error("Invalid downstream_length from TSIOBufferBlockWriteStart");
  return false;
}

ℹ️ Minor Issues

1. Typo in Comment

Location: plugins/compress/compress.cc:236

Issue: Comment says "brotlidestory" instead of "brotli destroy"

Fix: Change to // Brotli destroy

2. Inconsistent Error Logging

Location: plugins/compress/compress.cc:395-398

Issue: Mix of TSError() and error() macro.

Current:

if (!data->zstrm_zstd.cctx) {
  TSError("Failed to create Zstandard compression context");  // Uses TSError
  return;
}

Elsewhere:

error("Failed to initialize Zstd compression context");  // Uses error()

Recommendation: Use consistent error logging throughout the ZSTD code - prefer error() macro for consistency with the rest of the file.


✅ Excellent Practices Observed

  1. Proper error checking: All ZSTD errors are checked with ZSTD_isError() and error names are logged
  2. Resource cleanup: ZSTD_freeCCtx() is properly called in data_destroy()
  3. Data integrity: Checksum is enabled with ZSTD_c_checksumFlag
  4. Infinite loop prevention: Code checks for no progress in compression loop (line 542-544)
  5. Proper compression modes: Uses ZSTD_e_continue and ZSTD_e_flush appropriately
  6. Backward compatibility: Gracefully handles missing ZSTD libraries

Summary

  • Critical Issues: 1 (must fix before merge)
  • Medium Issues: 2 (should fix for robustness)
  • Minor Issues: 2 (nice to fix)

Overall Assessment: The ZSTD implementation is well-structured and follows the ZSTD API correctly. The critical initialization issue should be addressed, and the integer overflow checks would significantly improve robustness. Great work on achieving excellent compression ratios and maintaining backward compatibility!

Testing: All three test phases passed successfully ✅

  • Test 1 (Master baseline): GZIP and Brotli working
  • Test 2 (PR without ZSTD): No regressions, graceful fallback
  • Test 3 (PR with ZSTD): ZSTD achieves 98.6% compression (4701 → 68 bytes)

bryancall
bryancall previously approved these changes Oct 31, 2025
Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with making the changes in this PR or create another PR, tests worked on my Fedora 42 server.

@JakeChampion
Copy link
Contributor Author

I'll make the changes in this pr if that's okay 👍

@bryancall bryancall requested a review from cmcfarlen October 31, 2025 20:22
bryancall
bryancall previously approved these changes Nov 6, 2025
Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary - LGTM ✅

I've completed a thorough code review and functional testing of the zstd compression implementation.

Code Review

Implementation Quality:

  • Clean separation of concerns with zstd_compress.cc/h following the existing compression plugin architecture
  • Proper resource management with context creation/destruction lifecycle
  • Good error handling with input validation (null checks, size validation, progress detection)
  • Uses modern zstd streaming API (ZSTD_compressStream2) with appropriate flush handling
  • Checksum enabled for data integrity verification
  • Configurable compression levels per host (1-22, default 12)

API & Integration:

  • Proper addition of TS_HTTP_VALUE_ZSTD and TS_HTTP_LEN_ZSTD constants
  • Extends normalize_ae configuration appropriately (values 4 and 5)
  • Consistent with existing gzip/brotli patterns

Build System:

  • The CMake integration works well. Note: I needed to add a pkg-config fallback for Fedora/RHEL systems that don't ship zstd CMake configs, which you may want to consider for broader distribution compatibility.

Testing Results

Built and tested with dev-asan preset:

  • ✅ Zstd compression confirmed working (98.6% reduction on 5KB text content)
  • ✅ Content-Encoding header correctly set
  • ✅ Compressed data verified as valid Zstandard format (v0.8+)
  • ✅ Decompressed content matches original exactly
  • ✅ Cache disabled mode works correctly
  • ✅ No ASAN errors detected

Test Configuration:

  • Origin serving 5,100 bytes of compressible text
  • Compressed to 71 bytes on wire
  • Level 12 compression
  • Cache disabled for testing

Observations

  1. The Accept-Encoding cache quality calculation fix (commit facf384) is a nice improvement that fixes the multiplicative quality issue
  2. Documentation updates are comprehensive and include good examples
  3. All CI checks passing

Recommendation: This is a solid implementation that adds valuable functionality while maintaining code quality and backward compatibility. Ready to merge.

@bryancall
Copy link
Contributor

Fedora 43 Build Support

For Fedora 43 (and potentially other RHEL-based systems), the zstd package doesn't ship CMake config files, only pkg-config files. Here's a suggested enhancement to add pkg-config fallback support:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 692b11985a..0e1d26e08f 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -365,7 +365,20 @@ if(zstd_FOUND)
     endif()
   endif()
 else()
-  set(HAVE_ZSTD_H FALSE)
+  # Try pkg-config as fallback
+  find_package(PkgConfig QUIET)
+  if(PKG_CONFIG_FOUND)
+    pkg_check_modules(ZSTD QUIET IMPORTED_TARGET libzstd)
+    if(ZSTD_FOUND)
+      add_library(zstd::zstd ALIAS PkgConfig::ZSTD)
+      set(HAVE_ZSTD_H TRUE)
+      message(STATUS "Found zstd via pkg-config: ${ZSTD_VERSION}")
+    else()
+      set(HAVE_ZSTD_H FALSE)
+    endif()
+  else()
+    set(HAVE_ZSTD_H FALSE)
+  endif()
 endif()
 
 # ncurses is used in traffic_top

This allows the build to find zstd on distributions that only provide libzstd.pc without CMake package config files. Tested successfully on Fedora 43 with libzstd-devel 1.5.7.

@JakeChampion
Copy link
Contributor Author

Fedora 43 Build Support

For Fedora 43 (and potentially other RHEL-based systems), the zstd package doesn't ship CMake config files, only pkg-config files. Here's a suggested enhancement to add pkg-config fallback support:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 692b11985a..0e1d26e08f 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -365,7 +365,20 @@ if(zstd_FOUND)
     endif()
   endif()
 else()
-  set(HAVE_ZSTD_H FALSE)
+  # Try pkg-config as fallback
+  find_package(PkgConfig QUIET)
+  if(PKG_CONFIG_FOUND)
+    pkg_check_modules(ZSTD QUIET IMPORTED_TARGET libzstd)
+    if(ZSTD_FOUND)
+      add_library(zstd::zstd ALIAS PkgConfig::ZSTD)
+      set(HAVE_ZSTD_H TRUE)
+      message(STATUS "Found zstd via pkg-config: ${ZSTD_VERSION}")
+    else()
+      set(HAVE_ZSTD_H FALSE)
+    endif()
+  else()
+    set(HAVE_ZSTD_H FALSE)
+  endif()
 endif()
 
 # ncurses is used in traffic_top

This allows the build to find zstd on distributions that only provide libzstd.pc without CMake package config files. Tested successfully on Fedora 43 with libzstd-devel 1.5.7.

@bryancall could this be done in a follow up or would we want this as part of the current pull request?

@bryancall
Copy link
Contributor

bryancall commented Nov 7, 2025

@JakeChampion It can be done in a followup PR. I am thinking about merging this in today.

bneradt
bneradt previously approved these changes Nov 8, 2025
cmcfarlen
cmcfarlen previously approved these changes Nov 8, 2025
Copy link
Contributor

@cmcfarlen cmcfarlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is looking good. Thanks for hanging in there!

@JakeChampion JakeChampion dismissed stale reviews from cmcfarlen, bneradt, and bryancall via 6d440f1 November 8, 2025 08:08
@JakeChampion
Copy link
Contributor Author

Apologies for the final commit here which voided the reviews, I saw a code refactor I previously did and realised doesn't need to happen at all so I reverted it, I'll not touch the pr again now

@zwoop zwoop self-requested a review November 10, 2025 18:50
// Disable Vary mismatch checking for Accept-Encoding. This is only safe to
// set if you are promising to fix any Accept-Encoding/Content-Encoding mismatches.
if (http_config_params->get_ignore_accept_encoding_mismatch() &&
// Only suppress variability checks when the operator explicitly set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a bug that's been here for a long time. @cmcfarlen @ezelkow1 I wonder if we should consider this for back ports to 10.x and even 9.2 ? Only this portion though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a pr open for this specific change which we can backport
#12618

}

if (strncasecmp(value, "br", sizeof("br") - 1) == 0) {
info("Accept-Encoding value [%.*s]", len, value);
Copy link
Contributor

@zwoop zwoop Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit picking (and I know it's the same), but this feels more like a debug() than an info(). I'm honestly not sure why this plugin has debug / info / warning all going to the same Dbg() tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

9 participants