Skip to content

Conversation

@Zettroke
Copy link
Contributor

Summary

This is a paired request for vectordotdev/vrl#1349 adding support for VRL with Copy-on-Write.
I an opening it to make testing easier as we will have working vector version using new VRL.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

References

@Zettroke Zettroke requested a review from a team as a code owner March 19, 2025 16:48
@github-actions github-actions bot added domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components domain: sinks Anything related to the Vector's sinks domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: core Anything related to core crates i.e. vector-core, core-common, etc labels Mar 19, 2025
@bruceg
Copy link
Member

bruceg commented Mar 26, 2025

I would want to see if this regresses any of the existing soak test benchmarks before approving this to merge. @pront can we run those on a single PR on demand any more?

@bruceg
Copy link
Member

bruceg commented Mar 26, 2025

Also, @Zettroke, do you have any opinion on if this obviates the need for the CoW optimization specific to LogEvent introduced in #11166?

@Zettroke
Copy link
Contributor Author

@bruceg I think so. All variants of the Value that are not trivially copyable are reference counted now. So there should be no need to wrap Value in Arc

@pront
Copy link
Member

pront commented Mar 26, 2025

I would want to see if this regresses any of the existing soak test benchmarks before approving this to merge. @pront can we run those on a single PR on demand any more?

Yes, for branches on vectordotdev/vector but not for external contributors. The https://github.com/vectordotdev/vector/blob/master/.github/workflows/regression.yml will need to be enhancement to handle different comparison repos.

@pront
Copy link
Member

pront commented Mar 26, 2025

I copied the branch into vectordotdev and rebased on the latest master.
Here is the job: https://github.com/vectordotdev/vector/actions/runs/14090098596

@pront
Copy link
Member

pront commented Mar 26, 2025

@Zettroke
Copy link
Contributor Author

Zettroke commented Mar 26, 2025

I see regression in splunk_hec_route_s3, can it be a fluke? I can't see any obvious cause for it😅

@pront
Copy link
Member

pront commented Mar 27, 2025

I see regression in splunk_hec_route_s3, can it be a fluke? I can't see any obvious cause for it😅

It can be a flake but it can also be a legitimate issue, it's never obvious with such a wide range of use cases. I will kickoff a couple more runs.

@Zettroke
Copy link
Contributor Author

@pront Do I understand correctly that this is the reverse view? The baseline is my changes, and the comparison is master. So that means there is only 1 test case showing improvement?

@pront
Copy link
Member

pront commented Mar 31, 2025

@pront Do I understand correctly that this is the reverse view? The baseline is my changes, and the comparison is master. So that means there is only 1 test case showing improvement?

Are you referring to this summary? If yes, it means only experiment was slower compared to master.

@Zettroke Zettroke force-pushed the CoW-value-optimization branch from 15c6489 to 5ac5c88 Compare April 1, 2025 17:05
@bruceg
Copy link
Member

bruceg commented Apr 1, 2025

I too had set up a PR that dropped the inner Arc from LogEvent, and saw the same mixed results, a combination of significant performance wins and significant regressions. I'd like to do this, but am uneasy about the regressions.

@Zettroke
Copy link
Contributor Author

Zettroke commented Apr 1, 2025

@pront Sorry, but I'm having hard time understanding what you mean. Regression test result shows my commit as the baseline and master as the comparison, so I assumed results are inverse (1 significant performance gain and 7+ regressions). Or am I missing something?

@Zettroke
Copy link
Contributor Author

Zettroke commented Apr 1, 2025

@bruceg Oh, I see your test summary. I'll look into it. My wild guess is that it might be connected with codecs. While constructing the value, each insertion goes through Arc::make_mut. While this overhead is easily outweighed by avoiding a copy in VRL execution, in codecs there are no copies to avoid, so there is just overhead left. Though I was hoping the branch predictor would do all the hard work ;)

Is there a way to access test events used in regression tests? I see some magic lading is used to generate test data :)

@pront
Copy link
Member

pront commented Apr 1, 2025

Kicked off another run here: https://github.com/vectordotdev/vector/actions/runs/14203583436

Just to manage expectations here, if the results mixed we might not accept this PR. But let's see the summary first.

@bruceg
Copy link
Member

bruceg commented Apr 1, 2025

My wild guess is that it might be connected with codecs. While constructing the value, each insertion goes through Arc::make_mut.

I think you're onto something. Anything where the value is created incrementally instead of using into would suffer this kind of hit.

Though I was hoping the branch predictor would do all the hard work ;)

The Arc uses atomics to handle the reference counting, and I wouldn't be surprised if that disables the branch prediction.

Is there a way to access test events used in regression tests? I see some magic lading is used to generate test data :)

lading is an open-source component, so you can try it out yourself.

@bruceg
Copy link
Member

bruceg commented Apr 1, 2025

Just to manage expectations here, if the results mixed we might not accept this PR. But let's see the summary first.

Let's see if @Zettroke can do something about improving the codec efficiency. If we can manage that, it could turn the numbers around.

@Zettroke
Copy link
Contributor Author

Zettroke commented Apr 1, 2025

Results are ready! Looks way better now 🎉 . But honestly feels a bit weird. In @bruceg regression test syslog_loki had 6% drop, but latest shows no change (-0.29%). What do you think?

@pront pront force-pushed the master branch 2 times, most recently from f13b711 to 3161400 Compare July 10, 2025 15:40
@pront pront force-pushed the master branch 2 times, most recently from 1720078 to ffe54be Compare July 10, 2025 15:43
@thomasqueirozb
Copy link
Contributor

Since the VRL PR hasn't been merged I'm converting this to a draft

@thomasqueirozb thomasqueirozb marked this pull request as draft October 24, 2025 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: codecs Anything related to Vector's codecs (encoding/decoding) domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants