Add KVCompose #137

Akulyat · 2025-09-19T13:29:56Z

PR description

KVComposePress (source, paper) is a structured KV cache compression method that uses attention-guided composite tokens. Our method aggregates per-head importance scores, aligns them into composite tokens to preserve cache structure, and allocates retention budgets across layers.
For the RULER benchmark, KVCompose achieves the best performance among structured methods and competitive results vs KVZip for unstructured settings.

Checklist

~~Tests are working (make test)~~
- (couldn't make it work locally, I have certain limitations, hope they run and work here in the PR)
Code is formatted correctly (make style, on errors try fix with make format)
Copyright header is included
All commits are signed-off using git commit -s
(new press) mypress_press.py is in the presses directory
(new press) MyPress is in __init__.py
(new press) README.md is updated with a 1 liner about the new press in the Available presses section
(new press) New press is in the default_presses list in tests/default_presses.py
(new press) A docstring is provided that follows the same structure as the existing ones

copy-pr-bot · 2025-09-19T13:30:00Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

alessiodevoto · 2025-09-22T13:15:17Z

/ok to test 84f63f2

alessiodevoto · 2025-09-22T13:38:36Z

Hi @Akulyat, thanks for opening this PR! 🙂 I had a first glance, we will provide a more detailed review in the coming days. Some initial comments:

Please make sure the commits are signed off (git commit -s) and the style checks work (you can refer to the PR template). You can fix this by running git rebase --exec 'git commit --amend --no-edit -s' HEAD~2 or by starting an interactive rebase
Thanks for sharing the results as well! However, they should not go in KVPress but rather in our HuggingFace Leaderboard. Please remove them from this PR and feel free to submit them by opening a PR on the Hugging Face Sapce, once the code here is stable!
Could you add the new press is in the default_presses list in tests/default_presses.py ? This way the press will tested against the library and make sure it is compatible
In general, some comments about the method might help readability.

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Akulyat · 2025-09-23T11:01:32Z

Hi @alessiodevoto, thank you for the quick feedback!

Addressing your comments:

Commits are signed off, style checks work.
The results are removed from this PR.
Added the press to tests/default_presses.py.
I couldn’t run the tests locally due to some limitations, so I’m relying on them being validated here.
Main functions in the press are commented. I can update or expand the comments after the detailed review.

I have a question regarding the results: should I wait for this PR to be merged, or should I already open a PR to the Leaderboard?

alessiodevoto · 2025-09-29T13:16:06Z

/ok to test b0b3715

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Akulyat · 2025-09-30T12:44:07Z

Following yesterday’s testing:

Fixed style errors.
Concerning two failed tests:
- One was a real bug, fixed.
- The other was the ComposedPress compatibility test. Our implementation requires attentions and is not currently compatible with ComposedPress. I haven't properly thought about whether it can be fixed, this limitation is open for discussion.
  Yet, as for KVzip, I added an exception for KVComposePress.

kvpress/presses/kvcompose_press.py

maxjeblick

Thanks a lot for your code, I added some additional comments w.r.t. the review above.

The most important part is to avoid the potential copying of the cache, as this is in direct conflict with memoery reduction.

kvpress/presses/kvcompose_press.py

SimJeg · 2025-11-04T13:32:57Z

@Akulyat thank you for citing our work on kvpress in https://arxiv.org/abs/2509.05165. We recently published a reference paper for kvpress and updated the citation section of our README accordingly. We’d appreciate it if you could update your paper

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Akulyat · 2025-11-15T06:04:46Z

Thank you very much for all the suggestions.
I’ve fixed the issues, and I left a few replies where I need clarification(statelessness, hack with eager attention).

maxjeblick

Sorry for the delayed response and thanks a lot for your changes!
We've done a second round of review and have the following comments:

Please reset the press to it's default state after compression.
There are some merge conflicts.

Your method will create a kv-cache of size 2x of the context length. As this creates a significant memory overhead, we need this to be apparent both in the press and in the README. I'd thus kindly ask you to

Add a logger.warning in the __call__ method, informing the user about it.
Adding it as a note to the main README.

As a remark: I obtained mean score of 57.2% on 50% CR on ruler, using the default settings in the repo, otherwise. IDK if this is expected with your method.

Details:

2025-12-02 01:39:17,814 - INFO - Metrics saved to results/ruler__4096__meta-llama--Meta-Llama-3.1-8B-Instruct__kvcompose_unstructured__0.50/2/metrics.json
2025-12-02 01:39:17,814 - INFO - Average compression ratio: 0.50
2025-12-02 01:39:17,814 - INFO - Metrics:
{
"cwe": {
"string_match": 17.3
},
"fwe": {
"string_match": 52.0
},
"niah_multikey_1": {
"string_match": 93.0
},
"niah_multikey_2": {
"string_match": 78.6
},
"niah_multikey_3": {
"string_match": 3.0
},
"niah_multiquery": {
"string_match": 55.1
},
"niah_multivalue": {
"string_match": 86.25
},
"niah_single_1": {
"string_match": 98.2
},
"niah_single_2": {
"string_match": 95.4
},
"niah_single_3": {
"string_match": 62.6
},
"qa_1": {
"string_match": 56.0
},
"qa_2": {
"string_match": 25.0
},
"vt": {
"string_match": 21.92
}
}

kvpress/presses/kvcompose_press.py

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Akulyat · 2026-01-16T15:40:40Z

Thank you for the review and the detailed comments.

I have addressed the requested changes and left a separate reply regarding the last point on eager attention.

Additionally, the metrics you provided do not match what we got and plan to submit to the leaderboard. Could you please share the full config file you used to run this experiment?

maxjeblick · 2026-01-20T09:15:25Z

Thanks @Akulyat a lot for the updates and the fixes!

Before merging, I kindly ask you to:

Reset state after compression - Add a _reset_state() method (similar to KVzipPress._reset_internal_parameters()) and call it in the finally block
Resolve merge conflicts
There's probably a type in logger.warning( "KVComposePress temporarily creates a KV cache of ~2x the context length during prefill; " )

I will rerun the experiment and double check there wasn't an issue on my side. I'll update you in this thread.

Akulyat added 2 commits September 23, 2025 11:54

Add KVCompose

effad92

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Add KVCompose to tests/default_presses.py and README description

b0b3715

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Akulyat force-pushed the main branch from 84f63f2 to b0b3715 Compare September 23, 2025 10:29

Akulyat changed the title ~~Add KVCompose results~~ Add KVCompose Sep 23, 2025

Akulyat added 2 commits September 29, 2025 18:05

KVCompose fix style

21a9220

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

KVCompose fix test_presses_run

a2824be

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

maxjeblick requested a review from Jack-Yu-815 October 1, 2025 07:12

Jack-Yu-815 reviewed Oct 9, 2025

View reviewed changes

kvpress/presses/kvcompose_press.py Outdated Show resolved Hide resolved

kvpress/presses/kvcompose_press.py Outdated Show resolved Hide resolved

kvpress/presses/kvcompose_press.py Show resolved Hide resolved

maxjeblick requested changes Oct 9, 2025

View reviewed changes

Akulyat added 3 commits November 15, 2025 03:02

Get rid of doubling the cache

7ff0457

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Style and documentation fix

21d3d2b

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Adapt test_head_compression for KVCompose

a2b81bd

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

maxjeblick requested changes Dec 3, 2025

View reviewed changes

kvpress/presses/kvcompose_press.py Outdated Show resolved Hide resolved

kvpress/presses/kvcompose_press.py Show resolved Hide resolved

maxjeblick reviewed Dec 3, 2025

View reviewed changes

kvpress/presses/kvcompose_press.py Show resolved Hide resolved

KVCompose memory warning

1cc445e

Signed-off-by: Dmitry Akulov <akulyats@gmail.com>

Add KVCompose #137

Are you sure you want to change the base?

Add KVCompose #137

Uh oh!

Conversation

Akulyat commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description

Checklist

Uh oh!

copy-pr-bot bot commented Sep 19, 2025

Uh oh!

alessiodevoto commented Sep 22, 2025

Uh oh!

alessiodevoto commented Sep 22, 2025

Uh oh!

Akulyat commented Sep 23, 2025

Uh oh!

alessiodevoto commented Sep 29, 2025

Uh oh!

Akulyat commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maxjeblick left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SimJeg commented Nov 4, 2025

Uh oh!

Akulyat commented Nov 15, 2025

Uh oh!

maxjeblick left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Akulyat commented Jan 16, 2026

Uh oh!

maxjeblick commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Akulyat commented Sep 19, 2025 •

edited

Loading

maxjeblick left a comment •

edited

Loading

maxjeblick left a comment •

edited

Loading