Skip to content

Avoid unnecessary Vec allocations in greedy proof selection#254

Merged
pablodeymo merged 1 commit intomainfrom
optimize-greedy-proof-selection
Apr 1, 2026
Merged

Avoid unnecessary Vec allocations in greedy proof selection#254
pablodeymo merged 1 commit intomainfrom
optimize-greedy-proof-selection

Conversation

@pablodeymo
Copy link
Copy Markdown
Collaborator

Motivation

The extend_proofs_greedily function (introduced in #251) uses a greedy set-cover algorithm to select aggregated signature proofs that maximize validator coverage. On each iteration of the selection loop, it was allocating a Vec<u64> for every remaining candidate proof just to compare coverage counts via .len(). Only the winning proof's coverage Vec is actually needed -- to extend the covered set and record the attestation count.

With N remaining proofs per iteration, this means N-1 unnecessary heap allocations that are immediately discarded. Since this runs on the block production hot path (every 4 seconds), reducing throwaway allocations is worthwhile.

Description

Split the greedy selection into two phases:

  1. Selection pass -- computes only a .count() (no allocation) for each candidate proof to find the one covering the most uncovered validators.
  2. Collection pass -- collects the actual Vec<u64> of newly covered validator IDs only for the winning proof.

This trades one extra iteration over the best proof's participant indices for avoiding N-1 Vec allocations per loop iteration.

How to Test

  • make test -- all existing tests pass (no behavioral change)
  • make lint -- clippy clean

  In extend_proofs_greedily, the greedy set-cover loop was allocating a
  Vec<u64> for every candidate proof on each iteration just to compare
  coverage counts via .len(). Only the winning proof's coverage Vec is
  actually needed (to extend the covered set and count attestations).

  Now the selection pass computes only a count (no allocation) for each
  candidate, and collects the actual Vec only for the winner. This avoids
  N-1 unnecessary allocations per iteration, where N is the number of
  remaining proofs.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Review: Optimization in extend_proofs_greedily

Verdict: Correct optimization, no issues. This is a straightforward performance improvement that reduces allocations without changing consensus logic.

Analysis:

  1. Correctness (Lines 978-997): The greedy selection logic is preserved. The change from eager collection (allocating Vec<u64> for every candidate) to counting (iterator consumption only) then collecting only for the winner maintains identical behavior.

  2. Performance: Significant improvement. Previously, every iteration of the while-loop allocated O(remaining_indices.len()) vectors. Now only one allocation occurs per outer loop iteration (for the winning proof). This reduces heap pressure in the attestation aggregation hot path.

  3. Idiomatic Rust:

    • Line 985: Using .count() instead of .collect::<Vec<_>>().len() is the correct idiom.
    • Line 987: The max_by_key closure |&(_, count)| count properly handles the Copy type by value.
  4. Minor note: The winning proof's participant_indices() is iterated twice (once for counting at line 982, once for collection at line 994). This is intentional and correct—the redundant work for the single winner is negligible compared to avoiding allocations for all losers.

No changes required.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

No findings.

The change in store.rs looks like a safe micro-optimization: it preserves the greedy selection logic in extend_proofs_greedily, still breaks when no proof adds uncovered validators, and keeps the resulting AggregatedAttestation/metrics behavior unchanged. For consensus correctness, this does not alter fork choice, justification/finalization, attestation validation, state transition, XMSS verification, or SSZ handling; it only avoids allocating a Vec<u64> for every candidate proof and materializes coverage once for the selected winner at store.rs.

Residual risk is low. I could not run cargo test in this sandbox because rustup is blocked from creating temp files under /home/runner/.rustup, so this review is based on patch and source inspection only.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Review: Avoid unnecessary Vec allocations in greedy proof selection

Summary

A focused, single-function optimization to extend_proofs_greedily in crates/blockchain/src/store.rs. The change is small, well-described, and the motivation is sound.


Correctness

The critical invariant is that covered is not mutated between the selection pass and the collection pass. Both passes therefore observe the same state, so new_covered collected in the second pass is identical to what new_coverage would have been in the original code. The optimization is semantically equivalent.

Edge cases are all handled correctly:

  • best == None is unreachable (loop guard ensures remaining_indices is non-empty), but the defensive break is harmless.
  • best_count == 0 correctly short-circuits when all remaining proofs are subsets of covered.
  • The metrics call inc_pq_sig_attestations_in_aggregated_signatures(new_covered.len() as u64) still receives the right count since the collection pass produces the same filtered slice.

Performance

  • Before: N heap allocations per outer loop iteration (one Vec<u64> per candidate).
  • After: 1 heap allocation per outer loop iteration (only for the winner) + one extra iterator pass over the winner's participant indices.

The tradeoff is favorable at any N ≥ 2, and particularly so on the block-production hot path where proof sets can grow with validator count.

One alternative worth noting: it isn't possible to fold the collection into the selection pass to avoid the second iterator scan, because that would require a simultaneous mutable borrow of covered (for extend) and a shared borrow (for filter(|vid| !covered.contains(vid))). The two-pass design is therefore the cleanest correct approach.


Style / Idioms

  • max_by_key(|&(_, count)| count) — correctly uses copy semantics since usize: Copy. Good.
  • Comment // Collect coverage only for the winning proof is appropriately concise.
  • Variable rename from new_covered (in the selection arm) to best_count is cleaner and avoids the misleading new_covered name that previously referred to a coverage Vec but only existed to call .len().

Minor Observation

new_covered is collected into a Vec<u64> solely to:

  1. Measure new_covered.len() for metrics.
  2. covered.extend(new_covered).

These two uses are unavoidable given that covered must be borrowed mutably for extend after the immutable borrow in filter. The current implementation is the correct minimal approach — no further allocation savings are possible here without unsafe code.


Verdict

Approve. The change is correct, focused, and a clear improvement. No behavioral change, no new allocations introduced, and the hot-path allocation count drops from O(N) to O(1) per greedy iteration. Tests and lint passing confirms no regressions.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR optimizes the greedy set-cover loop inside extend_proofs_greedily in crates/blockchain/src/store.rs by splitting the inner body into two passes: a count-only pass (no heap allocation) over all remaining proofs to find the winner, followed by a single collect() only for the winning proof. This eliminates N-1 throwaway Vec<u64> allocations per outer loop iteration — a meaningful saving on the block-production hot path that runs every 4 seconds.

Key changes:

  • The map inside the max_by_key chain now emits (idx, count: usize) instead of (idx, Vec<u64>).
  • The break condition changes from new_covered.is_empty()best_count == 0 (semantically identical).
  • A second pass over proofs[best_idx].participant_indices() collects the actual Vec<u64> only for the winner. This is correct because covered is not mutated between the two passes.
  • Metrics and covered.extend use the collected new_covered Vec exactly as before.

Correctness: The optimization is sound. participant_indices() is a pure, deterministic iterator over a BitList, so calling it twice for the winner while covered is unchanged produces identical results. No behavioral differences.

Confidence Score: 5/5

Safe to merge — pure performance optimization with no behavioral change and correct two-pass semantics.

The change is mechanically correct: participant_indices() is a pure deterministic iterator over an immutable bitfield, and covered is not mutated between the count and collect passes for the same winner. All existing invariants are preserved, the break condition is semantically equivalent, and metrics reporting is unaffected. No P0 or P1 issues found.

No files require special attention.

Important Files Changed

Filename Overview
crates/blockchain/src/store.rs Splits greedy proof selection into a count-only pass (no allocation) and a collect pass for the winner only, eliminating N-1 unnecessary Vec allocations per loop iteration with no behavioral change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Start: remaining_indices non-empty] --> B[Selection pass: for each remaining idx, call participant_indices + filter + count]
    B --> C{best is Some?}
    C -- No --> Z[Break]
    C -- Yes --> D{best_count == 0?}
    D -- Yes --> Z
    D -- No --> E[Collection pass: call participant_indices + filter + collect for best_idx only]
    E --> F[Push AggregatedAttestation]
    E --> G[Push selected_proof]
    E --> H[Emit metrics with new_covered.len]
    F & G & H --> I[covered.extend new_covered]
    I --> J[remaining_indices.remove best_idx]
    J --> A
Loading

Reviews (1): Last reviewed commit: "Avoid unnecessary Vec allocations in gre..." | Re-trigger Greptile

Copy link
Copy Markdown
Collaborator

@MegaRedHand MegaRedHand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! 🐎

@pablodeymo pablodeymo merged commit d55ae59 into main Apr 1, 2026
7 checks passed
@pablodeymo pablodeymo deleted the optimize-greedy-proof-selection branch April 1, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants