Skip to content

feat(crypto): add Ligero/Brakedown linear-code polynomial commitment …#1191

Open
diegokingston wants to merge 6 commits into
mainfrom
feat/ligero-brakedown-pcs
Open

feat(crypto): add Ligero/Brakedown linear-code polynomial commitment …#1191
diegokingston wants to merge 6 commits into
mainfrom
feat/ligero-brakedown-pcs

Conversation

@diegokingston
Copy link
Copy Markdown
Collaborator

…scheme

Implement a hash-based PCS for multilinear polynomials supporting two pluggable encoding backends:

  • Reed-Solomon (Ligero): O(N log N) prover via FFT, requires FFT-friendly field
  • Expander codes (Brakedown): O(N) prover via sparse matrix multiply, field-agnostic

No trusted setup, post-quantum secure. Uses existing FieldElementVectorBackend for Merkle commitments and Fiat-Shamir transcript for non-interactive proofs.

Tested with Goldilocks (Ligero) and Mersenne31 (Brakedown) fields.

…scheme

Implement a hash-based PCS for multilinear polynomials supporting two
pluggable encoding backends:
- Reed-Solomon (Ligero): O(N log N) prover via FFT, requires FFT-friendly field
- Expander codes (Brakedown): O(N) prover via sparse matrix multiply, field-agnostic

No trusted setup, post-quantum secure. Uses existing FieldElementVectorBackend
for Merkle commitments and Fiat-Shamir transcript for non-interactive proofs.

Tested with Goldilocks (Ligero) and Mersenne31 (Brakedown) fields.
@diegokingston diegokingston requested a review from a team as a code owner March 11, 2026 16:18
@github-actions
Copy link
Copy Markdown

Kimi AI Review

This PR introduces a new module for linear-code Pairing-Based Cryptography (PCS) schemes, specifically Ligero and Brakedown. Here are my detailed comments and feedback on the changes:

  1. Modular Structure and Organization:

    • The new module is well-organized with separate files for commit, prove, verify, data structures, encoding schemes (Expander and Reed-Solomon), and matrix operations. This modularity is great for maintainability and testing.
    • Each file has a clear purpose and contains related functionality, which simplifies understanding and reviewing the code.
  2. Commit Function (commit.rs):

    • The commit function correctly handles the transformation of the evaluation vector into a matrix, encoding each row, and building a Merkle tree over the columns. This is the core of the PCS scheme.
    • The use of assert! to ensure the length of evals is a power of two is appropriate. However, consider adding a more descriptive error message to help with debugging.
    • The calculation of n_cols and n_rows based on the power of two is correct and follows the PCS scheme requirements.
  3. Matrix Operations (matrix.rs):

    • The Matrix struct and its associated methods provide a clear and efficient way to handle matrix operations. The row-major order is suitable for the PCS scheme.
    • The row_mul method is crucial for computing the left-multiplication of the tensor-half vector with the matrix. The implementation is correct and follows the expected mathematical operations.
  4. Encoding Schemes:

    • Both the Reed-Solomon and Expander encoding schemes are implemented correctly. The Reed-Solomon scheme uses FFT for efficient encoding, while the Expander scheme uses sparse matrix multiplication for linear-time encoding.
    • The distance method in both encoding schemes returns the relative minimum distance of the code, which is crucial for security analysis.
  5. Prove and Verify Functions (prove.rs and verify.rs):

    • The prove function correctly computes the tensor decomposition, re-linearizes the polynomial, and samples column indices using the Fiat-Shamir heuristic.
    • The verify function correctly recomputes the tensor vectors, re-derives the column indices, and checks the consistency and final evaluation. The security of the scheme relies on these verification steps being correct.
  6. Security Considerations:

    • The use of the Fiat-Shamir heuristic for deriving random challenges is appropriate for converting the interactive proof system into a non-interactive one.
    • The calculation of the number of column openings based on the security parameter and code distance is correct and follows the security analysis of PCS schemes.
  7. Performance:

    • The use of sparse matrices in the Expander encoding scheme is efficient and reduces the computational complexity to linear time.
    • The Reed-Solomon scheme's use of FFT for encoding is also efficient and suitable for fields that support FFT operations.
  8. Testing:

    • The provided tests cover various aspects of the PCS scheme, including commitment, proving, and verifying with both Reed-Solomon and Expander encoding schemes.
    • The tests also check edge cases, such as wrong evaluations failing verification, which is crucial for ensuring the security of the scheme.

Overall, the implementation of the linear-code PCS module is correct, well-organized, and follows the mathematical and cryptographic principles underlying these schemes. The code is efficient, secure, and well-tested. I approve of this PR with no major issues found.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review by ChatGPT

Comment thread crates/crypto/src/lib.rs
pub mod fiat_shamir;
pub mod hash;
pub mod linear_code_pcs;
pub mod merkle_tree;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Feedback

Correctness

  • General Operations: Ensure that all modular and field operations handle edge cases, such as division by zero, points at infinity, and identity elements.
  • Polynomial Implementation: Verify that polynomials are correctly reduced modulo the field characteristic.
  • FFT Implementation: Ensure correct radix and root-of-unity computations for FFT, consider edge cases like n=0.

Security

  • Constant-Time Operations: Verify that all operations on secret data are performed in constant time to mitigate timing attacks.
  • Zeroization: Ensure that temporary buffers holding secret data are cleared after use.
  • Randomness: Evaluate the cryptographic security of random number generators—avoid using non-cryptographic sources for randomness.

Performance

  • Memory Allocations: Check for redundant memory allocations, particularly in loops, to improve performance.
  • Redundant Field Inversions: Identify any unnecessary field inversions which may be optimized out.
  • FFT Efficiency: Ensure in-place computation of FFT to minimize memory usage.

Bugs & Errors

  • Potential Panics: Search for usages of unwrap and address potential areas that could cause panics. Replace with safer error handling mechanisms.
  • Memory Safety: Ensure no unsafe code or raw pointers compromise the memory safety.
  • Integer Overflow: Validate that integer operations are using wrapping, saturating, or checked variants to prevent overflow.

Code Simplicity

  • Duplicated Code: De-duplicate logic where necessary to improve maintainability.
  • Abstractions: Review code abstractions to ensure they are neither overly complex nor simplistic, striking a balance for maintainability.

Specific Review Notes

  • New Module linear_code_pcs: The addition of linear_code_pcs should be accompanied by adjoined unit/integration tests to verify correctness and integration with existing modules.

Overall, the code requires further review, testing, and refinement to meet correctness, security, and performance standards. Address the above concerns before merging.

// This matches DenseMultilinearPolynomial's evaluation order.
}
evals
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • Edge Cases: The commit function uses assert! calls which may cause the program to panic in production if the conditions are unmet. Consider handling these conditions more gracefully or ensuring these checks are only in debug builds.
  • Polynomial Correctness: The tensor_vec function seems correct mathematically, assuming FieldElement performs correct field operations.

Security:

  • Timing Side-Channels: Ensure all field operations, especially in tensor_vec, are constant-time if they are used with secret values to avoid timing attacks.
  • Error Handling: The use of expect in MerkleTree::build should be scrutinized; ensure it does not leak sensitive information on failure.
  • Randomness and Hashing: Not applicable in the provided code snippet, but ensure domain separation in hashing if used elsewhere.

Performance:

  • Unnecessary Allocations: The ext_data vector is initialized with with_capacity, which is good for performance. However, consider evaluating if any temporary allocations can be circumvented.

Bugs & Errors:

  • Potential Panics: The use of assert! can lead to panics, as mentioned above.
  • Off-by-One Errors: No such errors observed in the provided snippet.

Code Simplicity:

  • Duplicated Code: No obvious duplication detected.
  • Complexity: The code is relatively straightforward, but consider abstracting some operations to improve readability and reuse, especially if similar manipulations occur elsewhere in the code.

Additional Notes:

  • Consider adding unit tests that specifically target edge cases like minimal input sizes, powers of two less than typical, etc., to ensure stability.
  • Ensure that the FieldElement and other mathematical operations enforce boundary conditions internally to prevent misuse or undefined behavior.

pub v: Vec<FieldElement<F>>,
/// Opened columns with their Merkle proofs, sampled via Fiat-Shamir.
pub columns: Vec<OpenedColumn<F, B>>,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • The current code is primarily defining the structures used in linear-code polynomial commitment schemes (PCS). Ensure that the mathematical operations related to polynomial commitment, such as multilinear polynomial evaluations and proofs, are correctly implemented in other parts of the code. This snippet provides no evidence of incorrect mathematical operations.

Security:

  • There is no indication in the provided code about the use of cryptographically secure randomness or constant-time operations. Important to review the implementations interacting with these structures for such security guarantees.
  • Make sure that all operations performed on these structures, particularly regarding proof construction and verification, avoid timing side-channels and maintain constant-time operations when working with secret values.

Performance:

  • No unnecessary allocations or misuse of resources in the provided snippet, but consider efficiency strategies like using smallvec if vectors are typically small or have a predictable size.
  • The structures here do not concern MSM or FFT specifically but ensure that elsewhere these operations are implemented efficiently.

Bugs & Errors:

  • No potential panic or unwrap usage within this code snippet. Possible error handling should be carefully implemented where these structures are used, especially in divisions, inverses, and Merkle tree operations.
  • There are no integer operations in this snippet, so check other areas for possible underflow/overflow issues.

Code Simplicity:

  • The code defines necessary structs without unnecessary complexity. Ensure elsewhere that clear abstractions are used to manipulate these structures, avoiding code redundancy.
  • Inspect if any matrix and merkle_tree operations within the codebase can be abstracted to simplify the code further.

Overall, the foundational structures seem correct; however, ensure that related operations meet cryptographic standards and that performance goals are achieved in the broader implementation context. Address the highlighted concerns before merging.

@@ -0,0 +1,2 @@
pub mod expander;
pub mod reed_solomon;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  1. Reed-Solomon Implementation: Ensure that the polynomial division and interpolation operations handle edge cases such as division by zero and have checks for zero coefficients properly. These operations are critical in ensuring correct encoding and decoding of data.

  2. Edge Cases: Check if the polynomial arithmetic properly handles zero polynomials and leading zeros. Also, ensure that FFT operations handle edge cases for polynomial degrees gracefully.

Security

  1. Constant-Time Operations: Verify if cryptographic operations involved in the proof systems and field arithmetic ensure constant-time execution, as this is crucial to mitigate timing side-channel attacks.

  2. Zeroization of Sensitive Data: Ensure any sensitive data that deals with secrets is zeroized after use to prevent leakage.

  3. Randomness: If randomness is used in proof generation or cryptographic operations, check for the use of a cryptographically secure random number generator.

  4. Hash Function Domain Separation: Ensure that different operations use separate domains in hash functions to avoid cross-protocol attacks.

Performance

  1. FFT Efficiency: Validate that the FFT implementation makes efficient use of in-place transformations and optimizes unnecessary allocations.

  2. Redundant Operations: Ensure there are no redundant field inversions, particularly during MSM and within the elliptic curve scalar multiplication.

Bugs & Errors

  1. Panics & Unwraps: Look for potential panics and unwraps, especially around array indexing and optional types, to improve robustness.

  2. Integer Overflow/Underflow: Check that arithmetic operations are performed with overflow checking or using safe arithmetic methods.

Code Simplicity

  1. Complex Implementations and Abstractions: Evaluate if there are any overly complex implementations or if the abstractions could be improved for better maintainability and readability.

Consider addressing these issues before merging the code.

let enc = ReedSolomonEncoding::<F>::new(4, 4);
assert_eq!(enc.distance(), (3, 4));
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Edge Cases: The assertions in new function assume that msg_len and rho_inv will always form a power of two when multiplied. However, there is no check for when msg_len is zero, which could lead to incorrect behavior or panics. Consider adding a check to ensure msg_len > 0. Also, ensure that encode handles the case of zero-length messages gracefully.
  • Polynomial and FFT Implementations: The Polynomial::evaluate_fft method assumes that the FFT always succeeds when called with a power-of-two length parameter. Make sure that there are appropriate checks in the FFT implementation to handle cases where this assumption might not hold due to external factors.

Security

  • Timing Side-Channels: Ensure that all sensitive operations in the underlying field and polynomial implementations are constant-time, especially operations in the FFT, which can have side-channel vulnerabilities.
  • Zeroization of Sensitive Data: This code does not show any explicit handling for zeroizing sensitive data like keys or secrets post-use. If these are involved, ensure zeroization to avoid leaks.

Performance

  • Unnecessary Allocations: The creation of the Polynomial object in encode may involve allocations that could be optimized if the message structure remains unchanged through operations. Consider reusing allocations if possible.

Bugs & Errors

  • Potential Panics or Unwraps: Using assert! and expect directly can cause panics which might be better handled gracefully through results or errors if the environment demands high stability.

Code Simplicity

  • Consider extracting some logic into helper functions if the complexity within functions like encode grows, for better readability and maintenance.

Summary:

The implementation is largely correct but needs stronger consideration for edge cases, particularly when msg_len is zero, and handling potential panics due to invalid inputs or operations. Security aspects like constant-time operations and zeroization should be explicitly verified if they involve sensitive data. After addressing the mentioned concerns, the code will be more robust and ready for merging.

.collect();

LinCodeProof { v, columns }
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • The function calculate_t calculates the number of column openings needed based on a security parameter and code distance. Ensure delta_den isn't zero to avoid division by zero.
  • The calculation of half_delta and log_factor in calculate_t involves floating-point arithmetic, which might introduce precision errors. Verify the impact on security depending on precision.
  • Use of debug_assert_eq! for checks like a.len() == commitment.n_rows means the program won't panic in release mode if there's a mismatch. Consider using regular assert_eq! if discrepancy handling is necessary in non-debug builds.

Security

  • Verify that the implementation of transcript.sample_u64 guarantees constant-time operation to avoid timing side-channels when sampling column indices.
  • Ensure point uses cryptographically secure randomness and that its derivation doesn't leave traces in memory after use.
  • There is potential for panic due to unwrap in state.merkle_tree.get_proof_by_pos(idx).expect("column index is in range"). Consider handling this more gracefully.

Performance

  • The use of Vec::collect and potential multiple heap allocations in col_indices and columns section could be optimized with pre-allocation.

Bugs & Errors

  • The expect call assumes the column index is in range. An out-of-bounds index not handled properly could result in a panic and termination of execution. Consider addressing by adding appropriate range checks if the assumption might fail in borderline cases.

Code Simplicity

  • Code seems clear and follows a logical structure. Comments effectively explain complex operations like tensor decomposition. Ensure consistency in maintaining the code simplicity throughout the library.

In conclusion, while the basic logic is sound, several considerations around security, correct assumptions, and potential impact of floating-point precision warrant further examination before merging.

assert_eq!(sorted.len(), cols.len());
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  1. Mathematical Operations:

    • mul_vec assumes the vector v passed is of the same size as the number of columns in the matrix, which is good. However, be cautious of operations like v[*col] when col is not properly checked against its range, although assert_eq! ensures dimensions match during usage.
  2. Edge Cases:

    • Ensure n_rows or n_cols is not zero during construction to avoid abnormal behavior in other operations as assert_eq!(entries.len(), n_rows) only checks entries length match.
  3. Polynomial and FFT Implementations:

    • Not applicable within the scope of the changes.
  4. Proof System Correctness:

    • Not applicable within the scope of the changes.

Security

  1. Timing Side-Channels:

    • Ensure FieldElement operations, especially multiplication and addition, are constant-time.
  2. Proper Zeroization of Sensitive Data:

    • No indication of zeroization for sensitive data is identified. Ensure FieldElement supports this if necessary.
  3. Cryptographically Secure Randomness:

    • The function random_binary does not use a secure source of randomness; it relies on a deterministic pseudo-random generator approach for reproducible results, which may not be suitable if randomness needs to be cryptographically secure.
  4. No Secret-Dependent Branching:

    • Usage of .contains() and .insert() on used (BTreeSet) can introduce timing side-channels.
  5. Hash Function Domain Separation:

    • Not applicable within the scope of the changes.

Performance

  1. Unnecessary Allocations:

    • Vec allocations in random_binary and mul_vec could be optimized by pre-allocating with precise capacity if possible.
  2. Redundant Field Inversions:

    • No inversions are seen but always review for redundant implementations as multiplication is used instead.
  3. MSM and FFT Efficiency:

    • Not applicable within the scope of the changes.

Bugs & Errors

  1. Potential Panics or Unwraps:

    • assert_eq! could panic if assumptions are not met.
  2. Memory Safety Issues:

    • No direct unsafe code is present, which is good.
  3. Off-by-one Errors:

    • Utilizing indices from a state might lead to off-by-one errors when adjusting for zero-based indexing, include checks to ensure valid indices.
  4. Integer Overflow/Underflow:

    • Usage of .wrapping_* methods ensure no overflow, but always analyze the expected behavior when wrapping.

Code Simplicity

  1. Overly Complex Implementations:

    • The transformation logic within random_binary could be clarified or separated for readability.
  2. Duplicated Code:

    • No immediate duplications are observed.
  3. Poor Abstractions:

    • Abstraction levels seem appropriate, though ensure Entry and Row concepts are correctly abstracted for more complex methods in future expansions.

The current code has potential correctness and security issues. Consider refactoring to address these concerns for improvement before merging.


assert_eq!(result, expected);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Polynomial and FFT Implementations: The tests for evaluations of polynomials seem robust; however, thorough verification on edge cases such as evaluating zero polynomials or the evaluation at boundary limits (e.g., very large input values) should be ensured.

Security

  • Timing Side-Channels: There is no indication of defenses against timing side-channels. The polynomial evaluations and other arithmetic operations should be verified to not have data-dependent branching that could lead to timing attacks.
  • Proper Zeroization of Sensitive Data: The code does not show any indication of explicitly zeroizing sensitive data before it goes out of scope. Ensure that sensitive mathematical objects are wiped from memory after use.
  • Cryptographically Secure Randomness: The code uses Keccak256 for hashing in the DefaultTranscript, but ensure randomness used elsewhere is secured with a cryptographically secure random generator.
  • Hash Function Domain Separation: There is a potential issue with hash function use; ensure that domain separation techniques are applied when the same hash function is used in multiple parts of the protocol.

Performance

  • Unnecessary Allocations: Use of Vec for point evaluations and other operations could lead to unnecessary memory allocations. Consider optimizing with fixed-size arrays where applicable.
  • Redundant Field Inversions: Not observed in the current code, but generally, it’s important to minimize inversions as they are more expensive in field arithmetic.

Bugs & Errors

  • Potential Panics or Unwraps: Usage of expect in polynomial evaluations could lead to runtime panics if the point is invalid or out of expected bounds. Ensure values are safely validated.

Code Simplicity

  • Duplicated Code: There is duplicated logic in multiple commit_open_verify tests for different field parameters. Consider abstracting repetitive test setup into helper functions to ease maintenance and reduce duplication.

Recommendation

  • Address security concerns regarding timing side-channels and proper zeroization of sensitive data. Ensure domain separation in hashing.
  • Refactor test code and possibly other logic to remove redundancies and optimize memory usage.

Overall, there are important issues, mainly in terms of security practices and some potential simplifications, that should be addressed before considering a merge.

/// Relative minimum distance of the code as `(numerator, denominator)`.
/// For RS with rate rho, this is `(1 - rho)`, e.g. `(1, 2)` for rate 1/2.
fn distance(&self) -> (usize, usize);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  1. Ensure that the encode function respects the mathematical guarantees of the respective linear code (Reed-Solomon or Expander). This snippet does not provide the specifics of these encodings, which are crucial for verifying correctness.

Security

  1. Timing Side-channels: It's crucial to ensure that operations in sensitive encoding (like Reed-Solomon and Expander codes) are constant-time to prevent leaking information via timing attacks. This is especially important in encode.
  2. Zeroization: Sensitive data (e.g., message inputs to encode) should be zeroized after use to prevent any accidental leakage.
  3. It's missing a clear indication of cryptographically secure randomness usage, especially if any randomness is involved in encoding.
  4. Ensure that hash functions, if used, have domain separation to avoid cross-protocol attacks.

Performance

  1. Unnecessary Allocations: Repeatedly creating Vec can be costly. Consider reusing buffers or using more efficient data structures if performance bottleneck exists.
  2. Redundant Operations: Check if any redundant computations can be avoided in encode, especially if FFT is involved.

Bugs & Errors

  1. Potential Panics: Ensure that encode handles invalid inputs gracefully, without causing panics or unwrapping.
  2. Memory Safety: Presently, the code doesn't show memory safety concerns, but ensure that operations on msg within encode do not cause buffer overflows.
  3. Integer Overflow/Underflow: If using indices or lengths within encode, ensure these are checked or wrapped safely.

Code Simplicity

  1. Complexity: If the actual encoding algorithms are implemented later, ensure they aren't too complex or difficult to understand.
  2. Duplication: Check other parts of the library to ensure this trait is not duplicated unnecessarily.

Overall, while the trait definition is clean, more detail on the implementation of the encoding functions and their specific operations would be required to fully assess correctness and security vulnerabilities. Therefore, merging this without reviews on those implementations may lead to issues later on.

});

eval == *claimed_value
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Mathematical operations: Ensure that the field operations like addition, multiplication, and zero element checks are correctly handling edge cases like zero elements. There's no explicit handling for division or inversions, which are potential sources of issues in elliptic curve operations.
  • Edge cases: The code seems to handle basic checks for edge cases such as incorrect vector lengths. Verify handling of empty vectors and the case where point length is not even, as this could lead to incorrect tensor splitting.
  • Proof system correctness: Important checks seem to be in place, but the comments could be more detailed to ensure all specification steps are covered.

Security

  • Timing side-channels: While the use of iterators and function calls generally move operations towards being more constant-time, ensure that collection accesses and loops do not inadvertently introduce timing side-channels, especially in cryptographic contexts.
  • Zeroization of sensitive data: There is no explicit zeroization of variables (e.g., w) containing sensitive data. Consider adding zeroization to guard against leaking sensitive data in memory.
  • Cryptographically secure randomness: The randomness derived from sampling column indices in the transcript should be reviewed to ensure it is cryptographically secure.
  • Secret-dependent branching: Avoid logic where branch lengths or memory accesses are influenced by secret data to prevent side channels.

Performance

  • Unnecessary allocations: Potential allocations occur in creating vectors like col_indices. Consider pre-allocating where possible.
  • Redundant field inversions: There are multiplication and addition operations but no explicit field inversions, which can be costly. Ensure this is correctly balanced against other operations.

Bugs & Errors

  • Potential panics or unwraps: Although not observed directly within the diff, verify the usage of collections and ensure all indexed accesses are safe.
  • Memory safety issues: Generally safe given Rust's guarantees but double-check unsafe blocks or external calls.
  • Off-by-one errors: Ensure all loop bounds are correct; particularly important in cryptographic implementations where an off-by-one could invalidate proofs.
  • Integer overflow/underflow: Utilize Rust's checked arithmetic where appropriate to prevent overflow impacts, especially in calculating t.

Code Simplicity

  • Overly complex implementations: The code appears readable, but ensuring function extractions for repetitive logic could improve clarity.
  • Duplicated code: Internal consistency checks within the loop have similar logic. Consider refactoring into utility functions.
  • Poor abstractions: The abstractions used seem appropriate, but there might be opportunities to simplify tensor computations or column verifications for clarity.

Given these considerations, the code requires some improvements for security and correctness, particularly focusing on edge cases and potential side-channel attacks.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Mar 11, 2026

Greptile Summary

This PR adds a complete Ligero/Brakedown linear-code polynomial commitment scheme (PCS) for multilinear polynomials to lambdaworks-crypto. It introduces two pluggable encoding backends — Reed-Solomon (Ligero, O(N log N) via FFT) and expander codes (Brakedown, O(N) via sparse matrix multiplication) — alongside a commit/prove/verify pipeline backed by the existing MerkleTree and Fiat-Shamir transcript infrastructure. The implementation is no-trusted-setup and post-quantum secure.

Key findings:

  • Biased field element generation in random_nonzero: The value state >> 33 is bounded to 31 bits, so for any field with characteristic > 2^31 (Goldilocks64, Stark252, etc.), only a tiny fraction of the field elements can appear in the sparse A/B matrices. The Brakedown distance analysis requires uniform random non-zero field elements; this bias invalidates the soundness bound when ExpanderEncoding is used with large fields.
  • b_out_len deviates from the Brakedown paper: Setting b_out_len = compressed_len = α·n instead of the paper's (1 − α − 1/r)·n changes both the code rate and minimum distance, meaning the user-supplied distance parameter cannot be independently verified against the actual construction.
  • Dead code: SparseMatrix::random_binary is never called from ExpanderEncoding or anywhere else in the new module.
  • debug_assert for security-critical size checks in prove.rs: Point-length validation is stripped in release builds, which can silently produce proofs over the wrong evaluation point.
  • The overall architecture (commit → prove → verify separation, LinearCodeEncoding trait, shared calculate_t utility, Fiat-Shamir transcript binding) is clean and well-structured. The RS encoding, tensor_vec, matrix operations, and verifier logic are all correct.

Confidence Score: 2/5

  • Not safe to merge without addressing the biased field element distribution in random_nonzero and the b_out_len construction mismatch, both of which can invalidate the security guarantees for certain field choices.
  • Two logic-level issues directly affect soundness: (1) the 31-bit ceiling on generated field element values breaks the uniform distribution assumption of the Brakedown distance proof for large fields, and (2) the b_out_len = compressed_len deviation means the actual code rate/distance do not match what the paper's analysis covers. Since ExpanderEncoding is advertised as field-agnostic and callers can supply any field, these are not confined to specific test configurations.
  • crates/crypto/src/linear_code_pcs/sparse_matrix.rs (biased element generation) and crates/crypto/src/linear_code_pcs/encoding/expander.rs (b_out_len construction) require the most attention before merge.

Important Files Changed

Filename Overview
crates/crypto/src/linear_code_pcs/sparse_matrix.rs Contains two issues: (1) random_nonzero limits generated field element values to 31 bits via state >> 33, breaking the uniform distribution required by Brakedown for any field with characteristic > 2^31. (2) random_binary is dead code never referenced from ExpanderEncoding.
crates/crypto/src/linear_code_pcs/encoding/expander.rs The b_out_len = compressed_len simplification deviates from the Brakedown paper construction, yielding a code with different rate and distance than the paper's analysis supports. The user-supplied distance parameter therefore cannot be independently validated against the actual construction.
crates/crypto/src/linear_code_pcs/prove.rs Proves a multilinear evaluation via left-multiplication and Fiat-Shamir column sampling. Logic is sound and uses the shared calculate_t from utils.rs. Security-critical size checks are debug_assert only, which could cause confusing silent failures in release builds.
crates/crypto/src/linear_code_pcs/verify.rs Verifier correctly replays the Fiat-Shamir transcript, checks Merkle proofs, column consistency (⟨a, col_j⟩ == encode(v)[j]), and the final evaluation. Uses shared calculate_t from utils.rs. All size checks are explicit (non-debug) returns of false, which is appropriate.
crates/crypto/src/linear_code_pcs/utils.rs Shared calculate_t function is now correctly deduplicated from both prove.rs and verify.rs, uses Brakedown's δ/3 soundness formula, and properly caps at n_ext_cols. Tests cover RS rate-1/2, rate-1/4, small-distance, and capping cases.
crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs RS encoding backend is correct: treats message as polynomial coefficients and evaluates via FFT. Distance is correctly computed as (rho_inv-1, rho_inv). The test encode_is_systematic_at_roots has a misleading name as RS codes are non-systematic (evaluation-based, not coefficient-in-prefix).
crates/crypto/src/linear_code_pcs/tests.rs Integration tests cover Ligero (Goldilocks + RS) and Brakedown (Mersenne31 + Expander) for 4 and 6 variables, wrong-value rejection, rate-4 code, and transcript-binding (proof replay with different point). Coverage is good but all tests use small polynomials (4–6 vars), masking the t capping issue noted in previous review.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant commit.rs
    participant MerkleTree
    participant prove.rs
    participant Transcript
    participant verify.rs

    Caller->>commit.rs: commit(evals, encoding)
    commit.rs->>commit.rs: reshape evals → k×m Matrix
    commit.rs->>commit.rs: encode each row → k×n ExtMatrix
    commit.rs->>MerkleTree: build(columns of ExtMatrix)
    MerkleTree-->>commit.rs: MerkleTree + root
    commit.rs-->>Caller: CommitOutput { commitment, state }

    Caller->>prove.rs: prove(commitment, state, encoding, point, transcript, sec_param)
    prove.rs->>prove.rs: a = tensor_vec(point[..half])
    prove.rs->>prove.rs: b = tensor_vec(point[half..])
    prove.rs->>prove.rs: v = state.matrix.row_mul(a)
    prove.rs->>prove.rs: claimed_value = ⟨v, b⟩
    prove.rs->>Transcript: append root, point, claimed_value, v
    prove.rs->>Transcript: sample t column indices
    prove.rs->>MerkleTree: get_proof_by_pos(idx) for each index
    prove.rs-->>Caller: LinCodeProof { v, columns }

    Caller->>verify.rs: verify(commitment, proof, encoding, point, claimed_value, transcript, sec_param)
    verify.rs->>verify.rs: a = tensor_vec(point[..half]), b = tensor_vec(point[half..])
    verify.rs->>Transcript: append root, point, claimed_value, proof.v
    verify.rs->>Transcript: re-derive t column indices
    verify.rs->>verify.rs: w = encoding.encode(proof.v)
    loop for each opened column j
        verify.rs->>verify.rs: verify Merkle proof vs root
        verify.rs->>verify.rs: check ⟨a, col_j⟩ == w[col.index]
    end
    verify.rs->>verify.rs: check ⟨proof.v, b⟩ == claimed_value
    verify.rs-->>Caller: bool
Loading

Comments Outside Diff (3)

  1. crates/crypto/src/linear_code_pcs/sparse_matrix.rs, line 933-940 (link)

    Biased field element generation breaks uniform distribution for large fields

    The value produced by state >> 33 is at most 31 bits wide (range [0, 2^31]). For any field with characteristic greater than 2^31 — including Goldilocks64 (modulus ≈ 2^64), Stark252 (modulus ≈ 2^252), and others — only about 2^31 / p of the field's elements can ever be generated here. Practically, for Goldilocks64 this covers less than 0.012% of all non-zero field elements.

    The Brakedown paper (Golovnev et al., Section 5.1, Theorem 2) explicitly requires uniform random non-zero field elements in the A and B matrices for the minimum-distance analysis to hold. With a severely biased distribution, the distance guarantee and the resulting soundness bound (t = calculate_t(...) calls) no longer apply.

    The same issue exists in random_binary at line 996.

    A correct approach would be to derive a field element that covers the full range of F, e.g., by mapping the full 256-bit hash digest to the field using rejection sampling or a reduction modulo the field characteristic:

    // Use full hash bytes for a uniform field element, not just 31 bits
    // Option: use FieldElement::from_bytes_be (if available) or reduce the
    // full u256 value modulo the field characteristic.
    let val_hash = derive(seed, row_idx, hash_counter);
    hash_counter += 1;
    let val = FieldElement::<F>::from_bytes_be(&val_hash)
        .unwrap_or(FieldElement::one());
    let val = if val == FieldElement::zero() { FieldElement::one() } else { val };

    This affects the security of the Brakedown PCS when used with any field whose order exceeds 2^31.

  2. crates/crypto/src/linear_code_pcs/encoding/expander.rs, line 316 (link)

    b_out_len = compressed_len deviates from the Brakedown construction

    The comment acknowledges a "simplicity" deviation from the paper, but this is not cosmetic — it affects both the code rate and the achievable minimum distance:

    Per Brakedown (Golovnev et al., Section 5.1), the B matrix should output (1 - α - 1/r) * n_current extra symbols per level, where r is the overall rate blowup. This ensures the total codeword length yields the intended rate 1/r and relative distance δ = β/r.

    With b_out_len = compressed_len = α * n_current, each level contributes 2α * n_current extra symbols instead of (1 - α - 1/r) * n_current. The actual codeword length is approximately n / (1 - 2α) rather than r * n. For α = 0.25, the actual rate is ≈ 1/2 but the caller passes distance = (1, 25) (δ ≈ 0.04), which would require rate ≥ (1 + 2β)/(1 - α) ≈ 4–5 for small β. The resulting mismatch means the distance parameter passed by callers does not correspond to the structure actually constructed.

    This makes it impossible to verify from the outside whether the user-supplied distance is achievable with the chosen b_out_len.

    // Per Golovnev et al., equation for extra output length at each level:
    // b_out_len = ((1 - alpha - 1.0/r) * current_len as f64).ceil() as usize
    // where r is the intended total rate blowup (e.g., 5 for rate 1/5).
    // This requires the caller to specify r, or for the constructor to derive it.
  3. crates/crypto/src/linear_code_pcs/prove.rs, line 781-782 (link)

    Security-critical size checks are debug_assert only

    debug_assert_eq!(a.len(), commitment.n_rows);
    debug_assert_eq!(b.len(), commitment.n_cols);

    These checks verify that point.len() matches the number of variables recorded during commit. In release builds they are compiled out entirely. If a caller passes a point of the wrong length, the prover silently produces a proof over a different evaluation point than intended.

    While this does not break soundness (the verifier will catch it via its explicit if a.len() != commitment.n_rows check), it creates a confusing API in production. Regular assert! or an explicit error return would be clearer and safer for a cryptographic library:

Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/sparse_matrix.rs
Line: 933-940

Comment:
**Biased field element generation breaks uniform distribution for large fields**

The value produced by `state >> 33` is at most 31 bits wide (range `[0, 2^31]`). For any field with characteristic greater than `2^31` — including Goldilocks64 (modulus ≈ 2^64), Stark252 (modulus ≈ 2^252), and others — only about `2^31 / p` of the field's elements can ever be generated here. Practically, for Goldilocks64 this covers less than 0.012% of all non-zero field elements.

The Brakedown paper (Golovnev et al., Section 5.1, Theorem 2) explicitly requires **uniform random non-zero field elements** in the A and B matrices for the minimum-distance analysis to hold. With a severely biased distribution, the distance guarantee and the resulting soundness bound (`t = calculate_t(...)` calls) no longer apply.

The same issue exists in `random_binary` at line 996.

A correct approach would be to derive a field element that covers the full range of `F`, e.g., by mapping the full 256-bit hash digest to the field using rejection sampling or a reduction modulo the field characteristic:

```rust
// Use full hash bytes for a uniform field element, not just 31 bits
// Option: use FieldElement::from_bytes_be (if available) or reduce the
// full u256 value modulo the field characteristic.
let val_hash = derive(seed, row_idx, hash_counter);
hash_counter += 1;
let val = FieldElement::<F>::from_bytes_be(&val_hash)
    .unwrap_or(FieldElement::one());
let val = if val == FieldElement::zero() { FieldElement::one() } else { val };
```

This affects the security of the Brakedown PCS when used with any field whose order exceeds `2^31`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/encoding/expander.rs
Line: 316

Comment:
**`b_out_len = compressed_len` deviates from the Brakedown construction**

The comment acknowledges a "simplicity" deviation from the paper, but this is not cosmetic — it affects both the code rate and the achievable minimum distance:

Per Brakedown (Golovnev et al., Section 5.1), the B matrix should output `(1 - α - 1/r) * n_current` extra symbols per level, where `r` is the overall rate blowup. This ensures the total codeword length yields the intended rate `1/r` and relative distance `δ = β/r`.

With `b_out_len = compressed_len = α * n_current`, each level contributes `2α * n_current` extra symbols instead of `(1 - α - 1/r) * n_current`. The actual codeword length is approximately `n / (1 - 2α)` rather than `r * n`. For α = 0.25, the actual rate is ≈ 1/2 but the caller passes `distance = (1, 25)` (δ ≈ 0.04), which would require rate ≥ `(1 + 2β)/(1 - α)` ≈ 4–5 for small β. The resulting mismatch means the distance parameter passed by callers does not correspond to the structure actually constructed.

This makes it impossible to verify from the outside whether the user-supplied distance is achievable with the chosen `b_out_len`.

```rust
// Per Golovnev et al., equation for extra output length at each level:
// b_out_len = ((1 - alpha - 1.0/r) * current_len as f64).ceil() as usize
// where r is the intended total rate blowup (e.g., 5 for rate 1/5).
// This requires the caller to specify r, or for the constructor to derive it.
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/sparse_matrix.rs
Line: 962-1019

Comment:
**`random_binary` is dead code**

`SparseMatrix::random_binary` is never called by `ExpanderEncoding` (which exclusively uses `random_nonzero`) nor from any other part of the new `linear_code_pcs` module. Keeping it increases maintenance burden and creates confusion about which generation strategy is expected to be used for the expander matrices.

If it is intentionally retained for future use or experimentation, it should either be marked `#[allow(dead_code)]` with a clarifying comment, or removed until it is needed.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/prove.rs
Line: 781-782

Comment:
**Security-critical size checks are `debug_assert` only**

```rust
debug_assert_eq!(a.len(), commitment.n_rows);
debug_assert_eq!(b.len(), commitment.n_cols);
```

These checks verify that `point.len()` matches the number of variables recorded during commit. In release builds they are compiled out entirely. If a caller passes a `point` of the wrong length, the prover silently produces a proof over a different evaluation point than intended.

While this does not break soundness (the verifier will catch it via its explicit `if a.len() != commitment.n_rows` check), it creates a confusing API in production. Regular `assert!` or an explicit error return would be clearer and safer for a cryptographic library:

```suggestion
    assert_eq!(a.len(), commitment.n_rows, "point length incompatible with commitment n_rows");
    assert_eq!(b.len(), commitment.n_cols, "point length incompatible with commitment n_cols");
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs
Line: 558-571

Comment:
**Misleading test function name — RS encoding is NOT systematic**

The test is named `encode_is_systematic_at_roots`, but this is a misnomer. A *systematic* code places the original message symbols verbatim as a prefix of the codeword. The RS encoding used here is *not* systematic: the message is treated as polynomial *coefficients*, and the codeword consists of FFT evaluations at roots of unity — none of the codeword positions contain a raw message symbol.

What the test actually verifies is that `cw[0] = poly(ω^0) = poly(1) = sum of coefficients`. This is a useful functional check, but the name `encode_is_systematic_at_roots` misrepresents what the code does. A clearer name would be:

```rust
fn encode_evaluates_to_coefficient_sum_at_unity() { ... }
```

This distinction matters because callers relying on the "systematic" label may incorrectly assume they can read back the message from the first `msg_len` positions of the codeword.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 2dd51b3

Comment thread crates/crypto/src/linear_code_pcs/verify.rs Outdated
Comment thread crates/crypto/src/linear_code_pcs/encoding/expander.rs Outdated
Comment thread crates/crypto/src/linear_code_pcs/prove.rs
Comment thread crates/crypto/src/linear_code_pcs/sparse_matrix.rs
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 11, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 97.96610% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.56%. Comparing base (d7cc54a) to head (2dd51b3).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
crates/crypto/src/linear_code_pcs/verify.rs 91.78% 6 Missing ⚠️
crates/crypto/src/linear_code_pcs/matrix.rs 91.22% 5 Missing ⚠️
crates/crypto/src/linear_code_pcs/sparse_matrix.rs 99.31% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1191      +/-   ##
==========================================
+ Coverage   75.96%   76.56%   +0.60%     
==========================================
  Files         220      231      +11     
  Lines       49043    50573    +1530     
==========================================
+ Hits        37255    38723    +1468     
- Misses      11788    11850      +62     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ap column openings, fix PRNG bias

- Accept distance as constructor parameter in ExpanderEncoding instead of hardcoding (1, 25)
- Extract calculate_t() to shared utils.rs module, removing duplication between prove.rs and verify.rs
- Cap column opening count t at n_ext_cols to prevent oversized proofs
- Fix biased column selection in sparse_matrix.rs by re-mixing PRNG state on collision instead of linear probing
@github-actions
Copy link
Copy Markdown

Kimi AI Review

Here are my feedback and issues found in the PR diff:

  1. crates/crypto/src/lib.rs:

    • The addition of pub mod linear_code_pcs; is good for organizing the new module. However, it would be beneficial to include a brief documentation comment explaining the purpose of this module in the context of the cryptographic library.
  2. crates/crypto/src/linear_code_pcs/commit.rs:

    • In the commit function, the calculation of n_cols and n_rows based on v seems correct, but it would be good to add a comment explaining why v / 2 is used. This is because the code assumes a specific structure for the polynomial evaluations, and documenting this assumption would improve code readability.
    • The assert_eq! macro call for checking encoding.message_len() against n_cols is important for ensuring the correctness of the encoding process. However, it might be more informative to include a custom error message that explains the relationship between message_len and n_cols.
    • The tensor_vec function is called twice in the prove function, once for each half of the point. It would be beneficial to consider refactoring this code to avoid duplication, possibly by introducing a new function that takes the full point and returns both tensor vectors.
  3. crates/crypto/src/linear_code_pcs/data_structures.rs:

    • The LinCodeCommitment, CommitState, CommitOutput, OpenedColumn, and LinCodeProof structs are well-defined and encapsulate the necessary data for the commit and prove phases. However, it would be good to include documentation comments for each struct, explaining their purpose and the meaning of each field.
  4. crates/crypto/src/linear_code_pcs/encoding/expander.rs:

    • The ExpanderEncoding struct and its implementation look well-structured. However, the use of a mutable seed (level_seed) for generating the sparse matrices could be improved by using a more cryptographically secure pseudorandom number generator (CSPRNG) to ensure better randomness properties.
  5. crates/crypto/src/linear_code_pcs/encoding/mod.rs:

    • This module file is appropriately organized by separating the expander and reed_solomon encodings into their own files.
  6. crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs:

    • The ReedSolomonEncoding struct and its implementation are clear and well-documented. However, the use of rho_inv as a usize could be improved by using a different type that better represents the concept of the inverse rate, such as a f64 or a Fraction type.
  7. crates/crypto/src/linear_code_pcs/matrix.rs:

    • The Matrix struct and its implementation are straightforward and serve their purpose well. However, the row_mul function could be optimized by using more efficient linear algebra techniques, such as using a vectorized dot product if the target architecture supports SIMD instructions.
  8. crates/crypto/src/linear_code_pcs/mod.rs:

    • This module file is appropriately organized, separating the various components of the linear-code PCS into their own files.
  9. crates/crypto/src/linear_code_pcs/prove.rs:

    • The prove function is well-structured and follows the expected steps for generating an evaluation proof. However, as mentioned earlier, there is an opportunity to refactor the code to avoid duplication when computing the tensor vectors.
  10. crates/crypto/src/linear_code_pcs/sparse_matrix.rs:

    • The SparseMatrix struct and its implementation are clear and well-documented. However, the use of a Vec<Vec<(usize, FieldElement<F>)>> for storing the sparse matrix could be improved by using a more memory-efficient data structure, such as a CompressedSparseRow (CSR) format.
  11. crates/crypto/src/linear_code_pcs/tests.rs:

    • The test cases provided are comprehensive and cover various scenarios, including different numbers of variables and different encoding schemes. However, it would be good to include additional tests that specifically target edge cases, such as when the number of variables is very small or when the evaluation point is at the boundary of the domain.
  12. crates/crypto/src/linear_code_pcs/traits.rs:

    • The LinearCodeEncoding trait is well-defined and provides a clear interface for different encoding schemes. However, it would be beneficial to include documentation comments explaining the purpose of each method and the expected behavior.
  13. crates/crypto/src/linear_code_pcs/utils.rs:

    • The calculate_t function is a utility function that calculates the number of column openings needed for a given security parameter. However, the calculation could

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review by ChatGPT

}

Self {
msg_len,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  1. Mathematical Operations: Code changes do not show any mathematical operations or changes in modular arithmetic or polynomial calculations.
  2. Edge Cases: Assertions check for valid distance values, but there are no checks for overflows in multiplication or addition during matrix generation, which could be critical.
  3. Infinity Points: Not applicable to this change as no elliptic curve operations are shown.

Security

  1. Constant-Time Operations: There is no evidence in this snippet that constant-time guarantees are needed or implemented for secrets.
  2. Zeroization: No handling of sensitive data is shown in the given snippet.
  3. Cryptographically Secure Randomness: The seed is passed as u64, but ensure cryptographically secure RNG initialization if used for generating matrices.

Performance

  1. Unnecessary Allocations: There seem to be vector allocations for matrices without checking for efficient sizing or reuse.
  2. Efficient Operations: Repeated wrapping_add for the seed could lead to unintended collisions; using a more robust method or a larger datatype might mitigate.

Bugs & Errors

  1. Assertions: Assertions should be used carefully, as they will not run in a release build. Switch to error handling if runtime safety is crucial.
  2. Potential Panics: The excessive dependence on user-supplied distance values can lead to panics if not properly bounded.

Code Simplicity

  1. Complexity: The use of tuple (usize, usize) for distance could be abstracted into a domain-specific type with validations.

Summary

  • Add overflow checks and consider better randomness management. Use assertions cautiously as they might create discrepancies between debug and release builds. Overall, ensure that operations on sizes and indices are safe.
  • Consider replacing assertions with proper error handling and checking bounds against overflows.
  • Optimize performance by reevaluating RNG and vector allocation strategies.
  • Address the use of constants or better-abstraction methods for domain-specific mathematical parameters.

pub mod matrix;
pub mod prove;
pub mod sparse_matrix;
pub mod traits;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Correctness: No mathematical operations are detailed in this snippet, so potential issues in modular arithmetic, field operations, or curve operations could not be specifically addressed.
  • Security: Verify if the new utils module introduces any secret-dependent branching or lacks constant-time operations for secrets, as these can introduce timing side-channels.
  • Zeroization: Ensure that any sensitive data handling in the new module or imports are properly zeroized.
  • Randomness: Check for cryptographically secure randomness if any RNG functionality has been introduced in the utils.
  • Performance: If utils introduces operations involving MSM or FFT, evaluate them for unnecessary allocations or redundant calculations.
  • Errors and Bugs: There's a pointer to potential issues if functions related to proof systems are moved to utils. Ensure there's no improper handling that may lead to panics or overflow.
  • Code Simplicity: The introduction of a utils module could indicate the presence of utility functions that may abstract common logic, but could also suggest grouping unrelated functionalities leading to complexity. Ensure no duplication or tangled abstractions appear.

Without additional details on the specific changes made within each module or function, particularly utils, a thorough review isn't possible. I recommend inspecting the full code changes in these areas before proceeding with the merge.

point: &[FieldElement<F>],
transcript: &mut impl IsTranscript<F>,
sec_param: usize,
) -> LinCodeProof<F, B>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • The logic for calculating t has been moved from the local function to utils::calculate_t. Please ensure that the new calculate_t implementation in utils is correct, especially regarding edge cases like zero or small delta_num and delta_den.

Security

  • Verify that the random sampling in transcript.sample_u64(n_ext_cols as u64) is cryptographically secure and free from timing side-channels.

Performance

  • Consider the efficiency of sampling col_indices. If t approaches n_ext_cols, it could be more efficient to shuffle a list of indices and select the first t elements.

Bugs & Errors

  • Ensure the transcript.sample_u64 function handles edge cases for zero or near-zero probabilities correctly to avoid panic or unexpected behavior.

Code Simplicity

  • The move to utils::calculate_t seems appropriate, providing calculate_t is not overly complex and well-documented. Ensure that the abstraction does not hide important logical details that might confuse future developers.

#[derive(Clone, Debug)]
pub struct SparseMatrix<F: IsField> {
pub n_rows: usize,
pub n_cols: usize,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • The change implements a more randomized probing strategy by re-mixing the PRNG state, which should reduce clustering bias. However, care must be taken to ensure that this reformulated approach maintains the uniform distribution over the columns, which can affect the correctness when constructing sparse matrices.

Security:

  • Ensure that the PRNG remains suitable for cryptographic purposes when used in contexts that require unpredictability.
  • Verify that the re-mixed PRNG state does not introduce any timing side-channel vulnerabilities.

Performance:

  • This change does not introduce any obvious performance bottlenecks; however, confirming that the revised probing method does not significantly degrade the performance in practice would be advisable.

Bugs & Errors:

  • Given that this section of code involves wrapping with wrapping_* methods, ensure that integer overflow is handled as expected for the intended behavior.

Code Simplicity:

  • The change simplifies the probing strategy logic by removing direct linear probing in favor of a more randomized approach, which could lead to cleaner code handling in the long run.

Overall, while the code change addresses clustering bias, further assurance on the uniformity, potential timing side-channels, and performance impacts could strengthen the review.

type B = FieldElementVectorBackend<F, Keccak256, 32>;

fn make_poly(n_vars: usize) -> (Vec<FE>, DenseMultilinearPolynomial<F>) {
let n = 1usize << n_vars;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • The change in ExpanderEncoding::<F>::new function now includes an additional parameter (1, 25). Ensure that the function new() and the associated scheme properly utilize this parameter, especially if it impacts the encoding logic or security properties.

Security:

  • No explicit timing-attack mitigations are mentioned. If this new() function or the encoding affects sensitive data, such as keys, ensure that all operations are constant-time.
  • Verify that any randomness used within this context remains cryptographically secure, though this snippet does not indicate such usage directly.

Performance:

  • The segments shown do not immediately suggest performance issues, but if the new() method in ExpanderEncoding results in additional file or memory allocations due to added parameters, these should be reviewed.

Bugs & Errors:

  • Ensure that management of any new state derived from the parameter (1, 25) does not introduce potential for errors, particularly if index-based accesses are performed.

Code Simplicity:

  • Changes are consistent across all instances in the code snippet, suggesting a thematic or necessary update, possibly indicating proper abstraction has been maintained.

Conclusion:

  • Double-check how the additional parameter (1, 25) affects the behavior. Ensure no logical errors are introduced either in correctness or security with this new parameter's usage. Until verified, this code review considers the changes provide a potential risk.

let log_factor = -(1.0 - half_delta).log2();
let t = (sec_param as f64 / log_factor).ceil() as usize;
t.min(n_ext_cols)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • The operation (delta_num as f64 / delta_den as f64) / 2.0 might be vulnerable to precision issues when using floating-point arithmetic for division. Consider using integer arithmetic first when possible.
  • Make sure that delta_num <= delta_den to prevent the denominator from being zero when calculating half_delta.

Security:

  • No obvious side-channel vulnerabilities since this function is likely not operating directly on secret data.

Performance:

  • The use of floating-point operations and ceil might introduce unnecessary overhead. However, since delta has a real number nature, this might be unavoidable.

Bugs & Errors:

  • The function does not handle cases where log2 operates on non-positive numbers; this could lead to undefined behavior.
  • The function should check for division by zero to protect against inputs where delta_den might be zero, which should be treated as an error or handled gracefully.

Code Simplicity:

  • Overall, the code is straightforward but ensure proper validation checks are in place for inputs to maintain safety and correctness.

To improve the current code, ensure robust input validation and reconsider the use of floating-point arithmetic if precision is a concern.

/// - Consistency: `<a, col_j> == w[j]` (the inner product of tensor-half `a`
/// with the column equals the encoded value at that position).
/// 5. Verify `<v, b> == claimed_value`.
///
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • Polynomial and FFT Implementations: Ensure that any changes related to polynomial operations, especially if they involve inverse FFT or multiplication, are carefully reviewed for consistency with mathematical definitions. No specific issues are noted, but any error in FFT can severely affect the correctness of results.
  • Edge Cases: There are no visible checks in the provided diff for handling special cases such as zero denominators in calculate_t or invalid n_ext_cols. It's essential to handle such cases to avoid division by zero or invalid memory accesses.

Security:

  • Timing Side-Channels: Ensure that no operations use secret-dependent branching. There is no visible check for constant-time operations in the provided diff.
  • Randomness: The transcript.sample_u64 line suggests sampling randomness. Ensure transcript is properly seeded and cryptographically secure.
  • Hash Function Domain Separation: Although not directly visible here, ensure any associated hash functions used for commitments or proofs include domain separation to avoid cross-protocol attacks.

Performance:

  • Unnecessary Allocations: Allocating a Vec<usize> for column indices could potentially be optimized if the size is known or can be efficiently managed.

Bugs & Errors:

  • Potential Panics or Unwraps: If n_ext_cols can be zero, the calculation transcript.sample_u64(n_ext_cols as u64) might cause panic due to invalid range in sampling. Additionally, ensure calculate_t handles invalid inputs gracefully.
  • Memory Safety: There are no explicit checks for potential buffer overflows or underflows in array accesses. Ensure all accesses, especially using sampled indices, respect array boundaries.

Code Simplicity:

  • The logic for calculate_t was encapsulated in utils, which is a good step towards simplicity, but ensure abstraction doesn't hide essential error handling. Ensure calculate_t function is well-documented and handles edge cases properly.

Overall, addressing security concerns and handling edge cases can provide improvements to make it merge-ready.

…nce docs

- Bind point and claimed_value to Fiat-Shamir transcript in prove/verify
  to prevent proof replay attacks with different evaluation points
- Fix calculate_t to use full delta instead of delta/2 (was over-conservative)
- Replace LCG-based row seeding in SparseMatrix with SHA3-256 hash-based
  per-row seeds to eliminate inter-row correlation
- Add safety documentation for unverified distance parameter in ExpanderEncoding
- Add regression test for proof replay and unit tests for calculate_t
@github-actions
Copy link
Copy Markdown

Kimi AI Review

Here are my detailed comments on the PR diff:

  1. crates/crypto/src/lib.rs:

    • New linear_code_pcs module added. Looks good.
  2. crates/crypto/src/linear_code_pcs/commit.rs:

    • In commit function, assert!(n.is_power_of_two(), "evals length must be a power of two"): This is important for ensuring the polynomial commitment scheme works correctly. Good check.
    • In commit function, assert_eq!(n_cols, encoding.message_len(), "encoding message_len must equal matrix column count"): This ensures the linear code encoding is compatible with the polynomial degree. Good check.
    • In commit function, debug_assert_eq!(encoded_row.len(), n_ext_cols): This checks the encoded row length matches the expected number of extended columns. Good sanity check.
    • The commit function overall looks correct and well-structured.
  3. crates/crypto/src/linear_code_pcs/data_structures.rs:

    • LinCodeCommitment, CommitState, CommitOutput, OpenedColumn, LinCodeProof: These structs are well-defined and contain all necessary fields. Good job.
  4. crates/crypto/src/linear_code_pcs/encoding/expander.rs:

    • ExpanderEncoding struct and implementation looks good. The recursive sparse matrix multiplication is implemented efficiently.
    • Safety warning in new function doc comment about unverified distance parameter is important. This is a critical assumption for soundness.
    • encode method correctly applies the sparse matrix multiplication recursively. Good implementation.
  5. crates/crypto/src/linear_code_pcs/encoding/mod.rs:

    • New module structure for encoding backends (expander, reed_solomon). Looks good.
  6. crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs:

    • ReedSolomonEncoding struct and implementation looks good. The FFT-based encoding is implemented correctly.
    • Safety check in new function for msg_len * rho_inv being a power of two is important. Good check.
  7. crates/crypto/src/linear_code_pcs/matrix.rs:

    • Matrix struct and implementation looks good. The row-major representation is suitable for the polynomial commitment scheme.
    • row_mul method correctly left-multiplies the matrix by a row vector. Good implementation.
  8. crates/crypto/src/linear_code_pcs/mod.rs:

    • New module structure looks good.
  9. crates/crypto/src/linear_code_pcs/prove.rs:

    • prove function implementation looks correct. It correctly computes the tensor decomposition, recomputes the transcript, and generates the proof.
    • Binding all statement parameters to the transcript before deriving column indices is important for soundness. Good check.
    • Determining the number of column openings based on security parameter and code distance is done correctly using calculate_t. Good implementation.
  10. crates/crypto/src/linear_code_pcs/sparse_matrix.rs:

    • SparseMatrix struct and implementation looks good. The Compressed Sparse Row (CSR) format is suitable for efficient sparse matrix multiplication.
    • random_binary method correctly generates a random sparse matrix with the specified number of nonzero entries per row. Good use of SHA3-256 for pseudorandom column selection.
  11. crates/crypto/src/linear_code_pcs/tests.rs:

    • Comprehensive test suite covering Ligero, Brakedown, tensor product, matrix arrangement, etc. Good test coverage.
    • Tests correctly check the polynomial commitment scheme functionality, including commit, prove, verify, and edge cases. Good tests.
  12. crates/crypto/src/linear_code_pcs/traits.rs:

    • LinearCodeEncoding trait is well-defined. It captures the necessary functionality for linear code encoding backends. Good job.
  13. crates/crypto/src/linear_code_pcs/utils.rs:

    • calculate_t function correctly calculates the number of column openings based on security parameter and code distance. Good implementation.
  14. crates/crypto/src/linear_code_pcs/verify.rs:

    • verify function implementation looks correct. It correctly recomputes the tensor vectors, rederives the transcript, checks the Merkle proofs, and verifies the proof consistency and evaluation.
    • Replaying the transcript to rederive the column indices is important for soundness. Good check.
    • Verifying the Merkle proofs and consistency of each opened column is done correctly. Good checks.
    • Final evaluation check <v, b> == claimed_value is important for verifying the proof. Good check.

Overall, the PR diff looks good. The polynomial commitment scheme implementation using linear-code PCS (Ligero and Brake

- Use libm::ceil/libm::log2 instead of f64 methods for no_std compat
- Replace .clone() with Copy semantics in tests for Copy field elements
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review by ChatGPT

Comment thread crates/crypto/Cargo.toml
libm = "0.2"
# CUDA
cudarc = { version = "0.9.7", optional = true }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code introduces a new dependency on libm, which is typically used for mathematical operations on floating-point numbers. Care should be taken to ensure that its use does not introduce floating-point inaccuracies in cryptographic operations, which require strict integer arithmetic.

Correctness:

  • Verify that all mathematical operations added with libm adhere to modular arithmetic without floating-point errors.
  • Ensure that edge cases, particularly regarding zero, infinity, and identity elements, are handled consistently throughout any newly introduced operations.
  • Ensure polynomial operations and FFT implementations are correct, locking down any potential edge cases.

Security:

  • Cryptographic operations should not depend on libm functions for divisors or multipliers to prevent timing side-channels.
  • Ensure that secret data is zeroized effectively; check any newly introduced operations conform with this requirement.
  • Avoid using libm for operations on secret data unless proven to be constant-time.

Performance:

  • Review the impact of using libm on performance, ensuring that it does not result in unnecessary allocations or degraded FFT efficiency.
  • Check for redundant field inversions that could lead to performance hits.

Bugs & Errors:

  • Ensure that libm does not result in off-by-one errors, especially given its native floating-point bias.
  • Check all uses of floating-point operations within loops could potentially cause overflow or underflow.
  • Ensure no unwraps or potential panics pathways are added with this dependency change.

Code Simplicity:

  • Verify that introducing libm has not unnecessarily complicated math operations. Ensure the library's addition does not duplicate existing functionality provided elsewhere in the library.
  • Review code abstractions to avoid poor implementations resulting from the introduction of libm.

use crate::linear_code_pcs::traits::LinearCodeEncoding;

/// Expander-code encoding backend for the Brakedown PCS.
///
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • The change from using f64.ceil() to libm::ceil() may have been done for compatibility or performance reasons, but functionally both should work the same under normal circumstances. Ensure that libm::ceil() is correctly linked and behaves consistently across all target environments.

Security

  • Verify that any operations involving sensitive data, such as secrets, are constant-time to prevent timing side-channel attacks.

Performance

  • No comments on performance impacts; ensure that the use of libm::ceil() optimizes the intended target architecture.

Bugs & Errors

  • Ensure that libm::ceil() does not introduce any potential for underflow or overflow when casting floating-point results to an integer with as usize. This would typically be fine if out_len has been correctly validated, as with the out_len.max(1) logic ensuring it is at least 1.

Code Simplicity

  • No immediate concerns unless libm::ceil() requires additional setup or dependency management that might complicate the build process.

Consider verifying cross-platform compatibility, particularly around the library usage of libm, and running relevant test cases to ensure no regressions are introduced.

// Prove
let mut transcript_p = DefaultTranscript::<F>::new(&[0x42]);
let proof = prove(
&out.commitment,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Overall, the mathematical operations on finite fields seem correct and idiomatic Rust code is used to avoid unnecessary cloning.
  • Ensure that edge cases, like operations involving zero or identity elements, are thoroughly covered by corresponding tests.

Security

  • Timing side-channels: The current code does not exhibit clear evidence of constant-time operation on secrets. Ensure that secret-dependent operations, especially in modular arithmetic and field operations, are performed without timing vulnerabilities.
  • Proper zeroization: No explicit zeroization of sensitive data is observed. Consider incorporating methods to zeroize secret data to prevent leakage through memory.
  • Cryptographically secure randomness & secret-dependent branching: While the given code snippet doesn't deal with randomness, ensure all randomness throughout the library is secure and avoid branches dependent on secret data.
  • Hash function domain separation wasn't mentioned in the review, ensure the usage across the library to prevent unintended collisions.

Performance

  • The removal of unnecessary cloning improves performance but verify the absence of unnecessary allocations in larger contexts like matrix or polynomial operations.
  • FFT efficiency: Make sure FFT implementations are efficient, especially in cases involving power-of-two input sizes since they can leverage optimized algorithms.

Bugs & Errors

  • Potential panics or unwraps: Ensure all lengths and accesses, like the ones in the tensor operations, are valid and not prone to underflow/overflow or panics.
  • Memory safety: The use of Rust should mitigate most memory safety concerns, but always ensure safe concurrency if applicable.

Code Simplicity

  • The recent changes have removed redundant code, simplifying the implementation. Continue to refactor duplicated code for clarity.

Overall, ensure all critical aspects of both correctness and security are verified throughout the cryptographic library, beyond the shown portions, particularly concerning constant-time operations and proper zeroization.

t.min(n_ext_cols)
}

#[cfg(test)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Floating-Point Precision: The use of floating-point arithmetic for critical computations like log2 can result in precision errors. Investigate if this precision is acceptable for the security and correctness of your cryptographic application.

Security

  • Timing Side-Channels: The current code, especially in the function calculate_t, does not appear to handle any sensitive information, so timing side-channel risk is minimal here. However, ensure this applies broadly in the rest of the library.
  • Sensitive Data Handling: Ensure that any sensitive data handled elsewhere within this code is properly zeroized.
  • Cryptographically Secure Operations: Utilize cryptographically secure methods for any randomness within the library.

Performance

  • Use of libm: The change to libm suggests a move to more accurate mathematical functions. This is generally a good practice but ensure these performance characteristics align with desired efficiency.
  • Allocation and Operations Efficiency: No evident excessive allocations or redundant operations in this provided snippet.

Bugs & Errors

  • Potential Panics: No direct potential for panics in this code snippet. Ensure any dependencies or the larger codebase have adequate error handling.
  • Integer Overflow/Underflow: The as usize coercion is a potential point for errors if values are unexpectedly negative or excessively large. Ensure calling code ensures bounds.

Code Simplicity

  • Complexity Reduction: The code effectively uses libm functions to reduce complexity and improve mathematical precision, which is a positive change.
  • Duplication and Abstraction: No apparent duplication in the given snippet.

Further investigation into the overall cryptographic operations and broader code context is recommended to ensure these initial assessments hold at the library level.

@github-actions
Copy link
Copy Markdown

Kimi AI Review

Here are my review comments on the PR diff:

  1. Cargo.toml:

    • Adding libm = "0.2" seems fine for providing math functions like ceil. However, consider using std::f64::ceil instead of libm if possible to reduce dependencies.
  2. src/lib.rs:

    • Adding pub mod linear_code_pcs; is good to expose the new module.
  3. src/linear_code_pcs/commit.rs:

    • The commit function seems well-structured. A few suggestions:
      • Consider adding more documentation explaining the purpose and behavior of the function.
      • The assert!(n.is_power_of_two(), ...) check is good for ensuring the input length is a power of two.
      • The assert_eq!(n_cols, encoding.message_len(), ...) check is good for ensuring the encoding message length matches the matrix column count.
  4. src/linear_code_pcs/data_structures.rs:

    • The new data structures like LinCodeCommitment, CommitState, etc. seem well-defined. A few suggestions:
      • Add more documentation explaining the purpose and fields of each struct.
      • Consider using #[derive(Debug)] for structs that will be printed for debugging.
  5. src/linear_code_pcs/encoding/expander.rs:

    • The ExpanderEncoding struct and implementation look good. A few suggestions:
      • The new function has a safety comment about verifying the distance parameter. Consider adding a post-condition check in debug mode to ensure the distance holds.
      • The encode function seems efficient with O(n) complexity due to sparse matrix multiplication. Good job!
      • Consider adding more documentation explaining the expander code construction and parameters.
  6. src/linear_code_pcs/encoding/reed_solomon.rs:

    • The ReedSolomonEncoding struct and implementation look good. A few suggestions:
      • Add more documentation explaining the Reed-Solomon code construction and parameters.
      • The encode function uses FFT-based polynomial evaluation, which is efficient. Good job!
  7. src/linear_code_pcs/matrix.rs:

    • The Matrix struct and implementation seem well-defined. A few suggestions:
      • Add more documentation explaining the purpose and methods of the matrix.
      • The row_mul method is useful for left-multiplying a row vector with the matrix. Consider adding a col_mul method for right-multiplication as well.
  8. src/linear_code_pcs/prove.rs:

    • The prove function seems well-structured. A few suggestions:
      • Consider adding more documentation explaining the purpose and behavior of the function.
      • The use of Fiat-Shamir for deriving random challenges is good for non-interactive proofs.
  9. src/linear_code_pcs/sparse_matrix.rs:

    • The SparseMatrix struct and implementation look good. A few suggestions:
      • Add more documentation explaining the purpose and methods of the sparse matrix.
      • The random_binary function for generating random sparse matrices is useful. Consider adding a seed parameter for reproducibility.
  10. src/linear_code_pcs/verify.rs:

    • The verify function seems well-structured. A few suggestions:
      • Consider adding more documentation explaining the purpose and behavior of the function.
      • The verification checks seem thorough, including replaying the transcript, checking Merkle proofs, and verifying the final evaluation.

Overall, the PR adds a new linear-code PCS implementation with both Reed-Solomon and expander encodings. The code is well-structured, documented, and efficient. Keep up the good work! A few minor suggestions are provided above to further improve the code quality and documentation.

…ix expander

Three fixes based on comparison with the Brakedown paper (Golovnev et al.,
CRYPTO 2023):

1. calculate_t: use δ/3 (not δ) as the per-column catch probability.
   From Lemma 1, p.14: the soundness error of the testing phase is
   (1 - δ/3)^t, so t = ceil(λ / log2(1/(1 - δ/3))). The previous
   formula using full δ opened ~3.8x fewer columns than required.

2. Expander encoding: implement two-matrix construction (A + B) per
   recursion level, matching Algorithm 1 in Section 5.1. The previous
   single-matrix construction (x, Ax, A²x, ...) lacked the B mixing
   matrix needed for the distance guarantee.

3. Sparse matrix: add random_nonzero() constructor with random non-zero
   field element entries instead of all-ones, as required by Section 5.1
   ("uniform random non-zero elements of F").

Also adds a linearity test for the expander encoding.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review by ChatGPT

#[test]
fn encode_deterministic() {
let enc = ExpanderEncoding::<F>::new(16, 0.25, 4, 2, 42, (1, 25));
let msg: Vec<FE> = (0..16).map(|x| FE::from(x as u64)).collect();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • Distance Parameter: The provided implementation does not inherently verify the correctness of the distance parameter. Given the reliance on this parameter for soundness, it can pose risks. Ensure any calls to new set this correctly, based on a robust spectral expansion analysis.

Security:

  • Sensitive Data: The code does not explicitly zeroize sensitive data. Consider adding zeroization for the codeword and intermediate vectors to ensure sensitive data isn't left in memory longer than necessary.
  • Randomness: The random matrix generation should use a cryptographically secure PRNG, especially since they directly impact the security properties (e.g., a precomputed attack could exploit poor randomness).

Performance:

  • Sparse Matrix Multiplication: Ensure the SparseMatrix.mul_vec implementation is efficient, particularly for large vectors typical in these contexts. Sparse optimizations should be verified.

Bugs & Errors:

  • Potential Panics: The use of libm::ceil without bounds checking on msg_len, alpha, or initialization parameters may inadvertently cause out-of-bounds behavior or panic if assumptions about vector sizes don't hold.
  • Integer Overflow: The code does not currently guard against potential integer overflow in size calculations, such as in cw_len accumulation. Use checked arithmetic where possible.

Code Simplicity:

  • Abstraction: The ExpanderLevel struct introduces some additional clarity but could benefit further from separating creation concerns (random matrix generation) into its own function, improving modularity.
  • Seed Manipulation: The use of wrapping_add for seed manipulation ensures no overflow, which is good, but better documentation on seed derivation could assist future maintainers.

Overall, ensure that cryptographic robustness is maintained through verified randomness and sensitive data handling. Once addressed, particularly the potential security and correctness issues, the implementation will provide a sound base for the PCS expander encoding.

let hash = derive(seed, row_idx, 0);
let mut state = u64_from_hash(&hash);
let mut hash_counter: u64 = 1;
let mut used = alloc::collections::BTreeSet::new();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness

  • Mathematical Operations: The implementation should ensure that the field values are non-zero as intended. However, the current conversion of val_raw via FieldElement::<F>::from(val_raw) may not correctly map to non-zero elements across all fields, particularly if val_raw % (p-1) + 1 is calculated incorrectly for certain field sizes.

Security

  • Timing Side-Channels: Be cautious of the while loop used in the selection of the column index (while used.contains(&col)). This loop could potentially reveal information about the distribution of non-zero entries depending on how long it runs.
  • Cryptographically Secure Randomness: The randomness is derived using SHA-3 in a deterministic manner. The use of a PRNG here should ensure that it aligns with cryptographic best practices for randomness.

Performance

  • Unnecessary Allocations: Using BTreeSet to track used column indices might be an overhead for large matrices. Consider more space-efficient structures if permissible by the use-case characterizing large n_cols.

Bugs & Errors

  • Potential Panics: The code uses assert!(nnz_per_row <= n_cols); which will panic if the condition fails. This assertion should either be replaced with error handling or accompanied by a precise error message if it is intended to stay.

Code Simplicity

  • Complex Branching: The conditional setting of field element values using if val == FieldElement::zero() { FieldElement::one() } else { val } indicates duplication, which could hint at suboptimal field element binding. This logic might be simplified by ensuring that the value chosen always falls into [1, p-1].

This function has notable concerns in terms of correctness and potential non-uniform randomness distribution. Addressing these concerns is essential before recommending the merge.

@github-actions
Copy link
Copy Markdown

Kimi AI Review

Here are my review comments on the PR diff:

  1. Cargo.toml:

    • Adding libm = "0.2" seems fine for using floating point operations in Rust. However, we should ensure that all transitive dependencies are compatible and don't introduce any security issues.
  2. lib.rs:

    • Adding pub mod linear_code_pcs; is good for organizing the new module. However, we should consider if there are any naming conflicts or better organization schemes.
  3. commit.rs:

    • The commit function seems well-structured. However, we should ensure that the evals vector is not modified during the commit process to avoid any potential security issues.
    • The use of debug_assert_eq! for checking the length of encoded_row is fine for debugging purposes, but we should also have proper error handling in case the lengths don't match.
  4. data_structures.rs:

    • The new data structures seem well-defined. However, we should ensure that they are properly documented and their invariants are maintained throughout the codebase.
  5. encoding/expander.rs:

    • The ExpanderEncoding struct and its implementation seem well-structured. However, we should ensure that the recursive encoding process is correctly handled and doesn't introduce any security issues.
    • The use of libm::ceil for calculating compressed_len is fine, but we should ensure that the result is always a power of two as required.
  6. encoding/mod.rs:

    • The new module structure seems fine. However, we should consider if there are any naming conflicts or better organization schemes.
  7. encoding/reed_solomon.rs:

    • The ReedSolomonEncoding struct and its implementation seem well-structured. However, we should ensure that the FFT-based encoding process is correctly handled and doesn't introduce any security issues.
  8. matrix.rs:

    • The Matrix struct and its implementation seem well-structured. However, we should ensure that the matrix operations are correctly handled and don't introduce any security issues.
  9. prove.rs:

    • The prove function seems well-structured. However, we should ensure that the Fiat-Shamir challenge generation is correctly handled and doesn't introduce any security issues.
    • The use of debug_assert_eq! for checking the lengths of a and b is fine for debugging purposes, but we should also have proper error handling in case the lengths don't match.
  10. sparse_matrix.rs:

    • The SparseMatrix struct and its implementation seem well-structured. However, we should ensure that the sparse matrix operations are correctly handled and don't introduce any security issues.
  11. traits.rs:

    • The new trait definitions seem fine. However, we should ensure that they are properly documented and their invariants are maintained throughout the codebase.
  12. utils.rs:

    • The calculate_t function seems well-structured. However, we should ensure that the security parameter calculations are correctly handled and don't introduce any security issues.
  13. verify.rs:

    • The verify function seems well-structured. However, we should ensure that the verification process is correctly handled and doesn't introduce any security issues.
    • The use of debug_assert_eq! for checking the lengths of a and b is fine for debugging purposes, but we should also have proper error handling in case the lengths don't match.

Overall, the PR seems well-structured and organized. However, we should carefully review the security implications of the new code, especially around the use of floating point operations, challenge generation, and matrix operations. Additionally, we should ensure that the new code is properly documented and its invariants are maintained throughout the codebase.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review by ChatGPT

/// y = x · A ∈ F^{αn} (compress via matrix A)
/// z = Enc_{αn}(y) ∈ F^{r·αn} (recursive encoding)
/// v = z · B ∈ F^{(r-1-rα)n} (mix via matrix B)
/// ```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness:

  • Field Operations: The code generally looks correct in terms of field operations, but ensure that operations are performed modulo the field order and handle edge cases like zero or identity elements correctly.
  • Handling of Scalars: It's important to verify that scalars like a and b are within the field and their operations are field operations.

Security:

  • Timing Side-channels: The changes avoid extra cloning and use direct references which is good, but review the rest of the codebase to ensure constant-time operations when working with secrets.
  • Zeroization: There's no indication here that sensitive data is properly zeroed after use. Ensure this is covered in the overall library.
  • Randomness & Hashing: No evidence in this snippet of checking cryptographic randomness. Additionally, ensure hash functions provide domain separation if used.

Performance:

  • Cloning Avoidance: The changes remove unnecessary cloning, which should slightly improve performance by reducing allocations and unnecessary copies.
  • Efficiency of Operations: Multiplications seem consistent; ensure that in more extensive computations like MSM and FFT, performance is maximized.

Bugs & Errors:

  • Potential Panics: No explicit checks for panics, such as unwrap or explicit index accesses, which is good but verify there are no hidden ones elsewhere.
  • Memory Safety: Given Rust's safety guarantees, memory safety should be inherently managed.
  • Off-by-one & Overflow: Not applicable to this snippet, but ensure tests cover these issues thoroughly.

Code Simplicity:

  • Simplification: The code change makes it more straightforward and eliminates duplication which is beneficial. Ensure similar simplicity elsewhere, especially in complex algorithms.

In summary, the code changes themselves are positive but holistic checks for security issues, especially regarding zeroization and timing safety across the library, are essential. Until this is verified, the recommendation is not to merge.

@github-actions
Copy link
Copy Markdown

Kimi AI Review

The PR diff introduces a significant amount of new code related to linear-code polynomial commitment schemes (PCS). Here are the issues I found:

  1. libm dependency: The libm crate is added as a dependency to use libm::ceil. However, Rust's standard library already includes the ceil method on floating-point types, so there's no need for an external dependency. Replace libm::ceil with (num as f64).ceil() as usize where num is the value to be天花板函数.

    • File: src/linear_code_pcs/utils.rs, line 10
  2. Missing derive macros: Some structs are missing derive macros for Clone and Debug traits, which are required by the trait bounds in the LinearCodeEncoding trait. Add #[derive(Clone, Debug)] to the following structs:

    • File: src/linear_code_pcs/data_structures.rs, struct LinCodeCommitment
    • File: src/linear_code_pcs/data_structures.rs, struct CommitState
    • File: src/linear_code_pcs/data_structures.rs, struct CommitOutput
    • File: src/linear_code_pcs/data_structures.rs, struct OpenedColumn
    • File: src/linear_code_pcs/data_structures.rs, struct LinCodeProof
  3. Unnecessary alloc crate usage: The alloc crate is used for Vec, but it's not necessary when targeting Rust's standard library. Replace alloc::vec::Vec with Vec:

    • File: src/linear_code_pcs/commit.rs, line 3
    • File: src/linear_code_pcs/data_structures.rs, line 3
    • File: src/linear_code_pcs/encoding/expander.rs, line 3
    • File: src/linear_code_pcs/encoding/reed_solomon.rs, line 3
    • File: src/linear_code_pcs/matrix.rs, line 3
    • File: src/linear_code_pcs/sparse_matrix.rs, line 3
  4. Missing PhantomData in ReedSolomonEncoding: The ReedSolomonEncoding struct should hold a PhantomData marker to indicate its generic parameter F is used:

    • File: src/linear_code_pcs/encoding/reed_solomon.rs, line 18
    #[derive(Clone, Debug)]
    pub struct ReedSolomonEncoding<F: IsFFTField> {
        /// Length of the input message (number of polynomial coefficients).
        msg_len: usize,
        /// Inverse of the code rate: `codeword_len = msg_len * rho_inv`.
        rho_inv: usize,
        _marker: core::marker::PhantomData<F>,
    }
  5. Unnecessary AsRef<[u8]> bound: The AsRef<[u8]> trait bound on B::Node is not necessary since MerkleTree::build and MerkleTree::get_proof_by_pos already handle the node data. Remove this bound:

    • File: src/linear_code_pcs/prove.rs, line 13
    • File: src/linear_code_pcs/verify.rs, line 13
  6. Missing #[cfg(test)] attribute: Some test modules are missing the #[cfg(test)] attribute, which is required to compile them only when running tests:

    • File: src/linear_code_pcs/mod.rs, line 11
  7. Unnecessary clone calls: There are several unnecessary clone calls that can be removed to improve performance:

    • File: src/linear_code_pcs/commit.rs, line 47
    • File: src/linear_code_pcs/sparse_matrix.rs, line 48
    • File: src/linear_code_pcs/sparse_matrix.rs, line 83
  8. Missing #[cfg(test)] attribute on test functions: Some test functions are missing the #[cfg(test)] attribute, which is required to compile them only when running tests:

    • File: src/linear_code_pcs/tests.rs, line 14
    • File: src/linear_code_pcs/tests.rs, line 34
    • File: src/linear_code_pcs/tests.rs, line 54
    • File: src/linear_code_pcs/tests.rs, line

@diegokingston
Copy link
Copy Markdown
Collaborator Author

@greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants