feat(crypto): add Ligero/Brakedown linear-code polynomial commitment …#1191
feat(crypto): add Ligero/Brakedown linear-code polynomial commitment …#1191diegokingston wants to merge 6 commits into
Conversation
…scheme Implement a hash-based PCS for multilinear polynomials supporting two pluggable encoding backends: - Reed-Solomon (Ligero): O(N log N) prover via FFT, requires FFT-friendly field - Expander codes (Brakedown): O(N) prover via sparse matrix multiply, field-agnostic No trusted setup, post-quantum secure. Uses existing FieldElementVectorBackend for Merkle commitments and Fiat-Shamir transcript for non-interactive proofs. Tested with Goldilocks (Ligero) and Mersenne31 (Brakedown) fields.
Kimi AI ReviewThis PR introduces a new module for linear-code Pairing-Based Cryptography (PCS) schemes, specifically Ligero and Brakedown. Here are my detailed comments and feedback on the changes:
Overall, the implementation of the linear-code PCS module is correct, well-organized, and follows the mathematical and cryptographic principles underlying these schemes. The code is efficient, secure, and well-tested. I approve of this PR with no major issues found. |
| pub mod fiat_shamir; | ||
| pub mod hash; | ||
| pub mod linear_code_pcs; | ||
| pub mod merkle_tree; |
There was a problem hiding this comment.
Code Review Feedback
Correctness
- General Operations: Ensure that all modular and field operations handle edge cases, such as division by zero, points at infinity, and identity elements.
- Polynomial Implementation: Verify that polynomials are correctly reduced modulo the field characteristic.
- FFT Implementation: Ensure correct radix and root-of-unity computations for FFT, consider edge cases like
n=0.
Security
- Constant-Time Operations: Verify that all operations on secret data are performed in constant time to mitigate timing attacks.
- Zeroization: Ensure that temporary buffers holding secret data are cleared after use.
- Randomness: Evaluate the cryptographic security of random number generators—avoid using non-cryptographic sources for randomness.
Performance
- Memory Allocations: Check for redundant memory allocations, particularly in loops, to improve performance.
- Redundant Field Inversions: Identify any unnecessary field inversions which may be optimized out.
- FFT Efficiency: Ensure in-place computation of FFT to minimize memory usage.
Bugs & Errors
- Potential Panics: Search for usages of
unwrapand address potential areas that could cause panics. Replace with safer error handling mechanisms. - Memory Safety: Ensure no unsafe code or raw pointers compromise the memory safety.
- Integer Overflow: Validate that integer operations are using wrapping, saturating, or checked variants to prevent overflow.
Code Simplicity
- Duplicated Code: De-duplicate logic where necessary to improve maintainability.
- Abstractions: Review code abstractions to ensure they are neither overly complex nor simplistic, striking a balance for maintainability.
Specific Review Notes
- New Module
linear_code_pcs: The addition oflinear_code_pcsshould be accompanied by adjoined unit/integration tests to verify correctness and integration with existing modules.
Overall, the code requires further review, testing, and refinement to meet correctness, security, and performance standards. Address the above concerns before merging.
| // This matches DenseMultilinearPolynomial's evaluation order. | ||
| } | ||
| evals | ||
| } |
There was a problem hiding this comment.
Correctness:
- Edge Cases: The
commitfunction usesassert!calls which may cause the program to panic in production if the conditions are unmet. Consider handling these conditions more gracefully or ensuring these checks are only in debug builds. - Polynomial Correctness: The
tensor_vecfunction seems correct mathematically, assumingFieldElementperforms correct field operations.
Security:
- Timing Side-Channels: Ensure all field operations, especially in
tensor_vec, are constant-time if they are used with secret values to avoid timing attacks. - Error Handling: The use of
expectinMerkleTree::buildshould be scrutinized; ensure it does not leak sensitive information on failure. - Randomness and Hashing: Not applicable in the provided code snippet, but ensure domain separation in hashing if used elsewhere.
Performance:
- Unnecessary Allocations: The
ext_datavector is initialized withwith_capacity, which is good for performance. However, consider evaluating if any temporary allocations can be circumvented.
Bugs & Errors:
- Potential Panics: The use of
assert!can lead to panics, as mentioned above. - Off-by-One Errors: No such errors observed in the provided snippet.
Code Simplicity:
- Duplicated Code: No obvious duplication detected.
- Complexity: The code is relatively straightforward, but consider abstracting some operations to improve readability and reuse, especially if similar manipulations occur elsewhere in the code.
Additional Notes:
- Consider adding unit tests that specifically target edge cases like minimal input sizes, powers of two less than typical, etc., to ensure stability.
- Ensure that the
FieldElementand other mathematical operations enforce boundary conditions internally to prevent misuse or undefined behavior.
| pub v: Vec<FieldElement<F>>, | ||
| /// Opened columns with their Merkle proofs, sampled via Fiat-Shamir. | ||
| pub columns: Vec<OpenedColumn<F, B>>, | ||
| } |
There was a problem hiding this comment.
Correctness:
- The current code is primarily defining the structures used in linear-code polynomial commitment schemes (PCS). Ensure that the mathematical operations related to polynomial commitment, such as multilinear polynomial evaluations and proofs, are correctly implemented in other parts of the code. This snippet provides no evidence of incorrect mathematical operations.
Security:
- There is no indication in the provided code about the use of cryptographically secure randomness or constant-time operations. Important to review the implementations interacting with these structures for such security guarantees.
- Make sure that all operations performed on these structures, particularly regarding proof construction and verification, avoid timing side-channels and maintain constant-time operations when working with secret values.
Performance:
- No unnecessary allocations or misuse of resources in the provided snippet, but consider efficiency strategies like using
smallvecif vectors are typically small or have a predictable size. - The structures here do not concern MSM or FFT specifically but ensure that elsewhere these operations are implemented efficiently.
Bugs & Errors:
- No potential panic or unwrap usage within this code snippet. Possible error handling should be carefully implemented where these structures are used, especially in divisions, inverses, and Merkle tree operations.
- There are no integer operations in this snippet, so check other areas for possible underflow/overflow issues.
Code Simplicity:
- The code defines necessary structs without unnecessary complexity. Ensure elsewhere that clear abstractions are used to manipulate these structures, avoiding code redundancy.
- Inspect if any matrix and merkle_tree operations within the codebase can be abstracted to simplify the code further.
Overall, the foundational structures seem correct; however, ensure that related operations meet cryptographic standards and that performance goals are achieved in the broader implementation context. Address the highlighted concerns before merging.
| @@ -0,0 +1,2 @@ | |||
| pub mod expander; | |||
| pub mod reed_solomon; | |||
There was a problem hiding this comment.
Correctness
-
Reed-Solomon Implementation: Ensure that the polynomial division and interpolation operations handle edge cases such as division by zero and have checks for zero coefficients properly. These operations are critical in ensuring correct encoding and decoding of data.
-
Edge Cases: Check if the polynomial arithmetic properly handles zero polynomials and leading zeros. Also, ensure that FFT operations handle edge cases for polynomial degrees gracefully.
Security
-
Constant-Time Operations: Verify if cryptographic operations involved in the proof systems and field arithmetic ensure constant-time execution, as this is crucial to mitigate timing side-channel attacks.
-
Zeroization of Sensitive Data: Ensure any sensitive data that deals with secrets is zeroized after use to prevent leakage.
-
Randomness: If randomness is used in proof generation or cryptographic operations, check for the use of a cryptographically secure random number generator.
-
Hash Function Domain Separation: Ensure that different operations use separate domains in hash functions to avoid cross-protocol attacks.
Performance
-
FFT Efficiency: Validate that the FFT implementation makes efficient use of in-place transformations and optimizes unnecessary allocations.
-
Redundant Operations: Ensure there are no redundant field inversions, particularly during MSM and within the elliptic curve scalar multiplication.
Bugs & Errors
-
Panics & Unwraps: Look for potential panics and unwraps, especially around array indexing and optional types, to improve robustness.
-
Integer Overflow/Underflow: Check that arithmetic operations are performed with overflow checking or using safe arithmetic methods.
Code Simplicity
- Complex Implementations and Abstractions: Evaluate if there are any overly complex implementations or if the abstractions could be improved for better maintainability and readability.
Consider addressing these issues before merging the code.
| let enc = ReedSolomonEncoding::<F>::new(4, 4); | ||
| assert_eq!(enc.distance(), (3, 4)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Correctness
- Edge Cases: The assertions in
newfunction assume thatmsg_lenandrho_invwill always form a power of two when multiplied. However, there is no check for whenmsg_lenis zero, which could lead to incorrect behavior or panics. Consider adding a check to ensuremsg_len > 0. Also, ensure thatencodehandles the case of zero-length messages gracefully. - Polynomial and FFT Implementations: The
Polynomial::evaluate_fftmethod assumes that the FFT always succeeds when called with a power-of-two length parameter. Make sure that there are appropriate checks in the FFT implementation to handle cases where this assumption might not hold due to external factors.
Security
- Timing Side-Channels: Ensure that all sensitive operations in the underlying field and polynomial implementations are constant-time, especially operations in the FFT, which can have side-channel vulnerabilities.
- Zeroization of Sensitive Data: This code does not show any explicit handling for zeroizing sensitive data like keys or secrets post-use. If these are involved, ensure zeroization to avoid leaks.
Performance
- Unnecessary Allocations: The creation of the
Polynomialobject inencodemay involve allocations that could be optimized if the message structure remains unchanged through operations. Consider reusing allocations if possible.
Bugs & Errors
- Potential Panics or Unwraps: Using
assert!andexpectdirectly can cause panics which might be better handled gracefully through results or errors if the environment demands high stability.
Code Simplicity
- Consider extracting some logic into helper functions if the complexity within functions like
encodegrows, for better readability and maintenance.
Summary:
The implementation is largely correct but needs stronger consideration for edge cases, particularly when msg_len is zero, and handling potential panics due to invalid inputs or operations. Security aspects like constant-time operations and zeroization should be explicitly verified if they involve sensitive data. After addressing the mentioned concerns, the code will be more robust and ready for merging.
| .collect(); | ||
|
|
||
| LinCodeProof { v, columns } | ||
| } |
There was a problem hiding this comment.
Correctness
- The function
calculate_tcalculates the number of column openings needed based on a security parameter and code distance. Ensuredelta_denisn't zero to avoid division by zero. - The calculation of
half_deltaandlog_factorincalculate_tinvolves floating-point arithmetic, which might introduce precision errors. Verify the impact on security depending on precision. - Use of
debug_assert_eq!for checks likea.len() == commitment.n_rowsmeans the program won't panic in release mode if there's a mismatch. Consider using regularassert_eq!if discrepancy handling is necessary in non-debug builds.
Security
- Verify that the implementation of
transcript.sample_u64guarantees constant-time operation to avoid timing side-channels when sampling column indices. - Ensure
pointuses cryptographically secure randomness and that its derivation doesn't leave traces in memory after use. - There is potential for panic due to
unwrapinstate.merkle_tree.get_proof_by_pos(idx).expect("column index is in range"). Consider handling this more gracefully.
Performance
- The use of
Vec::collectand potential multiple heap allocations incol_indicesandcolumnssection could be optimized with pre-allocation.
Bugs & Errors
- The
expectcall assumes the column index is in range. An out-of-bounds index not handled properly could result in a panic and termination of execution. Consider addressing by adding appropriate range checks if the assumption might fail in borderline cases.
Code Simplicity
- Code seems clear and follows a logical structure. Comments effectively explain complex operations like
tensor decomposition. Ensure consistency in maintaining the code simplicity throughout the library.
In conclusion, while the basic logic is sound, several considerations around security, correct assumptions, and potential impact of floating-point precision warrant further examination before merging.
| assert_eq!(sorted.len(), cols.len()); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Correctness
-
Mathematical Operations:
mul_vecassumes the vectorvpassed is of the same size as the number of columns in the matrix, which is good. However, be cautious of operations likev[*col]whencolis not properly checked against its range, althoughassert_eq!ensures dimensions match during usage.
-
Edge Cases:
- Ensure
n_rowsorn_colsis not zero during construction to avoid abnormal behavior in other operations asassert_eq!(entries.len(), n_rows)only checks entries length match.
- Ensure
-
Polynomial and FFT Implementations:
- Not applicable within the scope of the changes.
-
Proof System Correctness:
- Not applicable within the scope of the changes.
Security
-
Timing Side-Channels:
- Ensure
FieldElementoperations, especially multiplication and addition, are constant-time.
- Ensure
-
Proper Zeroization of Sensitive Data:
- No indication of zeroization for sensitive data is identified. Ensure
FieldElementsupports this if necessary.
- No indication of zeroization for sensitive data is identified. Ensure
-
Cryptographically Secure Randomness:
- The function
random_binarydoes not use a secure source of randomness; it relies on a deterministic pseudo-random generator approach for reproducible results, which may not be suitable if randomness needs to be cryptographically secure.
- The function
-
No Secret-Dependent Branching:
- Usage of
.contains()and.insert()onused(BTreeSet) can introduce timing side-channels.
- Usage of
-
Hash Function Domain Separation:
- Not applicable within the scope of the changes.
Performance
-
Unnecessary Allocations:
Vecallocations inrandom_binaryandmul_veccould be optimized by pre-allocating with precise capacity if possible.
-
Redundant Field Inversions:
- No inversions are seen but always review for redundant implementations as multiplication is used instead.
-
MSM and FFT Efficiency:
- Not applicable within the scope of the changes.
Bugs & Errors
-
Potential Panics or Unwraps:
assert_eq!could panic if assumptions are not met.
-
Memory Safety Issues:
- No direct unsafe code is present, which is good.
-
Off-by-one Errors:
- Utilizing indices from a state might lead to off-by-one errors when adjusting for zero-based indexing, include checks to ensure valid indices.
-
Integer Overflow/Underflow:
- Usage of
.wrapping_*methods ensure no overflow, but always analyze the expected behavior when wrapping.
- Usage of
Code Simplicity
-
Overly Complex Implementations:
- The transformation logic within
random_binarycould be clarified or separated for readability.
- The transformation logic within
-
Duplicated Code:
- No immediate duplications are observed.
-
Poor Abstractions:
- Abstraction levels seem appropriate, though ensure
EntryandRowconcepts are correctly abstracted for more complex methods in future expansions.
- Abstraction levels seem appropriate, though ensure
The current code has potential correctness and security issues. Consider refactoring to address these concerns for improvement before merging.
|
|
||
| assert_eq!(result, expected); | ||
| } | ||
| } |
There was a problem hiding this comment.
Correctness
- Polynomial and FFT Implementations: The tests for evaluations of polynomials seem robust; however, thorough verification on edge cases such as evaluating zero polynomials or the evaluation at boundary limits (e.g., very large input values) should be ensured.
Security
- Timing Side-Channels: There is no indication of defenses against timing side-channels. The polynomial evaluations and other arithmetic operations should be verified to not have data-dependent branching that could lead to timing attacks.
- Proper Zeroization of Sensitive Data: The code does not show any indication of explicitly zeroizing sensitive data before it goes out of scope. Ensure that sensitive mathematical objects are wiped from memory after use.
- Cryptographically Secure Randomness: The code uses Keccak256 for hashing in the
DefaultTranscript, but ensure randomness used elsewhere is secured with a cryptographically secure random generator. - Hash Function Domain Separation: There is a potential issue with hash function use; ensure that domain separation techniques are applied when the same hash function is used in multiple parts of the protocol.
Performance
- Unnecessary Allocations: Use of
Vecfor point evaluations and other operations could lead to unnecessary memory allocations. Consider optimizing with fixed-size arrays where applicable. - Redundant Field Inversions: Not observed in the current code, but generally, it’s important to minimize inversions as they are more expensive in field arithmetic.
Bugs & Errors
- Potential Panics or Unwraps: Usage of
expectin polynomial evaluations could lead to runtime panics if the point is invalid or out of expected bounds. Ensure values are safely validated.
Code Simplicity
- Duplicated Code: There is duplicated logic in multiple
commit_open_verifytests for different field parameters. Consider abstracting repetitive test setup into helper functions to ease maintenance and reduce duplication.
Recommendation
- Address security concerns regarding timing side-channels and proper zeroization of sensitive data. Ensure domain separation in hashing.
- Refactor test code and possibly other logic to remove redundancies and optimize memory usage.
Overall, there are important issues, mainly in terms of security practices and some potential simplifications, that should be addressed before considering a merge.
| /// Relative minimum distance of the code as `(numerator, denominator)`. | ||
| /// For RS with rate rho, this is `(1 - rho)`, e.g. `(1, 2)` for rate 1/2. | ||
| fn distance(&self) -> (usize, usize); | ||
| } |
There was a problem hiding this comment.
Correctness
- Ensure that the
encodefunction respects the mathematical guarantees of the respective linear code (Reed-Solomon or Expander). This snippet does not provide the specifics of these encodings, which are crucial for verifying correctness.
Security
- Timing Side-channels: It's crucial to ensure that operations in sensitive encoding (like Reed-Solomon and Expander codes) are constant-time to prevent leaking information via timing attacks. This is especially important in
encode. - Zeroization: Sensitive data (e.g., message inputs to
encode) should be zeroized after use to prevent any accidental leakage. - It's missing a clear indication of cryptographically secure randomness usage, especially if any randomness is involved in encoding.
- Ensure that hash functions, if used, have domain separation to avoid cross-protocol attacks.
Performance
- Unnecessary Allocations: Repeatedly creating
Veccan be costly. Consider reusing buffers or using more efficient data structures if performance bottleneck exists. - Redundant Operations: Check if any redundant computations can be avoided in
encode, especially if FFT is involved.
Bugs & Errors
- Potential Panics: Ensure that
encodehandles invalid inputs gracefully, without causing panics or unwrapping. - Memory Safety: Presently, the code doesn't show memory safety concerns, but ensure that operations on
msgwithinencodedo not cause buffer overflows. - Integer Overflow/Underflow: If using indices or lengths within encode, ensure these are checked or wrapped safely.
Code Simplicity
- Complexity: If the actual encoding algorithms are implemented later, ensure they aren't too complex or difficult to understand.
- Duplication: Check other parts of the library to ensure this trait is not duplicated unnecessarily.
Overall, while the trait definition is clean, more detail on the implementation of the encoding functions and their specific operations would be required to fully assess correctness and security vulnerabilities. Therefore, merging this without reviews on those implementations may lead to issues later on.
| }); | ||
|
|
||
| eval == *claimed_value | ||
| } |
There was a problem hiding this comment.
Correctness
- Mathematical operations: Ensure that the field operations like addition, multiplication, and zero element checks are correctly handling edge cases like zero elements. There's no explicit handling for division or inversions, which are potential sources of issues in elliptic curve operations.
- Edge cases: The code seems to handle basic checks for edge cases such as incorrect vector lengths. Verify handling of empty vectors and the case where
pointlength is not even, as this could lead to incorrect tensor splitting. - Proof system correctness: Important checks seem to be in place, but the comments could be more detailed to ensure all specification steps are covered.
Security
- Timing side-channels: While the use of iterators and function calls generally move operations towards being more constant-time, ensure that collection accesses and loops do not inadvertently introduce timing side-channels, especially in cryptographic contexts.
- Zeroization of sensitive data: There is no explicit zeroization of variables (e.g.,
w) containing sensitive data. Consider adding zeroization to guard against leaking sensitive data in memory. - Cryptographically secure randomness: The randomness derived from sampling column indices in the transcript should be reviewed to ensure it is cryptographically secure.
- Secret-dependent branching: Avoid logic where branch lengths or memory accesses are influenced by secret data to prevent side channels.
Performance
- Unnecessary allocations: Potential allocations occur in creating vectors like
col_indices. Consider pre-allocating where possible. - Redundant field inversions: There are multiplication and addition operations but no explicit field inversions, which can be costly. Ensure this is correctly balanced against other operations.
Bugs & Errors
- Potential panics or unwraps: Although not observed directly within the diff, verify the usage of collections and ensure all indexed accesses are safe.
- Memory safety issues: Generally safe given Rust's guarantees but double-check unsafe blocks or external calls.
- Off-by-one errors: Ensure all loop bounds are correct; particularly important in cryptographic implementations where an off-by-one could invalidate proofs.
- Integer overflow/underflow: Utilize Rust's checked arithmetic where appropriate to prevent overflow impacts, especially in calculating
t.
Code Simplicity
- Overly complex implementations: The code appears readable, but ensuring function extractions for repetitive logic could improve clarity.
- Duplicated code: Internal consistency checks within the loop have similar logic. Consider refactoring into utility functions.
- Poor abstractions: The abstractions used seem appropriate, but there might be opportunities to simplify tensor computations or column verifications for clarity.
Given these considerations, the code requires some improvements for security and correctness, particularly focusing on edge cases and potential side-channel attacks.
Greptile SummaryThis PR adds a complete Ligero/Brakedown linear-code polynomial commitment scheme (PCS) for multilinear polynomials to Key findings:
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| crates/crypto/src/linear_code_pcs/sparse_matrix.rs | Contains two issues: (1) random_nonzero limits generated field element values to 31 bits via state >> 33, breaking the uniform distribution required by Brakedown for any field with characteristic > 2^31. (2) random_binary is dead code never referenced from ExpanderEncoding. |
| crates/crypto/src/linear_code_pcs/encoding/expander.rs | The b_out_len = compressed_len simplification deviates from the Brakedown paper construction, yielding a code with different rate and distance than the paper's analysis supports. The user-supplied distance parameter therefore cannot be independently validated against the actual construction. |
| crates/crypto/src/linear_code_pcs/prove.rs | Proves a multilinear evaluation via left-multiplication and Fiat-Shamir column sampling. Logic is sound and uses the shared calculate_t from utils.rs. Security-critical size checks are debug_assert only, which could cause confusing silent failures in release builds. |
| crates/crypto/src/linear_code_pcs/verify.rs | Verifier correctly replays the Fiat-Shamir transcript, checks Merkle proofs, column consistency (⟨a, col_j⟩ == encode(v)[j]), and the final evaluation. Uses shared calculate_t from utils.rs. All size checks are explicit (non-debug) returns of false, which is appropriate. |
| crates/crypto/src/linear_code_pcs/utils.rs | Shared calculate_t function is now correctly deduplicated from both prove.rs and verify.rs, uses Brakedown's δ/3 soundness formula, and properly caps at n_ext_cols. Tests cover RS rate-1/2, rate-1/4, small-distance, and capping cases. |
| crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs | RS encoding backend is correct: treats message as polynomial coefficients and evaluates via FFT. Distance is correctly computed as (rho_inv-1, rho_inv). The test encode_is_systematic_at_roots has a misleading name as RS codes are non-systematic (evaluation-based, not coefficient-in-prefix). |
| crates/crypto/src/linear_code_pcs/tests.rs | Integration tests cover Ligero (Goldilocks + RS) and Brakedown (Mersenne31 + Expander) for 4 and 6 variables, wrong-value rejection, rate-4 code, and transcript-binding (proof replay with different point). Coverage is good but all tests use small polynomials (4–6 vars), masking the t capping issue noted in previous review. |
Sequence Diagram
sequenceDiagram
participant Caller
participant commit.rs
participant MerkleTree
participant prove.rs
participant Transcript
participant verify.rs
Caller->>commit.rs: commit(evals, encoding)
commit.rs->>commit.rs: reshape evals → k×m Matrix
commit.rs->>commit.rs: encode each row → k×n ExtMatrix
commit.rs->>MerkleTree: build(columns of ExtMatrix)
MerkleTree-->>commit.rs: MerkleTree + root
commit.rs-->>Caller: CommitOutput { commitment, state }
Caller->>prove.rs: prove(commitment, state, encoding, point, transcript, sec_param)
prove.rs->>prove.rs: a = tensor_vec(point[..half])
prove.rs->>prove.rs: b = tensor_vec(point[half..])
prove.rs->>prove.rs: v = state.matrix.row_mul(a)
prove.rs->>prove.rs: claimed_value = ⟨v, b⟩
prove.rs->>Transcript: append root, point, claimed_value, v
prove.rs->>Transcript: sample t column indices
prove.rs->>MerkleTree: get_proof_by_pos(idx) for each index
prove.rs-->>Caller: LinCodeProof { v, columns }
Caller->>verify.rs: verify(commitment, proof, encoding, point, claimed_value, transcript, sec_param)
verify.rs->>verify.rs: a = tensor_vec(point[..half]), b = tensor_vec(point[half..])
verify.rs->>Transcript: append root, point, claimed_value, proof.v
verify.rs->>Transcript: re-derive t column indices
verify.rs->>verify.rs: w = encoding.encode(proof.v)
loop for each opened column j
verify.rs->>verify.rs: verify Merkle proof vs root
verify.rs->>verify.rs: check ⟨a, col_j⟩ == w[col.index]
end
verify.rs->>verify.rs: check ⟨proof.v, b⟩ == claimed_value
verify.rs-->>Caller: bool
Comments Outside Diff (3)
-
crates/crypto/src/linear_code_pcs/sparse_matrix.rs, line 933-940 (link)Biased field element generation breaks uniform distribution for large fields
The value produced by
state >> 33is at most 31 bits wide (range[0, 2^31]). For any field with characteristic greater than2^31— including Goldilocks64 (modulus ≈ 2^64), Stark252 (modulus ≈ 2^252), and others — only about2^31 / pof the field's elements can ever be generated here. Practically, for Goldilocks64 this covers less than 0.012% of all non-zero field elements.The Brakedown paper (Golovnev et al., Section 5.1, Theorem 2) explicitly requires uniform random non-zero field elements in the A and B matrices for the minimum-distance analysis to hold. With a severely biased distribution, the distance guarantee and the resulting soundness bound (
t = calculate_t(...)calls) no longer apply.The same issue exists in
random_binaryat line 996.A correct approach would be to derive a field element that covers the full range of
F, e.g., by mapping the full 256-bit hash digest to the field using rejection sampling or a reduction modulo the field characteristic:// Use full hash bytes for a uniform field element, not just 31 bits // Option: use FieldElement::from_bytes_be (if available) or reduce the // full u256 value modulo the field characteristic. let val_hash = derive(seed, row_idx, hash_counter); hash_counter += 1; let val = FieldElement::<F>::from_bytes_be(&val_hash) .unwrap_or(FieldElement::one()); let val = if val == FieldElement::zero() { FieldElement::one() } else { val };
This affects the security of the Brakedown PCS when used with any field whose order exceeds
2^31. -
crates/crypto/src/linear_code_pcs/encoding/expander.rs, line 316 (link)b_out_len = compressed_lendeviates from the Brakedown constructionThe comment acknowledges a "simplicity" deviation from the paper, but this is not cosmetic — it affects both the code rate and the achievable minimum distance:
Per Brakedown (Golovnev et al., Section 5.1), the B matrix should output
(1 - α - 1/r) * n_currentextra symbols per level, whereris the overall rate blowup. This ensures the total codeword length yields the intended rate1/rand relative distanceδ = β/r.With
b_out_len = compressed_len = α * n_current, each level contributes2α * n_currentextra symbols instead of(1 - α - 1/r) * n_current. The actual codeword length is approximatelyn / (1 - 2α)rather thanr * n. For α = 0.25, the actual rate is ≈ 1/2 but the caller passesdistance = (1, 25)(δ ≈ 0.04), which would require rate ≥(1 + 2β)/(1 - α)≈ 4–5 for small β. The resulting mismatch means the distance parameter passed by callers does not correspond to the structure actually constructed.This makes it impossible to verify from the outside whether the user-supplied distance is achievable with the chosen
b_out_len.// Per Golovnev et al., equation for extra output length at each level: // b_out_len = ((1 - alpha - 1.0/r) * current_len as f64).ceil() as usize // where r is the intended total rate blowup (e.g., 5 for rate 1/5). // This requires the caller to specify r, or for the constructor to derive it.
-
crates/crypto/src/linear_code_pcs/prove.rs, line 781-782 (link)Security-critical size checks are
debug_assertonlydebug_assert_eq!(a.len(), commitment.n_rows); debug_assert_eq!(b.len(), commitment.n_cols);
These checks verify that
point.len()matches the number of variables recorded during commit. In release builds they are compiled out entirely. If a caller passes apointof the wrong length, the prover silently produces a proof over a different evaluation point than intended.While this does not break soundness (the verifier will catch it via its explicit
if a.len() != commitment.n_rowscheck), it creates a confusing API in production. Regularassert!or an explicit error return would be clearer and safer for a cryptographic library:
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/sparse_matrix.rs
Line: 933-940
Comment:
**Biased field element generation breaks uniform distribution for large fields**
The value produced by `state >> 33` is at most 31 bits wide (range `[0, 2^31]`). For any field with characteristic greater than `2^31` — including Goldilocks64 (modulus ≈ 2^64), Stark252 (modulus ≈ 2^252), and others — only about `2^31 / p` of the field's elements can ever be generated here. Practically, for Goldilocks64 this covers less than 0.012% of all non-zero field elements.
The Brakedown paper (Golovnev et al., Section 5.1, Theorem 2) explicitly requires **uniform random non-zero field elements** in the A and B matrices for the minimum-distance analysis to hold. With a severely biased distribution, the distance guarantee and the resulting soundness bound (`t = calculate_t(...)` calls) no longer apply.
The same issue exists in `random_binary` at line 996.
A correct approach would be to derive a field element that covers the full range of `F`, e.g., by mapping the full 256-bit hash digest to the field using rejection sampling or a reduction modulo the field characteristic:
```rust
// Use full hash bytes for a uniform field element, not just 31 bits
// Option: use FieldElement::from_bytes_be (if available) or reduce the
// full u256 value modulo the field characteristic.
let val_hash = derive(seed, row_idx, hash_counter);
hash_counter += 1;
let val = FieldElement::<F>::from_bytes_be(&val_hash)
.unwrap_or(FieldElement::one());
let val = if val == FieldElement::zero() { FieldElement::one() } else { val };
```
This affects the security of the Brakedown PCS when used with any field whose order exceeds `2^31`.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/encoding/expander.rs
Line: 316
Comment:
**`b_out_len = compressed_len` deviates from the Brakedown construction**
The comment acknowledges a "simplicity" deviation from the paper, but this is not cosmetic — it affects both the code rate and the achievable minimum distance:
Per Brakedown (Golovnev et al., Section 5.1), the B matrix should output `(1 - α - 1/r) * n_current` extra symbols per level, where `r` is the overall rate blowup. This ensures the total codeword length yields the intended rate `1/r` and relative distance `δ = β/r`.
With `b_out_len = compressed_len = α * n_current`, each level contributes `2α * n_current` extra symbols instead of `(1 - α - 1/r) * n_current`. The actual codeword length is approximately `n / (1 - 2α)` rather than `r * n`. For α = 0.25, the actual rate is ≈ 1/2 but the caller passes `distance = (1, 25)` (δ ≈ 0.04), which would require rate ≥ `(1 + 2β)/(1 - α)` ≈ 4–5 for small β. The resulting mismatch means the distance parameter passed by callers does not correspond to the structure actually constructed.
This makes it impossible to verify from the outside whether the user-supplied distance is achievable with the chosen `b_out_len`.
```rust
// Per Golovnev et al., equation for extra output length at each level:
// b_out_len = ((1 - alpha - 1.0/r) * current_len as f64).ceil() as usize
// where r is the intended total rate blowup (e.g., 5 for rate 1/5).
// This requires the caller to specify r, or for the constructor to derive it.
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/sparse_matrix.rs
Line: 962-1019
Comment:
**`random_binary` is dead code**
`SparseMatrix::random_binary` is never called by `ExpanderEncoding` (which exclusively uses `random_nonzero`) nor from any other part of the new `linear_code_pcs` module. Keeping it increases maintenance burden and creates confusion about which generation strategy is expected to be used for the expander matrices.
If it is intentionally retained for future use or experimentation, it should either be marked `#[allow(dead_code)]` with a clarifying comment, or removed until it is needed.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/prove.rs
Line: 781-782
Comment:
**Security-critical size checks are `debug_assert` only**
```rust
debug_assert_eq!(a.len(), commitment.n_rows);
debug_assert_eq!(b.len(), commitment.n_cols);
```
These checks verify that `point.len()` matches the number of variables recorded during commit. In release builds they are compiled out entirely. If a caller passes a `point` of the wrong length, the prover silently produces a proof over a different evaluation point than intended.
While this does not break soundness (the verifier will catch it via its explicit `if a.len() != commitment.n_rows` check), it creates a confusing API in production. Regular `assert!` or an explicit error return would be clearer and safer for a cryptographic library:
```suggestion
assert_eq!(a.len(), commitment.n_rows, "point length incompatible with commitment n_rows");
assert_eq!(b.len(), commitment.n_cols, "point length incompatible with commitment n_cols");
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/crypto/src/linear_code_pcs/encoding/reed_solomon.rs
Line: 558-571
Comment:
**Misleading test function name — RS encoding is NOT systematic**
The test is named `encode_is_systematic_at_roots`, but this is a misnomer. A *systematic* code places the original message symbols verbatim as a prefix of the codeword. The RS encoding used here is *not* systematic: the message is treated as polynomial *coefficients*, and the codeword consists of FFT evaluations at roots of unity — none of the codeword positions contain a raw message symbol.
What the test actually verifies is that `cw[0] = poly(ω^0) = poly(1) = sum of coefficients`. This is a useful functional check, but the name `encode_is_systematic_at_roots` misrepresents what the code does. A clearer name would be:
```rust
fn encode_evaluates_to_coefficient_sum_at_unity() { ... }
```
This distinction matters because callers relying on the "systematic" label may incorrectly assume they can read back the message from the first `msg_len` positions of the codeword.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 2dd51b3
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1191 +/- ##
==========================================
+ Coverage 75.96% 76.56% +0.60%
==========================================
Files 220 231 +11
Lines 49043 50573 +1530
==========================================
+ Hits 37255 38723 +1468
- Misses 11788 11850 +62 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ap column openings, fix PRNG bias - Accept distance as constructor parameter in ExpanderEncoding instead of hardcoding (1, 25) - Extract calculate_t() to shared utils.rs module, removing duplication between prove.rs and verify.rs - Cap column opening count t at n_ext_cols to prevent oversized proofs - Fix biased column selection in sparse_matrix.rs by re-mixing PRNG state on collision instead of linear probing
Kimi AI ReviewHere are my feedback and issues found in the PR diff:
|
| } | ||
|
|
||
| Self { | ||
| msg_len, |
There was a problem hiding this comment.
Correctness
- Mathematical Operations: Code changes do not show any mathematical operations or changes in modular arithmetic or polynomial calculations.
- Edge Cases: Assertions check for valid
distancevalues, but there are no checks for overflows in multiplication or addition during matrix generation, which could be critical. - Infinity Points: Not applicable to this change as no elliptic curve operations are shown.
Security
- Constant-Time Operations: There is no evidence in this snippet that constant-time guarantees are needed or implemented for secrets.
- Zeroization: No handling of sensitive data is shown in the given snippet.
- Cryptographically Secure Randomness: The seed is passed as
u64, but ensure cryptographically secure RNG initialization if used for generating matrices.
Performance
- Unnecessary Allocations: There seem to be vector allocations for
matriceswithout checking for efficient sizing or reuse. - Efficient Operations: Repeated
wrapping_addfor the seed could lead to unintended collisions; using a more robust method or a larger datatype might mitigate.
Bugs & Errors
- Assertions: Assertions should be used carefully, as they will not run in a release build. Switch to error handling if runtime safety is crucial.
- Potential Panics: The excessive dependence on user-supplied
distancevalues can lead to panics if not properly bounded.
Code Simplicity
- Complexity: The use of tuple
(usize, usize)fordistancecould be abstracted into a domain-specific type with validations.
Summary
- Add overflow checks and consider better randomness management. Use assertions cautiously as they might create discrepancies between debug and release builds. Overall, ensure that operations on sizes and indices are safe.
- Consider replacing assertions with proper error handling and checking bounds against overflows.
- Optimize performance by reevaluating RNG and vector allocation strategies.
- Address the use of constants or better-abstraction methods for domain-specific mathematical parameters.
| pub mod matrix; | ||
| pub mod prove; | ||
| pub mod sparse_matrix; | ||
| pub mod traits; |
There was a problem hiding this comment.
- Correctness: No mathematical operations are detailed in this snippet, so potential issues in modular arithmetic, field operations, or curve operations could not be specifically addressed.
- Security: Verify if the new
utilsmodule introduces any secret-dependent branching or lacks constant-time operations for secrets, as these can introduce timing side-channels. - Zeroization: Ensure that any sensitive data handling in the new module or imports are properly zeroized.
- Randomness: Check for cryptographically secure randomness if any RNG functionality has been introduced in the
utils. - Performance: If
utilsintroduces operations involving MSM or FFT, evaluate them for unnecessary allocations or redundant calculations. - Errors and Bugs: There's a pointer to potential issues if functions related to proof systems are moved to
utils. Ensure there's no improper handling that may lead to panics or overflow. - Code Simplicity: The introduction of a
utilsmodule could indicate the presence of utility functions that may abstract common logic, but could also suggest grouping unrelated functionalities leading to complexity. Ensure no duplication or tangled abstractions appear.
Without additional details on the specific changes made within each module or function, particularly utils, a thorough review isn't possible. I recommend inspecting the full code changes in these areas before proceeding with the merge.
| point: &[FieldElement<F>], | ||
| transcript: &mut impl IsTranscript<F>, | ||
| sec_param: usize, | ||
| ) -> LinCodeProof<F, B> |
There was a problem hiding this comment.
Correctness
- The logic for calculating
thas been moved from the local function toutils::calculate_t. Please ensure that the newcalculate_timplementation inutilsis correct, especially regarding edge cases like zero or smalldelta_numanddelta_den.
Security
- Verify that the random sampling in
transcript.sample_u64(n_ext_cols as u64)is cryptographically secure and free from timing side-channels.
Performance
- Consider the efficiency of sampling
col_indices. Iftapproachesn_ext_cols, it could be more efficient to shuffle a list of indices and select the firsttelements.
Bugs & Errors
- Ensure the
transcript.sample_u64function handles edge cases for zero or near-zero probabilities correctly to avoid panic or unexpected behavior.
Code Simplicity
- The move to
utils::calculate_tseems appropriate, providingcalculate_tis not overly complex and well-documented. Ensure that the abstraction does not hide important logical details that might confuse future developers.
| #[derive(Clone, Debug)] | ||
| pub struct SparseMatrix<F: IsField> { | ||
| pub n_rows: usize, | ||
| pub n_cols: usize, |
There was a problem hiding this comment.
Correctness:
- The change implements a more randomized probing strategy by re-mixing the PRNG state, which should reduce clustering bias. However, care must be taken to ensure that this reformulated approach maintains the uniform distribution over the columns, which can affect the correctness when constructing sparse matrices.
Security:
- Ensure that the PRNG remains suitable for cryptographic purposes when used in contexts that require unpredictability.
- Verify that the re-mixed PRNG state does not introduce any timing side-channel vulnerabilities.
Performance:
- This change does not introduce any obvious performance bottlenecks; however, confirming that the revised probing method does not significantly degrade the performance in practice would be advisable.
Bugs & Errors:
- Given that this section of code involves wrapping with
wrapping_*methods, ensure that integer overflow is handled as expected for the intended behavior.
Code Simplicity:
- The change simplifies the probing strategy logic by removing direct linear probing in favor of a more randomized approach, which could lead to cleaner code handling in the long run.
Overall, while the code change addresses clustering bias, further assurance on the uniformity, potential timing side-channels, and performance impacts could strengthen the review.
| type B = FieldElementVectorBackend<F, Keccak256, 32>; | ||
|
|
||
| fn make_poly(n_vars: usize) -> (Vec<FE>, DenseMultilinearPolynomial<F>) { | ||
| let n = 1usize << n_vars; |
There was a problem hiding this comment.
Correctness:
- The change in
ExpanderEncoding::<F>::newfunction now includes an additional parameter(1, 25). Ensure that the functionnew()and the associated scheme properly utilize this parameter, especially if it impacts the encoding logic or security properties.
Security:
- No explicit timing-attack mitigations are mentioned. If this
new()function or the encoding affects sensitive data, such as keys, ensure that all operations are constant-time. - Verify that any randomness used within this context remains cryptographically secure, though this snippet does not indicate such usage directly.
Performance:
- The segments shown do not immediately suggest performance issues, but if the
new()method inExpanderEncodingresults in additional file or memory allocations due to added parameters, these should be reviewed.
Bugs & Errors:
- Ensure that management of any new state derived from the parameter
(1, 25)does not introduce potential for errors, particularly if index-based accesses are performed.
Code Simplicity:
- Changes are consistent across all instances in the code snippet, suggesting a thematic or necessary update, possibly indicating proper abstraction has been maintained.
Conclusion:
- Double-check how the additional parameter
(1, 25)affects the behavior. Ensure no logical errors are introduced either in correctness or security with this new parameter's usage. Until verified, this code review considers the changes provide a potential risk.
| let log_factor = -(1.0 - half_delta).log2(); | ||
| let t = (sec_param as f64 / log_factor).ceil() as usize; | ||
| t.min(n_ext_cols) | ||
| } |
There was a problem hiding this comment.
Correctness:
- The operation
(delta_num as f64 / delta_den as f64) / 2.0might be vulnerable to precision issues when using floating-point arithmetic for division. Consider using integer arithmetic first when possible. - Make sure that
delta_num <= delta_dento prevent the denominator from being zero when calculatinghalf_delta.
Security:
- No obvious side-channel vulnerabilities since this function is likely not operating directly on secret data.
Performance:
- The use of floating-point operations and
ceilmight introduce unnecessary overhead. However, sincedeltahas a real number nature, this might be unavoidable.
Bugs & Errors:
- The function does not handle cases where
log2operates on non-positive numbers; this could lead to undefined behavior. - The function should check for division by zero to protect against inputs where
delta_denmight be zero, which should be treated as an error or handled gracefully.
Code Simplicity:
- Overall, the code is straightforward but ensure proper validation checks are in place for inputs to maintain safety and correctness.
To improve the current code, ensure robust input validation and reconsider the use of floating-point arithmetic if precision is a concern.
| /// - Consistency: `<a, col_j> == w[j]` (the inner product of tensor-half `a` | ||
| /// with the column equals the encoded value at that position). | ||
| /// 5. Verify `<v, b> == claimed_value`. | ||
| /// |
There was a problem hiding this comment.
Correctness:
- Polynomial and FFT Implementations: Ensure that any changes related to polynomial operations, especially if they involve inverse FFT or multiplication, are carefully reviewed for consistency with mathematical definitions. No specific issues are noted, but any error in FFT can severely affect the correctness of results.
- Edge Cases: There are no visible checks in the provided diff for handling special cases such as zero denominators in
calculate_tor invalidn_ext_cols. It's essential to handle such cases to avoid division by zero or invalid memory accesses.
Security:
- Timing Side-Channels: Ensure that no operations use secret-dependent branching. There is no visible check for constant-time operations in the provided diff.
- Randomness: The
transcript.sample_u64line suggests sampling randomness. Ensuretranscriptis properly seeded and cryptographically secure. - Hash Function Domain Separation: Although not directly visible here, ensure any associated hash functions used for commitments or proofs include domain separation to avoid cross-protocol attacks.
Performance:
- Unnecessary Allocations: Allocating a
Vec<usize>for column indices could potentially be optimized if the size is known or can be efficiently managed.
Bugs & Errors:
- Potential Panics or Unwraps: If
n_ext_colscan be zero, the calculationtranscript.sample_u64(n_ext_cols as u64)might cause panic due to invalid range in sampling. Additionally, ensurecalculate_thandles invalid inputs gracefully. - Memory Safety: There are no explicit checks for potential buffer overflows or underflows in array accesses. Ensure all accesses, especially using sampled indices, respect array boundaries.
Code Simplicity:
- The logic for
calculate_twas encapsulated inutils, which is a good step towards simplicity, but ensure abstraction doesn't hide essential error handling. Ensurecalculate_tfunction is well-documented and handles edge cases properly.
Overall, addressing security concerns and handling edge cases can provide improvements to make it merge-ready.
…nce docs - Bind point and claimed_value to Fiat-Shamir transcript in prove/verify to prevent proof replay attacks with different evaluation points - Fix calculate_t to use full delta instead of delta/2 (was over-conservative) - Replace LCG-based row seeding in SparseMatrix with SHA3-256 hash-based per-row seeds to eliminate inter-row correlation - Add safety documentation for unverified distance parameter in ExpanderEncoding - Add regression test for proof replay and unit tests for calculate_t
Kimi AI ReviewHere are my detailed comments on the PR diff:
Overall, the PR diff looks good. The polynomial commitment scheme implementation using linear-code PCS (Ligero and Brake |
- Use libm::ceil/libm::log2 instead of f64 methods for no_std compat - Replace .clone() with Copy semantics in tests for Copy field elements
| libm = "0.2" | ||
| # CUDA | ||
| cudarc = { version = "0.9.7", optional = true } | ||
|
|
There was a problem hiding this comment.
The code introduces a new dependency on libm, which is typically used for mathematical operations on floating-point numbers. Care should be taken to ensure that its use does not introduce floating-point inaccuracies in cryptographic operations, which require strict integer arithmetic.
Correctness:
- Verify that all mathematical operations added with
libmadhere to modular arithmetic without floating-point errors. - Ensure that edge cases, particularly regarding zero, infinity, and identity elements, are handled consistently throughout any newly introduced operations.
- Ensure polynomial operations and FFT implementations are correct, locking down any potential edge cases.
Security:
- Cryptographic operations should not depend on
libmfunctions for divisors or multipliers to prevent timing side-channels. - Ensure that secret data is zeroized effectively; check any newly introduced operations conform with this requirement.
- Avoid using
libmfor operations on secret data unless proven to be constant-time.
Performance:
- Review the impact of using
libmon performance, ensuring that it does not result in unnecessary allocations or degraded FFT efficiency. - Check for redundant field inversions that could lead to performance hits.
Bugs & Errors:
- Ensure that
libmdoes not result in off-by-one errors, especially given its native floating-point bias. - Check all uses of floating-point operations within loops could potentially cause overflow or underflow.
- Ensure no unwraps or potential panics pathways are added with this dependency change.
Code Simplicity:
- Verify that introducing
libmhas not unnecessarily complicated math operations. Ensure the library's addition does not duplicate existing functionality provided elsewhere in the library. - Review code abstractions to avoid poor implementations resulting from the introduction of
libm.
| use crate::linear_code_pcs::traits::LinearCodeEncoding; | ||
|
|
||
| /// Expander-code encoding backend for the Brakedown PCS. | ||
| /// |
There was a problem hiding this comment.
Correctness
- The change from using
f64.ceil()tolibm::ceil()may have been done for compatibility or performance reasons, but functionally both should work the same under normal circumstances. Ensure thatlibm::ceil()is correctly linked and behaves consistently across all target environments.
Security
- Verify that any operations involving sensitive data, such as secrets, are constant-time to prevent timing side-channel attacks.
Performance
- No comments on performance impacts; ensure that the use of
libm::ceil()optimizes the intended target architecture.
Bugs & Errors
- Ensure that
libm::ceil()does not introduce any potential for underflow or overflow when casting floating-point results to an integer withas usize. This would typically be fine ifout_lenhas been correctly validated, as with theout_len.max(1)logic ensuring it is at least1.
Code Simplicity
- No immediate concerns unless
libm::ceil()requires additional setup or dependency management that might complicate the build process.
Consider verifying cross-platform compatibility, particularly around the library usage of libm, and running relevant test cases to ensure no regressions are introduced.
| // Prove | ||
| let mut transcript_p = DefaultTranscript::<F>::new(&[0x42]); | ||
| let proof = prove( | ||
| &out.commitment, |
There was a problem hiding this comment.
Correctness
- Overall, the mathematical operations on finite fields seem correct and idiomatic Rust code is used to avoid unnecessary cloning.
- Ensure that edge cases, like operations involving zero or identity elements, are thoroughly covered by corresponding tests.
Security
- Timing side-channels: The current code does not exhibit clear evidence of constant-time operation on secrets. Ensure that secret-dependent operations, especially in modular arithmetic and field operations, are performed without timing vulnerabilities.
- Proper zeroization: No explicit zeroization of sensitive data is observed. Consider incorporating methods to zeroize secret data to prevent leakage through memory.
- Cryptographically secure randomness & secret-dependent branching: While the given code snippet doesn't deal with randomness, ensure all randomness throughout the library is secure and avoid branches dependent on secret data.
- Hash function domain separation wasn't mentioned in the review, ensure the usage across the library to prevent unintended collisions.
Performance
- The removal of unnecessary cloning improves performance but verify the absence of unnecessary allocations in larger contexts like matrix or polynomial operations.
- FFT efficiency: Make sure FFT implementations are efficient, especially in cases involving power-of-two input sizes since they can leverage optimized algorithms.
Bugs & Errors
- Potential panics or unwraps: Ensure all lengths and accesses, like the ones in the tensor operations, are valid and not prone to underflow/overflow or panics.
- Memory safety: The use of Rust should mitigate most memory safety concerns, but always ensure safe concurrency if applicable.
Code Simplicity
- The recent changes have removed redundant code, simplifying the implementation. Continue to refactor duplicated code for clarity.
Overall, ensure all critical aspects of both correctness and security are verified throughout the cryptographic library, beyond the shown portions, particularly concerning constant-time operations and proper zeroization.
| t.min(n_ext_cols) | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
Correctness
- Floating-Point Precision: The use of floating-point arithmetic for critical computations like
log2can result in precision errors. Investigate if this precision is acceptable for the security and correctness of your cryptographic application.
Security
- Timing Side-Channels: The current code, especially in the function
calculate_t, does not appear to handle any sensitive information, so timing side-channel risk is minimal here. However, ensure this applies broadly in the rest of the library. - Sensitive Data Handling: Ensure that any sensitive data handled elsewhere within this code is properly zeroized.
- Cryptographically Secure Operations: Utilize cryptographically secure methods for any randomness within the library.
Performance
- Use of
libm: The change tolibmsuggests a move to more accurate mathematical functions. This is generally a good practice but ensure these performance characteristics align with desired efficiency. - Allocation and Operations Efficiency: No evident excessive allocations or redundant operations in this provided snippet.
Bugs & Errors
- Potential Panics: No direct potential for panics in this code snippet. Ensure any dependencies or the larger codebase have adequate error handling.
- Integer Overflow/Underflow: The
as usizecoercion is a potential point for errors if values are unexpectedly negative or excessively large. Ensure calling code ensures bounds.
Code Simplicity
- Complexity Reduction: The code effectively uses
libmfunctions to reduce complexity and improve mathematical precision, which is a positive change. - Duplication and Abstraction: No apparent duplication in the given snippet.
Further investigation into the overall cryptographic operations and broader code context is recommended to ensure these initial assessments hold at the library level.
Kimi AI ReviewHere are my review comments on the PR diff:
Overall, the PR adds a new linear-code PCS implementation with both Reed-Solomon and expander encodings. The code is well-structured, documented, and efficient. Keep up the good work! A few minor suggestions are provided above to further improve the code quality and documentation. |
…ix expander
Three fixes based on comparison with the Brakedown paper (Golovnev et al.,
CRYPTO 2023):
1. calculate_t: use δ/3 (not δ) as the per-column catch probability.
From Lemma 1, p.14: the soundness error of the testing phase is
(1 - δ/3)^t, so t = ceil(λ / log2(1/(1 - δ/3))). The previous
formula using full δ opened ~3.8x fewer columns than required.
2. Expander encoding: implement two-matrix construction (A + B) per
recursion level, matching Algorithm 1 in Section 5.1. The previous
single-matrix construction (x, Ax, A²x, ...) lacked the B mixing
matrix needed for the distance guarantee.
3. Sparse matrix: add random_nonzero() constructor with random non-zero
field element entries instead of all-ones, as required by Section 5.1
("uniform random non-zero elements of F").
Also adds a linearity test for the expander encoding.
| #[test] | ||
| fn encode_deterministic() { | ||
| let enc = ExpanderEncoding::<F>::new(16, 0.25, 4, 2, 42, (1, 25)); | ||
| let msg: Vec<FE> = (0..16).map(|x| FE::from(x as u64)).collect(); |
There was a problem hiding this comment.
Correctness:
- Distance Parameter: The provided implementation does not inherently verify the correctness of the
distanceparameter. Given the reliance on this parameter for soundness, it can pose risks. Ensure any calls tonewset this correctly, based on a robust spectral expansion analysis.
Security:
- Sensitive Data: The code does not explicitly zeroize sensitive data. Consider adding zeroization for the
codewordand intermediate vectors to ensure sensitive data isn't left in memory longer than necessary. - Randomness: The random matrix generation should use a cryptographically secure PRNG, especially since they directly impact the security properties (e.g., a precomputed attack could exploit poor randomness).
Performance:
- Sparse Matrix Multiplication: Ensure the
SparseMatrix.mul_vecimplementation is efficient, particularly for large vectors typical in these contexts. Sparse optimizations should be verified.
Bugs & Errors:
- Potential Panics: The use of
libm::ceilwithout bounds checking onmsg_len,alpha, or initialization parameters may inadvertently cause out-of-bounds behavior or panic if assumptions about vector sizes don't hold. - Integer Overflow: The code does not currently guard against potential integer overflow in size calculations, such as in
cw_lenaccumulation. Use checked arithmetic where possible.
Code Simplicity:
- Abstraction: The
ExpanderLevelstruct introduces some additional clarity but could benefit further from separating creation concerns (random matrix generation) into its own function, improving modularity. - Seed Manipulation: The use of
wrapping_addfor seed manipulation ensures no overflow, which is good, but better documentation on seed derivation could assist future maintainers.
Overall, ensure that cryptographic robustness is maintained through verified randomness and sensitive data handling. Once addressed, particularly the potential security and correctness issues, the implementation will provide a sound base for the PCS expander encoding.
| let hash = derive(seed, row_idx, 0); | ||
| let mut state = u64_from_hash(&hash); | ||
| let mut hash_counter: u64 = 1; | ||
| let mut used = alloc::collections::BTreeSet::new(); |
There was a problem hiding this comment.
Correctness
- Mathematical Operations: The implementation should ensure that the field values are non-zero as intended. However, the current conversion of
val_rawviaFieldElement::<F>::from(val_raw)may not correctly map to non-zero elements across all fields, particularly ifval_raw % (p-1) + 1is calculated incorrectly for certain field sizes.
Security
- Timing Side-Channels: Be cautious of the
whileloop used in the selection of the column index (while used.contains(&col)). This loop could potentially reveal information about the distribution of non-zero entries depending on how long it runs. - Cryptographically Secure Randomness: The randomness is derived using SHA-3 in a deterministic manner. The use of a PRNG here should ensure that it aligns with cryptographic best practices for randomness.
Performance
- Unnecessary Allocations: Using
BTreeSetto track used column indices might be an overhead for large matrices. Consider more space-efficient structures if permissible by the use-case characterizing largen_cols.
Bugs & Errors
- Potential Panics: The code uses
assert!(nnz_per_row <= n_cols);which will panic if the condition fails. This assertion should either be replaced with error handling or accompanied by a precise error message if it is intended to stay.
Code Simplicity
- Complex Branching: The conditional setting of field element values using
if val == FieldElement::zero() { FieldElement::one() } else { val }indicates duplication, which could hint at suboptimal field element binding. This logic might be simplified by ensuring that the value chosen always falls into [1, p-1].
This function has notable concerns in terms of correctness and potential non-uniform randomness distribution. Addressing these concerns is essential before recommending the merge.
Kimi AI ReviewHere are my review comments on the PR diff:
Overall, the PR seems well-structured and organized. However, we should carefully review the security implications of the new code, especially around the use of floating point operations, challenge generation, and matrix operations. Additionally, we should ensure that the new code is properly documented and its invariants are maintained throughout the codebase. |
| /// y = x · A ∈ F^{αn} (compress via matrix A) | ||
| /// z = Enc_{αn}(y) ∈ F^{r·αn} (recursive encoding) | ||
| /// v = z · B ∈ F^{(r-1-rα)n} (mix via matrix B) | ||
| /// ``` |
There was a problem hiding this comment.
Correctness:
- Field Operations: The code generally looks correct in terms of field operations, but ensure that operations are performed modulo the field order and handle edge cases like zero or identity elements correctly.
- Handling of Scalars: It's important to verify that scalars like
aandbare within the field and their operations are field operations.
Security:
- Timing Side-channels: The changes avoid extra cloning and use direct references which is good, but review the rest of the codebase to ensure constant-time operations when working with secrets.
- Zeroization: There's no indication here that sensitive data is properly zeroed after use. Ensure this is covered in the overall library.
- Randomness & Hashing: No evidence in this snippet of checking cryptographic randomness. Additionally, ensure hash functions provide domain separation if used.
Performance:
- Cloning Avoidance: The changes remove unnecessary cloning, which should slightly improve performance by reducing allocations and unnecessary copies.
- Efficiency of Operations: Multiplications seem consistent; ensure that in more extensive computations like MSM and FFT, performance is maximized.
Bugs & Errors:
- Potential Panics: No explicit checks for panics, such as
unwrapor explicit index accesses, which is good but verify there are no hidden ones elsewhere. - Memory Safety: Given Rust's safety guarantees, memory safety should be inherently managed.
- Off-by-one & Overflow: Not applicable to this snippet, but ensure tests cover these issues thoroughly.
Code Simplicity:
- Simplification: The code change makes it more straightforward and eliminates duplication which is beneficial. Ensure similar simplicity elsewhere, especially in complex algorithms.
In summary, the code changes themselves are positive but holistic checks for security issues, especially regarding zeroization and timing safety across the library, are essential. Until this is verified, the recommendation is not to merge.
Kimi AI ReviewThe PR diff introduces a significant amount of new code related to linear-code polynomial commitment schemes (PCS). Here are the issues I found:
|
|
@greptile |
…scheme
Implement a hash-based PCS for multilinear polynomials supporting two pluggable encoding backends:
No trusted setup, post-quantum secure. Uses existing FieldElementVectorBackend for Merkle commitments and Fiat-Shamir transcript for non-interactive proofs.
Tested with Goldilocks (Ligero) and Mersenne31 (Brakedown) fields.