Skip to content

Add portable scalar fallback for packed Goldilocks#1198

Open
MavenRain wants to merge 2 commits into
lambdaclass:mainfrom
MavenRain:feat/goldilocks-scalar-fallback
Open

Add portable scalar fallback for packed Goldilocks#1198
MavenRain wants to merge 2 commits into
lambdaclass:mainfrom
MavenRain:feat/goldilocks-scalar-fallback

Conversation

@MavenRain
Copy link
Copy Markdown

Add portable scalar fallback for packed Goldilocks

Description

The goldilocks_avx module was unavailable without AVX2. Add a scalar fallback (PackedGoldilocksScalar) with the same API that works on all platforms, and a PackedGoldilocks type alias that auto-selects AVX2 on x86_64 or scalar elsewhere.

Closes #1153

  • New feature
  • Bug fix
  • Optimization

Checklist

  • Linked to Github Issue
  • Unit tests added
  • This change requires new documentation.
    • Documentation has been added/updated.
  • This change is an Optimization
    • Benchmarks added/run

  The goldilocks_avx module was unavailable without AVX2. Add a scalar
  fallback (PackedGoldilocksScalar) with the same API that works on all
  platforms, and a PackedGoldilocks type alias that auto-selects AVX2
  on x86_64 or scalar elsewhere.

  Closes lambdaclass#1153
@MavenRain MavenRain requested a review from a team as a code owner April 2, 2026 14:51
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR adds a portable PackedGoldilocksScalar type to the goldilocks_avx module — a scalar fallback that provides the same packed-arithmetic API as PackedGoldilocksAVX2 but uses element-wise FieldElement operations instead of SIMD intrinsics. A PackedGoldilocks type alias is added that auto-selects PackedGoldilocksAVX2 on x86_64+AVX2 and PackedGoldilocksScalar elsewhere, and the #[cfg] gate that previously hid the goldilocks_avx module from non-x86_64 targets is removed.

  • The core arithmetic is correct: all operations delegate to the well-tested FieldElement<Goldilocks64Field> impls.
  • The new PackedGoldilocks alias cleanly unifies the two implementations behind one name for downstream users.
  • The PartialEq implementation uses a verbose hand-rolled comparison (IsField::eq on raw .value()) that is inconsistent with PackedGoldilocksAVX2, which simply compares self.0 == other.0; deriving PartialEq (or using the simpler form) would align the two types and also allow Hash to be derived, avoiding a future clippy lint.
  • The unit tests for the scalar path only exercise small, in-range values and miss arithmetic near the field boundary (overflow, underflow, near-modulus multiplication) — the AVX2 tests cover these cases and equivalent tests would increase confidence in the scalar path.
  • The module-level documentation and the name goldilocks_avx no longer fully describe the module's scope now that it compiles unconditionally and contains non-AVX code.

Confidence Score: 4/5

Safe to merge; all correctness concerns are style/test-coverage suggestions and no logic bugs were found.

The implementation is straightforward — arithmetic is entirely delegated to the existing, well-tested FieldElement type, so the risk of a correctness regression is low. The outstanding items (verbose PartialEq, missing Hash, sparse boundary tests, stale docs) are all P2 suggestions that do not affect correctness or runtime safety.

crates/math/src/field/fields/goldilocks_avx/scalar.rs — PartialEq and test coverage could be improved before the file grows further.

Important Files Changed

Filename Overview
crates/math/src/field/fields/goldilocks_avx/scalar.rs New file implementing the portable scalar fallback; API mirrors AVX2 but PartialEq is unnecessarily verbose and tests don't cover field-boundary edge cases.
crates/math/src/field/fields/goldilocks_avx/mod.rs Wires in the scalar module and adds the PackedGoldilocks type alias; unconditional pub mod scalar is correct, but the module-level doc comment no longer reflects the scalar path.
crates/math/src/field/fields/mod.rs Removes the x86_64/avx2/avx512 cfg gate so goldilocks_avx is always exposed; intentional and correct given the new scalar fallback.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Consumer imports goldilocks_avx] --> B{Target platform?}
    B -- "x86_64 + avx2 feature" --> C[PackedGoldilocks = PackedGoldilocksAVX2]
    B -- "All other platforms" --> D[PackedGoldilocks = PackedGoldilocksScalar]
    C --> E[AVX2 SIMD: 4 elements x 256-bit registers\navx2.rs]
    D --> F[Scalar fallback: 4 elements x element-wise ops\nscalar.rs NEW]
    E --> G[Shared API\nnew / from_u64_array / zero / one\nto_array / from_slice / store_to_slice\nsquare / Add / Sub / Mul / Neg / *Assign]
    F --> G
    H[avx512.rs] -- "x86_64 + avx512f" --> I[PackedGoldilocksAVX512\n8 elements x 512-bit]
    J[avx2_ext.rs] -- "x86_64 + avx2" --> K[PackedGoldilocksFp2AVX2\nPackedGoldilocksFp3AVX2]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 79-83

Comment:
**Overly verbose `PartialEq`, inconsistent with `PackedGoldilocksAVX2`**

The scalar `PartialEq` calls `<Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value())` on raw field values, while the AVX2 sibling uses the much simpler `self.0 == other.0`. Since `FieldElement` already implements `PartialEq` correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types.

Consider aligning with the AVX2 pattern:

```suggestion
impl PartialEq for PackedGoldilocksScalar {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}
```

Additionally, since `FieldElement` derives `Hash`, and `PackedGoldilocksScalar` now has a manual `PartialEq`, `Hash` cannot be automatically derived. If someone tries to use `PackedGoldilocksScalar` as a hash-map key they'll get a compile error. Clippy will also fire the `derived_hash_with_manual_eq`-related lint. Either derive both (`#[derive(PartialEq, Eq)]` — which would make the manual impls unnecessary), or also add a manual `Hash` impl to keep the `a == b ⟹ hash(a) == hash(b)` contract explicit.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 140-182

Comment:
**Tests only cover small values; no near-modulus boundary tests**

All five unit tests use small, comfortably-in-range integers (`[1,2,3,4]`, `[10,20,30,40]`, etc.). The field modulus for Goldilocks is `p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001`. The scalar implementation delegates to `FieldElement` arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for:

- Addition overflow: `(p-1) + 1` should equal `0`
- Subtraction underflow: `0 - 1` should equal `p-1`
- Multiplication near-modulus: `(p-1) * (p-1)` should equal `1`
- Negation: `-(p-1)` should equal `1`

The AVX2 module includes exactly these cases (`test_avx2_overflow_handling`, `test_avx2_sub_underflow`, `test_avx2_neg`, `test_avx2_mul_near_modulus`). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/mod.rs
Line: 55-56

Comment:
**Module name `goldilocks_avx` is now misleading**

The containing module (`crates/math/src/field/fields/goldilocks_avx/`) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing `goldilocks_avx::PackedGoldilocks` on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware.

While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of `mod.rs` should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the `#![doc]` block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Add portable scalar fallback for packed ..." | Re-trigger Greptile

Comment on lines +79 to +83
impl PartialEq for PackedGoldilocksScalar {
fn eq(&self, other: &Self) -> bool {
(0..WIDTH).all(|i| <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()))
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Overly verbose PartialEq, inconsistent with PackedGoldilocksAVX2

The scalar PartialEq calls <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()) on raw field values, while the AVX2 sibling uses the much simpler self.0 == other.0. Since FieldElement already implements PartialEq correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types.

Consider aligning with the AVX2 pattern:

Suggested change
impl PartialEq for PackedGoldilocksScalar {
fn eq(&self, other: &Self) -> bool {
(0..WIDTH).all(|i| <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()))
}
}
impl PartialEq for PackedGoldilocksScalar {
fn eq(&self, other: &Self) -> bool {
self.0 == other.0
}
}

Additionally, since FieldElement derives Hash, and PackedGoldilocksScalar now has a manual PartialEq, Hash cannot be automatically derived. If someone tries to use PackedGoldilocksScalar as a hash-map key they'll get a compile error. Clippy will also fire the derived_hash_with_manual_eq-related lint. Either derive both (#[derive(PartialEq, Eq)] — which would make the manual impls unnecessary), or also add a manual Hash impl to keep the a == b ⟹ hash(a) == hash(b) contract explicit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 79-83

Comment:
**Overly verbose `PartialEq`, inconsistent with `PackedGoldilocksAVX2`**

The scalar `PartialEq` calls `<Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value())` on raw field values, while the AVX2 sibling uses the much simpler `self.0 == other.0`. Since `FieldElement` already implements `PartialEq` correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types.

Consider aligning with the AVX2 pattern:

```suggestion
impl PartialEq for PackedGoldilocksScalar {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}
```

Additionally, since `FieldElement` derives `Hash`, and `PackedGoldilocksScalar` now has a manual `PartialEq`, `Hash` cannot be automatically derived. If someone tries to use `PackedGoldilocksScalar` as a hash-map key they'll get a compile error. Clippy will also fire the `derived_hash_with_manual_eq`-related lint. Either derive both (`#[derive(PartialEq, Eq)]` — which would make the manual impls unnecessary), or also add a manual `Hash` impl to keep the `a == b ⟹ hash(a) == hash(b)` contract explicit.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +140 to +182
#[cfg(test)]
mod tests {
use super::*;

#[test]
fn scalar_add_works() {
let a = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
let b = PackedGoldilocksScalar::from_u64_array([10, 20, 30, 40]);
let c = a + b;
assert_eq!(c, PackedGoldilocksScalar::from_u64_array([11, 22, 33, 44]));
}

#[test]
fn scalar_mul_works() {
let a = PackedGoldilocksScalar::from_u64_array([2, 3, 4, 5]);
let b = PackedGoldilocksScalar::from_u64_array([10, 10, 10, 10]);
let c = a * b;
assert_eq!(c, PackedGoldilocksScalar::from_u64_array([20, 30, 40, 50]));
}

#[test]
fn scalar_sub_works() {
let a = PackedGoldilocksScalar::from_u64_array([10, 20, 30, 40]);
let b = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
let c = a - b;
assert_eq!(c, PackedGoldilocksScalar::from_u64_array([9, 18, 27, 36]));
}

#[test]
fn scalar_square_works() {
let a = PackedGoldilocksScalar::from_u64_array([3, 4, 5, 6]);
let b = a.square();
assert_eq!(b, PackedGoldilocksScalar::from_u64_array([9, 16, 25, 36]));
}

#[test]
fn scalar_neg_works() {
let a = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
let b = -a;
let c = a + b;
assert_eq!(c, PackedGoldilocksScalar::zero());
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Tests only cover small values; no near-modulus boundary tests

All five unit tests use small, comfortably-in-range integers ([1,2,3,4], [10,20,30,40], etc.). The field modulus for Goldilocks is p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001. The scalar implementation delegates to FieldElement arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for:

  • Addition overflow: (p-1) + 1 should equal 0
  • Subtraction underflow: 0 - 1 should equal p-1
  • Multiplication near-modulus: (p-1) * (p-1) should equal 1
  • Negation: -(p-1) should equal 1

The AVX2 module includes exactly these cases (test_avx2_overflow_handling, test_avx2_sub_underflow, test_avx2_neg, test_avx2_mul_near_modulus). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 140-182

Comment:
**Tests only cover small values; no near-modulus boundary tests**

All five unit tests use small, comfortably-in-range integers (`[1,2,3,4]`, `[10,20,30,40]`, etc.). The field modulus for Goldilocks is `p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001`. The scalar implementation delegates to `FieldElement` arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for:

- Addition overflow: `(p-1) + 1` should equal `0`
- Subtraction underflow: `0 - 1` should equal `p-1`
- Multiplication near-modulus: `(p-1) * (p-1)` should equal `1`
- Negation: `-(p-1)` should equal `1`

The AVX2 module includes exactly these cases (`test_avx2_overflow_handling`, `test_avx2_sub_underflow`, `test_avx2_neg`, `test_avx2_mul_near_modulus`). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +55 to +56
// Portable scalar fallback (always available)
pub mod scalar;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Module name goldilocks_avx is now misleading

The containing module (crates/math/src/field/fields/goldilocks_avx/) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing goldilocks_avx::PackedGoldilocks on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware.

While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of mod.rs should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the #![doc] block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/mod.rs
Line: 55-56

Comment:
**Module name `goldilocks_avx` is now misleading**

The containing module (`crates/math/src/field/fields/goldilocks_avx/`) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing `goldilocks_avx::PackedGoldilocks` on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware.

While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of `mod.rs` should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the `#![doc]` block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scalar fallback for packed Goldilocks

1 participant