Add portable scalar fallback for packed Goldilocks by MavenRain · Pull Request #1198 · lambdaclass/lambdaworks

MavenRain · 2026-04-02T14:51:07Z

Add portable scalar fallback for packed Goldilocks

Description

The goldilocks_avx module was unavailable without AVX2. Add a scalar fallback (PackedGoldilocksScalar) with the same API that works on all platforms, and a PackedGoldilocks type alias that auto-selects AVX2 on x86_64 or scalar elsewhere.

Closes #1153

New feature
Bug fix
Optimization

Checklist

Linked to Github Issue
Unit tests added
This change requires new documentation.
- Documentation has been added/updated.
This change is an Optimization
- Benchmarks added/run

The goldilocks_avx module was unavailable without AVX2. Add a scalar fallback (PackedGoldilocksScalar) with the same API that works on all platforms, and a PackedGoldilocks type alias that auto-selects AVX2 on x86_64 or scalar elsewhere. Closes lambdaclass#1153

greptile-apps · 2026-04-02T14:54:34Z

Greptile Summary

This PR adds a portable PackedGoldilocksScalar type to the goldilocks_avx module — a scalar fallback that provides the same packed-arithmetic API as PackedGoldilocksAVX2 but uses element-wise FieldElement operations instead of SIMD intrinsics. A PackedGoldilocks type alias is added that auto-selects PackedGoldilocksAVX2 on x86_64+AVX2 and PackedGoldilocksScalar elsewhere, and the #[cfg] gate that previously hid the goldilocks_avx module from non-x86_64 targets is removed.

The core arithmetic is correct: all operations delegate to the well-tested FieldElement<Goldilocks64Field> impls.
The new PackedGoldilocks alias cleanly unifies the two implementations behind one name for downstream users.
The PartialEq implementation uses a verbose hand-rolled comparison (IsField::eq on raw .value()) that is inconsistent with PackedGoldilocksAVX2, which simply compares self.0 == other.0; deriving PartialEq (or using the simpler form) would align the two types and also allow Hash to be derived, avoiding a future clippy lint.
The unit tests for the scalar path only exercise small, in-range values and miss arithmetic near the field boundary (overflow, underflow, near-modulus multiplication) — the AVX2 tests cover these cases and equivalent tests would increase confidence in the scalar path.
The module-level documentation and the name goldilocks_avx no longer fully describe the module's scope now that it compiles unconditionally and contains non-AVX code.

Confidence Score: 4/5

Safe to merge; all correctness concerns are style/test-coverage suggestions and no logic bugs were found.

The implementation is straightforward — arithmetic is entirely delegated to the existing, well-tested FieldElement type, so the risk of a correctness regression is low. The outstanding items (verbose PartialEq, missing Hash, sparse boundary tests, stale docs) are all P2 suggestions that do not affect correctness or runtime safety.

crates/math/src/field/fields/goldilocks_avx/scalar.rs — PartialEq and test coverage could be improved before the file grows further.

Important Files Changed

Filename	Overview
crates/math/src/field/fields/goldilocks_avx/scalar.rs	New file implementing the portable scalar fallback; API mirrors AVX2 but PartialEq is unnecessarily verbose and tests don't cover field-boundary edge cases.
crates/math/src/field/fields/goldilocks_avx/mod.rs	Wires in the scalar module and adds the PackedGoldilocks type alias; unconditional pub mod scalar is correct, but the module-level doc comment no longer reflects the scalar path.
crates/math/src/field/fields/mod.rs	Removes the x86_64/avx2/avx512 cfg gate so goldilocks_avx is always exposed; intentional and correct given the new scalar fallback.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Consumer imports goldilocks_avx] --> B{Target platform?}
    B -- "x86_64 + avx2 feature" --> C[PackedGoldilocks = PackedGoldilocksAVX2]
    B -- "All other platforms" --> D[PackedGoldilocks = PackedGoldilocksScalar]
    C --> E[AVX2 SIMD: 4 elements x 256-bit registers\navx2.rs]
    D --> F[Scalar fallback: 4 elements x element-wise ops\nscalar.rs NEW]
    E --> G[Shared API\nnew / from_u64_array / zero / one\nto_array / from_slice / store_to_slice\nsquare / Add / Sub / Mul / Neg / *Assign]
    F --> G
    H[avx512.rs] -- "x86_64 + avx512f" --> I[PackedGoldilocksAVX512\n8 elements x 512-bit]
    J[avx2_ext.rs] -- "x86_64 + avx2" --> K[PackedGoldilocksFp2AVX2\nPackedGoldilocksFp3AVX2]

Prompt To Fix All With AI

This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 79-83

Comment:
**Overly verbose `PartialEq`, inconsistent with `PackedGoldilocksAVX2`**

The scalar `PartialEq` calls `<Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value())` on raw field values, while the AVX2 sibling uses the much simpler `self.0 == other.0`. Since `FieldElement` already implements `PartialEq` correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types.

Consider aligning with the AVX2 pattern:

```suggestion
impl PartialEq for PackedGoldilocksScalar {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}
```

Additionally, since `FieldElement` derives `Hash`, and `PackedGoldilocksScalar` now has a manual `PartialEq`, `Hash` cannot be automatically derived. If someone tries to use `PackedGoldilocksScalar` as a hash-map key they'll get a compile error. Clippy will also fire the `derived_hash_with_manual_eq`-related lint. Either derive both (`#[derive(PartialEq, Eq)]` — which would make the manual impls unnecessary), or also add a manual `Hash` impl to keep the `a == b ⟹ hash(a) == hash(b)` contract explicit.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs
Line: 140-182

Comment:
**Tests only cover small values; no near-modulus boundary tests**

All five unit tests use small, comfortably-in-range integers (`[1,2,3,4]`, `[10,20,30,40]`, etc.). The field modulus for Goldilocks is `p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001`. The scalar implementation delegates to `FieldElement` arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for:

- Addition overflow: `(p-1) + 1` should equal `0`
- Subtraction underflow: `0 - 1` should equal `p-1`
- Multiplication near-modulus: `(p-1) * (p-1)` should equal `1`
- Negation: `-(p-1)` should equal `1`

The AVX2 module includes exactly these cases (`test_avx2_overflow_handling`, `test_avx2_sub_underflow`, `test_avx2_neg`, `test_avx2_mul_near_modulus`). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/math/src/field/fields/goldilocks_avx/mod.rs
Line: 55-56

Comment:
**Module name `goldilocks_avx` is now misleading**

The containing module (`crates/math/src/field/fields/goldilocks_avx/`) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing `goldilocks_avx::PackedGoldilocks` on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware.

While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of `mod.rs` should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the `#![doc]` block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Add portable scalar fallback for packed ..." | Re-trigger Greptile}

greptile-apps · 2026-04-02T14:54:38Z

+impl PartialEq for PackedGoldilocksScalar {
+    fn eq(&self, other: &Self) -> bool {
+        (0..WIDTH).all(|i| <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()))
+    }
+}


Overly verbose PartialEq, inconsistent with PackedGoldilocksAVX2

The scalar PartialEq calls <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()) on raw field values, while the AVX2 sibling uses the much simpler self.0 == other.0. Since FieldElement already implements PartialEq correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types.

Consider aligning with the AVX2 pattern:

Suggested change

impl PartialEq for PackedGoldilocksScalar {

fn eq(&self, other: &Self) -> bool {

(0..WIDTH).all(|i| <Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value()))

}

}

impl PartialEq for PackedGoldilocksScalar {

fn eq(&self, other: &Self) -> bool {

self.0 == other.0

}

}

Additionally, since FieldElement derives Hash, and PackedGoldilocksScalar now has a manual PartialEq, Hash cannot be automatically derived. If someone tries to use PackedGoldilocksScalar as a hash-map key they'll get a compile error. Clippy will also fire the derived_hash_with_manual_eq-related lint. Either derive both (#[derive(PartialEq, Eq)] — which would make the manual impls unnecessary), or also add a manual Hash impl to keep the a == b ⟹ hash(a) == hash(b) contract explicit.

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs Line: 79-83 Comment: **Overly verbose `PartialEq`, inconsistent with `PackedGoldilocksAVX2`** The scalar `PartialEq` calls `<Goldilocks64Field as IsField>::eq(self.0[i].value(), other.0[i].value())` on raw field values, while the AVX2 sibling uses the much simpler `self.0 == other.0`. Since `FieldElement` already implements `PartialEq` correctly (using canonical comparison), both paths are equivalent for well-formed elements — but the verbose form is harder to read and creates a maintenance gap between the two types. Consider aligning with the AVX2 pattern: ```suggestion impl PartialEq for PackedGoldilocksScalar { fn eq(&self, other: &Self) -> bool { self.0 == other.0 } } ``` Additionally, since `FieldElement` derives `Hash`, and `PackedGoldilocksScalar` now has a manual `PartialEq`, `Hash` cannot be automatically derived. If someone tries to use `PackedGoldilocksScalar` as a hash-map key they'll get a compile error. Clippy will also fire the `derived_hash_with_manual_eq`-related lint. Either derive both (`#[derive(PartialEq, Eq)]` — which would make the manual impls unnecessary), or also add a manual `Hash` impl to keep the `a == b ⟹ hash(a) == hash(b)` contract explicit. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-02T14:54:39Z

+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn scalar_add_works() {
+        let a = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
+        let b = PackedGoldilocksScalar::from_u64_array([10, 20, 30, 40]);
+        let c = a + b;
+        assert_eq!(c, PackedGoldilocksScalar::from_u64_array([11, 22, 33, 44]));
+    }
+
+    #[test]
+    fn scalar_mul_works() {
+        let a = PackedGoldilocksScalar::from_u64_array([2, 3, 4, 5]);
+        let b = PackedGoldilocksScalar::from_u64_array([10, 10, 10, 10]);
+        let c = a * b;
+        assert_eq!(c, PackedGoldilocksScalar::from_u64_array([20, 30, 40, 50]));
+    }
+
+    #[test]
+    fn scalar_sub_works() {
+        let a = PackedGoldilocksScalar::from_u64_array([10, 20, 30, 40]);
+        let b = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
+        let c = a - b;
+        assert_eq!(c, PackedGoldilocksScalar::from_u64_array([9, 18, 27, 36]));
+    }
+
+    #[test]
+    fn scalar_square_works() {
+        let a = PackedGoldilocksScalar::from_u64_array([3, 4, 5, 6]);
+        let b = a.square();
+        assert_eq!(b, PackedGoldilocksScalar::from_u64_array([9, 16, 25, 36]));
+    }
+
+    #[test]
+    fn scalar_neg_works() {
+        let a = PackedGoldilocksScalar::from_u64_array([1, 2, 3, 4]);
+        let b = -a;
+        let c = a + b;
+        assert_eq!(c, PackedGoldilocksScalar::zero());
+    }
+}


Tests only cover small values; no near-modulus boundary tests

All five unit tests use small, comfortably-in-range integers ([1,2,3,4], [10,20,30,40], etc.). The field modulus for Goldilocks is p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001. The scalar implementation delegates to FieldElement arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for:

Addition overflow: (p-1) + 1 should equal 0

Subtraction underflow: 0 - 1 should equal p-1

Multiplication near-modulus: (p-1) * (p-1) should equal 1

Negation: -(p-1) should equal 1

The AVX2 module includes exactly these cases (test_avx2_overflow_handling, test_avx2_sub_underflow, test_avx2_neg, test_avx2_mul_near_modulus). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path.

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/math/src/field/fields/goldilocks_avx/scalar.rs Line: 140-182 Comment: **Tests only cover small values; no near-modulus boundary tests** All five unit tests use small, comfortably-in-range integers (`[1,2,3,4]`, `[10,20,30,40]`, etc.). The field modulus for Goldilocks is `p = 2^64 − 2^32 + 1 = 0xFFFF_FFFF_0000_0001`. The scalar implementation delegates to `FieldElement` arithmetic which handles wrapping, but there are no tests verifying that wrapping behaviour is correct for: - Addition overflow: `(p-1) + 1` should equal `0` - Subtraction underflow: `0 - 1` should equal `p-1` - Multiplication near-modulus: `(p-1) * (p-1)` should equal `1` - Negation: `-(p-1)` should equal `1` The AVX2 module includes exactly these cases (`test_avx2_overflow_handling`, `test_avx2_sub_underflow`, `test_avx2_neg`, `test_avx2_mul_near_modulus`). Adding equivalent tests to the scalar module — or a cross-check test comparing scalar against AVX2 results for boundary inputs — would give stronger correctness confidence for the fallback path. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-02T14:54:40Z

+// Portable scalar fallback (always available)
+pub mod scalar;


Module name goldilocks_avx is now misleading

The containing module (crates/math/src/field/fields/goldilocks_avx/) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing goldilocks_avx::PackedGoldilocks on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware.

While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of mod.rs should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the #![doc] block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback.

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/math/src/field/fields/goldilocks_avx/mod.rs Line: 55-56 Comment: **Module name `goldilocks_avx` is now misleading** The containing module (`crates/math/src/field/fields/goldilocks_avx/`) was previously only compiled on x86_64 with AVX features and its name accurately described its contents. Now that it is unconditionally compiled and includes a portable scalar fallback, a consumer importing `goldilocks_avx::PackedGoldilocks` on an ARM or RISC-V target will encounter a name that implies x86 SIMD hardware. While renaming the module is a separate, potentially larger change, at minimum the module-level doc comment at the top of `mod.rs` should be updated to reflect that it now covers both SIMD and scalar code paths — especially since the `#![doc]` block on lines 1–42 only describes AVX2/AVX-512 and says nothing about the scalar fallback. How can I resolve this? If you propose a fix, please make it concise.

…pdate module docs

MavenRain requested a review from a team as a code owner April 2, 2026 14:51

greptile-apps Bot reviewed Apr 2, 2026

View reviewed changes

Address P2 review feedback: simplify PartialEq, add boundary tests, u…

cf56b10

…pdate module docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add portable scalar fallback for packed Goldilocks#1198

Add portable scalar fallback for packed Goldilocks#1198
MavenRain wants to merge 2 commits into
lambdaclass:mainfrom
MavenRain:feat/goldilocks-scalar-fallback

MavenRain commented Apr 2, 2026

Uh oh!

greptile-apps Bot commented Apr 2, 2026

Important Files Changed

Uh oh!

greptile-apps Bot Apr 2, 2026

Uh oh!

greptile-apps Bot Apr 2, 2026

Uh oh!

greptile-apps Bot Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		// Portable scalar fallback (always available)
		pub mod scalar;

Conversation

MavenRain commented Apr 2, 2026

Add portable scalar fallback for packed Goldilocks

Description

Checklist

Uh oh!

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant