implement pedantic codec validations via minicbor context #707

rrruko · 2025-10-15T17:17:40Z

This keeps the default permissive behavior of the codecs while providing the option to enable pedantically node-conformant validations by using minicbor::decode_with(..., BasicStrictContext::new()).

Summary by CodeRabbit

New Features
- Stricter, schema-validated decoding for values and transaction bodies, rejecting zero or empty amounts and duplicate/invalid fields.
- Non-empty collections for multi-asset, mint and withdrawals to ensure meaningful data and clearer validation errors.
- Field-based transaction body validation with helpers to build and validate core transaction fields.
Refactor
- Mint handling and value encoding/decoding rewritten to enforce non-empty, non-zero assets and improve robustness.

coderabbitai · 2025-10-15T17:17:51Z

Walkthrough

Adds strict decoding/validation for PositiveCoin; introduces generic Multiasset and non-empty wrappers (NonEmptyMap, NonEmptyMultiasset) and rewrites Value/TransactionBody encoding/decoding for Conway types; updates txbuilder and validation call sites to use the new multiasset/non-empty types.

Changes

Cohort / File(s)	Summary of Changes
PositiveCoin decoding `pallas-codec/src/utils.rs`	Removed `Decode` derive for `PositiveCoin`; added manual `minicbor::Decode` impl that rejects zero and returns a decode error.
Conway primitives refactor `pallas-primitives/src/conway/model.rs`	Added `Multiasset<A>` wrapper around maps and `NonEmptyMap` / `NonEmptyMultiasset<T>` types with Encode/Decode and Deref; replaced `Mint`, `Withdrawals`, and `VotingProcedures` to use non-empty wrappers; replaced macro-based Value/TxOutput codecs with explicit `Encode`/`Decode`; introduced `ValidationContext`, `Strict<T>`/`StrictVerbose<T>` validation wrappers, `TxBodyField`/`TxBodyFields` staging and field-by-field decode logic, and various helpers enforcing non-empty/non-zero constraints and duplicate-field checks.
TxBuilder mint handling `pallas-txbuilder/src/conway.rs`	Construct mint directly with `NonEmptyMultiasset::from_multiasset(...)`; updated imports and iteration to use `NonEmptyMultiasset<NonZeroInt>`.
Validation mint handling `pallas-validate/src/phase1/conway.rs`	When minted value exists, pass `m.clone().to_multiasset()` (a `Multiasset`) into `conway_add_minted_non_zero` instead of the previous mint type.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CBOR as CBOR
  participant Dec as minicbor::Decoder
  participant PC as PositiveCoin
  participant MA as Multiasset/Strict
  participant TB as TxBodyFields
  CBOR->>Dec: bytes
  Dec->>PC: decode PositiveCoin
  alt decoded == 0
    PC-->>Dec: Err(Decode) (invalid zero)
  else
    PC-->>Dec: PositiveCoin(value>0)
  end
  Dec->>MA: decode Multiasset/Value via Strict wrapper
  MA->>MA: validate non-empty/non-zero/size
  Dec->>TB: decode TransactionBody fields (indexed)
  TB->>TB: stage fields, check duplicates, apply strict validation
  TB-->>Dec: validated TransactionBody

sequenceDiagram
  autonumber
  participant App as TxBuilder
  participant MA as Multiasset<NonZeroInt>
  participant NE as NonEmptyMultiasset<NonZeroInt>
  App->>MA: collect mint entries
  MA-->>App: Multiasset map
  App->>NE: NonEmptyMultiasset::from_multiasset(MA.into_iter().collect())
  NE-->>App: NonEmptyMultiasset (or error if empty)
  App->>App: attach mint to TransactionBody

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Areas to pay extra attention:

TxBodyFields staged decoding and duplicate-field handling.
NonEmptyMap / NonEmptyMultiasset Encode/Decode and conversion helpers.
Manual Decode for PositiveCoin and any call sites assuming derived behavior.
Value and TransactionOutput explicit codecs replacing macro routing.

Possibly related PRs

fix(validate): contemplate burns in value preservation checks #688 — Changes to conway value-preservation paths and handling of minted multiasset data; overlaps call sites and argument shapes.
fix(validate): preservation of value fails when output asset goes to zero #698 — Changes to multiasset representation and zero-value cleanup logic that are directly related to NonEmptyMultiasset/ Multiasset handling.

Suggested reviewers

scarmuega

Poem

I thump my paws at zeros’ fate—
No empty bags shall cross this gate.
Coins hop high, assets align,
Maps packed tight in ordered line.
Mint made stout, the checks now sing— 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.69% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "implement pedantic codec validations via minicbor context" accurately captures the primary changes in the pull request. The changeset introduces a ValidationContext trait with concrete implementations (AccumulatingContext, TerminatingContext), strict validation wrappers (Strict, StrictVerbose), non-empty constraint types (NonEmptyMap, NonEmptyMultiasset), and explicit Decode implementations with validation logic across multiple modules. The title is specific, clear, and directly reflects the main objective of enabling stricter, node-conformant codec validations through minicbor's context mechanism while maintaining backward-compatible permissive defaults.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

pallas-validate/src/phase1/conway.rs (1)
368-369: Avoid cloning mint just to borrow Multiasset

m.clone().to_multiasset() allocates a full clone. Prefer borrowing the inner Multiasset to avoid copying.

Recommended:

Add as_multiasset(&self) -> &Multiasset<T> on NonEmptyMultiasset<T>.

Use it here to pass a borrowed &Multiasset<_>.

Apply this diff locally:
-        input = conway_add_minted_non_zero(&input, &m.clone().to_multiasset(), &PostAlonzo(NegativeValue))?;
+        // borrow mint instead of cloning
+        input = conway_add_minted_non_zero(&input, m.as_multiasset(), &PostAlonzo(NegativeValue))?;
And add this helper in pallas-primitives/src/conway/model.rs within impl<A> NonEmptyMultiasset<A>:
 impl<A> NonEmptyMultiasset<A> {
+    pub fn as_multiasset(&self) -> &Multiasset<A> {
+        &self.asset
+    }
     pub fn from_multiasset(ma: Multiasset<A>) -> Option<Self> {
         if ma.is_empty() {
             None
         } else {
             Some(NonEmptyMultiasset {
                 asset: ma,
             })
         }
     }

     pub fn to_multiasset(self) -> Multiasset<A> {
         self.asset
     }
 }
pallas-txbuilder/src/conway.rs (1)
157-161: Minor: simplify iterator type annotation

The explicit type on x isn’t needed; inference is clear from mint.iter(). Consider:
-            .flat_map(|x: &pallas_primitives::conway::NonEmptyMultiasset<NonZeroInt>| x.iter())
+            .flat_map(|x| x.iter())
pallas-primitives/src/conway/model.rs (3)
30-65: StrictContext scaffolding is solid; consider Default

The context/error accumulator is straightforward. Optionally derive Default for BasicStrictContext to ease construction alongside new().

945-957: NonEmptyMultiasset: add a borrow accessor

To avoid cloning in downstream callers (e.g., validation), expose a borrow:
pub fn as_multiasset(&self) -> &Multiasset<T> { &self.asset }
This enables borrowing the underlying Multiasset where a reference is expected.

Also applies to: 967-981

1358-1394: context_bound on Block/Tx: consider Strict wrappers

Adding #[cbor(context_bound = "StrictContext")] is useful but won’t surface errors from ctx.push_error unless a Strict wrapper is used. Consider adding StrictBlock/StrictTx wrappers mirroring StrictTransactionBody/StrictWitnessSet to fail fast in pedantic decodes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d34b143 and 937f2cf.

📒 Files selected for processing (4)

pallas-codec/src/utils.rs (3 hunks)
pallas-primitives/src/conway/model.rs (12 hunks)
pallas-txbuilder/src/conway.rs (3 hunks)
pallas-validate/src/phase1/conway.rs (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (4)

pallas-validate/src/phase1/conway.rs (2)

pallas-txbuilder/src/transaction/model.rs (1)

input (58-63)

pallas-validate/src/utils.rs (1)

conway_add_minted_non_zero (334-355)

pallas-primitives/src/conway/model.rs (1)

pallas-codec/src/utils.rs (19)

decode (22-33)

decode (148-161)

decode (326-345)

decode (420-430)

decode (495-501)

decode (537-543)

decode (593-597)

decode (631-637)

decode (672-685)

decode (758-769)

decode (849-866)

decode (896-917)

decode (1020-1026)

decode (1076-1084)

decode (1173-1183)

decode (1284-1296)

d (151-151)

d (329-329)

map (1326-1336)

pallas-txbuilder/src/conway.rs (2)

pallas-validate/src/phase2/script_context.rs (1)

mint (405-411)

pallas-primitives/src/conway/model.rs (1)

from_multiasset (968-976)

pallas-codec/src/utils.rs (1)

pallas-primitives/src/conway/model.rs (13)

decode (94-111)

decode (123-135)

decode (149-161)

decode (194-210)

decode (222-234)

decode (365-388)

decode (690-717)

decode (818-854)

decode (866-878)

decode (909-915)

decode (950-956)

decode (1123-1137)

decode (1264-1276)

🔇 Additional comments (10)

pallas-codec/src/utils.rs (2)

861-865: Good: enforce NonEmptySet invariant at decode

Rejecting empty sets here is correct and aligns with NonEmpty semantics used across Conway types and tests.

1001-1005: PositiveCoin: manual Decode with non‑zero validation

Removing the derived Decode and introducing a custom Decode that errors on zero is the right call. It guarantees Zero is rejected at the codec boundary and integrates cleanly with stricter Multiasset decoders.

Also applies to: 1019-1027

pallas-txbuilder/src/conway.rs (1)

71-72: Mint construction via NonEmptyMultiasset looks correct

Collecting into Multiasset and wrapping with NonEmptyMultiasset::from_multiasset matches the new primitives. No issues.

pallas-primitives/src/conway/model.rs (7)

66-112: Multiasset strict decode + size check

Decoding into a BTreeMap and pushing:

empty policy errors, and

a size‑cap error via is_multiasset_small_enough
works well with the Strict wrappers. Relying on the A decoder (e.g., PositiveCoin/NonZeroInt) to enforce non‑zero amounts is appropriate and keeps this generic.

190-211: Value encode/decode aligns with node semantics

Encoding as coin or 2‑tuple and supporting both definite/indef arrays during decode matches Conway’s representation. Tests cover zero‑coin and empty MA cases. LGTM.

505-567: TransactionBody: custom decode validates required fields

Requiring inputs, outputs, and fee; capturing duplicates via ctx.push_error; and wiring strict types (NonEmptyMap/Multiasset) provides the intended pedantic behavior when used with StrictTransactionBody. Good direction.

678-679: Unknown txbody fields: confirm intended strictness

Decoding errors on unknown keys unconditionally (not just via the Strict wrappers). If backward compatibility with future fields is needed in permissive mode, consider pushing to ctx instead of returning Err, leaving the hard error to StrictTransactionBody.

903-916: NonEmptyMap: push error (not Err) is consistent with Strict wrappers

This lets permissive decodes proceed while StrictTransactionBody turns it into a hard error. Good balance.

1100-1117: TransactionOutput: datatype‑based decode is correct

Switching by CBOR datatype (array→legacy, map→post‑alonzo) is robust and mirrors long‑standing traversal logic.

Also applies to: 1119-1138

1398-1411: is_multiasset_small_enough: matches node behavior

The size heuristic and limit are in line with the Haskell node. Tests exercise the failure path. LGTM.

Quantumplation · 2025-10-17T06:53:46Z

pallas-primitives/src/conway/model.rs

+use pallas_codec::minicbor::data::Type;

-pub type Mint = Multiasset<NonZeroInt>;
+trait StrictContext {


Some suggestions:

If we name this something more general, like ValidationContext,

and have the push_error should return a Result that we can ?;

Then we can have a NoopContext, which does nothing; an AccumulatingContext which accumulates things as errors, and a TerminatingContext, which returns an error.

Then, our Strict<T> implements minicbor::Decode, constructs a TerminatingContext, and deserializes T;

And we could have a StrictVerbose<T> that implements minicbor::Decode, constructs an AccumulatingContext, and then concatenates all the errors to return it's own at the end.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

pallas-primitives/src/conway/model.rs (5)
46-76: Consider implementing Default for context types.

The AccumulatingContext and TerminatingContext types would benefit from Default implementations to make them more ergonomic to construct.

Apply this diff:
 impl AccumulatingContext {
     pub fn new() -> Self {
         AccumulatingContext {
             errors: vec![]
         }
     }
 }
+
+impl Default for AccumulatingContext {
+    fn default() -> Self {
+        Self::new()
+    }
+}

 impl TerminatingContext {
     pub fn new() -> Self {
         TerminatingContext {
         }
     }
 }
+
+impl Default for TerminatingContext {
+    fn default() -> Self {
+        Self::new()
+    }
+}
686-722: Consider performance impact of cloning TxBodyField entries.

The TxBodyFields decoder clones TxBodyField values on lines 707 and 714, and these are cloned again on lines 830 and 842 in the main decode logic. Since TxBodyField::Outputs contains Vec<TransactionOutput<'a>>, this could be expensive for transactions with many outputs.

Consider refactoring to avoid the intermediate clones, perhaps by:

Using references in the intermediate representation

Extracting fields without cloning

Using Option::take() to move values out

861-903: Consider providing public constructors for NonEmptyMap.

NonEmptyMap<K, V> can only be constructed via decoding, which limits its usefulness in code that constructs values programmatically (e.g., transaction builders).

Consider adding:
impl<K, V> NonEmptyMap<K, V> {
    pub fn new(map: BTreeMap<K, V>) -> Option<Self> {
        if map.is_empty() {
            None
        } else {
            Some(NonEmptyMap { map })
        }
    }
}
Similar constructors would be useful for NonEmptyMultiasset (which has from_multiasset but it could be made public if it isn't already).

1353-1365: Extract magic numbers to named constants.

The hardcoded values 44, 28, and 65535 should be extracted as named constants with documentation explaining their origin (likely CBOR encoding sizes).

Apply this diff:
+/// Size in bytes of a single asset entry in CBOR encoding
+const PER_ASSET_SIZE: usize = 44;
+/// Size in bytes of a policy entry in CBOR encoding  
+const PER_POLICY_SIZE: usize = 28;
+/// Maximum allowed size for a multiasset value in bytes
+const MAX_MULTIASSET_SIZE: usize = 65535;
+
 fn is_multiasset_small_enough<T>(ma: &Multiasset<T>) -> bool {
-    let per_asset_size = 44;
-    let per_policy_size = 28;
-
     let policy_count = ma.0.len();
     let mut asset_count = 0;
     for assets in ma.0.values() {
         asset_count += assets.len();
     }
 
-    let size = per_asset_size * asset_count + per_policy_size * policy_count;
-    size <= 65535
+    let size = PER_ASSET_SIZE * asset_count + PER_POLICY_SIZE * policy_count;
+    size <= MAX_MULTIASSET_SIZE
 }
1298-1305: Track the FIXME for AuxiliaryData type inconsistency.

The FIXME comment identifies a legitimate issue where the re-exported AuxiliaryData from the alonzo module doesn't support Plutus V2/V3 scripts, despite PostAlonzoAuxiliaryData being defined in this file with full support.

This technical debt should be tracked.

Would you like me to open an issue to track this work? The fix would involve either:

Making AuxiliaryData a type parameter in upstream types

Defining separate Block/Tx types for Conway that use the correct AuxiliaryData variant

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 937f2cf and 11de009.

📒 Files selected for processing (1)

pallas-primitives/src/conway/model.rs (11 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

pallas-primitives/src/conway/model.rs (1)

pallas-codec/src/utils.rs (18)

decode (22-33)

decode (148-161)

decode (326-345)

decode (420-430)

decode (495-501)

decode (537-543)

decode (593-597)

decode (631-637)

decode (672-685)

decode (758-769)

decode (849-866)

decode (896-917)

decode (1020-1026)

decode (1076-1084)

decode (1173-1183)

decode (1284-1296)

d (151-151)

d (329-329)

🔇 Additional comments (3)

pallas-primitives/src/conway/model.rs (3)

98-109: Verify that ignoring the input context is intentional.

The Strict<T> decoder creates its own TerminatingContext and completely ignores the input context _ctx. This means validation context is not threaded through when decoding nested Strict<T> types.

Is this the intended behavior? If so, consider adding a doc comment explaining that Strict<T> establishes a new validation scope.

The same applies to StrictVerbose<T> at lines 122-139.

166-188: LGTM on Multiasset validation logic.

The custom decoder properly enforces Conway-era constraints:

Non-empty policies (lines 176-180)

Size limits via is_multiasset_small_enough (lines 183-185)

The comment on lines 173-175 raises a good point about potentially making Multiasset monomorphic rather than generic, but that's a design decision that can be deferred.

1374-1745: Excellent test coverage for Conway validation rules.

The test suite comprehensively covers:

Value encoding variants and edge cases

Empty collection rejection for all relevant fields

Zero value rejection for PositiveCoin and NonZeroInt

Duplicate field detection in transaction bodies

Size limits for multiassets

Set tagging variants for witness sets

This provides strong confidence in the strict validation implementation.

yHSJ

I feel like I'd have to try out this API to really get an opinion on whether or not I love it.

My immediate concerns are:

If we unwrap a Strict wrapper (I don't like using that function name, feels dirty to me), I lose any "proof" that I can trust the data inside it instead of revalidating stuff.
Currently, the pallas decoders fail sometimes (when the data is in the wrong format, for example, but not "business logic" validations). How would this change differentiate between business logic and fundamental issues with the data?

But really, I think maybe we should just give it a try and see how it feels as an API.

Note: I have not reviewed the actual logic of the decoders, I am simply commenting on the usability of the API

cc: @scarmuega @KtorZ

rrruko added 10 commits August 28, 2025 16:59

add some tests for cbor decoding

bdb3281

add witness set decoder tests

4198d79

add aux data tests

a7e5547

use correct AuxiliaryData type

78adb47

fix some tests

c5ba3cd

transaction body decoder tests

deecbc1

delete commented code

7a1a9a4

use is_some_and

4f85d34

make multiasset wrapper transparent to fix pallas-traverse build

8d2fbe2

implement pedantic validations as minicbor context

937f2cf

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

Quantumplation reviewed Oct 17, 2025

View reviewed changes

implement generic Strict/StrictVerbose wrappers

11de009

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

yHSJ reviewed Nov 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

implement pedantic codec validations via minicbor context #707

implement pedantic codec validations via minicbor context #707

Uh oh!

rrruko commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Quantumplation Oct 17, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

yHSJ left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

implement pedantic codec validations via minicbor context #707

Are you sure you want to change the base?

implement pedantic codec validations via minicbor context #707

Uh oh!

Conversation

rrruko commented Oct 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Quantumplation Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

yHSJ left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rrruko commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 15, 2025 •

edited

Loading