-
Notifications
You must be signed in to change notification settings - Fork 84
implement pedantic codec validations via minicbor context #707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds strict decoding/validation for PositiveCoin; introduces generic Multiasset and non-empty wrappers (NonEmptyMap, NonEmptyMultiasset) and rewrites Value/TransactionBody encoding/decoding for Conway types; updates txbuilder and validation call sites to use the new multiasset/non-empty types. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CBOR as CBOR
participant Dec as minicbor::Decoder
participant PC as PositiveCoin
participant MA as Multiasset/Strict
participant TB as TxBodyFields
CBOR->>Dec: bytes
Dec->>PC: decode PositiveCoin
alt decoded == 0
PC-->>Dec: Err(Decode) (invalid zero)
else
PC-->>Dec: PositiveCoin(value>0)
end
Dec->>MA: decode Multiasset/Value via Strict wrapper
MA->>MA: validate non-empty/non-zero/size
Dec->>TB: decode TransactionBody fields (indexed)
TB->>TB: stage fields, check duplicates, apply strict validation
TB-->>Dec: validated TransactionBody
sequenceDiagram
autonumber
participant App as TxBuilder
participant MA as Multiasset<NonZeroInt>
participant NE as NonEmptyMultiasset<NonZeroInt>
App->>MA: collect mint entries
MA-->>App: Multiasset map
App->>NE: NonEmptyMultiasset::from_multiasset(MA.into_iter().collect())
NE-->>App: NonEmptyMultiasset (or error if empty)
App->>App: attach mint to TransactionBody
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Areas to pay extra attention:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
pallas-validate/src/phase1/conway.rs (1)
368-369: Avoid cloning mint just to borrow Multiasset
m.clone().to_multiasset()allocates a full clone. Prefer borrowing the innerMultiassetto avoid copying.Recommended:
- Add
as_multiasset(&self) -> &Multiasset<T>onNonEmptyMultiasset<T>.- Use it here to pass a borrowed
&Multiasset<_>.Apply this diff locally:
- input = conway_add_minted_non_zero(&input, &m.clone().to_multiasset(), &PostAlonzo(NegativeValue))?; + // borrow mint instead of cloning + input = conway_add_minted_non_zero(&input, m.as_multiasset(), &PostAlonzo(NegativeValue))?;And add this helper in
pallas-primitives/src/conway/model.rswithinimpl<A> NonEmptyMultiasset<A>:impl<A> NonEmptyMultiasset<A> { + pub fn as_multiasset(&self) -> &Multiasset<A> { + &self.asset + } pub fn from_multiasset(ma: Multiasset<A>) -> Option<Self> { if ma.is_empty() { None } else { Some(NonEmptyMultiasset { asset: ma, }) } } pub fn to_multiasset(self) -> Multiasset<A> { self.asset } }pallas-txbuilder/src/conway.rs (1)
157-161: Minor: simplify iterator type annotationThe explicit type on
xisn’t needed; inference is clear frommint.iter(). Consider:- .flat_map(|x: &pallas_primitives::conway::NonEmptyMultiasset<NonZeroInt>| x.iter()) + .flat_map(|x| x.iter())pallas-primitives/src/conway/model.rs (3)
30-65: StrictContext scaffolding is solid; consider DefaultThe context/error accumulator is straightforward. Optionally derive
DefaultforBasicStrictContextto ease construction alongsidenew().
945-957: NonEmptyMultiasset: add a borrow accessorTo avoid cloning in downstream callers (e.g., validation), expose a borrow:
pub fn as_multiasset(&self) -> &Multiasset<T> { &self.asset }This enables borrowing the underlying
Multiassetwhere a reference is expected.Also applies to: 967-981
1358-1394: context_bound on Block/Tx: consider Strict wrappersAdding
#[cbor(context_bound = "StrictContext")]is useful but won’t surface errors fromctx.push_errorunless a Strict wrapper is used. Consider addingStrictBlock/StrictTxwrappers mirroringStrictTransactionBody/StrictWitnessSetto fail fast in pedantic decodes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
pallas-codec/src/utils.rs(3 hunks)pallas-primitives/src/conway/model.rs(12 hunks)pallas-txbuilder/src/conway.rs(3 hunks)pallas-validate/src/phase1/conway.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
pallas-validate/src/phase1/conway.rs (2)
pallas-txbuilder/src/transaction/model.rs (1)
input(58-63)pallas-validate/src/utils.rs (1)
conway_add_minted_non_zero(334-355)
pallas-primitives/src/conway/model.rs (1)
pallas-codec/src/utils.rs (19)
decode(22-33)decode(148-161)decode(326-345)decode(420-430)decode(495-501)decode(537-543)decode(593-597)decode(631-637)decode(672-685)decode(758-769)decode(849-866)decode(896-917)decode(1020-1026)decode(1076-1084)decode(1173-1183)decode(1284-1296)d(151-151)d(329-329)map(1326-1336)
pallas-txbuilder/src/conway.rs (2)
pallas-validate/src/phase2/script_context.rs (1)
mint(405-411)pallas-primitives/src/conway/model.rs (1)
from_multiasset(968-976)
pallas-codec/src/utils.rs (1)
pallas-primitives/src/conway/model.rs (13)
decode(94-111)decode(123-135)decode(149-161)decode(194-210)decode(222-234)decode(365-388)decode(690-717)decode(818-854)decode(866-878)decode(909-915)decode(950-956)decode(1123-1137)decode(1264-1276)
🔇 Additional comments (10)
pallas-codec/src/utils.rs (2)
861-865: Good: enforce NonEmptySet invariant at decodeRejecting empty sets here is correct and aligns with NonEmpty semantics used across Conway types and tests.
1001-1005: PositiveCoin: manual Decode with non‑zero validationRemoving the derived Decode and introducing a custom Decode that errors on zero is the right call. It guarantees Zero is rejected at the codec boundary and integrates cleanly with stricter Multiasset decoders.
Also applies to: 1019-1027
pallas-txbuilder/src/conway.rs (1)
71-72: Mint construction via NonEmptyMultiasset looks correctCollecting into
Multiassetand wrapping withNonEmptyMultiasset::from_multiassetmatches the new primitives. No issues.pallas-primitives/src/conway/model.rs (7)
66-112: Multiasset strict decode + size checkDecoding into a
BTreeMapand pushing:
- empty policy errors, and
- a size‑cap error via
is_multiasset_small_enough
works well with the Strict wrappers. Relying on theAdecoder (e.g.,PositiveCoin/NonZeroInt) to enforce non‑zero amounts is appropriate and keeps this generic.
190-211: Value encode/decode aligns with node semanticsEncoding as coin or 2‑tuple and supporting both definite/indef arrays during decode matches Conway’s representation. Tests cover zero‑coin and empty MA cases. LGTM.
505-567: TransactionBody: custom decode validates required fieldsRequiring inputs, outputs, and fee; capturing duplicates via
ctx.push_error; and wiring strict types (NonEmptyMap/Multiasset) provides the intended pedantic behavior when used withStrictTransactionBody. Good direction.
678-679: Unknown txbody fields: confirm intended strictnessDecoding errors on unknown keys unconditionally (not just via the Strict wrappers). If backward compatibility with future fields is needed in permissive mode, consider pushing to
ctxinstead of returningErr, leaving the hard error toStrictTransactionBody.
903-916: NonEmptyMap: push error (not Err) is consistent with Strict wrappersThis lets permissive decodes proceed while
StrictTransactionBodyturns it into a hard error. Good balance.
1100-1117: TransactionOutput: datatype‑based decode is correctSwitching by CBOR datatype (array→legacy, map→post‑alonzo) is robust and mirrors long‑standing traversal logic.
Also applies to: 1119-1138
1398-1411: is_multiasset_small_enough: matches node behaviorThe size heuristic and limit are in line with the Haskell node. Tests exercise the failure path. LGTM.
| use pallas_codec::minicbor::data::Type; | ||
|
|
||
| pub type Mint = Multiasset<NonZeroInt>; | ||
| trait StrictContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions:
- If we name this something more general, like
ValidationContext, - and have the push_error should return a Result that we can
?;
Then we can have a NoopContext, which does nothing; an AccumulatingContext which accumulates things as errors, and a TerminatingContext, which returns an error.
Then, our Strict<T> implements minicbor::Decode, constructs a TerminatingContext, and deserializes T;
And we could have a StrictVerbose<T> that implements minicbor::Decode, constructs an AccumulatingContext, and then concatenates all the errors to return it's own at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
pallas-primitives/src/conway/model.rs (5)
46-76: Consider implementingDefaultfor context types.The
AccumulatingContextandTerminatingContexttypes would benefit fromDefaultimplementations to make them more ergonomic to construct.Apply this diff:
impl AccumulatingContext { pub fn new() -> Self { AccumulatingContext { errors: vec![] } } } + +impl Default for AccumulatingContext { + fn default() -> Self { + Self::new() + } +} impl TerminatingContext { pub fn new() -> Self { TerminatingContext { } } } + +impl Default for TerminatingContext { + fn default() -> Self { + Self::new() + } +}
686-722: Consider performance impact of cloningTxBodyFieldentries.The
TxBodyFieldsdecoder clonesTxBodyFieldvalues on lines 707 and 714, and these are cloned again on lines 830 and 842 in the main decode logic. SinceTxBodyField::OutputscontainsVec<TransactionOutput<'a>>, this could be expensive for transactions with many outputs.Consider refactoring to avoid the intermediate clones, perhaps by:
- Using references in the intermediate representation
- Extracting fields without cloning
- Using
Option::take()to move values out
861-903: Consider providing public constructors forNonEmptyMap.
NonEmptyMap<K, V>can only be constructed via decoding, which limits its usefulness in code that constructs values programmatically (e.g., transaction builders).Consider adding:
impl<K, V> NonEmptyMap<K, V> { pub fn new(map: BTreeMap<K, V>) -> Option<Self> { if map.is_empty() { None } else { Some(NonEmptyMap { map }) } } }Similar constructors would be useful for
NonEmptyMultiasset(which hasfrom_multiassetbut it could be made public if it isn't already).
1353-1365: Extract magic numbers to named constants.The hardcoded values
44,28, and65535should be extracted as named constants with documentation explaining their origin (likely CBOR encoding sizes).Apply this diff:
+/// Size in bytes of a single asset entry in CBOR encoding +const PER_ASSET_SIZE: usize = 44; +/// Size in bytes of a policy entry in CBOR encoding +const PER_POLICY_SIZE: usize = 28; +/// Maximum allowed size for a multiasset value in bytes +const MAX_MULTIASSET_SIZE: usize = 65535; + fn is_multiasset_small_enough<T>(ma: &Multiasset<T>) -> bool { - let per_asset_size = 44; - let per_policy_size = 28; - let policy_count = ma.0.len(); let mut asset_count = 0; for assets in ma.0.values() { asset_count += assets.len(); } - let size = per_asset_size * asset_count + per_policy_size * policy_count; - size <= 65535 + let size = PER_ASSET_SIZE * asset_count + PER_POLICY_SIZE * policy_count; + size <= MAX_MULTIASSET_SIZE }
1298-1305: Track the FIXME for AuxiliaryData type inconsistency.The FIXME comment identifies a legitimate issue where the re-exported
AuxiliaryDatafrom the alonzo module doesn't support Plutus V2/V3 scripts, despitePostAlonzoAuxiliaryDatabeing defined in this file with full support.This technical debt should be tracked.
Would you like me to open an issue to track this work? The fix would involve either:
- Making AuxiliaryData a type parameter in upstream types
- Defining separate Block/Tx types for Conway that use the correct AuxiliaryData variant
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pallas-primitives/src/conway/model.rs(11 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pallas-primitives/src/conway/model.rs (1)
pallas-codec/src/utils.rs (18)
decode(22-33)decode(148-161)decode(326-345)decode(420-430)decode(495-501)decode(537-543)decode(593-597)decode(631-637)decode(672-685)decode(758-769)decode(849-866)decode(896-917)decode(1020-1026)decode(1076-1084)decode(1173-1183)decode(1284-1296)d(151-151)d(329-329)
🔇 Additional comments (3)
pallas-primitives/src/conway/model.rs (3)
98-109: Verify that ignoring the input context is intentional.The
Strict<T>decoder creates its ownTerminatingContextand completely ignores the input context_ctx. This means validation context is not threaded through when decoding nestedStrict<T>types.Is this the intended behavior? If so, consider adding a doc comment explaining that
Strict<T>establishes a new validation scope.The same applies to
StrictVerbose<T>at lines 122-139.
166-188: LGTM on Multiasset validation logic.The custom decoder properly enforces Conway-era constraints:
- Non-empty policies (lines 176-180)
- Size limits via
is_multiasset_small_enough(lines 183-185)The comment on lines 173-175 raises a good point about potentially making Multiasset monomorphic rather than generic, but that's a design decision that can be deferred.
1374-1745: Excellent test coverage for Conway validation rules.The test suite comprehensively covers:
- Value encoding variants and edge cases
- Empty collection rejection for all relevant fields
- Zero value rejection for PositiveCoin and NonZeroInt
- Duplicate field detection in transaction bodies
- Size limits for multiassets
- Set tagging variants for witness sets
This provides strong confidence in the strict validation implementation.
yHSJ
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I'd have to try out this API to really get an opinion on whether or not I love it.
My immediate concerns are:
- If we
unwrapa Strict wrapper (I don't like using that function name, feels dirty to me), I lose any "proof" that I can trust the data inside it instead of revalidating stuff. - Currently, the pallas decoders fail sometimes (when the data is in the wrong format, for example, but not "business logic" validations). How would this change differentiate between business logic and fundamental issues with the data?
But really, I think maybe we should just give it a try and see how it feels as an API.
Note: I have not reviewed the actual logic of the decoders, I am simply commenting on the usability of the API
cc: @scarmuega @KtorZ
This keeps the default permissive behavior of the codecs while providing the option to enable pedantically node-conformant validations by using
minicbor::decode_with(..., BasicStrictContext::new()).See also #697, pragma-org/amaru#494
Summary by CodeRabbit
New Features
Refactor