Skip to content

IDL Support for Custom Vector Length Discriminators #4076

@mgild

Description

@mgild

Problem

Anchor IDLs currently do not support custom vectors that use 1-byte or 2-byte length discriminators. The IDL specification assumes all vectors use the standard 4-byte (u32) length prefix, which is Rust's default Vec<T> serialization format.

However, many on-chain programs optimize for account size by using custom vector implementations with smaller length discriminators:

  • 1-byte length (u8): for vectors with max 255 elements
  • 2-byte length (u16): for vectors with max 65,535 elements

This creates a mismatch between the program's actual serialization format and what the IDL describes, breaking client-side deserialization.

Current Workaround

Developers currently have to:

  1. Manually implement custom deserializers in client code
  2. Use byte arrays instead of vectors in the IDL (losing type information)
  3. Add documentation comments explaining the actual format
  4. Risk client-side deserialization errors

Proposed Solution

Add a new optional field within the vector type object to denote custom length byte sizes:

IDL Schema Addition

Current IDL format:

{
  "name": "items",
  "type": {
    "vec": "u64"
  }
}

Proposed format with custom length:

{
  "name": "items",
  "type": {
    "vec": {
      "type": "u64",
      "length": "u8"
    }
  }
}

The `length` field would support: `"u8"` | `"u16"` | `"u32"` (default if omitted)

For backward compatibility, when `vec` has a simple string value (current format), it defaults to u32 length prefix.

Example Use Case

Rust Program:

#[account]
pub struct CompactData {
    pub items: CompactVec<u64, u8>,  // Custom vec with u8 length prefix
}

// Custom vec type
pub struct CompactVec<T, L> {
    _length_type: PhantomData<L>,
    pub inner: Vec<T>,
}

Current IDL (broken):

{
  "name": "items",
  "type": {
    "vec": "u64"
  }
}

This assumes 4-byte length, which is incorrect!

Proposed IDL (correct):

{
  "name": "items",
  "type": {
    "vec": {
      "type": "u64",
      "length": "u8"
    }
  }
}

Multiple Element Example

{
  "name": "prices",
  "type": {
    "vec": {
      "type": {
        "defined": "PriceFeed"
      },
      "length": "u16"
    }
  }
}

Benefits

  1. Accurate type representation: IDL correctly reflects on-chain data layout
  2. Automatic client generation: Generated clients can correctly deserialize compact vectors
  3. Better DX: Developers don't need manual deserialization workarounds
  4. Account size optimization: Explicitly documents and supports space-efficient patterns
  5. Type safety: Maintains full type information instead of falling back to byte arrays
  6. Consistent schema: The `length` field lives within the `vec` object, matching the existing IDL structure

Alternatives Considered

  1. Use byte arrays: Loses type information and requires manual parsing
  2. Custom IDL types: Too verbose, doesn't integrate with existing tooling
  3. Documentation only: Doesn't solve client-side deserialization issues
  4. Top-level field: Putting `length` outside `vec` breaks the existing schema pattern

Implementation Notes

This would require updates to:

  • IDL specification/schema
  • IDL code generation (anchor-lang)
  • Client libraries (Anchor TS, Rust client)
  • Serialization/deserialization logic

Real-World Impact

This is particularly relevant for:

  • Oracle programs with compact price feeds
  • Gaming programs with large inventories
  • Any program optimizing for Solana's account size constraints
  • Programs handling lists where max size is known and small

Backward Compatibility

The proposal maintains backward compatibility by:

  • Allowing `vec` to remain a simple string (current format) which defaults to u32
  • Making the `length` field optional within the `vec` object
  • Not breaking existing IDLs
  • Existing parsers can ignore the `length` field if not supported

Would appreciate community feedback on the proposed API design!

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureidlrelated to the IDL, either program or client side

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions