[FEAT] Recursive DLPack container conversion for auto torch.Tensor return by Kathryn-cat · Pull Request #517 · apache/tvm-ffi

Kathryn-cat · 2026-03-30T00:18:04Z

Problem

When a packed FFI function receives torch.Tensor inputs, the return value is automatically converted back to torch.Tensor via DLPack — but only for bare ffi.Tensor returns. When the return is a container (Array, List, Map, Dict) containing tensors, the tensors inside remain as ffi.Tensor, requiring manual per-element conversion.

Solution

We perform a lazy conversion of each element in the container from ffi::Tensor to torch::Tensor when they're retrieved. A lazy conversion ensures both semantic correctness of containers and reduce runtime overhead compared with eager conversion.

Stream Propagation

Container	Element	Stream set?	How
list/tuple	torch.Tensor	Yes	ConstructorCall → each element hits DLPackExchangeAPI_ setter
list/tuple	ffi.Tensor	No	ConstructorCall → each element hits Tensor_ setter (no stream)
dict	torch.Tensor (values)	Yes	Same as above via ConstructorCall
dict	ffi.Tensor (values)	No	Same as above
ffi.Array/ffi.List (tagged)	ffi.Tensor	Yes	ContainerObject_ setter → _scan_seq_for_stream
ffi.Map/ffi.Dict (tagged)	ffi.Tensor	Yes	ContainerObject_ setter → _scan_map_for_stream
ffi.Array/ffi.List (untagged)	ffi.Tensor	No	Object_ setter (pass-through)
ffi.Map/ffi.Dict (untagged)	ffi.Tensor	No	Object_ setter (pass-through)

gemini-code-assist

Code Review

This pull request implements recursive conversion of FFI containers—specifically Array, List, Map, and Dict—into native Python lists and dictionaries when the DLPack exchange API is active. This allows for the seamless return of nested structures containing tensors (e.g., PyTorch tensors) through the FFI. The changes include new Cython conversion logic, updated return value handling, and comprehensive tests for various nested and mixed-type scenarios. Feedback was provided to include an early return for empty sequences in the list conversion function to improve consistency and performance.

python/tvm_ffi/cython/function.pxi

junrushao · 2026-03-30T00:49:25Z

The PR looks good in functionalities, but I'm slightly worried about the semantics and runtime overhead due to eager recursive conversion. Is there a way to opt out this behavior?

Kathryn-cat · 2026-03-30T03:44:20Z

The PR looks good in functionalities, but I'm slightly worried about the semantics and runtime overhead due to eager recursive conversion. Is there a way to opt out this behavior?

@junrushao I'd like to make sure that I address your right concern. Are you more worried about:

a) semantics: In the current implementation, the return type silently changes from ffi.Array to list when a ffi tensor is in the inputs.
b) runtime overhead: The eager version is converting elements that are never accessed. Do you prefer we change it into a lazy conversion, or we use a flag / API to let users disable the eager conversion behavior manually?

junrushao · 2026-03-30T05:42:56Z

I do worry about both a) and b) separately, but b) over a).

a) semantics: In the current implementation, the return type silently changes from ffi.Array to list when a ffi tensor is in the inputs.

I do worry about it in a usecase where we may want to return a mutable ffi.List for the user to update the list inplace. If we do eager conversion, it means the update will effectively become a no-op which changes the semantics.

b) runtime overhead: The eager version is converting elements that are never accessed. Do you prefer we change it into a lazy conversion, or we use a flag / API to let users disable the eager conversion behavior manually?

I don't know how we could make it lazy, but a flag could definitely work. I'd love to hear from you and @tqchen what makes the most sense.

tqchen · 2026-03-30T12:12:12Z

agree about the List mutation case is something we want to think about, and wasn't on the radar before.

Kathryn-cat · 2026-04-02T04:24:00Z

/gemini review

gemini-code-assist

Code Review

This pull request implements lazy DLPack conversion for FFI containers (Array, List, Map, and Dict) by introducing a CContainerBase class. This change enables the propagation of the DLPack exchange API and includes logic to scan containers for non-CPU tensors to automatically configure the stream context. Review feedback highlights that the current scanning mechanism is shallow and does not account for nested containers, which could lead to synchronization issues. Additionally, there are concerns regarding the performance overhead of scanning large containers during FFI calls.

python/tvm_ffi/cython/function.pxi

gemini-code-assist · 2026-04-02T04:28:51Z

python/tvm_ffi/cython/function.pxi

+    getitem_args[0].type_index = type_index
+    getitem_args[0].v_obj = <TVMFFIObject*>chandle
+
+    for i in range(n):


Scanning a large container (e.g., an Array with thousands of elements) for a tensor can introduce significant overhead during FFI calls. Since this scan happens on every FFI call where a container is passed as an argument and the stream context isn't already set, consider adding a limit to the number of elements scanned or optimizing this check if possible.

is this somewhat unavoidable?

Yes, scanning for O(n) is for correctness here.

junrushao · 2026-04-02T06:57:39Z

python/tvm_ffi/container.py


 @register_object("ffi.Array")
-class Array(core.Object, Sequence[T]):
+class Array(core.CContainerBase, core.Object, Sequence[T]):


Looks like we are inheriting from both core.CContainerBase and core.Object, which both inherits from core.CObject. Should be fine :)

junrushao

Overall looks good! I left two minor comments

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

python/tvm_ffi/cython/function.pxi Outdated Show resolved Hide resolved

Kathryn-cat force-pushed the dlpack-container-convert branch from 0e91025 to 90f066d Compare April 2, 2026 01:52

rebased

9a0f26e

Kathryn-cat force-pushed the dlpack-container-convert branch from 90f066d to 9a0f26e Compare April 2, 2026 01:52

implemented lazy conversion and stream propagation

6637b91

gemini-code-assist bot reviewed Apr 2, 2026

View reviewed changes

minor fix

2531747

junrushao reviewed Apr 2, 2026

View reviewed changes

junrushao approved these changes Apr 2, 2026

View reviewed changes

junrushao merged commit 3539d70 into apache:main Apr 2, 2026
9 checks passed

Conversation

Kathryn-cat commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Stream Propagation

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

junrushao commented Mar 30, 2026

Uh oh!

Kathryn-cat commented Mar 30, 2026

Uh oh!

junrushao commented Mar 30, 2026

Uh oh!

tqchen commented Mar 30, 2026

Uh oh!

Kathryn-cat commented Apr 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

junrushao Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

junrushao Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

junrushao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Kathryn-cat commented Mar 30, 2026 •

edited

Loading