Skip to content

Conversation

@fzi-hielscher
Copy link
Contributor

@fzi-hielscher fzi-hielscher commented Nov 9, 2025

Spill hw.array values on the stack at their definition site instead of their use sites to reduce the number of redundant alloca + store ops created. This should mitigate the negative effects of #9172, fixing #9211.

This PR adds the HWToLLVMArraySpillCache helper to the HWToLLVM conversion patterns. Certain operations on !hw.array typed values need to "spill" (i.e., copy into a buffer, usually stack allocated) their input arrays in order to access dynamically indexed elements. The cache is used to avoid spilling the array for each such user separately. Instead,operations that create an array value which needs spilling allocate a buffer that can be shared by the value's users. HW Dialect operations that inherently allocate a buffer for their output arrays (AggregateConstantOp, ArrayInjectOp, ArraySliceOp) use the cache to associate their output value with a pointer to the buffer, so it does not need to be spilled again. Non-HW Dialect producers of HW array values (e.g., arc.state_load) need to be handled pre conversion. This is done by invoking spillNonHWOps.

Use of the cache is optional and can be disabled by passing spill-arrays-early=false to convert-hw-to-llvm or using a nullopt value for the cache argument of populateHWToLLVMConversionPatterns. In fact, conversions passes that do not disable allowPatternRollback must not use the cache since a buffer added to the cache once cannot be removed in case of a rollback.

@fzi-hielscher fzi-hielscher force-pushed the hwtollvm-early-array-spill branch from 3c40229 to 1e486ed Compare November 12, 2025 23:56
@fzi-hielscher fzi-hielscher changed the title [WIP][HWToLLVM][ArcToLLVM] Spill array values early [HWToLLVM][ArcToLLVM] Spill array values early Nov 14, 2025
@fzi-hielscher fzi-hielscher marked this pull request as ready for review November 14, 2025 11:40
@fzi-hielscher
Copy link
Contributor Author

@SimonEbner, could you check if this actually fixes the compile time regression in your example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant