Description:
QueueSlotLight currently pads each slot to 64 bytes to prevent false sharing between adjacent slots. In SPSC, the producer and consumer never access the same slot concurrently, making this padding unnecessary.
For small types (e.g. uint32_t), this inflates each slot by 16× and forces the working set to spill from L1 to L2 then L3 as queue depth grows, destroying hardware prefetcher efficiency.
Acceptance criteria:
- QueueSlotLight contains only data, no padding member
- QueueSlotFull padding is preserved unchanged
Description:
QueueSlotLight currently pads each slot to 64 bytes to prevent false sharing between adjacent slots. In SPSC, the producer and consumer never access the same slot concurrently, making this padding unnecessary.
For small types (e.g. uint32_t), this inflates each slot by 16× and forces the working set to spill from L1 to L2 then L3 as queue depth grows, destroying hardware prefetcher efficiency.
Acceptance criteria: