Skip to content

Commit 2f27bf8

Browse files
committed
add txt
1 parent 7f24ec8 commit 2f27bf8

1 file changed

Lines changed: 243 additions & 0 deletions

File tree

txt.txt

Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
Issues in Metal RT Implementation
2+
3+
### 1. **Hardcoded TLAB Layout (Critical)**
4+
5+
```plume/plume_metal.cpp#L2926-2932
6+
// Build the TLAB: array of GPU addresses for each root parameter.
7+
uint64_t tlab[3] = {
8+
descriptorSet->rtUAVTableBuffer ? descriptorSet->rtUAVTableBuffer->gpuAddress() : 0,
9+
descriptorSet->rtASHeaderBuffer ? descriptorSet->rtASHeaderBuffer->gpuAddress() : 0,
10+
cbvAddress
11+
};
12+
```
13+
14+
The TLAB assumes a fixed 3-element layout `[UAV, AS, CBV]`. This only works for shaders with exactly this root signature. Any shader with a different layout (more textures, different order, SRV textures, etc.) will break silently.
15+
16+
**Fix:** Either generate the TLAB based on the actual pipeline layout/descriptor set layout, or document this limitation clearly.
17+
18+
---
19+
20+
### 2. **Fragile Resource Discovery in `traceRays`**
21+
22+
```plume/plume_metal.cpp#L2850-2872
23+
for (size_t i = 0; i < descriptorSet->resourceEntries.size(); i++) {
24+
const auto& entry = descriptorSet->resourceEntries[i];
25+
if (entry.resource == nullptr) continue;
26+
27+
switch (entry.type) {
28+
case RenderDescriptorRangeType::READ_WRITE_TEXTURE:
29+
case RenderDescriptorRangeType::TEXTURE:
30+
outputTexture = static_cast<MTL::Texture*>(entry.resource);
31+
break;
32+
// ...
33+
}
34+
}
35+
```
36+
37+
This picks the **first** texture/AS/buffer it finds. If a shader has multiple textures, only the first one ends up ;in the TLAB. Related to issue #1.
38+
39+
---
40+
41+
### 3. **Dangling Pointer Risk in `instanceContributions`**
42+
43+
```plume/plume_metal.cpp#L2117-2118
44+
resourceEntries[descriptorIndex].instanceContributions = metalAS->instanceContributions.empty()
45+
? nullptr : metalAS->instanceContributions.data();
46+
```
47+
48+
This stores a raw pointer to the internal data of `MetalAccelerationStructure::instanceContributions`. If the AS is destroyed or the vector reallocates, this becomes dangling.
49+
50+
**Fix:** Copy the data instead of storing a pointer, or use shaIssues in Metal RT Implementation
51+
52+
### 1. **Hardcoded TLAB Layout (Critical)**
53+
54+
```plume/plume_metal.cpp#L2926-2932
55+
// Build the TLAB: array of GPU addresses for each root parameter.
56+
uint64_t tlab[3] = {
57+
descriptorSet->rtUAVTableBuffer ? descriptorSet->rtUAVTableBuffer->gpuAddress() : 0,
58+
descriptorSet->rtASHeaderBuffer ? descriptorSet->rtASHeaderBuffer->gpuAddress() : 0,
59+
cbvAddress
60+
};
61+
```
62+
63+
The TLAB assumes a fixed 3-element layout `[UAV, AS, CBV]`. This only works for shaders with exactly this root signature. Any shader with a different layout (more textures, different order, SRV textures, etc.) will break silently.
64+
65+
**Fix:** Either generate the TLAB based on the actual pipeline layout/descriptor set layout, or document this limitation clearly.
66+
67+
---
68+
69+
### 2. **Fragile Resource Discovery in `traceRays`**
70+
71+
```plume/plume_metal.cpp#L2850-2872
72+
for (size_t i = 0; i < descriptorSet->resourceEntries.size(); i++) {
73+
const auto& entry = descriptorSet->resourceEntries[i];
74+
if (entry.resource == nullptr) continue;
75+
76+
switch (entry.type) {
77+
case RenderDescriptorRangeType::READ_WRITE_TEXTURE:
78+
case RenderDescriptorRangeType::TEXTURE:
79+
outputTexture = static_cast<MTL::Texture*>(entry.resource);
80+
break;
81+
// ...
82+
}
83+
}
84+
```
85+
86+
This picks the **first** texture/AS/buffer it finds. If a shader has multiple textures, only the first one ends up in the TLAB. Related to issue #1.
87+
88+
---
89+
90+
### 3. **Dangling Pointer Risk in `instanceContributions`**
91+
92+
```plume/plume_metal.cpp#L2117-2118
93+
resourceEntries[descriptorIndex].instanceContributions = metalAS->instanceContributions.empty()
94+
? nullptr : metalAS->instanceContributions.data();
95+
```
96+
97+
This stores a raw pointer to the internal data of `MetalAccelerationStructure::instanceContributions`. If the AS is destroyed or the vector reallocates, this becomes dangling.
98+
99+
**Fix:** Copy the data instead of storing a pointer, or use shared ownership.
100+
101+
---
102+
103+
### 4. **Missing `useResource` for BLAS Referenced by TLAS**
104+
105+
In `traceRays`, only the TLAS is marked:
106+
```plume/plume_metal.cpp#L2944-2946
107+
if (tlas != nullptr) {
108+
activeComputeEncoder->useResource(tlas, MTL::ResourceUsageRead);
109+
}
110+
```
111+
112+
But the BLASes that the TLAS references are not marked with `useResource`. Metal requires all resources accessed by the GPU to be marked on the encoder.
113+
114+
**Fix:** Track BLAS references in the TLAS and call `useResource` on each.
115+
116+
---
117+
118+
### 5. **Memory Leak if VFT Creation Fails**
119+
120+
```plume/plume_metal.cpp#L1867-1870
121+
visibleFunctionTable = computePipeline->newVisibleFunctionTable(vftDesc);
122+
if (visibleFunctionTable == nullptr) {
123+
fprintf(stderr, "Failed to create visible function table.\n");
124+
} // computePipeline is not released here, but the pipeline is left in a broken state
125+
```
126+
127+
If VFT creation fails, the pipeline is partially constructed but `computePipeline` is not cleaned up. The destructor will release it, but the object is in an invalid state without proper error propagation.
128+
129+
---
130+
131+
### 6. **No Intersection Shader Support**
132+
133+
```plume/plume_metal.h#L710
134+
MTL::IntersectionFunctionTable *intersectionFunctionTable = nullptr;
135+
```
136+
137+
This is declared but never populated. Custom intersection shaders (for procedural geometry) won't work.
138+
139+
---
140+
141+
### 7. **Only Descriptor Set 0 is Used**
142+
143+
```plume/plume_metal.cpp#L2822
144+
MetalDescriptorSet* descriptorSet = raytracingDescriptorSets[0];
145+
```
146+
147+
Multiple descriptor sets aren't supported for raytracing, even though `raytracingDescriptorSets` is an array.
148+
149+
---
150+
151+
### 8. **Potential Race Condition on `rtBuffersDirty`**
152+
153+
`rtBuffersDirty` is written in `setTexture()`/`setAccelerationStructure()` (called from app thread potentially) and read/written in `traceRays()` (called from command list recording). No synchronization.
154+
155+
---
156+
157+
### Summary
158+
159+
The implementation works for the **specific raytracing example** (single UAV output, single TLAS, single CBV in that exact order). For a production-quality implementation, the TLAB construction needs to be generalized to work with arbitrary root signatures/descriptor layouts.
160+
161+
Would you like me to help fix any of these issues?red ownership.
162+
163+
---
164+
165+
### 4. **Missing `useResource` for BLAS Referenced by TLAS**
166+
167+
In `traceRays`, only the TLAS is marked:
168+
```plume/plume_metal.cpp#L2944-2946
169+
if (tlas != nullptr) {
170+
activeComputeEncoder->useResource(tlas, MTL::ResourceUsageRead);
171+
}
172+
```
173+
174+
But the BLASes that the TLAS references are not marked with `useResource`. Metal requires all resources accessed by the GPU to be marked on the encoder.
175+
176+
**Fix:** Track BLAS references in the TLAS and call `useResource` on each.
177+
178+
---
179+
180+
### 5. **Memory Leak if VFT Creation Fails**
181+
182+
```plume/plume_metal.cpp#L1867-1870
183+
visibleFunctionTable = computePipeline->newVisibleFunctionTable(vftDesc);
184+
if (visibleFunctionTable == nullptr) {
185+
fprintf(stderr, "Failed to create visible function table.\n");
186+
} // computePipeline is not released here, but the pipeline is left in a broken state
187+
```
188+
189+
If VFT creation fails, the pipeline is partially constructed but `computePipeline` is not cleaned up. The destructor will release it, but the object is in an invalid state without proper error propagation.
190+
191+
---
192+
193+
### 6. **No Intersection Shader Support**
194+
195+
```plume/plume_metal.h#L710
196+
MTL::IntersectionFunctionTable *intersectionFunctionTable = nullptr;
197+
```
198+
199+
This is declared but never populated. Custom intersection shaders (for procedural geometry) won't work.
200+
201+
---
202+
203+
### 7. **Only Descriptor Set 0 is Used**
204+
205+
```plume/plume_metal.cpp#L2822
206+
MetalDescriptorSet* descriptorSet = raytracingDescriptorSets[0];
207+
```
208+
209+
Multiple descriptor sets aren't supported for raytracing, even though `raytracingDescriptorSets` is an array.
210+
211+
---
212+
213+
### 8. **Potential Race Condition on `rtBuffersDirty`**
214+
215+
`rtBuffersDirty` is written in `setTexture()`/`setAccelerationStructure()` (called from app thread potentially) and read/written in `traceRays()` (called from command list recording). No synchronization.
216+
217+
---
218+
219+
### Summary
220+
221+
The implementation works for the **specific raytracing example** (single UAV output, single TLAS, single CBV in that exact order). For a production-quality implementation, the TLAB construction needs to be generalized to work with arbitrary root signatures/descriptor layouts.
222+
223+
Would you like me to help fix any of these issues?
224+
225+
-------------
226+
227+
Here’s the status versus the 8 issues you listed:
228+
229+
Fixed
230+
- **#1 Hardcoded TLAB layout**: now built from the parsed Metal Shader Converter root signature JSON, so it follows the shader’s actual root layout.
231+
- **#2 Fragile resource discovery**: removed the “first texture/AS/CBV” scan; descriptor tables are populated by register + space across the bound descriptor sets.
232+
- **#3 Dangling instanceContributions pointer**: now stored as a value copy in `MetalDescriptorSet::ResourceEntry`, so no dangling.
233+
- **#7 Only descriptor set 0**: root signature `RegisterSpace` maps to descriptor set index, so multiple sets are supported.
234+
235+
Partially/changed behavior
236+
- **#8 rtBuffersDirty race**: `traceRays` no longer reads this flag, so the original read/write race is gone. The flag is still set but unused; we can delete it or make it atomic if you prefer.
237+
238+
Still open
239+
- **#4 Missing `useResource` for BLAS referenced by TLAS**: still not handled.
240+
- **#5 VFT creation failure cleanup/state**: unchanged.
241+
- **#6 Intersection shader support**: still not wired up.
242+
243+
If you want, I can tackle the remaining three in the next pass.

0 commit comments

Comments
 (0)