Commit c7fcab1
authored
[FIX] Prevent TypeError in text-only Gemma3CausalLM and improve generate_step defaults
1. Bug Fix: TypeError during generation for text-only models
The Problem:
When using a Gemma3CausalLM model configured for text-only processing (i.e., with vision_encoder=None and preprocessor=None), a call to causal_lm.generate() fails with a TypeError.
The root cause is that the internal generate_step method returns a dictionary containing an 'images': None key-value pair. This None value is eventually passed to ops.concatenate during the output normalization step, which does not accept None as a valid input. This workflow is common when pretraining a model from scratch.
The Fix:
The generate_step method has been modified to only include the 'images' key in its returned dictionary if an image tensor is actually present. This ensures that a None value is never passed to downstream functions, resolving the TypeError.
Proof of Bug and Fix:
The following Colab notebook demonstrates the bug with the original code and shows the successful execution after applying this fix:
https://colab.research.google.com/drive/1QVk2idB6fcdYYJb1cBQGaKHe5QSGjCti?usp=sharing
2. Refactoring: Remove Hardcoded Stop Token
The Problem:
The internal generate_step method has a hardcoded default stop_token_ids=[106], which corresponds to the <end_of_turn> token. This is conceptually incorrect for a base architectural model, as the model itself should not have opinions about instruction-following or conversational tokens. This hardcoded value can interfere with pretraining or sampling raw text.
The Fix:
The method signature has been changed from stop_token_ids=[106] to stop_token_ids=None.
This is a safe, non-breaking change because the public-facing Gemma3CausalLM.generate() method is already responsible for setting the appropriate stop tokens when a user specifies stop_token_ids="auto".1 parent 93d89d8 commit c7fcab1
1 file changed
+6
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
230 | | - | |
| 230 | + | |
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
| |||
326 | 326 | | |
327 | 327 | | |
328 | 328 | | |
329 | | - | |
| 329 | + | |
330 | 330 | | |
331 | 331 | | |
332 | | - | |
333 | 332 | | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
334 | 337 | | |
335 | 338 | | |
336 | 339 | | |
| |||
0 commit comments