-
Notifications
You must be signed in to change notification settings - Fork 61
Description
When I try to fine-tune the CellViT-256-x40.pth model using the ViT256 backbone and the provided pre-trained encoder (vit256_small_dino.pth), I encounter a RuntimeError due to mismatched keys in the model head.
backbone: ViT256
pretrained_encoder: /mnt/raid/zanzhuheng/working/ESCC_CELLVIT/vit256_small_dino.pth
pretrained: ../CellViT-256-x40.pth
Error Traceback:
Loading checkpoint: _IncompatibleKeys(missing_keys=['head.weight', 'head.bias'], unexpected_keys=['head.mlp.0.weight', 'head.mlp.0.bias', 'head.mlp.2.weight', 'head.mlp.2.bias', 'head.mlp.4.weight', 'head.mlp.4.bias', 'head.last_layer.weight_g', 'head.last_layer.weight_v'])
Expected behavior
I expected the model to load the pre-trained CellViT-256-x40.pth checkpoint successfully and continue training (fine-tuning) with the ViT256 backbone.
It seems like the current model definition uses a single-layer head (e.g., nn.Linear), while the checkpoint uses a more complex multi-layer MLP head. The structure mismatch causes the weight loading to fail.
Should the model config or code be updated to match the checkpoint structure (with mlp layers), I initialize only part of the model using strict=False and got all nan pred
I'm happy to provide more details if needed. Thank you!