Skip to content

Setup_wandb() fails with "Object of type function is not JSON serializable" when logging args #1401

@Pragalbhv

Description

@Pragalbhv

Bug description

When running mace_run_train with --wandb, training fails during setup_wandb() because the code tries to JSON-serialize the full args namespace. If at least one attribute is a function (or other callable), and the custom JSON encoder doesn’t handle callables, then json.dumps(args_dict, cls=CustomEncoder) raises.

To Reproduce
Steps to reproduce the behavior:

  1. Install MACE with wandb (e.g. pip install mace-torch[wandb] or wandb installed).
  2. Run training with wandb flags, for example:
    '''
    mace_run_train
    --name=$exp_name
    --default_dtype="float32"
    --energy_key='energy'
    --forces_key='forces'
    --stress_key='stress'
    --model='MACELES'
    --train_file="train.extxyz"
    --valid_fraction=0.05
    --test_file="test.extxyz"
    --E0s="{xx: xxx, yy: yyy, zz: zzz}"
    --loss='universal'
    --energy_weight=100
    --forces_weight=1000
    --eval_interval=1
    --error_table='PerAtomMAE'
    --r_max=6.0
    --num_radial_basis=10
    --pair_repulsion
    --distance_transform="Agnesi"
    --num_channels=32
    --max_L=0
    --num_interactions=2
    --correlation=3
    --max_ell=3
    --scaling='rms_forces_scaling'
    --num_workers=8
    --lr=0.01
    --weight_decay=1e-8
    --ema
    --ema_decay=0.99
    --scheduler_patience=5
    --batch_size=32
    --valid_batch_size=32
    --max_num_epochs=1
    --patience=50
    --amsgrad
    --distributed
    --device=cuda
    --seed=1
    --clip_grad=10
    --keep_checkpoints
    --save_cpu
    --wandb --wandb_project irp --wandb_entity astagroup --wandb_name $exp_name
    --restart_latest
    '''

Wandb probably takes the full config string and then turns it into a JSON string, herein lies the issue. MACE is trying to log the full args to wandb as JSON, and args contains at least one function, which JSON cannot represent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions