There's no mechanism to ensure that pretrained data writing that happens as part of mace_run_train in distributed mode only occurs on one process. As a result, it's possible for the different processes to overwrite each others' mp_finetuning...xyz files, and then fail when they are being read back in.
There's no mechanism to ensure that pretrained data writing that happens as part of
mace_run_trainin distributed mode only occurs on one process. As a result, it's possible for the different processes to overwrite each others'mp_finetuning...xyzfiles, and then fail when they are being read back in.