Skip to content

Torch deterministic #17

@HaoxiangYou

Description

@HaoxiangYou

Thank you for providing this awesome repo!

I try to make results consistent between different runs via seeding(seed, torch_deterministic=True) .

It is known torch has some broadcasting issue with deterministic algorithm: pytorch/pytorch#79987

So, I manually fix the broadcasting in each environment. e.g. In the envs/ant Line 204-206, I change the code to self.state.joint_q.view(self.num_envs, -1)[env_ids, 3:7] = self.start_rotation.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_q.view(self.num_envs, -1)[env_ids, 7:] = self.start_joint_q.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_qd.view(self.num_envs, -1)[env_ids, :] = torch.zeros(size=(len(env_ids), self.num_joint_qd), device = self.device)

After these changes, I run the experiments with/without torch_deterministic=True, e.g. below is the ant test where the blue one is without torch_deterministic=True and orange one with torch_deterministic=True

image

The non-deterministic run is similar to the paper results, however, for the deterministic setting, the rewards remain unchanged.

Does someone have ideas about what else the issue may torch_deterministic=True bring? Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions