Hi, thank you for such great work and open-sourcing the cleaned-up code!
I'd like to use FPO++ for G1 motion tracking task since the motion tracking results are present in your paper. But I didn't find any motion tracking pipeline in current code, so I wonder if you plan to also open-source the motion tracking part?
I've tried to modify the locomotion pipeline for motion tracking given the implementation details present in your paper. To be more specific, I:
- Used the MDP parts of BeyondMimic.
- Changed both
actor_hidden_dims and critic_hidden_dims to [1024, 512, 256].
- Changed
sampling_steps to 50.
- Changed
learning_rate to 3e-4.
- Changed
clip_param to 0.01.
- Changed
num_learning_epochs to 5.
The full runner config is:
@configclass
class G1FlatFpoRunnerCfg(FpoRslRlOnPolicyRunnerCfg):
max_iterations = 30000
save_interval = 200
experiment_name = "g1_flat_flow"
num_steps_per_env = 24
empirical_normalization = True
flow_eval_modes = ["zero"]
policy = FpoRslRlPpoActorCriticCfg(
init_noise_std=1.0,
actor_hidden_dims=[1024, 512, 256],
critic_hidden_dims=[1024, 512, 256],
activation="elu",
sampling_steps=50,
)
algorithm = FpoRslRlPpoAlgorithmCfg(
num_learning_epochs=5,
n_samples_per_action=32,
learning_rate=3e-4,
clip_param=0.01,
)
I used 4096 envs to train it but the reward curve looks like this:
So am I missing something or would you plan to open-source the motion tracking part? Thanks a lot!!
Hi, thank you for such great work and open-sourcing the cleaned-up code!
I'd like to use FPO++ for G1 motion tracking task since the motion tracking results are present in your paper. But I didn't find any motion tracking pipeline in current code, so I wonder if you plan to also open-source the motion tracking part?
I've tried to modify the locomotion pipeline for motion tracking given the implementation details present in your paper. To be more specific, I:
actor_hidden_dimsandcritic_hidden_dimsto[1024, 512, 256].sampling_stepsto50.learning_rateto3e-4.clip_paramto0.01.num_learning_epochsto5.The full runner config is:
I used 4096 envs to train it but the reward curve looks like this:
So am I missing something or would you plan to open-source the motion tracking part? Thanks a lot!!