Mujoco (gym)¶
We use mujoco==2.2.1
as the codebase.
See https://github.com/deepmind/mujoco/tree/2.2.1
The implementation follows OpenAI gym *-v4 environment, see reference.
You can set post_constraint
to False
to disable the bug fix with
this issue, which is *-v3
environments’ standard approach.
Ant-v3/v4¶
Observation space (v3):
(111)
, first 13 elements forqpos[2:]
, next 14 elements forqvel
, other elements for clippedcfrc_ext
(com-based external force on body, a.k.a. contact force);Observation space (v4):
(27)
, first 13 elements forqpos[2:]
, next 14 elements forqvel
;Action space:
(8)
, with range[-1, 1]
;frame_skip
: 5;max_episode_steps
: 1000;reward_threshold
: 6000.0;
HalfCheetah-v3/v4¶
gym HalfCheetah-v3 source code
gym HalfCheetah-v4 source code
Observation space:
(17)
, first 8 elements forqpos[1:]
, next 9 elements forqvel
;Action space:
(6)
, with range[-1, 1]
;frame_skip
: 5;max_episode_steps
: 1000;reward_threshold
: 4800.0;
Hopper-v3/v4¶
Observation space:
(11)
, first 5 elements forqpos[1:]
, next 6 elements forqvel
;Action space:
(3)
, with range[-1, 1]
;frame_skip
: 4;max_episode_steps
: 1000;reward_threshold
: 6000.0;
Humanoid-v3/v4, HumanoidStandup-v2/v4¶
gym HumanoidStandup-v2 source code
gym HumanoidStandup-v4 source code
Observation space:
(376)
, first 22 elements forqpos[2:]
, next 23 elements forqvel
, next 140 elements forcinert
(com-based body inertia and mass), next 84 elements forcvel
(com-based velocity [3D rot; 3D tran]), next 23 elements forqfrc_actuator
(actuator force), next 84 elements forcfrc_ext
(com-based external force on body);Action space:
(17)
, with range[-0.4, 0.4]
;frame_skip
: 5;max_episode_steps
: 1000;
InvertedDoublePendulum-v2/v4¶
gym InvertedDoublePendulum-v2 source code
gym InvertedDoublePendulum-v4 source code
Observation space:
(11)
, first 1 element forqpos[0]
, next 2 elements forsin(qpos[1:])
, next 2 elements forcos(qpos[1:])
, next 3 elements forqvel
, next 3 elements forqfrc_constraint
;Action space:
(1)
, with range[-1, 1]
;frame_skip
: 5;max_episode_steps
: 1000;reward_threshold
: 9100.0;
InvertedPendulum-v2/v4¶
gym InvertedPendulum-v2 source code
gym InvertedPendulum-v4 source code
Observation space:
(4)
, first 2 elements forqpos
, next 2 elements forqvel
;Action space:
(1)
, with range[-3, 3]
;frame_skip
: 2;max_episode_steps
: 1000;reward_threshold
: 950.0;
Pusher-v2/v4¶
Observation space:
(23)
, first 7 elements forqpos[:7]
, next 7 elements forqvel[:7]
, next 3 elements fortips_arm
, next 3 elements forobject
, next 3 elements forgoal
;Action space:
(7)
, with range[-2, 2]
;frame_skip
: 5;max_episode_steps
: 100;reward_threshold
: 0.0;
Reacher-v2/v4¶
Observation space:
(11)
, first 2 elements forcos(qpos[:2])
, next 2 elements forsin(qpos[:2])
, next 2 elements forqpos[2:]
, next 2 elements forqvel[:2]
, next 3 elements fordist
, a.k.a.fingertip - target
;Action space:
(2)
, with range[-1, 1]
;frame_skip
: 2;max_episode_steps
: 50;reward_threshold
: -3.75;
Swimmer-v3/v4¶
Observation space:
(8)
, first 3 elements forqpos[2:]
, next 5 elements forqvel
;Action space:
(2)
, with range[-1, 1]
;frame_skip
: 4;max_episode_steps
: 1000;reward_threshold
: 360.0;
Walker2d-v3/v4¶
Observation space:
(17)
, first 8 elements forqpos[1:]
, next 9 elements forqvel
;Action space:
(6)
, with range[-1, 1]
;frame_skip
: 4;max_episode_steps
: 1000;