MuJoCo Playground

EnvPool provides native C++ implementations for the non-DM-Control tasks from google-deepmind/mujoco_playground tag v0.2.0. This covers all 19 Playground locomotion tasks and all 10 Playground manipulation tasks in that release. The implementation uses the pinned Playground XMLs together with google-deepmind/mujoco_menagerie commit 1b86ece576591213e2b666ebf59508454200ca97 for robot assets.

MuJoCo Playground also vendors DM Control Suite tasks, but EnvPool already ships those through the existing DeepMind Control Suite family. They are not registered again here.

Each task has both the direct task ID and a MuJoCoPlayground/ alias, for example Go1Getup-v1 and MuJoCoPlayground/Go1Getup-v1.

Task Coverage

All action spaces are Box(-1, 1, dtype=float64). All tasks support render_mode="rgb_array" and pixel-only observations through from_pixels=True.

EnvPool task ID

Upstream task

Observation

Action

Render

AlohaHandOver-v1

AlohaHandOver

obs shape (83,)

(14,)

yes

AlohaSinglePegInsertion-v1

AlohaSinglePegInsertion

obs shape (82,)

(14,)

yes

ApolloJoystickFlatTerrain-v1

ApolloJoystickFlatTerrain

state (112,); privileged_state (224,)

(32,)

yes

BarkourJoystick-v1

BarkourJoystick

obs shape (465,)

(12,)

yes

BerkeleyHumanoidJoystickFlatTerrain-v1

BerkeleyHumanoidJoystickFlatTerrain

state (52,); privileged_state (114,)

(12,)

yes

BerkeleyHumanoidJoystickRoughTerrain-v1

BerkeleyHumanoidJoystickRoughTerrain

state (52,); privileged_state (114,)

(12,)

yes

G1JoystickFlatTerrain-v1

G1JoystickFlatTerrain

state (103,); privileged_state (216,)

(29,)

yes

G1JoystickRoughTerrain-v1

G1JoystickRoughTerrain

state (103,); privileged_state (216,)

(29,)

yes

Go1JoystickFlatTerrain-v1

Go1JoystickFlatTerrain

state (48,); privileged_state (123,)

(12,)

yes

Go1JoystickRoughTerrain-v1

Go1JoystickRoughTerrain

state (48,); privileged_state (123,)

(12,)

yes

Go1Getup-v1

Go1Getup

state (42,); privileged_state (91,)

(12,)

yes

Go1Handstand-v1

Go1Handstand

state (45,); privileged_state (94,)

(12,)

yes

Go1Footstand-v1

Go1Footstand

state (45,); privileged_state (94,)

(12,)

yes

H1InplaceGaitTracking-v1

H1InplaceGaitTracking

obs shape (186,)

(19,)

yes

H1JoystickGaitTracking-v1

H1JoystickGaitTracking

obs shape (113,)

(19,)

yes

LeapCubeReorient-v1

LeapCubeReorient

state (57,); privileged_state (128,)

(16,)

yes

LeapCubeRotateZAxis-v1

LeapCubeRotateZAxis

state (32,); privileged_state (105,)

(16,)

yes

Op3Joystick-v1

Op3Joystick

obs shape (147,)

(20,)

yes

PandaPickCube-v1

PandaPickCube

obs shape (66,)

(8,)

yes

PandaPickCubeCartesian-v1

PandaPickCubeCartesian

obs shape (70,)

(3,)

yes

PandaPickCubeOrientation-v1

PandaPickCubeOrientation

obs shape (66,)

(8,)

yes

PandaOpenCabinet-v1

PandaOpenCabinet

obs shape (55,)

(8,)

yes

PandaRobotiqPushCube-v1

PandaRobotiqPushCube

obs shape (48,)

(7,)

yes

AeroCubeRotateZAxis-v1

AeroCubeRotateZAxis

state (14,); privileged_state (81,)

(7,)

yes

SpotFlatTerrainJoystick-v1

SpotFlatTerrainJoystick

state (81,); privileged_state (167,)

(12,)

yes

SpotGetup-v1

SpotGetup

obs shape (30,)

(12,)

yes

SpotJoystickGaitTracking-v1

SpotJoystickGaitTracking

obs shape (69,)

(12,)

yes

T1JoystickFlatTerrain-v1

T1JoystickFlatTerrain

state (85,); privileged_state (180,)

(23,)

yes

T1JoystickRoughTerrain-v1

T1JoystickRoughTerrain

state (85,); privileged_state (180,)

(23,)

yes

Render

Rendering is implemented in C++ through EnvPool’s MuJoCo OffscreenRenderer. The Playground env owns the same mjModel and mjData used by stepping; env.render() draws directly from that native state. Pixel-observation variants render once per reset or step, update the frame stack, and cache that same frame so a same-step env.render() call returns the identical image.

The render API supports render_mode="rgb_array", render_width, render_height, and render_camera_id. If no explicit render size is requested, env.render() uses 480 by 480 pixels. Pixel observations default to 84 by 84 pixels.

Like other native MuJoCo environments in EnvPool, MuJoCo Playground also supports pixel-only observations through from_pixels=True. In that mode the public observation is obs["pixels"] with channel-first shape (3 * frame_stack, render_height, render_width) and dtype uint8; the state and privileged-state vectors are not returned as observations.

import envpool

env = envpool.make_gymnasium(
    "Go1JoystickFlatTerrain-v1",
    num_envs=8,
    from_pixels=True,
    frame_stack=3,
    render_width=84,
    render_height=84,
    render_mode="rgb_array",
)
obs, info = env.reset()

The reset-frame comparison below places EnvPool on the left and the official MuJoCo Playground renderer on the right. The documentation image is generated by syncing the official renderer from EnvPool’s reset qpos and qvel debug fields, so both sides render the same MuJoCo state. The comparison ignores RGB channel deltas up to 3/255 when counting mismatched pixels, while still enforcing a bounded raw mean absolute difference.

For Op3Joystick-v1 the official render-side model is loaded from the filesystem XML/assets instead of MuJoCo Playground’s in-memory asset dict. OP3’s visual meshes and simplified collision meshes share STL basenames, and MuJoCo’s asset dict cannot represent both at once without collapsing the visual model to the simplified collision mesh. The filesystem path keeps the same pinned XML and menagerie assets while preserving the intended visual meshes.

../_images/mujoco_playground_official_compare.png

Regenerate the image with:

bazel run --config=debug //scripts:render_compare -- \
  --family=mujoco_playground \
  --columns=4 \
  --source-width=480 \
  --source-height=360 \
  --tile-width=144 \
  --tile-height=108 \
  --max-mean-abs-diff=4.2 \
  --max-mismatch-ratio=0.15 \
  --max-ignored-abs-diff=3

Validation

The native implementation is checked against the official MuJoCo Playground Python oracle. The coverage test compares EnvPool’s Playground registry against the pinned upstream locomotion.ALL_ENVS and manipulation.ALL_ENVS lists; the vendored DM Control Suite registry is intentionally excluded.

The alignment test reset-syncs MuJoCo state once, then drives both implementations with the same external actions and compares observations, rewards, termination flags, truncation flags, and exposed info fields.

Rendering is checked separately on reset and for the first three control steps against the official MuJoCo renderer using the same synchronized state. The test keeps per-task pixel budgets narrow because OpenGL rasterization can leave small backend-dependent edge differences even when the MuJoCo state is aligned.