MuJoCo Playground
EnvPool provides native C++ implementations for the non-DM-Control tasks from
google-deepmind/mujoco_playground tag v0.2.0. This covers all 19
Playground locomotion tasks and all 10 Playground manipulation tasks in that
release. The implementation uses the pinned Playground XMLs together with
google-deepmind/mujoco_menagerie commit
1b86ece576591213e2b666ebf59508454200ca97 for robot assets.
MuJoCo Playground also vendors DM Control Suite tasks, but EnvPool already ships those through the existing DeepMind Control Suite family. They are not registered again here.
Each task has both the direct task ID and a MuJoCoPlayground/ alias, for
example Go1Getup-v1 and MuJoCoPlayground/Go1Getup-v1.
Task Coverage
All action spaces are Box(-1, 1, dtype=float64). All tasks support
render_mode="rgb_array" and pixel-only observations through
from_pixels=True.
EnvPool task ID |
Upstream task |
Observation |
Action |
Render |
|---|---|---|---|---|
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
|
|
|
|
yes |
Render
Rendering is implemented in C++ through EnvPool’s MuJoCo
OffscreenRenderer. The Playground env owns the same mjModel and
mjData used by stepping; env.render() draws directly from that native
state. Pixel-observation variants render once per reset or step, update the
frame stack, and cache that same frame so a same-step env.render() call
returns the identical image.
The render API supports render_mode="rgb_array", render_width,
render_height, and render_camera_id. If no explicit render size is
requested, env.render() uses 480 by 480 pixels. Pixel observations default
to 84 by 84 pixels.
Like other native MuJoCo environments in EnvPool, MuJoCo Playground also
supports pixel-only observations through from_pixels=True. In that mode the
public observation is obs["pixels"] with channel-first shape
(3 * frame_stack, render_height, render_width) and dtype uint8; the
state and privileged-state vectors are not returned as observations.
import envpool
env = envpool.make_gymnasium(
"Go1JoystickFlatTerrain-v1",
num_envs=8,
from_pixels=True,
frame_stack=3,
render_width=84,
render_height=84,
render_mode="rgb_array",
)
obs, info = env.reset()
The reset-frame comparison below places EnvPool on the left and the official
MuJoCo Playground renderer on the right. The documentation image is generated
by syncing the official renderer from EnvPool’s reset qpos and qvel
debug fields, so both sides render the same MuJoCo state. The comparison
ignores RGB channel deltas up to 3/255 when counting mismatched pixels, while
still enforcing a bounded raw mean absolute difference.
For Op3Joystick-v1 the official render-side model is loaded from the
filesystem XML/assets instead of MuJoCo Playground’s in-memory asset dict.
OP3’s visual meshes and simplified collision meshes share STL basenames, and
MuJoCo’s asset dict cannot represent both at once without collapsing the visual
model to the simplified collision mesh. The filesystem path keeps the same
pinned XML and menagerie assets while preserving the intended visual meshes.
Regenerate the image with:
bazel run --config=debug //scripts:render_compare -- \
--family=mujoco_playground \
--columns=4 \
--source-width=480 \
--source-height=360 \
--tile-width=144 \
--tile-height=108 \
--max-mean-abs-diff=4.2 \
--max-mismatch-ratio=0.15 \
--max-ignored-abs-diff=3
Validation
The native implementation is checked against the official MuJoCo Playground
Python oracle. The coverage test compares EnvPool’s Playground registry against
the pinned upstream locomotion.ALL_ENVS and manipulation.ALL_ENVS lists;
the vendored DM Control Suite registry is intentionally excluded.
The alignment test reset-syncs MuJoCo state once, then drives both implementations with the same external actions and compares observations, rewards, termination flags, truncation flags, and exposed info fields.
Rendering is checked separately on reset and for the first three control steps against the official MuJoCo renderer using the same synchronized state. The test keeps per-task pixel budgets narrow because OpenGL rasterization can leave small backend-dependent edge differences even when the MuJoCo state is aligned.