This is a codebase that the PRPL lab is using for multiple projects related to programmatic policy learning.
Requirements: Python >=3.10 and <=3.12.
We strongly recommend uv. The steps below assume that you have uv installed. If you do not, just remove uv from the commands and the installation should still work.
uv pip install -e ".[develop]"
Check the installation: ./run_ci_checks.sh
If you want to use an OpenAI LLM, make sure you have an OPENAI_API_KEY set (e.g., see here)
from pathlib import Path
import gymnasium
from prpl_llm_utils.cache import SQLite3PretrainedLargeModelCache
from prpl_llm_utils.models import OpenAIModel
from programmatic_policy_learning.approaches.ppl_approach import (
LLMPPLApproach,
)
env = gymnasium.make("LunarLander-v3")
env.action_space.seed(123)
environment_description = (
"The well-known LunarLander in gymnasium, i.e., "
'env = gymnasium.make("LunarLander-v3")'
)
cache = SQLite3PretrainedLargeModelCache(Path("llm_cache.db"))
llm = OpenAIModel("gpt-4o-mini", cache)
approach = LLMPPLApproach(
environment_description,
env.observation_space,
env.action_space,
seed=123,
llm=llm,
)
obs, info = env.reset()
approach.reset(obs, info)
print(approach._policy)
for _ in range(5):
action = approach.step()
assert env.action_space.contains(action)
obs, reward, terminated, _, info = env.step(action)
approach.update(obs, reward, terminated, info)We use hydra to run experiments at scale. See experiments/run_experiment.py. For example:
python experiments/run_experiment.py -m env=lunar_lander llm=openai seed='range(0,2)'
If you encounter an error when installing dependencies (e.g., box2d-py) that looks like this:
Box2D/Box2D_wrap.cpp:3378:10: fatal error: 'string' file not found
3378 | #include <string>
| ^~~~~~~~
1 error generated.
error: command '/usr/bin/clang++' failed with exit code 1
This might mean that your macOS Command Line Tools (CLT) or SDK isn’t installed or selected correctly, and the compiler (clang++) cannot find the C++ standard library headers.
To fix this issue, try these steps:
-
Reinstall or point to the correct Command Line Tools (CLT):
-
Remove any broken or partial CLT installations:
sudo rm -rf /Library/Developer/CommandLineTools
-
Reinstall the CLT (a GUI prompt will appear):
xcode-select --install
-
-
After completing the installation, try installing the dependencies again:
uv pip install -e ".[develop]"
If you are using uv to manage your virtual environment, you can also try installing box2d-py directly to verify the fix:
uv pip install box2d-pyYou can add environments in two ways:
- Plain Gymnasium env (already registered via
gymnasium.make) - Provider-based env (env lives in a separate repo and needs a small adapter)
If the env is already registered with Gymnasium, just add a YAML under conf/env/ and you’re done.
Example: conf/env/lunarlander.yaml
# Passed into gymnasium.make() to create the environment.
make_kwargs:
id: "LunarLander-v3"
render_mode: null # "human", "rgb_array", or null
# Optional, purely descriptive.
description: "The well-known LunarLander in gymnasium, i.e., env = gymnasium.make('LunarLander-v3')"How it’s used in code:
from programmatic_policy_learning.env.registry import EnvRegistry
registry = EnvRegistry()
env = registry.load(cfg.env) # default fallback is gymnasium.make(**make_kwargs)If you don’t specify a
provider,EnvRegistryfalls back togymnasium.make(**make_kwargs).
Use this when your env lives in another repo (e.g., PRBench, GGG, custom maze env).
You’ll: (a) create a YAML with a provider, (b) add a provider function, and (c) (if needed) pin the external repo in pyproject.toml.
Example: conf/env/prbench_motion2d_p1.yaml
make_kwargs:
id: "prbench/Motion2D-p1-v0"
render_mode: null
provider: prbench # <--- important
description: "PRBench Motion2D-p1. Gymnasium-style env registered by PRBench"Edit: programmatic_policy_learning/env/registry.py
Add an entry to the provider map:
self._providers: dict[str, Callable[[Any], Any]] = {
"ggg": create_ggg_env,
"prbench": create_prbench_env,
# "gym_maze": create_maze_env, # example for your own provider
}File structure:
programmatic_policy_learning/
env/
providers/
prbench_provider.py # define create_prbench_env(cfg)
ggg_provider.py # define create_ggg_env(cfg)
maze_provider.py # define create_maze_env(cfg) (example)Example: programmatic_policy_learning/env/providers/prbench_provider.py
from __future__ import annotations
from typing import Any
import gymnasium as gym
def create_prbench_env(cfg: Any):
"""Create and return a PRBench env using cfg.env.make_kwargs."""
make_kwargs = dict(cfg.env.make_kwargs)
env = gym.make(**make_kwargs)
return envYour provider can do anything needed (import the external package, wrap the env, set seeds, apply wrappers, etc.). Just return the final
env.
If your provider imports an external repo, put it in pyproject.toml under dependencies = [...], so CI and collaborators get the same version.
Example (GGG):
dependencies = [
"generalization_grid_games@git+https://github.com/zahraabashir/generalization_grid_games.git@ee0a559",
]Example (your own repo):
dependencies = [
"my_cool_env_pkg@git+https://github.com/your-org/my_cool_env_pkg.git@<commit-hash>"
]After this, you only need to run the following command to install that dependency:
uv pip install -e ".[develop]"Same pattern for both plain and provider-based envs:
from programmatic_policy_learning.env.registry import EnvRegistry
registry = EnvRegistry()
env = registry.load(cfg.env) # uses provider if present, else gymnasium.make- If your YAML has
provider: ...,EnvRegistryroutes to the matching provider function. - If there’s no
provider, it callsgymnasium.make(**make_kwargs).
- Add
conf/env/<your_env>.yaml - If external repo:
- Add dependency pin in
pyproject.tomlunder[project.optional-dependencies] - Add provider entry in
EnvRegistry(provider name → function) - Implement
create_<provider>_env(cfg)inenv/providers/<provider>_provider.py
- Add dependency pin in
- Instantiate with
EnvRegistry().load(cfg.env)That's it!
If you want to implement an environment yourself (instead of importing it from another repo), you can follow the same provider-based structure:
- Create a new provider file under
programmatic_policy_learning/env/providers/x_provider.py. - Inside this file, implement your custom environment class (e.g., MyCustomEnv).
- At the end of the file, also implement a factory function (as before) like: def create_x_env(cfg: Any):
- Add a YAML under conf/env/ (as in the examples above), and register your provider in EnvRegistry.
This way, whether your env comes from an external repo or is defined locally, the process looks the same, your provider file is the single place to keep both the environment definition and the factory function.
- Ask an owner of the repository to add your GitHub username to the collaborators list
- All checks must pass before code is merged (see
./run_ci_checks.sh) - All code goes through the pull request review process on GitHub