Wrappers¶

Wrappers in powerzoo/wrappers/ adapt the contract from Python contract to the API that a specific RL algorithm expects. They never change physics or task semantics; they only translate API shapes.

This page is the practical reference for what each wrapper does. Per-class signatures live in API · Wrappers.

What every wrapper preserves¶

All PowerZoo wrappers respect the contract documented in Python contract:

The reward channel still carries only the economic objective.
Core CMDP envs expose fixed-order vector costs through env.constraint_names() and info['constraint_costs']; scalar info['cost'] exists only in compatibility wrappers.
The 5 observation modes (global / local / local_plus_forecast / local_plus_voltage / ders_local) are not modified; only the container changes (Dict → flat Box, single-agent → PettingZoo Parallel, …).

You can stack wrappers on each other; the canonical orderings are documented below.

`GymnasiumWrapper` — adapt a raw `GridEnv`¶

Adapts a raw GridEnv (state-dict return) to the standard Gymnasium 5-tuple:

from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import GymnasiumWrapper

env = GymnasiumWrapper(TransGridEnv())
obs, info = env.reset(seed=42)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

Use this when you want to drive a GridEnv directly (no task wrapping); for benchmark experiments, make_task_env(...) already returns a Gymnasium-style env.

`NormalizationWrapper` — running statistics¶

Normalises observations (and optionally actions) to [-1, 1] using running statistics. Stack on top of GymnasiumWrapper:

from powerzoo.wrappers import GymnasiumWrapper, NormalizationWrapper

env = NormalizationWrapper(GymnasiumWrapper(TransGridEnv()))

For task envs, powerzoo.rl.make_env(name, normalize=True) does the same thing.

`ForecastWrapper` — append a load-forecast window¶

Appends a horizon-length demand forecast to every observation. Extends observation_space automatically (base dim + horizon).

Parameter	Default	Description
`horizon`	`6`	Number of future steps to append.
`mode`	`'perfect'`	`'perfect'` (ground truth), `'noisy'` (Gaussian noise), or `'none'` (zeros).
`noise_std`	`0.02`	Fractional noise std for `mode='noisy'` (e.g. 0.02 = 2 %).
`normalize`	`True`	Divide forecast values by dataset maximum.

from powerzoo.wrappers import GymnasiumWrapper, ForecastWrapper

env = ForecastWrapper(GymnasiumWrapper(TransGridEnv()), horizon=6, mode='noisy')
obs, info = env.reset(seed=0)
# obs[-6:] contains the next 6 half-hour demand values

The three forecast modes let you measure forecast-quality sensitivity on the same task, without changing the underlying physics.

`TaskCMDPWrapper` — attach task-selected CMDP metadata¶

Keeps the standard Gymnasium 5-tuple, but annotates info with:

constraint_names / constraint_costs — the core env's full vector
selected_constraint_names / selected_constraint_costs — the benchmark task's chosen subset
selected_cost_sum — scalar compatibility alias for the selected subset

This is how DSO, TSO and DC microgrid align a benchmark CMDP spec without mutating the core env semantics.

`CMDPWrapper` — 6-tuple with vector costs¶

Returns (obs, reward, costs, terminated, truncated, info), where costs is the task-selected vector from info['selected_constraint_costs'] (or the full vector when no task spec is present).

`SafeRLWrapper` — 6-tuple for OmniSafe / Safety-Gymnasium¶

Returns a 6-tuple (obs, reward, cost, terminated, truncated, info). Cost extraction priority:

info['selected_cost_sum'] — task-selected scalar projection of the CMDP vector.
info['cost_sum'] — full-vector sum (legacy scalar alias).
info['cost'] — compatibility-only fallback.

from powerzoo.wrappers import GymnasiumWrapper, SafeRLWrapper, TaskCMDPWrapper

env = TaskCMDPWrapper(GymnasiumWrapper(TransGridEnv()), constraint_spec=...)
env = SafeRLWrapper(env, cost_threshold=25.0)
obs, info = env.reset(seed=0)
obs, reward, cost, terminated, truncated, info = env.step(env.action_space.sample())

Parameter	Default	Description
`cost_threshold`	`25.0`	Scalar threshold exposed as `env.cost_threshold` for OmniSafe.

`GymnasiumSafeWrapper` — 5-tuple, scalar compatibility in `info`¶

Returns the standard Gymnasium 5-tuple but injects the selected scalar projection into info['cost']. Use this when your algorithm reads cost from info rather than as a separate return value.

from powerzoo.wrappers import GymnasiumWrapper, GymnasiumSafeWrapper

env = GymnasiumSafeWrapper(GymnasiumWrapper(TransGridEnv()))
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
print(info['cost'])  # non-negative scalar

`MARLWrapper` — single-agent → PettingZoo Parallel¶

Converts a single-agent PowerZoo env to the PettingZoo Parallel API. Each resource registered in the underlying GridEnv becomes an independent agent:

from powerzoo.wrappers import MARLWrapper

env = MARLWrapper(TransGridEnv(), agent_type='generators')
obs, info = env.reset(seed=0)
actions = {a: env.action_space(a).sample() for a in env.agents}
obs, rewards, terminations, truncations, info = env.step(actions)

For task envs, the task-aware TaskPettingZooWrapper (defined in powerzoo/tasks/interfaces/pettingzoo.py, re-exported from powerzoo.wrappers for compatibility) is what make_task_env(..., framework='pettingzoo') uses under the hood. It preserves the reward / cost / observation / __all__ semantics of the underlying task adapter.

`FlattenWrapper` — Dict → flat Box¶

Flattens dict / nested observation_space and action_space into 1-D Box spaces. Useful for algorithms that require flat vector inputs:

from powerzoo.wrappers import GymnasiumWrapper, FlattenWrapper

env = FlattenWrapper(GymnasiumWrapper(TransGridEnv()))

PowerEnv returns Dict observations and accepts Dict actions; FlattenWrapper is what makes the single-agent battery_arbitrage, dc_scheduling and dc_microgrid* tasks return a flat Box.

Standard stacks¶

The most common stacking patterns:

from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import (
    GymnasiumWrapper, NormalizationWrapper, ForecastWrapper, SafeRLWrapper,
)

# Single-agent vanilla
env = GymnasiumWrapper(TransGridEnv())

# Single-agent with normalisation and forecast
env = ForecastWrapper(NormalizationWrapper(GymnasiumWrapper(TransGridEnv())))

# Single-agent Safe RL
env = SafeRLWrapper(NormalizationWrapper(GymnasiumWrapper(TransGridEnv())),
                    cost_threshold=25.0)

For task envs (the recommended path), use powerzoo.rl.make_env(...) and pass normalize=True, forecast_horizon=N, safe_rl=True, cost_threshold=...; it builds the appropriate stack for you.

Wrappers¶

What every wrapper preserves¶

GymnasiumWrapper — adapt a raw GridEnv¶

NormalizationWrapper — running statistics¶

ForecastWrapper — append a load-forecast window¶

TaskCMDPWrapper — attach task-selected CMDP metadata¶

CMDPWrapper — 6-tuple with vector costs¶

SafeRLWrapper — 6-tuple for OmniSafe / Safety-Gymnasium¶

GymnasiumSafeWrapper — 5-tuple, scalar compatibility in info¶

MARLWrapper — single-agent → PettingZoo Parallel¶

FlattenWrapper — Dict → flat Box¶