跳转至

包装器

所有 wrapper 都可从 powerzoo.wrappers 导入。

面向基准任务、任务感知的 PettingZoo 包装目前位于 powerzoo.tasks.interfaces.TaskPettingZooWrapper,仍从 powerzoo.wrappers 重新导出以保持向后兼容。

from powerzoo.wrappers import (
    GymnasiumWrapper,
    NormalizationWrapper,
    SafeRLWrapper,
    GymnasiumSafeWrapper,
    ForecastWrapper,
    MARLWrapper,
    FlattenWrapper,
)

powerzoo.wrappers.gym_wrappers.GymnasiumWrapper(env)

Bases: Wrapper

Wrap a PowerZoo GridEnv to produce a standard Gymnasium interface.

The inner env's step() returns a state dict as observation; this wrapper calls env.obs(state) to obtain a flat numpy array and passes it through as the Gymnasium observation.

It also exposes: - env.observation_space / env.action_space (from inner env) - env.obs_names / env.action_names (human-readable labels) - Correct (obs, info) return from reset()

Parameters:

Name Type Description Default
env

Any GridEnv subclass (TransGridEnv, DistGridEnv, …).

required

Example::

from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import GymnasiumWrapper

raw = TransGridEnv()
env = GymnasiumWrapper(raw)

obs, info = env.reset(seed=0)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

reset(*, seed=None, options=None, **kwargs)

Reset inner env and return (obs_array, info).

step(action)

Step inner env; convert state dict → flat obs array.

Numpy array actions are auto-converted to the dict format expected by GridEnv (e.g. a unit-dispatch vector → {'unit_power_mw': action}).

把任意 PowerZoo GridEnv 适配为标准 Gymnasium 五元组 API:

from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import GymnasiumWrapper

env = GymnasiumWrapper(TransGridEnv())
obs, info = env.reset(seed=42)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

powerzoo.wrappers.gym_wrappers.NormalizationWrapper(env, clip=True)

Bases: ObservationWrapper

Normalise observations to [-1, 1] using fixed bounds.

Bounds are derived automatically from the inner environment's case data (physical limits), and are stable across episodes — safe to use during training.

Parameters:

Name Type Description Default
env Env

A GymnasiumWrapper or any gym.Env with a Box obs space.

required
clip bool

Whether to clip the normalised observation to [-1, 1]. Default True.

True

The raw bounds can be inspected via env.obs_low and env.obs_high.

Example::

from powerzoo.wrappers import GymnasiumWrapper, NormalizationWrapper

env = NormalizationWrapper(GymnasiumWrapper(TransGridEnv()))
obs, info = env.reset(seed=0)   # obs ∈ [-1, 1]

利用滑动统计把 observation(可选 action)归一化到 [−1, 1]。堆叠在 GymnasiumWrapper 之上:

from powerzoo.wrappers import GymnasiumWrapper, NormalizationWrapper

env = NormalizationWrapper(GymnasiumWrapper(TransGridEnv()))

powerzoo.wrappers.safe_rl_wrapper.SafeRLWrapper(env, cost_threshold=None)

Bases: Wrapper

Wrap a Gymnasium env to emit (obs, reward, cost, terminated, truncated, info).

step(action)

返回与 OmniSafe 和 Safety-Gymnasium 兼容的 6 元组 (obs, reward, cost, terminated, truncated, info)

Cost 提取优先级:

  1. info['selected_constraint_costs'] — task 选中的 CMDP 向量
  2. info['constraint_costs'] — core env 完整向量
  3. info['cost_sum'] 或兼容 info['cost']
  4. 0.0 — 安全回退
from powerzoo.wrappers import GymnasiumWrapper, SafeRLWrapper

env = SafeRLWrapper(GymnasiumWrapper(TransGridEnv()), cost_threshold=25.0)
obs, info = env.reset(seed=0)
obs, reward, cost, terminated, truncated, info = env.step(env.action_space.sample())
参数 默认 描述
cost_threshold 25.0 标量阈值,以 env.cost_threshold 形式提供给 OmniSafe

powerzoo.wrappers.safe_rl_wrapper.GymnasiumSafeWrapper(env, cost_threshold=None)

Bases: Wrapper

Keep Gymnasium's 5-tuple API while projecting vector costs to info['cost'].

step(action)

返回标准 Gymnasium 5 元组,并把 cost 注入 info['cost']。当算法从 info 读取 cost 而不是接收单独返回值时使用它。

from powerzoo.wrappers import GymnasiumWrapper, GymnasiumSafeWrapper

env = GymnasiumSafeWrapper(GymnasiumWrapper(TransGridEnv()))
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
print(info['cost'])  # non-negative scalar

powerzoo.wrappers.forecast_wrapper.ForecastWrapper(env, horizon=6, mode='perfect', noise_std=0.02, normalize=True)

Bases: ObservationWrapper

Augment observations with a look-ahead demand forecast.

Parameters

env : gym.Env A Gymnasium-wrapped PowerZoo GridEnv. horizon : int Number of future time steps to append to the observation. Default 6. mode : str 'perfect', 'noisy', or 'none'. Default 'perfect'. noise_std : float Fractional Gaussian noise std for mode='noisy' (e.g. 0.02 = 2 %). Ignored for other modes. Default 0.02. normalize : bool Whether to normalise forecast values by the maximum demand in the dataset (so each value is in [0, ~1]). Default True.

reset(*, seed=None, options=None, **kwargs)

observation(obs)

Append forecast to base observation.

在每个 observation 末尾追加一个 horizon 长度的需求预测,并自动扩展 observation_space(base dim + horizon)。

参数 默认 描述
horizon 6 追加的未来步数
mode 'perfect' 'perfect'(真值)、'noisy'(高斯噪声)或 'none'(零)
noise_std 0.02 mode='noisy' 的分数噪声 std(如 0.02 = 2 %)
normalize True 把预测值除以数据集最大值
from powerzoo.wrappers import GymnasiumWrapper, ForecastWrapper

env = ForecastWrapper(GymnasiumWrapper(TransGridEnv()), horizon=6, mode='noisy')
obs, info = env.reset(seed=0)
# obs[-6:] contains the next 6 half-hour demand values

powerzoo.wrappers.marl_wrapper.MARLWrapper(env, agent_type='generators', render_mode=None)

Bases: ParallelEnv

PettingZoo Parallel wrapper for PowerZoo GridEnv.

Parameters

env : GridEnv An initialised (and optionally resource-populated) PowerZoo grid env. agent_type : str 'generators' or 'resources'. render_mode : str or None Passed through to env.render_mode if set.

reset(seed=None, options=None)

step(actions)

把单 agent 的 PowerZoo env 转换为 PettingZoo Parallel API。底层 GridEnv 中注册的每个 resource 都对应一个独立 agent。

from powerzoo.wrappers import MARLWrapper

env = MARLWrapper(TransGridEnv(), agent_type='generators')
obs, info = env.reset(seed=0)
actions = {a: env.action_space(a).sample() for a in env.agents}
obs, rewards, terminations, truncations, info = env.step(actions)

powerzoo.wrappers.flatten.FlattenWrapper(env, resource_names=None, obs_keys=None, custom_obs_fn=None, custom_action_fn=None)

Bases: Wrapper

Combined wrapper for flattening both observation and action spaces

Intelligently flattens dict spaces based on controlled resources.

Parameters:

Name Type Description Default
env Env

Environment to wrap (should be PowerEnv instance)

required
resource_names Optional[List[str]]

List of resource names to control (e.g., ['bat0', 'wind_0']) If None, controls all resources

None
obs_keys Optional[List[str]]

List of observation keys to include (e.g., ['grid', 'resources', 'time']) If None, includes all observations

None
custom_obs_fn Optional[Callable]

Optional custom observation flattening function

None
custom_action_fn Optional[Callable]

Optional custom action mapping function

None
Example

from powerzoo.envs.power_env import PowerEnv from powerzoo.wrappers.flatten import FlattenWrapper

env = PowerEnv(config) env = FlattenWrapper(env, resource_names=['bat0'])

from stable_baselines3 import PPO model = PPO('MlpPolicy', env, verbose=1) model.learn(total_timesteps=10000)

reset(**kwargs)

Reset and flatten observation

step(action)

Convert flat action to dict and step

把 dict / 嵌套的 observation_spaceaction_space 扁平化为 1-D Box。适用于要求扁平向量输入的算法。

from powerzoo.wrappers import GymnasiumWrapper, FlattenWrapper

env = FlattenWrapper(GymnasiumWrapper(TransGridEnv()))