包装器¶
所有 wrapper 都可从 powerzoo.wrappers 导入。
面向基准任务、任务感知的 PettingZoo 包装目前位于 powerzoo.tasks.interfaces.TaskPettingZooWrapper,仍从 powerzoo.wrappers 重新导出以保持向后兼容。
from powerzoo.wrappers import (
GymnasiumWrapper,
NormalizationWrapper,
SafeRLWrapper,
GymnasiumSafeWrapper,
ForecastWrapper,
MARLWrapper,
FlattenWrapper,
)
powerzoo.wrappers.gym_wrappers.GymnasiumWrapper(env)
¶
Bases: Wrapper
Wrap a PowerZoo GridEnv to produce a standard Gymnasium interface.
The inner env's step() returns a state dict as observation; this
wrapper calls env.obs(state) to obtain a flat numpy array and passes
it through as the Gymnasium observation.
It also exposes:
- env.observation_space / env.action_space (from inner env)
- env.obs_names / env.action_names (human-readable labels)
- Correct (obs, info) return from reset()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
env
|
Any |
required |
Example::
from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import GymnasiumWrapper
raw = TransGridEnv()
env = GymnasiumWrapper(raw)
obs, info = env.reset(seed=0)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
把任意 PowerZoo GridEnv 适配为标准 Gymnasium 五元组 API:
from powerzoo.envs.grid.trans import TransGridEnv
from powerzoo.wrappers import GymnasiumWrapper
env = GymnasiumWrapper(TransGridEnv())
obs, info = env.reset(seed=42)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
powerzoo.wrappers.gym_wrappers.NormalizationWrapper(env, clip=True)
¶
Bases: ObservationWrapper
Normalise observations to [-1, 1] using fixed bounds.
Bounds are derived automatically from the inner environment's case data (physical limits), and are stable across episodes — safe to use during training.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
env
|
Env
|
A |
required |
clip
|
bool
|
Whether to clip the normalised observation to |
True
|
The raw bounds can be inspected via env.obs_low and env.obs_high.
Example::
from powerzoo.wrappers import GymnasiumWrapper, NormalizationWrapper
env = NormalizationWrapper(GymnasiumWrapper(TransGridEnv()))
obs, info = env.reset(seed=0) # obs ∈ [-1, 1]
利用滑动统计把 observation(可选 action)归一化到 [−1, 1]。堆叠在 GymnasiumWrapper 之上:
from powerzoo.wrappers import GymnasiumWrapper, NormalizationWrapper
env = NormalizationWrapper(GymnasiumWrapper(TransGridEnv()))
powerzoo.wrappers.safe_rl_wrapper.SafeRLWrapper(env, cost_threshold=None)
¶
Bases: Wrapper
Wrap a Gymnasium env to emit (obs, reward, cost, terminated, truncated, info).
step(action)
¶
返回与 OmniSafe 和 Safety-Gymnasium 兼容的 6 元组 (obs, reward, cost, terminated, truncated, info)。
Cost 提取优先级:
info['selected_constraint_costs']— task 选中的 CMDP 向量info['constraint_costs']— core env 完整向量info['cost_sum']或兼容info['cost']0.0— 安全回退
from powerzoo.wrappers import GymnasiumWrapper, SafeRLWrapper
env = SafeRLWrapper(GymnasiumWrapper(TransGridEnv()), cost_threshold=25.0)
obs, info = env.reset(seed=0)
obs, reward, cost, terminated, truncated, info = env.step(env.action_space.sample())
| 参数 | 默认 | 描述 |
|---|---|---|
cost_threshold |
25.0 |
标量阈值,以 env.cost_threshold 形式提供给 OmniSafe |
powerzoo.wrappers.safe_rl_wrapper.GymnasiumSafeWrapper(env, cost_threshold=None)
¶
Bases: Wrapper
Keep Gymnasium's 5-tuple API while projecting vector costs to info['cost'].
step(action)
¶
返回标准 Gymnasium 5 元组,并把 cost 注入 info['cost']。当算法从 info 读取 cost 而不是接收单独返回值时使用它。
from powerzoo.wrappers import GymnasiumWrapper, GymnasiumSafeWrapper
env = GymnasiumSafeWrapper(GymnasiumWrapper(TransGridEnv()))
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
print(info['cost']) # non-negative scalar
powerzoo.wrappers.forecast_wrapper.ForecastWrapper(env, horizon=6, mode='perfect', noise_std=0.02, normalize=True)
¶
Bases: ObservationWrapper
Augment observations with a look-ahead demand forecast.
Parameters¶
env : gym.Env
A Gymnasium-wrapped PowerZoo GridEnv.
horizon : int
Number of future time steps to append to the observation. Default 6.
mode : str
'perfect', 'noisy', or 'none'. Default 'perfect'.
noise_std : float
Fractional Gaussian noise std for mode='noisy' (e.g. 0.02 = 2 %).
Ignored for other modes. Default 0.02.
normalize : bool
Whether to normalise forecast values by the maximum demand in the
dataset (so each value is in [0, ~1]). Default True.
在每个 observation 末尾追加一个 horizon 长度的需求预测,并自动扩展 observation_space(base dim + horizon)。
| 参数 | 默认 | 描述 |
|---|---|---|
horizon |
6 |
追加的未来步数 |
mode |
'perfect' |
'perfect'(真值)、'noisy'(高斯噪声)或 'none'(零) |
noise_std |
0.02 |
mode='noisy' 的分数噪声 std(如 0.02 = 2 %) |
normalize |
True |
把预测值除以数据集最大值 |
from powerzoo.wrappers import GymnasiumWrapper, ForecastWrapper
env = ForecastWrapper(GymnasiumWrapper(TransGridEnv()), horizon=6, mode='noisy')
obs, info = env.reset(seed=0)
# obs[-6:] contains the next 6 half-hour demand values
powerzoo.wrappers.marl_wrapper.MARLWrapper(env, agent_type='generators', render_mode=None)
¶
Bases: ParallelEnv
PettingZoo Parallel wrapper for PowerZoo GridEnv.
Parameters¶
env : GridEnv
An initialised (and optionally resource-populated) PowerZoo grid env.
agent_type : str
'generators' or 'resources'.
render_mode : str or None
Passed through to env.render_mode if set.
把单 agent 的 PowerZoo env 转换为 PettingZoo Parallel API。底层 GridEnv 中注册的每个 resource 都对应一个独立 agent。
from powerzoo.wrappers import MARLWrapper
env = MARLWrapper(TransGridEnv(), agent_type='generators')
obs, info = env.reset(seed=0)
actions = {a: env.action_space(a).sample() for a in env.agents}
obs, rewards, terminations, truncations, info = env.step(actions)
powerzoo.wrappers.flatten.FlattenWrapper(env, resource_names=None, obs_keys=None, custom_obs_fn=None, custom_action_fn=None)
¶
Bases: Wrapper
Combined wrapper for flattening both observation and action spaces
Intelligently flattens dict spaces based on controlled resources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
env
|
Env
|
Environment to wrap (should be PowerEnv instance) |
required |
resource_names
|
Optional[List[str]]
|
List of resource names to control (e.g., ['bat0', 'wind_0']) If None, controls all resources |
None
|
obs_keys
|
Optional[List[str]]
|
List of observation keys to include (e.g., ['grid', 'resources', 'time']) If None, includes all observations |
None
|
custom_obs_fn
|
Optional[Callable]
|
Optional custom observation flattening function |
None
|
custom_action_fn
|
Optional[Callable]
|
Optional custom action mapping function |
None
|
Example
from powerzoo.envs.power_env import PowerEnv from powerzoo.wrappers.flatten import FlattenWrapper
env = PowerEnv(config) env = FlattenWrapper(env, resource_names=['bat0'])
from stable_baselines3 import PPO model = PPO('MlpPolicy', env, verbose=1) model.learn(total_timesteps=10000)
把 dict / 嵌套的 observation_space 与 action_space 扁平化为 1-D Box。适用于要求扁平向量输入的算法。