跳转至

任务

PowerZoo 提供现成的基准任务,将 grid case、数据切分、agent 设计与评估协议组合到一个对象中。使用 make_task_env 按名称实例化任意任务:

from powerzoo.tasks import make_task_env, list_public_tasks

print(list_public_tasks())
env = make_task_env("marl_opf", split="train")

# Use PettingZoo Parallel API (no RLlib required)
env = make_task_env("marl_opf", split="train", framework="pettingzoo")

多智能体任务默认始终使用专用任务 adapterTaskOPFMultiAgentEnvTaskUCMultiAgentEnvTaskEVMultiAgentEnv 等)。这些 adapter 在装与不装 RLlib 时都能正常工作;装了 ray[rllib] 后,返回对象会同时满足 RLlib MultiAgentEnv 接口。要在同一 adapter 语义之上得到任务感知的 PettingZoo Parallel API wrapper,传 framework='pettingzoo'

明确的公开基准面通过 powerzoo.tasks.public 或它重新导出的 helper 提供:

from powerzoo.tasks import PUBLIC_TASKS, list_public_tasks, get_public_task_catalog

print(PUBLIC_TASKS)
print(list_public_tasks())
print(get_public_task_catalog()[0]["task_id"])
print(get_public_task_catalog()[0]["default_episode_horizon_steps"])

只有满足基准合约的任务才会留在 PUBLIC_TASKS:有文档、已注册、可实例化、通过 smoke test。已注册但尚未完整的任务仍可通过 list_tasks() / make_task_env(...) 访问,但不属于公开基准面。

framework 参数决定使用哪种多智能体接口:

取值 描述
'auto'(默认) 专用任务 adapter(安装 ray 时同时兼容 RLlib)
'pettingzoo' powerzoo.tasks.interfaces.TaskPettingZooWrapper 包装的任务感知 PettingZoo Parallel API(轻量,无需 RLlib)
'rllib' 'auto' 相同,但缺 ray[rllib] 时直接报错

注册任务路由

任务名 make_task_env() / create_env() 返回
battery_arbitrage 围绕 PowerEnvFlattenWrapper(单 agent Gymnasium)
marl_opf TaskOPFMultiAgentEnv
marl_der_arbitragemarl_ders_benchmark TaskResourceMultiAgentEnv
marl_ev_v2g TaskEVMultiAgentEnv
dc_scheduling 围绕 PowerEnvFlattenWrapper(单 agent Gymnasium)
dc_microgriddc_microgrid_safe DCMicrogridEnv(单 agent Gymnasium,自包含)
gencos_bidding GenCosMARLEnv(PettingZoo Parallel API;竞争式 5-agent 市场)
marl_uc TaskUCMultiAgentEnv
opf_118 / opf_118_7d TaskOPFMultiAgentEnv
joint_trans_dist / joint_trans_dist_7d 仅实验性 — 仍处于注册状态,但不属于 PUBLIC_TASKS;在 joint adapter / reward 路径正式上线前,当前实例化会失败

公开基准任务卡片

get_public_task_catalog() 为当前公开基准面返回稳定的任务卡片元数据。在构建实验菜单、基准摘要、或需要与真实公开任务保持同步的文档时,把它作为权威数据源使用。

from powerzoo.tasks import get_public_task_catalog

for card in get_public_task_catalog():
    print(card["task_id"], card["grid_case"], card["default_episode_horizon_steps"])
任务 Grid Agent 模式 默认 observation Reward / cost 合约 Horizon Frameworks
battery_arbitrage distribution / Case33bw single flattened 仅目标 peak / off-peak 套利利润,带 SOC 目标 shaping;SOC 违反在 info['cost'] 48 gymnasium
marl_opf transmission / Case5 multi global 共享经济调度 reward;物理违反在 info['cost'] 48 autorllibpettingzoo
marl_der_arbitrage distribution / Case33bw multi local_plus_forecast 共享电池套利 reward;电压 / SOC 违反在 info['cost'] 48 autorllibpettingzoo
marl_ev_v2g distribution / Case33bw multi local_plus_forecast 共享 EV 套利与出发就绪 reward;grid / EV 违反在 info['cost'] 168 autorllibpettingzoo
dc_scheduling distribution / Case33bw single flattened 仅目标的单 agent energy-SLA-PUE reward;grid 与 datacenter 热稳违反在 info['cost_sum'] 48 gymnasium
dc_microgrid self-contained DC microgrid single flattened 标量化 r_energy + w_cost·r_cost + w_carbon·r_carbon;向量在 info['reward_vector'];SLA / overtemp / power-deficit 在 info['cost'] 288 gymnasium
dc_microgrid_safe self-contained DC microgrid single flattened dc_microgrid 相同,CMDP cost_threshold = 0.5 288 gymnasium
marl_uc transmission / Case5 multi global 共享 UC 经济 reward;物理违反在 info['cost'] 48 autorllibpettingzoo
opf_118 transmission / Case118 multi global 共享大规模经济调度 reward;物理违反在 info['cost'] 48 autorllibpettingzoo
opf_118_7d transmission / Case118 multi global 共享大规模经济调度 reward;物理违反在 info['cost'] 336 autorllibpettingzoo

内部原子验证 preset 位于 powerzoo.tasks.atomic 下,但有意不纳入公开基准面。


Simple Tasks

powerzoo.tasks.simple.MARLOPFTask(case='Case5', split='train', start_date=None, end_date=None, delta_t_minutes=30, max_load_ratio=None, max_steps=48, action_mode='score', observation_mode='global', forecast_horizon_steps=4, constraint_tightness='standard', **kwargs)

Bases: MultiAgentTask

Multi-Agent OPF Control Task

Multi-agent economic dispatch on the IEEE 5-bus system. Each generator learns to coordinate its output as an independent agent.

Initialize the MARL OPF task.

Parameters:

Name Type Description Default
case str

Grid case ('Case5', 'Case118', etc.)

'Case5'
split Optional[str]

Data split — 'train', 'val', or 'test'. Sets the date range for episode sampling. Ignored when start_date/end_date are given explicitly.

'train'
start_date Optional[str]

Explicit start date (overrides split).

None
end_date Optional[str]

Explicit end date (overrides split).

None
delta_t_minutes int

Time step in minutes.

30
max_load_ratio float

Maximum load ratio.

None
max_steps int

Max steps per episode (default 48 = 1 day @ 30 min).

48
action_mode str

'score' for softmax allocation, 'direct' for MW.

'score'
observation_mode str

One of 'global', 'local', 'local_plus_forecast'.

'global'
forecast_horizon_steps int

Forecast horizon used in local_plus_forecast mode.

4
**kwargs

Other override parameters passed to Task base.

{}

get_scenario_config()

Return scenario configuration for PowerEnv.

get_agents_config()

Return multi-agent configuration.

关键参数

参数 默认 描述
case 'Case5' Grid case 名称
split 'train' 数据切分:'train''val''test'
action_mode 'score' 'score'(softmax 分配)或 'direct'(直接给出 MW 出力)
max_load_ratio 0.9 最大负荷占总发电容量的比例
max_steps 48 每 episode 步数(48 = 30 分钟分辨率下的 1 天)

Agent 设计

  • Action:score ∈ [0, 1] — 用 softmax 把净负荷分配到各发电机
  • Observation:全局特征(总负荷、线路潮流、时间)+ 本地特征(机组下标、p_min、p_max、成本系数)
  • Reward:−(发电成本) / 1000(共享、合作式)

数据切分(不重叠,固定不变以保证基准可复现)

Split 日期范围
train 2023-07-05 – 2024-12-31
val 2025-01-01 – 2025-06-30
test 2025-07-01 – 2025-12-15

Middle Tasks

powerzoo.tasks.middle.MARLUCTask(case='Case5', split='train', start_date=None, end_date=None, delta_t_minutes=30, max_load_ratio=None, max_steps=48, observation_mode='global', forecast_horizon_steps=4, constraint_tightness='standard', **kwargs)

Bases: MultiAgentTask

Multi-Agent Unit Commitment Task on the IEEE 5-bus system.

Initialise the MARL UC task.

Parameters:

Name Type Description Default
case str

Grid case. Default 'Case5'.

'Case5'
split Optional[str]

Data split ('train', 'val', 'test').

'train'
start_date Optional[str]

Explicit start date (overrides split).

None
end_date Optional[str]

Explicit end date (overrides split).

None
delta_t_minutes int

Time step in minutes. Default 30.

30
max_load_ratio float

Max load as fraction of capacity. Defaults to tightness preset (0.9 for standard).

None
max_steps int

Steps per episode. Default 48.

48
observation_mode str

One of 'global', 'local', 'local_plus_forecast'.

'global'
forecast_horizon_steps int

Forecast horizon used in local_plus_forecast mode.

4
constraint_tightness str

One of 'loose', 'standard', 'strict'.

'standard'
**kwargs

Passed to Task base.

{}

get_scenario_config()

get_agents_config()

create_env()

Create UC-specific multi-agent env.

MARLOPFTask 之上扩展机组组合决策。每个发电机 agent 既要决定出力多少,也要决定是否在线

关键参数

参数 默认 描述
case 'Case5' Grid case 名称
split 'train' 数据切分
max_load_ratio 0.9 最大负荷比例
max_steps 48 每 episode 步数

UC 默认值(当 case.units 列中未给出时使用)

默认 单位
startup_cost 500 $/start
shutdown_cost 200 $/stop
ramp_rate 999 MW/step
min_up_time 1 steps
min_down_time 1 steps

Agent 设计

  • Action[score, on_off] — 2 元向量;on_off ≥ 0.5 时投运机组
  • Observation:全局 + 本地 + commitment 向量(所有机组当前 on/off 状态)
  • Reward:−(发电成本 + 启动成本 + 停机成本) / 1000(仅经济目标)
  • Cost 信号:物理约束违反 → info['cost'](CMDP 分离)

Complex Tasks

powerzoo.tasks.complex.OPF118Task(split='train', start_date=None, end_date=None, delta_t_minutes=30, max_load_ratio=None, max_steps=48, action_mode='score', observation_mode='global', forecast_horizon_steps=4, constraint_tightness='standard', **kwargs)

Bases: MultiAgentTask

Multi-Agent OPF Control on the IEEE 118-bus system.

A complex-difficulty benchmark for large-scale cooperative dispatch.

Initialize the 118-bus OPF task.

Parameters:

Name Type Description Default
split Optional[str]

Data split — 'train', 'val', or 'test'.

'train'
start_date Optional[str]

Explicit start date (overrides split).

None
end_date Optional[str]

Explicit end date (overrides split).

None
delta_t_minutes int

Time step in minutes. Default 30.

30
max_load_ratio float

Maximum load as fraction of total capacity. Defaults to tightness preset (0.85 for standard).

None
max_steps int

Max steps per episode. Default 48 (1 day).

48
action_mode str

'score' (softmax allocation) or 'direct' (MW).

'score'
constraint_tightness str

One of 'loose', 'standard', 'strict'.

'standard'
**kwargs

Passed to Task base.

{}

get_scenario_config()

get_agents_config()

在 IEEE 118-bus 系统上的大规模合作 OPF:54 个发电机、186 条输电线。

关键参数

参数 默认 描述
split 'train' 数据切分
max_load_ratio 0.85 最大负荷比例(系统规模更大,比 5-bus 略低)
max_steps 48 每 episode 步数
action_mode 'score' 'score''direct'

Agent 设计

  • 54 个合作 agent(每个发电机一个)
  • 使用与 MARLOPFTask 相同的 score 动作 / OPF observation 协议
  • 共享合作 reward

OPF118Task7Days

OPF118Task 的 7 天(336 步)变体。所有参数继承自 OPF118Taskmax_steps 默认 336

env = make_task_env("opf_118_7d", split="train")