Resources¶
A resource is a controllable physical asset attached to a grid. PowerZoo's resource layer (powerzoo/envs/resource/) covers six asset types: battery, electric vehicle (EV), solar, wind, flexible load and data center.
The shared design rule is that a resource is not a standalone RL env. Its step() updates internal state (SOC, queue, temperature, …) but does not return a reward or termination signal. The Gymnasium (obs, reward, terminated, truncated, info) contract plus the CMDP cost vector in info is assembled by PowerEnv plus a Task. To train an RL agent on a single resource, use a task (e.g. battery_arbitrage for a single battery) or write a custom PowerEnv config — never call BatteryEnv directly as a gymnasium.Env.
Vocabulary check. SOC (State Of Charge) — battery fill, 0–1. G2V / V2G — Grid-to-Vehicle / Vehicle-to-Grid. DR (Demand Response) — curtailing or shifting load in response to grid signals. PUE (Power Usage Effectiveness) — total facility power / IT equipment power, lower is better. COP (Coefficient of Performance) — heat removed per unit electrical input.
BatteryEnv — energy storage¶
Battery state of charge evolves as:
subject to SOC_min ≤ SOC ≤ SOC_max and P_min ≤ P_t ≤ P_max.
Defaults: one-way eta_charge = eta_discharge = 0.95 (hence achieved round-trip ≈ 0.9025). The optional eta_roundtrip parameter is a sqrt shorthand: setting eta_roundtrip = 0.9 is equivalent to eta_charge = eta_discharge = sqrt(0.9) ≈ 0.949. The legacy efficiency kwarg is deprecated.
Action space is 1D [P_norm] ∈ [-1, 1], mapped to [-power_mw, +power_mw]. With enable_q_control=True the action becomes 2D [P_norm, Q_norm]. Cost component: cost_clipped_power = |desired − feasible| after SOC / power clipping.
VehicleEnv — electric vehicle (G2V / V2G)¶
Same SOC dynamics as BatteryEnv, plus three EV-specific constraints:
- Availability: the EV can charge or discharge only when parked at home. During commute hours the action is masked to zero.
- Departure SOC: the EV must reach
SOC ≥ SOC_departurebefore leaving. Missing this deadline is a hard violation. - Stochastic schedule: departure / arrival times can vary per episode.
Action: 1D [P_norm] ∈ [-1, 1]. Observation is 9D and includes home / away flag, departure-readiness, time-to-departure and the underlying SOC. Cost contributions: cost_clipped_power (also non-zero when an action is issued while away), plus EV-specific departure and home-availability costs reported through the task adapter.
SolarEnv and WindEnv — renewables¶
Both subclass RenewableEnv. Output is profile-driven (SOLAR_AVAILABLE_MW / WIND_AVAILABLE_MW from the data pipeline) and capped by nameplate capacity_mw. The agent's only control is curtailment: a 1D action [curtail_frac] ∈ [0, 1] reduces output below the available level (1.0 = no curtailment, 0.0 = full curtailment). With enable_q_control=True the action becomes 2D [curtail_frac, Q_norm].
Observation is 4D (or 5D with Q control) and includes the underlying capacity factor, current power and a curtailment-cost penalty. There is no SOC integrator; the only state is the time-series index.
FlexLoad — demand response¶
FlexLoad is the demand-response resource. Each device has two independent levers:
- Curtailment — permanent demand reduction, capped by
curtail_cap_mw. - Demand shifting — defer consumption over a
shift_horizonwindow, capped byshift_cap_mw. Shifted energy enters a buffer that must be repaid within the horizon.
Action is 2D [curtail_mw, shift_out_mw], with three scaling modes (physical, unit, tanh). Observation is 8D — [curtail_norm, shift_out_norm, shift_in_norm, buffer_fill_ratio, buffer_energy_norm, time_sin, time_cos, price_norm]. Cost components:
| Cost field | Unit | Meaning |
|---|---|---|
cost_buffer_overflow |
MWh | Deferred demand exceeding the shift horizon. |
cost_curtailment |
$ | Discomfort / compensation for curtailed energy. |
cost_shift_discomfort |
$ | Holding cost for buffered deferred demand. |
cost_simultaneous |
$ | Complementarity violation (curtailing and shifting at the same step). |
FlexLoad also exposes an LMP injection interface (set_lmp, get_bid) that is SCUC / SCED-compatible — see Markets.
DataCenterEnv — AI data center as a controllable load¶
DataCenterEnv is more elaborate than the other resources. It models:
- GPU-level IT power: per-GPU idle and active draw, queue of training and finetuning jobs, exogenous inference load.
- Cooling: COP-based cooling power that scales with the cooling setpoint and outdoor temperature.
- Thermal dynamics: first-order zone-temperature model with a critical threshold.
- Workload: an EDF (Earliest-Deadline-First) scheduler that consumes the queue.
Action is 3D [r_train, r_finetune, T_cool_setpoint_norm]:
r_train,r_finetune∈[0, 1]— fraction of available GPUs allocated to each workload.T_cool_setpoint_norm∈[0, 1]— normalised cooling thermostat.
Observation is 11D (utilisation, queues, temperature, COP, prices, time). Inference workload is exogenous (diurnal). Cost component: cost_overtemp = max(t_zone − t_critical, 0).
DataCenterEnv is used by two tasks: dc_scheduling (where it sits on a distribution grid as a flexible load) and dc_microgrid (where it sits inside a self-contained DC microgrid).
Cost summary¶
Every cost field exposed by a resource flows automatically into info['cost_resource'] and then into the fixed-order info['constraint_costs'] vector. The legacy alias info['cost_resource_violation'] remains available during the transition. The full table:
| Resource | Cost field | Unit |
|---|---|---|
BatteryEnv |
cost_clipped_power |
MW |
VehicleEnv |
cost_clipped_power (+ EV-specific via adapter) |
MW |
DataCenterEnv |
cost_overtemp |
°C |
FlexLoad |
cost_buffer_overflow |
MWh |
FlexLoad |
cost_curtailment |
$ |
FlexLoad |
cost_shift_discomfort |
$ |
FlexLoad |
cost_simultaneous |
$ |
Any cost_* key in status() is collected automatically; the full convention is documented in Reward and cost split.
Attaching a resource to a grid¶
from powerzoo.envs.grid import TransGridEnv
from powerzoo.envs.resource import BatteryEnv
grid = TransGridEnv()
battery = BatteryEnv(capacity_mwh=50.0, power_mw=20.0, parent=grid, bus_id=2)
print(battery.resource_id) # 'battery_0'
The resource is registered automatically via parent / bus_id. Changing bus_id mid-episode triggers a map rebuild via the property setter. See Architecture · Environment stack §3 for the registration data flow.
See also¶
- Transmission physics, Distribution physics — the grid envs a resource attaches to.
- Markets —
FlexLoadandBatteryEnvin market envs. - Microgrid —
DataCenterEnv+BatteryEnvin a self-contained DC microgrid. - API · Resources — per-class signatures and parameter tables.