市场环境¶
基类¶
powerzoo.envs.market.base.MarketEnv()
¶
Bases: ABC
Market environment base class independent from physical grid.
Defines interfaces for market clearing, settlement and revenue calculation.
Standard call sequence per time step::
1. market.step(bidding) — receive bids (overwrites ``last_bids``)
2. market.clear() — run clearing
3. market.settle() — settle trades
4. market.revenue() — compute revenues
CostBasedMarketEnv(基于成本的分配)¶
基于成本的 LMP 套利环境。发电机由线性成本 DC-OPF(mc_c @ p)分配;不存在报价-成本分离。
powerzoo.envs.market.cost_based_market.CostBasedMarketEnv(case=None, battery_bus_id=2, battery_capacity_mwh=200.0, battery_power_mw=50.0, lmp_scale=100.0, difficulty=None, normalize_actions=True, **grid_kwargs)
¶
Bases: Env
Battery arbitrage on LMPs from TransGridEnv marginal-cost DC-OPF.
Offer-based clearing: :class:BidBasedMarketEnv.
Parameters¶
case : ClearCase, optional
Power system case. Defaults to Case5.
battery_bus_id : int, optional
Bus to attach the default battery (default: 2).
Set to None to skip auto-creating a battery (attach your own).
battery_capacity_mwh : float
Battery energy capacity. Default 200 MWh.
battery_power_mw : float
Battery power rating. Default 50 MW.
lmp_scale : float
Divide raw LMP values by this factor for normalisation in the obs.
Default 100 $/MWh.
difficulty : str or None
Passed to TransGridEnv. 'easy', 'medium', 'hard'.
**grid_kwargs :
Any remaining kwargs forwarded to TransGridEnv.__init__.
Example::
from powerzoo import CostBasedMarketEnv
env = CostBasedMarketEnv(difficulty='medium')
obs, info = env.reset(seed=42)
while True:
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
break
steps_per_day
property
¶
reset(*, seed=None, options=None)
¶
Reset grid and battery; return initial observation.
self.grid.reset() cascades to all registered sub-resources
(including self._battery) via GridEnv.reset() → each resource's
reset() is called, which resets battery SOC to initial_soc.
No separate battery reset is required here.
step(action)
¶
Step the market environment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
action
|
Any
|
Battery power setpoint. When |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Standard Gymnasium 5-tuple. |
float
|
|
bool
|
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
if no battery is attached ( |
render()
¶
close()
¶
BidBasedMarketEnv(分段线性报价)¶
带显式分段线性报价曲线的竞争式市场环境。LMP 由基于报价的分配(而非真实成本)派生,使真实的报价-成本分离与策略性报价研究成为可能。
powerzoo.envs.market.bid_based_market.BidBasedMarketEnv(case=None, battery_bus_id=2, battery_capacity_mwh=200.0, battery_power_mw=50.0, n_segments=5, markup_std=0.05, lmp_scale=100.0, difficulty=None, normalize_actions=True, skip_grid_opf=True, degradation_cost_per_mwh=0.0, action_smooth_cost=0.0, infeasible_penalty=1000.0, **grid_kwargs)
¶
Bases: Env
Competitive electricity market with piecewise-linear offer curves.
Parameters¶
case : ClearCase, optional
Power system case. Defaults to Case5.
battery_bus_id : int, optional
Bus to attach the default battery (default: 2).
battery_capacity_mwh : float
Battery energy capacity (default 200 MWh).
battery_power_mw : float
Battery power rating (default 50 MW).
n_segments : int
Number of offer-curve segments per generator (default 5).
markup_std : float
Standard deviation of random markup (fraction) applied to
cost-based offer prices each episode. 0 = truthful bidding.
Default 0.05 (5 % noise).
lmp_scale : float
Divide raw LMP values by this for observation normalisation
(default 100 $/MWh).
difficulty : str or None
Passed to TransGridEnv.
normalize_actions : bool
Whether to normalise the battery action to [-1, 1].
skip_grid_opf : bool
When True (default for RL training), the underlying grid
skips its internal OPF solve and instead uses the
market-cleared dispatch computed by :func:solve_piecewise_ed_opf.
This eliminates a redundant LP solve per step (the grid's OPF and
the market's piecewise ED-OPF would otherwise be solving very similar
problems with the same net-load input). Set to False when
debugging or when the grid OPF result is needed separately.
degradation_cost_per_mwh : float
Monetary penalty per MWh of battery throughput, added to the
reward as -degradation_cost_per_mwh * |power| * dt_h.
Models battery wear; suppresses high-frequency charge/discharge
cycles. Default 0 (no degradation penalty).
action_smooth_cost : float
Penalty coefficient for rapid changes in battery setpoint.
Added as -action_smooth_cost * |power - prev_power|.
Helps prevent oscillation between charge and discharge.
Default 0 (disabled).
infeasible_penalty : float
Penalty subtracted from reward when market clearing is infeasible
(opf success == False). Replaces the hard-coded value.
Default 1000.0. Tune together with reward_scale to keep
the penalty in a reasonable range relative to the LMP revenue.
**grid_kwargs
Forwarded to TransGridEnv.__init__.
steps_per_day
property
¶
reset(*, seed=None, options=None)
¶
step(action)
¶
Step the market environment.
Two internal paths:
skip_grid_opf=True (default, recommended for RL training):
The underlying grid's internal OPF solve is bypassed.
The sequence per step is:
1. Step the battery (update SOC, current_p_mw).
2. Advance the grid's time step counter.
3. Compute net load = gross load − battery injection at its bus.
4. Run piecewise-linear SCED → market dispatch + LMP.
5. Update grid's cached line/node state directly from the
market-cleared OPF result (DC power flow for line flows).
6. Compute LMP-based settlement reward.
This avoids one redundant LP solve per step.
skip_grid_opf=False (legacy / debug):
The grid runs its own OPF, then the market overrides the LMP
with its own piecewise ED-OPF result. Runs two OPFs per step.
Returns standard Gymnasium 5-tuple.