Skip to content

Market Environments

Base Class

powerzoo.envs.market.base.MarketEnv()

Bases: ABC

Market environment base class independent from physical grid.

Defines interfaces for market clearing, settlement and revenue calculation.

Standard call sequence per time step::

1. market.step(bidding)   — receive bids (overwrites ``last_bids``)
2. market.clear()         — run clearing
3. market.settle()        — settle trades
4. market.revenue()       — compute revenues

step(bidding) abstractmethod

Accept bids for current time step.

clear() abstractmethod

Run market clearing.

settle() abstractmethod

Settle transactions.

revenue() abstractmethod

Compute revenues per participant.


CostBasedMarketEnv (cost-based dispatch)

Cost-based LMP arbitrage environment. Generators are dispatched by a linear-cost DC-OPF (mc_c @ p); there is no bid–cost separation.

powerzoo.envs.market.cost_based_market.CostBasedMarketEnv(case=None, battery_bus_id=2, battery_capacity_mwh=200.0, battery_power_mw=50.0, lmp_scale=100.0, difficulty=None, normalize_actions=True, **grid_kwargs)

Bases: Env

Battery arbitrage on LMPs from TransGridEnv marginal-cost DC-OPF.

Offer-based clearing: :class:BidBasedMarketEnv.

Parameters

case : ClearCase, optional Power system case. Defaults to Case5. battery_bus_id : int, optional Bus to attach the default battery (default: 2). Set to None to skip auto-creating a battery (attach your own). battery_capacity_mwh : float Battery energy capacity. Default 200 MWh. battery_power_mw : float Battery power rating. Default 50 MW. lmp_scale : float Divide raw LMP values by this factor for normalisation in the obs. Default 100 $/MWh. difficulty : str or None Passed to TransGridEnv. 'easy', 'medium', 'hard'. **grid_kwargs : Any remaining kwargs forwarded to TransGridEnv.__init__.

Example::

from powerzoo import CostBasedMarketEnv

env = CostBasedMarketEnv(difficulty='medium')
obs, info = env.reset(seed=42)
while True:
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break

steps_per_day property

reset(*, seed=None, options=None)

Reset grid and battery; return initial observation.

self.grid.reset() cascades to all registered sub-resources (including self._battery) via GridEnv.reset() → each resource's reset() is called, which resets battery SOC to initial_soc. No separate battery reset is required here.

step(action)

Step the market environment.

Parameters:

Name Type Description Default
action Any

Battery power setpoint. When normalize_actions=True (default), in [-1, 1] where -1 = max charge, 0 = idle, 1 = max discharge. When False, in [-power_mw, power_mw] MW.

required

Returns:

Type Description
ndarray

Standard Gymnasium 5-tuple. info includes:

float

requested_p_mw (float): physical MW requested by the agent after denormalization and rated-power clipping.

bool

realized_p_mw (float): physical MW actually dispatched by the battery after SOC constraints. Differs from requested_p_mw when the battery is near a SOC limit. Useful for diagnosing policy saturation and SOC boundary effects.

Raises:

Type Description
RuntimeError

if no battery is attached (battery_bus_id=None). Without a physical battery, the environment cannot settle actions and would compute reward on an unconstrained phantom power value.

render()

close()


BidBasedMarketEnv (piecewise-linear offers)

Competitive market environment with explicit piecewise-linear offer curves. LMP is derived from offer-based dispatch (not true costs), enabling realistic bid–cost separation and strategic bidding research.

powerzoo.envs.market.bid_based_market.BidBasedMarketEnv(case=None, battery_bus_id=2, battery_capacity_mwh=200.0, battery_power_mw=50.0, n_segments=5, markup_std=0.05, lmp_scale=100.0, difficulty=None, normalize_actions=True, skip_grid_opf=True, degradation_cost_per_mwh=0.0, action_smooth_cost=0.0, infeasible_penalty=1000.0, **grid_kwargs)

Bases: Env

Competitive electricity market with piecewise-linear offer curves.

Parameters

case : ClearCase, optional Power system case. Defaults to Case5. battery_bus_id : int, optional Bus to attach the default battery (default: 2). battery_capacity_mwh : float Battery energy capacity (default 200 MWh). battery_power_mw : float Battery power rating (default 50 MW). n_segments : int Number of offer-curve segments per generator (default 5). markup_std : float Standard deviation of random markup (fraction) applied to cost-based offer prices each episode. 0 = truthful bidding. Default 0.05 (5 % noise). lmp_scale : float Divide raw LMP values by this for observation normalisation (default 100 $/MWh). difficulty : str or None Passed to TransGridEnv. normalize_actions : bool Whether to normalise the battery action to [-1, 1]. skip_grid_opf : bool When True (default for RL training), the underlying grid skips its internal OPF solve and instead uses the market-cleared dispatch computed by :func:solve_piecewise_ed_opf. This eliminates a redundant LP solve per step (the grid's OPF and the market's piecewise ED-OPF would otherwise be solving very similar problems with the same net-load input). Set to False when debugging or when the grid OPF result is needed separately. degradation_cost_per_mwh : float Monetary penalty per MWh of battery throughput, added to the reward as -degradation_cost_per_mwh * |power| * dt_h. Models battery wear; suppresses high-frequency charge/discharge cycles. Default 0 (no degradation penalty). action_smooth_cost : float Penalty coefficient for rapid changes in battery setpoint. Added as -action_smooth_cost * |power - prev_power|. Helps prevent oscillation between charge and discharge. Default 0 (disabled). infeasible_penalty : float Penalty subtracted from reward when market clearing is infeasible (opf success == False). Replaces the hard-coded value. Default 1000.0. Tune together with reward_scale to keep the penalty in a reasonable range relative to the LMP revenue. **grid_kwargs Forwarded to TransGridEnv.__init__.

steps_per_day property

reset(*, seed=None, options=None)

step(action)

Step the market environment.

Two internal paths:

skip_grid_opf=True (default, recommended for RL training): The underlying grid's internal OPF solve is bypassed. The sequence per step is:

1. Step the battery (update SOC, current_p_mw).
2. Advance the grid's time step counter.
3. Compute net load = gross load − battery injection at its bus.
4. Run piecewise-linear SCED → market dispatch + LMP.
5. Update grid's cached line/node state directly from the
   market-cleared OPF result (DC power flow for line flows).
6. Compute LMP-based settlement reward.

This avoids one redundant LP solve per step.

skip_grid_opf=False (legacy / debug): The grid runs its own OPF, then the market overrides the LMP with its own piecewise ED-OPF result. Runs two OPFs per step.

Returns standard Gymnasium 5-tuple.

render()

close()