Markets¶

PowerZoo's market layer wraps TransGridEnv with a clearing / settlement loop and exposes Locational Marginal Prices (LMPs) as the main agent-facing signal. Three envs cover three different research questions:

Env / task	Who decides the offer?	What the LMP reflects	Typical research target
`CostBasedMarketEnv`	nobody (flat true marginal cost `mc_c · p`)	true system marginal cost	clean DER arbitrage on uncongested LMPs
`BidBasedMarketEnv`	static piecewise offers (cost + optional markup, frozen per episode)	offer-based dispatch (decoupled from cost)	DER arbitrage with realistic LMPs
`GenCosMARLEnv` (`gencos_bidding`)	each agent, every step, via a 3-segment markup vector	offer-based dispatch under MARL bidding	strategic bidding, market power, learned price-making

Vocabulary check. LMP (Locational Marginal Price) is the dual variable of the nodal power-balance constraint: how much total system cost would rise if 1 extra MW of demand appeared at that bus. It equals system marginal cost when no line is congested; under congestion, buses on the constrained side have higher LMPs. SCED (Security-Constrained Economic Dispatch) is OPF using submitted offers as the LP objective.

The shared dispatch loop is the same across all three:

flowchart LR
    O[Offer / cost curves] --> S[SCED LP\nclear net injections]
    S --> L[LMP\nnodal duals]
    L --> R["Settlement\nrevenue = LMP × P × Δt"]
    R --> A["Agent profit\n= revenue − true cost"]

What changes between the three envs is only who supplies the offer curve and how often. The grid solve, LMP computation and settlement formulas are shared.

`CostBasedMarketEnv` — clean LMP arbitrage¶

This is the simplest market. Generator costs are flat (C_i(P) = mc_{c,i} \cdot P), no offers are submitted, and the LMP is the dual of the resulting cost-based DC-OPF. A battery is attached at a chosen bus, and the agent decides its power setpoint at each step.

Action. Box(1) battery setpoint in [-power_mw, +power_mw].
Observation. [soc, lmp_norm, time_sin, time_cos, total_demand_norm].
Reward. LMP × P_net × Δt. Safety violations stay in info['cost_*'].

from powerzoo import CostBasedMarketEnv
env = CostBasedMarketEnv(difficulty='medium')
obs, info = env.reset(seed=42)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

Use this env to study temporal arbitrage on a clean price signal: the LMP equals the system's true marginal value of energy at each bus.

`BidBasedMarketEnv` — realistic LMP, static offers¶

BidBasedMarketEnv adds piecewise-linear offer curves. By default, offers are derived from true costs with an optional random markup; you can also supply them externally. Offers are frozen for the duration of an episode: generated once at reset() and kept until the next reset(). The market then clears via a network-constrained SCED on those offers, and LMPs come from the LP duals of the offer-based dispatch.

The battery here is a prosumer: it does not submit offers, but its net injection enters the SCED as a nodal load offset and therefore does influence the LMP. Discharging at a bus reduces local net load and can lower the LMP there; charging does the opposite.

Action. Same as cost-based — battery setpoint.
Observation. [soc, lmp_norm, time_sin, time_cos, demand_norm, mean_offer_price_norm].
Reward. LMP × P × Δt (settlement-based; battery has no "true cost").

Use this env when you want a more realistic LMP series for DER arbitrage research — one that decouples LMP from true cost without yet introducing strategic bidding.

`GenCosMARLEnv` — strategic bidding (`gencos_bidding`)¶

GenCosMARLEnv is the only market env where multiple independent agents submit offers every step. There is one agent per generator on Case5 (5 agents), and each one outputs a Box(3) markup vector that is sorted into a 3-segment monotone offer curve. The market clears with solve_piecewise_ed_opf; ramp constraints couple consecutive steps so dispatch decisions cannot be reset between rounds.

Action. Box(3) ∈ [-1, 1] markup scalars per agent. Sort enforces monotonicity.
Observation. 12-D private vector — own cost / capacity / last dispatch / last profit / ramp headroom, demand forecast, time, and a 4-step LMP history.
Reward. Per-agent dispatch profit LMP[node_i] · P_i · Δt - TC_i(P_i) · Δt.
Episode. 48 steps × 30 min (rolling market). Ramp limits at step t constrain [p_min_rt, p_max_rt] at step t+1.

from powerzoo.envs.market import make_gencos_env

env = make_gencos_env()
obs, info = env.reset(seed=0)
while env.agents:
    actions = {ag: env.action_spaces[ag].sample() for ag in env.agents}
    obs, rewards, terms, truncs, info = env.step(actions)

Or via the task registry:

from powerzoo.tasks import make_task_env
env = make_task_env('gencos_bidding', framework='pettingzoo')

The full benchmark card for this task — including baselines, OOD splits and metrics — is in Benchmarks · GenCos.

Choosing between the three¶

flowchart TB
    Q1{Do you want to learn\nthe offer curve itself?}
    Q1 -->|yes| GC["GenCosMARLEnv\n(gencos_bidding)"]
    Q1 -->|no| Q2{Do you want LMP to be\noffer-based (realistic)?}
    Q2 -->|yes| BB[BidBasedMarketEnv]
    Q2 -->|no, prefer true cost| CB[CostBasedMarketEnv]

In short: cost-based for clean arbitrage research, bid-based for realistic LMPs without bidding agents, GenCos for strategic bidding MARL.

Markets¶

CostBasedMarketEnv — clean LMP arbitrage¶

BidBasedMarketEnv — realistic LMP, static offers¶

GenCosMARLEnv — strategic bidding (gencos_bidding)¶

Choosing between the three¶

See also¶

`CostBasedMarketEnv` — clean LMP arbitrage¶

`BidBasedMarketEnv` — realistic LMP, static offers¶

`GenCosMARLEnv` — strategic bidding (`gencos_bidding`)¶