Towards an Electricity Market Simulator

This paper describes a multi-agent based simulation (MABS) framework to construct an artificial electric power market populated with learning agents. The artificial market, named TEMMAS (The Electricity Market Multi-Agent Simulator), explores the integration of two design constructs: i) the specification of the environmental physical market properties, and ii) the modeling of the decision-making (deliberative) and reactive agents. TEMMAS is materialized in an experimental setup involving distinct power generator companies which operate in the market and search for the trading strategies that best exploit their generating units’ resources. The experimental results show a coherent market behavior that emerges from the overall simulated environment.


I. INTRODUCTION
The start-up of nation-wide electric markets, along with its recent expansion to intercountry markets, aims at providing competitive electricity service to consumers.The new market--based power industry calls for human decision-making in order to settle the energy assets' trading strategies.The interactions and influences among the market participants are usually described by game theoretic approaches which are based on the determination of equilibrium points to which compare the actual market performance [1], [2].However, those approaches find it difficult to incorporate the ability of market participants to repeatedly probe markets and adapt their strategies.Usually, the problem of finding the equilibria strategies is relaxed (simplified) both in terms of: i) the human agents' bidding policies, and ii) the technical and economical operation of the power system.As an alternative to the equilibrium approaches, the multi-agent based simulation (MABS) comes forth as being particulary well fitted to analyze dynamic and adaptive systems with complex interactions among constituents [3], [4].
In this paper we describe TEMMAS (The Electricity Market Multi-Agent Simulator), a MABS approach to the electricity market, aimed at simulating the interactions of agents and to study the macro-scale effects of those interactions.TEMMAS agents exhibit bounded rationality, i.e., they make decisions based on local information (partial knowledge) of the system and of other agents while learning and adapting their strategies during a simulation in order to reveal and assist to understand the complex and aggregate system behaviors that emerge from the interactions of the market agents.

II. TEMMAS MODELING FRAMEWORK
We describe the structural TEMMAS constituents by means of two concepts: i) the environmental entity, which owns a distinct existence in the real environment, e.g. a resource such as an electricity producer, or a decision-making agent such as a market bidder generator company, and ii) the environmental property, which is a measurable aspect of the real environment, e.g. the price of a bid or the demand for electricity.Hence, we define the environmental entity set, E T = { e 1 , . . ., e n }, and the environmental property set, E Y = { p 1 , . . ., p m }.The whole environment is the union of its entities and properties: The environmental entities, E T , are often clustered in different classes, or types, thus partitioning E T into a set, P E T , of disjoints subsets, P i E T , each containing entities that belong to the same class.Formally, The partitioning may be used to distinguish between decision-making agents and available resources, e.g. a company that decides the biding strategy to pursue or a plant that provides the demanded power.
The environmental properties, E Y , can also be clustered, in a similar way as for the environmental entities, thus grouping properties that are related.The partitioning may be used to express distinct categories, e.g.economical, electrical, ecological or social aspects.Another, more technical usage, is to separate constant parameters from dynamic state variables.
The factored state space representation.The state of the simulated environment is implicitly defined by the state of all its environmental entities and properties.We follow a factored representation, that describes the state space as a set, V, of discrete state variables [5].Each state variable, v i ∈ V, takes on values in its domain D( v i ) and the global (i.e., over E) state space, S ⊆ × vi∈V D( v i ), is a subset of the Cartesian product of the state variable domains.A state s ∈ S is an assignment of values to the set of state variables V. We define f C , C ⊆ V, as a projection such that if s is an assignment to V, f C ( s ) is the assignment of s to C; we define a context c as an assignment to the subset C ⊆ V; the initial state variables of each entity and property are defined, respectively, by the functions init The decision-making approach.Each agent perceives (the market) and acts (sells or buys) and there are two main approaches to develop the reasoning and decision-making capabilities: i) the qualitative mental-state based reasoning, such as the belief-desire-intention (BDI) architecture [6], which is founded on logic theories, and ii) the quantitative, decision-theoretic, evaluation of causal effects, such as the Markov decision process (MDP) support for sequential decision-making in stochastic environments.There are also hybrid approaches that combine the qualitative and quantitative formulations [7], [8].
The qualitative mental-state approaches capture the relation between high level components (e.g.beliefs, desires, intentions) and tend to follow heuristic (or rule-based) decision--making strategies, thus being better fitted to tackle large-scale problems and worst fitted to deal with stochastic environments.
The quantitative decision-theoretic approaches deal with low level components (e.g., primitive actions and immediate rewards) and searches for long-term policies that maximize some utility function, thus being worst fitted to tackle large-scale problems and better fitted to deal with stochastic environments.
The electric power market is a stochastic environment and we currently formulate medium-scale problems that can fit a decision-theoretic agent model.Therefore, TEMMAS adaptive agents (e.g., market bidders) follow a MDP based approach and resort to experience (sampled sequences of states, actions and rewards from simulated interaction) to search for optimal, or near-optimal, policies using reinforcement learning methods such as Q-learning [9] or SARSA [10].

III. TEMMAS DESIGN
Within the current design model of TEMMAS the electricity asset is traded through a spot market (no bilateral agreements), which is operated via a Pool institutional power entity.Each generator company, GenCo, submits (to Pool) how much energy, each of its generating unit, GenUnit GenCo , is willing to produce and at what price.Thus, we have: i) the power supply system comprises a set, E GenCo , of generator companies, ii) each generator company, GenCo, contains its own set, E GenUnitGenCo , of generating units, iii) each generating unit, GenUnit GenCo , of a GenCo, has constant marginal costs, and iv) the market operator, Pool, trades all the GenCos' submitted energy.
The bidding procedure conforms to the so-called "block bids" approach [11], where a block represents a quantity of energy being bided for a certain price; also, GenCos are not allowed to bid higher than a predefined price ceiling.Thus, the market supply essential measurable aspects are the energy price, quantity and production cost.The consumer side of the market is mainly described by the quantity of demanded energy; we assume that there is no price elasticity of demand (i.e., no demand-side market bidding).
Therefore, we have: The quantity refers both to the supply and demand sides of the market.The price referes both to the supply bided values and to the market settled (by Pool) value.
The E GenCo contains the decision-making agents.The Pool is a reactive agent that always applies the same predefined auction rules in order to determine the market price and hence the block bids that clear the market.Each E GenUnitGenCo represents the GenCo's set of available resources.
The resources' specification.Each generating unit, GenUnit GenCo , defines its marginal costs and constructs the block bids according to the strategy indicated by its generator company, GenCo.Each GenUnit GenCo calculates its marginal costs according to, either the "WithHeatRate" [12]) or the "WithCO 2 " [13] formulation.
The "WithHeatRate" formulation estimates the marginal cost, MC, by combining the variable operations and maintenance costs, vO&M, the number of heat rate intervals, nP at, each interval's capacity, cap i and the corresponding heat rate value, hr i , and the price of the fuel, f P rice, being used; the marginal cost for a given i ∈ [1, nP at] interval is given by, (1) where each block's capacity is given by: blockCap i+1 = cap i+1 − cap i .
The "WithCO 2 " marginal cost, MC, combines the variable operations and maintenance costs, vO&M, the price of the fuel, f P rice, the CO 2 cost, CO 2 cost, and the unit's productivity, η, through the expression, where K is a fuel-dependent constant factor, and CO 2 cost is given by, where CO 2 emit is the CO 2 fuel's emissions.Here all blocks have the same capacity; given a unit's maximum capacity, maxCap, and a number of blocks, nBlocks, to sell, each block's capacity is given by: blockCap = maxCap nBlocks .The decision-making strategies.Each generator company defines the bidding strategy for each of its generating units.We designed two types of strategies: a) the basic-adjustment, that chooses among a set of basic rigid options, and b) the heuristic-adjustment, that selects and follows a predefined well-known heuristic.There are several basic-adjustment strategies already defined in TEMMAS.Here we outline seven of those strategies, sttg i where i ∈ { 1, . . ., 7 }, available for a GenCo to apply: i) sttg 1 , bid according to the marginal production cost of each GenUnit GenCo (follow heat rate curves, e.g., cf.tables II and III), ii) sttg 2 , make a "small" increment in the prices of all the previous-day's block bids, iii) sttg 3 , similar to sttg 2 , but makes a "large" increment, iv) sttg 4 , make a "small" decrement in the prices of all the previous-day's block bids, v) sttg 5 , similar to sttg 4 , but makes a "large" decrement, vi) sttg 6 , hold the prices of all previous-day's block bids, vii) sttg 7 set the price to zero.There are two heuristic-adjustment defined strategies: a) the "Fixed Increment Price Probing" (FIPP) that uses a percentage to increment the price of last day's transacted energy blocks and to decrement the non-transacted blocks, and b) "Physical Withholding based on System Reserve" (PWSR) that reduces the block's capacity, as to decrement the next day's estimated system reserve (difference between total capacity and total demand), and then bids the remaining energy at the maximum market price.
The agents' decision process.The above strategies correspond to the GenCo agent's primary actions.The GenCo has a set, E GenUnitGenCo , of generating units and, at each decision-epoch, it decides the strategy to apply to each generating unit, thus choosing a vector of strategies, − − → sttg, where the i th vector's component refers to the GenUnit i GenCo generating unit; thus, its action space is given by: { sttg 1 , . . ., sttg 7 } i ∪ { FIPP, PWSR }.The GenCo's perceived market share, mShare, is used to characterize the agent internal memory so its state space is given by mShare ∈ [ 0..100 ].Each GenCo is a MDP decision-making agent such that the decision process period represents a daily market.At each decision-epoch each agent computes its daily profit (that is regarded as an internal reward function) and the Pool agent receives all the GenCos's block bids for the 24 daily hours and settles the hourly market price by matching offers in a classic supply and demand equilibrium price (we assume a hourly constant demand).
TEMMAS architecture and construction.The TEMMAS agents along with the major inter-agent communication paths are represented in the bottom region of Figure 1; the top region represents the user interface that enables to specify the each of the resources' and agents' configurable parameters.The of the TEMMAS architecture followed the INGENIAS [14] methodology and used its supporting development platform.Figure 2 presents the general "agent's perspective", where the tasks and the goals are clustered into individual and social perspectives.Figure 3 gives additional detail on the construction of tasks and goals using INGENIAS.resulting behavior, e.g. the learnt bidding policies, in light of the market specific dynamics; cf.[15] for our extended experimental setup.We considered three types of generating units: i) one base load coal plant, CO, ii) one combined cycle plant, CC, to cover intermediate load, and iii) one gas turbine, GT, peaking unit.Table I shows the essential properties of each plant type and tables II and III shows the heat rate curves used to define the bidding blocks.The marginal cost was computed using expression ( 1 ); the bidding block's quantity is the capacity increment, e.g. for CO, the 11.9 marginal cost bidding block's quantity is 350 − 250 = 100 MW (cf.We designed a simple experimental scenario and Table IV shows the GenCo's name and its production capacity according to the respective GenUnits (cf.We considered a constant, 2000 MW, hourly demand for electricity.Figure 4 shows the market share evolution while GenCo minor&active learns to play in the market with GenCo major, which is a larger company with a fixed strategy: "bid each block 5C higher than its marginal cost".We see that GenCo minor&active gets around 18% (75 − 57) of market from GenCo major.To earn that market the GenCo minor&active learnt to lower its prices in order to exploit the "5C space" offered by GenCo major fixed strategy.

V. CONCLUSIONS AND FUTURE WORK
This paper describes our preliminary work in the construction of the TEMMAS agent-based electricity market simulator.Our contribution includes a comprehensive formulation of the simulated electric power market environment along with the inhabiting decision-making and learning agents.Our initial results reveal an emerging and coherent market behavior, thus inciting us to extend TEMMAS with additional bidding strategies and to incorporate specific market rules, such as congestion management and pricing regulation mechanisms.

TABLE I PROPERTIES
OF GENERATING UNITS; THE UNITS' TYPES ARE COAL (CO), COMBINED CYCLE (CC) AND GAS TURBINE (GT); THE O&M INDICATES"OPERATION AND MAINTENANCE" COST.

Table I
).The "active" suffix (cf.TableIV, name column) means that GenCo searches for its GenUnits best bidding strategies; i.e. "active" is a policy

TABLE II CO
AND CC UNIT'S CAPACITY BLOCK (MW) AND HEAT RATE (BTU/KWH) AND THE CORRESPONDING MARGINAL COST ( C/MWH).

TABLE IV THE
EXPERIMENT'S GenCoS AND GenUnitS.