1 A Matrix Game Model for Analyzing FTR Bidding Strategies in Deregulated Electric Power Markets Tapas K. Das, Patricio Rocha, and Cihan Babayigit Abstract— Suppliers in deregulated electric power markets compete for financial transmission rights (FTRs) to hedge against congestion charges. The system operator receives the bids for FTRs submitted by the suppliers and develops an allocation strategy by solving an optimization model. Each FTR bid is defined by a path, a quantity indicating the amount of FTRs the supplier is bidding for in that path, and the price that the supplier is willing to pay for each FTR. The FTR revenue is calculated only after the electricity market has been cleared by computing the differences in the LMPs at the pair of nodes that connect each path. Thus, suppliers rely on forecasts of locational marginal prices (LMPs) to develop their FTR bids. In this paper, we present a game theoretic modeling approach to develop FTR bidding strategies for power suppliers assuming that they have forecasts of LMPs. The game theoretic model considers multiple participants as well as network contingencies. We apply the game theoretic model on a sample network to assess impacts of variations of bid and network parameters on the FTR market outcome. Index Terms— Deregulated Electricity Markets, Financial Transmission Rights, Matrix Game, FTR Settlement I. I NTRODUCTION ONGESTION in a power network is caused by transmission capacity limitations. Its effect on suppliers and consumers is primarily reflected by the different locational marginal prices (LMPs) observed in a network. For the system operator (ISO), this price difference results in higher revenues collected from the consumers than payments made to the suppliers. Financial transmission rights (FTRs) help to redistribute this excess revenue among the market participants [1] with the objective of mitigating the price uncertainty caused by the differences in the LMPs. Thus, FTRs can be seen as a hedging instrument for the market participants. An FTR is a contract between a market participant and the ISO. It is designated by a MW amount from a source node to a sink node in the network, and is valid over a defined period of time. As opposed to physical transmission rights (PTRs) [2], FTRs do not provide exclusive rights over a transmission line. FTRs can be acquired through auctions or in the secondary market. In this paper, we consider that market participants acquire FTRs only via periodical auctions. FTRs can be further classified as obligations or options. The holder of an FTR obligation will receive a payment C This work was supported in part by the National Science Foundation through a Grant# ECS-0400268. T. K. Das ([email protected]) and P. Rocha ([email protected]) are with the Department of Industrial & Management Systems Engineering, University of South Florida, Tampa, Florida 33620; C. Babayigit ([email protected]) is with Revenue Management Solutions, LLC, 777 South Harbour Island Boulevard Suite 890 Tampa, FL 33602. equal to the amount of FTR (in MW) times the difference in LMPs between the source and sink nodes (∆LM P ), when the ∆LM P has a positive value; if the ∆LM P is negative, then the holder of an FTR obligation will have to make a payment to the ISO (computed in the same fashion). An FTR obligation, thus can be a benefit but also a liability. Conversely, for an FTR option, if the ∆LM P is negative, the option holder does not have to make a payment to the ISO. In this paper we consider that a market participant has a choice to bid for any combination of FTR obligations and options. Once the market participants have submitted their FTR bids to the auction, the ISO allocates the FTRs by solving an optimization problem that maximizes the FTR sales revenue[3] subject to network capacity constraints. The ISO performs a simultaneous feasibility test (SFT) to ensure that the FTRs allocated are within the capability of the existing transmission system[1]. The SFT is required in order to check for revenue adequacy, i.e., to ensure that the excess revenue collected by the ISO in the electricity market will be greater than or equal to the payments made to the FTR holders. Despite the use of SFT, there are circumstances where the market might not be revenue adequate. Such circumstances include when the loop flow conditions are different in the electricity market compared to the assumptions made in the FTR model, and when there are emergency outages not included in the contingency scenarios in the FTR model [4]. Market participants attempt to maximize their FTR revenue based on their FTR bids submitted in anticipation of certain combination of system operating conditions, network contingency scenarios, and bidding strategies of other competitors. Among the operating conditions, forecasted LMPs play a critical role in determining the FTR bid price and path. In the literature, various methods have been used to forecast LMPs including simulation [5], artificial neural network [5], [6], and time series [7]. In this paper, we present a matrix-game theoretic modeling approach to develop FTR equilibrium bidding strategies for multiple market participants assuming that they have knowledge of LMP forecasts. The FTR bidding strategies are composed of price, quantity, path, and a parameter determining the proportion of options and obligations. The modeling approach presented here can be used by 1) market participants, to obtain FTR bids that maximize their profits and 2) system operators and market designers, to analyze a wide spectrum of market scenarios given by system contingencies and bid combinations. In the literature, a bi-level optimization method to obtain FTR bidding equilibrium strategies is presented in [3], where each bidder considers multiple bidding strategies and models 2 the bidding behavior of its opponents in the upper level problem. The lower level problem finds the FTR market clearing price and the respective FTR allocation. The solution of the bilevel problem is obtained by iteratively updating the bidding strategies of each bidder, one at a time, while maintaining the opponents’ bidding strategies fixed. This procedure ends when the bidding strategies of all participants cease to change. The matrix-game theoretic model we present in this paper is different from what is presented in [3] in that we obtain the FTR bidding strategies using a value iteration based reinforcement learning (RL) algorithm. Moreover, we consider an additional bid element that describes the proportion of FTR obligations and FTR options for each market participant. A common assumption in both papers is that bidders have knowledge of LMP forecasts. Another paper by O’Neill et al. [8] presents an auctionbased process that jointly consider FTR bids and forward energy contracts. Their mathematical formulation is a generalization of DC power dispatch models that accommodates transmission rights. The mathematical formulation iteratively allocates FTRs to the bidders, and in the last iteration the energy dispatch problem is solved and the LMPs are obtained. The paper is organized as follows. Section II describes the matrix game theoretic approach and the mathematical formulations included on it. The value iteration based RL algorithm to solve the model is explained in Section III. Numerical experiments and discussion of the results are presented in Section IV. Section VI provides the conclusions. II. A M ATRIX G AME M ODEL F ORMULATION FOR FTR A LLOCATION Let I = {1, 2, · · · , I} denote the set of paths of source and sink locations for which FTRs can be obtained. Also, let N = {1, 2, · · · , N } denote the set of participants bidding for the available FTRs. A bidder n ∈ N is considered to bid on a subset of paths In ⊂ I. The bidders submit different price and quantity bids for obligation and option type FTRs. Since the expected maximum FTR revenue on path i is determined by the forecasted ∆LM Pi , the bid prices of bidder n are considered to be kin times ∆LM Pin , where kin ≥ 0. Let Qni denote the maximum amount of FTR (options and/or obligations) that bidder n bids in path i. The quantity Qni is split between FTR obligations and FTR options via a factor mni , 0 ≤ mni ≤ 1. If mni = 0 the bid is only for FTR options, while if mni = 1 the bid on path i is only for FTR obligations. Hence, a bid vector for bidder n for path i ∈ In can be denoted as ani = kin,ob ∆LM Pin ; kin,op ∆LM Pin ; Qni ; mni , where the price multiplying factors (kin ) are indexed by ob and op for obligation and option, respectively. ∆LM Pin denotes the forecasted LMP difference between sink and source busses considered by bidder n in path i. The cardinality of the bid vector ani is given as follows, |ani | = |kin,ob | × |kin,op | × |Qni | × |mni | with |kin,ob |, |kin,op |, |Qni |, and |mni |, the number of levels used to discretize each element of the bidding vector. Then the cardinality of the entire action space for bidder n, |F n |, is given as, Y |ani | |F n | = (1) i∈In The competition of the N bidders in the FTR market can be modeled as an N-player matrix game consisting of N payoff matrices of size |F 1 | × |F 2 | × ... × |F N |. The computation of the elements in the payoff matrices is presented next. A. Computation of Payoff Matrix Elements 1) FTR Allocation Model: Upon receiving the FTR bids from each bidder the ISO solves an optimization problem (2)-(6) to develop the FTR allocations and obtain the FTR prices. In this paper, we assume the ISO runs a uniform price auction. We adopt a formulation similar to what is presented in [3], [8]. The objective function in (2) maximizes the FTR auction revenue subject to network capacity constraints and upper bounds for the FTRs requested, max N X X ρn,ob ∗ F T Rin,ob + ρn,op ∗ F T Rin,op i i (2) n=1 i∈In s.t. N X X n,c n,c [Di,l ∗ F T Rin,ob + max(0, Di,l )∗ n=1 i∈In F T Rin,op ] ≤ Blc N X X ∀ l, c (3) n,c n,c [−Di,l ∗ F T Rin,ob + max(0, −Di,l )∗ n=1 i∈In F T Rin,op ] ≤ Blc F T Rin,ob ≤ mni ∗ Qni F T Rin,op ≤ (1 − mni ) ∗ Qni ∀ l, c ∀ n, i ∀ n, i (4) (5) (6) where F T Rin,ob quantity of obligation FTR allocated to nth bidder on path i (decision variable) F T Rin,op quantity of option FTR allocated to nth bidder on path i (decision variable) ρn,ob obligation bid price of nth bidder on path i i ρn,op option bid price of nth bidder on path i i n,c Di,l PTDF of the nth bidder’s ith path on line l under contingency c Blc capacity limit of line l under contingency c Qni upper bidding quantity of bidder n for path i Constraints (3) and (4) account for the fact that counterflows are not considered for FTR options. From the above mathematical formulation we also obtain the market clearing price (MCP) for an FTR option and an FTR obligation in + − each path i. Let wlc and wlc the shadow prices of constraints (3) and (4), respectively, for line l under contingency c. Then, M CPiob = C X L X c=0 l=1 + − c Di,l (wlc − wlc ), (7) 3 M CPiop = C X L X c=0 l=1 C X L X + c max(0, Di,l )wlc + • − c max(0, −Di,l )wlc , (8) c=0 l=1 2) Expected FTR Profit: Since the bidders are trying to maximize their individual FTR profit based on forecasts for ∆LM P s in the network, the expected value Rn of the FTR profit for each bidder n is as follows, X [∆LM Pin ∗ F T Rin,ob + max(∆LM Pin , 0) ∗ Rn = i∈In n,op F T Ri − (M CPiob ∗ F T Rin,ob + M CPiop ∗ F T Rin,op )] (9) • • n where R corresponds to the total FTR profit for all the paths for which bidder n submitted FTR bids. 3) Risk-Constrained Expected FTR Profit: Bidders use forecasted ∆LM P s as the basis for selecting bid prices in the FTR auction. Due to the variability in the LM P s, we incorporate the variance of the profit, var (Rn ), as a measure of risk in the calculation of the risk-constrained expected profit U n, U n = Rn − ζ n ∗ var (Rn ) , (10) where, for risk neutral bidders, the risk coefficient ζ n = 0. For risk prone and risk averse bidders, ζ n takes negative and positive values, respectively. The value of U n is computed for each generator, for each of the FTR bid combinations, determining the individual payoff matrices in the matrix game. The matrix game is then solved with a Reinforcement Learning algorithm presented in the following section. III. S OLUTION A LGORITHM FOR THE FTR M ATRIX G AME We solve the resulting FTR matrix game using a Reinforcement Learning (RL) algorithm [9]. The RL algorithm has been used to solve matrix games resulting from other problems in electricity markets[10]. The RL algorithm has the following steps. 1) Let iteration count t = 0. Initialize R-values for each bidder n to an identical small positive number. There are |F n | R-values, R0 (n, 1), ...R0 (n, |F n |), for each bidder. Also initialize the learning parameter γ0 , exploration parameter φ0 , and parameters γτ and φτ needed to obtain suitable decay rates of learning and exploration. Let T denote the maximum iteration count. 2) If t < T , continue learning the R-values through the following steps: • Action Selection: Greedy action selection: Each bidder n, with probability (1 − φt ), chooses an action an for which Rt (n, an ) ≥ Rt (n, an ) where an stands for all the other FTR bid strategies excepting an . A tie is broken arbitrarily. Exploratory action selection: With probability φt , a bidder chooses an action an from the remaining possible FTR bid strategies (excluding the greedy bid), where each of these candidate bids has an equal probability of being chosen. R-Value Updating: Update the specific R-values for each bidder n corresponding to the chosen bid strategy an using the learning scheme given below. Rt+1 (n, an ) ← (1−γt )Rt (n, an )+γt U n (an , a−n ), (11) where U n (an , a−n ) is nth bidder’s payoff when the bidder selects action an and the rest of the bidders select action combination a−n . Set t ← t + 1. Update the learning parameter γt and exploration parameter φt following the DCM scheme given below ([11]): t2 Θ0 , where u = , (12) Θt = 1+u Θτ + t where Θ0 denotes the initial value of a learning/exploration rate, and Θτ is a large value (e.g., 109 ) chosen to obtain a suitable decay rate for the learning/exploration parameters. Exploration rate generally has a large starting value (e.g., 0.8) and a quicker decay, whereas learning rate has a small starting value (e.g., 0.1) and very slow decay rate. Exact choice of these values depends on the application ([11], [12]). If t < T , go back to beginning of Step 2, else go to Step 3. 3) From the final set of R-values, the best-response FTR ∗ bid strategy an for each bidder n is found as follows, • ∗ an = max R(n, an ) n a (13) The RL algorithm always provides a pure strategy solution. Mixed strategies, which always exist in a matrix game, are not considered since their implementation in real-life power networks is impractical. Remarks: Mixed strategy solutions imply that players choose multiple actions with different probabilities. Thus, to accomplish a mixed strategy implementation, a game must be repeated many times to achieve, on average, the percentages with which the bids (actions) should be chosen. This condition of identical repeated play of a game is difficult to satisfy since a power network is unlikely to remain completely unchanged for a large number of FTR allocation periods. If any of the network conditions changes, the matrix game changes as well, requiring a new best-response solution. The pure strategy best-response solution found by the valuebased RL algorithm almost always coincides with the pure strategy Nash equilibrium, if the matrix game has one [9]. If the matrix game has multiple Nash equilibria, the RL algorithm finds the best-response action with the highest Rvalues. If no Nash equilibrium exists in a matrix game, the RL algorithm finds an out-of-equilibrium solution [13] that provides a practical alternative. For such games, the greedy action selection approach of the RL algorithm (that prevails 4 after the exploration period ends) drives each participant to choose its highest R-value action. The resulting action combination of the participants and the corresponding payoffs constitute the out-of-equilibrium solution for the game. A. Computational Issues Computational dificulties can arise if the action space for the bidders (as presented in equation(1)) is too large. This can be due to a high number of paths available for FTR bidding and a high level of discretization of the bid parameters. Large action spaces increase the size of the matrix game and the number of times the FTR allocation model (described by equations (2)-(6)) is solved. It may be noted that the RL algorithm has a computational complexity that depends on the number of bidders n, the number of iterations T , and the maximum n number of bid choices any player possesses |Fmax |, and is n given by O(nT |Fmax |). Fig. 1. FTR Bidders in a 3-Bus Power Network TABLE I N ETWORK AND B ID VALUES IV. N UMERICAL E XAMPLE In order to demonstrate the matrix game theoretic approach to obtain best-response bidding strategies for an FTR market, we consider a sample power network similar to the one studied in [3]. By varying the network parameters such as contingencies and LMP differences between the nodes, we created sixteen different network scenarios for which bestresponse FTR bidding strategies are obtained. Since in the matrix game formulation, the continuous bid parameters (obligation price, option price, quantity, and type mix) are discretized, the effect of the extent of discretization is examined. Thereafter, we study the impact of individual bid parameters of the bidders under the assumption that the other bidders choose their actions uniformly from the available sets. Finally, we investigate the impact of the network parameters on the best-response FTR bidding strategies through an analysis of variance (ANOVA) via a 24 factorial experiment. A. The Sample Network We considered a sample network consisting of three buses and four bidders, which is depicted in Figure 1. The Bidders 3 and 4 are considered non-strategic, hence only bidders 1 and 2 are considered strategic bidders in the matrix game. The paths between source and sink buses on which the bidders bid are shown in the Figure 1, which also indicates the reactance values and flow limits of each line. B. Best-Response Bidding Strategies for Different Network Scenarios Four key network related parameters that were considered in this study are contingency (c), ∆LM P s (l), variances of the ∆LM P estimates (v), and the risk coefficient (r). Sixteen different network scenarios were created by varying each of the four network parameters at two levels. The parameters l, v, and r (which could be varied for both strategic bidders) were varied only for bidder 2. In order to simplify the numerical exposition, we considered the obligation and the option price bids to be identical, which reduced the size of the bid vector from four to three dimensions. We note however, that obligation FTR may become a liability, whereas the option FTR does not have such a risk, and hence the bid prices could be different. Our model is general and accommodates this characteristic. For each of the sixteen scenarios, the possible number of bid choices of the two players was kept constant at 125 with five levels of discretizations for each of price, quantity, and the type mix. Table I shows the values of the network and the bid parameters. For each scenario, the payoff matrices were constructed and the value iteration based learning algorithm was implemented. The network scenarios and the corresponding pure best-response strategy as obtained by the RL algorithm are presented in Table II. As indicated in the last column of Table II, in ten out of the thirteen scenarios having pure strategy Nash equilibria, the RL algorithm converged to a Nash equilibrium point. Among the multiple Nash equilibria that exist for scenarios vr and clvr, the strategies that the RL algorithm converged to have higher payoffs for both bidders compared to the other Nash equilibrium points. In three of the remaining scenarios (with ’No’ in the last column), the RL algorithm converged to bestresponse (non-NE) strategies yielding higher payoffs for both of the bidders compared to the NE payoffs. For these scenarios, Table III shows a comparison of the payoffs from the Nash equilibrium strategies and the corresponding best-response strategies obtained by the RL algorithm. The remaining three scenarios (with a ’-’ in the last column) do not have a pure strategy Nash equilibrium. The RL algorithm converged to 5 TABLE II B EST- RESPONSE B IDDING S TRATEGIES FOR S IXTEEN N ETWORK TABLE IV I MPACT OF BID PARAMETER DISCRETIZATION S CENARIOS high values, and as discretization increases, the algorithm has more candidates to choose from. D. Impact of Bid Parameter Variations TABLE III B EST- RESPONSE STRATEGIES WITH HIGHER PAYOFFS THAN NASH EQUILIBRIUM best-response strategies with a high payoff distribution for the bidders. C. Impact of Bid Parameter Discretization As discussed earlier, discretization of the bid parameters is essential to formulating the non-cooperative behavior of the bidders as a matrix game. A finer discretization of the continuous parameters is required to minimize the deviation from the actual problem scenario. At the same time, finer discretization of the parameters of a multidimensional bid vector expands the action space, which increases the dimensions of the payoff matrices and the resulting computational requirements. In order to expose the significance of discretization, we studied the impact of price parameter discretization on the best-response bidding strategies. Five different levels of discretization of the price parameter (3, 5, 10, 15, 20) were considered while the discretization of quantity and type mix parameters were kept constant at 5 levels each. This resulted in payoff matrix sizes varying from 75 × 75 (3 × 5 × 5) to 500 × 500 (20 × 5 × 5). The best-response payoffs of the players are given in Table IV. As evident from the payoffs, the best-response strategies varied quite significantly with the level of discretization. It also appears that with finer price discretization the payoffs of the bidders increased. This is due to the fact that the algorithm always looks for a solution with The solution of a matrix game is a resultant of the parameter values of the participants’ bid vectors. Though it is difficult, it is desirable to extract insight into the impact of the individual bid parameter on the payoffs of the bestresponse strategy. Therefore, we conducted an experiment where impact of each bid parameter was graphically analyzed as follows. We acknowledge that the observations made in this section have problem specific interpretations with some potential for generalization. In the experiment, the network parameter values were maintained at the following. For bidder 1: ∆LM P = $20, variance = 0.2, risk coefficient = 0.003, and for bidder 2: ∆LM P = $10.5, variance = 0.2, risk coefficient = 0.002. Maximum quantity (Q) was considered to be 300, and the network was assumed to have no contingency. The price factor of bidders 1 and 2 were varied in ten steps between 0.1 and 0.95 in steps of 0.1. Figure 2 shows the impact of price variations by bidder 2 on bidder 1 payoffs. The payoffs of bidder 1, as plotted, were averaged over all possible combinations (80 × 80) of quantity and type mix parameters of the two bidders, where each bidder has 10 × 8 possible bid choices. For all bidder 1 price factor values up to 0.7, the payoff was zero. For bid price factor beyond 0.7, bidder 1’s payoffs were identical for all bid price factors less than or equal to 0.7 by bidder 2. Hence, only the bid price factor scenarios with both bids greater than or equal to 0.7 are critical as shown in Figure 2. As bidder 2 changes its price factor, the optimal price bid for bidder 1 also changes. For example, as bidder 2 changes price factor from 0.7 to 0.8, the optimal price bid for bidder 1 changes from 0.8 to 0.9. Similarly, Figure 3 shows the impact of price bid variations of bidder 1 on the bidder 2’s payoffs (utility). A general conclusion that can be drawn from the above is that a significant interaction exists between the bidder prices in how they impact the bidder utilities. The exact level of interactions will depend on the network parameter values. Analyses, similar to that of price, were also conducted with quantity and type mix parameters. The results from the investigation of the quantity parameter are presented in Figures 4 and 5. For both bidders, the quantity effect appears to be somewhat identical. The bidder payoffs increase with increase in the quantity bid, and they level off after 0.5 for bidder 1 and 0.7 for bidder 2 irrespective of the competitor’s bid. This indicates that for the given problem parameters, the quantity 6 Fig. 2. Price Effect on Bidder 1’s Average Utility Fig. 3. Price Effect on Bidder 2’s Average Utility bid should be kept at the maximum possible value. However, it was our conjecture that in the presence of high values of variance and/or risk coefficient, the choice of the quantity parameter could become strategic. To test this conjecture, we studied the sample network under a new scenario with the following network parameters. For bidder 1: ∆LM P = $20, variance = 0.2, risk coefficient = 0.003, and for bidder 2: ∆LM P = $13, variance = 2, risk coefficient = 0.01. The strategic impact of bidder 2’s quantity bid on her payoff, which starts to decline beyond a certain value of quantity bid, is shown in Figure 6. This is in clear contrast to the higher the better behavior seen earlier. We can state a general conclusion that FTR quantity could be a significant parameter and should be considered in the bidding process. Fig. 5. Quantity Effect on Bidder 2’s Average Utility Fig. 6. Strategic Impact of Quantity Parameter with the choice of higher values of the type mix factor (i.e., higher proportion of obligation). Table V depicts, for a sample scenario, how the total FTR allocation as well as its obligation and option components change for bidder 2, as the bidder varies its type mix bid. This supports the trend observed in Figure 8, since bidder 2 wins the most FTR when the type mix factor is set at zero (i.e., all option), and the FTR allocation decreases as more obligations are added to the mix. We conclude that type mix parameter could play a significant role in a multi-bidder FTR settlement process and thus should be adequately investigated. Fig. 7. Fig. 4. Type Mix Effect on Bidder 1’s Average Utility Quantity Effect on Bidder 1’s Average Utility E. Impact of the Network Parameter Variations The results of the investigation on the impact of type mix parameter on the bidder payoffs are given in Figures 7 and 8. It appears from Figure 7 that bidder 1’s payoff is not affected by its choice of the type mix parameter, and is only minimally affected by the choice of bidder 2’s type mix parameter. On the other hand, bidder 2s payoff is completely independent of bidder 1’s strategy, as evident from the overlapping curves in Figure 8. Bidder 2 suffers a significant decrease in utility The impact of the network parameters on the best-response payoffs of the bidders was studied through an analysis of variance (ANOVA) via a 4-factor designed experiment. The factors, their levels, and the sixteen (24 ) experiments were presented in Table I and II. Two sets of ANOVA were performed using payoffs of bidder 1 and bidder 2 (given in Table II) as experimental outcomes. Since each outcome is a single replicate, normal probability plots of the factor and 7 TABLE VII ANOVA WITH B IDDER 1’ S PAYOFFS Fig. 8. Type Mix Effect on Bidder 2’s Average Utility TABLE V I MPACT OF TYPE MIX PARAMETER interaction effects were constructed to obtain error sum of square (SS) estimates. The ANOVA results are given in Tables VI and VII. It appears from Table VI that bidder 2’s payoff is affected by all four of the factors and is insensitive to any of the factor interactions. Among the significant factors, the ∆LM P appears to be the most critical with a p-value of 0.0001. Table VII shows that, for the given network, bidder 1’s payoff is affected only by the ∆LM P estimate of bidder 2 and the contingency in the network. As expected, variance and risk coefficient parameters of bidder 2 (which are the other two factors considered in the experiment) have no significant impact on the payoff of bidder 1. V. I MPLEMENTATION S TEPS In this section we briefly outline the steps that a market participant needs to take in determining his/her FTR bidding strategy for a particular bidding period using the methodology presented in this paper. TABLE VI ANOVA WITH B IDDER 2’ S PAYOFFS Obtain a forecast for the LMPs in the network using a methodology from those cited in the Introduction section ([5], [6], [7]). • Based on the LMP forecasts, develop a set of alternative FTR bidding strategy vectors using different combinations of 1) the network paths, 2) discrete values of prices within the acceptable price range, 3) discrete values of quantities within the feasible range based on the network conditions, and 4) the parameter indicating the proportion of options and obligations. • Develop anticipatory bids for all other market participants. • For each FTR bid combination of the market participants, obtain the FTR allocations and the market clearing prices by solving the optimization problem (Equations 2 through 6)presented in Section II. • Determine the risk constrained profits for each participant for all FTR bid combinations. • Construct the game matrices using the profits and then solve the game with the RL algorithm (in Section III) to find the set of best-response bids for the participants. If the ISO or a market designer is using the methodology, the only change in the above steps will be that all the participants’ bids will be anticipatory. • VI. C ONCLUSIONS Financial transmission right is considered an important mechanism for power market participants to hedge against price uncertainties resulting from transmission congestion. Since the introduction of the framework for FTR allocation in [14], research dealing with modeling of FTR market behavior has been limited. In this paper, a game theoretic model for examining noncooperative bidding strategies for acquiring FTRs in a deregulated power market is presented. The matrix game theoretic model presents a significant departure from the commonly used bi-level optimization approach found in the literature, and it allows consideration of multidimensional bids as well as bidding on multiple FTR paths. A value iteration based RL algorithm is used as a solution tool for the matrix game model. A sample power network is used to demonstrate the matrix game model. Sixteen different numerical scenarios are constructed from the sample network for which best-response FTR bidding solutions are presented. The quality of the bestresponse solutions in terms of their Nash property and bidder payoffs are discussed. It is observed that in 10 out of 13 8 network scenarios, for which pure strategy Nash equilibrium solutions exist, the best-response strategies coincide with the highest value NE solutions. Additional experimentations were also conducted to study the impact of bid parameters on best-response solutions. The numerical results show that price is an important factor and could significantly alter the FTR allocation outcome. The quantity bid is a function of risk and variance parameters of the network. When risk and variance are low, the quantity bid parameter becomes nonstrategic and all bidders select the highest possible amount. The proportion of obligation and option may have significant impact on the payoffs of the bidders, and hence should be considered while bidding. The statistically designed 2-level factorial experiment provided an ideal means for investigating impacts of four different network related parameters (contingency, ∆LM P , variance of ∆LM P estimates, and risk coefficient of the bidders) on the market outcome. The results show that all four factors significantly impact FTR settlement, but their interactions were not significant. It was found that some contingencies in the network can create favorable bidding positions for some of the bidders. The results indicate that an accurate consideration of the network parameters is crucial in determining an effective bidding strategy. We believe that the model and the solution approach presented here will help the market participants to better evaluate their FTR bidding strategies, and thus aid the FTR market to reach a best-response state reducing uncertainty for the participants. As an extension of the model presented here, we are currently developing a model for obtaining joint bidding strategies for FTR and energy markets. We formulate the problem as a two-tier matrix game [15]. In this model we do not assume that the LMPs are known (via forecasts) to the FTR bidders. Instead, LMPs are obtained from the energy market settlement, which is impacted by the bidding behavior in the FTR market. R EFERENCES [1] W. Hogan, “Financial transmission rights formulations,” tech. rep., Harvard Electricity Policy Group, 2002. [2] T. Joskow and J. Tirole, “Transmission rights and market power on electric power markets,” RAND Journal of Economics, vol. 31, no. 3, pp. 450–487, 2000. [3] T. Li and M. Shahidehpour, “Risk-constrained FTR bidding strategy in transmission markets,” IEEE Transactions on Power Systems, vol. 20, no. 2, pp. 1014–1021, 2005. [4] “PJM financial transmission rights FAQs,” http://www.pjm.com/faqs/ftrmarket/ftr-ftr.aspx. [5] M. Shahidehpour, H. Yamin, and Z. Li, Market Operations in Electric Power Systems. 2002. [6] Y. Hong and C. Hsiao, “Locational marginal price forecasting in deregulated electric markets using a recurrent neural network,” in Power Eng. Soc. Winter Meeting, 2001. [7] J. Contreras, R. Espinola, F. Nogales, and A. Conejo, “Arima models to predict next-day electricity prices,” IEEE Trans. Power Syst., vol. 18, pp. 1014–1020, 2003. [8] R. O’Neill, U. Helman, B. Hobbs, W. Stewart, and M. Rothkopf, “A joint energy and transmission rights auction: Proposal and properties,” IEEE Transactions on Power Systems, vol. 17, no. 4, pp. 1058–1067, 2002. [9] V. Nanduri and T. K. Das, “A reinforcement learning approach to obtaining the Nash equilibrium of multi-player matrix games,” IIE Transactions on Operations Engineering, vol. 41, no. 2, pp. 158–167, 2009. [10] V. Nanduri, T. K. Das, and P. Rocha, “Generation capacity expansion in energy markets using a two-level game-theoretic model,” IEEE Transactions on Power Systems, vol. 24, no. 3, pp. 1165–1172, 2009. [11] T. K. Das, A. Gosavi, S. Mahadevan, and N. Marchalleck, “Solving semi-Markov decision problems using average reward reinforcement learning,” Management Science, vol. 45, no. 4, 1999. [12] A. Gosavi, N. Bandla, and T. K. Das, “A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking,” IIE Transactions, Special issue on advances on large-scale optimization for logistics,production and manufacturing systems, 2002. [13] W. B. Arthur, “Out-of-equilibrium economics and agent-based modeling,” in Handbook of Computational Economics, Vol. 2: Agent-Based Computational Economics (K. Judd and L. Tesfatsion, eds.), North Holland: Elsevier, 2005. [14] W. Hogan, “Competitive electricity market design: A wholesale primer,” tech. rep., Harvard Electricity Policy Group, 1998. [15] C. Babayigit, P. Rocha, and T. K. Das, “A two-tier matrix game approach for obtaining joint bidding strategies in FTR and energy markets,” To appear in IEEE Transactions on Power Systems, 2010. Tapas K. Das is a Professor of Industrial and Management Systems Engineering and an Associate Provost for Policy Analysis, Planning, and Performance at the University of South Florida. His research interest is in applied stochastic optimization for decision making problems involving single and multiple (noncooperative) players in a variety of interdisciplinary fields including policy making in deregulated power markets and containment and mitigation of large scale pandemic outbreaks. He is also involved in developing bedside decision making tools for better disease diagnosis and treatment planning with specific applications to cancer care. He currently directs a NSF funded GK-12 project aimed at infusing engineering and science in K-12 curriculum. Dr. Das is a Fellow of Institute of Industrial Engineers (IIE), member of INFORMS and IEEE, and Chair of ENRE Section of INFORMS. Patricio Rocha is a Ph.D. candidate with the Industrial and Management Systems Engineering at the University of South Florida (USF), Tampa. He received the Masters degree in Industrial Engineering in 2007 from USF. His current research interest includes energy and environmental policies. In particular, effects of emission control policies on the energy market. He is a student member of INFORMS and IEEE, and served as the president of INFORMS student chapter at USF. Cihan Babayiǧit received his M.S. in Industrial and Manufacturing Systems Engineering in 2003 from Ohio University and a Ph.D. in Industrial Engineering in 2008 from the University of South Florida, Tampa, Florida. His research interest is in the field of stochastic game theoretic modeling and analysis of deregulated electricity markets. He currently serves as a Senior Statistical Analyst at Revenue Management Solutions, LLC, 777 South Harbour Island Boulevard Suite 890 Tampa, FL 33602, USA.
© Copyright 2026 Paperzz