Game Theoretic Approaches for Spectrum Sharing in Cognitive Radio Networks: A Survey Manoj C∗ and Amrita Mishra† Depatment of Electrical Engineering, Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh, India. Email: ∗ [email protected], † [email protected] Abstract—Cognitive Radio is gaining popularity and acceptance all over the world as an efficient way to utilise the limited wireless spectrum resources. Out of the many design requirements of Cognitive Radio such as spectrum sensing, spectrum allocation, power control etc., spectrum sharing is one of main challenges to ensure peaceful co-existence of the primary and secondary users. Since multiple users compete to get maximum spectrum resources for themselves, Game Theory is an efficient framework to design robust, stable and scalable spectrum sharing schemes. In this paper, we discuss how the concepts of game theory can be exploited to design spectrum sharing protocols. Game theoretic models such as Cournot model, Bertrand model, Auction based approach for spectrum sharing, which design the spectrum sharing scenario from different aspects are also presented. Further, the paper throws light on the following scenario in which users competing for spectrum resources may have no incentive to co-operate and may also exchange false information about the channel conditions to gain more access to spectrum. Cheat-proof strategies are thus developed to maintain the efficiency of spectrum usage. Finally, this paper puts forth a comparative study between the various models along with research challenges and future directions in game theoretic modelling. Index Terms—Cognitive Radio, Spectrum Sharing, Game Theory. I. I NTRODUCTION Cognitive radio is gaining acceptance worldwide as the next breakthrough technology in wireless services for efficient utilisation of the available spectrum and thus providing faster and reliable communication [1], [2]. It stands out from the usual wireless networks in a sense that the users have to intelligently and dynamically change their operating parameters with change of their immediate surroundings. This ability of the radio transceiver enables the frequency spectrum to be shared among primary (licensed) and secondary (unlicensed) users. Among the various design requirements of a cognitive radio setup, dynamic spectrum sharing among the various users forms an important aspect. It requires that the performance of the primary user shouldnt degrade due to the opportunistic and selfish behaviour of the malicious users. The final objective is to design a robust spectrum sharing scheme to ensure peaceful functioning of the primary as well as the secondary users. Users in a cognitive radio network are intelligent and observe, learn to enhance their performance. In traditional spectrum sharing policies we assume that all users cooperate unconditionally in a static environment. However, in cognitive radio scenario if the users work towards different goals, e.g., to compete for an open unlicensed band, fully cooperative behaviour cannot be taken for granted. Instead, users will only cooperate with others if cooperation can bring them more benefit. The necessity of the users to change and adapt is further invigorated due to the various changes in the radio environment. This intelligent behaviour of the users compels researchers to take the help of Game Theory to analyse the cognitive interaction processes [3]. Game theory is a mathematical framework used to analyse the strategic decisions made by multiple decision makers. Different game models (e.g. non cooperative/cooperative, static/dynamic, and complete/incomplete information games) have been deployed to model and study the user behaviours in different scenarios [4]. The common aim of these models is to improve the network performance such as throughput maximization, resource consumption, QoS (Quality of Service) given the self-interest of the participating users. There are many advantages of studying cognitive radio networks in a game theoretic framework. First, by modelling dynamic spectrum sharing among network users (primary and secondary users) as games, users behaviours and actions can be analyzed in a formalized game structure and the theoretical advancements in game theory can be useful in providing the upper bounds. Second, the optimization of spectrum usage is generally a multi-objective optimization problem, which is very difficult to analyze and solve. Game theory provides us with well defined equilibrium criteria to measure game optimality under various game settings. Third, noncooperative game theory, one of the most important branches of game theory, enables us to derive efficient approaches for dynamic spectrum sharing using only local information. Such approaches become highly desirable when centralized control is not available [5]. The organisation of the paper is as follows. We will discuss the basic concepts of Game Theory in Section II. We then present the various game theoretic models in spectrum sharing along with their simulation results: Cournot model, Bertrand model, Auction based model and the model incorporating cheat-proof strategy in Section III, IV, V and VI respectively. Finally we conclude in Section VII by putting forth the research challenges and future directions in game theoretic modelling along with a comparison between the various models for spectrum sharing. II. BASICS OF G AME T HEORY Game theory is a bag of analytical tools designed to help us understand the phenomena that we observe when decisionmakers interact [6]. Game theory provides a mathematical model that describes the interaction between various agents which tries to maxmize their payoff. A game is a description of strategic interaction that includes the constraints on the actions that the players can take and the players’ interests, but does not specify the actions that the players do take. The three basic components in a game are the set of players, set of actions available for each player and set of preferences for each player. A. Terminologies A player is the decision makers in the game. In cognitive radio scenario, the players are the wireless nodes. Set of actions are the set of alternatives available to each player. At any instant, the player must choose an element from the subset of the set of actions. In general, the set of actions can be different for different players. In cognitive radio, the set of actions can be the choice of modulation scheme, coding rate, protocol, flow control parameter, transmit power level, or any other factor that is under the control of the node [7]. When each player chooses an action, the resulting “action profile” determines the outcome of the game. We also assume that the player, when presented with any pair of actions, knows which of the pair he/she prefers or knows that she regards both actions as equally desirable [8]. Usually, preferences are given by defining a utility function. A higher value of utility function means that the outcome is more desirable compared to an outcome with lesser utility function. The utility is also called as pay-off of the user. Game theory assumes that the players are rational. This means that the player tries to maximize his/her profit irrespective of what other players are doing. B. Strategic form game and Nash Equilibrium A strategic game is a model of interactive decision-making in which each decision-maker chooses his plan of action once and for all. Also, these choices are made simultaneously. The Nash equilibrium is a joint strategy where no player can increase her utility by unilaterally deviating. It is the best response of a player to the best response of every other players. The best response of a player is the action that gives him maximum pay-off, given that all other player’s strategies remain unchanged. Mathematical model of Nash equilibrium is given below. A game consists of a finite set of players N = 1, 2, ..., N . Each of the players i ∈ N selects a strategy si ∈ Si with the objective of maximizing the utility ui . The strategy profile s is the vector containing the strategies of all players s = (si )i∈N = (s1 , s2 , ..., sN ). the collective strategies of all players except player i is denoted by s–i . Strategy s ∈ S is a Nash equilibrium if ui (s) ≥ ui (s0i , si ) ∀s0i ∈ Si ,∀i ∈ N . Nash equilibrium corresponds to a steady state. If, whenever the game is played, the action profile is the same Nash equilibrium, then no player has a reason to choose any action different from the current action; there is no incentive for the player to change. C. Cooperative and Non-Cooperative Games A cooperative game in game theory is one where players form groups or coalitions and these coalitions enforce cooperative behaviour. Here, the game is a competition between coalitions of players, rather than between individual players. A non-cooperative game is one in which players make decisions independently, without coordination with other players and each players have their own objectives towards which they move. In this case, the players see only their own payoff and they dont consider the payoff others or the whole system. Thus, while they may be able to cooperate, any cooperation must be self-enforcing. In a cognitive radio framework, each user usually makes its own decisions (possibly relying also on the information collected from other users). These decisions may be dominated by the rules of the operating protocol, but ultimately each user has some freedom in setting parameters or changing its own mode of operation. These users are autonomous agents, taking their own decisions about transmit power, packet forwarding etc. The users can exhibit three kinds of behaviour: 1) Users may work towards overall good of the entire network community as whole. 2) In some cases, the same users may behave selfishly, looking out for only their own interests. 3) Finally users may behave maliciously, seeking to ruin network performance of other users. Game theory can be applied in all the three cases. III. C OURNOT M ODEL OF S PECTRUM S HARING A. What is a Cournot game Spectrum sharing problem is formulated as an oligopoly market [9]. An oligopoly market is one in which few firms compete with each other in terms of amount of commodity supplied to the market to maximise the profit. In the above spectrum sharing context, the SUs are analogous to the firms who compete for the spectrum offered by the PU. The cost of the spectrum is determined by using a pricing function by the PU. A Cournot game is used to analyze this situation and the Nash equilibrium (NE) is considered as the solution of this game. The main objective of this Cournot game formulation is to maximize the profit of all SUs based on the equilibrium adopted by all SUs. B. System Model We consider a wireless system with a PU and multiple SUs (i.e., total number of SUs is denoted by N ) who want to share the spectrum allocated to the PU. The PU shares some portion of the spectrum bi with secondary user i. The primary user charges the secondary user for the spectrum at a rate of c(b) per unit bandwidth, where b is the amount of available bandwidth that can be shared. The SUs transmit in the allocated spectrum using adaptive modulation to enhance their transmission performance. The revenue of secondary user i is denoted by ri per unit of achievable transmission rate. The spectral efficiency of the transmission for secondary user i can be obtained from ki = log2 (1 + Kγi ) where (1) 1.5 K= ln 0.2/BERitar (2) We assume that the received SNR information is available at the transmitting end through channel estimation C. Spectrum Sharing Scheme We discuss the static and the dynamic Cournot game. A static game model is presented in an ideal case in which all the SUs can observe the strategies and payoffs of all other SUs. The dynamic case is however a practical version in which the information of SUs are not known to a particular SU. The SU observes change in payoff due to different charging price of the PU and adapts its strategy accordingly. 1) Static Cournot Game: The players (i.e., firms in the oligopoly market) in this game are the SUs. The strategy of each of the players corresponds to the allocated spectrum size (denoted by bi for SU i) which is non-negative. The payoff for each player is the profit (i.e., revenue-cost) of secondary user i (denoted by pi ). The pricing function used by the PU for charging is given by τ X c(B) = x + y bj (3) j of one player given others strategies [8]. The best response function of secondary user i given the allocated spectrum size of other secondary users bj , where j 6= i, is defined as follows BRi (B−i ) = arg max pi (B−i ∪ {bi }) bi The set B ∗ = {b∗1 . . . , b∗N } denotes the Nash equilibrium of this game if and only if ∗ b∗i = BRi (B−i ), ∀i pi (B) = ri ki bi − bi c(B) (4) We assume that the guard band used to separate the spectrum allocated to different users is fixed and small. Then, the profit can be rewritten as follows τ X (5) pi (B) = ri ki bi − bi x + y bj j Let B−i = {b1 , . . . , bi1 , bi+1 , . . . , bN } denote the set of strategies adopted by all except secondary user i such that B = B−i {∪bi }. As the optimal allocated spectrum of one user depends on the strategies of all other users, NE is considered to be the solution of the game to ensure that all the secondary users are satisfied with the solution. The NE is obtained by using the best response function which is the best strategy (7) ∗ where B−i denotes the set of best responses of secondary users j for j 6= i. We formulate an optimization problem with the objective defined as follows: Minimize : N X |bi − BRi (B−i )| (8) i=1 i.e., we want to minimize the difference between decision variables bi and the corresponding best response function. The minimum value of the objective function is zero if the algorithm reaches the NE. 2) Dynamic Cournot Game: Here the NE of every SU is obtained by interaction with the PU only. Thus, each SU communicates with the PU to obtain the differentiated pricing function for different strategies. The adjustment of the allocated spectrum size can be modelled as a repeated Cournot game as: bi (t + 1) = bi (t) + αi bi (t) where x, y, and z are non-negative constants, τ ≥ 1 (so that this pricing function is convex), and B denotes the set of strategies of all secondary users (i.e., B = {b1 , . . . bN }). Let w denote thePworth of the spectrum for the PU. The condition c(B) > w× j bj is necessary to ensure that the PU is willing to share spectrum of size b with the SUs. The PU charges all of the SUs the same price. The revenue of the SU i can be obtained from ri ×ki ×bi , while the cost of spectrum allocation is bi c(b). The profit of the user i can be obtained as (6) ∂pi (B) ∂bi (t) (9) where bi (t + 1) is the allocated spectrum size at time t, αi is the speed adjustment parameter (i.e., learning rate) of SU i. D. Simulation Results We consider a cognitive radio environment with one PU and two SUs sharing a spectrum of 15 MHz. The target BER for both the SUs is BERitar = 10−4 . For the pricing function of PU, we use x = 0 and y = 1, while τ is adjusted based on the evaluation scenario (e.g., τ = 1.0 ), and the worth of spectrum for PU is w = 1. The revenue of a SU per unit transmission rate is ri = 10 ∀i ∈ I. We also assume that the SNR information γi is available to all SUs through channel estimation [10]. Fig. 2 shows the best response of both SU’s in the static Cournot game. The best response of each SU is a linear function of the other user’s strategy. The Nash equilibrium is located at the point at which the best responses intersect. We also observe that under different channel qualities, the Nash equilibrium is located at the different places. Also, the trajectory of spectrum sharing in the dynamic Cournot game is shown for the case of α1 = α2 = 0.14. We again observe that with the same speed adjustment parameter, better channel quality results in more fluctuations in the trajectory to the NE. estimation, the secondary users can obtain the received SNR of the channel. For the secondary user i, given the received SNR , targetBERtari (Target BER) and assigned spectrum Bi , the transmission rate (in bits per second) can be obtained. B. Bandwidth Auction Fig. 1. System Model for Spectrum Sharing. The problem of spectrum sharing is formulated as an auction in which the secondary users (SUs) make bids for the bandwidth allocated to the primary user (PU). An auction with relatively simple rules is proposed below to characterize the behaviour of interaction between primary user and multiple secondary users. 1) Information: Each SU i knows its revenue ri per unit of achievable transmission rate, and it also knows its spectral efficiency ki . ri relates to the QoS in a real network The PU announces a positive reserve bid β > 0 and the price p > 0 to all SUs before the auction starts. 2) Bids: The SU i submits a bid bi (0 ≤ bi ≤ Btot ) which generally represents the maximum bandwidth that SU desires for data transmission. 3) Allocation: The PU allocates bandwidth according to (here we only consider the FDM scheme). The bandwidth once allocated by the PU there is no contention among the SUs. Bi = P bi Btot bj + β (10) j∈I 4) Payments: SU i pays the PU Ci = pθi bi Fig. 2. Best responses and trajectories of both SU’s to NE. IV. S PECTRUM S HARING USING AUCTION BASED A PPROACH A. System Model Let us consider a system where there is only one primary user (PU) and a group I = (1, ..., I) of secondary users (SUs) who want to share the spectrum allocated to the primary user Btot (as shown in Figure 1). The primary user retains a given amount of bandwidth Brem > Breq where Breq is the bandwidth required to provide a quality of service requirement. The primary user charges secondary users at a price of p per unit bandwidth. After the allocation, the secondary users may transmit in the allocated spectrum using adaptive modulation to enhance the transmission performance. The revenue of the secondary user i is denoted by ri per unit of achievable transmission rate. The spectral efficiency of transmission for the user i is denoted by ki . We assume that through channel (11) Where θi is an user dependent priority parameter. We adopt a ‘prepay’ mechanism in which the SU pays for the bandwidth it bids instead of that which is assigned by the PU. The prepay mechanism is a crucial part of the auction rules as it prevents the SU from over-bidding the bandwidth since they pay for their own bid. where θi is an user dependent priority parameter. We adopt a ‘prepay’ mechanism in which the SU pays for the bandwidth it bids instead of that which is assigned by the PU. The prepay mechanism is a crucial part of the auction rules as it prevents the SU from over-bidding the bandwidth since they pay for their own bid. A bidding profile is defined as the vector containing the SUs bids, b = (b1 , ..., bI ). The bidding profile of SU is opponents is defined as b−i = (b1 , ..., bi1 , bi+1 , ..., bI ), such that b = (bi ; b−i ). Under the rule of this auction, we notice that bi ∈ ∆ br = [0, Btot ] and the bidding profile b is constrained by ∆ bi ∈ br = {b|0 ≤ bi ≤ Btot ∀i ∈ I} (12) In this auction, a positive reserve bid β is used by the PU to control the remaining portion of the spectrum for its own usage. The PU sets β such that β ≥ Breq is satisfied. Given the allocated bandwidth, the SU is revenue is given by Ri = ri ki Bi (13) The SU i chooses to bid bi which maximises its payoff Ui (bi ; b−i ; p) = Ri[Bi (bi ; b−i )]Ci (bi , p) (14) The desirable outcome of an auction is Nash Equilibrium (NE) which is a bidding profile b∗ such that no user wants to deviate from it i.e U i(bi ∗; b−i ∗; p)Ui (bi ; b−i ∗; p)∀i ∈ I, bi ∈ bR (15) We define SU is best response as B(b−i ; p) = bi |bi = arg max b ∈ bR Ui (bi ; b−i ; p (16) This in general could be a set. A NE is also a fixed point solution of all the best responses of the SUs. We state certain properties of NE along with a dynamic updating algorithm to reach the NE in a distributed fashion. Theorem 1: There are two extreme prices pi and pi defined as ri ki Btot {(I − 1)Btot + β} pi = (17) θi (IBtot + β)2 ri ki Btot pi = (18) θi β If p < pi , all the SUs would bid for the maximum bandwidth allocated to the PU (i.e., bi = Btot ∀ i ∈ I); if p > pi , no SU would be willing to use any of the spectrum offered by the PU (i.e., bi = 0 ∀ i ∈ I). Theorem 2: There is a unique NE for the bids of the SU’s. In addition, if p ∈ (pi , pi ), SU i’s unique best response function is given as follows: v Btot ! u u P u bj + β ri ki Btot u X t j6 = i B(b−i , p) = − bj + β pθi j6=i 0 (19) Fig. 3. Fig. 4. Region of values for stable Nash Equilibrium. Nash Equilibrium of bid under different channel equalities. where is [x]ba defined as [x]ba = max {min {x, b} , a} (20) Theorem 3: If the unique NE is interior (Interior NE implies that none of the participating users selects a strategy on the boundary of his strategy space), then the bandwidth allocation is fair. C. Dynamic Updating Algorithm In a practical cognitive radio scenario, the SUs may only be able to observe the pricing and assignment information from the primary user (PU), but not the strategies and payoffs of other secondary users. Hence, we also investigate a distributed algorithm for each SU to achieve Nash equilibrium based on its interaction with the PU only. Here, each SU communicates with the PU to obtain the price and different assignment functions for different bids and updates its bid as follows: ∂Ui (b) (21) ∂bi (t) Where bi (t + 1) is the bid in terms of bandwidth at time t and αi is the speed adjustment parameter of the SU i. bi (t + 1) = bi (t)+ αi bi (t) D. Simulation Results A cognitive radio environment with one PU and two SUs sharing a spectrum of Btot = 10 MHz is considered. The target BER for both the SUs is BERtar i = 10−4 . The revenue of a SU per unit transmission rate is ri = 10 ∀ i ∈ I. We also assume that the SNR information γi is available to all SUs through channel estimation. The PU sets the price p = 10 per unit bandwidth and reserves bid β = 0.2 [11] In Fig. 3, the regions indicated by arrows are the regions, for which the spectrum sharing is stable and NE would be reached else the sharing would be unstable and fluctuations would occur. In Fig. 4, we observe the adaptation of SUs bids under different channel equalities. As expected SU 2 bids more bandwidth and achieves higher revenue when its channel quality becomes better. We also observe the dependence of channel quality and bid of one user on the other user. V. B ERTRAND G AME M ODEL An oligopoly market can also be modeled as a Bertrand Game where the firms fix their prices game theoretically. Here at least two sellers producing homogeneous products compete by setting prices simultaneously; buyers buy everything from the firm with lower price. The solution of this Bertrand game is the Nash Equilibrium. Here we apply Bertrand game model to the problem of competitive spectrum pricing for dynamic spectrum access [12]. In this model, a few primary services compete to offer spectrum to a secondary service. A. System Model Consider a wireless environment with N primary services operating on frequency bands Fi and a secondary service with a group of secondary users. The primary service i serving Mi local connections wants to sell part of the spectrum Fi at price pi per unit bandwidth to the secondary user. The spectrum demand depends on the data rate in the spectrum and the price charged. The spectral efficiency of transmission by a secondary user k is given by k = log2 (1 + Kγ) where 1.5 K= ln (0.2/BERtar ) where c1 and c2 denote the constant weights for the revenue and cost functions respectively, Bireq is the bandwidth requirement of the primary connection, Wi is the size of spectrum, Mi (p) is the number of primary connections and ki is the spectral efficiency of wireless transmission for primary service i. Based on this model, a Bertrand game is formulated as Players: Primary services Strategy: Price per unit spectrum pi (non-negative) Payoff: Profit Pi (Revenue minus cost) realized by selling the spectrum to secondary user Based on the spectrum demand, revenue and cost functions, the profit of each primary firm is given by Pi (p) = bi pi +Ri − Ci (bi ) where p = {p1 , . . . , pi , . . . , pN } is the set of prices offered by all players in the game. NE is obtained by using the fact that its the best strategy of each player, given others’ strategies. The best response of primary service i given the prices offered by other primary services p−i (pi = p−i ∪ {pi }) is defined as Bi (p−i ) = arg max Pi (p−i ∪ {pi }) pi (22) where γ is the SNR at the receiver and BERtar is the target bit-error-rate. B. Spectrum Pricing Competition To quantify the spectrum demand, we consider the quadratic utility function given by N N N X X X X 1 (s) bi bj − U(b) = bi ki − b2i + 2ν pi bi 2 i=1 i=1 i=1 i6= (23) where b is the set of size of spectrum shared by all primary services. i.e., b = {b1 , . . . , bi , . . . , bN }, pi is the price offered (s) by primary service i, ki denotes the spectral efficiency of transmission by a secondary user using the spectrum Fi owned by primary service i. The spectrum substitutability parameter ν represents the ability of the secondary user to switch among the frequencies offered by the primary services. ν = 0 means that the secondary user cannot switch to that frequency spectrum while ν = 1 implies that the secondary user can switch among the spectra freely. The spectrum demand function Di (p) of spectrum Fi at secondary service is obtained by differentiating U(b) with respect to bi and equate it to zero. It is given by P (s) (s) (ki − pi )(ν(N − 2) + 1) − ν i6=j (kj − pj ) Di (p) = (1 − ν) (ν (N − 1) + 1) (24) The cost function of the primary user is developed by considering the degradation of QoS of the primary user. The revenue function Ri and the cost function Ci are defined as 2 (p) Wi − bi req Ri = c1 Mi , Ci (bi ) = c2 Mi Bi − ki (25) Mi (26) p∗ {p∗1 , . . . p∗N } denotes the NE of this game if and only if p∗i = Bi (p∗−i ), ∀i (27) where p∗−i denotes the best responses of all players expect player i. We can obtain the NE by solving the equations ∂P∂pi (p) i for all i. In cognitive radio situation the primary service will not be able to observe the profit gained and strategy adopted by other primary services. So, it has to decide its strategy from the observed history. So, we go for a distributed price adjustment algorithm which progressively reaches the NE. Let pi [t] denote the price offered by primary service i at timet. p[t] and p−i are defined similarly. We consider two cases, first in which the strategies of other primary services in previous iteration are known to all and the case in which its not observable. In the first case, the price offered by the primary service can be obtained from pi [t + 1] = Bi (p−i [t]) ∀i (28) In the second case, the primary service has only local information and the spectrum demand. Using this, it adjusts its prize in the direction that maximizes its profit as given by the equation ∂Pi (p) pi [t + 1] = pi [t] + αi (29) ∂pi where αi is the learning rate. The first case has no control parameters and it is proved to be stable [12] by considering the eigen values of the Jacobian matrix. In the second case, the algorithm can be either stable or unstable depending on the learning rate αi , number of local connections Mi and spectrum substitutability factor ν. C. Inefficiency of Nash Equilibrium total profit for all primary services is given by PThe N P j=1 j (p). The optimal price for all primary services can be obtained from PN ∂ j=1 Pj (p) =0 (30) ∂pi The optimal values of pi obtained from this equation are different than those from NE. So, primary services may cooperate to achieve higher profit. In a repeated game, the game is played multiple times and the users can observe the outcome of previous games. So, they will learn to cooperate. Since this optimal price is not the NE, some primary users may deviate unilaterally to increase their own profit. So, the optimum pricing is not a stable equilibrium. As the optimal pricing is desirable, it can be achieved by using a punishment mechanism that punishes any user that deviate from the optimal price. When a user deviates form the optimal pricing,the punishment action is triggered and all the users switch to the NE state, from which no player will deviate. We consider a trigger strategy in which any primary service maintains the collusion as long as other services agree to do so. But, if a primary service deviates, a punishment action is triggered A primary service usually gives a smaller weight to the profit in the future stages than the profit in the current stage. If the current profit is Pi , the profit in the next stage is of worth δi Pi where δi is the weight. Let Pio , Pin and Pid denote the profits of primary service i following optimal price, profit by following price at NE and profit of deviating respectively. The collusion will be maintained if the long-term profit by adopting collusion is higher than that obtained by deviation. Mathematically [12], 1 δi P o ≥ Pid + Pn 1 − δi i 1 − δi i (31) A lower bound on δi can be obtained from this as δi ≥ Pid − Pio Pid − Pin (32) Collusion will be maintained only where δi satisfies (32). D. Performance Evaluation We consider a cognitive radio environment with two primary services and a secondary service. 20 MHz of frequency spectrum available to each primary service. The number of local connections at each primary service is 10. The target BER of secondary service is BERtar = 10−4 . The bandwidth requirement of the connections at each primary service is 2 Mbps ( Breq = 2), and c1 = c2 = 2. The channel quality for the secondary service varies between 9 to 22 dB. The spectrum substitutability factor lies between 0.1 to 0.6. For the dynamic price adaptation algorithms, the initial prices are set as p1 [0] = p2 [0] = 1. If the primary services can observe each others strategies (Case 1), the price converges to equilibrium price in a few iterations. But, if only the spectrum demand from the secondary Fig. 5. Profit of each primary service at equilibrium under different channel qualities of frequency spectrum offered by primary service one. service is observable (Case 2), and the price is adjusted based on this information, the speed of convergence depends on the learning rate α. An optimum learning rate makes the algorithm converges as fast as that of case 1. But, a larger learning rate causes fluctuations in the price adaptation and the algorithm requires large number of iterations to converge. The profit of both the primary services at the Nash equilibrium is shown in Fig. 5. When the channel quality of spectrum offered by primary service one is better, the spectrum demand becomes higher. So, primary service one can increase the price as well as the size of the offered spectrum share to gain higher revenue. When primary service one gains a higher profit due to larger demand, primary service two gains only a lower profit due to smaller demand. The spectrum substitutability factor ν also impacts the prices due to the different channel qualities.A larger ν only slightly affects the price offered by primary service one, the price offered by primary service two decreases at a higher rate for a larger value of ν. A smaller value of ν lowers the price offered by primary service one, the rate of decrease in the price offered by service two has to be higher to attract the secondary service. This is required to achieve the highest profit given the channel qualities corresponding to the spectrum offered by primary service one. VI. C HEAT - P ROOF S TRATEGIES In cognitive radio environment, users competing for the open spectrum may have no incentive to cooperate with each other. They may even exchange false private information like channel conditions to get more access to the spectrum. So, cheat-proof spectrum sharing schemes should be developed to maintain the efficiency of the spectrum usage. So we use mechanism design theory to make and provide incentives for players to be honest [13]. We also make cheating unprofitable by statistical approaches. A. System Model We consider a situation where K pairs of unlicensed users coexist in the same area and compete for an unlicensed spectrum band. The users trying to communicate with their pair cause interference to other pairs. At time slot n, all pairs try to occupy the spectrum and the received signal at the i-th receiver yi [n] is be expressed as yi [n] = K X hji [n]xj [n] + wi [n], i = 1, 2, . . . , K (33) j=1 where xj [n] is the transmitted information on j-th pair, hji [n](j = 1, 2, . . . , K; i = 1, 2, . . . , K) represents the channel gain from j-th transmitter to the i-th receiver and wi [n] is the white noise at i-th receiver. The transmission power of i-th user bounded by PiM i.e., |xi [n]|2 ≤ PiM at all n. Each user is selfish and they try to maximize their own profit. This spectrum sharing game can be modelled as: Players: K transmitter-receiver pairs Strategy: Transmission power of each user pi in [0, PiM ] Payoff: Ri (p1 , p2 , . . . , pK ), the gain of transmission achieved by i-th player after the players have chosen the power levels p1 , p2 , . . . , pK . The averaged payoff of i-th player is given by ! pi |hii |2 P Ri (p1 , p2 , . . . , pK ) = log2 1 + . N0 + j6=i pj |hji |2 (34) First,we consider a one-shot game in which the players consider only the current payoff. Its proved in [13] that the M ). only Nash equilibrium of this game is (P1M , P2M , . . . , PK The payoff at NE is given by ! M 2 P |h | ii Pi RiS (h1i , h2i , . . . , hKi ) = log2 1 + N0 + j6=i PjM |hji |2 (35) The superscript ‘S’ stands for selfish. This is the only possible outcome of a one-shot game. All users transmitting at maximum power causes strong mutual interference to all users. Spectrum sharing lasts over a long period of time. So, everyone will be better off if they take turns and transmit. Such a cooperation must be self-enforced. Now we consider a repeated game which lasts over several turns. The players view these rounds as a whole. The payoff is given by Ui = (1 − δ) +∞ X δ n Ri [n] (36) n=0 where Ri [n] is the payoff of player i at time slot n. δ is the discount factor as defined earlier. If all the players follow some predetermined rules to share the spectrum, higher expected one-slot payoff RiC (‘C’ stands for cooperation, RiC > RiS ∀i = 1, 2, . . . K) can be achieved. But, selfish players can take advantage of others by transmitting in the time slots not allotted to them. This gives a payoff RiD (‘D’ stands for deviation). Cooperation is not a stable equilibrium in the one-shot game,but it can be enforced in a repeated game by the threat of punishment. We denote the discounted payoff with deviation as UiD and that without deviation as UiC . As δ → 1, UiD converges to RiS almost surely and UiC converges to RiC almost surely. Hence, cooperation exists only if UiC (= riC ) > UiD (= riS ). i.e., all players are self-enforced to cooperate because of punishment after deviation. We use a “punish-and-forgive” strategy where the punishment state stays only for T − 1 time slots and cooperation resumes from T -th time slot. The parameter T can be determined by analyzing the incentive of the players. If the tendency to deviate is stronger, the punishment should also be harsher to prevent deviation. B. Cooperation with Optimal Detection We assume that there is a common control channel over which players can exchange information. Based on the information transmitted, the players decide who should transmit at a given slot. Only one player transmits at a time. Each slot is divided into three phases: in first phase, each player exchange channel information with others; in second phase, players decide whether to access the spectrum or not, according to cooperation rule; in third slot, the eligible player transmits data. During the third phase, the eligible player pauses transmission and ‘listens’ to the channel for some time to catch the deviators. If the player finds any other player deviating, the system is alerted into punishment mode. We consider two cooperation rules: maximum total throughput criterion (MTT) which maximizes the sum of individual payoffs and approximate proportional fairness (APF) criterion that maximizes their product. Punishment-based spectrum sharing game provides incentive for players to be honest, as deviation is deterred by the threat of punishment. Detection of the deviating behavior is necessary for threat to be credible. C. Cheat-Proof Strategies The repeated game discussed above inherently assumes that complete and perfect information is available. But, information like the power constraints and channel gains are private information of player. So, selfish players may provide false information to get a higher payoff. Therefore, enforcing truthtelling is a crucial problem. 1) Mechanism-design-based strategy: Mechanism design provides incentives for players to be honest. The players claiming high values are asked to pay a tax and the amount of the tax will increase as the claimed value increases. Some monetary compensation is given to the players reporting low values. Now, the spectrum sharing game becomes a new game with original payoffs replaced by the overall payoffs which includes the monetary transfers. The transfer function can be designed such that the players get the highest payoff only when they claim their true private values. With this transfer functions, all players’ payment/income adds up to 0 at any time slot. It means that the monetary transfer is exchanged only within the community of cooperative players at any time. This property is suitable for open spectrum sharing scenario. 2) Statistics-based strategy: For the APF rule, every player reports the normalized channel gain and the player with the highest reported value geta access to spectrum. The normalized gains are exponentially distributed with mean 1. So, in the long run, each player will have access to the spectrum 1/K of total Fig. 6. The payoffs under a heterogeneous setting with different cooperation rules. slots. If player i occupies the spectrum more than (1/K + ε) of the total time, where ε is a pre-determined threshold, it is highly possible that the player has cheated. If a player is found to transmit for more than (1/K + ε) of slots, that player will be marked as cheater and get punished. In this way, the profit of cheating is greatly limited. D. Simulation Results We consider a scenario with two players (K = 2) with same maximum power constraint and same relative interference γ. The players can gain more by cooperating than by being selfish. But, cooperation is unnecessary in cases when interference is very less (γ ≈ 0). It was observed that payoff of cooperation is higher than that of non cooperation for γ > 0.15. Now we consider a heterogeneous environment where players have different power constraints. We fix the power constraint of player 1, and increase the power constraint of player 2, P2M . The payoffs with the MTT and APF rules are demonstrated in Fig. 6, where ‘1’ and ‘2’ refer to the payoffs of player 1 and player 2, respectively. The payoffs without cooperation and payoffs using the max-min fairness criterion (denoted by “NOC” and “MMF” respectively) are also shown for comparison. It can be seen from the figure that both MTT and APF rules outperform the non-cooperation case. This means players have the incentive to cooperate in both rules. It was also seen that the payoff is maximized only if the player honestly claims his/her true information. Therefore, players are self enforced to tell the truth with this mechanism. VII. C ONCLUSION Although Game Theory has been extensively used in modelling the interactions between the users in a cognitive radio network, yet it faces certain challenges too. Choosing a proper pay-off function needn’t always result in a simple analysis for the game theoretic model. As cognitive radio networks benefit from technology evolution, the same technologies can also be used by malicious users to launch more complicated and unpredictable attacks. It is therefore wise to use the framework of game theory judiciously. Dynamic spectrum sharing is one of the key functions of cognitive radio networks. In this paper, we initially discussed the basics of game theory .Then we presented and elaborated on the various game theoretic models namely Cournot, Auction based, Bertrand and Cheat-proof which can be applied to the spectrum sharing scenario. Each model has been dealt separately giving an extensive knowledge of how the problem is formulated, what are the governing conditions imposed and ultimately the equilibrium attained. Cournot model is the most primitive form of modelling a spectrum sharing problem which concludes that NE is the most desirable solution. Auction based model is a novel way of modelling the spectrum sharing problem with auction theory background. Bertrand game moves a step ahead and shows the inefficiency of NE and how to improve upon it. Cheat proof strategies throws insight into mechanism design which seeks the players to be honest. We have exhaustively analysed and presented existing game theoretic models in the spectrum sharing scenario. R EFERENCES [1] S. Haykin. Cognitive radio: brain-empowered wireless communications. Selected Areas in Communications, IEEE Journal on, 23(2):201 – 220, Feb. 2005. [2] Ian F. Akyildiz, Won-Yeol Lee, Mehmet C. Vuran, and Shantidev Mohanty. Next generation/dynamic spectrum access/cognitive radio wireless networks: A survey. Computer Networks, 50(13):2127 – 2159, 2006. [3] Magnús M. Halldórsson, Joseph Y. Halpern, Li (Erran) Li, and Vahab S. Mirrokni. On spectrum sharing games. In Proceedings of the twentythird annual ACM symposium on Principles of distributed computing, PODC ’04, pages 107–114, New York, NY, USA, 2004. ACM. [4] Jane Wei Huang and Vikram Krishnamurthy. Game theoretic issues in cognitive radio systems (invited paper). Journal of Communications, 4(10), November 2009. [5] Beibei Wang, Yongle Wu, and K.J. Ray Liu. Game theory for cognitive radio networks: An overview. Computer Networks, 54(14):2537 – 2561, 2010. [6] Martin J. Osborne and Ariel Rubinstein. A course in game theory. MIT Press, 1994. [7] Allen B. MacKenzie. Game Theory for Wireless Engineers. Morgan & Claypool PublishersPress, 2006. [8] Martin J. Osborne. An introduction to game theory. Oxford University Press, 2003. [9] Li Yan-bin, Wang Li-feng, and Li Ying. An improved game-theoretic spectrum sharing algorithm in cognitive radio networks. In Computer Research and Development (ICCRD), 2011 3rd International Conference on, volume 2, pages 499 –503, March 2011. [10] D. Niyato and E. Hossain. A game-theoretic approach to competitive spectrum sharing in cognitive radio networks. In Wireless Communications and Networking Conference, 2007.WCNC 2007. IEEE, pages 16 –20, March 2007. [11] Xinbing Wang, Zheng Li, Pengchao Xu, Youyun Xu, Xinbo Gao, and Hsiao-Hwa Chen. Spectrum sharing in cognitive radio networks – an auction-based approach. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 40(3):587 –596, June 2010. [12] D. Niyato and E. Hossain. Competitive pricing for spectrum sharing in cognitive radio networks: Dynamic game, inefficiency of nash equilibrium, and collusion. Selected Areas in Communications, IEEE Journal on, 26(1):192 –202, Jan. 2008. [13] Yongle Wu, B. Wang, K.J.R. Liu, and T.C. Clancy. Repeated open spectrum sharing game with cheat-proof strategies. Wireless Communications, IEEE Transactions on, 8(4):1922 –1933, April 2009.
© Copyright 2026 Paperzz