Enhanced Spectrum Planning in Cognitive System Based

Progress In Electromagnetics Research Symposium Proceedings, Stockholm, Sweden, Aug. 12-15, 2013 759
Enhanced Spectrum Planning in Cognitive System Based on
Reinforcement Learning
R. Urban, E. Hutova, and D. Nespor
Department of Theoretical and Experimental Electrical Engineering
Brno University of Technology, Technicka 12, Brno 612 00, Czech Republic
Abstract— This paper presents preliminary results of interference less spectrum planning which
is performed by reinforcement learning. The frequency planning of the wireless services is very
difficult in current overfilled spectrum situation since it is nearly impossible to find spectral hole
worldwide to deploy a new wireless service. The possible solution is an open dynamic spectrum
access which could be implemented as a part of cognitive radio. Moreover, the modern wireless
standards such as LTE-A partly implement cognitive radio improvements, e.g., carrier aggregation
system, which enables using unused parts of the frequency spectrum to decrease interference and
increase data throughput. It could be realised both the intra-band and the inter-band solution.
According to the measured data of the spectrum situation in various environments, we prepared
best case of channel switching in LTE-A and WI-FI systems, which is based on the reinforcement
learning to minimize interference with primary users represented by measured data. Using this
technique, we are capable to obtain very low misdetection probability and large variety in channel
switching.
1. INTRODUCTION
Currently generally acceptable approach of spectrum allocation is very inefficient. Frequency spectrum is allocated according to the global frequency plans for large geographical areas. This method
leads to blocking some parts of the spectrum in large areas. On the other hand, some spectrum
parts could be over utilized by aggregation of several base stations at the same place. Regarding
to these facts, deploying of the new worldwide wireless service is now nearly impossible due to the
totally overfilled frequency spectrum plans around. One possible solution of this issue in frequency
spectrum is offered by using dynamic spectrum sharing or resource management, which could be
operated by cognitive system [1].
Cognitive radio is an intelligent autonomous system capable of adapting to the current area
and time [1, 2]. It is possible to change transmitter parameters such as the operation frequency
and radiation power, which are crucial parameters for dynamic spectrum sharing. These systems
are using white spaces in the frequency spectrum (unused bandwidth) or rarely used parts of the
spectrum in time, so called grey space. To find out these sharing possibilities, it is necessary to
sense the environment and aware spectrum holes. It is obvious, that real-time sensing costs a lot
of energy and also processing time. From this reason radio environment maps and spectrum usage,
based on off-line spectrum surveys [3], models should be used instead of the sensing [4, 5]. Spectrum
data are managed by a cognitive engine that controls available resources. In a cognitive system,
we define primary users (wireless devices using the spectrum according to the assigned licence)
and secondary or cognitive users (new entrants to the shared frequency spectrum whose traffic is
coordinated by the cognitive engine). In this paper, we are focusing on LTE and WIFI cognitive
users, who have dynamically assigned channel to maximize SINR and minimize interference caused
by or to other systems by cognitive engine. It is obvious that the spectrum sensing is a crucial
feature for correct operation of the cognitive radio. Real-time wideband spectrum sensing is still
one of the main challenges for system developers. Therefore, cognitive radio is now fully working
in limited bands [6] and simulations are provided in a large scale.
This paper introduces new approach of finding the best available channel for secondary users,
which is based on reinforcement learning from spectrum survey data introduced in [3]. The reinforcement learning algorithm returns the vector of the scored channels based on possible interference
in particular channel. Based on this statistical information we are able to plan secondary users’ frequency allocation and minimize chance of interference between both primary and secondary users.
Nonetheless to say, that better spectrum planning brings decline of the radiation power and decline
of the microwaves smog to biological issues as well. The effects of the enormous electromagnetic
radiation are widely discussed in [4].
PIERS Proceedings, Stockholm, Sweden, Aug. 12–15, 2013
760
2. SYSTEM DEFINITION
In simulation we suppose to deploy LTE-A and WI-FI service over the real measured data partly
presented in [3]. We assume the distribution of primary users based on the 24 hour measurement
sample of the frequency spectrum background from urban area. The measurement was performed
in the frequency band from 800 MHz up to 2800 MHz, which covers the most common LTE bands
and also WI-FI band Presented simulation process has several steps. Firstly, the measured data
are processed and potential primary users are marked as “interference possibility” or occupied
part of the spectrum. Secondly, the reinforcement algorithm (see below) is periodically applied to
obtain actual scores for each channel of specific services working under LTE and WI-FI technology
(see below). The channel with lowest score is chosen and used until another channel got lower
score. Finally, interference counts for selected channel combination are calculated as a comparison
parameter.
2.1. Long Term Evolution (LTE)
Long Term Evolution (LTE) [7] is a modern standard for cellular telecommunications. It was
designed to provide connectivity 100 Mbit in downlink and 50 Mbit in uplink for mobile user equipment’s (UE). There is also an implementation of MIMO (4 × 4). Commonly it uses 64QAM modulation for downlink and QPSK modulation respectively for uplink stream. The next step cellular
networks evolution is LTE-A, truly 4G system, which is capable of 10 times higher transmission
speeds. For our simulations we choose 3 MHz channel bandwidth.
2.2. WI-FI
One of the most common worldwide wireless standard is definitely WI-FI [8]. It was firstly introduced in 1985 and nowadays it is included in nearly all mobile devices (smart phones, cameras,
cars, TVs, etc.). This technology is mainly using OFDM technique in shared open spectrum ISM
(Industrial, Science and Medical) bands −2.4 GHz and 5 GHz respectively. The massive spread of
the WI-FI systems causes overfilled in the shared designed spectrum and it is very difficult to find
“interference less” channel.
2.3. Reinforcement Learning
Reinforcement learning is one of the machine learning technique [9], which provides us fast estimation of the channels behaviour. The crucial parameter of reinforcement learning is weight function
(Wt ) which is defined as:
Wt = CWt−1 + I(t, t − 1),
(1)
where Wt−1 is the value of the weight function in previous time step, C is constant (in this paper
C = 1) and I is increment of the weight function for current time step with memory of the onetime
step.
Generally, the scores of reinforcement learning should be set by many ways. We have decided to
use dominant punishment algorithm and we have also used memory where the channel situation is
stored from previous time (t−1). If we detect repetitive interference possibility, scores are increased
by value 1000. Hence, single interference count add to total channel score 100. Finally, when the
channel is empty in 2 following time steps, we decrease score by 1. This logic minimizes interference
possibility. Unfortunately, it minimizes improvements of the channels scores as well.
Figure 1: Simulation flowchart.
Progress In Electromagnetics Research Symposium Proceedings, Stockholm, Sweden, Aug. 12-15, 2013 761
3. SIMULATION RESULTS
It is obvious, that intelligent spectrum utilization is a great opportunity how to increase performance
of the wireless system. Each part of our simulation (see Fig. 1) is described below.
Firstly, the input parameters such as learning duration, learning repetition, etc. need to be
selected. Afterwards, the band selection is crucial. In this paper we are limited only for intraband channel switching, but in further work we would like to extend the scope of this work also
for inter-band switching. Learning duration defines, how many measured samples will be used for
reinforcement learning algorithm. On the other hand, learning repetition is describing duration,
how long the selected “best” channel is used. After setting parameters of the simulation, we are
able to start the initial learning. Based on the reinforcement learning (Chapter 2), we got initial
After 1st iteration, LTE band #3, DOWN, channel BW:3 MHz
5000
After last iteration, LTE band #3, DOWN, channel BW:3 MHz
5
x 10
8
6
3000
Score [-]
Score [-]
4000
2000
2
1000
0
4
0
5
10
15
Channel [#]
20
0
25
0
5
10
15
Channel [#]
(a)
20
25
(b)
Figure 2: Reward/punishment score after 1st iteration and last iteration respectively for one LTE band.
After 1st iteration, WIFI, channel BW:21MHz
1000
3
After last iteration, WIFI, channel BW:21MHz
x 10 4
2.5
800
Score [-]
Score [-]
2
600
400
1.5
1
200
0.5
0
0
1 2 3 4 5 6 7 8 9 10 11 12 13
Channel [#]
1 2 3 4 5 6 7 8 9 10 11 12 13
Channel [#]
(a)
(b)
Figure 3: Reward/punishment score after 1st iteration and last iteration respectively for WI-FI service.
0.18
Band:LTE
Band:LTE
Band:LTE
Band:LTE
Misdetection probability [%]
0.16
0.14
0.12
band
band
band
band
#3,
#3,
#1,
#1,
UP
DOWN
UP
DOWN
0.1
0.08
0.06
0.04
0.02
0
16
19
22
1
4
Time [hours]
7
10
13
16
Figure 4: The best channel interference probabilities for selected LTE bands.
PIERS Proceedings, Stockholm, Sweden, Aug. 12–15, 2013
762
score table (Fig. 2(a)) for all channels in selected band. In second step, we choose the channel
with lowest score. When more channels with equal scores exist, we choose the lowest channel. In
following example we start with channel #12 for LTE service and channel #8 for WI-FI (Fig. 3)
Until the next learning sequence, we are using selected channel. The channel preference is changed
according to updated scores from weight function (1). The final scores after all possible learning
sequences for current data set are presented in Fig. 2 and Fig. 3(b).
Finally the interference count is calculated as a number of detected primary user’s radiation (for
both frequency and time domain) in selected channel as misdetection probability. The results are
presented in Fig. 4 and Fig. 5.
Misdetection probability [%]
2
1.5
1
0.5
0
16
19
22
1
4
7
10
13
16
Time [hours]
Figure 5: The best channel for WI-FI technology.
4. CONCLUSION
First results of the enhanced spectrum planning using reinforcement learning were presented. Based
on the measured result, the algorithms for spectrum diagnosis were prepared and presented. It was
proved, that this technique enables nearly interference less channels planning in various systems.
It takes into account changes in the environment in both time domain and frequency domain.
Preliminary results indicate that proposed technique nearly eliminates interferences in narrows
LTE bands (less than 0.5%). The misdetection probability is less than 0.2% for LTE service
and less than 2% for WI-FI service. The intra- and inter-channel aggregation enables additional
bandwidth for fast data transfers. The same system was also tested for WI-FI band where the
sufficient misdetection probability in various environments was obtained.
Further work will be focused not only on physical layer, but more factors will be taken into
account, such as data transmission, carrier aggregation for LTE systems, channel aggregation for
WI-FI service and environment simulation via game-theory.
REFERENCES
1. Mitola, J., Cognitive Radio Architecture: The Engineering Foundations of Radio XML, WileyInterscience, Hoboken, NJ, 2006.
2. Mitola, J., “Cognitive radio architecture evolution,” Proceedings of the IEEE, Vol. 97, 626–641,
2009.
3. Urban, R., T. Korinek, and P. Pechac, “Broadband spectrum survey measurements for cognitive radio applications,” Radioengineering, Vol. 21, 2012.
4. Urban, R., P. Fiala, T. Kriz, and J. Mikulka, “Stochastic description of wireless channel for
cognitive radio,” PIERS Proceedings, 38–42, Taipei, Taiwan, March 25–28, 2013.
5. Urban, R., T. Kriz, and M. Cap, “Indoor broadband spectrum survey measurements for the
improvement of wireless systems,” PIERS Proceedings, 376–280, Taipei, Taiwan, 2013.
6. “Cognitive
radio
experimentation
platform,”
2011,
Available:
http://asgard.lab.es.aau.dk/joomla/index.php/home.
7. Kumar, A., J. Sengupta, and Y.-F. Liu, “3GPP LTE: The future of mobile broadband,” Wirel.
Pers. Commun., Vol. 62, 671–686, 2012.
8. IEEE, “Part 11: Wireless LAN medium access control (MAC) and physical layer (PHY)
specifications,” Amendment 4: Further Higher Data Rate Extension in the 2.4 GHz Band, Ed.:
The Institute of Electrical and Electronics Engineers, Inc., 2003.
9. Bublin, M., J. Pan, I. Kambourov, and P. Slanina, “Distributed spectrum sharing by reinforcement and game theory,” Fifth Karlsruhe Workshop on Software Radio, Karlsruhe, Germany,
2008.