Predicting ForeX values using Reservoirs 1 Introduction 2

Predicting ForeX values using Reservoirs
Bas van Stein
LIACS s0800279
University of Leiden
[email protected]
Tom Groentjes
LIACS s0231347
University of Leiden
[email protected]
December 7, 2011
Abstract
In this paper we show the results of the Reservoir computing Algorithm [1] that we made in Python [2]
using the Oger [3] toolbox. We tested our Algorithm with different parameters and datasets. We will show
the results and explain them.
1
Introduction
want bigger sets then 2000 points we merged a few sets
that had adjacent time slices. We also tried to use a day
For the assignment we had to implement a reservoir and an hour as timestep but then we were not able to
with the Oger toolbox in Python. We used an example get much more then 2000 points with the disadvantage
python program from the toolbox, the Mackey Glass that some gaps occurred in the timeseries.
example to start with. Reservoirs are a kind of artificial neural networks that can learn time series. They
We pruned the data we got by deleting the time
are very good to learn signals with a rather fixed pe- information and we only preserved the closing values.
riod.
We finally choose to fed the algorithm a time series of
4000 points. The data ranges from 6-15-2011 to 6-182011 and we used a test set of 400 points that comes
2 Implementation
after the time slice of the training set. We choose to
fed the algorithm a time serie of 4000 points. The data
The first big challenge was to understand what the ranges from 6-15-2011 to 6-18-2011 and we used a test
python example was doing and were and how he got set of 400 points that comes after the timeslice of the
the data. After we understood the example program training set. The results then looked reasonable, howwe began creating our FX data set. We took the EU- ever far from optimal. We also tried to tune many of
R/USD set from the Forex website with the timestep the algorithm variables like the grid-search and the size
being set at one minute and 2000 points. Because we of the reservoir. However the optimization on the grid1
search takes some time so we could not try too big
Our program can be run using the below command
values.
in the terminal and provided that the datasets are in
We also tried to feed the reservoir with data from the same folder.
timeseries in Juli 2008,2009 and 2010 and then run the
test on 2011 but that did not seem to work since the
values differ to much from each other.
3
>
python FXreservoirBT.py
Results
(a) Mackey Glass Dataset
(b) 4000 minute data with 50 nodes
(c) 4000 minute data with 500 nodes
Figure 1: Results
2
(a) 4000 day data with 500 nodes
Figure 2: Results continued..
We also ran some tests by altering the leak parameter and the grid-search parameters but the results were
all so bad that we did not include them into the report.
4
Conclusion
We can conclude that a reservoir algorithm that we used is not very suitable for stock predictions. The number
of nodes in the network did not seem to really matter for the results. The amount of data and the time steps
are very important for the outcome. We can see that when using minutes the algorithm can not really see any
periodic behavior. The program performed best on our daily test set from 2011. The prediction we make kinda
follows the actual target signal, but only as an approximation of the global direction of the target.
5
Appendix
In this section you will find the code of our Reservoir.
1
2
3
4
5
6
7
8
9
10
11
import Oger
import p y l a b
import s c i p y
import c s v
import s y s
import mdp
import random
def
f o r e x e u r o ( s a m p l e l e n =2000 ,
t e s t =0) :
3
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
’’’
f o r e x e u r o ( s a m p l e l e n =1000 , t a u =17 , s e e d = None , n s a m p l e s = 1 ) −> i n p u t
Get t h e F o r e x Euro / D o l l a r t i m e− s e r i e s . P a r a m e t e r s a r e :
− s a m p l e l e n : l e n g t h o f t h e t i m e− s e r i e s i n t i m e s t e p s . D e f a u l t i s 1 0 0 0 .
− n s a m p l e s : number o f s a m p l e s t o g i v e
’’’
c s v . QUOTE NONE
samples = [ ]
if test > 0 :
f = open ( ’ 2 0 1 1 - t e s t s e t . c s v ’ , ’ r ’ )
if test < 1 :
f = open ( ’ 2 0 1 1 - t e s t s e t . c s v ’ , ’ r ’ )
t e l l e r = 1;
t e l l e r 2 =0;
i n p 0 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) )
i n p 1 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) )
i n p 2 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) )
i n p 3 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) )
try :
reader = csv . reader ( f )
f o r row i n r e a d e r :
i f t e l l e r > 0 and t e l l e r < s a m p l e l e n + 1 :
i n p 0 [ t e l l e r 2 ] = f l o a t ( row [ 0 ] )
if test < 1 :
#i n p 0 [ t e l l e r 2 ] = ( f l o a t ( row [ 0 ] ) − 0 . 7 2 ) ∗ 10
i n p 1 [ t e l l e r 2 ] = i n p 0 [ t e l l e r 2 ] + ( random . random ( ) − 0 . 5 ) ∗ 0 . 0 0 1
i n p 2 [ t e l l e r 2 ] = i n p 0 [ t e l l e r 2 ] + ( random . random ( ) − 0 . 5 ) ∗ 0 . 0 0 0 1
#i n p 3 [ t e l l e r 2 ] = f l o a t ( row [ 0 ] )
teller2 = teller2 + 1
teller = teller + 1
s a m p l e s . append ( [ i n p 0 ] )
if test < 1 :
s a m p l e s . append ( [ i n p 1 ] )
s a m p l e s . append ( [ i n p 2 ] )
#s a m p l e s . append ( [ i n p 3 ] )
finally :
f . close ()
return s a m p l e s
if
name
== " _ _ m a i n _ _ " :
f r e e r u n s t e p s = 700
begin training = 0
begin test = 0
t r a i n i n g s a m p l e l e n g t h = 4000
t e s t s a m p l e l e n g t h =4700
n training samples = 3
print ’ C r e a t i n g t r a i n i n g s e t . . . ’
t r a i n s i g n a l s = f o r e x e u r o ( s a m p l e l e n=t r a i n i n g s a m p l e l e n g t h , t e s t =0)
print ’ d o n e . ’
print ’ C r e a t i n g t e s t i n g s e t . . . ’
#t r a i n s i g n a l s = Oger . d a t a s e t s . m a c k e y g l a s s ( s a m p l e l e n=t r a i n i n g s a m p l e l e n g t h , n s a m p l e s=n t r a i n i n g s a m p l e s )
#t e s t s i g n a l s = Oger . d a t a s e t s . m a c k e y g l a s s ( s a m p l e l e n=t e s t s a m p l e l e n g t h , n s a m p l e s =1)
t e s t s i g n a l s = f o r e x e u r o ( s a m p l e l e n=t e s t s a m p l e l e n g t h , t e s t =1)
print ’ d o n e . ’
print ’ C r e a t i n g r e s e r v o i r . . . ’
r e s e r v o i r = Oger . n o d e s . L e a k y R e s e r v o i r N o d e ( o u t p u t d i m =500 , l e a k r a t e = 0 . 4 , i n p u t s c a l i n g = . 1 ,
r e s e t s t a t e s =F a l s e )
r e a d o u t = Oger . n o d e s . R i d g e R e g r e s s i o n N o d e ( )
Oger . u t i l s . e n a b l e w a s h o u t ( Oger . n o d e s . R i d g e R e g r e s s i o n N o d e , 4 0 0 )
print ’ d o n e . ’
#r e a d o u t . r i d g e p a r a m = 0 . 0 1 2 5 8 9 2 5 4 1 1 7 9 # 3 1 6 . 2 2 7 7 6 6 0 1 7
f l o w = Oger . n o d e s . F r e e r u n F l o w ( [ r e s e r v o i r , r e a d o u t ] , f r e e r u n s t e p s = f r e e r u n s t e p s )
gridsearch parameters = {readout :{ ’ r i d g e _ p a r a m ’ :
10 ∗∗
s c i p y . a r a n g e ( −5 ,
3,
. 3 ) }}
# I n s t a n t i a t e an o p t i m i z e r
l o s s f u n c t i o n = Oger . u t i l s . t i m e s l i c e ( r a n g e ( t r a i n i n g s a m p l e l e n g t h − f r e e r u n s t e p s ,
u t i l s . nrmse )
o p t = Oger . e v a l u a t i o n . O p t i m i z e r ( g r i d s e a r c h p a r a m e t e r s , l o s s f u n c t i o n )
print ’ o p t i m i z i n g . . . ’
# Do t h e g r i d s e a r c h
opt . g r i d s e a r c h ( [ [ ] , t r a i n s i g n a l s ] ,
print ’ g r i d s e a r c h o p t i m i z i n g ’
flow ,
b i a s s c a l i n g =.2 ,
training sample length ) ,
c r o s s v a l i d a t e f u n c t i o n =Oger . e v a l u a t i o n . l e a v e o n e o u t )
# Get t h e o p t i m a l f l o w and r u n c r o s s −v a l i d a t i o n w i t h
o p t f l o w = f l o w #o p t . g e t o p t i m a l f l o w ( v e r b o s e=True )
it
print ’ F r e e r u n o n t e s t _ s i g n a l s s i g n a l w i t h t h e o p t i m a l f l o w . . . ’
opt flow . train ( [ [ ] , train signals ])
freerun output = opt flow . execute ( t e s t s i g n a l s [ 0 ] [ 0 ] )
pylab . p l o t ( s c i p y . concatenate ( ( t e s t s i g n a l s [ 0 ] [ 0 ] [ − 2 ∗ f r e e r u n s t e p s : ] ) ) )
p y l a b . p l o t ( s c i p y . c o n c a t e n a t e ( ( f r e e r u n o u t p u t [ −2 ∗ f r e e r u n s t e p s : ] ) ) )
pylab . x l a b e l ( ’ T i m e s t e p ’ )
pylab . legend ( [ ’ T a r g e t s i g n a l ’ , ’ P r e d i c t e d s i g n a l ’ ] )
4
Oger .
101
102
103
pylab . a x v l i n e ( pylab . xlim ( ) [ 1 ] − f r e e r u n s t e p s + 1 ,
print o p t f l o w [ 1 ] . ridge param
p y l a b . show ( )
pylab . ylim ( ) [ 0 ] ,
pylab . ylim ( ) [ 1 ] ,
c o l o r=’ r ’ )
References
[1] Mantas Lukosevicius, Herbert Jaeger: Reservoir Computing Approaches to Recurrent Neural Network
Training, School of Engineering and Science, Germany. Computer Science Review, 2009, 127 - 149
[2] Python, 2009. Python Programming Language, Available from: http://www.python.org
[3] Reservoir computing - Oger toolbox
http://organic.elis.ugent.be/printpdf/book/export/html/265, 2011
5