Can social media and the options market predict the stock
market behavior?
Patrick Houlihan, Germán G. Creamer
Stevens Institute of Technology
{phouliha, gcreamer} @stevens.edu
___________________________________________________________
Abstract
This paper evaluates if sentiment extracted from social media and options market is correlated with future asset
prices. Through a series of rigorous simulations, using both textual based data and a particular market data
derived call-put ratio, between July 2009 and September 2012, this research shows that: 1) features derived from
market data and a call-put ratio improve model performance; 2) including features derived from a unique
dictionary further improves model performance; and 3) a particular machine-learning algorithm outperforms a
variety of algorithms. This research suggests model performance is superior when including behavioral related
features from both market data and social media.
"The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body
of data." -- John Tukey
“Prediction is very difficult, especially about the future.” -- Niels Bohr
Keywords: Social Media, Investors Sentiment, Behavioral Finance, Machine Learning
___________________________________________________________
I. INTRODUCTION
In today’s society, much human interaction takes place online, through blogs, emails, chat boards
to name a few. Social mediums like Twitter have gained mass popularity and serve as a medium
for communicating through a few sentences. Due to the nature of micro blogs, i.e. twitter posts,
being to the point, more on topic and less verbose (120 character limit for twitter posts), enable
micro blogs to be a prime candidate to extract sentiment for use in predictive analytics as shown
by Bermingham et al. (2010). Furthermore Schapire et al. (2000) showed text can be classified
as being negative, positive or neutral using machine learning algorithms and used robust
features. The main question is how to translate text analytics features into appropriate sentiments
that capture market trends.
Gruhl et al. (2005) showed that blogs and other on-line chatting mediums are a predecessor to
‘real-world’ behavior and the volumes of postings related to various products on Amazons
website are highly correlated with actual purchase decisions. This was one of the first studies to
validate social media as a mechanism for predictive analytics. Pang et al. (2008) provided
further support of social media being a viable source of data to be used for predictive analytics
which is validated by the fact that people are more inclined to share their opinions over social
media mediums to mere strangers. Extracting features from social media messages have proven
to be a robust feature for a variety of different labels. Asur et al. (2010) leveraged the volume
(feature) of Twitter messages, tweets, related to a specific movie prior to release and showed a
positive correlation. Choi et al. (2012) research showed that Google query search volume was
shown to be a strong predictor for future economic activity in various industries, further
reinforcing the internet as a source for robust predictive data and behavior patterns.
Related to the capital markets, Bollen et al. (2010) extracted the mood state of a large number of
users on a stock blogging site and yielded highly predictive, directional moves in the Dow Jones
Industrial Average, where sentiment and mood was shown to predict market directional moves, 2
days out, with an 87.6% accuracy. In addition, Houlihan and Creamer (2014) leveraged volume
and sentiment as features from StockTwit messages and showed these features help explain the
diffusion of price information and future directional moves. Sentiment will be one of the
features used as a feature in this research.
Another way to capture the market sentiment is through the options market. Anthony (1988) has
shown that increased trading in call options leads to next day gains in various underlying stocks
that experienced a spike in call volume the day prior. The latter research would warrant using
call option volume as a feature for a model to predict a label, such as future directional moves.
Cao et al. (2003) find that option volume imbalances, specifically, short-term out of the money
call option volumes are predictors of pending takeovers. This finding points to somewhat of an
inefficient market, one where only informed traders have access to insider information prior to
an announcement. However, this inefficiency can be leveraged as an indicator to a model that
attempts to predict a label such as the next day directional move. Billinglsy et al. (1988) showed
one such indicator, the put-call ratio, to yield abnormal gains when used in a trading strategy.
The put-call ratio, PCR, is simply the total daily put volume (behavior) divided by the daily call
volume (behavior) for a particular equity. Intuitively, a ratio below 1.0 would point to a bullish
indicator, whereas a ratio above 1.0 points to a bearish indicator. However, Bandopadhyaya et
al. (2011) and Billingsley et al. (1988) show that a ratio of 0.7 is a better threshold. Additionally,
the PCR seems to be more of a contrarian indicator than a conformist indicator, in fact several
other indicators are contrarian in nature, including short-term interest, and VIX. The PCR is
thought to be a short-term sentiment on the future move of a particular stock or index. Hu
(2013) shows that imbalances between option volume and underlying volume predicts future
stock returns. Pan et al. (2006) also show that volume, for specific traders, was determined to
contain information about future prices. This latter study had access to a unique data set that
showed new buyer volume (behavior), broken out by various traders. Unique put-call ratios
were derived using each particular trader. The data (1990-2001) was analyzed using a univariate
regression, where the independent variables are the corresponding put-call ratios and the
dependent variable is the next day risk adjusted return. The results showed stocks with a low
put-call ratios, derived from a particular trader (full-service), outperformed stocks with high putcall ratios by +40 basis points (bps), on the next day and 1% over the following week. The
premise here is that informed, full-service, investors trading the underlying stock in lieu of index
options have firm specific related news rather than market-wide news. Also, stocks that went
through periods of higher breadth (advancing issues relative to declining issues) rewarded
investors with abnormal returns of 2.92% in 6 months and 4.95% in a 12-month period as shown
by Chen et al. (2002). The former and latter studies are in stark contrast to index options, where
Han (2004) shows the opposite effect. Market-wide and firm-wide sentiment seem to be
contrarian and conformist in nature respectively. In addition, Houlihan and Creamer (2014)
formulated trader specific call-put ratios based on option contract volume and determined that
specific traders have superior information than other traders as they showed a higher Sharp ratios
with specific trader call-put ratios.
The injection of news into the marketplace in conjunction with various trader behavior of the
options market help explain both the volatility and evolution of assets price. This research
combines both social media sentiment and investors’ sentiment captured through the call-put
with several predictive models to forecast market price direction.
II. DATA
All data, price data, micro-blogging messages were drawn from the time period between July
2009 and September 2012. Additionally, time series were formed for all the various features and
labels to create a matrix for all stocks used in the analysis.
Social Media:
-
Roughly 4.1 million messages were provided by StockTwits1, a social media platform for
the financial community consisting of 230,000 active members whom can discuss and
exchange trading ideas. StockTwits also adopted its users to append tickers (CashTags)
with a $, i.e. $ALTR, when discussing specific assets in messages, allowing for a simple
regex match.
o The messages were provided in JSON2 format. Asides from the raw count of
messages (both aggregate and ticker mentioned specific) twenty-eight fields are
provided, however, for purposes of this paper the focus was on the following
fields:
body - the message text
created_at - datetime stamp of when messages was posted
symbols - list of tickers mentioned in message, i.e. cashtags
Market Data:
-
Asset price data used is from the University of Chicago’s Center for Research in Security
Prices (CRSP) database.
-
Also used is a unique dataset provided by ISE Holdings3 which consists of firm-wide
daily option volume data broken out by various trader types, and their respective
percentage breakdown, table I, below:
o Customer - Option trade volume for traders acting on behalf of discount and fullservice customers.
o Broker Dealer - Option trade volume for traders acting on behalf of institutional
clients.
o Proprietary - Option trade volume for proprietary traders acting on behalf of their
own firm.
o Professional - Option trade volume for Non-Registered Broker Dealer traders
whose daily average is at least 390 trades (high frequency traders)
Table I. Trader Composition
This table shows each traders average put and call volume (option contracts) for all four order types. In addition, we break the data further out to
see the percentage breakdowns to see what trader is driving the majority of the volume.
Average volume
Proprietary
Customer
Broker Dealer
May 2005 to December 2012
Close
Open Buy
Open Sell
Call
Put
Put
Call
Put
41.28
45.56
38.42
39.16
14.46
22.97% 21.46% 25.09% 22.50% 18.77%
62.99% 66.69% 58.70% 64.21% 74.20%
14.03% 11.85% 16.21% 13.29% 7.02%
Buy
Close Sell
Call
Put
Call
21.44
15.33
16.63
15.84% 13.89% 12.52%
78.64% 80.54% 83.09%
5.52% 5.57% 4.39%
III. METHODOLOGY
We will harness four different machine learning algorithms because of their different
methodological approaches to classification:
•
Logistic regression (LR)-ridge regression: a very well-known linear regression algorithm
used as the baseline algorithm.
•
Gaussian Naive Bayes (GNB): a Bayesian parameter estimation problem based on some
known prior distribution.
•
Support vector machine (SVM): classifier based on a linear discriminant function.
•
Adaboost (AB): ensemble method that minimizes bias.
We used Python’s sklearn package and the parameters used can be found in table II below:
Table II. Parameters for Machine Learning Algorithms
The table below shows parameter settings of the four different machine learning algorithms used in the analysis.
Learning Algorithm Model
SVM
Ridge Regression
Gaussian Naïve Bayes
AdaBoost
SVC
Parameters
Kernel=’rbf’
Class_weight=”none”
LogisticRegression c=1e5
GaussianNB
Defaults
AdaBoostClassifier Weaklearner = DecisionTreeClassifier
learning rate = 1
max_depth = 1
n_estimators = 300
algorithm = “SAMME”
Notes
Radial basis function, rbf, is the default kernel
Chose large c parameter to reduce regularization
None
Implemented ensemble learning algorithm approach, decision stump as weak learner. A few quick
experiments were run to determine a robust learning rate and a value of 1 was selected. We saw little
improvement and in some cases worse performance for lower learning rates. Considering this and in lieu
of iteratively adjusting the learning rate it was decided to not dynamically adjust the learning rate as
research (Forrest 2001) has shown little statistical significant advantages of reduction in learning rate. The
maximum depth was set to 1 to avoid a slow error rate improvement for each boosting iteration
(Friedman et al. 2000). In addition, the SAMME algorithm (Zhu et al. 2009) to take advantage of multiclass exponential loss.
Models will be trained for each individual stock with the first 80% observations and tested with
the remaining 20% observations. Splitting the data set up this way will prevent data snooping.
All labels are the directional moves, up (1) or down (-1), of the asset price 1 to 5 days out into the
future.
The models for every individual stock are evaluated with the Matthews Correlation Coefficient
(MCC). MCC helps determine if the model is a robust predictor of the price direction (Matthews
1975). Not only is MCC ideal for a binary label, it overcomes the bias inherent in an unbalanced
label count. Considering markets tend to go up in the long run, directional moves in the positive
direction will outweigh moves in the negative direction, there will be a class label imbalance;
more upticks than downticks; data provided shows a 55% ratio of upticks to downticks. To avoid
analysis that would bias results due to an unfair Bernoulli trial, the MCC measure was chosen as
it handles this label imbalance quite effectively. MCC is calculated by the following formula:
𝑀𝑀𝑀𝑀𝑀𝑀 =
TPxTN − FPxFN
�(𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹)(𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹)(𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹)(𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹)
where,
TP – true positive, forecasted true and actual true
TN – true negative, forecasted false and actual false
FP – false positive, forecasted positive and actual negative
FN – false negative, forecasted negative and actual negative
In order to determine the effect of certain features on model performance, we incrementally add
features to the data set. Though, we will start with a baseline feature set first. Below outlines the
various combinations of features and labels that were run through the training and testing
procedure.
1. Baseline features and labels: Form a log return time series, individually for all stocks, and
determine features, t-1 and t, represented in continuous form, and the corresponding label
for the next 5-day sign of the log returns.
2. Baseline features, call-put ratio and labels: Using the same baseline features and labels,
from 1, above, we include additional three additional features by calculating the ISE callput ratios for every trader (customer, broker dealer and proprietary) using only the open
buy order transaction types, through the following formula4:
ISE =
LONG CALLS 𝑇𝑇𝑇𝑇 (Opening Position)
LONG PUTS 𝑇𝑇𝑇𝑇 (Opening Position)
where,
𝑇𝑇𝑇𝑇 = trader specific call volume
𝑇𝑇𝑇𝑇 = trader specific put volume
3. Baseline features, call-put ratio, sentiment and labels:
Using the same baseline features and labels, from 2, above, we include additional three
features by extracting sentiment from messages using the Dictionary of Affect Words
(DAL). Agarwal et al. (2011) showed DAL to accurately capture binary (positive or
negative) sentiment from tweets and Xie et al. (2013) also successfully used DAL to
capture positive and negative polarity in news articles. DAL parameters are known as
Pleasantness (Pleas), Activation (Act) and Imagery (Img).
These parameters, the
additional three features, capture human emotion similar to Googles Profile of Mood
states (six total emotional states) that were successfully used by Bollen et al. (2009) to
predict future directional moves in stocks.
We score all messages using the DAL
parameter scores by tokenizing each message and take the average of each parameter for
every message.
Each stock matrix will be run through all four machine learning algorithms by training each with
the first 80% of data and tested using the remaining data. The label is cycled from t+0 to t+5,
making 6 iterations for each stock per algorithm, totaling 24 simulations per stock.
IV. RESULTS
The first step taken was to include the log returns of t-1 and t as the features with the sign of the
next 5 day returns as the label, results summarized in table III below.
Table III. Average values of Matthews correlation coefficient for price direction by learning algorithm using
lagged values
MCC represents the average Mathews correlation coefficient for the forecasted day for 800 stocks, across all 5 days for each algorithm (ALGO).
FREQ represents the number of stocks included in the analysis.
ALGO
GNB
AB
LR
SVM
AVERAGE
MCC1
-0.006
-0.013
-0.002
-0.002
-0.006
MCC2
0.011
0.003
0.013
0.005
0.008
MCC3
0.003
0.017
-0.004
0.007
0.006
MCC4
-0.012
0.004
-0.008
0.000
-0.004
MCC5 MCC_AVG FREQ
0.012
0.001
800
-0.008
0.001
800
0.012
0.002
800
0.011
0.004
800
0.007
0.002
800
The AdaBoost algorithm exhibited the best MCC ratio, -0.013, for t+1 (MCC1), out of all models
for the baseline model. Since we have established AdaBoost has performance advantages over
the other three models, we move forward with using AdaBoost for the remainder of the
methodology steps and add each respective call-put ratio for each trader type individually to
determine if they add any predictive value, results in table IV below.
Table IV. Average values of Matthews correlation coefficient for price direction prediction by learning
algorithm using lagged values and ISE call-put ratio by trader
MCC represents the average Mathews correlation coefficient for the forecasted day for a number of stocks, across all 5 days for each algorithm
(ALGO). FREQ is the number of stocks in the simulation and TRADER represents the trader type. P-values of ANOVA for comparison to
MCC=0.
ALGO MCC1
AB
0.007
AB
-0.008
AB
-0.020
AB
0.016
AVERAGE -0.001
p-value >> 0.100
MCC2 MCC3 MCC4
0.003
0.020
-0.004
0.001
0.021
0.000
0.008
0.006
0.006
-0.028 -0.020 -0.017
-0.004
0.007
-0.004
>> 0.100 >> 0.100 >> 0.100
MCC5 MCC_AVG FREQ
0.009
0.007
704
0.029
0.009
105
-0.019
-0.004
500
-0.009
-0.012
90
0.002
0.000
350
>> 0.100
TRADER
CUST
BD
PROP
ALL
There was an overall increase in the average (MCC_AVG) of each 5 day MCC1-MCC5
coefficients for every trader compared to the baseline AdaBoost, AB, observation of 0.001, by
adding the additional ISE call-put ratio features. Both the proprietary, PROP, and all trader, ALL,
exhibited stronger MCC coefficients of, -0.020 and 0.016 for t+1 (MCC1) compared to the
AdaBoost baseline result of -0.013. In addition, the overall ALL average was -0.012 compared to
the baselines AdaBoost average value of 0.001. These results suggest the entire basket of call-put
ratios (All) do contain the most information out of all prior feature combinations. Please note the
varying counts, FREQ, for each trader is consistent with the proportions presented in table 1.
What this stipulates is not all stocks had a call-put ratio to calculate for every trader.
Combining all features, baseline, call-put ratios and DAL further contributed to the average first
day MCC (MCC1) coefficient which improved from -0.001 to 0.026 (see table V).
Table V. Average values of Matthews correlation coefficient for price direction prediction by learning
algorithm using lagged values, ISE call-put ratio and sentiment (DAL) by trader.
MCC represents the average Mathews correlation coefficient for the forecasted day for a number of stocks, across all 5 days for each algorithm
(ALGO). FREQ is the number of stocks in the simulation and TRADER represents the trader type. P-values of ANOVA for comparison to
MCC=0.
ALGO
AB
AB
AB
AB
AVERAGE
p-value
MCC1 MCC2 MCC3
0.021
0.006
-0.003
0.010
0.019
-0.024
-0.004 -0.007
0.011
0.077 -0.004 -0.001
0.026
0.004
-0.004
0.020 >> 0.100 >> 0.100
MCC4
-0.013
0.019
0.002
-0.049
-0.010
>> 0.100
MCC5 MCC_AVG FREQ
0.001
0.002
704
-0.008
0.003
105
-0.007
-0.001
500
0.009
0.007
90
-0.001
0.003
350
>> 0.100
TRADER
CUST
BD
PROP
ALL
In addition, there was a significant increase in the MCC coefficients for t+1 (MCC1) for all the
trader types, ALL, 0.077, compared to the baseline of 0.016. In order to determine if there is a
statistical significant difference between trader type contributions to the t+1 label, ANOVA tests
were run for all traders found in tables IV and V. The baseline features with ISE resulted in no
statistical significance (table IV). However, ANOVA tests run for the baseline features with both
ISE and DAL yielded a p-value of 0.020, which is significant at the 5% level (table V). Our
analysis shows that the ISE call put ratio and the sentiment features improve the predictive
performance suggesting there is even more information present with using both textual based
features, sentiment, and market data derived features, call-put ratios.
V. CONCLUSION
We have a hybrid feature matrix composed of both sentiment extracted from social media
messages and market data derived signals, call-put ratio. Both of these features contain a
sentiment and behavioral aspect. Sentiment is an aggregated opinion of the general investing
community and the call-put ratios is sentiment for various trader types beyond what would be
found on social media platforms. Social media provides information about the masses opinions
and moods and a profile of the more herd traders. The market data derived signal consists of
customer, broker dealer and proprietary traders whom are not, besides customer, on social media
outlets broadcasting their opinions to the world about stocks as there are strict SEC rules
preventing them from doing so, however, we are able capture their behavior through the option
volume data. This research suggests combining both feature types, sentiment from both the
masses and professional trader types, from two forms, text and market data, yields statistical
significant contributions to model performance.
Notes
1. “StockTwits® is a social media platform designed for sharing ideas between investors, traders, and
entrepreneurs.” Wikipedia Contributors, “StockT wits” Wikipedia, The Free Encyclopedia,
http://en.wikipedia.org/wiki/StockTwits (accessed September 22, 2014).”
2. Wikipedia Contributors, “JSON,” Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/JSON
(accessed September 22, 2014).
3. ISE Holdings is the International Securities Exchange Holding company that operates two U.S. option
exchanges: International Securities Exchange, LLC and Topaz Exchange, LLC
4. “The ISE Sentiment Index® (ISEE®) is a unique put/call value calculated using only opening long customer
transactions to determine bullish/bearish market direction.“
References
Agarwal, Apoorv, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau. “Sentiment
Analysis of Twitter Data.” In Proceedings of the Workshop on Languages in Social Media,
30–38. LSM ’11. Stroudsburg, PA, USA: Association for Computational Linguistics, 2011.
http://dl.acm.org/citation.cfm?id=2021109.2021114.
Anthony, Joseph H. “The Interrelation of Stock and Options Market Trading-Volume Data.” The
Journal of Finance 43, no. 4 (September 1, 1988): 949–64. doi:10.1111/j.15406261.1988.tb02614.x.
Asur, Sitaram, and Bernardo A. Huberman. “Predicting the Future with Social Media.”
arXiv:1003.5699 (March 29, 2010). http://arxiv.org/abs/1003.5699.
Bermingham, Adam, and Alan F. Smeaton. “Classifying Sentiment in Microblogs: Is Brevity an
Advantage?” In Proceedings of the 19th ACM International Conference on Information and
Knowledge Management, 1833–1836. CIKM ’10. New York, NY, USA: ACM, 2010.
http://doi.acm.org/10.1145/1871437.1871741.
Billingsley, Randall S, and Don M Chance. “Put—call Ratios and Market Timing Effectiveness.”
The Journal of Portfolio Management 15, no. 1 (January 1, 1988): 25–28.
doi:10.3905/jpm.1988.409184.
Bollen, Johan, Huina Mao, and Xiao-Jun Zeng. “Twitter Mood Predicts the Stock Market.”
arXiv:1010.3003 (October 14, 2010). http://arxiv.org/abs/1010.3003.
Bollen, Johan, Alberto Pepe, and Huina Mao. “Modeling Public Mood and Emotion: Twitter
Sentiment and Socio-economic Phenomena.” arXiv:0911.1583 (November 8, 2009).
http://arxiv.org/abs/0911.1583.
Cao, Charles, John M. Griffin, and Zhiwu Chen. Informational Content of Option Volume Prior
to Takeovers. SSRN Scholarly Paper. Rochester, NY: Social Science Research Network,
September 1, 2003. http://papers.ssrn.com/abstract=445320.
Choi, Hyunyoung, and Hal R. Varian. “Predicting the Present with Google Trends.” SSRN
eLibrary (June 2012). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2094858.
Gruhl, Daniel, R. Guha, Ravi Kumar, Jasmine Novak, and Andrew Tomkins. “The Predictive
Power of Online Chatter.” In Proceedings of the Eleventh ACM SIGKDD International
Conference on Knowledge Discovery in Data Mining, 78–87. KDD ’05. New York, NY,
USA: ACM, 2005. http://doi.acm.org/10.1145/1081870.1081883.
Han, Bin. “Limits of Arbitrage, Sentiment and Pricing Kernal: Evidences from Index Options,”
2004.
Houlihan, Patrick, and Germán G. Creamer. Leveraging a Call-Put Ratio as a Trading Signal.
SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, December 1,
2014. http://papers.ssrn.com/abstract=2363475.
Houlihan, Patrick, and Germán G. Creamer. Diffusion of Price Information through Message
Volume and Sentiment. SSRN Scholarly Paper. Rochester, NY: Social Science Research
Network, December 1, 2014. http://papers.ssrn.com/abstract=2527968.
Hu, Jianfeng. “Does Option Trading Convey Stock Price Information?” Journal of Financial
Economics 111, no. 3 (March 2014): 625–45. doi:10.1016/j.jfineco.2013.12.004.
Matthews, B. W. “Comparison of the Predicted and Observed Secondary Structure of T4 Phage
Lysozyme.” Biochimica et Biophysica Acta (BBA) - Protein Structure 405, no. 2 (October
20, 1975): 442–51. doi:10.1016/0005-2795(75)90109-9.
Pan, Jun, and Allen M. Poteshman. “The Information in Option Volume for Future Stock
Prices.” Review of Financial Studies 19, no. 3 (September 21, 2006): 871–908.
Pang, Bo, and Lillian Lee. “A Sentimental Education: Sentiment Analysis Using Subjectivity
Summarization Based on Minimum Cuts.” In Proceedings of the 42nd Annual Meeting on
Association for Computational Linguistics. ACL ’04. Stroudsburg, PA, USA: Association
for Computational Linguistics, 2004. http://dx.doi.org/10.3115/1218955.1218990.
Opinion Mining and Sentiment Analysis. Now Publishers Inc, 2008.
Schapire, Robert E., and Yoram Singer. “BoosTexter: A Boosting-based System for Text
Categorization.” Machine Learning 39, no. 2 (2000): 135–168.
© Copyright 2026 Paperzz