On the Effects of Stock Spam E-mails∗ Michael Hanke†, Florian Hauser‡ August 30, 2006 Abstract A rising number of unsolicited e-mails recommends buying certain stocks, pretending that the sender has private information that will boost these stocks’ prices when it becomes publicly available. We first describe the common characteristics of stocks pushed by such e-mails. Then, we investigate the effect of stock spam e-mails on returns, volatility, intraday spread, and volume. We find a significant impact of spam mails on all of these variables. As a second contribution, we characterize features of stocks that are particularly easy to manipulate, and we investigate dependencies between these features. JEL classification: D82, G14, G24 Keywords: stock spam, market manipulation 1 Introduction Almost everyone with an e-mail account has already received e-mails recommending to buy certain stocks. The principal purpose of such mails obviously is to manipulate the prices of the stocks they push. The senders of such e-mails spread forged “news” about the stocks, trying to artificially increase demand, which in turn leads to price increases. Activity of this kind has recently received increasing attention by the media (Wall Street Journal (2000, 2005a,b)). Manipulation of stock markets is a phenomenon that has been around for a long time, possibly for as long as stock markets. There is a host of analytical literature on this topic (e.g., van Bommel (2003), Jianga et al. (2005)). In earlier days, manipulators had to work much more subtly. Manipulation attempts camouflaged as analyst’s reports were quite common. In contrast to their modern counterparts initiated via the internet, their success was highly dependent on the analysts’ reputation, which was put on the line by repeatedly issuing “cooked” reports. Precise timing was much more difficult, if not impossible with traditional media such as newspapers which are read by some people in the morning, by others in the evening, and might be accessed by a few “insiders” already the day before they go to press. ∗ We thank participants of the 20th workshop of the Austrian Working Group on Banking and Finance (Graz 2005) for helpful comments and discussions. † Innsbruck University School of Management, Department of Banking and Finance, Universitaetsstrasse 15, 6020 Innsbruck, Austria. e-mail: [email protected] ‡ Corresponding author. Innsbruck University School of Management, Department of Banking and Finance, Universitaetsstrasse 15, 6020 Innsbruck, Austria. e-mail: [email protected] 1 For would-be manipulators, the internet offers the ideal framework, allowing simultaneous contacts to large numbers of potential investors. Apart from email, electronic newsletters and internet discussion forums are popular tools used by manipulators. While electronic newsletters can be viewed as just the good old print newsletters transferred to a new medium, discussion forums in the internet are a new framework for exchanging information on stocks. There are already several studies confirming the explanatory power of information spread in such forums for stock price changes (see e.g. Tumarkin and Whitelaw (2001), Dewally (2003), Antweiler and Frank (2004)). Our goal in this paper is to explore the effects of stock spam e-mails that can be found in market data. Using panel regression, we investigate their effects on return, volatility, volume, and intraday spread measures of the target stocks. Our results indicate that spam e-mails show marked effects on all of these variables. Some limitations of our study are caused by the unavailability of intra-day data. These would have been necessary to provide a clearer picture of potential profits of trading strategies. Nevertheless, we can learn a lot from the daily data we use here. When we started this project, to the best of our knowledge, there had been no empirical study yet on stock spam e-mails. Only very recently, we have become aware of a competing paper (Boehme and Holz, 2006). Although they retrieve their data from the same database, there are several major differences between their paper and ours (more details on these differences will be provided in the corresponding sections below): • Our sample is much more comprehensive,1 covering more stocks (235 compared to 93) and more spam events (1241 compared to 526 (volume) and 152 (returns) in Boehme and Holz (2006)). • Our market data come from Datastream, whereas Boehme and Holz (2006) use data from Yahoo Finance. We strongly believe that our data are more reliable. • We use panel regression instead of pooled OLS. • Whereas Boehme and Holz (2006) focus only on returns and volume, we also describe effects on volatility and intraday spread. • Our study provides a more comprehensive analysis of the dependencies between spam success and characteristics of target stocks. The paper is organized as follows: Section 2 provides additional information about stock spam e-mails and discusses possible motivations and trading strategies followed by spammers. Moreover, we formulate a number of interesting questions that will be answered through our empirical analysis. Section 3 presents our data set in detail. In Section 4, we describe the design of our study, providing all the regression equations estimated. Section 5 presents and discusses the results, and Section 6 concludes. 1 This is despite the fact that we use only data from 2005, whereas Boehme and Holz (2006) cover the period from November 2004 to February 2006. 2 2 Stock Spam E-mails We define stock spam e-mails as unsolicited e-mails that • explicitly recommend to buy a certain stock (the target ) and/or • contain “information” about events that investors will most certainly interpret as a buy recommendation. Stock spam e-mails represent one of the contemporary variants of strategies known under the names “pump-and-dump” or “scalping”. The goal of such strategies is to artificially increase the demand for a certain stock by spreading false positive news in the hope of liquidating one’s own existing position (which may have been built up exclusively for this reason) at an inflated price. Before the advent of the technological possibilities brought about by the internet, the execution of such strategies was fraught with problems. The strategies only work well if the forged information can be spread quite quickly among a larger audience. Ideally, all recipients should get the information at the same time. The fact that such practices are illegal in many jurisdictions leads to a strong desire of the manipulator to remain anonymous. Among all e-mails pretending to contain information relevant for a stock’s price that we have seen, there was not a single one with either negative information or an outright sell recommendation. Another common feature of stock spam e-mails is that the information provided is incorrect, be it because it is outdated, exaggerated, or – in most cases – purely fictitious. The only purpose of such e-mails seems to be the manipulation of the target’s stock price. Assuming this is the case, the question naturally arises who benefits from spam e-mails, and therefore has an incentive to initiate them. A related question is what strategies stock spammers use in order to benefit financially from their activities. The stocks that are commonly targeted by spammers share several typical characteristics (details will be provided in Section 3). Here, it suffices to note that popular target stocks are (see, e.g., Luft and Levine (2004, p. 2)) • not traded on organized exchanges, but quoted on platforms like OTC-BB or Pink Sheets2 and traded OTC via market makers, • very cheap, often trading markedly below one dollar a share (so-called penny stocks), • micro-caps (i.e., have a small total market value), • thinly traded, with average daily turnover at five-to-six-digit figures or smaller, • traded quite irregularly, with frequent gaps in daily price series, • very difficult or impossible to sell short. These features, together with the observation that stock spam e-mails always indicate an imminent rise in the target’s stock price, suggest two main possible motives for spamming a stock: 2 These platforms will be described in more detail in Section 3.2. 3 1. The spammer intends to sell a certain stock he already owns and wants to boost the selling price. 2. The spammer buys the target stock, initiates the spam shortly afterwards, and sells the stock again once the price has risen. The large number of spam e-mails suggests that there are professional spammers around, initiating stock spam e-mails on a regular basis. The negative average return of target stocks (see Section 3.1) provides a strong disincentive against holding such stocks for longer periods of time. Thus, we assume that professional spammers build up the positions in their target stocks as shortly as possible before initiating the spam mails. Putting ourselves in the position of a spammer of the latter type, many of the characteristics of popular target stocks described above make good sense. It certainly is much easier to manipulate the price of a stock with low turnover. Small movements in absolute terms translate into large percentage changes for so-called penny stocks, which makes cheap stocks attractive targets. Given that stocks with these characteristics usually cannot be shorted,3 we should indeed expect manipulation attempts via stock spam e-mails to contain only information which is positive for the target’s stock price. At the same time, this raises a number of questions: • When stocks are too thinly traded, the spammers might have too large a price impact. He drives the price upwards when building up his position, and drives it downwards again when liquidating the position, thus reducing his returns. Is high or low turnover better from the spammers’ point of view in the sense that stocks within a certain turnover (liquidity) range are more desirable targets than others? • The spammers’ success depends crucially on other investors’ willingness to invest in their target stocks. Given that stocks trading at very low stock prices usually carry a negative image, is the spammers’ rule really “the cheaper, the better”? • Certain stocks are frequently targeted during our observation period, while others receive only one or two spam mails during the year 2005. A possible reason for this is that stocks that are easier to manipulate are targeted more often. Do stocks that are targeted frequently react differently to such manipulation attempts? • At times, we observe stock spam e-mails targeting the same stocks on successive days. Is sequential spamming more effective than one-shot spams? Do the other effects observed differ between one-off spams and sequential spamming? • The spammers’ success depends crucially on other investors’ willingness to invest in their target stocks. Given that stocks trading at very low stock prices usually carry a negative image, is the spammers’ rule really “the cheaper, the better”? 3 At least, not at reasonable transaction costs. 4 3 Data We define a spam event for a certain stock as a day on which at least one incident of a recorded stock spam e-mail occurred before the closing of business. Spam events are retrieved from a publicly available database.4 The database allows an exact matching of spam events to trading days. Daily stock prices were gathered from Datastream. Our sample covers the year 2005. Due to the “unusual” nature of our data, we describe the data collection process underlying the construction of the Crummy database in more detail. The author of the database maintains a large number of so-called trap accounts – e-mail accounts whose sole purpose is the reception of spam e-mails. Every day in the afternoon, when trading in the U.S. closes, automatic scripts evaluate the e-mails received for all trap accounts, classify the subset of stock spam e-mails according to the target stock, and time-stamp them. We checked a large sample of e-mails from the database for possible errors in the automated classification scripts. About 20 ticker symbols that turned out to be incorrectly classified in the database have been excluded from our sample. We should point out that there is absolutely no way to ensure that the database is exhaustive. There may (and will!) be stock spam e-mails during 2005 which have not been recorded in the database. We strongly believe, however, that this in no way casts doubt on the validity of our results. On the contrary, more spam events in our sample should rather improve the statistical significance of our results. To be included in our sample, stocks in the database had to correspond to the following criteria: • Availability of daily price data from Datastream. • At least one spam event between January 1 and December 31, 2005. • Trading activity resulting in at least 50 recorded daily closing prices in 2005. • Trading at a U.S. marketplace (ensuring adequate matching of spam events to trading days). 3.1 Descriptive Statistics Out of about 360 stocks in the database, 235 matched our criteria. For these stocks included in the sample, we have 1241 spam events in 2005. Figure 1 shows that the stocks are affected quite differently by spam e-mails: While 130 stocks were subject to no more than five spam events, the most popular target was spammed on almost 60 days in 2005! Considering the large sample of our study, an analysis of the number of spams associated with certain trading days provides a good indicator for the spam activity over time. Figure 2 shows that the spam events in our sample are quite evenly distributed over time with a maximum of 12 spam events for a single trading day. We found no evidence of calendar effects. A preliminary analysis of our data confirms the characterization of target stocks provided above: 4 The Crummy database, http://www.crummy.com/features/StockSpam/reports/ 5 Figure 1: Number of spam events per stock (stocks sorted by spam event frequency) Figure 2: Number of spam events per trading day • Trading in those stocks was quite illiquid: For 59220 stock trading days in total (252 trading days per year times 235 stocks), we recorded only 41764 quotes. • Penny stocks are the predominant targets: 36309 of the 41764 unadjusted quotes were below 1$. Figure 3 gives detailed information about the dis6 tribution of quotes in our sample. Figure 3: Number of quotes in categories (0.01$ steps) After several warnings issued by the SEC5 , investors should already be aware that investing in penny stocks is a quite risky business. Nevertheless it is remarkable that even a portfolio of 235 stocks targeted by spam mails in 2005 does not provide protection against suffering huge losses. Figure 4 shows the value of a hypothetical portfolio, starting at 100 on January 2 and growing at compounded average daily returns of the stocks in our sample (this can roughly be thought of the performance of a naive-diversification-strategy). The portfolio loses almost 90% over one year. During the same period, the S&P 500 gained about 3.8%. Figure 5 shows that daily returns of the stocks in our sample are leptokurtic. The red line shows the frequency for values from a normal distribution with mean and standard deviation equal to those of the returns in our sample. The fat tails are more pronounced for returns on spam days (blue) than for those on unaffected days (for the exact definition of unaffected days see Section 4). 3.2 Electronic Quotation Platforms for OTC-traded Stocks Most of the stocks in our sample are not traded on organized exchanges, but quoted on platforms like Pink Sheets and OTCBB (Over-the-Counter Bulletin Board). In contrast to many stock exchanges, OTC markets are dealer markets, meaning that securities are traded by market makers, providing bid- and ask prices for those stocks. To inform potential traders about current quotes, such platforms process quotes of eligible market makers who ensure the liquidity for the listed stocks. Companies quoted on those platforms usually cannot 5 http://www.sec.gov/investor/pubs/microcapstock.htm 7 Figure 4: Value of a hypothetical portfolio growing at compounded average daily returns of all stocks in our sample Figure 5: Cumulated frequency of standardized absolute returns of the stocks in our sample. Blue line: returns on spam days, black line: returns on unaffected days, red line: normal distribution. meet the requirements of reputable exchanges like NYSE or NASDAQ. Listing requirements on OTC platforms are very lenient: There are no minimum financial standards, and in some cases not even any action on the part of the issuing 8 company itself is necessary for a stock to be listed on those platforms. A minor difference in “regulation” is the requirement for companies quoted on OTCBB to publish periodic financial reports (this is not necessary for stocks quoted at Pink Sheets). Table 1 compares the two most important platforms, OTCBB and Pink Sheets. More information on the market structure of OTCBB as well as the impact of liquidity and size on returns, volatility, and bid/ask spreads can be found in Larson et al. (2001); Luft and Levine (2004). Minimum Financial Requirements Issuer Application Required Periodic Financial Reporting required Eligible Market Makers Dollar Volume (bill.) Exclusive Securities OTCBB No No Yes 225 4.2 3269 Pink Sheets No No No 189 8.7 4788 Table 1: Comparison of OTCBB and Pink Sheets for January 2006 (data obtained from http://www.otcqx.com/docs/OTCQXBrochure.pdf) 4 4.1 Method Basic Setup We look for answers to the questions formulated in Section 2 using panel regression with dummy variables. At first sight, our problem seems to lend itself well to the standard event study methodology. However, there is one major difference: In classical event studies, the exact timing of the event is usually unknown. With our data, this is not a problem: As described above in Section 3, each spam mail is time-stamped and easily associated with the corresponding trading day. We take advantage of our large cross-section to mitigate the problem of observing only a small number of spam days for many of the stocks in our sample by using panel regression with dummies. We have both a large cross-section and a large number of observations in time, so many econometric problems discussed in the panel regression literature which are due to a small number of observations in time are not a concern for us. Unobserved effects may be present both in the time and in the cross-section domain. Our data constitute an unbalanced panel, which restricts us to allow for unobserved effects in only one of the domains, but not both. A preliminary analysis allowing for unobserved time effects showed that there are no such effects. There is, however, good reason to assume a fixed unobserved crosssectional effect, e.g., average returns vary systematically across stocks. We account for this by using the standard fixed effects transformation (also known as the within transformation, see e.g. Wooldridge (2006, p. 461)). Trying to remove any correlation from the residuals, we find that including AR-terms up to order four yields coefficients that are both economically and statistically significant. Sometimes, AR-coefficients of higher order are statistically significant (due to the size of our sample), but very small (close to zero). 9 In the basic model, we include different dummies for the days preceding a spam day (Di,p,t , i = 1, . . . , 235: index for the stocks in our sample), spam days (Di,s,t ), and the days following a spam day (Di,f,t ). A further condition we imposed for a day to qualify as a preceding day was that at least two non-spamevent days must lie between this day and the most recent following day. All days that are neither spam days, nor within a one-day time interval around a spam day, will be called unaffected days (returns on such days will be referred to as unaffected returns). Thus, the basic regression equation reads as follows: ÿi,t = 4 X βARj ÿi,t−j + j=1 X βk D̈i,k,t + üi,t , (1) k∈{p,s,f } where ÿi,t = yi,t − y i , the time-demeaned data on y (this is usually called the fixed-effects transformation, similarly for D and u). Analyses of residuals from our regressions show no significant autocorrelation, but some (time-) heteroskedasticity within cross-sections. We account for this by using a period SUR specification (Wooldridge, 2002, p. 144) for the covariances. Boehme and Holz (2006) use a log-linearized version of a multiplicative regression model, estimated by standard pooled OLS. Our approach allows for greater flexibility in variable transformations described in the following subsections. 4.2 Dependent Variables The following variables are used as dependent variables: • RET, representing returns, • SIG, a measure of volatility, • SPR, a measure of intraday volatility, and • VOL, measuring daily turnover. Various standardization procedures are applied to the raw data to facilitate a meaningful aggregation across stocks. To calculate RET, we start from the price series in DATASTREAM, which is already corrected for dividends and stock splits. Denoting these prices by Si,t , we calculate raw returns Ri,t using Ri,t = ln Si,t − ln Si,t−1 . (2) In many studies, the returns are also corrected for systematic effects, e.g., by calculating excess returns relative to a stock index. For the stocks in our sample, however, idiosyncratic risk clearly dominates systematic risk. For this reason, we employ a constant mean return correction which is handled implicitly within our model through the fixed-effects transformation (as described in Section 4.1). SIG is calculated from the daily variance (squared raw return Ri,t ) by standardizing over the average variance on unaffected days and then taking the square root: v u 2 u Ri,t SIGi,t = t 2 , (3) Ri 10 2 2 where Ri denotes the arithmetic mean of Ri,t for unaffected days. To generate a measure for intra-day volatility, we start by calculating ςi,t = ln [hight (Si,t )] − ln [lowt (Si,t )] , (4) where hight (·) and lowt (·) denote the intra-day high and low of the respective stock on day t (from DATASTREAM). SPR is then calculated from SPRi,t = ςi,t , ςi (5) where ςi denotes the arithmetic average of ςi,t for unaffected days. VOL is calculated from the daily turnover by standardizing over the turnover on unaffected days, and then applying a log transformation: toi,t , (6) VOLi,t = ln 1 + toi where toi,t denotes the turnover of share i on trading day t, and toi is the average daily turnover of share i on unaffected days. 4.3 Explanatory Variables In the base case, we use Di,k,t , k ∈ {p, s, f } together with AR(j)-terms (j=1,. . . ,4) as independent variables. For analyzing many of the questions discussed in Section 2, the dummy methodology is not applicable together with the fixed effects specification, because for many cross-sections the dummies would be constant in time. In these cases, we split our sample into sub-samples and repeat the regressions on each of the sub-samples. To investigate a possible dependence on turnover, we split our sample into two: one contains all stocks with “high” average daily turnover (more than 20.000$ per day), and the other all stocks with “low” turnover (less than or equal to 20.000$ per day). To distinguish between stocks that are frequently targeted by spammers and those that receive only a few spam mails per year, we split our sample according to the number of spam events recorded in 2005. One sub-sample contains all stocks with five of more spam events during the observation period, and the other all stocks with four spam events or fewer. The question whether repeated spamming on successive days shows greater or smaller effects is investigated by defining separate spam day dummies for the first, second, third, and fourth spam day in a row. These dummies will be denoted by Di,s,t,1 , Di,s,t,2 ,. . . ,Di,s,t,4 . The regression equation then reads ÿi,t = 4 X j=1 βARj ÿi,t−j + X βk D̈i,k,t + k∈{p,f } 4 X βs,l D̈i,s,t,l + üi,t . (7) l=1 In order to find out whether the effects of spam e-mails differ depending on the price level of the target stock, we subdivide our sample into two, using a stock price of 10 cents as the boundary. 11 5 Results In the following subsections, we present and discuss the results for the regressions described in Section 2. In preliminary runs, we also included a coefficient for linear time dependence, which was zero in all the regressions and has therefore been dropped. This is in contrast to the results of Boehme and Holz (2006), who find a positive linear trend in their (smaller) sample. 5.1 5.1.1 Base Case Return The base case panel regression is given by equation (1). Results for the first regression, using RET as the dependent variable, are given in Table 2. As discussed in Section 4, α denotes the constant implied by the fixed-effects transformation. We find significant negative autocorrelation up to lag 4, which is a Coeff. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n Est. p−val. -0.015 0.000 -0.260 0.000 -0.082 0.000 -0.053 0.000 -0.035 0.000 0.010 0.273 0.019 0.000 -0.008 0.235 0.0744 28469 Table 2: Results for the base case, dependent variable: RET. p−values for two-sided test with period SUR corrections. common sign of illiquidity (bid-ask-bounce) and therefore to be expected given the characteristics of our stocks. The spam day dummy coefficient is highly significant and positive: On spam days, returns are indeed markedly higher than on unaffected days. In contrast, there is no significant return effect on days preceding or following spam days. The corresponding coefficients are only half the size of the spam day dummy coefficient. Their signs, however, support our economic story. One problem with the interpretation of our returns is that they are based on closing prices. Unfortunately, we do not have information on intra-day price developments (apart from spreads). For relatively liquid stocks, it might well be the case that prices rise after spreading the spam mail, the spammer takes his profits, and prices are already coming back to “normal” levels before the close of trading on the day of the spam event. In these cases, our results would be affected twofold: Returns on spam days would underestimate the true return impact of the spam mails, and returns on following days would underestimate the market’s correction after the spam event. Therefore, our coefficients should provide rather conservative estimates of the respective return impacts. 12 5.1.2 Volatility Results for the base case with SIG as the dependent variable are given in Table 3. We find significant positive autocorrelation in SIG up to lag 4 with the firstCoeff. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n Est. p−val. 0.404 0.000 0.205 0.000 0.063 0.000 0.041 0.000 0.047 0.000 0.037 0.351 0.082 0.000 0.007 0.818 0.087 28469 Table 3: Results for the base case, dependent variable: SIG. p−values for twosided test with period SUR corrections. order term clearly dominating in magnitude, which points to the well-known phenomenon of volatility clustering. On spam days, SIG is significantly larger than usual. 5.1.3 Intraday Spread Table 4 shows the results for SPR used as the dependent variable. Here, again, autocorrelation is positive and highly significant up to lag 4 with the first lag clearly dominating. Days with large intraday price changes tend to cluster. βp Coeff. Est. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n 0.614 0.000 0.212 0.000 0.079 0.000 0.057 0.000 0.048 0.000 0.279 0.000 0.164 0.000 0.033 0.457 0.0977 29782 p−val. Table 4: Results for the base case, dependent variable: SPR. p−values for twosided test with period SUR corrections. and βs are highly significant with positive sign, indicating larger spreads on spam days and days preceding spam days compared to unaffected days. Interestingly, βp is about two thirds larger than βs . This may be viewed as supporting our story of the spammer building up his position shortly before he initiates the spam. Taking this result together with the corresponding coefficient of the volatility regression implies that the spammer’s build-up orders are issued on the day preceding the spam event. 13 5.1.4 Volume Table 5 shows the results using VOL as the dependent variable. All AR coeffiCoeff. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n Est. p−val. 0.110 0.000 0.400 0.000 0.151 0.000 0.095 0.000 0.089 0.000 0.228 0.000 0.448 0.000 0.165 0.000 0.439 55028 Table 5: Results for the base case, dependent variable: VOL. p−values for two-sided test with period SUR corrections. cients up to lag 4 are highly significant. Strong positive autocorrelation indicates a clustering of high- and low-volume days. The effect on VOL is about twice as large on spam days compared to preceding days and following days. Again, this indicates that the spammers’ activities leave clear traces in market data. Regardless of whether the manipulation attempts really cause prices to move in the desired direction, trading volume in spammed stocks is clearly higher on and around spam days. The fact that volume is already up on the preceding day indicates that it is not uncommon for the spammer to build up his position quite shortly before he spreads his e-mails. Boehme and Holz (2006) also find a positive dependence between volume and spam events, but they do not consider effects on preceding and following days. 5.2 Dependence on Turnover Next, we examine a possible dependence of our results on turnover. Table 6 shows the results for the sub-samples of high-turnover and low-turnover stocks, respectively. As far as RET is concerned, spam day dummies are significant for both high- and low-turnover stocks. The coefficient for low-turnover-stocks, however, is more than twice as large as that for high-turnover stocks. The coefficient for days following spam days is zero for high-turnover-stocks, but significantly negative for low-turnover stocks. A possible explanation is that much of the return impact of the spam is already digested during the spam day for high-turnover stocks, depressing return effects (visible in the daily data) on both the spam day and the following day, while the same process might take longer for low turnover (low liquidity) stocks. For SIG, we find an interesting result. Volatility is only significantly increased on spam days for high-turnover stocks, while the corresponding coefficient for low-turnover stocks shows a p-value of around 0.11 (with the expected positive sign). For the intraday spread, the same coefficients are significant as in the base case (spam day and preceding day). The two groups differ, however, as far as the magnitude of the effect is concerned: The total effect of βp (calculated as 14 Coeff. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n Coeff. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n RET high turnover low turnover Est. p−val. Est. p−val. -0.011 0.000 -0.159 0.000 -0.056 0.000 -0.048 0.000 -0.029 0.003 0.013 0.185 0.013 0.028 0.005 0.528 0.0328 18417 -0.023 0.000 -0.345 0.000 -0.118 0.000 -0.063 0.000 -0.043 0.000 0.006 0.738 0.031 0.004 -0.031 0.028 0.121 10052 SIG high turnover Est. p−val. low turnover Est. p−val. 0.413 0.000 0.205 0.000 0.070 0.000 0.034 0.000 0.044 0.000 0.075 0.128 0.092 0.001 0.017 0.664 0.0825 18417 0.386 0.000 0.203 0.000 0.049 0.000 0.055 0.000 0.053 0.000 -0.038 0.553 0.062 0.109 -0.010 0.831 0.097 10052 SPR high turnover low turnover Est. p−val. Est. p−val. VOL high turnover low turnover Est. p−val. Est. p−val. 0.570 0.227 0.093 0.059 0.058 0.190 0.126 0.018 0.091 0.000 0.479 0.000 0.139 0.000 0.081 0.000 0.096 0.000 0.116 0.000 0.328 0.000 0.119 0.000 0.535 26291 0.000 0.000 0.000 0.000 0.000 0.004 0.001 0.739 0.11 18886 0.681 0.000 0.191 0.000 0.058 0.000 0.052 0.000 0.034 0.000 0.441 0.000 0.229 0.000 0.056 0.464 0.0837 10896 0.124 0.349 0.149 0.098 0.082 0.342 0.598 0.214 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.37 28737 Table 6: Dependence on turnover. Top line shows dependent variables, second line indicates sub-sample. p−values for two-sided test with period SUR corrections. βp /(1 − βAR1 − βAR2 − βAR3 − βAR4 ) is twice as large for preceding days and 50% larger on spam days for low-turnover-stocks than for high-turnover-stocks. This makes sense from an economic point of view: The lower the liquidity, the larger the price impact. Volume shows similar characteristics as spreads: The same coefficients are significant as in the base case (here, coefficients for spam days, preceding and following days are highly significant). The impact on volume is markedly higher for low-turnover-stocks, which can again be attributed to liquidity. 5.3 Popularity as a Target Interestingly, some stocks are spammed very frequently, while others are only targeted once or twice during the year 2005. An interesting question is what might be the reason for the popularity of some targets. Maybe the stocks that are more popular among spammers react “better” (from the viewpoint of the spammer) to manipulation attempts than others? To answer this question, we divide our sample into two groups: One sub-sample contains all stocks with 5 spam events or more during 2005, the other sub-sample all stocks with less than 5 spam events. We run the regression described in equation (1). The results are shown in Table 7. The effects on returns are about the same in magnitude for both groups. For frequently spammed stocks, the effect is highly significant on spam days, while for less frequently spammed stocks, it has a p-value of 6.8%. 15 Coeff. RET ≥ 5 spam events < 5 spam events Est. p−val. Est. p−val. SIG ≥ 5 spam events < 5 spam events Est. p−val. Est. p−val. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n -0.016 0.000 -0.231 0.000 -0.102 0.000 -0.069 0.000 -0.048 0.000 0.008 0.478 0.019 0.002 -0.003 0.704 0.0628 11501 0.396 0.000 0.214 0.000 0.067 0.000 0.044 0.000 0.044 0.000 -0.018 0.714 0.049 0.054 0.014 0.691 0.0916 11501 Coeff. SPR ≥ 5 spam events < 5 spam events Est. p−val. Est. p−val. VOL ≥ 5 spam events < 5 spam events Est. p−val. Est. p−val. 0.595 0.211 0.096 0.058 0.047 0.244 0.099 0.010 0.104 0.433 0.145 0.103 0.072 0.108 0.391 0.103 α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n 0.000 0.000 0.000 0.000 0.000 0.001 0.004 0.835 0.101 11936 -0.015 0.000 -0.279 0.000 -0.070 0.000 -0.040 0.000 -0.026 0.001 0.014 0.347 0.020 0.068 -0.020 0.120 0.0843 16968 0.626 0.000 0.212 0.000 0.070 0.000 0.056 0.000 0.049 0.000 0.330 0.000 0.362 0.000 0.067 0.404 0.0966 17846 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.498 19759 0.410 0.000 0.198 0.000 0.061 0.000 0.040 0.000 0.048 0.000 0.127 0.053 0.189 0.000 -0.017 0.749 0.0859 16968 0.113 0.381 0.152 0.091 0.099 0.384 0.566 0.248 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.403 35269 Table 7: Dependence on target popularity. Top line shows dependent variables, second line indicates sub-sample. p−values for two-sided test with period SUR corrections. When looking at our volatility measure SIG, we find that volatility of frequently spammed stocks is much less affected compared to infrequently spammed stocks. The popularity of a target is inversely related to its volatility reaction on spam days. Unaffected volatility, however, is no different between our subsamples volatility: Stocks with high spread increases on spam days and preceding days are less frequently spammed. VOL also reacts less sensitively for frequently spammed stocks compared to the other sub-sample. The difference in effects is particularly high on preceding days. A closely related question is whether market participants learn from previous spams, i.e., do observed effects on the dependent variables decrease with the number of times a particular stock is targeted? Unfortunately, this question is very difficult to investigate, at least based on the information we have. Two requirements that we consider crucial for the feasibility of the analysis are not met in practice: First, we do not know who actually received a particular spam e-mail (whether it was sent to a comparatively large group of recipients, or rather a smaller one). An investor who receives only, say, every fifth spam e-mail concerning a certain stock cannot be expected to learn anything from experience with the four spams in between. Second, we would need information about the absolute number of spam e-mails by which each stock in our sample has been targeted. Recording a spam e-mail in January 2005 does not tell 16 us anything about how many spams occurred prior to that point in time (in 2004, 2003,. . . ). Therefore, we cannot provide an answer to this question based on our data. Boehme and Holz (2006) investigate whether the “stock spam trick is wearing out over time”, conducting the analysis described above across all stocks, independently of how many spams the stocks received prior to the sample period. They find no evidence of decreasing effects of spam mails. 5.4 Successive Spam Days From time to time, we observe a clustering of spam events, meaning that a certain stock is targeted on successive trading days. We can only speculate about possible motives of spammers for that: Maybe the manipulation attempt on the first day was not successful, and the spammer desperately tries to repeat his attempt on the following day. Or it was successful, but due to liquidity problems the spammer could not liquidate his entire position, and he wants to make sure that demand for the stock does not decrease while he is trying to liquidate the remainder of his position. To investigate this question, we define new dummy variables based on the number of spam days in a row, and run the regression described in equation (7). Table 8 shows the results of this regression. For RET, the spam day coefficient is significantly positive for the first day in a Coeff. α βAR1 βAR2 βAR3 βAR4 βp βs,1 βs,2 βs,3 βs,4 βs,5 βf R2 n Est. RET p−val. -0.006 0.000 -0.210 0.000 -0.065 0.000 -0.067 0.000 -0.023 0.003 0.009 0.330 0.019 0.008 0.017 0.102 0.017 0.260 0.001 0.977 -0.052 0.072 -0.008 0.283 0.0629 16106 Est. SIG p−val. 0.395 0.000 0.203 0.000 0.057 0.000 0.040 0.000 0.049 0.000 0.045 0.331 0.087 0.022 0.037 0.512 -0.087 0.255 0.184 0.119 0.135 0.369 0.052 0.161 0.105 16106 Est. SPR p−val. 0.575 0.000 0.219 0.000 0.098 0.000 0.062 0.000 0.048 0.000 0.228 0.000 0.148 0.004 0.038 0.595 0.101 0.299 0.080 0.603 0.405 0.053 0.072 0.159 0.127 16837 Est. VOL p−val. 0.270 0.399 0.121 0.066 0.057 0.190 0.392 0.417 0.257 0.302 0.343 0.142 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.004 0.000 0.42 22609 Table 8: Successive spam days – results for the regression from equation (7). Top line shows dependent variables. p−values for two-sided test with period SUR corrections. row of successive spam days only. Repeating spam e-mails on successive days does not seem to be a profitable strategy. For SIG and SPR, we also find a significantly positive coefficient for the first spam day only. While intraday spreads are significantly larger than usual on preceding days, no such effect can be detected for volatility. Neither volatility, nor intraday spreads are significantly affected on following days. Volume is significantly higher on all spam days, regardless of whether they occur in a row or not. The volume effect is also significantly positive for preceding and following days. The effect on spam days is much larger in magnitude for spam days than for preceding or following days. 17 5.5 Price Level of Target Stock Given that almost all of the stocks in our sample trade very cheap, we want to find out whether “cheaper” means “better for the spammer”, in the sense that cheaper stocks are easier to manipulate (react more sensitively to manipulation attempts). To this end, we split our sample into two, taking a price of ten cents as the boundary. The results of the regression described in 1 applied to these sub-samples are shown in Table 9. For returns, we find a marked influence of Coeff. RET price ≥ 10 cents price < 10 cents Est. p−val. Est. p−val. SIG price ≥ 10 cents price < 10 cents Est. p−val. Est. p−val. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n -0.006 0.000 -0.210 0.000 -0.065 0.000 -0.067 0.000 -0.023 0.002 0.009 0.336 0.014 0.012 -0.008 0.276 0.0625 16106 0.395 0.203 0.057 0.040 0.049 0.045 0.058 0.052 Coeff. SPR price ≥ 10 cents price < 10 cents Est. p−val. Est. p−val. VOL price ≥ 10 cents price < 10 cents Est. p−val. Est. p−val. α βAR1 βAR2 βAR3 βAR4 βp βs βf R2 n 0.575 0.000 0.219 0.000 0.098 0.000 0.062 0.000 0.048 0.000 0.228 0.000 0.109 0.004 0.072 0.162 0.127 16837 0.270 0.399 0.120 0.066 0.057 0.190 0.370 0.143 -0.028 0.000 -0.307 0.000 -0.109 0.000 -0.055 0.000 -0.049 0.001 0.006 0.753 0.022 0.038 -0.012 0.385 0.105 12363 0.700 0.000 0.192 0.000 0.049 0.000 0.042 0.000 0.040 0.000 0.398 0.000 0.296 0.000 0.007 0.933 0.0856 12945 0.000 0.000 0.000 0.000 0.000 0.332 0.041 0.162 0.104 16106 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.42 22609 0.440 0.000 0.193 0.000 0.059 0.000 0.036 0.000 0.035 0.000 0.043 0.522 0.140 0.000 -0.065 0.233 0.085 12363 0.071 0.000 0.327 0.000 0.141 0.000 0.093 0.000 0.085 0.000 0.213 0.000 0.509 0.000 0.155 0.000 0.442 32419 Table 9: Dependence on the price level of the target. Top line shows dependent variables, second line indicates sub-sample. p−values for two-sided test with period SUR corrections. price level, independently of spam events. Stocks in the lower price group show much higher negative returns than the higher-priced stocks. The increase in volatility on spam days is significantly larger for lower-priced stocks. Spreads are generally higher for lower-priced stocks, as is the effect on spam days and preceding days. Volume is markedly higher for higher-priced stocks, indicating generally higher liquidity. The impact on volume on spam days is significantly higher for low-priced stocks, which is consistent with our interpretation of the effects being mainly driven by liquidity. 18 6 Conclusions and Directions for Further Research Using a large sample of spam e-mails, we find significant effects of stock spam on a number of market statistics (price, volatility, intra-day spread, trading volume). This, in itself, does not provide evidence for the success of spammers. However, there is still a lot of information that we can extract from the analysis of our data: We can characterize stocks that are particularly vulnerable (i.e., easy to manipulate). Stocks that are frequently targeted react – based on closing prices! – less sensitively to stock spam than others. Sequential spamming on successive days has no additional effect. Return reacts strongest for stocks trading below 10 cents at low volumes. Our database is insufficient to analyze the success of certain trading strategies. Instead of open/high/low/close data, intraday transactions data would be required. The same holds for an analysis of possible trading gains of other investors trying to “ride the wave”. As soon as we get intra-day price data for the stocks in our sample, we will investigate also the economic significance of our results by exploring the success of various trading strategies based on stock spam e-mails. References Antweiler, Werner, Murray Z. Frank. 2004. Is all that talk just noise? the information content of internet stock message boards. The Journal of Finance 59/3 1259–1294. Boehme, Rainer, Thorsten Holz. 2006. The effect of stock spam on financial markets. Dewally, Michael. 2003. Internet investment advice: Investing with a rock of salt. Financial Analyst Journal 59/4 65. Jianga, Guolin, Paul G. Mahoney, Jianping Mei. 2005. Market manipulation: A comprehensive study of stock pools. Journal of Financial Economics 77/1 147–170. Larson, Scott, Carl Luft, Lawrence M. Levine. 2001. Over the counter bulletin board exchange: Market structure, risk, and return. Journal of Alternative Investments 4/2 33–42. Luft, Carl, Lawrence M. Levine. 2004. Over the counter bulletin board exchange: The impact of liquidity and size to return, volatility, and bid/ask spread. Journal of Alternative Investments 7/3 95–106. Tumarkin, Robert, Robert F. Whitelaw. 2001. News or noise? internet postings and stock prices. Financial Analyst Journal 57/3 41–51. van Bommel, J. 2003. Rumors. The Journal of Finance LVIII/4 1499–1519. Wall Street Journal. 2000. Teenage trader runs afoul of the sec as stock touting draws charges of fraud 21.09.2000. 19 Wall Street Journal. 2005a. Scam artists tap text messaging to lure folks seeking hot stocks 27.09.2005. Wall Street Journal. 2005b. Sec steps up effort to fight stock fraud; agency begins to halt trading when it suspects spammers are touting shares illegally 02.02.2005. Wooldridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. The MIT Press. Wooldridge, Jeffrey M. 2006. Introductory Econometrics - A Modern Approach. 3rd ed. Thomson. 20
© Copyright 2026 Paperzz