These are my notes on the phd thesis 'Four
Essay in Statistical Arbitrage in Equity Markets' by Jozef Rudy. Hoping
to implement some of these eventually.
Ch 1 - Introduction
This is just a summary chapter. The work is
mostly about Pairs trading and its modifications, concentrating on daily
trading but also applying high frequency data and other modification.
There is also a chapter on mean reversion strategies - fitting under
statistical arbitrage.
The standard market approach is
daily sampling (Gatev 2006). In the standard form, the edge such
strategies provide seems to be dissipating. Going to higher frequency
can potentially achieve higher information (Aldridge 2009). Nonstandard
half-daily sampling frequency and using ETFs can further help the
performance.
Ch 2 - Literature Review
Nunzio Tartaglia is credited for
developing pairs trading at Morgan Stanley in 1980s. Hugely successful
but profits have come down recently. That is why one needs to go into
higher frequencies Marshall et al. (2010). Similarly, Shulmeister (2007)
finds that technical are profitable, but only on higher time frames.
That motivates half-daily timeframe.
Engle
and Granger (1987) brought cointegration to limelight. Johansen (1988)
developed the critical test. For a pair, the simpler method is to first
calculate the beta using $P_{1t}=\beta P_{2t}+\epsilon_t$. Then check
the residual using Augmented Dickey-Fuller unit root test (ADF) at 95%
confidence using
$$\Delta \epsilon_t = \phi+\gamma\epsilon_{t-1}+\sum_{i=1}^{p}\alpha_i\Delta \epsilon_{t-1}+u_t.$$
We
include the most significant lags in an iterative sense and then check
for the no cointegration using $\gamma=0$, against the hypothesis
$\gamma<0$.
For more than two
assets one need to use Johansen method. Non-parametric distance method
(Gatev 2006) and stochastic approach (Mudchanatongsuk 2008) has also
been used.
Time adaptive models like Kalman filter have
been shown to be superior to rolling window OLS based methods due to
forward looking methodology of the former. Double exponential
smoothing-based prediction based models can give comparable results to
Kalman filter but run order of magnitude times faster.
'Market neutral' hedge funds are generally pairs trading kind of funds.
Ch 3 - Stats Arb. and HF data
The main innovation is
to apply statistical arbitrage technique of pairs trading to
high-frequency equity data (Eurostoxx 50 stocks). This is done for
5-minute interval (IR~3) to daily frequency (IR~1). Pairs are chosen
based on best in-sample IR and highest in-sample t-stats of the ADF test
of the residuals of the cointegrating regression sampled at daily
frequency. 5 best pairs are chosen. The simplest method is Engle and
Granger (1987) cointegration approach. To make beta parameter adaptive
the following techniques can be used - rolling OLS, DESP model and
Kalman filter.
Cointegration model
Take pairs from same industry based on economic reasoning and apply OLS regression on them:
$$Y_t=\beta X_t + \epsilon_t$$
Then test the residuals of the OLS regression for stationarity using the Augmented Dickey-Fuller unit root test.
Rolling OLS
Similarly we can calculate the rolling
beta using rolling OLS. This approach suffers from 'ghost effect',
'lagging effect' and 'drop-out-effect'. The window can be optimized for
maximum in-sample IR. This was around 200 periods. This was used for out
of sample.
Double Exponential smoothing prediction model
We first calculate $\beta_t=Y_t/X_t$. We then do double smoothing by:
$$S_t = \alpha \beta_t+(1-\alpha)S_{t-1}$$
$$T_t=\alpha S_t + (1-\alpha)T_{t-1}$$
Using these the prediction of beta at time period $t+1$ is
$$\hat{\beta}_{t+1} = \Bigg[2S_t-T_t\Bigg] + k \Bigg[\frac{\alpha}{1-\alpha}(S_t-T_t)\Bigg].$$
$k$ is the number of look-back periods. the optimized values of $\alpha$ and $k$ are 0.8126 and 30.
Time-varying parameter model with Kalman filter
This is more optimal than OLS for adaptive parameter estimation. The measurement equation is
$$Y_t=\beta_t X_t+\epsilon_t$$
and the state equation is
$$\beta_t=\beta_{t-1}+\eta_t.$$
The
idea to add second equation is based on the intuition that there is
some characteristic of beta, i.e. auto-correlation which can be added as
information for better estimation. The noise ratio is to be optimized
yielding $3e^{-7}$.
The pair trading model
Choosing the pairs within an
industry makes us immune to industry wide shock. The spread between the
pairs is calculated as $z_t=P_{Y_t}-\beta_{t}P_{X_t}$. We did not
include a constant in any of the models. This spread is normalized by
subtracting the mean and divided by the standard deviation. Entry is at 2
standard deviation and exit near 0.5 standard deviation. Once the entry
is triggered we wait one period before we enter. We choose money
neutral investment by putting equal money in the two sides
(irrespective of the $\beta$). There is no re-balancing. When normalized
spread returns to its long term mean, it is caused by the combination
of two things: real reversal of the spread and adaptation of beta to new
equilibrium value - leading to not total reversal in dollar value even
when the spread has totally reversed.
In sample indicators are used with the objective to identify out of sample performance:
1) t-stat from ADF test on the residuals of the OLS regression.
2) the information ratio
3) half life of mean-reversion.
The half-life is given by $-ln(2)/k$, where k is the median unbiased estimate of the strength of mean-reversion OU equation
$$dz_t = k(\mu-z_t)dt+\sigma dW_t$$
where
$z_t$ is the value of the spread, $\sigma$ is the standard devation.
The higher the $k$, the faster the spread tends to revert to its long
term mean. In sample IR is also used as a metric (IR 2 means strategy is
profitable every month, IR 3 means strategy is profitable every day).
IR is overestimated if the returns are auto-correlated.
Out of sample performance
Assuming a trading cost of 30 bps
one way. The best result comes out for 30 minute interval. Kalman is
the best out of - fixed beta, rolling OLS, DESP and Kalman, with the
smoothest beta (Table 3-3).
Further investigations
Relationship between the in-sample t-stats and the out-of-sample information ratio
The
in-sample t-stats for the fit is positively correlated to out of sample
information ratio for upto 10 minutes frequency. Beyond this the
correlation is statistically indistinguishable from 0.
Relationship between t-stats for different high-frequency and pairs
Trading
pairs have similar t-stats across all frequencies is ascertained by the
first PCA explaining almost all of the variance (after standardizing
the t-stats of ADF test for all pairs). This has the following
implication - once a pair has been found to be co-integrated at a
certain frequency, it tends to be co-integrated across all frequencies.
Does cointegration in daily data imply higher frequency cointegration
The
correlation between t-stats (of the ADF test) of daily data and
5-minute data has an interval of [-0.03,0.33] using bootstrapping.
Hence, co-integration found at daily frequency implies there is
co-integration at 5-min interval as well.
Does in-sample information ratio and the half-life of mean reversion indicate what the out-of-sample information ratio will be?
Using
bootstrapping the confidence bounds indicate that the in-sample
information ratio can positively predict the out-of-sample information
ratio to a certain extent. Also, There is negative relation between the
half-life of mean reversion and subsequent out-of-sample information
ratio.
A diversified pair trading strategy
Using the
indicators presented above, best 5 pairs are selected. Best in-sample IR
- gives attractive the out of sample performance. Half-life of mean
reversion - does not work out. In-sample t-stats of the ADF test of the
cointegrating regression as indicator only works for 5 to 10 minute
strategies. A combination is worse than individual indicators. Finally, a
daily IR of 1.34 and high frequency IR of 3.24 comes out to be better
than simple long.
Ch 4 - Profitable Pair Trading: A comparison using the S&P 100 constituent stocks and the 100 Most liquid ETFs
The
greatest known risk to pairs trading is a stock going bankrupt. ETFs
can avoid that. But are they equally profitable? It turns out they are
than stocks based on adaptive long-short strategy (IR of 1 vs 0),
extending in-sample period (1.7 vs 0.2) and preselecting pairs based on
in-sample IR (2.93 vs 0.46).The ratio can be made time adaptive via
Kalman filter. Pairs trading strategy in its basic form might be
becoming unprofitable.
Datastream is used to get data
for 100 most liquid ETFs and S&P100 stocks. In-sample period of 3/4
and 5/6 is used. Based of if there is cointegration or not 428 ETF pairs
and 693 stock pairs are evaluated.
Methodology
Bollinger bands are used with 20 day moving window
with 2 standard deviation windows for entry/exit triggers, in general.
These parameters are optimized for max in-sample IR and differ from one
pair to another.
Model
The spread is calculated using adaptive beta using Kalman
filter, based on prices. By optimizing the noise ratio $Q/H$, an
increase in ratio makes the beta more adaptive and decrease more
smooth. Constant level is not used to reduce parameter. We invest the
same amount of dollars on each side of the trade. Once invested, we wait
for the spread to revert back. The initial money neutral positions are
not dynamically rebalanced.
Out of sample results
With 75% in-sample the IR for ETF and
stocks are 1.06 and 0.08 respectively. This increases to 1.71 and 0.22
for 83% in-sample respectively. ETFs used are index trackers, thus they
contain lower idiosyncratic risk as shares. Index divergence is more
probable to reverse than stock divergence, where the reason could be
more fundamental. Much better results of ETFs could also be a result of a
stronger autocorrelations of ETF pairs compared to shares. Lower
volumes traded (only marginally) also makes ETF market less competitive
Results for the best 50 pairs
The correlation between in-sample
and out-of-sample IR is 0.24 and 0.14 for ETFs and Stocks. This
motivates using better performing in-sample pairs in out of sample. This
increases the IR to 1.58 and 0.13 for 75% in-sample case for ETFs and
Stocks respectively. And an IR of 2.93 and 0.46 for 83% in-sample case.
Conclusions
- ETFs are better than Stocks because of non-existence of idiosyncratic risk in ETFs.
- Decreasing out-of-sample period improves performance. Hence, re-estimating the model once per week will improve the results.
- In-sample IR predicts out of sample IR.
Ch 5 - Mean Reversion based on Autocorrelation: A comparison using the S&P 100 constituents and the 100 most liquid ETFs
Simple
strategy based on normalized previous period's return and the actual
conditional autocorrelation can give traders and edge. ETFs are more
suitable than Stocks and half-daily frequency improves the performance.
Introduction
- Form pairs with 30 days trailing conditional correlation above the threshold of 0.8
- Eliminate pairs with a previous day's normalized spread returns smaller than 1.
- Select pairs with first order autocorrelation within certain bounds.
Two different samplings - daily and half-daily are used, with 4 year in sample and out of sample period.
Contrarian
profits, explained by overreaction hypothesis causing negative
autocorrelation, have decreased in recent periods (Khandani and Lo
2007). Higher frequencies still have some juice (Dunis et al 2010).
Market neutral strategies have been shown to be exposed to general
market factors. S&P 100 stocks and 100 ETFs are used with investment
exactly for one trading period.
Methodology
JPMorgan (1996) method is used to calculate
conditional (time-varying) volatility and conditional correlation
(cutoff 0.8), over a period of 30 days. $$cov(r_A, r_B)_t=\lambda
cov(r_A,r_B)_{t-1}+(1-\lambda)r_A r_B,$$ where $\lambda$ is the constant
0.94, corresponding to 30 days. The return of the spread is simply
the difference of the returns of the constituents. The conditional
autocorrelation of the pair is calculated as
$$\rho_t=\frac{cov(r_t,r_{t-1})_t}{\sigma_t \sigma_{t-1}},$$ where
$r_t$ is the returns of the spread pair. The conditional covariance of
the pair is calculated as $$cov(r_t,r_{t-1})_t=\lambda
cov(r_t,r_{t-1})_{t-1}+(1-\lambda)r_t r_{t-1}.$$ The normalized returns
of the spread is simply $$R_t=\frac{r_t}{\sigma_t}.$$ We only trade
pairs with normalized returns above 1. If the autocorrelation is
negative we bet on the reversal otherwise be bet the pair will continue
to move in the same direction as in current period, with each pair held
only for one period. 5 best pairs with highest normalized returns are
chosen.
Trading results
Trading cost of 20 bps per pair trade is assumed.
Net of cost IR for in-sample and out-of-sample top 1, 5, 10 and 20 best
pairs for different autocorrelation ranges is all negative for stocks.
The results are positive both for in-sample and out-of-sample for ETFs
(5, 10, 20 pairs) for the range -0.4 to 0 (but not -1 to -0.4).
For
half-daily frequency results are better but still not good enough for
shares. For ETFs the results are stupendous for the full negative
autocorrelation range. Positive autocorrelation range is not that
productive.
The out of sample results are consistent
till 2009 after which it is flat. Adding more pairs makes the equity
curve more consistent.
Ch 6 - Profitable Mean Reversion after large price drops: A story of Day
and Night in the S&P500, 400 Mid Cap and 600 Small Cap Indices
Open-to-close
(day) and close-to-open (night) have information. The worst performing
shares during the day (resp. night) are bought and held during night
(resp. day). The alpha is not explained by Fama and French 3-factors and
Carhart 5-factors.
Literature review
Contrarian returns have been reducing (Khandani
and Lo 2007). Most strategies use close to close information and don't
make use of the opening prices into account. Existence of contrarian
profits can be explained by overreaction hypothesis (Lo and MacKinlay
1990), with a negative autocorrelation assumption. De Bondt (1985) show
that for 3 years rebalancing losers beat the past winners, with the
outperformance continuing as late as 5 years after the portfolio have
been formed. Predictability of short-term returns are exploited either
by momentum or reversion. Serletis and Rosenberg (2009) show the Hurst
exponent for the four major US stock market indices during 1971-2006
display mean-reverting behavior. Bali (2008) find that the speed of the
mean reversion is higher during periods of large falls in prices.
De
Gooijer et al. (2009) find non-linear relationship between overnight
price and opening price. Cliff et. al. (2008) show that night returns
are positive while day returns are 0. The effect is partly driven by the
higher opening prices which decline during the first trading hour of
the session.
Financial Data
Stocks consisting of - S&P 500, S&P 400
MidCap and S&P 600 SmallCap are used. Data from 2000-2010 adjusted
prices. 5bps trading cost one way. We calculate open-to-close day
returns and close-to-open night returns. The average return of holding
the shared during day and night is very similar for the constituent
stocks of S&P 500 index and is slightly positive for both. For
S&P 400 MidCap the daily returns are positive and overnight returns
negative, similar to S&P 600 SmallCap. These differences are not
profitable after trading cost.
Trading Strategy
Exploit the mean reverting behavior of the
largest losers either during the day or night. Version 1 (day holding)
buys n worst performing shares during the close-to-open period (decision
period) with shares bought at the market open and sold at market close,
equally weighted. Version 2 (night holding) buys n worst performing
shares during the open-to-close period (decision period). The Benchmark
strategy buys the n worst losers based on full day returns.
Strategy Performance
For S&P 600 small cap, the first two
deciles (stocks with largest decline during the decision period) produce
high IRs and the last two negative (a short strategy will work, which
is not examined here). This holds true for both day and night
strategies. There is a clear structure present going from top to bottom
deciles. Overreaction is not as strong for mid cap stocks as it for
small caps. But the pattern is similar and extreme deciles are
profitable.
The benchmark strategy (close to close
decision period with subsequent close to close as holding period) has
been unprofitable for Small, Mid and S&P500 cross section more
recently. Version 1 and version 2 have been more profitable.
Park
(1995) claims that the profitability of mean reversion strategy
disappears once the average bid-ask price is used instead of a closing
price, i.e. the most significant part of the close-to-close contrarian
strategy is caused by the bid-ask bounce and is not achievable in
practice. The two versions shown here are better than the benchmark
(close-to-close) and hence this strategy is immune to bid-ask bounce.
Multi-factor Models
Style factors:
- CPAM model by Sharpe (1964) - market returns.
- Fama and French 3-factor model (1992) - Mkt, small-big, value-growth.
- Adj. Carhart's 5 factor model (1997) - Mkt, small-big, value-growth,
Momentum: High returns - low returns (M2 to M12), reversion: low
returns - high returns (M1).
$\alpha$ comes out positive for each case. Momentum factor turns
out to be negative while the reversal factor comes out positive, as
expected.
Ch 7 - General Conclusions
Two ways to improve trading results:
- Using more data - higher frequency, bigger universe. Even including
opening prices can be hugely beneficial. Getting opening price and
instantly process is a challenge.
- Using advanced modeling - Kalman can be fast and efficient vs OLS.
Factor neutralizing the pairs ratio (not only industry neutral as done
here) can further improve the results. Neural networks and SVM can be
used to predict the future direction of spreads instead of using fixed
std. level for the spread entry specification.
Delving more into model complexity, as opposed to data complexity, would be more beneficial.