Sunday, June 18, 2017

paired cross-validation for better model selection

Source: Ensemble Methods: Foundations and Algorithms. Section 1.3

We model to generalise. Hence, test error is the main criteria of expected performance. If one is fitting a lot of models their performance should be compared on the test set. For selecting the parameters of a specific model one needs validation set (apart from training and test set). When the data is less we use k-fold cross validation to do validation and testing. A simple comparison of average generalisation error estimates is not reliable since the winning algorithm may occasionally perform well due to the randomness in data split. That is where the paired cross-validation t-test comes into play.

Let's take the example of $5 \times 2$ cv paired t-test. We run 5-times 2-fold cross-validation.

1) student-t hypothesis test (to compare a few models): Run 5 times 2-fold cross validation. Two algorithms a and b are trained on each half and tested on the other. The error difference $d^{(i)}  = err_a^{(i)} - err_b^{(i)}$, for $i \in [1,2]$. The mean and variance of this error difference can be calculated as $\mu = (d^{(1)} + d^{(2)})/2$ and $s^2 = (d^{(1)} - \mu)^2 +(d^{(2)} - \mu) ^2$. Let $s_i^2$ denote the variance  in the $i^{th}$ time 2-fold cross-validation, and $\mu^{(i)}$ the mean error difference. One can take the mean error difference over 5 times and get the t-stats as
\[t = \frac{\frac{1}{5}\sum \mu^{(i)}}{\sqrt{\frac{1}{5}\sum s_i^2}}\]
This should be distributed according to t-distribution with 5-degrees of freedom.

2) McNemar's test (when the model is expensive): We calculate $err_{01}$ the number of instances when first algo is wrong while the second is correct and similarly $err_{10}$. The quantity
\[\frac{(|err_{01}-err_{10}|-1)^2}{err_{01}+err_{10}}\]
will have a $\chi^2_1$ distribution.

3) Friedman test (compare many models): First, we sort the algorithms on each dataset according to their average errors. We then average the ranks of each algorithm over all the dataset and use the critical difference value (Nemenyi post-hoc test) to get the confidence interval. This is used to infer if there is any statistical difference between the algorithms.


Saturday, December 17, 2016

Independence and Exchangeability


Bayesian statistics differs from frequentist statistics in its treatment of unknown values. Bayesian statistics regards probability as an epistemic concept. Under this approach, unknown parameters are given a prior probability distribution. This contrasts with the frequentists approach where parameters are regarded as unknown constants. Indeed, under the epistemic interpretation, the notion of an unknown constant is a contradiction in terms.

In classical frequentist statistics, the samples are often supposed to be formed by independent and identically distributed random variables (iid), while in Bayesian statistics they can only be considered as such if conditioned to the parameter value, which is based on the notion of exchangeability. For example, coin tosses are independent given the numerical value of the probability of Heads, p. Without the knowledge of the numerical value of p, the trials are exchangeable and hence are conditionally independent only if given the value of p. This is the essence of the celebrated Bruno De Finetti's Representation Theorem from 1937.

This theorem asserts that if $\mathbf{x}$ is exchangeable, then it can be represented as a Naive Bayes' model with the latent parent variable representing some meta-parameter, i.e. the $x_i$s are independent given the value of the parameter. In other words, the elements of $\mathbf{x}$ are IID, conditional on the meta-parameter indexing the distribution of $\mathbf{x}$. Hence, this Representation theorem shows how statistical models emerge in a Bayesian context: under the hypothesis of exchangeability of the observables $\{X_i\}^{\infty}_{i=1}$, there is a parameter $\Theta$ such that, given the value of $\Theta$, the observables are conditionally independent and identically distributed. Moreover, De Finetti's strong law shows that our opinion about the unobservable $\Theta$, is the opinion about the limit of $\bar{X}_n$ as $n$ tends to $\infty$.

Saturday, October 3, 2015

Momentum signals in the term structure of commodity futures - Boons, Prado 2015

Basis-momentum (the difference between the momentum of nearby and next nearby contracts) strongly predicts spot returns. It also predicts the spread return. These returns are beyond the classical momentum and carry returns for commodity futures. This does not depend on the presence of institutional investors in commodity markets.

Introduction

Literature states that cross-sectional variation in commodity futures returns in largely driven by the characteristics basis (carry) and momentum. Portfolio sorted on basis-momentum predicts both outright and spread with an IR of around 1. This is 12-1 kind of momentum on the cross-section. Basis momentum effectively captures the interaction effect between basis and momentum. The motivation for looking at basis-momentum is that there should be additional information in the decision of producers, consumers, and speculators as to where in the futures curve they take their positions, due to seasonality in production and demand.

Methodology

Continuous contracts are rolled on the last day of the month before expiry. The basis is defined as $B(t)=\frac{F_{T_1}(t)}{F_{T_2}(t)}-1$. The momentum is defined as $M(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1}(s))-1$. Finally, the basis momentum is $BM(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1}(s))-\prod_{s=t-11}^{t-1}(1+r_{T_2}(s))$ and spread return momentum is $SM(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1-T_2}(s))-1$. Spread returns are defined as $r_{T_1-T_2}(t)=\frac{(F_{T_1}(t)-F_{T_2}(t))-(F_{T_1}(t-1)-F_{T_2}(t-1))}{F_{T_1}(t-1)}$

We see that $$r_{T_1-T_2}(t) = r_{T_1}(t)-r_{T_2}(t)  + r_{T_2}(t)\frac{B(t-1)}{1+B(t-1)}.$$ which translates to $$ SM(t) = BM(t) + \sum\left(r_{T_2}(t)\frac{B(t-1)}{1+B(t-1)}\right).$$ The second term is the interaction effect, which consists of next nearby momentum and carry momentum.

A large literature shows that sorting commodities on the basis (carry) leads to large spot returns. Szymanowska (2014) show that basis also predicts spreading returns. Similarly, a large literature shows that sorting commodities on momentum leads to large spot returns as well. Szymanowska (2014) show that momentum do not predict spreading returns. This paper shows that sorting commodities based on basis momentum outperforms the previous two. Persistence in the tilting of the term structure is what basis-momentum tries to capture.

Tests and results

  1. Does Basis-momentum predict returns in the cross-section?: We regress the spot and spread returns over the three factors Basis, momentum and basis-momentum in two regressions. - We see that all three signals have predictability but it is basis-momentum which beats them all. Basis momentum is the only factor predicting cross-sectional spreading returns.
  2. Is Basis-momentum a priced risk factor?: We do time series regressions to determine whether the basis-momentum factors are spanned by basis and momentum factors. Then we conduct Fama-MacBeth cross-sectional regressions for commodity factor pricing models containing basis, momentum and basis-momentum. - basis momentum provides the best Sharpe of 0.93 for spot and 0.99 for spreading returns.

Currency Momentum Strategies

Menkhoff, Sarno, Schmeling, Schrimpf 2011

Abstract

Significant cross-sectional spread gives excess returns of 10% pa, not explained by traditional risk factors but explained by under and over reactions of investors. Different from carry trade.

Introduction

Momentum in stocks poses challenge to standard finance theory. Apart from conventional risk-factors, factors like credit risk/bankruptcy risk, limits to arbitrage, under reaction, or high transaction costs have been proposed. 

FX time series momentum strategies like moving average cross-overs, filter rules, channel breakouts deteriorate over time. FX cross-sectional strategies are less examined. We study 1976 - 2010 with 48 currencies. We decompose these momentum returns into systematic and unsystematic risk components, compare momentum strategies to carry and trading rules, qualify the importance of transaction cost and investigating non-standard sources of momentum returns like under- and over- reaction and limits to arbitrage.

We find evidence of return continuation and subsequent reversal over 36 months. These are different from carry returns and technical trading rules. Momentum profits are skewed towards currencies with high transaction costs. But these returns are not systematically related to standard proxies for business cycle risk, liquidity risk, carry trade risk factor, volatility risk, three Fama-French factors, Carhart four factor. These profits vary significantly over time suggesting limit to arbitrage. Momentum in countries with higher risk rating tend to yield significantly positive excess returns. Similar effect is found for a measure of exchange rate stability risk.

Related Literature

Stock market momentum - We established empirically, explained by

  1. risk-based and characteristic-based explanations: not linked to macroeconomic risk, but firm-specific risks, e.g. stronger in smaller firms, firms with lower credit rating, firms with higher revenue growth volatility, firms with higher likelihood to go bankrupt.
  2. behavioral biases: investor's under reaction to news, weak analyst coverage causes stronger momentum.
  3. Transaction costs or limit to arbitrage: reasonably high transaction costs may wipe out momentum profits.
Bonds and commodities momentum - Momentum strategies don't work for investment grade bonds or bonds at the country level, but yield positive returns for non-investment grade corporate bonds. Momentum returns are not related to liquidity but seem to reflect default risk in the winner and loser portfolios. Commodities high momentum returns are related to low levels of inventories.

Currency momentum - Mostly time series momentum has been analyzed. 
  1. Technical trading in FX  markets: highly correlated to trend following. Filter rules (like go long if moving returns are >1%) and moving average cross-over rules seem to work. This has slowed down recently.
  2. Contribution of this paper: cross-sectional momentum of FX and its analysis.

Data and currency portfolio

spot and 1 month forward rate from 1976-2010, end of month data. 48 countries. Interest rate differential (forward discount) contribute a significant share of the excess return of currency investments. We track pure spot returns as well to identify source of momentum. The long short portfolio is dollar neutral.

Characterizing Currency Momentum Returns


  1. Returns to Momentum strategies in currency markets - Returns driven by spot rates momentum and not mostly driven by interest rate changes (like for carry trades), especially for 1 year momentum with 1 month holding period. (1,1) is the best of the all. Though the cross-section of currencies is small relative to equities, the performance is still good because of much lower correlations in the currencies vs equities. 
  2. Out of sample perspective - do specific momentum strategies identified to be attractive in-sample continue to do well? Out of the universe of 144 strategies, we look for momentum in the lagged momentum returns! We find that 1 month lagged best portfolio is equally good (0.94) and hence can be seen as an out of sample test. These strategies have been stable over time.
  3. Comparing momentum and technical trading rules -  moving average cross overs of 1-20, 1-50 and 1-200 is used as a proxy for technical trading strategies (IR from 0.88 to 0.77). These are correlated to momentum but there is significant economic alpha. Similarly the cross-sectional momentum strategy has alpha over time series momentum strategies as well. 
  4. Comparing Currency momentum and the carry trade - Interest rate differentials are strongly auto-correlated and spot rate changes do not seem to adjust to compensate for this interest rate differential (forward rate puzzle). Hence, it may be the case that lagged high returns simply proxy for lagged high interest rate differentials and that cross-sectional momentum is simply carry. We show that that is not the case. Carry trade has negative skewness while momentum has slightly positive skewness. The high-low momentum strategies are uncorrelated with high-low carry strategies. Double sorting ( divide currencies into two portfolios based on median lagged forward discount and then divided each into three portfolios based on lagged returns) shows no material difference in long-short momentum returns among high vs low interest rate currencies. Cross-sectional Fama-Macbeth regression of currency excess returns on lagged excess returns over the last $l$-months, lagged forward discounts and lagged spot rate changes for each month show that lagged spot returns explain the regressions.
  5. Post-formation momentum returns - Initial under-reaction is accompanied by over-reaction which gets corrected over the long run. This causes reversal over longer periods. There is a clear pattern of increasing returns which peaks after 8-12 months across strategies and a subsequent period of declining excess returns, more pronounced for momentum strategies with longer formation periods, suggesting equity and currency momentum have similar origins.
Currency momentum seem similar to equity momentum. But the highly liquid FX markets are dominated by professional traders, where irrationality should be quickly arbitraged away. Hence examining possible limits to arbitrage activity which could explain the persistence of momentum profits in FX markets. 

Understanding the results

  1. Transaction cost - full bid-ask spread used. The 1,1 momentum returns from 10 to 4 percent. FX momentum strategies are much more profitable in the later part of the sample, but they do not always deliver high returns. There is much variation in profitability. Transaction costs can be decomposed into turnover across portfolios and bid-ask spreads across portfolio. Turnover can be extremely high for 1,1 momentum strategy, up to 70% per month. Winner and loser currencies do have higher transaction costs than the average exchange rate and the markup ranges from about 2.5 to 7 basis points per month. Transaction costs have declined over time due to more efficient trading technologies. This could imply (i) higher momentum returns due to lower trading costs (ii) lower momentum returns since lower cost facilitates more capital being deployed for arbitrage activity. Looking at 1,1 strategy for 1992 to 2010, we find profitability. Thus, lower bid-ask spreads do not necessarily lead to lower excess returns, which further indicate that trading costs are not the sole driving force behind momentum returns. Also suggesting that momentum returns are a phenomenon which is still exploitable.
  2. Momentum returns and Business cycle risk - Various univariate regressions on business cycle state variables - real growth in non-durables and service consuption expenditures, nonfarm employment growth, ISM manufacturing index, real industrial production, inflation rate, real money balances, growth in real disposable personal income, TED spread (3m libor - t-bill rate), term spread (20y - 3m tbill rate), carry trade long-short portfolio, global FX volatility - yield no explanation power. Regression on Fama-French three factors is also not explanatory. 
  3. Limit to Arbitrage: Time-variation in momentum profitability - 36 months moving window returns plot shows that there is time variation in performance. Hence, investor seeking to profit from momentum returns has to have a long enough investment horizon. Since the bulk of currency speculation is accounted for by professional market participants with rather short horizon. 
  4. Limit to Arbitrage: Idiosyncratic volatility - We investigate whether momentum returns are different between currencies with high or low idiosyncratic volatility (relative to an FX asset pricing model). When we double sort with respect to lagged idiosyncratic volatility and returns we find high idiosyncratic volatility explain higher returns.
  5. Limit to Arbitrage: Country risk - we sort on a measure of country risk and a measure of exchange rate stability risk. Data based on International Country Risk Guide (ICRG) database from the Political Risk Services group. We employ relative to US values. Momentum returns are significantly positive and always larger in high-risk countries than in low-risk countries. Hence country risk should be an important limit to arbitrage activity in FX markets. These risk ratings are not simple proxies for interest rate differentials, because the country risk and exchange rate risk are high both for winner and loser momentum currencies. Sorting based on forward discount show that country risk highest for carry trade target countries and lowest for carry trade funding currency. For top 15 developed countries, the momentum returns are non-existent after transaction cost. 

Robustness and additional tests


  1. Capital account restrictions and readability - 

Thursday, September 17, 2015

Diversified Statistical Abritrage: Dynamically combining mean reversion and momentum investment strategies - James Velissaris 2010

 Abstract

A dynamically adjusted strategy between mean-reversion and momentum (2008, 2009). Stocks are grouped together using PCA. The idiosyncratic returns is calculated by comparing the returns of the stock to the returns of the entire group. This residual return often oscillates around a long-term mean. This strategy is dollar neutral and have high turnover. The medium-term momentum strategy trade the 9 sector ETFs, based on technical trading rules. Dynamic allocation was done between the  11 strategies, with rebalancing at the end of each month. Out of sample IR of 2.27, with beta 35%

Equity mean reversion model

The decomposition of the stock returns is given by $$r_t = \alpha + \sum_{j=1}^n \beta_j F_t + \epsilon_t.$$ PCA of the normalized returns (after data centering and normalization in 252 day moving window) is used and the first 12 factors are retained. The Eigenportfolio returns $F_{jt}$ are given by $\sum_i \frac{v^{(j)}_i}{\sigma_i}R_{it}$. We, further, neglect the drift in returns. The model we implement is $dX_t=k(m-X_t)dt+\sigma dW_t$. The mean reversion time is $\tau = 1/k$. Use stock with mean reversion within 20 days, and for the s-score $s=\frac{X_t-m}{\sigma_{eq}}$ at +1.25 go short and get out at +0.75 (similarly for long). Trading cost of 10 bps. The model is two-times levered per side or four-times levered gross (industry standard).

Momentum strategy

S&P500 industry sector ETFs, S&P500 ETF and SPY. 60 and 5 day exponentially moving average is used. Signal long if 5d EMA is above 60d EMA for the previous 4 or more trading days. In all other scenarios the signal is short. There is no rebalancing the trade and 10 bps cost assumed.

In-sample analysis

2005-2007 in sample show mean-reversion strategy being much better than momentum with an IR of 1.28. The equally weighted strategy has an IR of 0.49.

Optimization and out-of-sample results

There are returns to be made by dynamically optimizing the weights of different strategies. We can use Quadratic programming with the objective function and constraints as $$\min_x \frac{1}{2}x^THx+f^Tx \quad Ax \le b, \quad A_{eq}x=b_{eq}, \quad lb \le x \le ub.$$
An important input into the process is lower and upper bounds for each variable. Using expected returns and allocation targets, we can customize the optimization process to best suit our portfolio specifications. The goal of this optimization is to maximize the Sharpe ratio of the diversified portfolio with a penalty for marginal risk contribution. The portfolio was optimized at the end of each month using the returns from the previous 252 trading days. There was no transaction cost used, except flat 10 bps per trade. The diversified strategy IR is 2.27 vs static allocation IR of 1.56, out-of-sample. The mean reversion strategy has a beta exposure. Optimization can be used to control beta, volatility and leverage as well to control drawdowns.

Conclusion

  • Potential benefit of including both mean-reversion and momentum in portfolio.
  • Did not hedge the beta risk using SPY, but can be done.
  • Momentum signal using PCA eigen-portfolios is not apparent at individual stock level.
  • Potentially greater alpha at finer time scales.
  • Varying time-scales with signal decay for both momentum and mean reversion can be useful.

Wednesday, September 16, 2015

Scaling by correlation matrix



We analyze the effect of scaling a signal by the inverse of correlation matrix here. We start by assuming that the two assets $A_1$ and $A_2$ have unit variance. This reduces the co-variance matrix to correlation matrix. We assume a simple correlation matrix of the form $$\begin{bmatrix} 1 & c \\ c & 1 \end{bmatrix}.$$ Now let's say we have generated a signal of $\mu_1$ and $\mu_2$ for the two assets before scaling. This means that the unscaled portfolio can be written as $$\mu_1 A_1 + \mu_2 A_2.$$ Now the inverse of the correlation matrix is $$\frac{1}{1-c^2}\begin{bmatrix} 1 & -c \\ -c & 1\end{bmatrix}.$$ This makes the scaled signal ($\Sigma^{-1}\mu$) $$\frac{\mu_1-c\mu_2}{1-c^2}A_1+\frac{\mu_2-c\mu_1}{1-c^2}A_2.$$ We can see that based on the 'original signal' ($\mu_1$ and $\mu_2$) and the correlation value ($c$) the 'scaled signal' is altered. Another way to look at the 'scaled signal' is to write the portfolio as $$\mu_1\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right] + \mu_2\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right].$$ This is another way of saying that we trade the same original signal but replace the assets $A_1$ and $A_2$ with the spreads $\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right]$ and $\left[\frac{1}{1-c^2}A_2-\frac{c}{1-c^2}A_1\right]$. In the table below we look at this 'spread' for different values of correlation coefficient $c$.  We also see the 'altered' signal value for the assets $A_1$ and $A_2$.
$$
\begin{array}{c|cc|cc}
c & \text{$\mu_1$} & \text{$\mu_2$} &A_1 & A_2  \\
\hline
+0.9 & 5.3A_1-4.7A_2 & 5.3A_2-4.7A_1 & 5.3\mu_1-4.7\mu_2 & 5.3\mu_2-4.7\mu_1  \\
+0.5 & 1.3A_1-0.7A_2 & 1.3A_2-0.7A_1 & 1.3\mu_1-0.7\mu_2 & 1.3\mu_2-0.7\mu_1 \\
+0.1 & 1.0A_1-0.1A_2 & 1.0A_2-0.1A_1 & 1.0\mu_1-0.1\mu_2& 1.0\mu-0.1\mu \\
0.0 & A_1 & A_2 & \mu_1 & \mu_2\\
-0.1 & 1.0A_1+0.1A_2 & 1.0A_2+0.1A_1 & 1.0\mu_1+0.1\mu_2 & 1.0\mu_2+0.1\mu_1  \\
-0.5 & 1.3A_1+0.7A_2 & 1.3A_2+0.7A_1  & 1.3\mu_1+0.7\mu_2 & 1.3\mu_2+0.7\mu_1   \\
-0.9 & 5.3A_1+4.7A_2 & 5.3A_2+4.7A_1  & 5.3\mu_1+4.7\mu_2 & 5.3\mu_2+4.7\mu_1
\end{array}
$$
For the case of high absolute correlations, till $\mu_1$ and $\mu_2$ are comparable the total portfolio values are within limits. But if $\mu_1$ and $\mu_2$ differ substantially huge positive and negative positions can be created, which may be undesirable. This is a likely scenario as signals are based on recent updated information while the correlations rely on slow window.

What if we add a third asset $A_3$ with signal $\mu_3$ which is uncorrelated to the first two assets? We have the correlation matrix as $$\begin{bmatrix} 1 & c & 0 \\ c & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix},$$ the inverse of this matrix is $$\frac{1}{1-c^2}\begin{bmatrix}1 & -c & 0\\ -c & 1 & 0 \\ 0 & 0 & 1-c^2\end{bmatrix}.$$ This results in the following 'altered' portfolio $$\frac{\mu_1-c\mu_2}{1-c^2}A_1+\frac{\mu_2-c\mu_1}{1-c^2}A_2+\mu_3A_3.$$ This shows that the signal of the uncorrelated asset is not changed.

Pairs trading the commodity futures curve - Antti Nikkanen

Notes on Antti Nikkanen Master's thesis Aug 2012

Ch1. Introduction

Commodity futures trading strategy, which exploits the roll returns of commodity futures as its main driver of excess return. To minimize the volatility of returns, pairs trading methodology is used to trade the futures curve, with a Sharpe of 3. Liquidity is taken into account with trading cost of 3.3 bps. Commodity is still unknown because of lack of good data, it being a derivative security, short maturity claim on a real asset and have pronounced seasonality in prices levels and volatility.

Ch2. Literature Review

Hong and Yogo (2012) show that aggregate basis (ratio of futures price to commodity price) is the most important predictor of commodity returns. The main factor behind the fluctuation of the aggregate basis is hedging pressure (how much producers short commodity futures to hedge their long positions in the underlying spot).

Erb and Harvey (2006) show that roll returns explain more than 90% of long-run cross-sectional variation of commodity futures returns over 1982-2004. The time-series variation of future returns is mostly explained by spot price movement. To become spot neutral the author creates spreads.

Fuertes and Miffre (2010) show tactical position of shorting contangoed and long backwarded futures. They also include momentum.

Gorton and Rouwenhorst (2005) state that the commodity futures returns are negatively correlated with those of equity and bond returns. But this low correlation exists only in 'normal' markets. The spread strategy reduces correlation even in 'abnormal' markets.

Ch3. Theory

Commodity markets do not fit the CAPM (Bodie and Rosansky 1980) because it is difficult to make a distinction between systematic risk/return and unsystematic risk/return. Also, the price is dependent on demand and supply factors, not perceived adequate risk premiums.

Stocks (like Finnish mining company Talvivaara) follow closely the price of underlying commodity (nickel). But many companies, especially the oil companies have hedged away its oil exposure e.g. ExxonMobile. With commodity ETFs there may be large tracking error e.g. USO is an oil ETF but lagged massively the movements in oil prices after the 2008 crash due to rolling the portfolio in times of negative roll returns. GLD on the other hand tracks the spot gold quite closely.

Less than 1% of futures contract result in a delivery of the underlying asset. Commodity futures do not represent direct exposures to actual commodities. They are bets on expected future spot prices (Gourton and Rouwenhorst 2005). The relationship between the futures and spot price is $F=Se^{(r+c-y)(T-t)}$, where $r$ is the risk free rate, $c$ is the storage cost (storage facilities, insurance, inspections, transportation and maintenance, spoilage and financing), $y$ is the convenience yield (ability to profit from local supply demand imbalances, leasing of gold to jewelry manufacturers).

Economics of backwardation and contango

Upward sloping (contango) and downward sloping (backwardation) are determined by demand, supply and seasonal changes. For a hedger who is inherently long (petroleum producer long on crude through exposure to oil exploration, developing refining and marketing), speculators are going to take the long risk if the price is sufficiently discounted vs spot price, i.e they are in backwardation. (Anson 2009). Contango occurs for commodities in which the hedger is inherently short to the exposure of commodity (e.g. aircraft manufacturers that does not have aluminum mines, willing to purchase the futures contract of a future aluminum delivery). Hence, profits for the speculator is determined by the amount the hedgers have interest for risk capital, not the long-term price trends of the commodity markets (Anson 2009).


Hicks' rational expectations hypothesis states that the price of an asset for delivery in future must be the market's current forecast of the spot price on the future delivery date (spot does not move in presence of any further information). This has proven not to be useful practically. Storage models have been better at explaining practicality, which states that relationship between the spot and future depends on storage levels and expected storage levels in the future (i.e. inventory). This mean there is an expectation of the spot price to move as well through maturity. A difficult to store commodity (NG) has steep forward curve. When inventories are high relative to demand, the curve will be upward-sloping and when tight downward-sloping (Till, Feldman 2006). These, difficult to store commodities (HO, HG, LC, LH) have the highest average excess returns versus easy to store commodities.

Commodity futures returns composition

Commodity returns is the sum of spot return, risk-free rate and roll return. Commodity markets are usually favorable for sudden spot price rises but show mean-reverting tendency over longer periods.

CTAs

Generally trend following, in contrast to market timing strategies where statistical techniques are used to predict the trends before they become apparent. Managed futures strategies are either technical or fundamental in either systematic or discretionary manner. Most do technical systematically. Bridgewater, an exception, does fundamental systematically, e.g. in 2008 they spotted the possibility for either an inflationary or a deflationary deleveraging through contraction in private credit growth, declining stock market and a widening credit spread and adjusted their positions based on 1920s Germany, 1980s Latin American inflationary deleveraging and the deflationary deleveraging of Great depression in the 1930s and Japan in 1990s (Schwager 2012).

A hedge against inflation

In inflationary periods, usually long commodity future positions benefit and stock and bond returns are negatively impacted, because the purchasing power of the money declines and earning power of the corporation erodes.

Pairs Trading

Johansen test can check the cointegration of multiple time series at a time. It is a relative strategy and does not care about absolute value of the assets. With stocks, it is more common that just one of the assets is over or under priced (Gatev, Goetzmann, Rouwenhorst 2006). For futures curve, even the underpriced contracts when in contango, usually have a negative expected return.

The main reason to pairs trade the future curve is to hedge price movement risk and only capture the part of the commodity futures roll return. This strategy could be made dynamically adjusting to be more profitable.

For two time series to move together there needs to be something called the error correction, which causes correction of prices and hence mean reversion. Usually the order of integration is first determined with a unit root test before running an actual cointegration test (crucial to check with common sense and graphics). Augmented Dickey-Fuller test takes care of the autocorrelation in the difference variable series. Johansen test is based on the error-correction representation of the VAR equation and testing for reduced rank and then using Granger's representation theorem to get the cointegration vector.

Ch4. Empirical work

1991 to 2012. Daily frequency of 12 nearest contracts of 20 commodities. Transaction cost of 3.3 bps per leg per trade and contracts with open interest less than 20000 not traded.

Methodology

  1. Determine the shape (contango vs backwardation) by taking the difference of the first five contracts, and taking an average of them. $$\frac{1}{5}\sum_{i=1}^5(f_i-f_{i+1}).$$
  2. If the result is positive (backwardation), go long the 'most' backwarded contract (maximum absolute slope), which is equivalently the most out of its path regarding its cointegration with the other data points in the curve. The position is taken onto the further contract.
  3. The short position is determined by taking the smallest value of differenced contracts and going short on the further contract.
  4. The pair is chosen only if both have open interest more than 20000.
  5. If contango, the process is same but reversed. Take position into the largest difference and a long position into the smallest absolute difference.
  6. At the start of each month the portfolio is set up for next 30 days, with equal weights.
All the commodity curves are found to be cointegrated. The information ratio is 3.1 for monthly rebalancing. All assets show positive returns. This can be bifurcated between roll returns (alpha genration) and hedged returns (to reduce volatility). Feeder cattle is invested only 3% of the time period while CL is invested 100%. daily traded strategy is similar with more trading cost, but good returns.

Improvements

  1. The current strategy is suboptimal in terms of when to trade.
  2. Entry should be based on price deviations form the equilibrium level.
  3. Best 5 instead of all would produce better results.
  4. To choose the 'hedging pair' from the real difference of the futures price and not the absolute price difference. This would capture the, though rare, instances where the futures curve has elements of both backwardation and contango.