Introduction
Valuation ratios (dividend price, earning price), yields on short and long term treasury and corporate bonds appear to posses some predictive power at short horizons for timing the market. New variable with incremental predictive power such as share of equity issue in total new equity and debt issues, consumption-wealth ratio, relative valuations of high and low-beta stocks, estimated factors from large economic datasets can be use (Lettau and Ludvigson 2008 review paper).
This paper, instead of trying to identify better predictors, look for better ways of using predictors. This is done by decomposing returns into sign and magnitude. Sign has better predictability. We aim to predict the expected returns and the following decomposition model is proposed:
$$E[r_t|F_{t-1}]=E[|r_t|sign(r_t)|F_{t-1}]=f(|r_t|)\times g(sign(r_t))\times \text{interaction copula}.$$
The magnitude is modeled using multiplicative error model, the sign by dynamic binary choice model and a copula for their interaction. This way we are able to model hidden nonlinearities absent from the regression setup. Magnitude and signs have substantial dependence over time but hardly any for returns! e.g. magnitude is like vol which shows significant dependence. One important aspect of the bivariate analysis is that, in spite of a large unconditional correlation between the multiplicative components, they appear to conditionally very weakly dependent.
This opens avenues for strategies as well (Anatolyev and Gerko 2005). The decomposition model is better than predictive regression which is better than buy-and-hold strategy - both in and out of sample. The decomposition model also produces unbiased forecasts.
Methodological Framework
The key identity is
$$r_t=c+|r_t-c|sign(r_t-c)=c+|r_t-c|(2\mathbb{I}[r_t>c]-1)$$
and hence,
$$E[r_t|F_{t-1}]=c-E[|r_t-c| | F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}],$$
where $c$ is a user defined constant used to model transaction cost, different dynamics of small or large positive and large negative returns. It would be 0 for modeling recession and expansion using GDP. 3% for modeling output gap, and 2% for forecasting inflation. $F_{t-1}$ is all the information available till time $t-1$, which practically consists of all data like lagged returns, volatility, volume and other predictive variables available at time $t-1$. Toy example, where predictive variables are based on realized volatility $RV_{t-1}$:
a) For direct regression model: $E[r_t]=\alpha+\beta RV_{t-1}$ gives a $R^2$ of 0.39%
b) For decomposition model: $E[|r_t|]=\alpha_{|r|}+\beta_{|r|}RV_{t-1}$ and $Pr[r_t>0]=\alpha_{\mathbb{I}}+\beta_{\mathbb{I}}RV_{t-1}$. Assuming the two components are stochastically independent giving $E[r_t]=\alpha_r+\beta_rRV_{t-1}+\gamma_r RV^2_{t-1}$, showing that nonlienarities are covered in the decomposition model, giving $R^2$ of 0.72%.
c) Further adding $\mathbb{I}[r_{t-1}>0]$ and $RV_{t-1}\mathbb{I}[r_{t-1}>0]$ to the regressor list increases the $R^2$ to 1.21%.
It is important to note that it is the augmentation of the sign component which delivers nonlinear dependence, improving the prediction. The driving force behind the predictive ability of the decomposition model is the predictability in the two components. The interaction term is less significant. This is the main theme of this work.
and hence,
$$E[r_t|F_{t-1}]=c-E[|r_t-c| | F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}],$$
where $c$ is a user defined constant used to model transaction cost, different dynamics of small or large positive and large negative returns. It would be 0 for modeling recession and expansion using GDP. 3% for modeling output gap, and 2% for forecasting inflation. $F_{t-1}$ is all the information available till time $t-1$, which practically consists of all data like lagged returns, volatility, volume and other predictive variables available at time $t-1$. Toy example, where predictive variables are based on realized volatility $RV_{t-1}$:
a) For direct regression model: $E[r_t]=\alpha+\beta RV_{t-1}$ gives a $R^2$ of 0.39%
b) For decomposition model: $E[|r_t|]=\alpha_{|r|}+\beta_{|r|}RV_{t-1}$ and $Pr[r_t>0]=\alpha_{\mathbb{I}}+\beta_{\mathbb{I}}RV_{t-1}$. Assuming the two components are stochastically independent giving $E[r_t]=\alpha_r+\beta_rRV_{t-1}+\gamma_r RV^2_{t-1}$, showing that nonlienarities are covered in the decomposition model, giving $R^2$ of 0.72%.
c) Further adding $\mathbb{I}[r_{t-1}>0]$ and $RV_{t-1}\mathbb{I}[r_{t-1}>0]$ to the regressor list increases the $R^2$ to 1.21%.
It is important to note that it is the augmentation of the sign component which delivers nonlinear dependence, improving the prediction. The driving force behind the predictive ability of the decomposition model is the predictability in the two components. The interaction term is less significant. This is the main theme of this work.
Marginal distributions and Copula model
a) Volatility model: Absolute returns $|r_t-c|$ is a positively valued variable and is modeled using multiplicative error framework of Engle (2002) $$|r_t-c|=\psi_t\eta_t,$$ where $\psi_t=E[|r_t-c||F_{t-1}]$ and $\eta_t$ is a positive multiplicative error with $E[\eta_t|F_{t-1}]=1$ and conditional distribution $\mathbb{D}$. $\psi_t$ can be modeled using lograthimic autoregressive conditional duration (LACD) as
$$ln\psi_t=\omega_v+\beta_vln\psi_{t-1}+\gamma_vln|r_{t-1}-c|+\rho_v\mathbb{I}[r_{t-1}>c]+\pmb{x}^T_{t-1}\pmb{\delta}_v.$$ The second last term allows for regime-specific volatility dependence while the last term represents macroeconomic predictors of volatility. $\mathbb{D}$ can be modeled as constant parameter Weibull distribution (or others distributions with the shape parameter vector $\varsigma$ a function of the past).
b) Direction model: The indicator $\mathbb{I}[r_t>c]$ has a conditional distribution of Bernoulli $\mathbb{B}(p_t)$ with probability mass function $f_{\mathbb{I}[r_t>c]}(v)=p^v_t(1-p_t)^{1-v}, v\in {0,1}$, where $p_t$ denotes the conditional 'success' probability $Pr(r_t>c|F_{t-1})=E[\mathbb{I}[r_t>c]|F_{t-1}]$. Christoffersen and Diebold (2006) show a remarkable result that if data are generated by $r_t=\mu_t+\sigma_t\epsilon_t$, where $\mu_t=E[r_t|F_{t-1}]$, $\sigma^2_t=Var[r_t|F_{t-1}]$, and $\epsilon_t$ is a homoskedastic martingale difference with unit variance (i.e. can be modeled as GARCH process) and distribution function $\mathbb{F}_{\varepsilon}$, then
$$Pr[r_t>c|F_{t-1}]=1-\mathbb{F}_{\epsilon}\left(\frac{c-\mu_t}{\sigma_t}\right).$$
This suggests that time-varying volatility can generate sign predictability as long as $c-\mu_t\ne0$. Furthermore Christoffrsen (2007) derive a Gram-Charlier expansion of this distribution and show that $Pr[r_t>c|F_{t-1}]$ depend on the third and fourth conditional cumulants of the standardized errors $\epsilon_t$. Hence, sign predictability would arise from time variability in second and higher-order moments. This leads us to parametrize $p_t$ as a dynamic logit model:
$$p_t=\frac{e^{\theta_t}}{1+e^{\theta_t}}\quad\text{with}\quad\theta_t=\omega_d+\phi_d\mathbb{I}[r_{t-1}>c]+\pmb{y}^T_{t-1}\pmb{\delta}_d,$$
where the last term denotes macroeconomic variables (valuation ratios, interest rate) and realized measure (variance, bipower vriation, realized third and fourth moment of returns).
$$E[r_t|F_{t-1}] = c - E[|r_t-c||F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}]$$
In terms of inference
\[\hat{r}_t=c-\hat{\psi}_t+2\hat{\xi}_t\]
Under conditional independence or if conditional dependence is weak we have
$$\xi_t=E[|r_t-c||F_{t-1}]E[\mathbb{I}[r_t>c]|F_{t-1}]=\psi_tp_t.$$
so,
$$\hat{r}_t=c+(2\hat{p}_t-1)\hat{\psi}_t.$$
Under the general case of dependence, the copula estimation is essential.
$$ln\psi_t=\omega_v+\beta_vln\psi_{t-1}+\gamma_vln|r_{t-1}-c|+\rho_v\mathbb{I}[r_{t-1}>c]+\pmb{x}^T_{t-1}\pmb{\delta}_v.$$ The second last term allows for regime-specific volatility dependence while the last term represents macroeconomic predictors of volatility. $\mathbb{D}$ can be modeled as constant parameter Weibull distribution (or others distributions with the shape parameter vector $\varsigma$ a function of the past).
$$Pr[r_t>c|F_{t-1}]=1-\mathbb{F}_{\epsilon}\left(\frac{c-\mu_t}{\sigma_t}\right).$$
This suggests that time-varying volatility can generate sign predictability as long as $c-\mu_t\ne0$. Furthermore Christoffrsen (2007) derive a Gram-Charlier expansion of this distribution and show that $Pr[r_t>c|F_{t-1}]$ depend on the third and fourth conditional cumulants of the standardized errors $\epsilon_t$. Hence, sign predictability would arise from time variability in second and higher-order moments. This leads us to parametrize $p_t$ as a dynamic logit model:
$$p_t=\frac{e^{\theta_t}}{1+e^{\theta_t}}\quad\text{with}\quad\theta_t=\omega_d+\phi_d\mathbb{I}[r_{t-1}>c]+\pmb{y}^T_{t-1}\pmb{\delta}_d,$$
where the last term denotes macroeconomic variables (valuation ratios, interest rate) and realized measure (variance, bipower vriation, realized third and fourth moment of returns).
c) Copula model:To construct the bivariate conditional distribution of $R_t=[|r_t-c|, \mathbb{I}[r_t>c]]^T$ copula theory is used. In particular,$$F_{R_t}(u,v)=C(F_{|r_t-c|}(u), F_{\mathbb{I}[r_t>c]}(v))$$where $F$ denotes the CDF and $C(u,v)$ is a copula. Most common choices are Frank, Clayton or Farlie-Gumbel-Morgenstern copulas. Once the three ingredients of the joint distribution of $R_t$, i.e. the volatility model, the direction model and the copula are specified, the parameter vector can be estimated by maximum likelihood.
Conditional mean prediction in decomposition model
The main interest is the mean forecast of returns$$E[r_t|F_{t-1}] = c - E[|r_t-c||F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}]$$
In terms of inference
\[\hat{r}_t=c-\hat{\psi}_t+2\hat{\xi}_t\]
Under conditional independence or if conditional dependence is weak we have
$$\xi_t=E[|r_t-c||F_{t-1}]E[\mathbb{I}[r_t>c]|F_{t-1}]=\psi_tp_t.$$
so,
$$\hat{r}_t=c+(2\hat{p}_t-1)\hat{\psi}_t.$$
Under the general case of dependence, the copula estimation is essential.
No comments:
Post a Comment