Slides for Financial Risk Forecasting

Holding periods

An evaluation of the Basel III market risk proposals

Holding periods

The BCBS has proposed using overlapping data to calculate risk forecasts, in other words, suppose a particular method requires N=300 observations to produce a risk forecast were the holding period, n, is 10 days, a nonoverlapping method would require 3,000 days but overlapping method only 310.

Why?

Assuming no structural breaks, more data will always deliver better results, and an overlapping sample periods will always result in inferior results to compared a nonoverlapping sample.

That leaves the question of whether it is advisable to use overlapping data compared to non-overlapping data drawn from the same population size. In some cases, it is obvious that the overlapping data is better, for example with samples with sparse trading. However, for highly liquid assets, the case is not as clear. In particular, the use of overlapping data has the potential to introduce particular biases in the estimation. This problem is well understood, noted for example by Hansen and Hodrick (1980) in a different context.

Below, we examine how risk forecasts are affected, both for overlapping and non-overlapping intervals for the case of n=10, which applies to most liquid large stocks as identified by the Committee. We consider two distributions, the normal and the Student-t(3). The latter was chosen because it's tail fatness corresponds to the most common fatness in equity returns. To sample sizes are used, 104 and 300, whilst the number of simulations is 105. The holding period is 10. In the case of the Student-t we estimate the degrees of freedom simultaneously with the standard deviation with a maximum likelihood procedure.

We look at three cases, first a naive straightforward comparison between nonoverlapping and overlapping, with different population sizes. Here, we expect the nonoverlapping to deliver better results. Second, we do a like for like comparison, and finally show results from observed data.

Naive comparison

This is not a fair comparison between the overlapping and nonoverlapping, after all, a bigger sample will always provide better results, given the distributional assumptions made here.

T= 10000

Normal. N=10000, S=100000, n=10 non-overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 73.62 84.23 73.56 84.28
se 1.19 1.45 0.52 0.60
Normal. N=10000, S=100000, n=10 overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 73.56 84.04 73.49 84.19
se 2.44 3.16 1.34 1.54
Student-t(3). N=10000, S=100000, n=10 non-overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 134.83 183.19 137.09 175.56
se 3.47 9.76 2.12 4.18
Student-t(3). N=10000, S=100000, n=10 overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 135.13 182.30 136.79 175.16
se 9.08 29.76 6.26 12.28

T= 300

Normal. N=300, S=100000, n=10 non-overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 75.35 82.40 73.48 84.19
se 7.16 8.13 3.00 3.44
Normal. N=300, S=100000, n=10 overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 72.20 76.58 71.02 81.37
se 13.26 13.80 7.50 8.60
Student-t(3). N=300, S=100000, n=10 non-overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 142.55 177.93 136.20 174.49
se 24.71 54.50 12.26 24.49
Student-t(3). N=300, S=100000, n=10 overlapping
HS VaR HS ES Parametric VaR Parametric ES
mean 133.90 141.40 132.69 174.18
se 60.81 61.81 32.41 71.66

These results are compared to those in Monte Carlo analysis of VaR vs. ES and 99% vs. 97.5%.


Consider first the large sample size, 10,000.

For the non-overlapping Gaussian parametric case, we would expect the VaR and ES to scale up by $\sqrt{10}$=3.1622777 and that is the result we get in the non-overlapping case. For historical simulation result is similarly close. For the Student-t 2 cases the scaling law law is is less than square root of time and it scales up as expected. The means are similar for both ES and VaR across overlapping and non-overlapping, however, the standard errors are more than double, and sometimes triple in the case of the overlapping, indicating that the error in estimation is much larger. This is not surprising, after all, the effective sample size is much smaller in the overlapping case.

Consider next the small sample size, 300.

The larger sample size is chosen to demonstrate the asymptotics, in practice, much smaller sample sizes will be used, perhaps 300 days as used here. We start seeing divergence between the parametric and the nonparametric methods, again something not surprising given the sample size. Here the results change significantly. The uncertainty in the estimation increases, not surprisingly. The impact is not so large for the Gaussian case, the standard errors are still much larger in the overlapping case and more importantly, there is downward bias in the risk forecast for both VaR and ES.

The impact is larger for the Student-t. We start seeing large divergence between the parametric and nonparametric methods, and again the standard error is much larger in the overlapping case. It is so large that in some cases the risk estimate is statistically insignificantly different from zero.

In other words, the fat tailed method does not seem to work for overlapping data and small observation windows.

Like for like comparison

A better test is to take a given number of days of data and then to test whether it is better to use overlapped or disjoint. Here n=10, and we use the sqrt(n) to adjust the non-overlapping results.

note the se, much higher in overlapping

non-over HS VaR non-over HS ES non-over Par VaR non-over Par ES over. HS VaR over. HS ES over. Par VaR over Par ES
mean 75.24 84.62 73.52 84.23 74.51 82.60 73.04 83.68
se 3.98 4.77 1.65 1.89 7.99 9.55 4.25 4.87

Data

This really speaks for itself, why overlapping makes no sense. plot of chunk unnamed-chunk-4