On the ideal properties of risk measures. Jon Danielsson. Modelsandrisk.org

On the ideal properties of risk measures

November 8, 2024

What are the ideal properties of risk measures? Should we disregard VaR because it is not subadditive? Did the Basel Committee make a mistake by moving to ES? And what is the difference between theory and practice?

One of the most surprising things about risk measurement is how differently people seem to view the same mathematical facts.

A good example is subadditivity. The mathematics of it are clear, as is the benefit. What is controversial is whether something is subadditive and if we actively should want a risk measure that is theoretically subadditive

It is surprising since the math is straightforward. The benefit of mathematical analysis is that it is correct based on the assumptions made. If the mathematical analysis does not correspond to what is observed in reality, then question the underlying assumptions in the math.

A case in point is the reaction to one of my papers, published over a decade ago, titled Fat Tails, VaR and Subadditivity. I made a post about it last week on LinkedIn that generated a surprising amount of controversy. When I posted about this piece on LinkedIn a week later, the same thing happened.

What I said was:

Some argue that subadditivity is important. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

I was a little careless, and a better statement would have been

Some argue that subadditivity is an important reason for picking ES over VaR. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

The problem arises because the risk is not measurable, in the same way, temperature or prices are. There is no single, unambiguous notion of risk. The reason is that risk is a latent variable, and to quantify it, we need a model, what I like to call a riskometer.

Coherent measures of risk

The background to this is that Artzner and co-authors wrote an influential paper in 1999 called Coherent Measures of Risk. A risk measure is coherent if it has monotonicity, translation invariance, positive homogeneity and subadditivity.

I discuss these on page 51 in my slides. Monotonicity and translation invariance are not controversial and should always hold.

Positive homogeneity and subadditivity are more controversial, especially subadditivity.

Suppose we have two assets, $X_1$ and $X_1$ , a risk measure $\varphi(\cdot)$ and constant $c$ .

Positive homogeneity

If $c>0$ and $$ \varphi(cX_1)=c\varphi(X_1) $$ then risk measure $\varphi$ satisfies positive homogeneity.

Subadditivity

If $$\varphi(X_1+X_2) \leq \varphi(X_1)+\varphi(X_2) $$ then risk measure $\varphi$ satisfies subadditivity. It respects diversification.

Where does it matter?

Positive homogeneity

Positive homogeneity says that the risk of a portfolio scales linearly with the size of the portfolio.

That is not something that generally holds in practice. The reason is liquidity. If we have a small position in a large asset, yes, we expect positive homogeneity to hold. But if our trading affects prices, which it does for large positions, then positive homogeneity is violated.

So, we expect violations of positive homogeneity, and therefore, we should not impose coherence on a risk measure.

Subadditivity

What about subadditivity? It relates to something profound: diversification reduces risk. Therefore, it is a highly desirable property. Or is it?

Super fat tails

Subadditivity is not desirable if the tails are super fat, like when the first moment is unbounded, where the integral is. $$ E[X]=\int_{-\infty}^\infty x f(x) dx $$ does not have a solution.

The example I like to give is if you go to a buffet restaurant and fear food poisoning. Then, the optimal strategy is only to eat one dish, because that minimises your chance of getting sick.

Diversification is not desirable in that case.

Here is an example of how such returns might look like a thousand simulations from a student-t with degrees of freedom 0.9.

It is hard to think of traded financial assets with such fat tails. Possibly electricity prices and certain derivatives and fixed income assets, but even that is a stretch.

We are more likely to find super fat tails in bespoke assets like catastrophe insurance, such as on hurricanes, cyber, and the like. However, in such cases, one should not use market risk measures such as Value-at-Risk (VaR) and expected shortfall (ES).

VaR and subadditivity

The most cited takeaway of the Artzner paper is that VaR is not subadditive, and instead, we should use a subadditive risk measure, perhaps ES.

Gaussian tails

If the underlying marginal distribution is Gaussian, VaR is subadditive, regardless of the dependence structure. OK, one can come up with a contrived example where a normal is mapped onto a binary outcome, or some weird copula or a transformation of a normal into something super fat.

The canonical example

Suppose we have two assets, $X_1$ and $X_1$ that are independent of each other, and for each of them, there is a 4.9% probability of a loss of 100 and a 95.1% probability of zero as the outcome. In this case, VaR(95%) violates subadditivity. For details, see page 63 in my slides.

That might apply to certain insurance contracts, structured credit and electricity prices.

Fat Tails, VaR and Subadditivity

This takes me to my paper Fat Tails, VaR and Subadditivity which generated the controversy.

This is the sticking bit.

Proposition 1

Suppose that $X_1$ and $X_2$ are two asset returns with jointly regularly varying non-degenerate tails with tail index $\alpha >1$ . Then VaR is subadditive sufficiently deep in the tail region.

The tail index captures the fatness of the tail. It is infinite for the normal, is equal to the degrees of freedom for a student-t. It indicates the number of finite bounded moments, and so if it is below one, the mean is not defined, and below two the variance is not defined. When I refer to super fat tails, I am discussing a variable with a tail index below one.

Proposition 1 guarantees that at sufficiently low probability levels, the VaR of a portfolio is lower than the sum of the VaRs of the individual positions if the return distribution exhibits fat tails with the same tail index.

Remark 1

From the proof to Proposition 1, we see that even without the non-degeneracy assumptions and, in particular, if the two assets have different tail indices, we still have $$ \limsup_{p\rightarrow 0}\frac{VaR_{p}(X_1+X_2)}{VaR_{p}(X_1)+VaR_{p}(X_2)}\leq 1 $$ which is a weaker form of subadditivity in the tails.

Proposition 1 and Remark 1 establish that VaR does not violate subadditivity for fat-tailed data, deep in the tail area, regardless of its dependency structure.

And it is straightforward to show that the relationship between VaR and ES is $$ ES=\frac{ \alpha }{ \alpha-1 }VaR $$

The controversy

The math is clear: if the tails are regularly varying and the tail index is bigger than one, VaR is subadditive, regardless of the dependent structure.

It is important to note that it will not always hold in practice, which is easy to demonstrate with simulations. The code at the end of this blog shows that. We can generate a variety of fat-tailed marginals and impose different Copulas designed to emphasise co-crashes and the like. As expected, we find few cases where VaR was not subadditive, as must happen by random chance. The subadditivity of VaR is asymptotic and, therefore, not guaranteed to hold in finite samples. For example, when simulating from a very fat distribution, perhaps student-t(2), we can get a very large draw that makes it indistinguishable from something super fat.

What matters is that most of the time, we observe subadditivity and especially that the deviations from it are small.

You can try it for yourself; the code is at the end.

Takeaway

We need two assumptions: regularly varying tails and being sufficiently far out in the tail. Here is the former.

Regular variation

Definition 1

A cumulative distribution function $F(x)$ varies regularly at minus infinity with tail index $\alpha >0$ if $$ \lim_{t\rightarrow \infty }\frac{F(-tx)}{F(-t)}=x^{-\alpha }~~~\forall x>0 $$ For example, the Student-t distributions vary regularly at infinity, with degrees of freedom equal to the tail index and satisfying the above approximation. Likewise, the stationary distribution of the GARCH(1,1) process and the non-normal stable distributions.

Sufficiently far out in the tail

This is an asymptotic result, and as we move from minus infinity to the centre of the distribution, this no longer can be expected to hold in all cases.

When is ES preferred?

However, that does not mean VaR is inherently preferable to ES, and there are many practical reasons to prefer ES.

There is an important special case since it is much easier to manipulate VaR than ES. I discussed this in a piece called How to manipulate risk forecasts and not get caught.

When is VaR preferred?

Recall the definition of VaR. $$ p=\int_{-\infty}^{-VaR(p)} { f(x)} dx $$ and ES $$ ES= \frac{1}{p} \int_{-\infty}^{-VaR(p)} x f (x) dx $$ The calculation of ES has two steps, while the VaR calculation has one. To calculate ES, first calculate VaR and then do the integration. In other words, there are two sources of estimation error, the VaR calculation and the integration.

Because ES compounds estimation error, it is by definition less accurate than VaR, as we argue in Why Risk Is So Hard to Measure. In an extensive battery of tests, we find that in all cases, ES is estimated more inaccurately than VaR.

Rama Cont and co-authors find a related result in a 2009 paper called Robustness and Sensitivity Analysis of Risk Measurement Procedures.

When to use market risk measures such as VaR and ES

Market risk measures are designed to capture day-to-day non-serious risk. The chance of something happening every five months (99% daily) or every month (95% daily). That means routine risk management, monitoring of traders (human and algorithmic), and other similar cases.

When to use neither VaR nor ES

With all of that said, ultimately, market risk measures like VaR and ES are fair weather risk measures. They are designed to measure risk in innocuous day-to-day events, failing to capture the risk of more serious events. That is why the vast majority of practical applications of such market risk measures explicitly confine themselves to events that are not important to most of us.

Therefore, they should not be used for pension funds, any long-term investment, insurance contracts, or other such cases—anything serious and not daily.

Risk model hallucination

Risk model hallucination happens when models are forced to forecast the likelihood of extreme events in cases where they have not been trained with such extreme outcomes. This is surprisingly common in applications such as financial regulations, pension funds, and reinsurance, as well as the containment of market upheaval and financial crises.

I argue in Risk model hallucination. that measuring systemic financial risk needs to acknowledge the changing behaviour of agents and institutions during periods of market stress. This approach is more challenging to formalise but crucial for learning how to develop financial resilience.

There is simply no reliable way to measure the risk of extreme outcomes. Doing that is one of the worst mistakes one can make in this field. It happens all the time.

Specious arguments for preferring ES over VaR

A counterargument I often get is that because ES catches the tails, it also informs on more extreme outcomes.

I have a hard disagree.

Because of the argument above, because ES is not trained on extreme tail events, it is not informative on them.

How accurate are risk measures?

Risk measures are surprisingly inaccurate, as we document in Why Risk Is So Hard to Measure.

Here is one figure showing the confidence bounds for VaR as the sample size increases

And the 99% confidence bound for 99% ES and VaR across all CRSP stocks. Expect VaR and ES to be one.

Sample size	ES	VaR
300 days	[0.65,1.49]	[0.63,1.28]
4 years	[0.74,1.35]	[0.69,1.35]
20 years	[0.84,1.20]	[0.80,1.24]

One might think a random number generator performs just as well!

What about regulations?

The Basel Committee introduced 99% VaR in the 1996 market risk amendment to the Basel 1 Capital Accord, only to replace it with 97.5% ES in Basel 3.

Because of the following result (from above) $$ ES=\frac{ \alpha }{ \alpha-1 }VaR $$ We have $$ VaR(99) \approx ES(97.5) $$ Where the ES is less accurately estimated.

I suspect they deliberatly picked 97.5% to ensure nothing would change. Pure theatre.

That is why the Basel committee made a fundamental mistake by switching to VaR.

My blog on that, Why risk is hard to measure, concluded by

To us, it is a concern that important public policy regulations such as bank capital calculations are based on numbers that in practice might be almost double or half the recorded number, just based on a random draw.

What about the practical objections?

The mathematics and the underlying assumptions of the subadditivity of VaR are clear.

These, of course, could be violated in practice. The mathematics suggests that that could happen in four cases:

When the tails are super fat;
When we are looking at outcomes that are not very extreme;
When samples are finite;
Extreme dependence, when continuous variables are mapped onto binary outcomes.

The second is the risk of innocuous day-to-day events that are so common in practical risk models but afford no protection of relevance to investors.

If someone criticises this analysis by merely saying it dismisses practical reality, that is not criticism worth paying attention to. The math is clear, the critisism fluffy in the exteme.

If someone criticises the analysis by giving concrete examples of how VaR violates subadditivity, either with a practical example or mathematical analysis, I am all ears. Please give it to me, and I would be delighted to add it to this blog as a credible exception.

End fin

Appendix: Simulation code

library(copula)
#set.seed(123)

S=1000
tf=0
ts=0
mv=0
n = 10000
df=3.5
df_tcop = 4
rho = 0.7
theta = 2
p=0.01

for(i in 1:S){
    cop = normalCopula(param = rho, dim = 2)
    cop = gumbelCopula(param = theta, dim = 2)
    cop = tCopula(param = rho, dim = 2, df = df_tcop)

    # The Clayton has the strongest lower tail dependence
    # and is hence best to explore this

    cop = claytonCopula(theta, dim = 2)

    U = rCopula(n, cop)

    #plot(U[,1],U[,2])

    x= qnorm(U[, 1])
    y= qnorm(U[, 2])

    # The t is probably the best marginal. df=3.5 is typical for a stock

    x= qt(U[, 1],df=df)
    y= qt(U[, 2],df=df)

    # or a log normal (upper tail, hence the -)
    #x=-qlnorm(U[, 1])
    #y=-qlnorm(U[, 2])

    #plot(x,y)

    VaRx=sort(x)[p*n]
    VaRy=sort(y)[p*n]
    VaRxy=sort(x+y)[p*n]
    tf=tf+ifelse(VaRx+VaRy<VaRxy,FALSE,TRUE)
    ts=ts+ ifelse(VaRx+VaRy<VaRxy,FALSE,VaRx+VaRy-VaRxy)
    mv=mv+VaRxy
    cat(VaRx,VaRy,VaRx+VaRy,VaRxy,ifelse(VaRx+VaRy<VaRxy,FALSE,TRUE),"\n")
 }
cat("\n\n")
cat(tf,tf/S,ts/tf,mv/S,(ts/tf)/(mv/S),"\n")

On the ideal properties of risk measures

Blame the models

Models and risk

Bloggs and appendices on artificial intelligence, financial crises, systemic risk, financial risk, models, regulations, financial policy, cryptocurrencies and related topics
© All rights reserved, Jon Danielsson,