On the ideal properties of risk measures

November 8, 2024 Risk

What are the ideal properties of risk measures? Should we disregard VaR because it is not subadditive? Did the Basel Committee make a mistake by moving to ES? And what is the difference between theory and practice?

One of the most surprising things about risk measurement is how differently people seem to view the same mathematical facts.

A good examples is subadditivity. The mathematics of it are clear, as is why we might benefit from it. What is controversial is whether something is subadditive and if we actively should prefer a risk measure that is theoretically subadditive

It is surprising since the math is straightforward. The benefit of mathematical analysis is that it is correct based on the assumptions made. If the mathematical analysis does not correspond to what is observed in reality, then question the underlying assumptions in the math.

A case in point is the reaction to one of my papers, published over a decade ago, titled Fat Tails, VaR and Subadditivity. I made a post about it last week on LinkedIn that generated a surprising amount of controversy. When I then posted about this piece on LinkedIn a week later, the same happened.

What I said was:

Some argue that subadditivity is important. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

I was a little careless, and a better statement would have been

Some argue that subadditivity is an important reason for picking ES over VaR. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

The problem arises since because risk is not measurable, in the same way temperature and prices are. There is no single, unambiguous, notion of risk. Risk is a latent variable, and to quantify it, we need a model, what I like to call a riskometer.

Coherent measures of risk

The background to this is that Artzner and co-authors wrote an influential paper in 1999 called Coherent Measures of Risk. A risk measure is coherent if it has monotonicity, translation invariance, positive homogeneity and subadditivity.

I discuss these on page 51 in my slides. Monotonicity and translation invariance are not controversial and should always hold.

Positive homogeneity and subadditivity are more controversial, especially subadditivity.

Suppose we have two assets, $X_1$ and $X_2$ , a risk measure $\varphi(\cdot)$ and constant $c$ .

Positive homogeneity

If $c>0$ and $$ \varphi(cX_1)=c\varphi(X_1) $$ Then risk measure $\varphi()$ satisfies positive homogeneity.

Subadditivity

If $$\varphi(X_1+X_2) \leq \varphi(X_1)+\varphi(X_2) $$ then risk measure $\varphi()$ satisfies subadditivity. It respects diversification.

Where does it matter?

Positive homogeneity

Positive homogeneity says that the risk of a portfolio scales linearly with the size of the portfolio.

That is not something that generally holds in practice. The reason is liquidity. If we have a small position in a large asset, yes, we expect positive homogeneity to hold. But if our trading affects prices, which it does for large positions, then positive homogeneity is violated.

So, we expect violations of positive homogeneity, and therefore, should not impose coherence on a risk measure.

Subadditivity

What about subadditivity? It relates to something profound: diversification reduces risk. And therefore is a highly desirable property. Or is it?

Super fat tails

Subadditivity is not desirable if the tails are super fat like when the first moment is unbounded, where the integral. $$ E[X]=\int_{-\infty}^\infty x f(x) dx $$ does not have a solution.

The example I like to give is if you go to a buffet restaurant and fear food poisoning. Then, the optimal strategy is only to eat one dish, because that minimises your chance of getting sick.

Diversification is not desirable in that case.

Here is an example of how such returns might look like, thousand simulations from a Student-t with degrees of freedom 0.9.

It is hard to think of traded financial assets with super fat tails. Possibly electricity prices and certain derivatives and fixed income assets, but even that is a stretch.

We are more likely to find super fat tails in bespoke assets like catastrophe insurance, such as on hurricanes, cyber, and the like. But in such cases, one should not use market risk measures such as Value-at-Risk (VaR) and expected shortfall (ES).

VaR and subadditivity

The most cited takeaway of the Artzner paper is that VaR is not subadditive, and instead we should use a subadditive risk measure, perhaps ES.

The canonical example

Suppose we have two assets, $X_1$ and $X_2$ that are independent of each other, and for each of them, there is a 4.9% probability of a loss of 100 and a 95.1% probability of zero as the outcome. In this case, VaR(95%) violates subadditivity. For details, see page 63 in my slides.

That might apply to certain insurance contracts, fixed income, structured credit and electricity prices.

Gaussian tails

If the underlying marginal distribution is Gaussian, VaR is subadditive, almost regardless of the dependence structure. The exception is, in the case of two assets, where holding both amplifies potential losses, or the dependence structure transforms the normal into something super fat. I cannot think of a practical financial example of this.

Fat Tails, VaR and Subadditivity

This takes me to my paper Fat Tails, VaR and Subadditivity which generated the controversy.

This is the sticking bit.

Proposition 1

Suppose that $X_1$ and $X_2$ are two asset returns with jointly regularly varying non-degenerate tails with tail index $\alpha >1$ . Then VaR is subadditive sufficiently deep in the tail region.

The tail index captures the fatness of the tail. It is infinite for the normal and equal to the degrees of freedom for the Student-t. It indicates the number of finite bounded moments, and so if it is below one, the mean is not defined and below two the variance is not defined. When I refer to super fat tails, I am discussing variables where the tail index is below one.

Proposition 1 guarantees that at sufficiently low probability levels, the VaR of a portfolio is lower than the sum of the VaRs of the individual positions if the return distribution exhibits fat tails with the same tail index.

Note that Proposition 1 differs from Axiom S in Artzner et. al (1999) $$ \forall X_1 \text{ and } X_2 \in G, \varphi(X_1+X_2) \leq \varphi(X_1)+\varphi(X_2) $$ It is a stronger statement with all outcomes, while Proposition 1 addresses the relevant tails. That difference explains the simulation results below.

Remark 1

From the proof to Proposition 1, we see that even without the non-degeneracy assumptions and, in particular, if the two assets have different tail indices, we still have $$ \limsup_{p\rightarrow 0}\frac{VaR_{p}(X_1+X_2)}{VaR_{p}(X_1)+VaR_{p}(X_2)}\leq 1 $$ which is a weaker form of subadditivity in the tails.

Proposition 1 and Remark 1 establish that VaR does not violate subadditivity for fat-tailed data, deep in the tail area.

And it is straightforward to show that the relationship between VaR and ES is $$ ES=\frac{ \alpha }{ \alpha-1 }VaR $$

The controversy

The math is clear: if the tails are regularly varying and the tail index is bigger than one, VaR is subadditive, regardless of its dependency structure, if we are sufficiently far in the tails, except in the same cases as with the normal above.

Holding more than one asset amplifies potential losses, or the dependence structure transforms the returns into something super fat. I cannot think of a practical financial example of this.

A key issue here the importance of the sample size and if we are sufficiently far in the tails.

The subadditivity of VaR from Proposition 1 is asymptotic, and therefore not guaranteed to hold in finite samples, which is easy to show with simulations. The code at the end of this blog demonstrates that. We can generate a variety of fat-tailed marginals and impose different Copulas designed to emphasise co-crashes and the like. As expected, we find few cases where VaR was not subadditive, as must happen by random chance. For example, when simulating from a very fat distribution, perhaps Student-t(2), we can get a very large draw that makes it indistinguishable from something super fat.

What matters is that most of the time we observe subadditivity and especially that the deviations from it as small.

You can try for yourself, the code is at the end.

Takeaway

We need two assumptions: regularly varying tails and being sufficiently far out in the tail. Here is the former.

Regular variation

Definition 1

A cumulative distribution function $F(x)$ varies regularly at minus infinity with tail index $\alpha >0$ if $$ \lim_{t\rightarrow \infty }\frac{F(-tx)}{F(-t)}=x^{-\alpha }~~~\forall x>0 $$ For example, the Student-t distributions vary regularly at infinity, with degrees of freedom equal to the tail index and satisfying the above approximation. Likewise, the stationary distribution of the GARCH(1,1) process and the non-normal stable distributions.

Sufficiently far out in the tail

This is an asymptotic result, and as we move from minus infinity to the centre of the distribution, this no longer can be expected to hold in all cases.

A small sample close to the centre is not very relevant in msot practical applications.

When is ES preferred?

However, the theoretical result that VaR is subadditive if the tail index exceeds one does not mean VaR is inherently preferable to ES. There are many practical reasons to prefer ES.

There is an important special case since it is much easier to manipulate VaR than ES. I discussed this in a piece called How to manipulate risk forecasts and not get caught.

Robert MacRae has a better example for preferring ES over VaR. A risk-maximising trader who faces a known distribution of returns and optimises against a 99% VaR constraint will bankrupt the company 1 day in 100, while if they optimise against an ES constraint they will not.

We may prefer ES because it tries to measure something more relevant than VaR where we are prepared to accept that it will in some ways fail even worse than VaR.

When is VaR preferred?

Recall the definition of VaR. $$ p=\int_{-\infty}^{-VaR(p)} { f(x)} dx $$ and ES $$ ES= \frac{1}{p} \int_{-\infty}^{-VaR(p)} x f (x) dx $$ The calculation of ES has two steps, while the VaR calculation has one. To calculate ES, first calculate VaR and then do the integration. In other words, there are two sources of estimation error, the VaR calculation and the integration.

Because ES can compound estimation error, it tends to be less accurate than VaR, as we argue in Why Risk Is So Hard to Measure. In an extensive battery of tests, we find that in all cases, ES is estimated more inaccurately than VaR.

Integration for ES multiples by $x$ and hence places linearly higher weights on points further out in the tail and these are always worse estimated. One reason is that the minima, or second smallest observations as estimates of the minima, or second smallest observation have infinite variance (unbounded second moment).

Rama Cont and co-authors find a related result in a 2009 paper called Robustness and Sensitivity Analysis of Risk Measurement Procedures.

When to use market risk measures such as VaR and ES

Market risk measures are designed to capture day-to-day non-serious risk. The chance of something happening every five months (99% daily) or every month (95% daily). That means routine risk management, monitoring of traders (human and algorithmic), and other similar cases.

When to use neither VaR nor ES

With all of that said, ultimately, market risk measures like VaR and ES are fair weather risk measures. They are designed to measure the risk of innocuous day-to-day events, and fail to capture risk of more serious events. That is why the vast majority of practical applications of such market risk measures explicitly confine themselves to events that are not important to most of us.

The daily 99% risk is serious only to highly leveraged entities such as banks.

Neither VaR nor ES should be used for pension funds, any long-term investment, most insurance contracts and other such cases.

Basically one should not use VaR or ES for anything that is serious and not daily.

Risk model hallucination

Risk model hallucination happens when models are forced to forecast the likelihood of extreme events in cases where they have not been trained with such extreme outcomes. This is surprisingly common in applications such as financial regulations, pension funds, reinsurance, as well as the containment of market upheaval and financial crises.

I argue in risk model hallucination that the measurment of systemic financial risk needs to acknowledge the changing behaviour of agents and institutions during periods of market stress. This approach is more challenging and difficult to formalise, but crucial for developing financial resilience.

There is simply no reliable way to measure the risk of extreme outcomes. Attempting to do so is one of the worst mistakes one can make in this field. It happens all the time.

Specious arguments for preferring ES over VaR

A counterargument I often get is that because ES catches the tails, it also informs on more extreme outcomes.

I disagree.

Because of the argument above, since in a finite sample estimation where ES is not trained on extreme tail events, and because of high uncertainty of the most extreme observations, ES it is not very informative on the more extreme outcomes.

What about regulations?

The Basel committee introduced 99% VaR in the 1996 market risk amendment to the Basel 1 Capital Accord, only to replace it with 97.5% ES in Basel 3.

Because of the following result (from above) $$ ES=\frac{ \alpha }{ \alpha-1 }VaR $$ We have $$ VaR(99) \approx ES(97.5) $$ where the ES is less accurately estimated.

I suspect they picked the 97.5% so that there would be no change in the overall level of reported risk, while banks that were exposed further out in the tail would (correctly) appear as riskier relative to those that were exposed closer in. Within the limitations of any riskometer approach this makes obvious sense.

The higher estimation error coupled with it giving mostly the same answer is why the Basel committee made a mistake by switching to VaR.

What about the practical objections?

The mathematics and the underlying assumptions of the subadditivity of VaR are clear.

These, of course, could be violated in practice. The mathematics suggests that that could happen in four cases:

When the tails are super fat;
When we are looking at outcomes that are not very extreme;
When samples are finite;
Extreme and risk amplifying dependence.

The second is the risk of innocuous day-to-day events that are so common in practical risk models but afford no protection of relevance to investors.

A criticism claiming that the subadditivity of VaR dismisses practical reality does have to indicate which of these four cases dominates in practice because the maths is clear that subadditivity is violated only within them. They also should to make clear why subbadditivity is a dominant consideration.

If someone criticises the analysis by giving concrete examples of how VaR violates subadditivity either with a practical example or mathematical analysis, I am all ears. Please give it to me, and I would be delighted to add it to this blog as a credible exception.

End fin

p.s. I thank Robert MacRae for valuable comments and Carlo Acerbi for challenging me to find the relevant nuance.

All errors are mine.

Appendix: Simulation code

library(copula)
#set.seed(123)

S=1000
tf=0
ts=0
mv=0
n = 10000
df=3.5
df_tcop = 4
rho = 0.7
theta = 2
p=0.01

for(i in 1:S){
    cop = normalCopula(param = rho, dim = 2)
    cop = gumbelCopula(param = theta, dim = 2)
    cop = tCopula(param = rho, dim = 2, df = df_tcop)

    # The Clayton has the strongest lower tail dependence
    # and is hence best to explore this

    cop = claytonCopula(theta, dim = 2)

    U = rCopula(n, cop)

    #plot(U[,1],U[,2])

    x= qnorm(U[, 1])
    y= qnorm(U[, 2])

    # the t is probably best marginal. df=3.5 is typical stock

    x= qt(U[, 1],df=df)
    y= qt(U[, 2],df=df)
    #plot(x,y)

    VaRx=sort(x)[p*n]
    VaRy=sort(y)[p*n]
    VaRxy=sort(x+y)[p*n]
    tf=tf+ifelse(VaRx+VaRy<VaRxy,FALSE,TRUE)
    ts=ts+ ifelse(VaRx+VaRy<VaRxy,FALSE,VaRx+VaRy-VaRxy)
    mv=mv+VaRxy
    cat(VaRx,VaRy,VaRx+VaRy,VaRxy,ifelse(VaRx+VaRy<VaRxy,FALSE,TRUE),"\n")
 }
cat("\n\n")
cat(tf,tf/S,ts/tf,mv/S,(ts/tf)/(mv/S),"\n")

Some argue that subadditivity is important. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

Some argue that subadditivity is an important reason for picking ES over VaR. The Basel Committee even changed the Basel Accords to expected shortfall. But is it so? Only in theory. Not practice.

Coherent measures of risk

Positive homogeneity

Subadditivity

Where does it matter?

Positive homogeneity

Subadditivity

Super fat tails

VaR and subadditivity

The canonical example

Gaussian tails

Fat Tails, VaR and Subadditivity

Proposition 1

Remark 1

The controversy

Takeaway

Regular variation

Definition 1

Sufficiently far out in the tail

When is ES preferred?

When is VaR preferred?

When to use market risk measures such as VaR and ES

When to use neither VaR nor ES

Risk model hallucination

Specious arguments for preferring ES over VaR

What about regulations?

What about the practical objections?

Appendix: Simulation code

Models and risk | Financial Regulation, Systemic Risk, Stability and AI