Risk and scientific socialism. Jon Danielsson. Modelsandrisk.org

Risk and scientific socialism

January 25, 2025
Modern financial risk forecasting has much in common with scientific socialism, relying on pseudoscientific models that can create a false sense of security. Effective risk management requires acknowledging these limitations.

One of the things that fascinates me about the financial system is that so many who work in it profess to believe in free markets — indeed, many view themselves as libertarians — while going about their business in an apparently very socialistic manner, with no irony. I could talk about bailouts, but not today.

We don’t hear much about scientific socialism these days, but it used to be very popular; a scientific blueprint for avoiding capitalism’s worst characteristics.

Friedrich Engels coined the term "scientific socialism" in his 1880 book Socialism: Utopian and Scientific when describing Karl Marx’s political, economic and social approach for improving society. Use scientific methodologies to predict what happens when experimenting with socialism so we can achieve the best of all possible worlds. Come to think about it, scientific socialism is still very popular, even if the actual term has gone out of fashion.

Scientific socialism is based on empirically testable hypotheses about society. When reality does not follow the theory’s predictions, ad hoc assumptions are added to the initial theory to ensure it fits the actual facts.

Karl Popper, who taught at the LSE from 1946 to 1969, argued that scientific socialism is "pseudoscientific" in his 1945  The Open Society and Its Enemies — one of my all-time favourite books. 

The study of the philosophy of science dates back to Ancient Greece where Aristotle analysed logic and empirical inquiry. That work was revived in the 17th century by Francis Bacon, who's scientific method emphasised empirical observation and inductive reasoning over pure logic or tradition, with the likes of Descartes, Hume and Kant further advancing the field.

Karl Popper revolutionised the philosophy of science in the 20th century by emphasising falsifiability as the key criterion for scientific theories. He challenged the traditional view that science progresses through verification or induction and argued that the scientific method is based on falsification, not verification. 

He maintained that no number of observations can prove a theory true, but a single contradictory observation can prove it false. A scientific theory must make testable predictions that can be refuted by empirical evidence. This separates science from pseudoscience — scientific theories must be open to falsification through rigorous testing.

Pseudoscience refers to any system of thought or theory that fails the criterion of falsifiability. A scientific theory must make testable predictions that could potentially be proven false through empirical observation. If a theory is structured in a way that makes it impossible to refute, it is not science — it is pseudoscience.

Modern philosophers see Popper's work of science as too simplistic as no scientific theory can be tested in isolation. When a prediction fails, it is unclear whether the theory itself is wrong or if an auxiliary assumption, like measurement errors or background knowledge are to blame. LSE's Lakatos, who overlapped and collaborated with Popper, argued that science also depends on confirmatory evidence, not just falsification. Some theories are not easily falsifiable, like in cosmology, but that does not mean they are not scientific. In addition, the way science progresses is more complex than in Popper's view, like when theories are adjusted incrementally rather than outright rejected when anomalies arise. 

None of that more recent nuance invalidates Popper's contention that scientific socialism is pseudoscientific. 

So, what does this have to do with financial risk management?

We quantify financial risk using what I coined the riskometer many years ago in a VoxEU blog. It is plunged into the City of London's guts to extract risk measurements.

For the technically minded, the riskometer combines a concept of risk (such as Value-at-Risk), a probability (e.g. 99% daily) and an empirical approach (perhaps GARCH).

We can directly validate the predictions of a risk measure, Value-at-Risk by looking at ex-post outcomes, since a violation is directly observable. That means if all we are concerned with is 99% daily Value-at-Risk, its estimation and ex-post validation conform to the scientific method. Forecasts from a risk measure like expected shortfall cannot be directly validated with ex-post outcomes. The reason is that expected shortfall encapsulates the tail, and as the underlying distribution is unknown, we can not directly validate expected shortfall forecasts with ex-post outcomes as we don't know the shape of the tail. However, we have reasonable approaches to do that. 

But that is not sufficient. It neither conforms to the way we do this in practice nor to what is important.

Start with practice. The only practical way to validate a risk forecast model is by backtesting it with historical data. And the only "correct" way to do that by the model designer not knowing that historical data.

That is rarely, if ever, how it is done in practice. Instead, the person developing the model knows the historical data and implicitly or explicitly is biased towards models that work well with that historical data. And there is no way to test for that. For example, if we are backtesting a model with data that covers March 2020, we know what happened then, and we know how that causes models to fail. I, and most risk quants, can then make a model that passes any arbitrary historical scenario, without having any idea about it will work in the future. Why not create a model that performs well on the past? After all, there are strong incentives to do exactly that.

Quoting my book Illusion of Control: "When I was in graduate school, I took a course in Soviet economics, and the professor told us a Russian joke: the best Soviet historians are experts in forecasting history because government policies were justified with reference to history. Thus history had to be "correct" if the historians were to survive.

When it comes to riskometers, if I try out one and one only, I will get the correct confidence intervals for my risk measurements. If, however, I arrive at the same riskometer as a result of trying out a large number of explanatory variables and model specifications, these confidence intervals will no longer be accurate. Any riskometer that performs poorly in forecasting history will be changed, leaving us with those that explain history well, a bit like the Soviet historians. The problem is that such riskometers will tend to do poorly in forecasting the future. "

This is called data snooping in statistics, and there are ways to deal with it, like bootstrapping. What we know from data snooping is that the confidence bounds around the forecast become much wider when we actively search for a model that best fits the data. 

So, if someone were to provide proper confidence bounds around risk forecasts validated by backtesting, bounds that incorporate data snooping, then yes, that methodology would be scientific. I have never seen that done in practice. I tried it once, and the bounds were so wide as to reject every model. Guess doing things right is costly, and certainly not conducive to career progression or bonuses.

Actually, you practically never see confidence bounds around risk forecasts, correctly calculated or not. Rather surprising given that most people producing forecasts are highly educated in statistics, and every stats course, from 101 onwards, emphasises the importance of testing the significance of predictions. OK, not that surprising when considering the incentives.

That makes risk forecasting not too different from scientific socialism. Both are based on the development of testable hypotheses validated by empirical observations. When reality does not agree with what is predicted by the model, ad hoc assumptions are added to the initial theory to ensure it fits the actual facts. If a theory is structured in a way that makes it impossible to refute, it is not science — it is pseudoscience. That is how risk forecast models are often made. That by itself is enough to make the riskometer pseudoscientific.

There is a more fundamental reason why risk management and forecasting is not scientific.

Nobody (or practically nobody) cares about risk forecasts by themselves. Instead, the risk forecast is an input into some practical decision: if to buy an asset, determining the level of capital, compensating an asset manager, evaluating a potential vendor, and the like. Maybe the single main exception is compliance, either internally or with regulations. Which makes it all worse. A better way to state the initial sentence is: "Nobody should care about risk forecasts by themselves".

From the point of view of scientific methodology, we should also examine whether the concept of risk and the probability is appropriate for the intended use, not limit ourselves to merely testing whether a model itself is acceptable.

I like 1994 as the date when the riskometer was first proposed because that is when JP Morgan introduced Value-at-Risk through its RiskMetrics system.

That was a major positive shock to the empirical finance industry, giving lucrative careers to legions of quants. The riskometer solved an important problem. It is an objective and scientific method of risk assessment, and by employing it, we move away from subjectivity, incompetence and even corruption.

I was always sceptical (having studied Popper), and finally convinced myself the riskometer was not scientific when I wrote The Emperor Has No Clothes: Limits to Risk Modelling in 2002.

Fundamentally, the reason why is because risk is a latent variable. Unlike in typical forecasting, such as temperature, where a numeric measurement is what matters, with riskometers, we never know what is important nor correct. 

Not only is an actual risk forecast subject to significant estimation errors. If that is all it was, the riskometer might still be scientific. There are more factors. The operators of the riskometer actively search for a model that best performs on historical data, knowing the history, and then almost never provide confidence bounds, correct or otherwise. But also, we never see any analysis of whether a particular risk measure is appropriate for the intended use.

All we can do is validate model output against an assumption about what is important.

Ultimately, this means that in risk management, we can make an infinite number of statements about what is important, most of which are contradictory.

We have to choose.  We have to make these choices based in part on subjective assessments as to whether methods are fit for purpose or based on fragile assumptions; and on how well they reflect the kind of process that drive problems we should guard against.  These choices cannot be wholly objective because the assumption that the past contains every kind of problem that could happen in the future is a very heroic, and always wrong. 

Taken together, this means we cannot falsify a risk forecast in the sense of Karl Popper, and risk management and forecasting are consequently not scientific.

For risk management, risk modelling and riskometers generally to be more scientific than scientific socialism, their users need to be realistic about data snooping and the difficulties of backtesting; the wide bounds that surround the "best guess" estimates we habitually use and and most of all of the many fragile assumptions that underlie even the best of models. 

However, I don't want to come across as nihilistic. I am not arguing risk management or backtesting or riskometers are useless, far from it, witness my website Financial Risk Forecasting. All I am saying is that the riskometer is not scientific in the Popperian sense. Of course, we need risk forecasting. Just recognise its limits. And that risk management is an art as well as a science.

What does this mean for riskometer practitioners?

p.s. I thank Robert Macrae for valuable comments.


The one-in-a-thousand-day problem
Artificial intelligence and stability

Models and risk
Bloggs and appendices on artificial intelligence, financial crises, systemic risk, financial risk, models, regulations, financial policy, cryptocurrencies and related topics
© All rights reserved, Jon Danielsson,