Now that we have explored two regulatory frameworks for the derivation of interest rate risk (Solvency II for the insurance sector in the EU and Wtp for the pension sector in The Netherlands ), let us perform a model validation on a purely statistical model that calculates the interest rate risk purely on historical data; we call this the BASE method.

The BASE method explores the full available distribution of changes in the data series and makes an inference about the likelihood of extreme observations over multiple projection horizons^{1}. The structure resembles the percentile calculations in the Wtp model where the distribution of projections into the future is used to say something about the risk. The difference lies in the fact that the BASE method utilizes only observed or historical data points. The relevance and reliability of the calculated risks is therefore influenced by the length and appropriateness of the historical data. For the current analyses we have combined two datasets: long maturity Dutch sovereign bond rates (source: KEF, from 1900) as well as 12-year maturity zero-coupon bond rates (source: DNB, from 2004). Where the two datasets overlap, we assume that the DNB dataset is the most accurate. We have analyzed two periods: 1900-2023 (full period) and 1995-2023 (EU period).

**How would our BASE interest rate risk model look if we have limited data points (EU period)?**

First, we explore interest rate projections created using a restricted period starting from 1995, close to the initiation of the European Union. This period is characterized with declining interest rates mainly and a few short periods with interest rate increases.

Given that we observe relatively short periods of interest increases, it is not surprising that when we use the BASE method to calculate the interest risks for the subsequent 10 years, we mainly observe an increase in the negative (VaR99.5%) scenario in the first 2 years, flattening at year 4 and a downward trajectory afterwards (See *Figure 2*). We see that the negative scenario reaches a maximum of 6.1% in year 3 and then decreases to 3.8% in year 10 (somewhat similar to 2023 levels of 3.1%). For the VaR50% and VaR0.5% we see both downward sloping directions for most of the predicted periods. This is not surprising as the interest rate over the restricted period, since 1995, also shows a downward trend.

When we compare the BASE predictions with Solvency II (2020) we note that the predictions up to 2 years ahead roughly align. However, as we go further in the future we see that the BASE predictions are much less conservative than the Solvency II model.

**How accurate is the BASE interest risk estimation based on the shorter period?**

We use the model validation method introduced in previous articles to backtest the reliability of the BASE interest risk method. In this case we use 2005-2023 as the validation period over which we estimate performance metrics (correlation and validation). The graph below plots monthly data of 1-year ahead projections for VaR99.5%, 50% and 0.5% scenario relative to the actual observed rate in that period. For each datapoint in the validation timeframe, we use the real data from 1995 until the preceding year to derive the percentiles, thereby for 2008 percentiles calculation we have used 1995 – 2007 realized interest rate data.

With a correlation of 85% between the VaR50% and the realized DNB rate, the BASE method is relatively good at capturing the expected trend. Since we are interested in capturing risk, we need to focus on the tails. The validation shows that only 91% of the observations fall below the estimated 99.5% VaR level, therefore we are not able to capture 8.5% of the extreme observations. Furthermore, 4% of the observed data lies below the VaR0.5% level, indicating we are not capturing the lower tail accurately either. Although these results are better than the current Solvency II (2016 version) model, we note that the overall volatility, as measured by the distance between the upper and lower tails, was also too low during the COVID and post-COVID period, which does not align well with reality.

**Can we improve the BASE interest risk accuracy by extending the historical dataset?**

If we go further back in time and look at the development of the interest rate from 1900 onwards, we see that, except for the peaks around the two World Wars in 20s & 40s, the period until 1960 is relatively stable. What follows is a highly dynamic period characterized with prolonged periods of increased interest rate following steep but relatively shorter drops.

Calculation of the projected percentile levels using interest rate changes on the full dataset leads to more stable outcomes and resembles more the projections of the extended Solvency II (2020) model. As the distribution of the changes in interest rates over the full dataset is more normally distributed, we find that the projected VaR50% level fluctuates slightly around the current level of 3.1% over the full prediction period. Similar to the prediction based on the shorter period (see *Figure 2*), the VaR99.5% level does drop slightly after year 4, however, this is short-lived, and it increases to 7.4% by year 10. Contrary to the continuous decrease in VaR0.5% level, we observe an increase after year 6.

Performing backtesting of the BASE method on the full data series over the same model validation period leads to better results. The correlation between the realized DNB rate and the VaR50% remains strong at 85%, and we see improvements in the tail estimation.

The percentage of observations over the test period which lies below the 99.5% level is now already 96%, while we do not see any realized data points which are under the 0.5% level. Furthermore, the volatility of estimated values tends to be larger, and the higher upper tail risk observed recently is captured a lot sooner (around the mid 2022 relative to beginning of 2023 in the shorter validation period, see *Figure 3*). Overall, over the test period, we are now much better able to capture the extremes by utilizing the longer 1900 data set rather than the shorter 1995 version.

**How accurate are 10-year ahead interest risk projections?**

When we perform the model validation on a 10-year ahead interest rate risk projections, the BASE model does not perform that well. The graph below (see Figure 7) indicates that as we try to predict the interest rate further into the future, we end up with less reliable estimates. The correlation between the VaR50% and the realized rate is low and negative (-48%), all observations fall below the 99.5% scenario (or even the 95%) which is indicative that we are overly conservative with respect to the upper tail risk when we predict 10 years into the future.

In addition, 20% of the observations lie below the 0.5% scenario which shows that the lower risk estimate was not low enough. This also tells us that the size of the declines in the 2015-2020 period was quite exceptional in comparison to all other periods of declines as observed in the full dataset. We note that the realised rates did not cross the lower percentile aggressively, but rather stay close to that lower band. This shows us that, although not very accurate, the model still provided a useful risk estimation.

This also serves as a reminder that estimates need to be updated frequently (at least on yearly basis) and that one should be more careful when utilising risks calculated for the more distant future (e.g. by using more conservative risk percentiles).

We have also performed model validation on other horizons and noted that performance statistics remain high for projections up until 4 years into the future for the BASE interest risk model.

**Do metrics above indicate that we should always use the full dataset when we calculate interest rate risk?**

Figure 6 shows that the full dataset leads to better results compared to using only the short EU period. Indeed, having sufficient observations is very important for every statistical analysis, otherwise the validity of the results is compromised.

Nevertheless, using the full dataset is not necessarily better if the current environment is different from most of the periods in the available dataset. For instance, if the available data includes mainly periods of relatively stable inflation, economic growth and employment our risk estimations would provide quite gentle risk estimations that would not be appropriate if the macro-economic environment changes to what we saw in 2022.

Therefore, what matters more than using the full dataset is using the “correct data points”. That is, we might need to restrict our estimation to include only periods that resemble the current economic situation to get a proper risk estimation. It is evident to note that the EU dataset (since 1995) failed to provide us with the context that is most similar to the current, post-COVID, environment.

**In a nutshell…**

Utilizing regulatory frameworks such as Solvency II or Wtp can be a good way to start to obtain an interest rate risk estimation. However, this model validation shows us that a pure statistical approach, such as the BASE method, can lead us to a more accurate interest risk estimation.

We also see that the length of the historical data series is of paramount importance for an accurate risk estimation. When using a shorter dataset starting from 1995 we see that, although better than the current Solvency II model (2016 version), the method also does not fully capture the interest dynamics of the post-COVID period. When we use the full dataset since 1900, we see a substantial improvement in the backtest results, which showed to be more accurate than the updated Solvency II model (2020 version).

The BASE method has some limitations. One stems from the inability to adequately capture different maturities. The dataset before 2004 only contains 10–15-year maturities. Therefore if we want to use the BASE method for different maturities, we would have to make additional model assumptions which would create additional model uncertainty.

Secondly, the BASE method does not say much about the context we are in. If we do not properly understand the current dynamics that affect the interest rate, we cannot be sure that either the full (1900-) or the EU-era (1995-) period is better at estimating the future interest rate risk. Also, when we choose one of these periods, we would display the same risk estimate independent of the period that we are in, also in periods of relative stability. In those periods many would find this too conservative and not needed. Therefore we cannot consider those predictions ‘better’ if the current context is very different from the period that was used to come to the risk estimate.

We will address the limitation of independence of context by performing a model validation on the REGIME-method which we will discuss in the following article.

**footnotes**

- This
*model can be seen as a simplified Vasicek model where no mean reversion correction is applied. This would make the model slightly more conservative in case this assumption does not hold, for example when interest rate levels were to deviate far from a unobserved long-term mean.**This approach is consistent with findings by Jan Willem van den End who found a mean reversion parameter close to 0 for long-term interest rates in his research paper ‘Statistical evidence on the mean reversion of interest rates’ .*

↩︎