Ratio Mathematica Volume 43, 2022 An adaptive Neural Network approach to predict the Capital Adequacy Ratio Giacomo Di Tollo* Gerarda Fattoruso† Bartolomeo Toffano‡ Abstract Financial institutions, policy makers and regulatory authorities need to im- plement stress tests in order to test both resilience and the consequences of adverse shocks. The European Central Bank and the European Bank- ing Authority regularly conduct these tests, whose importance is more and more evident after the financial crisis of 2007-2008. The stress tests’ non- linear features of variables and scenarios triggered the need of general and robust strategies to perform this task. In this paper we want to introduce an adaptive Neural Network approach to predict the Capital Adequacy Ratio (CAR), which is one of the main ratios monitored to retrieve useful infor- mation along many stress test procedures. The Neural Network approach is based on a comparison between feed-forward and recurrent networks, and is run after a meaningful pre-processing operations definition. Results show that our approach is able to successfully predict CAR by using both Neural Networks and recurrent networks. Keywords: Capital Adequacy Ratio; Stress Tests, Neural Network Ap- proach.1 *Department of Law, Economics, Management and Quantitative Methods (DEMM), Univer- sity of Sannio, Benevento, Italy; giditollo@unisannio.it †Corresponding Author. Department of Law, Economics, Management and Quantitative Meth- ods (DEMM), University of Sannio, Benevento and NEOMA BS, Rouen, France; Italy; fat- toruso@unisannio.it ‡Department of Economics, Ca’ Foscari University, Venice, Italy; bar- tolomeo.toffano@unive.it 1Received on July 20, 2022. Accepted on September 20, 2022. Published on September 25, 2022. doi: 10.23755/rm.v43i0.841. ISSN: 1592-7415. eISSN: 2282-8214. ©Di Tollo et al. This paper is published under the CC-BY licence agreement. G. Di Tollo, G. Fattoruso, B. Toffano 1 Introduction Banks’ bankruptcy may have catastrophic effects over the overall economy, since the contagion effect it may trigger could lead to a generalised overall cri- sis [47]. To this extent the activity of banking supervision, and the role and the authority of bank regulation, play a big role in preventing (or reducing the effect of) the banks’ bankruptcy [11, 44]. Although aimed to different targets, many of these supervision exercises are designed to maintain a sufficient banks’ level of capital adequacy to allocate specific reserves aimed to face expected losses and to protect themselves again excessive credit expansion. Authorities regulations im- pose constraints over these reserves, even though banks often imposes themselves reserves higher than the ones imposed by the regulations. These constraints are defined by a minimum capital adequacy ratio that measures the level of capital as a function of the risk bore the bank [27]. In this framework, systemic risk represents the risk of breakdown of the entire financial system: this can be triggered by the misbehavior of a single component of the overall financial system, that triggers negative impacts on the overall sys- tem. This scenario can be observed on the timeline of years 2007-2008, starting from some cracks in the subprime mortgage markets leading to a worldwide fi- nancial crisis: this confirmed the idea that the more complex and non-linear a system, the higher the probability of the system to fail [23, 28]. In order to pre- vent these failures (and/or to quantify financial (un)stability), financial institutions resort to stress tests, which are non linear tools used to assess the magnitude of an exogenous shock and to determine a collapse threshold. These tools operate by investigating both the local stress level (i.e. a single bank) and how the shock is globally spread. Back in the 1990s, stress tests were intended for testing the resilience and the stability of a financial institution to reach a certain credibility. Later, stress tests, have been used to check the stability and vulnerability of single financial institutions and the overall banking systems [16]. Based on the evalua- tion aim and on the implications of the findings, stress tests can be classified into two major groups: microprudential stress tests, which are forward-looking super- visory instruments for determining the liquidity adequacy of individual banks in relation to their portfolio risks [4]; macroprudential stress tests, which consists of two different types of approaches: the bottom-up and the top-down approach [16]. In the bottom-up approach, the effect is measured using data on individual portfolios. On the other hand, in the top-down approach, the impact is estimated by using aggregated data. Many computational methods have been introduced to perform stress test and to predict bank failures: Discriminant Analysis [37], Logit and Probit analysis [8], Neural Networks [53], just to name a few. There are also contributions that compared different methods: for instance, [2] inves- tigates different methods such as Logistic Regression (LR), Linear Discriminant An adaptive Neural Network approach to predict the Capital Adequacy Ratio Analysis (LDA), Random Forests (RF), Support Vector Machines (SVM), Neu- ral Networks (NN) and Random Forests of Conditional Inference Trees (CRF); [18] compared Generalized Linear Models and Generalized Additive Models, and concluded that Generalized Linear Mixed Models have a better ability to predict troubled businesses. During a stress-test procedure many variables are taken into account to mon- itor the financial institutions, and there exist several studies aimed to assess the relative importance of these variables, in order to select what variable to monitor to get the most accurate information as possible. Many contributions focuses in determining the relationships between Capital Ratios and bank failures [1], and to understand whether these ratios are useful to assess the regulatory capital ad- equacy [42]. In this context, a careful investigation of Capital Ratios is crucial for both the regulatory authority and the bank itself, since it has been shown that the familiar banking characteristics for identifying a distress-prone bank identified fragile banks effectively during the global crisis without new information and are likely to continue to work well in the future [40]. Amongst many variables (e.g., profitability, liquidity, solvency, productivity, asset quality, see [54]), Capital Ad- equacy Ratio (CAR) has a prominent place, and has been used in the predictor set by many works [39, 31, 14, 38, 33, 5, 48]. Recently, some contributions proposed to predict it as an indicator of financial health [49]. In our contribution we want to expand this framework by implementing an adaptive Neural Network approach to predict CAR from a well established set of indicators, and to provide banks and regulators with useful information about their stress-test activity. Our contribution is organised as follows: Section 2 reports the main literature on the topic; Sections 3 and 4 outline the set of data and the pre-processing opera- tion performed on it; Section 5 introduces the methods used in this paper; Section 6 comments the main results and Section 7 concludes the paper. 2 Literature review CAR represents a particularly relevant topic for assessing the risks to which banks are exposed [6]. In fact, for the construction of the CAR index, credit risk, market risk, interest rate risk and exchange rate risk are considered. In this sense, the regulatory authorities define the CAR as a significant indicator of safety and stability as it considers capital as a useful element to absorb losses [34]. Currently, the Capital Adequacy Ratios (CARs) defined by the minimum ratio of capital to risk weighted assets are 8% under Basel II and 10.5% under Basel III [3, 12]. Based on this, CAR represents a factor of analysis by regulators to determine cap- ital adequacy for banks and to perform stress tests [29]. In order to aggregate the information coming from the literature, we performed an analysis on bibliographic G. Di Tollo, G. Fattoruso, B. Toffano data using the software VOSviewer (Figure 1) to create a keyword co-occurrence map in order to analyze the main CAR literature in our field of analysis. Figure 1: VOSviewer: Capital Adequacy Ratio From the analysis of the data, it emerges that several authors analyze the re- quired minimum levels of the CAR by evaluating macroeconomic indicators [46], [7], financial indicators [50], multi credit rating indicators [45]. Furthermore, many authors carry out stress tests on CAR to verify the effects of economic crises [58], stability [25] and resilience [15] of banks, along with macro stress test for re- silience assessment [20]. Recent studies are moving towards identifying the most important variables for future projections of the CAR. In particular, [50] carry out a study on South Korean national banks using Random Forest Boruta algorithms, Random Forest Recursive Feature Elimination, and Bayesian Regularization Neu- ral Networks. Other contributions use CAR to benchmark the performances of banks in stress tests [3, 12, 24, 27, 29, 32, 59]. The goal of our contribution is to assess whether we can use stress-testing to effectively benchmark the performance of a bank in a precise scenario, and to this extent we need to choose a metrics that can precisely fill that role. CAR is apt to measures the financial soundness of banks in absorbing a reasonable amount of loss, and on the basis of the central role that the CAR assumes in the assess- ments of banks and on the basis of the guidelines of the literature on the analysis of the minimum levels of the CAR, our work aims to accurately predict the CAR by using quantitative methods. In this framework the quantitative research about stress-testing has been twofold: on one side, to predict the banks’ bankruptcy; on the other side, to assess the different variables features and capability to explain the default. According to [54], Neural Networks are widely used in contributions related to the first side, while its application about the other side are still limited. We can start our discussion by pointing out that along with stress testing, a key topic is the prediction of various risks, that was based on traditional probability An adaptive Neural Network approach to predict the Capital Adequacy Ratio and statistical theories [9], but that could lead to non-linear formulations or to tak- ing into account just a few variables, hence triggering the we need of complex and non linear models, also due to the needs of a more interconnected world, not only in financial terms. For this reason researchers and risk managers avail themselves of the usage of Artificial Neural Networks and Deep Learning to stress testing activities and predict high volatility periods. Since more hidden layers are in a Neural Network means a more complex modelling interaction effect, in finance forecasts, large collections of data often require dynamic data relationships that are difficult or impossible to specify under a complete model [9]. On the other hand, deep learning models can identify and manipulate dynamic non-linear data connections that are invisible to any current financial economic theory and may deliver more reliable predictive outcomes than traditional approaches [30]. Al- though Artificial Neural Networks and Deep Learning have several applications in the financial field such as credit scoring, predictions and forecasts in finan- cial crisis and bankruptcy, we want to focus on how Artificial Neural Networks and Deep Learning Methods are related to stress testing. Financial stability is essential to the economic growth of countries and individuals. Regulatory agen- cies and foreign organisations carried out stress testing activities to determine the stability of the financial system even earlier than 2007, but failed to anticipate the unprecedented economic implications of the crisis. For this reason ever more stress testing exercises were created and used from the authorities with a glance to the consequences of an interconnected financial system in the macroeconomic environment. For example, the European Banking Authority (namely, EBA) ap- proach uses simplified assumptions that cover only particular risks to individual bank balance sheets depending on the macro-economic scenario. One of the ma- jor drawbacks of the European Banking Authority approach is the static financial statement expectation, which allows assets and liabilities to stay stable over the horizon considered without any appreciation of management decisions or new loans. Macroeconomic feedback impacts, such as the influence of large insolvent firms on the global economy, are not generally welcomed assumptions in these systems. This kind of test aims the planning binaries of an after crisis recovery behaviour. However, [52] show that the main problem with respect to the Eu- ropean Banking Authority approach is that this mechanism does not provide an early alarm to avoid being completely disarmed in front of a shock. In [52] it is also provided a solution to this weakness of the model. They propose a Neu- ral intelligence for which financial or macroeconomic disturbances extend to the bank’s balance sheets while simultaneously building a large Neural Network with macro and financial factors. The model is capable of gathering more knowledge concealed in a large data set and allows for complex non-linear interactions that materialize under adverse macroeconomic conditions and financial strain. This methodology examines the financial system independently, without relying on the G. Di Tollo, G. Fattoruso, B. Toffano forecasts of the single banks. As a result of the cited paper, comparing the static stress test models with dynamic ones, prove that the deep learning framework can become a useful tool and can improve the early warning mechanism’s signaling ability to anticipate future financial issues and failures of individual banks. The authors, finally, compare the performance of the Deep Learning technique with the classic stress test models, such as the constant balance sheet approach and the dynamic balance sheet approach to satellite modelling. They reveal that the pre- diction error of the CAR dropped significantly under the Deep Learning Method due to its improved performance in simulating the one-year gains and losses of financial institutions. For this reason, Deep Learning Architecture may become a useful tool for macro prudential stress testing and can improve the early warning mechanism’s ability to anticipate future financial crises and failures of individual banks. Now that we have a measure by which we can benchmark banks, we need to find a way by which we can predict the CAR of banks based on certain factors which is what we need in order to stress test banks. The more factors we can incorporate in our predictions the better since it will reflect better a real-world situation and make our stress testing much more realistic. 3 Data set Our data set consists of worldwide banks’ financial indicators; along with stress financial indicators, we have considered also macro-economic indicators, in order to identify the propagation of systemic shocks that propagate into the fi- nancial institutions. We have retrieved quarterly observations that covers a period of 12 years (2007 to 2019). Data was collected from different sources: stress financial indicators have been collected from the Federal Deposit Insurance Corporation2 website3; macro- economics indicators were collected from the Federal Reserve Economic Data4 website5. The sample period covers twelve years: we have collected quarterly data referring to 672 banks and financial institutions between 2007 and 2019, hence we dispose of 34944 observations. The sample includes missing and noisy 2FDIC is an independent agency created by the Congress to maintain stability and public con- fidence in the nation’s (USA) financial system. The FDIC insures deposits; examines and su- pervises financial institutions for safety, soundness, and consumer protection; makes large and complex financial institutions resolvable; and manages receiverships 3referred to as https://www.fdic.gov/FDIC in what follows 4referred to as https://fred.stlouisfed.org/FRED in what follows 5Researchers at the St. Louis Fed contribute to monetary policy discussions by advising on a range of topics, especially in preparation for Federal Open Market Committee meetings (from the https://www.fdic.gov/FDIC website, accessed on 2021, January 29th). An adaptive Neural Network approach to predict the Capital Adequacy Ratio Table 1: Variables used in the experimental phase and the category they belong to: the label FIN denotes financial indicators and MAC denotes macro-economic indicators. Name Description FIN / MAC net loan Net loans and leases exposure FIN loss allow Loss allowance to loans FIN dep Total deposits FIN yield ea Yield on earning assets FIN fundc ea cost of funding earning assets FIN inc aa Noninterest income to average assets FIN CAR Total risk-based capital ratio FIN tot asst Average total assets FIN tot eq Average total equity FIN tot loan Average total loans FIN risk dens Risk weight density FIN GDP growth Gross Domestic Product growth MAC export growth US real exports of goods and services growth MAC debt GDP US public debt to GDP MAC govex GDP US government expenditure to GDP MAC inflat Implicit price deflator as a measure of US inflation MAC HPI growth House Price Index growth MAC unemp Unemployment rate (age 15-64) MAC Yield 10Y 10-year US sovereign bonds yields MAC SP500 ret SP 500 quarterly returns MAC values. Please notice that we have chosen a sample period that does not con- tain sub-periods denoting the emergence of a crisis, since we want to develop a methodology for ordinary periods, in which systemic shocks are more difficult to detect. Collected data show a number of correct entries which is smaller than its theoretical value: this is due to missing and noisy data, and could lead to misbe- havior of the Neural Network approach, hence we had to devise pre-processing operations, that are outlined in what follows. 4 Data pre-processing Data analysis is a key point in all experimental settings, and it is always per- formed in order to understand its features, to detect anomalies (if any), and to rep- resent data without loosing useful information. Based on the observations oulined by [13, 21], we apply the following data pre-processing operations. G. Di Tollo, G. Fattoruso, B. Toffano Table 2: Variables used in the experimental phase: overall main statistics before pre-processing operations. Name Mean STD Kurt. Skewn. Min Max net loan 684062.10 2845178 128.86 10.17 0 75190000 loss allow 10811.40 92866.23 1170.43 29.11 0 5752000 dep 848100.51 4672760 498.48 18.76 68 2180000 yield ea 4.61 1.21 20.01 2.38 0.07 26.96 fundc ea 1.24 0.88 2.67 1.32 0 16.59 inc aa 1.47 18.11 722.73 26.40 -15.95 601.27 CAR 23.68 15.42 113.18 6.08 0.75 725.80 tot asst 1114221 5587407 302.58 14.46 2816 2110361 tot eq 125720.10 566772.30 129.17 10.33 539.75 14389800 tot loan 683913.22 2852402 129.57 10.19 0 71201027 risk dens 60.09 14.28 0.63 0.06 8.43 192.24 GDP growth 1.73 2.35 4.05 -1.53 -8.45 5.51 exp. growth 3.63 8.17 5.14 -1.19 -28.65 25.84 debt GDP 93.95 11.27 1.38 -1.49 61.65 105.18 govex GDP 0.34 0.01 -1.09 0.37 0.31 0.37 inflat 100.43 4.51 -0.94 0.12 91.70 111.25 HPI growth 201.26 20.39 0.22 0.92 176.86 264.31 unemp 7.22 1.89 -1.40 -0.07 3.78 10.05 Yield 10Y 2.64 0.72 -0.02 0.70 1.56 4.84 SP500 ret 0.02 0.06 6.76 -1.47 -0.27 0.17 4.1 Removal and replacement When collecting data, one may incur in missing and incorrect values. Previ- ous contributions related to Neural Network approaches [21] suggested to remove indicators containing more than 30% of missing and wrong values. Our set of data does not contain such indicators, so we are using the whole set of variables in our experimental phase. Anyhow, many indicators show missing and wrong values, so we replace missing values (due to computational errors) with the upper limit of the normalization (see what follows), and wrong values with the indicator’s average over time. 4.2 Normalization Normalization is a general procedure performed in order to feed the Neural Network with data belonging to the same range: many contributions stress the importance of performing meaningful normalization, and many formulas are sug- An adaptive Neural Network approach to predict the Capital Adequacy Ratio gested [35]. In our case we used the logarithmic transformation that has been already introduced by [21], defined as follows: xi = logu (|min(0, xmin)|+ xi + 1) , (1) where xi represents the value before normalisation of input x for firm i, and xi represents its normalised value. Please notice that we have defined u such that u = xmax + 1 this has been imposed in order to have xi ∈ [0, 1]. Table 3: Main statistics of overall financial and macro-economic indicators after pre-processing operations. Name Mean STD Kurtosis Skewness Min Max net loan 0.67 0.08 1.41 0.42 0 1 loss allow 0.48 0.10 1.29 0.25 0 1 dep 0.64 0.07 1.38 0.58 0.22 1 yield ea 0.53 0.06 5.19 -0.07 0.02 1 fundc ea 0.30 0.14 -0.28 0.53 0 1 inc aa 0.41 0.06 6.68 0.05 0 1 CAR 0.50 0.07 1.23 0.86 0.08 1 tot asst 0.65 0.07 1.38 0.66 0.42 1 tot eq 0.63 0.08 1.36 0.71 0.38 1 tot loan 0.67 0.08 1.43 0.43 0 1 risk dens 0.79 0.05 1.48 -0.74 0.44 1 GDP growth 1.06 0.31 2.58 -1.52 0 1 export growth 0.99 0.2 10.45 -3.09 0 1 debt GDP 0.97 0.02 2.11 -1.71 0.89 1 govex GDP 0.93 0.03 -1.09 0.36 0.86 1 inflat 0.98 ¡ 0.01 -1.17 0.03 0.96 1 HPI growth 0.96 0.01 -0.83 0.01 0.92 1 unemp 0.86 0.10 -1.27 -0.28 0.65 1 Yield 10Y 0.79 0.11 -0.93 0.10 0.55 1 SP500 ret 1.12 0.57 -1.08 -0.09 0 1 G. Di Tollo, G. Fattoruso, B. Toffano 4.3 Correlation analysis We have performed a correlation analysis in order to understand whether some kind of correlation arise amongst variables defined in Section 3 and to avoid feeding the network with highly-correlated indicators. We have tested Pearson’s, Kendall’s, and Spearman’s correlation, leading to similar trends in the obtained correlations. In what follows we will refer to Spearman’s ranked based correla- tion. We have decided to remove from the predictor set the indicators showing a correlation with a given portion (i.e., j) of other indicators greater than a given threshold (i.e., h). In order to determine the value of j and h we have defined parameter-tuning procedure via REVAC (see [43]). The values found have been j = 1 3 and h = 0.70. On the basis of these values, we have decided to remove indicators showing a correlation with 30% of the other indicators greater than 0.7. These indicators are: netloan, lossallow, dep, totasst, toteq, totloan. They will not be considered in what follows: 6 indicators have been removed from the predic- tors set, corresponding to 24 quarterly indicators that will not be used to feed the Neural Networks’ nodes. As for the Neural Network experiments, in what follows we are outlining re- sults of the experiments run by using data before the pre-processing operations (referred to as full model) and data after the pre-processing operations (referred to as reduced model). 5 Experimental analysis In this section we are introducing the methods used to perform our experimen- tal analysis: the Neural Network approach will be detailed in Section 5.1, along with the main components needed to define its use, i.e., the network topologies (Section 5.1.1) and the partitioning of the set of data to enforce generalisation (Section 5.1.2). Then, we are introducing the methods we are comparing our ap- proach with (Linear Regression in Section 5.2, and Generalised Linear Models in Section 5.3), along with the metrics used for our comparisons in Section 5.4. 5.1 Neural networks In this section we are introducing our Neural Network approach: Artificial Neural Networks [22] can be referred to as algorithms that mimic the behavior of the human brain to perform complex tasks, and they are used to grasp non- functional relationships over the data. They are composed of elementary units (neurons) which are connected to each other via weighted and oriented links (synapses). Neurons may have different functions: the input neurons receive data An adaptive Neural Network approach to predict the Capital Adequacy Ratio from external sources; the output neurons show the computed output values; the hidden neurons are used to perform computations. During the learning phase, the weights associated to the synapses iteratively change over time, accordingly to a specific algorithm: several algorithms have been proposed for this learning phase: Back-Propagation [56], Quasi-Newton methods [55], Levenberg-Marquardt algo- rithm [26], just to name a few. The learning phase may be organised following three different paradigms: supervised learning [36]; unsupervised learning [10] and reinforcement leaning [57]. In what follows, we are training networks intro- duced in Section 5.1.1 by using back-propagation algorithm, in order to minimize the network test set’s root mean square error (RMSE) defined as √√√√1 n n∑ i=1 (ei −ai)2, (2) where n is the test set size, ei is expected output value corresponding to pattern i, and ai is the actual network output corresponding to input pattern i. 5.1.1 Network topologies For our experiments we are using two different Neural topologies: a feed- forward6 architecture with 80 inputs nodes, referred to as standard network (see Figure 2), and a variant in which inputs neurons corresponding to the same in- dicator are grouped by 4 before feeding the first feed-forward layer7 (see Figure 3). Please notice that, as for the cardinality of hidden neurons, many rules of thumb exist, suggesting different formulas to compute the number of hidden lay- ers and the number of hidden neurons [19]. We have decided not to use any of these rules, resorting to an adaptive method to determine the optimal hidden neu- rons structure: this procedure has been proposed by [17], and it aims to minimize the network’s error (in our scenario, the Eq. 2) calculated for each of the data set at hand. This adaptive procedure starts with a single hidden neuron and iteratively add one neuron until no improvement on the Eq. 2 is found over the last user- defined K iterations, and is outlined in Algorithm 1. 6A feed-forward network features neurons grouped into layers (1,2, . . . , lmax) : each neuron be- longing to layer i (i < lmax) is associated to synapses that connect itself to all neurons belonging to layer i + 1. 7These four values correspond to past observations spreading over one year, since for each indi- cator i corresponding to time t the input pattern contains the value of the indicators collected at time t, together with the 3 previously quarterly collected values. G. Di Tollo, G. Fattoruso, B. Toffano Figure 2: Standard network. An adaptive Neural Network approach to predict the Capital Adequacy Ratio Figure 3: Ad-hoc network: input neurons are grouped by four, indicating the observations over a year of the same variable. G. Di Tollo, G. Fattoruso, B. Toffano Algorithm 1: Adaptive hidden neurons computation for Neural Networks Initialization: observational data set The optimal network topology w.r.t. the error defined in Eq. 2. b in 1, . . . , # sub-sampling runs Xb ← Dataset at bth sub-sample Xtrainb ← Training set at bth sub-sample run Xtestb ← Test set at bth sub-sample run i in 1, . . . , # of hidden layer k ← 0 j ← 1 k ≤ K train net Netij on Xtrainb compute RMSEij error with Eq. 2 on the Xtestb Overallij ≤ all RSMEij BestNet ← Netij Neurons ← (i, j) Overallij > RSMENeurons k++ k ← 0 j++ BestNetb ← BestNet Neuronsb ← Neurons return BestNet, RSME=Eq. 2, Neurons 5.1.2 Training and test set In our experiments we are exploiting the supervised learning, meaning that during the learning phase, for each input pattern, we are also providing the de- sired output value, that in our scenario corresponds to CAR: all other indicators considered in Table 3 will define the input pattern for each financial institution. In order to grasp the time dynamics, for each input indicator i we are providing to the network the value of the indicators collected at time t, together with the values collected at time (t − 1), (t − 2), and (t − 3). During the Neural Network learning we have to identify two disjoint sets of observations out of the overall 34944 observations: the training set, that will be used to determine the synapses’ weights, and the test set, that will be used to determine the network performance and to stop the learning. According to [19], we have decided to split the overall data by randomly allocating the 70% of its observations to the training set, and the remaining 30% to the test set. This random allocation has been repeated 50 times, each time determining a different train-test partition. We have then run our Neural network approaches on all obtained partitions, and in what follows we are reporting, for each Neural approach, the average and standard deviation statistics over the 50 partitions. 5.2 Linear Regression Linear Regression is used to model the relationship between two or multiple parameters by fitting a linear equation on the observed data. Usually, this is done using the least-square regression that minimizes the sum of squares of the vertical deviation from each data point on the line. The algorithm aims to reduce this sum by selecting the most appropriate constants in the equation representative of the regression line. A Linear Regression line has an equation of the form Y = a + bX where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0) [60]. An adaptive Neural Network approach to predict the Capital Adequacy Ratio 5.3 Generalised Linear Models (GLMs) The basic Linear Regression predicts a certain value as a linear combination of a specific set of observed values, meaning that a change in one or multiple predic- tors affects the response variable. However, for complex data, Linear Regression is not very effective, and in these cases one may resort to Generalized Linear Mod- els [41], that allow response variables to have arbitrary distributions (rather than normal distributions), and define an arbitrary function of the response variable (the link function) to vary linearly with the predictors, rather than assuming that the response itself must vary linearly. A generalized Linear Model (GLM) con- sists of three elements: Linear predictor; Link function; Probability distribution or exponential family. The linear predictor is the linear combination of parameter b and explanatory variable x. The link function is what links the linear predicted and the probability distribution: there are many link functions and usually, they are used depending on the features of data we are trying to predict and in which range are we expecting it to be. 5.4 Evaluation of the model Once we build a model through Linear Regressions (or GLMs), we need to measure the correctness of this model. This can be done via different statistical measures, that are used to benchmark the performance of predictive models. 5.4.1 Root mean squared error The root mean squared errors represents the standard deviation of the prediction errors. by that, we mean that it tells us how concentrated the data is around the line that we predict. To calculate it we can take the square root of the Mean squared errors. RMSE = √√√√1 n n∑ i=1 (Yi − Ŷi)2. where n is the number of predictions, Yi the observed values, and Ŷi being the actual prediction of that variable. 5.4.2 R-Squared The R-Squared represents how well the data fits on the regression line. More G. Di Tollo, G. Fattoruso, B. Toffano generally, it is used to analyze how the difference in one variable can be explained by other variables. In the case of regression, we can reason in percentages and say that the closer the measure is to 1 the closer the points are to the regression line up to 1 where 100% of the points are on the line. Generally, this would mean that the higher R-squared is the better the results we have but this can be false in some edge cases. It is calculated by squaring the correlation coefficient calculated with this formula R2 = 1− SSres SStot where SSres is the sum of squares of residuals, and SStot is the residual sum of squares. 5.4.3 F-statistic The F-statistic in a regression is a value that represents how well you improved the regression line compared to a regression line with all the coefficients = 0. if your model significantly improved the model fit then you will get a better F- statistic. But before taking into account the F-value one must first look at the P-value that is calculated at the same time as the F-statistic. With the F-statistic calculation comes the P-value. Usually the P-value is looked at before taking the F-statistic into account. If the P-value is lower than the alpha level, then we can reject the null hypothesis and we can consider the F-value, otherwise the F-values is worthless. 6 Results and discussion In this section we report the principal results to build a model that is performs well on our benchmarks. All Neural approaches have been implemented in Python, exploiting the li- brary Tensor-Flow. Experiments have been run on a on a cluster with AMD Opteron 2216 dual core CPUs running at 2.4 GHz with 2x1 MB L2 cache and 4 GB of RAM under Cluster Rocks distribution built on top of CentOS 5.3 Linux. Table 4 reports the RMSE of the experiments run with both the Standard and Ad-hoc networks. We have performed 50 runs of the adaptive procedure devised in Algorithm 2 and reported the minimum, maximum, mean, median, and standard deviation of the RMSE distribution, for each possible instantiation of the pair [network, model]. An adaptive Neural Network approach to predict the Capital Adequacy Ratio Standard Standard Ad-hoc Ad-hoc Reduced model Full model Reduced model Full model Min 0.070 0.071 0.040 0.040 Max 0.074 0.076 0.045 0.051 Mean 0.072 0.073 0.042 0.045 Std 0.001 0.001 0.001 0.003 Median 0.072 0.073 0.042 0.046 Table 4: RMSE of the experiments run with the Standard and Ad-hoc networks. For each columns. For each column, statistics over 50 runs of the proposed adap- tive procedure are reported. As a first remark, we can see that the pre-processing operation have a valid role in improving the networks’ performances, in both Standard and Ad-hoc networks. Then, we can see that the Ad-hoc networks’ error is lower than the Standard one. This confirms the results found by [17] and [21], in which authors exploit the fact that the Ad-hoc network is able to grasp the temporal dependence of inputs. Then, we are presenting the results obtained with Linear Regression and with GLMs (both developped using the Sci-kit learn library on Python[51]), and then, we are comparing them with the Neural Network approach devised in section 5.1. As a first experiment, we have implemented a regression approach over the whole dataset, and the results are shown in table 5: in this table we report the R2 rel- ative to experiments performed with Linear Regression and GLMs, along with two variants: the Adjusted R-Squared (that takes into account the number of pre- dictors) and the Predicted R-Squared (that takes into account overfitting). Please notice that for GLM we have used the pseudo R-squared (for sake of definition) and that we also report the P-value of the F-test (i.e., the probability of obtaining an F-statistic value that is greater than the model’s F-value, under the null hy- pothesis that the regression model is not significant: low positive values identifies good fit). We have performed experiments by using as predictors both the total set of variables identified after the pre-processing phase (identified by the entry Whole set of data in the table), and a limited set of predictors composed of all pre- dictors that are significant for the regression according to their P-value (identified by the entry All observations, limited set of predictors). We see that the reduction of predictors does not improve the goodness of the fit according to the different metrics used, so in what follows we are using the whole set of predictors. In this direction, we remark that Generalised linear models lead to a better R−squared, but this comes at the cost of a higher overfitting, as witnessed by the lower value of the PredictedR − squared. Linear Regression instead, leads to a worse (but still acceptable) R−squared, but its difference with the PredictedR−squared G. Di Tollo, G. Fattoruso, B. Toffano is lower, showing a better robustness of the approach. Please notice that all regres- sions are significant according to the P-value of the F-test. We recall that these experiments have been performed on the whole set of data. In what follows we will describe experiments performed on different partitions of training/test sets, in order to compare these approaches with our Neural Network approach. Table 5: Linear Regression and GLMs over the whole set of data: measures of the goodness of the fit. Data Used Model statistic values All observations, Linear Regression Predicted R-squared 0.77 limited set of predictors R-squared 0.78 Adj R-squared 0.78 Whole set of data Linear Regression Predicted R-squared 0.77 R-squared 0.78 Adj R-squared 0.78 P-value of the F-test 0.0 Whole set of data GLM Predicted R-squared 0.71 Pseudo R-Squared 0.91 P-value of the F-test 0.0 Please notice that the previous results have been obtained on the whole set of data. In order to test the generalization capability of our approach, we have split the whole set of data in 50 different training/testing partitions (i.e., the same partitioning generated in Section 5.1.2), built our model to fit the training set, and assessed the goodness of the fit on the test set. This has been done for all regression detailed in Table 5, along with the Neural approaches devised in Table 4, and the goodness of fit (assessed by the R − squared) has been reported in Table 6, as computed on the error distributions displayed in Table 4. By looking the results, we see that the best results are offered by the Ad-hoc Neural Networks, but also the Standard network performs fairly well. This is due to the generalisation skill of the network, able to prevent the overfitting we have found on the aforementioned regression approaches. This confirms the goodness of our adaptive procedure, that has been proposed by [17] in a different context, but that can be tailored to the different application scenarios. An adaptive Neural Network approach to predict the Capital Adequacy Ratio Table 6: Experiments with Linear Regression, Generalised Linear Models, and Neural Networks (Standard and Ad-hoc). Statistics of the R-Squared computed on the test sets of 50 different training/testing partitions. Model statistic values Linear Regression min 0.76 max 0.80 mean 0.77 stdd ¡ 0.01 GLM min 0.69 max 0.72 mean 0.70 stdd ¡ 0.01 Standard NN min 0.71 max 0.90 mean 0.81 stdd 0.15 Ad-hoc NN min 0.64 max 0.92 mean 0.84 stdd 0.13 7 Concluding remarks The bankruptcy of banks may lead to a huge catastrophic effect over the over- all economy, since the contagion effect it may trigger could lead to a generalised overall crisis. To this extent, the activity of banking supervision, and the role and the authority of bank regulation, play a big role, since they may prevent (or reduce the effect of) the banks’ bankruptcy. Although aimed to different targets, many of these supervision exercise are designed to maintain a sufficient level of capital adequacy: in a way, the banks have to allocate specific reserves to face expected losses and to protect themselves from excessive credit expansion. Bank regula- tions impose constraints over these reserves, even though banks operate them- selves preventively against unexpected crises, and their reserves are often higher than the ones imposed by the regulations. The CAR is one of the main indicators monitored by the banks themselves and by the supervising authorities in order to assess the bank health, and in our contribution we have devised an adaptive Neu- ral Network approach to predict the CAR, and compared the obtained results with standard approaches such as Linear Regression and Generalised Linear Models. G. Di Tollo, G. Fattoruso, B. Toffano Results show that Neural Networks may be successfully used to predict the CAR, and that their outcomes compare favourably with standard methods when used jointly with meaningful pre-processing operations. In future research, modern re- current neural networks (Long Short-Term Memory or Gate Recurrent Units) and 1D-convolutional Neural Networks could be used to exploit the time dependency of the data. References [1] S. Park S. Peristiani A., Estrella. Capital ratios as predictors of bank failure. Economic Policy Review, 6:33–52, 02 2000. [2] V. Siakoulis E. Stavroulakis N. E. Vlachogiannakis A., Petropoulos. Pre- dicting bank insolvencies using machine learning techniques. International Journal of Forecasting, 36(3):1092 – 1113, 2020. [3] V. V. Acharya, D. Pierret, and S. Steffen. Introducing the “leverage ratio” in assessing the capital adequacy of european banks. ZEW Discussion Und Working Paper, 49(621):460–482, 2016. [4] T. Adrian, J. Morsink, and L. B. Schumacher. Stress testing at the imf. Technical report, International Monetary Fund, 2020. [5] Z. Affes and R. Hentati-Kaffel. Forecast bankruptcy using a blend of clus- tering and mars model: Case of us banks. Annals of Operations Research, 281(1):27–64, 2019. [6] N. M. Al-Sabbagh. Determinants of capital adequacy ratio in Jordanian banks. PhD thesis, Yarmouk University, 2004. [7] A. Alfadli and H. Rjoub. The impacts of bank-specific, industry-specific and macroeconomic variables on commercial bank financial performance: evidence from the gulf cooperation council countries. Applied Economics Letters, 27(15):1284–1288, 2020. [8] F. Audrino, A. Kostrov, and J.P. Ortega. Predicting u.s. bank failures with midas logit models. Journal of Financial and Quantitative Analysis, 54(6):2575–2603, 2019. [9] A. Bahrammirzaee. A comparative survey of artificial intelligence applica- tions in finance: artificial neural networks, expert system and hybrid intelli- gent systems. Neural Computing and Applications, 19(8):1165–1195, 2010. An adaptive Neural Network approach to predict the Capital Adequacy Ratio [10] H. B. Barlow. Unsupervised learning: introduction. In G. E. Hinton and T. J. Sejnowski, editors, Unsupervised Learning: Foundations of Neural Compu- tation, pages 1–17. Bradford Company Scituate, MA, USA, 1999. [11] J. R. Barth and G. Caprio. Approaches to bank supervision. Political insti- tutions and financial development, page 156, 2008. [12] L. Bateni, H. Vakilifard, and F. Asghari. The influential factors on capital adequacy ratio in iranian banks. International Journal of Economics and Finance, 6(11):108–116, 2014. [13] C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 2005. [14] K. Bourkhis and M. S. Nabi. Islamic and conventional banks’ soundness dur- ing the 2007–2008 financial crisis. Review of Financial Economics, 22(2):68 – 77, 2013. [15] J. A. Chattha and S. Archer. Solvency stress testing of islamic commercial banks: assessing the stability and resilience. Journal of Islamic Accounting and Business Research, 2016. [16] M. Čihák. Introduction to applied stress testing. IMF Working Papers, pages 1–74, 2007. [17] M. Corazza, D. De March, and G. di Tollo. Design of adaptive elman networks for credit risk assessment. Quantitative Finance, 21(2):323–340, 2021. [18] R. Dakovic, C. Czado, and D. Berg. Bankruptcy prediction in norway: a comparison study. Applied Economics Letters, 17(17):1739–1746, 2010. [19] G. di Tollo. Reti neurali e rischio di credito: stato dell’arte e analisi speri- mentale. Technical Report R-2005-003, Dipartimento di Scienze, Università “G. D’Annunzio” Chieti–Pescara, 2005. [20] P. Dua and H. Kapur. Macro stress testing and resilience assessment of indian banking. Journal of Policy Modeling, 40(2):452–475, 2018. [21] E.Angelini, G.di Tollo, and A. Roli. A neural net approach for credit-scoring. Quarterly Review of Economics and Finance, 48:733–755, 2008. [22] LM. Fu. Neural Networks in Computer Intelligence. McGraw-Hill, Inc., USA, 1994. G. Di Tollo, G. Fattoruso, B. Toffano [23] P. Gai, A. Haldane, and S. Kapadia. Complexity, concentration and conta- gion. Journal of Monetary Economics, 58(5):453–470, 2011. [24] N. Gambetta, M. A. Garcı́a-Benau, and A. Zorio-Grima. Stress test impact and bank risk profile: Evidence from macro stress testing in europe. Inter- national Review of Economics & Finance, 61:347–354, 2019. [25] M. G. Gulaliyev, N. P. Ashurbayli-Huseynova, A. A. Gubadova, B. N. Ahmedov, G. M. Mammadova, and R. T. Jafarova. Stability of the bank- ing sector: deriving stability indicators and stress-testing. Polish Journal of Management Studies, 19, 2019. [26] L. H. Wang J. P. Yin Chen P. H. Chen H. F. Zhang H. F., Zhang. Performance of the levenberg–marquardt neural network approach in nuclear mass predic- tion. Journal of Physics G: Nuclear and Particle Physics, 44(4):045110, mar 2017. [27] A. Hadjixenophontos and C. Christodoulou-Volos. Financial crisis and cap- ital adequacy ratio: A case study for cypriot commercial banks. Journal of Applied Finance and Banking, 8(3):87–109, 2018. [28] A. Haldane. Constraining discretion in bank regulation. Central Banking at a Crossroads, page 15, 2013. [29] M. K. Hassan, O. Unsal, and H. E. Tamer. Risk management and capital adequacy in turkish participation and conventional banks: A comparative stress testing analysis. Borsa Istanbul Review, 16(2):72–81, 2016. [30] JB Heaton, N. G. Polson, and J. H. Witte. Deep learning in finance. arXiv preprint arXiv:1602.06561, 2016. [31] H. Husna and R. Rahman. Financial distress–detection model for islamic banks. International Journal of Trade, Economics and Finance, pages 158– 163, 01 2012. [32] A. Jamali. Modeling effects of banking regulations and supervisory prac- tices on capital adequacy state transition in developing countries. Journal of Financial Regulation and Compliance, 2019. [33] K. Kumar A. Gepp K., Halteh. Financial-distress prediction of islamic banks using tree-based stochastic techniques. Managerial Finance, Special Issue in the Role of Islamic Finance in Mainstream Finance, 08 2017. An adaptive Neural Network approach to predict the Capital Adequacy Ratio [34] R. A. A. Karim. The impact of the basle capital adequacy ratio regulation on the financial and marketing strategies of islamic banks. International Journal of Bank Marketing, 1996. [35] A. Khashman. Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applica- tions, 37(9):6233 – 6239, 2010. [36] S. B. Kotsiantis. Supervised machine learning: A review of classification techniques. Informatica, 31(3):249–268, 2007. [37] K. Kočišová and M. Mišanková. Discriminant analysis as a tool for forecast- ing company’s financial health. Procedia - Social and Behavioral Sciences, 110:1148 – 1157, 2014. The 2-dn International Scientific conference Con- temporary Issues in Business, Management and Education 2013“. [38] N. Laila and F. Widihadnanto. Financial distress prediction using bankome- ter model on islamic and conventional banks: Evidence from indonesia. In- ternational Journal of Economics and Management, 11:169–181, 01 2017. [39] D. Martin. Early warning of bank failure : A logit regression approach. Journal of Banking & Finance, 1(3):249–276, November 1977. [40] D. Mayes and H. Stremmel. The effectiveness of capital adequacy measures in predicting bank distress. SUERF, 2014/1, 02 2014. [41] P. McCullagh and J.A. Nelder. Generalized Linear Models, Second Edition. Chapman and Hall/CRC Monographs on Statistics and Applied Probability Series. Chapman & Hall, 1989. [42] M. Mehreen, Maran M., S. Ariffin A. Karim, and Amin J. Proposing a multidimensional bankruptcy prediction model: An approach for sustainable islamic banking. Sustainability, 12:3226, 04 2020. [43] E. Montero, M. C. Riff, and B. Neveu. A beginner’s guide to tuning methods. Appl. Soft Comput., 17:39–51, April 2014. [44] G. E. Morgan. On the adequacy of bank capital regulation. Journal of Fi- nancial and Quantitative Analysis, 19(2):141–162, 1984. [45] D. M. Nachane and S. Ghosh. Credit rating and bank behaviour in india: Possible implications of the new basel accord. The Singapore Economic Review, 49(01):37–54, 2004. G. Di Tollo, G. Fattoruso, B. Toffano [46] A. K. NOVOKMET and A. BANOVIĆ. Why do the minimum capital ad- equacy ratios vary across europe? Journal of Applied Economic Sciences, 11(3):41, 2016. [47] H. Oloo, M. Wanjiru, and K. Newell-Jones. Female genital mutilation prac- tices in kenya: the role of alternative rites of passage. a case study of kisii and kuria districts. 2011. [48] T. Loughran B. McDonald P., Gandhi. Using annual report sentiment as a proxy for financial distress in u.s. banks. Journal of Behavioral Finance, 20(4):424–436, 2019. [49] J. Park, M. Shin, and W. Heo. Estimating the bis capital adequacy ratio for korean banks using machine learning: Predicting by variable selection using random forest algorithms. Risks, 9(2), 2021. [50] J. Park, M. Shin, and W. Heo. Estimating the bis capital adequacy ratio for korean banks using machine learning: Predicting by variable selection using random forest algorithms. Risks, 9(2):32, 2021. [51] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Ma- chine learning in Python. Journal of Machine Learning Research, 12:2825– 2830, 2011. [52] A. Petropoulos, V. Siakoulis, K. Panousis, T. Christophides, and year = 2020 month = 09 pages = title = A Deep Learning Approach for Dynamic Balance Sheet Stress Testing Chatzis, S. [53] V. Ravi and C. Pramodh. Threshold accepting trained principal component neural network and feature subset selection: Application to bankruptcy pre- diction in banks. Applied Soft Computing, 8(4):1539 – 1548, 2008. Soft Computing for Dynamic Data Mining. [54] P. Ravi Kumar and V. Ravi. Bankruptcy prediction in banks and firms via statistical and intelligent techniques – a review. European Journal of Oper- ational Research, 180(1):1 – 28, 2007. [55] B. Robitaille, B. Marcos, M. Veillette, and G. Payre. Modified quasi-newton methods for training neural networks. Computers Chemical Engineering, 20(9):1133–1140, 1996. An adaptive Neural Network approach to predict the Capital Adequacy Ratio [56] R. Rojas. The backpropagation algorithm. In Neural networks, pages 149– 182. Springer, 1996. [57] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018. [58] N. Vunjak, N. Milenković, J. Andrašić, and M. Pjanić. Stress test model for measuring the effects of the economic crisis on the capital adequacy ratio. [59] D. Worrell. Stressing to breaking point: Interpreting stress test results. 2008. [60] X. Yan and X. G. Su. Linear Regression Analysis: Theory and Computing. World Scientific Publishing Co., Inc., USA, 2009.