Microsoft Word - ETASR_V13_N4_pp11472-11483 Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11472 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models Youssef Kassem Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus | Energy, Environment, and Water Research Center, Near East University, Cyprus yousseuf.kassem@neu.edu..tr (corresponding author) Huseyin Camur Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus huseyin.camur@neu.edu.tr Mustapha Tanimu Adamu Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus 20215363@std.neu.edu.tr Takudzwa Chikowero Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus 20215146@std.neu.edu.tr Terry Apreala Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus 20224420@std.neu.edu.tr Received: 18 June 2023 | Revised: 4 July 2023 | Accepted: 5 July 2023 Licensed under a CC-BY 4.0 license | Copyright (c) by the authors | DOI: https://doi.org/10.48084/etasr.6131 ABSTRACT Solar irradiation prediction including Global Horizontal Irradiation (GHI) and Direct Normal Irradiation (DNI) is a useful technique for assessing the solar energy potential at specific locations. This study used five Artificial Neural Network (ANN) models and Multiple Linear Regression (MLR) to predict GHI and DNI in Africa. Additionally, a hybrid model combining MLR and ANNs was proposed to predict both GHI and DNI and improve the accuracy of individual ANN models. Solar radiation (GHI and DNI) and global meteorological data from 85 cities with different climatic conditions over Africa during 2001-2020 were used to train and test the models developed. The Pearson correlation coefficient was used to identify the most influential input variables to predict GHI and DNI. Two scenarios were proposed to achieve the goal, each with different input variables. The first scenario used influential input parameters, while the second incorporated geographical coordinates to assess their impact on solar radiation prediction accuracy. The results revealed that the suggested linear-nonlinear hybrid models outperformed all other models in terms of prediction accuracy. Moreover, the investigation revealed that geographical coordinates have a minimal impact on the prediction of solar radiation. Keywords-global horizontal irradiation; direct normal irradiation; multiple linear regression; artificial neural networks; hybrid model I. INTRODUCTION The exploitation of fossil fuels faces increasing political and environmental challenges [1]. The use of renewable energy is one solution to address these issues and meet the growing global demand for electricity. Renewable energy offers a solution to the pollution and environmental damage caused by fossil and nuclear energy [2]. The potential of renewable energy has inspired numerous researchers to explore clean technologies, intending to generate clean energy and minimize the effects of climate change [3-5]. Among the various renewable resources, solar energy is particularly promising, with applications in electricity generation, as well as air and water heating/cooling [6]. Solar photovoltaic (PV) energy generation uses solar modules that consist of multiple solar cells containing a PV material. Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11473 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models In general, assessing the energy generation potential for various solar technologies is based on two crucial parameters: Global Horizontal Irradiation (GHI) and Direct Normal Irradiation (DNI) [7]. Accurate measurement and prediction of GHI and DNI are crucial in assessing the energy generation potential of various solar technologies. These parameters are key inputs for designing, operating, and optimizing the performance of solar power plants. According to [8], GHI is the total amount of solar radiation received on a horizontal surface, including both direct and diffuse radiation. On the other hand, DNI is the amount of solar radiation received directly from the sun's rays, perpendicular to a surface. This type of radiation is particularly important for Concentrated Solar Power (CSP) plants that use mirrors or lenses to concentrate sunlight onto a receiver to produce high-temperature heat, which is then used to generate electricity [8]. Therefore, accurate and reliable measurements of GHI and DNI are essential for the effective operation and management of solar power plants in the future. Recently, soft-computing approaches have emerged as particularly effective techniques for modeling global solar radiation in many regions around the world. Soft-computing techniques enable efficient identification of relationships between dependent and independent variables, even for non- linear natural processes. Recently, various models have been developed, such as Multilayer Feed-Forward Neural Networks, Support Vector Machines, Autoregressive Integrated Moving Averages, etc., that use different meteorological and geographical elements to estimate the total amount of solar radiation in terms of GHI and DNI [9-37]. Based on previous scientific studies [9-37], the most relevant input parameters used to predict solar radiation are average temperature, pressure, relative humidity, wind speed, wind direction, sunshine hours, minimum and maximum temperatures, wet- bulb temperature, atmospheric temperature, cloudiness, and evaporation. Based on the above, various empirical models are used to estimate the annual amount of GHI and DNI in Africa, which is currently experiencing a major electricity crisis, with approximately 600 million people without access to electricity. Rural areas are particularly affected, with electrification rates as low as 10%. This energy poverty has significant negative impacts on the economy, society, and health, as communities rely on unsafe and inefficient energy sources. However, the abundant sunshine in Africa provides a unique opportunity for the development of solar energy systems, which have the potential to meet the energy needs of millions of people in the region. In this study, five ANN models (feed-forward neural network, cascade forward neural network, Elman neural network, Layer Recurrent Neural Network, and NARX Neural Network) and Multiple Linear Regression (MLR) were used to predict solar radiation data. Moreover, this study proposed linear-nonlinear hybrid models that integrate ANNs and MLR for GHI and DNI prediction. GHI, DNI, and global meteorological data from 85 cities in Africa with various climatic conditions were used to train and test the developed models. The Pearson correlation coefficient was used to identify the most influential input variables for predicting GHI and DNI. Two scenarios were used for this purpose: the first one was created using the most influential input parameters, while the second incorporated geographical coordinates (latitude, longitude, and altitude) along with the influential input data to assess the impact of geographical coordinates on the accuracy of solar radiation prediction. Data were obtained from the NASA POWER dataset for the period 2000-2021. II. MATERIAL AND METHODS A. Study Area Africa is a vast continent that spans the equator, and its climate varies greatly depending on the region. The continent includes several climatic zones, including tropical rainforest, savanna, and desert regions. The latitude and longitude of a region greatly influence its climate, which in turn affects weather patterns. The equator runs through the center of the continent, passing through countries. The weather in Africa can also be affected by various natural phenomena, such as the El Nino Southern Oscillation (ENSO), which is a climate cycle in the Pacific Ocean that affects global weather patterns. During El Nino, the Pacific Ocean warms, leading to changes in atmospheric pressure and wind patterns that affect rainfall patterns in Africa. B. Data Used The NASA POWER (Prediction Of Worldwide Energy Resource) dataset is a comprehensive collection of solar and meteorological data that provides information on various crucial parameters crucial for studying and analyzing renewable energy resources and their potential. The dataset covers locations around the world, allowing researchers and analysts to access solar and meteorological data for virtually any location on Earth. The NASA POWER dataset includes a wide range of parameters related to solar radiation and meteorological conditions. These data include solar radiation including GHI, DNI, Diffuse Horizontal Irradiance (DHI), and Clear Sky GHI, as well as meteorological data including temperature, relative humidity, wind speed, wind direction, precipitation, cloud cover, atmospheric pressure, and more. The dataset offers both hourly and daily temporal resolutions. Hourly data are available for certain parameters, allowing for a more detailed analysis of solar and meteorological conditions throughout the day. Daily data provide aggregated values for each parameter. The spatial resolution of the NASA POWER dataset varies depending on the specific parameter and the data source used. In general, the dataset provides information at a spatial resolution of approximately 1 km. The dataset integrates data from various sources, including satellite observations, ground measurements, and atmospheric models. NASA incorporates data from multiple sensors and instruments to provide accurate and reliable information. The NASA POWER dataset is freely accessible to the public through the NASA POWER web portal. Therefore, data including GHI, DNI, surface pressure, average, maximum, and minimum temperature, relative humidity, wind speed at 2 m height, average, maximum, and minimum wind speed at 10 m height, wind direction at 10 m height, frost point temperature, wet bulb temperature, cloud amount, and precipitation were collected for all the selected cities in Africa shown in Table I. Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11474 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models TABLE I. INFORMATION REGARDING THE SELECTED LOCATIONS Location Latitude [N°] Longitude [E°] Altitude [m] Location Latitude [N°] Longitude [E°] Altitude [m] Cairo 30.0 31.6 350.0 Cabinda -5.1 12.3 103.0 Kinshasa -4.3 15.3 277.0 Fez 33.8 -4.9 971.0 Vereeniging -26.6 27.9 1526.0 Uyo 5.0 7.9 71.0 Giza 30.0 31.2 19.0 Mwanza -2.5 32.7 1134.0 Luanda -9.5 13.5 201.0 Lilongwe -14.0 33.7 1071.0 Dar es Salaam -6.8 39.3 15.0 Kigali -1.9 30.1 1575.0 Khartoum 15.6 32.5 387.0 Bukavu -2.5 28.9 1533.0 Johannesburg -26.2 28.1 1746.0 Abomey 6.4 2.3 30.0 Abidjan 5.4 -4.0 105.0 Nnewi 6.0 7.0 163.0 Alexandria 30.9 29.8 18.0 Tripoli 32.8 13.3 31.0 Addis Ababa 9.0 38.8 2315.0 Kaduna 10.4 7.9 661.0 Nairobi -1.3 36.8 1657.0 Aba 5.1 7.4 64.0 Cape Town -33.3 18.4 35.0 Bujumbura -3.3 29.4 798.0 Yaoundé 3.9 11.5 715.0 Maputo -26.0 32.6 14.0 Kano 12.0 8.5 454.0 Hargeisa 9.6 44.1 1267.0 East Rand -26.4 27.4 1590.0 BoboDioulass 11.2 -4.3 420.0 Umuahia 5.5 7.5 154.0 Shubra el-Kheima 30.1 31.2 28.0 Douala 36.6 4.1 614.0 Ikorodu 6.6 3.5 36.0 Casablanca 33.3 -8.0 189.0 Asmara 15.3 38.9 2342.0 Ibadan 7.4 3.9 223.0 Marrakesh 31.6 -8.0 468.0 Antananarivo 19.0 46.7 1205.0 Tshikapa -3.0 23.8 505.0 Abuja 9.1 7.5 473.0 Ilorin 8.5 4.5 318.0 Kampala 0.3 32.6 1237.0 Blantyre -15.8 35.0 698.0 Kumasi 6.7 -1.6 260.0 Agadir 30.7 -9.6 454.0 Dakar 14.7 -17.3 6.0 Misratah 32.4 15.1 9.0 Port Harcourt 4.8 7.0 18.0 Lubumbashi -11.7 27.5 1262.0 Durban -29.9 31.0 13.0 Accra 5.8 0.1 39.0 Ouagadougou 12.4 -1.5 299.0 Brazzaville -3.0 23.8 505.0 Lusaka -15.4 29.2 1149.0 Monrovia 6.3 -10.8 6.0 Algiers 36.8 3.1 31.0 Tunis 33.8 9.4 43.0 Bamako 12.6 -8.0 335.0 Rabat 34.0 -6.8 87.0 Omdurman 15.6 32.5 391.0 Lomé 6.1 1.2 14.0 Mbuji-Mayi -6.1 23.6 678.0 Benin City 6.3 5.6 90.0 Pretoria -25.7 28.2 1338.0 Owerri 5.5 7.0 74.0 Kananga -5.9 22.4 636.0 Warri 5.5 5.8 5.0 Harare -17.9 31.1 1483.0 Jos 9.9 8.9 1182.0 Onitsha 6.1 6.8 51.0 Bangui 4.4 18.6 355.0 N'Djamena 12.1 15.1 297.0 Nampula -15.1 39.3 430.0 Nouakchott 18.1 -16.0 8.0 Oran Algeria 35.6 -0.7 162.0 Mombasa -4.0 39.7 10.0 West Rand -26.2 27.5 1589.0 Niamey 13.5 2.1 207.0 Lubango -14.9 13.5 1774.0 Pointe-Noire -4.8 11.9 16.0 Gqeberha -34.0 25.6 52.0 C. Artificial Neural Networks (ANNs) ANNs are a class of machine learning algorithms inspired by the structure and functioning of biological neural networks, such as the human brain [38]. An ANN consists of interconnected nodes, called artificial neurons or "nodes," organized into layers. The three main types in an ANN are the input, hidden, and output. The connections between neurons in an ANN are represented by weights. During the training process, the weights are adjusted based on a mathematical optimization algorithm, such as gradient descent, to minimize the difference between the predicted and desired outputs. This adjustment is performed through a process called backpropagation, in which the error is propagated backward through the network to update the weights. Activation functions play a crucial role in determining the output of a neuron based on the weighted inputs. The most common activation functions are logistic-sigmoid (logsig) and tangent-sigmoid (tansig) whose outputs lie between 0 and 1 and are defined as [38]: ������ = ��� � (1) � ���� = �� � �� � (2) In addition, a trial-and-error approach is typically employed to determine the optimal number of nodes in the hidden layer. This study used the TRAINLM training function, which updates the weights and biases of neuron connections based on the Levenberg-Marquardt (LM) optimization algorithm. The backpropagation algorithm, a type of gradient descent algorithm, serves as the learning algorithm for this purpose. The training process of an ANN is crucial, involving the adjustment of weights and biases to minimize the disparity between the ANN's output and the desired values. The Mean Squared Error (MSE) is used to optimize the performance of the trained ANN model, which quantifies the average squared Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11475 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models difference between the predicted and actual values, serving as a measure to guide the training procedure toward better accuracy. Figure 1 illustrates the schematic representation of the ANN model developed to predict the GHI and DNI. Fig. 1. Schematic representation of the ANN model used. 1) Feed-Forward Neural Network (FFNN) FFNN is widely used in various domains to analyze different types of problems in different scenarios [38-40]. The Levenberg-Marquardt algorithm and the backpropagation method are commonly used techniques [40]. The trial and error method is used to determine the appropriate number of hidden layers and neurons, and MSE is used to assess the performance of the training algorithm. It is important to note that the data were normalized within the range of 0-1. This study used the backpropagation algorithm for the training process. 2) Cascade Feed-Forward Neural Network (CFNN) CFNN is conceptually similar to the FFNN [40-42] and consists of three types of layers: an input layer, one or more hidden layers, and an output layer. The input layer receives weights from the input data [38-40]. Each subsequent layer receives weights from the input layer and all preceding layers [38-40]. Biases are present in all layers, contributing to the network's functionality. The final layer corresponds to the output layer. The configuration of weights and biases is necessary for each layer. During the training phase, MSE is computed to assess the model's performance. 3) Elman Neural Network (ENN) ENN is a feedback neural network known for its exceptional computational capabilities [39-40], and consists of four layers, namely, the input, hidden, context, and output layers [39-40]. The input layer functions as the signal transmission component, while the output layer has a linear weight effect. The distinguishing feature of ENN compared to backpropagation neural networks is the inclusion of the context layer [39-40]. 4) Layer Recurrent Neural Network (LRNN) LRNN incorporates recurrent connections at the layer level [41]. In traditional RNNs, such as the Elman or Jordan architectures, the recurrent connections are typically at the neuron level. However, in LRNN, recurrent connections are established between entire layers of neurons, and each layer is associated with a recurrent connection that allows information to flow from the previous to the current time step within the same layer [41]. This enables the network to capture and utilize temporal dependencies in sequential data. Recurrent connections in LRNN can improve the model's ability to process and analyze time series or sequential data, making it particularly suitable for tasks such as speech recognition, language modeling, and music generation, where capturing long-term dependencies is crucial. By incorporating layer-level recurrent connections, LRNN provides an alternative approach to modeling sequential data compared to traditional recurrent architectures. Its unique structure allows the efficient processing of temporal information and can lead to improved performance in tasks that involve sequential data analysis. 5) Nonlinear Autoregressive Retwork with Exogenous Input (NARX) NARX combines autoregressive elements with exogenous input to predict future values of a time series [42]. It is designed to capture nonlinear dependencies and patterns in time series data, incorporating both the past values of the target series (autoregressive component) and external factors or inputs that may influence the target series (exogenous component). NARX typically consists of an input layer, one or more hidden layers, and an output layer. The input layer receives both the past values of the target series (autoregressive inputs) and any exogenous inputs that may be available. The hidden layers process the input information and learn to capture the nonlinear relationships and dynamics of the data. Finally, the output layer generates the predicted values of the target series. An important aspect of the NARX model is the use of time-delayed inputs, where past values of the target series and exogenous inputs are fed as input features with a time delay. The model can consider historical information and dependencies between past and future observations by including these time-delayed inputs. Training the NARX model typically involves using optimization algorithms, such as gradient descent, to adjust its weights and biases to minimize prediction errors. The performance of the NARX model can be evaluated using MSE or Root Mean Squared Error (RMSE). NARX has been used in various domains, including finance, economics, weather forecasting, and time series prediction tasks in general. Its ability to capture nonlinear relationships and incorporate exogenous factors makes it a powerful tool for modeling and predicting complex time-series data. D. Multiple Linear Regression (MLR) MLR is a statistical method to analyze the relationship between a dependent variable and multiple independent variables. It extends the concept of simple linear regression by considering multiple predictors simultaneously. In MLR, the goal is to create a linear equation that best fits the relationship between the dependent and independent variables. Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11476 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models E. Hybrid Modeling (HM) HM is a valuable technique for capturing different parts of the underlying patterns by combining several models [43]. In this study, an HM was developed by combining the predicted values of MLR and estimated residuals (error) by a nonlinear model (computational models). Three steps were taken in the development of HM:  Step 1: Estimate the properties using the mathematical models and determine the residuals.  Step 2: Pass the residual through computational models to capture the nonlinearity of the data.  Step 3: Combine the obtained output from the mathematical and computational models to predict fuel properties. F. Statistical Indices The performance evaluation of the developed models involves the utilization of several statistical metrics. This study used the Coefficient of Determination (R2), RMSE, and Mean Absolute Error (MAE). Equations (3)-(5) present the mathematical expressions for these metrics. �� = 1 − ∑ ���,����,�� ����� ∑ ���,����,� !������ (3) �#$% = &�' ∑ � �,( − ),( � �'(*� (4) #+% = �' ∑ , �,( − ),( ,'(*� (5) TABLE II. DESCRIPTIVE STATISTICS OF THE USED DATA City DNI (kWh/m 2 ) Class GHI (kWh/m 2 ) Class City DNI (kWh/m 2 ) Class GHI (kWh/m 2 ) Class Aba 929 1 (poor) 1650 4 (good) Kigali 1142 2 (marginal) 1804 4 (good) Abdijan 1004 2 (marginal) 1707 4 (good) Kinshasa 1002 2 (marginal) 1640 4 (good) Abomey Calavi 1013 2 (marginal) 1692 4 (good) Kumasi 1023 2 (marginal) 1713 4 (good) Abuja 1321 3 (fair) 1918 5 (excellent) Libreville 847 1 (poor) 1568 4 (good) Accra 1254 2 (marginal) 1849 5 (excellent) Lilongwe 1718 4 (good) 2010 5 (excellent) Addis Ababa 2111 5 (excellent) 2128 5 (excellent) Lokoja 1080 2 (marginal) 1762 4 (good) Agadir 2196 6 (outstanding) 2072 5 (excellent) Lome 1091 2 (marginal) 1764 4 (good) Alexandria 2111 5 (excellent) 2128 5 (excellent) Luanda 1215 2 (marginal) 1752 4 (good) Algiers 1756 4 (good) 1749 4 (good) Lubango 2206 6 (outstanding) 2206 6 (outstanding) Antananarivo 2086 5 (excellent) 2146 5 (Excellent)) Lubumbashi 1842 5 (excellent) 2082 5 (excellent) Asmara 2027 5 (excellent) 2236 6 (outstanding) Lusaka 2401 6 (outstanding) 2223 6 (outstanding) Bamako 1754 4 (good) 2131 5 (excellent) Maiduguri 1707 4 (good) 2153 6 (outstanding) Bangui 1261 3 (fair) 1860 5 (excellent) Maputo 1806 4 (good) 1833 4 (good) Benguela 1866 5 (excellent) 2051 5 (excellent) Marrakesh 2367 6 (outstanding) 2062 5 (excellent) Benin City 914 1 (poor) 1625 4 (good) Mbuji-Mayi 1330 3 (fair) 1865 5 (excellent) Blantyre 1716 4 (good) 1994 5 (Excellent)) Misratah 1777 4 (good) 1863 5 (excellent) Bobo Dioulasso 1658 4 (good) 2156 6 (outstanding) Mombasa 1717 4 (good) 2067 5 (excellent) Brazzaville 1023 2 (marginal) 1710 4 (good) Monrovia 992 2 (marginal) 1665 4 (good) Bujumbura 1184 2 (marginal) 1802 4 (good) Mwanza 1577 4 (good) 1999 5 (excellent) Bukavu 1075 2 (marginal) 1740 4 (good) Nairobi 1761 4 (good) 2116 5 (excellent) Cabinda 841 1 (poor) 1507 3 (fair) Nampula 1745 4 (good) 2053 5 (excellent) Cairo 2084 5 (excellent) 2083 5 (excellent) Ndjamena 1849 5 (excellent) 2227 6 (outstanding) Cape Town 2516 6 (outstanding) 2025 5 (excellent) Niamey 1806 4 (good) 2204 6 (outstanding) Casablanca 1897 5 (excellent) 1891 5 (excellent) Nnewi 899 1 (poor) 1613 4 (good) Dakar 1589 4 (good) 2099 5 (excellent) Nouakchott 1917 5 (excellent) 2289 6 (outstanding) Dar es Salam 1750 4 (good) 2062 5 (excellent) Omdurman 2423 6 (outstanding) 2405 6 (outstanding) Doula 852 1 (poor) 1554 4 (good) Onitsha 955 2 (marginal) 1665 4 (good) Durban 1679 4 (good) 1706 4 (good) Oran 1813 4 (good) 1807 4 (good) East rand 992 2 (marginal) 1016 1 (poor) Ouagadougou 1690 4 (good) 2119 5 (excellent) Enugu 1019 2 (marginal) 1722 4 (good) Owerri 929 1 (poor) 1650 4 (good) Fez 2094 5 (excellent) 1956 5 (excellent) Point-Noire 902 1 (poor) 1555 4 (good) Giza 1005 2 (marginal) 1676 4 (good) Port Harcourt 834 1 (poor) 1504 3 (fair) Gqeberha 2124 5 (excellent) 1827 4 (good) Pretoria 2323 6 (outstanding) 2070 5 (excellent) Harare 2029 5 (excellent) 2117 5 (excellent) Rabat 2197 6 (outstanding) 1965 5 (excellent) Hargeisa 2460 6 (outstanding) 2442 6 (outstanding) Shubra el-Kheima 2084 5 (excellent) 2083 5 (excellent) Ibadan 982 2 (marginal) 1676 4 (good) Tangier 1935 5 (excellent) 1801 4 (good) Ikorodu 1005 2 (marginal) 1676 4 (good) Tripoli 1856 5 (excellent) 1970 5 (excellent) Ilorin 1182 2 (marginal) 1824 4 (good) Tshikapa 1022 2 (marginal) 1681 4 (good) Johannesburg 2224 6 (outstanding) 2030 5 (excellent) Tunis 2174 5 (excellent) 2038 5 (excellent) Jos 1335 3 (fair) 1930 5 (excellent) Umuahia 929 1 (poor) 1650 4 (good) Kaduna 1510 3 (fair) 2038 5 (excellent) Uyo 929 1 (poor) 1650 4 (good) Kampala 1309 3 (fair) 1932 5 (excellent) Vereeniging 2303 6 (outstanding) 2057 5 (excellent) Kananga 1164 2 (marginal) 1779 4 (good) Warri 809 1 (poor) 1555 4 (good) Kano 1615 4 (good) 2126 5 (Excellent)) West rand 2303 6 (outstanding) 2057 5 (excellent) Khartoum 2423 6 (outstanding) 2405 6 (outstanding) Yaounde 856 1 (poor) 1633 4 (good) Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11477 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models III. RESULTS AND DISCUSSION A. Solar Energy Characteristics The classification of the solar energy potential was determined considering the annual GHI and DNI values. The classification of solar resources can be found in [44]. Table II presents the classification of solar resources in a specific city based on GHI and DNI values. Based on the annual value of GHI, it is observed that most of the selected regions exhibit abundant solar resources and are classified into good, excellent, and outstanding categories. Furthermore, solar resources in the East are classified as poor (class 1). Moreover, it is noticed that the solar resources in Cabinda and Port Harcourt are categorized as fair (class 3). Consequently, these regions emerge as the most favorable locations for the future installation of PV systems, primarily due to their significantly high GSR values. Based on the annual value of the DNI, solar resources in 14% of the selected regions were classified as outstanding (Class 6). These regions are Agadir, Rabat, Lubango, Johannesburg, Vereeniging, West Rand, Pretoria, Marrakesh, Lusaka, Khartoum, Omdurman, Hargeisa, and Cape Town. Furthermore, solar resources in 14% of the selected regions (Warri, Port Harcourt, Cabinda, Libreville, Doula, Yaounde, Nnewi, Point-Noire, Benin City, Aba, Owerri, Umuahia, and Uyo) were classified as poor (Class 1). Consequently, based on the high values of GHI and DNI, it can be concluded that most of the selected locations are well-suited for the installation of both large- and small-scale PV systems. Moreover, these regions are also highly suitable for implementing flat-plate PV systems and CSP systems. B. Selecting Relevant Parameters The evaluation of the solar potential of a specific location is a crucial initial step in the effective planning of solar energy systems. Additionally, the prediction of solar radiation is influenced by various meteorological and geographical variables, making the identification of appropriate factors for accurate solar radiation prediction a significant area of research. According to [45], accurate information about the specific amount of solar energy available at a particular geographical location during a given period is essential and plays a vital role in the design process of PV systems. Moreover, meteorological parameters play a pivotal role in influencing the amount of solar radiation [46-47]. Furthermore, the orientation angles of a PV system have a significant impact on its performance [48-49]. TABLE III. PEARSON CORRELATION MATRIX FOR INPUT AND OUTPUT PARAMETERS GHI Sl Az SP Tav RH WS-2 WD WS FPT WPT Tmax Tmin CA WSmax WSmin PC GHI Sl 1 Az -0.1 1 SP -0.13 0.49 1 Tav -0.26 0.457 0.364 1 RH -0.27 0.076 0.315 -0.01 1 WS-2 0.161 -0.15 0.219 -0.29 -0.26 1 WD 0.023 0.24 0.312 -0.12 0.091 0.116 1 WS 0.185 -0.19 0.171 -0.33 -0.33 0.988 0.084 1 FPT -0.36 0.283 0.451 0.514 0.845 -0.36 0.018 -0.44 1 WPT -0.37 0.398 0.476 0.797 0.589 -0.38 -0.04 -0.45 0.928 1 Tmax 0.225 0.244 0.062 0.218 -0.76 0.034 0.032 0.09 -0.53 -0.28 1 Tmin -0.38 0.277 0.396 0.732 0.557 -0.23 -0.1 -0.32 0.869 0.93 -0.38 1 CA -0.24 0.089 0.014 0.278 0.677 -0.63 -0.19 -0.67 0.701 0.614 -0.56 0.587 1 WSmax 0.287 -0.22 0.038 -0.57 -0.37 0.822 0.079 0.852 -0.6 -0.67 0.178 -0.6 -0.67 1 WSmin -0.03 -0.09 0.158 0.011 -0.02 0.377 0.043 0.374 0 0.005 -0.04 0.069 -0.19 0.173 1 PC -0.27 0.104 0.03 0.266 0.642 -0.58 -0.15 -0.61 0.661 0.581 -0.48 0.502 0.732 -0.601 -0.159 1 GHI 0.162 -0.11 -0.36 0.013 -0.79 0.402 -0.08 0.444 -0.64 -0.45 0.557 -0.38 -0.76 0.385 0.123 -0.56 1 DNI Sl Az SP Tav RH WS-2 WD WS FPT WPT Tmax Tmin CA WSmax WSmin PC DNI Sl 1 Az -0.1 1 SP -0.13 0.49 1 Tav -0.26 0.457 0.364 1 RH -0.27 0.076 0.315 -0.01 1 WS-2 0.161 -0.15 0.219 -0.29 -0.26 1 WD 0.023 0.24 0.312 -0.12 0.091 0.116 1 WS 0.185 -0.19 0.171 -0.33 -0.33 0.988 0.084 1 FPT -0.36 0.283 0.451 0.514 0.845 -0.36 0.018 -0.44 1 WPT -0.37 0.398 0.476 0.797 0.589 -0.38 -0.04 -0.45 0.928 1 Tmax 0.225 0.244 0.062 0.218 -0.76 0.034 0.032 0.09 -0.53 -0.28 1 Tmin -0.38 0.277 0.396 0.732 0.557 -0.23 -0.1 -0.32 0.869 0.93 -0.38 1 CA -0.24 0.089 0.014 0.278 0.677 -0.63 -0.19 -0.67 0.701 0.614 -0.56 0.587 1 WSmax 0.287 -0.22 0.038 -0.57 -0.37 0.822 0.079 0.852 -0.6 -0.67 0.178 -0.6 -0.67 1 WSmin -0.03 -0.09 0.158 0.011 -0.02 0.377 0.043 0.374 0 0.005 -0.04 0.069 -0.19 0.173 1 PC -0.27 0.104 0.03 0.266 0.642 -0.58 -0.15 -0.61 0.661 0.581 -0.48 0.502 0.732 -0.601 -0.159 1 DNI 0.289 -0.29 -0.27 -0.42 -0.72 0.586 0.083 0.637 -0.81 -0.76 0.477 -0.69 -0.92 0.7 0.129 -0.69 1 Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11478 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models This study used Pearson’s correlation to identify the most influential input among potential variables. Based on the Pearson coefficient, the strength of the relationships can be categorized as follows [50]: 0.00–0.25 indicates a very weak relationship, 0.26–0.49 represents a weak relationship, 0.50– 0.69 corresponds to a moderate relationship, 0.70–0.89 signifies a strong relationship and 0.90–1.0 denotes a very strong relationship. Table III lists the Pearson correlation matrix depicting the relationships between the potential input parameters and the failure modes. The matrix provides an overview of the correlation coefficients between these variables. This study investigated the influence of geographical coordinates on the accuracy of the prediction of GHI and DNI. To achieve this objective, the proposed models were implemented and evaluated in two different scenarios, as shown in Figure 2. Different empirical models were used to predict the annual value of GHI and DNI. In general, data partitioning can influence model performance [51]. Moreover, in [51], it was concluded that empirical models achieve optimal performance when approximately 70-80% of the data are allocated for training and the remaining 20-30% are set aside for testing purposes. Consequently, the data were divided into training and testing sets using an arbitrary approach, with 80% of the total data assigned to the training set and the remaining 20% designated for the testing set. Table IV displays the descriptive statistics for the selected data. C. Results of ANN models An iterative algorithm was used to find the best neural network model and determine the optimal combination of input variables and hidden layer neurons. The study considered a range of 1-10 hidden layers and 10000-1000000 trial iterations. The Levenberg-Marquardt training algorithm was selected for its speed and reliability. Each network was trained multiple times to prevent inaccurate estimates. The model with the lowest MSE was chosen as the best-trained model. Table V presents the optimal network structure and activation function. The performance evaluation of the developed models was performed using R2, RMSE, and MAE. Table VI presents the values of these statistical indexes for all proposed ANN models, and the following can be concluded:  For GHI prediction, it is noticed that the FFNN model had the highest R2 value compared to other models. On the other hand, the LRNN model had the lowest RMSE and MAE values, indicating superior performance compared to the other models. As shown in Table VI, the accuracy of GHI prediction was reduced when the geographical coordinates Lat, Long, and Alt were used as input variables for the models.  For DNI prediction, it was found that the FFNN model exhibited the highest R2 value among all models, indicating its superiority. Besides, the ENN model demonstrated the lowest RMSE and MAE values, suggesting superior performance compared to the others. The results showed that the accuracy of the DNI prediction increased when geographic coordinates were used as input variables. TABLE IV. DESCRIPTIVE STATISTICS OF THE USED DATA Data Variable Unit Mean SD Min. Max. Training Lat. ° 3.6 17.0 -34.0 36.8 Long ° 16.4 15.8 -17.3 46.7 Alt. m 555.7 597.6 6.0 2342.0 Sl. ° 17.7 15.7 -1.0 90.0 Az. ° -54.4 81.8 -180.0 47.0 SP kPa 96.1 5.6 81.4 101.7 Tav ℃ 23.6 3.6 8.4 30.1 RH % 68.8 15.2 23.8 89.2 WS-2m m/s 2.4 1.1 0.5 6.1 WD ° 195.2 94.2 0.4 359.5 WS m/s 3.4 1.2 1.0 7.2 FPT ℃ 16.0 5.6 4.1 24.3 WBT ℃ 19.8 3.9 7.2 25.7 Tmax ℃ 36.5 4.9 25.3 46.9 Tmin ℃ 11.9 5.9 -8.7 23.5 CA % 54.4 15.9 11.4 86.9 WSmax m/s 10.1 3.6 2.8 22.9 WSmin m/s 0.1 0.1 0.0 1.4 PC mm/day 2.8 1.9 0.0 20.3 DNI kWh/m2/day 4.3 1.4 2.0 7.7 GHI kWh/m2/day 5.3 0.7 2.6 6.8 Testing Lat. ° 8.8 20.9 -26.6 35.6 Long ° 11.4 11.6 -6.8 31.2 Alt. m 403.9 528.1 5.0 1589.0 Sl. ° 19.8 11.4 0.0 34.0 Az. ° -27.2 67.2 -179.0 18.0 SP kPa 97.2 5.2 84.7 101.6 Tav ℃ 22.4 3.4 15.8 28.6 RH % 70.8 15.6 39.3 90.3 WS-2m m/s 2.0 1.0 0.1 4.4 WD ° 228.7 101.0 0.9 360.0 Wsav m/s 3.0 1.1 0.7 5.3 FPT ℃ 15.6 6.8 4.5 24.1 WPT ℃ 19.0 4.9 10.8 25.0 Tmax ℃ 37.0 4.8 29.1 47.8 Tmin ℃ 8.9 7.8 -6.2 21.2 CA % 54.3 18.9 22.8 84.6 Wsmax m/s 9.9 4.4 1.7 20.9 Wsmin m/s 0.1 0.1 0.0 0.5 PC mm/day 3.0 2.3 0.0 11.0 DNI kWh/m2/day 4.2 1.7 1.8 7.1 GHI kWh/m2/day 5.0 0.6 4.0 6.0 SD: Standard deviation; Min. Minimum; Max.: Maximum TABLE V. BEST NETWORK STRUCTURE BASED ON THE TRAINING SET FOR HORIZONTAL SOLAR RADIATION Model Scenario Number of hidden layers Number of neurons Transfer function FFNN 1 2 5 tansig 2 1 15 tansig ENN 1 2 10 logsig 2 2 10 tansig CFNN 1 1 15 logsig 2 1 5 tansig LRNN 1 2 5 tansig 2 2 15 tansig NARX 1 2 15 logsig 2 1 5 tansig Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11479 www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models Fig. 2. Model development for predicting GHI and DNI considering the two scenarios. TABLE VI. STATISTICAL INDEXES AND SCENARIO USED FOR ALL PROPOSED ANN MODELS IN TESTING Output Scenario Variable FFNN LRNN CFNN ENN NARX GHI SGHI#1 R 2 0.8287 0.5404 0.8278 0.7986 0.7858 RMSE 0.4386 0.3976 0.4125 0.4144 0.3537 MAE 0.3826 0.3073 0.3438 0.3276 0.2472 SGHI#2 R 2 0.8154 0.7657 0.5637 0.8263 0.6443 RMSE 0.3226 0.4102 0.3920 0.6104 0.4080 MAE 0.2711 0.3414 0.3283 0.5434 0.3029 DNI SDNI#1 R 2 0.9232 0.4533 0.9095 0.9047 0.9105 RMSE 0.5975 1.5089 0.6266 0.5340 0.6725 MAE 0.4354 1.0055 0.4458 0.3837 0.4945 SDNI#2 R 2 0.8862 0.4739 0.9249 0.8987 0.8650 RMSE 0.6694 1.4296 0.7061 0.5902 0.6720 MAE 0.5243 0.9555 0.5563 0.4505 0.5075 RMSE and MAE are in kWh/m2 D. Results of MLR MLR was used to predict the GHI and DNI in Africa. The training data were used to derive mathematical equations, represented by the following equations for the two scenarios: /01 = 8.562 − 0.035 ∙ �0 : 0.021 ∙ ;<= −0.018 ∙ => ? − 0.014 ∙ A+ : 0.012 ∙ ? − 0.017 ∙ A+ : 0.018 ∙ �� : 0.061 ∙ G$IJK :0.034 ∙ �� : 0.038 ∙ G$IJK :0.06 ∙