Microsoft Word - ETASR_V13_N4_pp11472-11483
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11472
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
Prediction of Solar Irradiation in Africa using
Linear-Nonlinear Hybrid Models
Youssef Kassem
Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus | Energy,
Environment, and Water Research Center, Near East University, Cyprus
yousseuf.kassem@neu.edu..tr (corresponding author)
Huseyin Camur
Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus
huseyin.camur@neu.edu.tr
Mustapha Tanimu Adamu
Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus
20215363@std.neu.edu.tr
Takudzwa Chikowero
Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus
20215146@std.neu.edu.tr
Terry Apreala
Department of Mechanical Engineering, Engineering Faculty, Near East University, Cyprus
20224420@std.neu.edu.tr
Received: 18 June 2023 | Revised: 4 July 2023 | Accepted: 5 July 2023
Licensed under a CC-BY 4.0 license | Copyright (c) by the authors | DOI: https://doi.org/10.48084/etasr.6131
ABSTRACT
Solar irradiation prediction including Global Horizontal Irradiation (GHI) and Direct Normal Irradiation
(DNI) is a useful technique for assessing the solar energy potential at specific locations. This study used five
Artificial Neural Network (ANN) models and Multiple Linear Regression (MLR) to predict GHI and DNI
in Africa. Additionally, a hybrid model combining MLR and ANNs was proposed to predict both GHI and
DNI and improve the accuracy of individual ANN models. Solar radiation (GHI and DNI) and global
meteorological data from 85 cities with different climatic conditions over Africa during 2001-2020 were
used to train and test the models developed. The Pearson correlation coefficient was used to identify the
most influential input variables to predict GHI and DNI. Two scenarios were proposed to achieve the goal,
each with different input variables. The first scenario used influential input parameters, while the second
incorporated geographical coordinates to assess their impact on solar radiation prediction accuracy. The
results revealed that the suggested linear-nonlinear hybrid models outperformed all other models in terms
of prediction accuracy. Moreover, the investigation revealed that geographical coordinates have a minimal
impact on the prediction of solar radiation.
Keywords-global horizontal irradiation; direct normal irradiation; multiple linear regression; artificial neural
networks; hybrid model
I. INTRODUCTION
The exploitation of fossil fuels faces increasing political
and environmental challenges [1]. The use of renewable energy
is one solution to address these issues and meet the growing
global demand for electricity. Renewable energy offers a
solution to the pollution and environmental damage caused by
fossil and nuclear energy [2]. The potential of renewable
energy has inspired numerous researchers to explore clean
technologies, intending to generate clean energy and minimize
the effects of climate change [3-5]. Among the various
renewable resources, solar energy is particularly promising,
with applications in electricity generation, as well as air and
water heating/cooling [6]. Solar photovoltaic (PV) energy
generation uses solar modules that consist of multiple solar
cells containing a PV material.
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11473
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
In general, assessing the energy generation potential for
various solar technologies is based on two crucial parameters:
Global Horizontal Irradiation (GHI) and Direct Normal
Irradiation (DNI) [7]. Accurate measurement and prediction of
GHI and DNI are crucial in assessing the energy generation
potential of various solar technologies. These parameters are
key inputs for designing, operating, and optimizing the
performance of solar power plants. According to [8], GHI is the
total amount of solar radiation received on a horizontal surface,
including both direct and diffuse radiation. On the other hand,
DNI is the amount of solar radiation received directly from the
sun's rays, perpendicular to a surface. This type of radiation is
particularly important for Concentrated Solar Power (CSP)
plants that use mirrors or lenses to concentrate sunlight onto a
receiver to produce high-temperature heat, which is then used
to generate electricity [8]. Therefore, accurate and reliable
measurements of GHI and DNI are essential for the effective
operation and management of solar power plants in the future.
Recently, soft-computing approaches have emerged as
particularly effective techniques for modeling global solar
radiation in many regions around the world. Soft-computing
techniques enable efficient identification of relationships
between dependent and independent variables, even for non-
linear natural processes. Recently, various models have been
developed, such as Multilayer Feed-Forward Neural Networks,
Support Vector Machines, Autoregressive Integrated Moving
Averages, etc., that use different meteorological and
geographical elements to estimate the total amount of solar
radiation in terms of GHI and DNI [9-37]. Based on previous
scientific studies [9-37], the most relevant input parameters
used to predict solar radiation are average temperature,
pressure, relative humidity, wind speed, wind direction,
sunshine hours, minimum and maximum temperatures, wet-
bulb temperature, atmospheric temperature, cloudiness, and
evaporation.
Based on the above, various empirical models are used to
estimate the annual amount of GHI and DNI in Africa, which is
currently experiencing a major electricity crisis, with
approximately 600 million people without access to electricity.
Rural areas are particularly affected, with electrification rates
as low as 10%. This energy poverty has significant negative
impacts on the economy, society, and health, as communities
rely on unsafe and inefficient energy sources. However, the
abundant sunshine in Africa provides a unique opportunity for
the development of solar energy systems, which have the
potential to meet the energy needs of millions of people in the
region. In this study, five ANN models (feed-forward neural
network, cascade forward neural network, Elman neural
network, Layer Recurrent Neural Network, and NARX Neural
Network) and Multiple Linear Regression (MLR) were used to
predict solar radiation data. Moreover, this study proposed
linear-nonlinear hybrid models that integrate ANNs and MLR
for GHI and DNI prediction. GHI, DNI, and global
meteorological data from 85 cities in Africa with various
climatic conditions were used to train and test the developed
models. The Pearson correlation coefficient was used to
identify the most influential input variables for predicting GHI
and DNI. Two scenarios were used for this purpose: the first
one was created using the most influential input parameters,
while the second incorporated geographical coordinates
(latitude, longitude, and altitude) along with the influential
input data to assess the impact of geographical coordinates on
the accuracy of solar radiation prediction. Data were obtained
from the NASA POWER dataset for the period 2000-2021.
II. MATERIAL AND METHODS
A. Study Area
Africa is a vast continent that spans the equator, and its
climate varies greatly depending on the region. The continent
includes several climatic zones, including tropical rainforest,
savanna, and desert regions. The latitude and longitude of a
region greatly influence its climate, which in turn affects
weather patterns. The equator runs through the center of the
continent, passing through countries. The weather in Africa can
also be affected by various natural phenomena, such as the El
Nino Southern Oscillation (ENSO), which is a climate cycle in
the Pacific Ocean that affects global weather patterns. During
El Nino, the Pacific Ocean warms, leading to changes in
atmospheric pressure and wind patterns that affect rainfall
patterns in Africa.
B. Data Used
The NASA POWER (Prediction Of Worldwide Energy
Resource) dataset is a comprehensive collection of solar and
meteorological data that provides information on various
crucial parameters crucial for studying and analyzing
renewable energy resources and their potential. The dataset
covers locations around the world, allowing researchers and
analysts to access solar and meteorological data for virtually
any location on Earth. The NASA POWER dataset includes a
wide range of parameters related to solar radiation and
meteorological conditions. These data include solar radiation
including GHI, DNI, Diffuse Horizontal Irradiance (DHI), and
Clear Sky GHI, as well as meteorological data including
temperature, relative humidity, wind speed, wind direction,
precipitation, cloud cover, atmospheric pressure, and more. The
dataset offers both hourly and daily temporal resolutions.
Hourly data are available for certain parameters, allowing for a
more detailed analysis of solar and meteorological conditions
throughout the day. Daily data provide aggregated values for
each parameter. The spatial resolution of the NASA POWER
dataset varies depending on the specific parameter and the data
source used. In general, the dataset provides information at a
spatial resolution of approximately 1 km. The dataset integrates
data from various sources, including satellite observations,
ground measurements, and atmospheric models. NASA
incorporates data from multiple sensors and instruments to
provide accurate and reliable information. The NASA POWER
dataset is freely accessible to the public through the NASA
POWER web portal. Therefore, data including GHI, DNI,
surface pressure, average, maximum, and minimum
temperature, relative humidity, wind speed at 2 m height,
average, maximum, and minimum wind speed at 10 m height,
wind direction at 10 m height, frost point temperature, wet bulb
temperature, cloud amount, and precipitation were collected for
all the selected cities in Africa shown in Table I.
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11474
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
TABLE I. INFORMATION REGARDING THE SELECTED LOCATIONS
Location Latitude [N°] Longitude [E°] Altitude [m] Location Latitude [N°] Longitude [E°] Altitude [m]
Cairo 30.0 31.6 350.0 Cabinda -5.1 12.3 103.0
Kinshasa -4.3 15.3 277.0 Fez 33.8 -4.9 971.0
Vereeniging -26.6 27.9 1526.0 Uyo 5.0 7.9 71.0
Giza 30.0 31.2 19.0 Mwanza -2.5 32.7 1134.0
Luanda -9.5 13.5 201.0 Lilongwe -14.0 33.7 1071.0
Dar es Salaam -6.8 39.3 15.0 Kigali -1.9 30.1 1575.0
Khartoum 15.6 32.5 387.0 Bukavu -2.5 28.9 1533.0
Johannesburg -26.2 28.1 1746.0 Abomey 6.4 2.3 30.0
Abidjan 5.4 -4.0 105.0 Nnewi 6.0 7.0 163.0
Alexandria 30.9 29.8 18.0 Tripoli 32.8 13.3 31.0
Addis Ababa 9.0 38.8 2315.0 Kaduna 10.4 7.9 661.0
Nairobi -1.3 36.8 1657.0 Aba 5.1 7.4 64.0
Cape Town -33.3 18.4 35.0 Bujumbura -3.3 29.4 798.0
Yaoundé 3.9 11.5 715.0 Maputo -26.0 32.6 14.0
Kano 12.0 8.5 454.0 Hargeisa 9.6 44.1 1267.0
East Rand -26.4 27.4 1590.0 BoboDioulass 11.2 -4.3 420.0
Umuahia 5.5 7.5 154.0 Shubra el-Kheima 30.1 31.2 28.0
Douala 36.6 4.1 614.0 Ikorodu 6.6 3.5 36.0
Casablanca 33.3 -8.0 189.0 Asmara 15.3 38.9 2342.0
Ibadan 7.4 3.9 223.0 Marrakesh 31.6 -8.0 468.0
Antananarivo 19.0 46.7 1205.0 Tshikapa -3.0 23.8 505.0
Abuja 9.1 7.5 473.0 Ilorin 8.5 4.5 318.0
Kampala 0.3 32.6 1237.0 Blantyre -15.8 35.0 698.0
Kumasi 6.7 -1.6 260.0 Agadir 30.7 -9.6 454.0
Dakar 14.7 -17.3 6.0 Misratah 32.4 15.1 9.0
Port Harcourt 4.8 7.0 18.0 Lubumbashi -11.7 27.5 1262.0
Durban -29.9 31.0 13.0 Accra 5.8 0.1 39.0
Ouagadougou 12.4 -1.5 299.0 Brazzaville -3.0 23.8 505.0
Lusaka -15.4 29.2 1149.0 Monrovia 6.3 -10.8 6.0
Algiers 36.8 3.1 31.0 Tunis 33.8 9.4 43.0
Bamako 12.6 -8.0 335.0 Rabat 34.0 -6.8 87.0
Omdurman 15.6 32.5 391.0 Lomé 6.1 1.2 14.0
Mbuji-Mayi -6.1 23.6 678.0 Benin City 6.3 5.6 90.0
Pretoria -25.7 28.2 1338.0 Owerri 5.5 7.0 74.0
Kananga -5.9 22.4 636.0 Warri 5.5 5.8 5.0
Harare -17.9 31.1 1483.0 Jos 9.9 8.9 1182.0
Onitsha 6.1 6.8 51.0 Bangui 4.4 18.6 355.0
N'Djamena 12.1 15.1 297.0 Nampula -15.1 39.3 430.0
Nouakchott 18.1 -16.0 8.0 Oran Algeria 35.6 -0.7 162.0
Mombasa -4.0 39.7 10.0 West Rand -26.2 27.5 1589.0
Niamey 13.5 2.1 207.0 Lubango -14.9 13.5 1774.0
Pointe-Noire -4.8 11.9 16.0 Gqeberha -34.0 25.6 52.0
C. Artificial Neural Networks (ANNs)
ANNs are a class of machine learning algorithms inspired
by the structure and functioning of biological neural networks,
such as the human brain [38]. An ANN consists of
interconnected nodes, called artificial neurons or "nodes,"
organized into layers. The three main types in an ANN are the
input, hidden, and output. The connections between neurons in
an ANN are represented by weights. During the training
process, the weights are adjusted based on a mathematical
optimization algorithm, such as gradient descent, to minimize
the difference between the predicted and desired outputs. This
adjustment is performed through a process called
backpropagation, in which the error is propagated backward
through the network to update the weights.
Activation functions play a crucial role in determining the
output of a neuron based on the weighted inputs. The most
common activation functions are logistic-sigmoid (logsig) and
tangent-sigmoid (tansig) whose outputs lie between 0 and 1
and are defined as [38]:
������ = ���
� (1)
�
���� = ��
� ��
� (2)
In addition, a trial-and-error approach is typically employed
to determine the optimal number of nodes in the hidden layer.
This study used the TRAINLM training function, which
updates the weights and biases of neuron connections based on
the Levenberg-Marquardt (LM) optimization algorithm. The
backpropagation algorithm, a type of gradient descent
algorithm, serves as the learning algorithm for this purpose.
The training process of an ANN is crucial, involving the
adjustment of weights and biases to minimize the disparity
between the ANN's output and the desired values. The Mean
Squared Error (MSE) is used to optimize the performance of
the trained ANN model, which quantifies the average squared
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11475
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
difference between the predicted and actual values, serving as a
measure to guide the training procedure toward better accuracy.
Figure 1 illustrates the schematic representation of the ANN
model developed to predict the GHI and DNI.
Fig. 1. Schematic representation of the ANN model used.
1) Feed-Forward Neural Network (FFNN)
FFNN is widely used in various domains to analyze
different types of problems in different scenarios [38-40]. The
Levenberg-Marquardt algorithm and the backpropagation
method are commonly used techniques [40]. The trial and error
method is used to determine the appropriate number of hidden
layers and neurons, and MSE is used to assess the performance
of the training algorithm. It is important to note that the data
were normalized within the range of 0-1. This study used the
backpropagation algorithm for the training process.
2) Cascade Feed-Forward Neural Network (CFNN)
CFNN is conceptually similar to the FFNN [40-42] and
consists of three types of layers: an input layer, one or more
hidden layers, and an output layer. The input layer receives
weights from the input data [38-40]. Each subsequent layer
receives weights from the input layer and all preceding layers
[38-40]. Biases are present in all layers, contributing to the
network's functionality. The final layer corresponds to the
output layer. The configuration of weights and biases is
necessary for each layer. During the training phase, MSE is
computed to assess the model's performance.
3) Elman Neural Network (ENN)
ENN is a feedback neural network known for its
exceptional computational capabilities [39-40], and consists of
four layers, namely, the input, hidden, context, and output
layers [39-40]. The input layer functions as the signal
transmission component, while the output layer has a linear
weight effect. The distinguishing feature of ENN compared to
backpropagation neural networks is the inclusion of the context
layer [39-40].
4) Layer Recurrent Neural Network (LRNN)
LRNN incorporates recurrent connections at the layer level
[41]. In traditional RNNs, such as the Elman or Jordan
architectures, the recurrent connections are typically at the
neuron level. However, in LRNN, recurrent connections are
established between entire layers of neurons, and each layer is
associated with a recurrent connection that allows information
to flow from the previous to the current time step within the
same layer [41]. This enables the network to capture and utilize
temporal dependencies in sequential data. Recurrent
connections in LRNN can improve the model's ability to
process and analyze time series or sequential data, making it
particularly suitable for tasks such as speech recognition,
language modeling, and music generation, where capturing
long-term dependencies is crucial. By incorporating layer-level
recurrent connections, LRNN provides an alternative approach
to modeling sequential data compared to traditional recurrent
architectures. Its unique structure allows the efficient
processing of temporal information and can lead to improved
performance in tasks that involve sequential data analysis.
5) Nonlinear Autoregressive Retwork with Exogenous Input
(NARX)
NARX combines autoregressive elements with exogenous
input to predict future values of a time series [42]. It is
designed to capture nonlinear dependencies and patterns in
time series data, incorporating both the past values of the target
series (autoregressive component) and external factors or
inputs that may influence the target series (exogenous
component). NARX typically consists of an input layer, one or
more hidden layers, and an output layer. The input layer
receives both the past values of the target series (autoregressive
inputs) and any exogenous inputs that may be available. The
hidden layers process the input information and learn to capture
the nonlinear relationships and dynamics of the data. Finally,
the output layer generates the predicted values of the target
series. An important aspect of the NARX model is the use of
time-delayed inputs, where past values of the target series and
exogenous inputs are fed as input features with a time delay.
The model can consider historical information and
dependencies between past and future observations by
including these time-delayed inputs. Training the NARX model
typically involves using optimization algorithms, such as
gradient descent, to adjust its weights and biases to minimize
prediction errors. The performance of the NARX model can be
evaluated using MSE or Root Mean Squared Error (RMSE).
NARX has been used in various domains, including finance,
economics, weather forecasting, and time series prediction
tasks in general. Its ability to capture nonlinear relationships
and incorporate exogenous factors makes it a powerful tool for
modeling and predicting complex time-series data.
D. Multiple Linear Regression (MLR)
MLR is a statistical method to analyze the relationship
between a dependent variable and multiple independent
variables. It extends the concept of simple linear regression by
considering multiple predictors simultaneously. In MLR, the
goal is to create a linear equation that best fits the relationship
between the dependent and independent variables.
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11476
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
E. Hybrid Modeling (HM)
HM is a valuable technique for capturing different parts of
the underlying patterns by combining several models [43]. In
this study, an HM was developed by combining the predicted
values of MLR and estimated residuals (error) by a nonlinear
model (computational models). Three steps were taken in the
development of HM:
Step 1: Estimate the properties using the mathematical
models and determine the residuals.
Step 2: Pass the residual through computational models to
capture the nonlinearity of the data.
Step 3: Combine the obtained output from the mathematical
and computational models to predict fuel properties.
F. Statistical Indices
The performance evaluation of the developed models
involves the utilization of several statistical metrics. This study
used the Coefficient of Determination (R2), RMSE, and Mean
Absolute Error (MAE). Equations (3)-(5) present the
mathematical expressions for these metrics.
�� = 1 − ∑ ���,����,��
�����
∑ ���,����,� !������
(3)
�#$% = &�' ∑ �
�,( −
),( �
�'(*� (4)
#+% = �' ∑ ,
�,( −
),( ,'(*� (5)
TABLE II. DESCRIPTIVE STATISTICS OF THE USED DATA
City
DNI
(kWh/m
2
)
Class
GHI
(kWh/m
2
)
Class City
DNI
(kWh/m
2
)
Class
GHI
(kWh/m
2
)
Class
Aba 929 1 (poor) 1650 4 (good) Kigali 1142 2 (marginal) 1804 4 (good)
Abdijan 1004 2 (marginal) 1707 4 (good) Kinshasa 1002 2 (marginal) 1640 4 (good)
Abomey Calavi 1013 2 (marginal) 1692 4 (good) Kumasi 1023 2 (marginal) 1713 4 (good)
Abuja 1321 3 (fair) 1918 5 (excellent) Libreville 847 1 (poor) 1568 4 (good)
Accra 1254 2 (marginal) 1849 5 (excellent) Lilongwe 1718 4 (good) 2010 5 (excellent)
Addis Ababa 2111 5 (excellent) 2128 5 (excellent) Lokoja 1080 2 (marginal) 1762 4 (good)
Agadir 2196 6 (outstanding) 2072 5 (excellent) Lome 1091 2 (marginal) 1764 4 (good)
Alexandria 2111 5 (excellent) 2128 5 (excellent) Luanda 1215 2 (marginal) 1752 4 (good)
Algiers 1756 4 (good) 1749 4 (good) Lubango 2206 6 (outstanding) 2206 6 (outstanding)
Antananarivo 2086 5 (excellent) 2146 5 (Excellent)) Lubumbashi 1842 5 (excellent) 2082 5 (excellent)
Asmara 2027 5 (excellent) 2236 6 (outstanding) Lusaka 2401 6 (outstanding) 2223 6 (outstanding)
Bamako 1754 4 (good) 2131 5 (excellent) Maiduguri 1707 4 (good) 2153 6 (outstanding)
Bangui 1261 3 (fair) 1860 5 (excellent) Maputo 1806 4 (good) 1833 4 (good)
Benguela 1866 5 (excellent) 2051 5 (excellent) Marrakesh 2367 6 (outstanding) 2062 5 (excellent)
Benin City 914 1 (poor) 1625 4 (good) Mbuji-Mayi 1330 3 (fair) 1865 5 (excellent)
Blantyre 1716 4 (good) 1994 5 (Excellent)) Misratah 1777 4 (good) 1863 5 (excellent)
Bobo Dioulasso 1658 4 (good) 2156 6 (outstanding) Mombasa 1717 4 (good) 2067 5 (excellent)
Brazzaville 1023 2 (marginal) 1710 4 (good) Monrovia 992 2 (marginal) 1665 4 (good)
Bujumbura 1184 2 (marginal) 1802 4 (good) Mwanza 1577 4 (good) 1999 5 (excellent)
Bukavu 1075 2 (marginal) 1740 4 (good) Nairobi 1761 4 (good) 2116 5 (excellent)
Cabinda 841 1 (poor) 1507 3 (fair) Nampula 1745 4 (good) 2053 5 (excellent)
Cairo 2084 5 (excellent) 2083 5 (excellent) Ndjamena 1849 5 (excellent) 2227 6 (outstanding)
Cape Town 2516 6 (outstanding) 2025 5 (excellent) Niamey 1806 4 (good) 2204 6 (outstanding)
Casablanca 1897 5 (excellent) 1891 5 (excellent) Nnewi 899 1 (poor) 1613 4 (good)
Dakar 1589 4 (good) 2099 5 (excellent) Nouakchott 1917 5 (excellent) 2289 6 (outstanding)
Dar es Salam 1750 4 (good) 2062 5 (excellent) Omdurman 2423 6 (outstanding) 2405 6 (outstanding)
Doula 852 1 (poor) 1554 4 (good) Onitsha 955 2 (marginal) 1665 4 (good)
Durban 1679 4 (good) 1706 4 (good) Oran 1813 4 (good) 1807 4 (good)
East rand 992 2 (marginal) 1016 1 (poor) Ouagadougou 1690 4 (good) 2119 5 (excellent)
Enugu 1019 2 (marginal) 1722 4 (good) Owerri 929 1 (poor) 1650 4 (good)
Fez 2094 5 (excellent) 1956 5 (excellent) Point-Noire 902 1 (poor) 1555 4 (good)
Giza 1005 2 (marginal) 1676 4 (good) Port Harcourt 834 1 (poor) 1504 3 (fair)
Gqeberha 2124 5 (excellent) 1827 4 (good) Pretoria 2323 6 (outstanding) 2070 5 (excellent)
Harare 2029 5 (excellent) 2117 5 (excellent) Rabat 2197 6 (outstanding) 1965 5 (excellent)
Hargeisa 2460 6 (outstanding) 2442 6 (outstanding) Shubra el-Kheima 2084 5 (excellent) 2083 5 (excellent)
Ibadan 982 2 (marginal) 1676 4 (good) Tangier 1935 5 (excellent) 1801 4 (good)
Ikorodu 1005 2 (marginal) 1676 4 (good) Tripoli 1856 5 (excellent) 1970 5 (excellent)
Ilorin 1182 2 (marginal) 1824 4 (good) Tshikapa 1022 2 (marginal) 1681 4 (good)
Johannesburg 2224 6 (outstanding) 2030 5 (excellent) Tunis 2174 5 (excellent) 2038 5 (excellent)
Jos 1335 3 (fair) 1930 5 (excellent) Umuahia 929 1 (poor) 1650 4 (good)
Kaduna 1510 3 (fair) 2038 5 (excellent) Uyo 929 1 (poor) 1650 4 (good)
Kampala 1309 3 (fair) 1932 5 (excellent) Vereeniging 2303 6 (outstanding) 2057 5 (excellent)
Kananga 1164 2 (marginal) 1779 4 (good) Warri 809 1 (poor) 1555 4 (good)
Kano 1615 4 (good) 2126 5 (Excellent)) West rand 2303 6 (outstanding) 2057 5 (excellent)
Khartoum 2423 6 (outstanding) 2405 6 (outstanding) Yaounde 856 1 (poor) 1633 4 (good)
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11477
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
III. RESULTS AND DISCUSSION
A. Solar Energy Characteristics
The classification of the solar energy potential was
determined considering the annual GHI and DNI values. The
classification of solar resources can be found in [44]. Table II
presents the classification of solar resources in a specific city
based on GHI and DNI values. Based on the annual value of
GHI, it is observed that most of the selected regions exhibit
abundant solar resources and are classified into good, excellent,
and outstanding categories. Furthermore, solar resources in the
East are classified as poor (class 1). Moreover, it is noticed that
the solar resources in Cabinda and Port Harcourt are
categorized as fair (class 3). Consequently, these regions
emerge as the most favorable locations for the future
installation of PV systems, primarily due to their significantly
high GSR values.
Based on the annual value of the DNI, solar resources in
14% of the selected regions were classified as outstanding
(Class 6). These regions are Agadir, Rabat, Lubango,
Johannesburg, Vereeniging, West Rand, Pretoria, Marrakesh,
Lusaka, Khartoum, Omdurman, Hargeisa, and Cape Town.
Furthermore, solar resources in 14% of the selected regions
(Warri, Port Harcourt, Cabinda, Libreville, Doula, Yaounde,
Nnewi, Point-Noire, Benin City, Aba, Owerri, Umuahia, and
Uyo) were classified as poor (Class 1). Consequently, based on
the high values of GHI and DNI, it can be concluded that most
of the selected locations are well-suited for the installation of
both large- and small-scale PV systems. Moreover, these
regions are also highly suitable for implementing flat-plate PV
systems and CSP systems.
B. Selecting Relevant Parameters
The evaluation of the solar potential of a specific location is
a crucial initial step in the effective planning of solar energy
systems. Additionally, the prediction of solar radiation is
influenced by various meteorological and geographical
variables, making the identification of appropriate factors for
accurate solar radiation prediction a significant area of
research. According to [45], accurate information about the
specific amount of solar energy available at a particular
geographical location during a given period is essential and
plays a vital role in the design process of PV systems.
Moreover, meteorological parameters play a pivotal role in
influencing the amount of solar radiation [46-47]. Furthermore,
the orientation angles of a PV system have a significant impact
on its performance [48-49].
TABLE III. PEARSON CORRELATION MATRIX FOR INPUT AND OUTPUT PARAMETERS
GHI
Sl Az SP Tav RH WS-2 WD WS FPT WPT Tmax Tmin CA WSmax WSmin PC GHI
Sl 1
Az -0.1 1
SP -0.13 0.49 1
Tav -0.26 0.457 0.364 1
RH -0.27 0.076 0.315 -0.01 1
WS-2 0.161 -0.15 0.219 -0.29 -0.26 1
WD 0.023 0.24 0.312 -0.12 0.091 0.116 1
WS 0.185 -0.19 0.171 -0.33 -0.33 0.988 0.084 1
FPT -0.36 0.283 0.451 0.514 0.845 -0.36 0.018 -0.44 1
WPT -0.37 0.398 0.476 0.797 0.589 -0.38 -0.04 -0.45 0.928 1
Tmax 0.225 0.244 0.062 0.218 -0.76 0.034 0.032 0.09 -0.53 -0.28 1
Tmin -0.38 0.277 0.396 0.732 0.557 -0.23 -0.1 -0.32 0.869 0.93 -0.38 1
CA -0.24 0.089 0.014 0.278 0.677 -0.63 -0.19 -0.67 0.701 0.614 -0.56 0.587 1
WSmax 0.287 -0.22 0.038 -0.57 -0.37 0.822 0.079 0.852 -0.6 -0.67 0.178 -0.6 -0.67 1
WSmin -0.03 -0.09 0.158 0.011 -0.02 0.377 0.043 0.374 0 0.005 -0.04 0.069 -0.19 0.173 1
PC -0.27 0.104 0.03 0.266 0.642 -0.58 -0.15 -0.61 0.661 0.581 -0.48 0.502 0.732 -0.601 -0.159 1
GHI 0.162 -0.11 -0.36 0.013 -0.79 0.402 -0.08 0.444 -0.64 -0.45 0.557 -0.38 -0.76 0.385 0.123 -0.56 1
DNI
Sl Az SP Tav RH WS-2 WD WS FPT WPT Tmax Tmin CA WSmax WSmin PC DNI
Sl 1
Az -0.1 1
SP -0.13 0.49 1
Tav -0.26 0.457 0.364 1
RH -0.27 0.076 0.315 -0.01 1
WS-2 0.161 -0.15 0.219 -0.29 -0.26 1
WD 0.023 0.24 0.312 -0.12 0.091 0.116 1
WS 0.185 -0.19 0.171 -0.33 -0.33 0.988 0.084 1
FPT -0.36 0.283 0.451 0.514 0.845 -0.36 0.018 -0.44 1
WPT -0.37 0.398 0.476 0.797 0.589 -0.38 -0.04 -0.45 0.928 1
Tmax 0.225 0.244 0.062 0.218 -0.76 0.034 0.032 0.09 -0.53 -0.28 1
Tmin -0.38 0.277 0.396 0.732 0.557 -0.23 -0.1 -0.32 0.869 0.93 -0.38 1
CA -0.24 0.089 0.014 0.278 0.677 -0.63 -0.19 -0.67 0.701 0.614 -0.56 0.587 1
WSmax 0.287 -0.22 0.038 -0.57 -0.37 0.822 0.079 0.852 -0.6 -0.67 0.178 -0.6 -0.67 1
WSmin -0.03 -0.09 0.158 0.011 -0.02 0.377 0.043 0.374 0 0.005 -0.04 0.069 -0.19 0.173 1
PC -0.27 0.104 0.03 0.266 0.642 -0.58 -0.15 -0.61 0.661 0.581 -0.48 0.502 0.732 -0.601 -0.159 1
DNI 0.289 -0.29 -0.27 -0.42 -0.72 0.586 0.083 0.637 -0.81 -0.76 0.477 -0.69 -0.92 0.7 0.129 -0.69 1
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11478
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
This study used Pearson’s correlation to identify the most
influential input among potential variables. Based on the
Pearson coefficient, the strength of the relationships can be
categorized as follows [50]: 0.00–0.25 indicates a very weak
relationship, 0.26–0.49 represents a weak relationship, 0.50–
0.69 corresponds to a moderate relationship, 0.70–0.89
signifies a strong relationship and 0.90–1.0 denotes a very
strong relationship. Table III lists the Pearson correlation
matrix depicting the relationships between the potential input
parameters and the failure modes. The matrix provides an
overview of the correlation coefficients between these
variables. This study investigated the influence of geographical
coordinates on the accuracy of the prediction of GHI and DNI.
To achieve this objective, the proposed models were
implemented and evaluated in two different scenarios, as
shown in Figure 2. Different empirical models were used to
predict the annual value of GHI and DNI. In general, data
partitioning can influence model performance [51]. Moreover,
in [51], it was concluded that empirical models achieve optimal
performance when approximately 70-80% of the data are
allocated for training and the remaining 20-30% are set aside
for testing purposes. Consequently, the data were divided into
training and testing sets using an arbitrary approach, with 80%
of the total data assigned to the training set and the remaining
20% designated for the testing set. Table IV displays the
descriptive statistics for the selected data.
C. Results of ANN models
An iterative algorithm was used to find the best neural
network model and determine the optimal combination of input
variables and hidden layer neurons. The study considered a
range of 1-10 hidden layers and 10000-1000000 trial iterations.
The Levenberg-Marquardt training algorithm was selected for
its speed and reliability. Each network was trained multiple
times to prevent inaccurate estimates. The model with the
lowest MSE was chosen as the best-trained model. Table V
presents the optimal network structure and activation function.
The performance evaluation of the developed models was
performed using R2, RMSE, and MAE. Table VI presents the
values of these statistical indexes for all proposed ANN
models, and the following can be concluded:
For GHI prediction, it is noticed that the FFNN model had
the highest R2 value compared to other models. On the
other hand, the LRNN model had the lowest RMSE and
MAE values, indicating superior performance compared to
the other models. As shown in Table VI, the accuracy of
GHI prediction was reduced when the geographical
coordinates Lat, Long, and Alt were used as input variables
for the models.
For DNI prediction, it was found that the FFNN model
exhibited the highest R2 value among all models, indicating
its superiority. Besides, the ENN model demonstrated the
lowest RMSE and MAE values, suggesting superior
performance compared to the others. The results showed
that the accuracy of the DNI prediction increased when
geographic coordinates were used as input variables.
TABLE IV. DESCRIPTIVE STATISTICS OF THE USED DATA
Data Variable Unit Mean SD Min. Max.
Training
Lat. ° 3.6 17.0 -34.0 36.8
Long ° 16.4 15.8 -17.3 46.7
Alt. m 555.7 597.6 6.0 2342.0
Sl. ° 17.7 15.7 -1.0 90.0
Az. ° -54.4 81.8 -180.0 47.0
SP kPa 96.1 5.6 81.4 101.7
Tav ℃ 23.6 3.6 8.4 30.1
RH % 68.8 15.2 23.8 89.2
WS-2m m/s 2.4 1.1 0.5 6.1
WD ° 195.2 94.2 0.4 359.5
WS m/s 3.4 1.2 1.0 7.2
FPT ℃ 16.0 5.6 4.1 24.3
WBT ℃ 19.8 3.9 7.2 25.7
Tmax ℃ 36.5 4.9 25.3 46.9
Tmin ℃ 11.9 5.9 -8.7 23.5
CA % 54.4 15.9 11.4 86.9
WSmax m/s 10.1 3.6 2.8 22.9
WSmin m/s 0.1 0.1 0.0 1.4
PC mm/day 2.8 1.9 0.0 20.3
DNI kWh/m2/day 4.3 1.4 2.0 7.7
GHI kWh/m2/day 5.3 0.7 2.6 6.8
Testing
Lat. ° 8.8 20.9 -26.6 35.6
Long ° 11.4 11.6 -6.8 31.2
Alt. m 403.9 528.1 5.0 1589.0
Sl. ° 19.8 11.4 0.0 34.0
Az. ° -27.2 67.2 -179.0 18.0
SP kPa 97.2 5.2 84.7 101.6
Tav ℃ 22.4 3.4 15.8 28.6
RH % 70.8 15.6 39.3 90.3
WS-2m m/s 2.0 1.0 0.1 4.4
WD ° 228.7 101.0 0.9 360.0
Wsav m/s 3.0 1.1 0.7 5.3
FPT ℃ 15.6 6.8 4.5 24.1
WPT ℃ 19.0 4.9 10.8 25.0
Tmax ℃ 37.0 4.8 29.1 47.8
Tmin ℃ 8.9 7.8 -6.2 21.2
CA % 54.3 18.9 22.8 84.6
Wsmax m/s 9.9 4.4 1.7 20.9
Wsmin m/s 0.1 0.1 0.0 0.5
PC mm/day 3.0 2.3 0.0 11.0
DNI kWh/m2/day 4.2 1.7 1.8 7.1
GHI kWh/m2/day 5.0 0.6 4.0 6.0
SD: Standard deviation; Min. Minimum; Max.: Maximum
TABLE V. BEST NETWORK STRUCTURE BASED ON THE
TRAINING SET FOR HORIZONTAL SOLAR RADIATION
Model Scenario
Number of
hidden layers
Number of
neurons
Transfer
function
FFNN
1 2 5 tansig
2 1 15 tansig
ENN
1 2 10 logsig
2 2 10 tansig
CFNN
1 1 15 logsig
2 1 5 tansig
LRNN
1 2 5 tansig
2 2 15 tansig
NARX
1 2 15 logsig
2 1 5 tansig
Engineering, Technology & Applied Science Research Vol. 13, No. 4, 2023, 11472-11483 11479
www.etasr.com Kassem et al.: Prediction of Solar Irradiation in Africa using Linear-Nonlinear Hybrid Models
Fig. 2. Model development for predicting GHI and DNI considering the two scenarios.
TABLE VI. STATISTICAL INDEXES AND SCENARIO USED
FOR ALL PROPOSED ANN MODELS IN TESTING
Output Scenario Variable FFNN LRNN CFNN ENN NARX
GHI
SGHI#1
R
2
0.8287 0.5404 0.8278 0.7986 0.7858
RMSE 0.4386 0.3976 0.4125 0.4144 0.3537
MAE 0.3826 0.3073 0.3438 0.3276 0.2472
SGHI#2
R
2
0.8154 0.7657 0.5637 0.8263 0.6443
RMSE 0.3226 0.4102 0.3920 0.6104 0.4080
MAE 0.2711 0.3414 0.3283 0.5434 0.3029
DNI
SDNI#1
R
2
0.9232 0.4533 0.9095 0.9047 0.9105
RMSE 0.5975 1.5089 0.6266 0.5340 0.6725
MAE 0.4354 1.0055 0.4458 0.3837 0.4945
SDNI#2
R
2
0.8862 0.4739 0.9249 0.8987 0.8650
RMSE 0.6694 1.4296 0.7061 0.5902 0.6720
MAE 0.5243 0.9555 0.5563 0.4505 0.5075
RMSE and MAE are in kWh/m2
D. Results of MLR
MLR was used to predict the GHI and DNI in Africa. The
training data were used to derive mathematical equations,
represented by the following equations for the two scenarios:
/01 = 8.562 − 0.035 ∙ �0 : 0.021 ∙ ;<=
−0.018 ∙ =>
? − 0.014 ∙ A+ : 0.012 ∙
? − 0.017 ∙ A+ : 0.018 ∙ �� : 0.061 ∙ G$IJK
:0.034 ∙ �� : 0.038 ∙ G$IJK
:0.06 ∙