001.docx


 CHEMICAL ENGINEERING TRANSACTIONS  
 

VOL. 83, 2021 

A publication of 

 
The Italian Association 

of Chemical Engineering 
Online at www.cetjournal.it 

Guest Editors: Jeng Shiun Lim, Nor Alafiza Yunus, Jiří Jaromír Klemeš 
Copyright © 2021, AIDIC Servizi S.r.l. 
ISBN 978-88-95608-81-5; ISSN 2283-9216 

Convolution Recurrent Neural Network for Daily Forecast of 
PM10 Concentrations in Brunei Darussalam 

Effa Nabilla Aziza,*, Asem Kasema, Wida Susanty Haji Suhailia, Peijiang Zhaob 
aSchool of Computing and Informatics, Universiti Teknologi Brunei, BE1410, Brunei Darussalam 
bBig Data Analytics Laboratory, National Institude of Information & Communications Technology, 184-8795, Tokyo, Japan 
 p20190005@student.utb.edu.bn  

PM10 is a particulate matter with an aerodynamic diameter less than or equal to 10 µm. It is one of the primary 
pollutants contributing to the ambient air quality level. Air quality monitoring in Brunei Darussalam is using only 
the PM10 concentrations to measure the nation’s daily Pollutant Standard Index (PSI). This study sheds light 
on a data-centric landscape of air pollution prediction in Brunei Darussalam, highlights potential uses of 
forecasting daily PM10 concentrations, and presents comparisons of prediction models built using several 
methods, namely: moving average, linear regression, recurrent neural network (RNN), long short-term 
memory (LSTM), LSTM with 1-D convolutions, and convolutional recurrent neural network (CRNN). This study 
is using daily PM10 concentrations obtained from the air quality monitoring stations located at every district in 
Brunei Darussalam for a period of 15 y (2005–2019). The results of the analysis of the daily prediction 
performance on shows that the CRNN approach provides the most accurate prediction among compared 
methods. The mean value of RMSE, MAE and SMAPE for the CRNN model are 3.414, 2.293 and 0.125. The 
results from the CRNN model can be used as part of early-warning application with the ability to provide health 
advisory such as wearing face masks or limit outdoor activities, or to show safer routes to school or work from 
heavy polluted areas, to mitigate the negative impacts of haze pollution on the citizens. Future work would 
include multiple days predictions, the inclusion of air quality data from neighbouring countries in Southeast 
Asia to account for transboundary air pollution, and the inclusion of other pollutants concentrations: carbon 
monoxide (CO), fine particulate matter (PM2.5), nitrogen dioxide (NO2), sulphur dioxide (SO2) and ozone (O3). 

1. Introduction 
Haze pollution has been a recurrent issue in Southeast Asia region, and the sources may come from localized 
or transboundary pollution. Transboundary haze episodes are mostly caused by long-range transport of 
biomass fires from slash-and-burn activities in Indonesia (Dotse et al., 2017) during dry seasons, with the 
prevailing southern monsoon wind going upwards, affecting several countries in the region, including Brunei 
Darussalam, Malaysia, Singapore, Indonesia, and Southern Thailand. In Brunei Darussalam, most of the 
localized haze episodes were resulted from peat fires where the peatlands area is being drained or 
deforested. This have released carbon emission that accumulated from dead plant matter, and lead to 
peatland fires and resulted in smoke haze and greenhouse gases into the atmosphere. Particulate matter 
(PM10) is the only pollutant used in determining the severity of haze and air quality index in Brunei 
Darussalam.  
Several studies have shown significant relationship between cardiovascular and respiratory morbidity rate with 
the level of PM10 concentrations. Newell et al. (2017) analysed an increase rate of cardiovascular (0.27 %) 
and respiratory (0.56 %) morbidity in China correlated with the increase of PM10 concentrations in China. 
During the catastrophic 1997–1998 haze episodes in Brunei Darussalam, the PM10 concentrations were linked 
to the increase cases of respiratory morbidity, such as are: asthma, influenzas, acute upper respiratory 
infections, pneumonia, bronchitis, emphysema, and conjunctivitis (Anaman and Ibrahim, 2003), and the drop 
of visibility (Yadav et al., 2003). Brunei have also lost 3.75 % number of tourists, with an estimation of 
economic loss of BND 1 million (Anaman and Looi, 2000) and Brunei Airport was forced to close due to the 
poor visibility and most flights were delayed or cancelled (Limin et al., 2006). Anaman (2001) surveyed several 

 
                                                                                                                                                                 DOI: 10.3303/CET2183060 
 
 
Paper Received: 15/07/2020; Revised: 30/08/2020; Accepted: 01/09/2020 
Please cite this article as: Aziz E.N., Kasem A., Haji Suhaili W.S., Zhao P., 2021, Convolution Recurrent Neural Network for Daily Forecast of 
PM10 Concentrations in Brunei Darussalam, Chemical Engineering Transactions, 83, 355-360  DOI:10.3303/CET2183060 
  

355


households during the 1998 haze episodes and discovered that the average household spent about BND 15 
on face masks to reduce the negative effects of the haze, in which for 65, 000 households in Brunei 
Darussalam would have estimation cost of BND 1 million for the daily usage of the face masks alone.  
Predictions on PM10 concentrations have been widely studied due to its potential negative impacts on human 
well-being, economic, and biodiversity, and its important role in climate change (IPCC, 2013). This led to many 
governments and stakeholders in Brunei Darussalam to monitor PM10 concentration closely and focus 
research on PM10 predictions. Many studies conducted on particulate matter have used mathematical 
statistical models (Donkelaar et al., 2010) to make long-term prediction by using climate model or a satellite 
remote sensing. Several machine learning methods (Saeed et al., 2017) were used to improve accuracy in 
short-term prediction with the inclusion of meteorological data, which can influence the predicted values. 
Another potential way to predict particulate matter concentrations is  using deep learning techniques, for 
example hybrid of convolutional neural network and long short-term memory methods (Yang et al., 2020) were 
used to predict hourly particulate matter concentrations in Seoul, South Korea; and to date, prediction method 
study on daily PM10 concentration in Brunei is using hybrid framework of genetic algorithm, random forests 
and back propagation neural networks (Dotse et al., 2017) with using 5 y of air quality data (2009-2013).  
The literature review above identified some research gaps. Firstly, prediction study on PM10 concentration 
using hybrid method of convolution neural network and recurrent neural network, and evaluation on 
performance models using several neural network-based methods have not been considered in the literature 
on data-centric landscape of air pollution prediction in Brunei Darussalam. Prediction methods using recent 
data on PM10 concentration in Brunei have also not been used, and the use of recent data is essential 
especially in forecasting study as it may improve the accuracy and applicability of the prediction models. 
Lastly, there are few studies on negative impacts of haze episodes in Brunei using recent haze episodes, as 
most of the studies are based on the catastrophic 1997-1998 haze episodes (Anaman and Ibrahim, 2003) in 
Brunei. In summary, the main gap in the literature study is a thorough analysis on PM10 concentration 
prediction using recent air quality data in Brunei, and comparing the model performance while considering the 
spatial and temporal distribution of the data that add significant contribution to the framework. 
This study aims to utilize convolution recurrent neural network as proposed in a PM2.5 prediction study in 
Japan (Zhao and Zettsu, 2018), using longer PM10 dataset of 15 y (2005-2019) provided by the Department of 
Environment, Park and Recreation (JASTRE), gathered from four stations in every district, to predict the daily 
PM10 concentrations in Brunei Darussalam. This study also conducted other prediction methods such as linear 
regression, recurrent neural network, and long short-term memory, and evaluate all methods using 
performance metrics. Results from the prediction model can aid the governments and organizations in 
monitoring the air quality, acts as early warning advisory to the public. A summarized data in Table 1 
highlights the total number of days with PM10 exceeding 50 μgm-3 (above the level for good air quality 
guidelines issued by the Ministry of Health in 2013 health advisory). Table 1 shows that 2013 and 2015 had 
the highest number of exceedance days (with a total of 36 d in 2013 and 74 da in 2015). 

Table 1: Number of days PM10 concentrations exceeded 50 μgm-3 (above MOH guideline for good air quality) 

District 
Year 

Anggerek 
Brunei-Muara 

Bukit Bendera 
Tutong 

Mumong 
Belait 

Taman Batang Duri 
Temburong 

Sum of Number of Days 
Exceeding 50 μgm-3  

2005 7 1 6 1 15 
2006 8 - 20 20 48 
2007 - - 3 - 3 
2008 2 1 2 - 5 
2009 8 13 20 12 53 
2010 - - 2 - 2 
2011 4 5 8 3 20 
2012 - 7 5 - 12 
2013 3 7 21 5 36 
2014 - - 6 - 6 
2015 13 26 32 3 74 
2016 - 3 6 - 9 
2017 - - - - - 
2018 - 1 1 - 2 
2019 13 14 25 3 55 
Total 58 78 157 47  
 
 
356


The contribution and novelty of this study to the research literature are: 
1. Utilize the new proposed prediction method that combines both convolution and recurrent neural 

network (Zhao and Zettsu, 2018) with the inclusion of the spatial and temporal distribution of the 
dataset, to predict daily PM10 concentrations. 

2. All prediction methods used in this study can be used as real-time daily forecasting, notably convolution 
recurrent neural network model, reducing the expense of using tapered element oscillating microbalance 
instrument to monitor the pollutant concentration. 

3. This prediction method is using larger dataset of air quality concentrations from Brunei (2005-2019), 
being the first prediction study to develop prediction methods using 15 y of recent air quality dataset.  

2. Data and methods 
2.1 Study area and data 

Brunei Darussalam (latitude 4.5353° N, longitude 114.7277° E) and its four administrative districts are shown 
in Figure 1, indicating the locations of Anggerek air quality monitoring station in Brunei Muara district (latitude 
4.9329° N, longitude 114.9415 E), Bukit Bendera air quality monitoring station in Tutong district (latitude 
4.8102° N, longitude 114.6601° E), Mumong air quality monitoring station in Belait district (latitude 4.5751° N, 
longitude 114.2330° E) and Taman Batang Duri air quality monitoring station in Temburong district (latitude 
4.5786° N, longitude 115.1215° E). The PM10 data used in this study is gathered from four air quality 
monitoring stations for every district and provided by JASTRE.  

Figure 1: Map showing the locations of the air quality monitoring stations in each district in Brunei Darussalam. 

The mean daily PM10 concentrations and meteorological datasets were pre-processed and restructured to the 
following format: year, month, day, latitude, and longitude of the air quality monitoring station, and the mean 
daily PM10 concentration of the station. The available data was divided into two subsets of training and test 
sets. 13 y data (2005–2017) were used for training the models, and 2 y data (2018–2019) were used for 
testing the trained models.  

2.2 Methods 

This study experimented with several methods to forecast PM10, including common mathematical methods, 
and Neural Networks–based methods, and also leveraged an approach of utilizing Convolutional Recurrent 
Neural Network (CRNN) as proposed by Zhao and Zettsu (2018). A naïve forecast is used as baseline model, 
in which the previous timestep (i.e. day) of PM10 values are used as the next-day forecasts, without any 
attempt to model the data. The predicted results by naïve forecast are used to compare with the forecast 
results by other models. Moving average forecast is formed by taking the average value of PM10 over a 
number of previous timesteps, and the averaging window is then shifted forward throughout the PM10 data. 
Linear regression is also used by estimating a linear function of the parameters in a previous time window of 
PM10 data (e.g. window size is 7 d). Recurrent neural network (RNN) models were also used, including simple 
RNN, bidirectional long short-term memory (LSTM), and LSTM with 1-D convolutional layer. The methods 
were applied on each station’s data individually as multiple time-series prediction tasks. The approach 
proposed by Zhao and Zettsu (2018) allows to combine spatio-temporal data from multiple locations and 

357


model it by using CRNN model. Figure 2 shows an example of how convolutional networks process the spatial 
features of the data.  

 
Figure 2: The processes of feeding spatial environment data based on coordinates to the convolutional neural 
network to generate a spatial feature map.  

Missing data within the grid, i.e. where there are no stations, are approximated using inverse distance 
weighting (IDW) for its simplicity. Other methods that take terrain into account (Wang et al., 2019) may be also 
used.  

2.3 Model performance evaluation 

Performance evaluation is used to calculate the error between the predicted and the real observed data on the 
test set, which can be measured using different types of performance metrics. In this study, the following 
metrics were used: mean absolute error (MAE), root mean squared error (RMSE), and symmetric mean 
absolute percentage error (SMAPE). 
MAE measures the average of absolute differences between predicted data and the real observed data. MAE 
value close to zero suggests both predicted and observed data has good agreement, as shown in Eq(1): 

𝑀𝑀𝑀𝑀𝑀𝑀 =  1
𝑛𝑛
∑ �𝑦𝑦𝑝𝑝 − 𝑦𝑦𝑜𝑜 �
𝑛𝑛
𝑖𝑖=1    (1) 

RMSE calculates the standard deviation of predicted data from the real observed data, as shown in Eq(2): 

𝑅𝑅𝑀𝑀𝑅𝑅𝑀𝑀 =  �
∑ �𝑦𝑦𝑝𝑝 − 𝑦𝑦𝑜𝑜 �

2𝑛𝑛
𝑖𝑖=1

𝑛𝑛
  (2) 

SMAPE calculates the absolute differences between predicted data and the real observed data and square 
the result, as shown in Eq(3): 

𝑅𝑅𝑀𝑀𝑀𝑀𝑆𝑆𝑀𝑀 =  100 %
𝑛𝑛

∑
�𝑦𝑦𝑝𝑝 − 𝑦𝑦𝑜𝑜 �

|𝑦𝑦𝑜𝑜 |+ � 𝑦𝑦𝑝𝑝 �
𝑛𝑛
𝑖𝑖=1   (3) 

where, in Eq(1) to Eq(3), yp is the predicted data, yo is the real observed data, and n is the total number of 
samples. 

3. Results and Discussions 
The computational experiments for this study were conducted using Python and TensorFlow. The parameters 
for the predictive models were learnt by training using the 15 y dataset (2005–2017) of mean daily PM10 
concentrations (80 % of the data), and the remaining 20 % of the data (2018–2019) were used for evaluation 
to avoid biased optimistic evaluation due to over-fitting.  
For Moving Average method, it was found that a two-days averaging window gave the best performance on 
the test set among the few experimented sizes (2-10). For the other methods, model fitting was run 
individually on each station’s training dataset, using a variety of parameters. All experiments, except for 
CRNN, used 500 epochs (very similar results were obtained with 80 and 300 epochs as well). For the CRNN 
method, the CNN part ran for 800 epochs and the RNN part ran for 700 epochs.  
The simple RNN network used two layers of 40 neurons each; the LSTM network used two bidirectional layers 
of 32 LSTM units each, and the LSTM with 1D-Conv used two unidirectional layers of 32 LSTM units, 
preceded with a Conv1D layer of 32 units (kernel size is 5).  
Table 2 presents the performance metrics of each method, averaged for all stations, with performance results 
of the CRNN method showing the least error in prediction, across all metrics.  

358


Table 2: Mean performance metrics of the models at the four stations 

Metric 
Epoch 

RMSE 
80 

 
300 

 
500 

MAE 
80 

 
300 

 
500 

SMAPE 
80 

 
300 

 
500 

Naïve Forecast 5.294    3.241   0.156   
Moving Average 5.867    3.662   0.172   
Linear Regression 5.217  5.180 5.169 3.245 3.226 3.216 0.167 0.155 0.154 
Simple RNN 5.419  5.168 5.195 3.471 3.228 3.228 0.164 0.155 0.155 
LSTM 5.393  5.353 5.270 3.293 3.349 3.330 0.155 0.160 0.159 
LSTM 1D-Conv 5.575  5.784 5.917 3.386 3.494 3.548 0.160 0.165 0.165 
CRNN 3.414    2.293   0.125   
 
Figure 2 shows an example of how one model has fit the training data and is also able to generalize and 
predict the data in the test set for the station in Belait district. 
 

Figure 3: Example of time-series graph of the observed and predicted values of PM10 concentrations at 
Mumong station, Belait district, for (a) testing sets (2018-2019), and (b) training sets from 2005 to 2017. 

Overall, based on the obtained metrics from all methods, it seems that all models have a relatively small error 
for one-day prediction, this finding suggests that probably any of them can be used in real forecast. It is also 
interesting to note that most of the methods, except for CRNN, achieved a slightly worse error than the naïve 
forecasting. For practical applications, it is believed that multi-days forecast is necessary, and the accuracy of 
prediction will degrade much more compared to only single future timestep case, and it is expected that the 
CRNN method will prevail as the suitable method to be used. 

4. Conclusions 
This study conducted several prediction methods to forecast daily PM10 concentrations in four districts of 
Brunei Darussalam and evaluated the performance of these methods using Mean Absolute Error (MAE), Root 
Mean Squared Error (RMSE), and Symmetric Mean Absolute Percentage Error (SMAPE) metrics. Convolution 
Recurrent Neural Network (CRNN) model has shown the most satisfactory forecasting results across all 
metrics, with MAE, RMSE, and SMAPE values of 3.414, 2.293, and 0.125. Since air pollution is a recurrent 

(b) 

(a) 

359


issue in Brunei Darussalam especially during the dry seasons, the results from the forecasting model can be 
used as part of an early-warning application with provides health advisory for citizens, such as wearing 
facemasks or limiting outdoor activities, or to show safer routes to school or work from heavily polluted areas, 
and to mitigate the negative impacts of haze pollution on the residents. This application aids the government, 
especially the Ministry of Health, as PM10 concentrations have been shown to have negative impacts on 
human well-being and many studies have shown significant relationship between PM10 concentrations with 
cardiovascular and respiratory diseases. Future work should include running the CRNN model for multi-
timesteps predictions, and the use of air quality data from neighbouring countries in Southeast Asia region to 
account for transboundary air pollution prediction. One possible approach similar to what has been conducted 
by Zhao and Zettsu (2019) is “Convolution Recurrent Neural Networks Based Dynamic Transboundary Air 
Pollution Prediction”. Besides, the inclusion of other meteorological parameters such as rainfall, humidity, and 
temperature, and other pollutants concentrations, such as carbon monoxide (CO), fine particulate matter 
(PM2.5), nitrogen dioxide (NO2), sulphur dioxide (SO2) and ozone (O3) is expected to improve the accuracy of 
predictions, and the applicability of its results. 

Acknowledgments 

The authors acknowledge the support from JASTIP-Net 2019 Grant, and Universiti Teknologi Brunei's Internal 
Grant "UTB/GSR/1/2019(10)" and thank them for supporting the research activities required to complete this 
study. A great appreciation is extended to the Department of Environment, Park and Recreation (JASTRE) for 
providing the daily PM10 concentrations data. 

References 

Anaman K.A., 2001, Urban householders’ assessment of the causes, responses, and economic impact of the 
1998 haze-related air pollution episode in Brunei Darussalam, ASEAN Economic Bulletin, 18(2), 193–205. 

Anaman K.A., Ibrahim N., 2003, Statistical estimation of dose-response functions of respiratory diseases and 
societal costs of haze-related air pollution in Brunei Darussalam, Pure and Applied Geophysics, 160(1), 
279–293. 

Anaman K.A., Looi C.N., 2000, Economic impact of haze related air pollution of the tourism industry in Brunei 
Darussalam, Economic Analysis and Policy, 30(2), 133–143.  

Donkelaar A.V., Martin R., Brauer M., Kahn R., Levy R.C., Verduzco C., Villeneuve P.J., 2010, Global 
estimates of ambient fine particulate matter concentrations from satellite-based aerosol optical depth: 
development and application, Environment Health Perspectives, 118(6), 847-855. 

Dotse S.Q., Petra M.I., Dagar L., Silva L.C.D., 2017, Application of computational intelligence techniques to 
forecast daily PM10 exceedances in Brunei Darussalam, Atmospheric Pollution Research, 9(2), 358–368. 

IPCC, 2013, Fifth assessment report of the intergovernmental panel on climate change, Cambridge University 
Press, Cambridge, United Kingdom. 

Limin S.H., Rieley J.O., Jaya S., Gumiri S., 2006, The impact of forest fires and resultant haze on terrestrial 
ecosystems and human health in central Kalimantan, Indonesia, Tropics, 15(3), 321–326. 

MOH, 2013, Health advisory during haze episodes, Ministry of Health Brunei Darussalam 
<www.moh.gov.bn/SiteCollectionDocuments/Haze/health-advisory-2013.pdf> accessed 19.05.2020. 

Newell K., Kartsonaki C., Lam K.B.H., Kurmi O.P., 2017, Cardiorespiratory health effects of particulate 
ambient air pollution exposure in low-income and middle-income countries: a systematic review and meta-
analysis, Lancet Planetary Health, 1(9), 368–380. 

Saeed S., Hussain L., Awan I.A., Idris A., 2017, Comparative analysis of different statistical methods for 
prediction of PM2.5 and PM10 concentrations in advance for several hours, International Journal of 
Computer Science and Network Security, 17(11), 45–52. 

Wang X., Klemeš J.J., Fan W., Dong X., 2019, An overview of air-pollution terrain nexus, Chemical 
Engineering Transactions, 72, 31–36. 

Yadav A.K., Kumar K., Kasim A.M., Singh M.P., 2003, Visibility and incidence of respiratory diseases during 
the 1998 haze episode in Brunei Darussalam, Pure and Applied Geophysics, 160(1), 265–277. 

Yang G., Lee H., Lee G., 2020, A hybrid deep learning model to forecast particulate matter concentration 
levels in Seoul, South Korea, Atmosphere 2020, 11(4), 348–367. 

Zhao P., Zettsu K., 2018, Convolution recurrent neural networks for short-term prediction of atmospheric 
sensing data, 2018 IEEE International Conference on Internet of Things and IEEE Green Computing and 
Communications and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data 
(SmartData), 30th July–3rd August 2018, Halifax, NS, Canada. 

Zhao P., Zettsu K., 2019, Convolution recurrent neural networks based dynamic transboundary air pollution 
prediction, 2019 IEEE 4th International Conference on Big Data Analytics (ICBDA), 15th March–18th 
March 2019, Suzhou, China. 

360