APPLICATION OF DIGITAL CELLULAR RADIO FOR MOBILE LOCATION ESTIMATION IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. AIR POLLUTION INDEX PREDICTION USING MULTIPLE NEURAL NETWORKS ZAINAL AHMAD1*, NAZIRA ANIZA RAHIM2, ALIREZA BAHADORI3 AND JIE ZHANG4 1School of Chemical Engineering, Engineering Campus, Universiti Sains Malaysia, 14300, Nibong Tebal, Penang, Malaysia. 2River Basin Research Centre, National Hydraulic Research Institute of Malaysia, Lot 5377, Jalan Putra Permai, 43300 Seri Kembangan, Selangor, Malaysia. 3School of Environment, Science and Engineering, Southern Cross University, Lismore, New South Wales, Australia. 4School of Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom. *Corresponding author: chzahmad@usm.my (Received: 25th May 2016; Accepted: 27th Sept. 2016; Published online: 30th May 2017) ABSTRACT: Air quality monitoring and forecasting tools are necessary for the purpose of taking precautionary measures against air pollution, such as reducing the effect of a predicted air pollution peak on the surrounding population and ecosystem. In this study a single Feed-forward Artificial Neural Network (FANN) is shown to be able to predict the Air Pollution Index (API) with a Mean Squared Error (MSE) and coefficient determination, R 2 , of 0.1856 and 0.7950 respectively. However, due to the non-robust nature of single FANN, a selective combination of Multiple Neural Networks (MNN) is introduced using backward elimination and a forward selection method. The results show that both selective combination methods can improve the robustness and performance of the API prediction with the MSE and R 2 of 0.1614 and 0.8210 respectively. This clearly shows that it is possible to reduce the number of networks combined in MNN for API prediction, without losses of any information in terms of the performance of the final API prediction model. ABSTRAK: Pemantauan dan ramalan kualiti udara adalah perlu bagi mengambil langkah berjaga-jaga terhadap pencemaran udara, seperti untuk meramalkan mengurangkan kesan puncak pencemaran udara terhadap penduduk sekitar dan ekosistem. Dalam kajian ini rangkaian tiruan tunggal neural suap depan (FANN) ditunjukkan masing-masing dapat meramalkan indek pencemaran udara (IPU) dengan purata ralat kuasa dua (MSE) dan pekali penentuan, R 2 , daripada 0.1856 dan 0.7950. Namun disebabkan oleh sifat tidak mantap FANN tunggal, gabungan terpilih pelbagai rangkaian neural (MNN) diperkenalkan dengan menggunakan penghapusan ke belakang dan kaedah pemilihan ke hadapan. Keputusan kajian menunjukkan bahawa kedua-dua kaedah gabungan terpilih boleh meningkatkan keteguhan dan prestasi ramalan API masing-masing dengan MSE dan R2 daripada 0.1614 dan 0.8210. Ini jelas menunjukkan bahawa ia adalah mungkin untuk mengurangkan bilangan rangkaian digabungkan dalam MNN untuk ramalan API, tanpa menjejaskan keupayaan mana-mana maklumat dari segi prestasi model ramalan akhir API. KEYWORDS: air pollution index;artificial neural networks; multiple neural networks; forward selection; backward elimination 1 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. 1. INTRODUCTION Air quality is monitored continuously and manually to detect any changes in the ambient air quality status that may cause harm to human health or the environment. The Malaysian Department of Environment (DOE) monitors the ambient air quality via a network of 51 monitoring stations across Malaysia [1]. These monitoring stations are strategically located in residential, traffic, and industrial areas to detect any significant changes in the air quality which could be harmful to human health and the environment. The ambient air quality measurement in Malaysia is described in terms of the Air Pollutant Index (API), which is a simple way to describe and report the air quality instead of using the actual concentration of air pollutants. This API also reflects effects on human health, ranging from good to hazardous, and can be categorized according to its action criteria as specified in the National Haze Action Plan Malaysia. Efficient methods for the assessment of air quality are needed in order to establish mechanisms for managing pollutant concentration and preventing illness in health- sensitive people. The criterion for good air quality varies with the kind of ecosystem and is established at different levels. Several methodologies for the assessment and monitoring of air pollutants have been implemented by organizations such as the Department of Environment (DOE) of Malaysia which has developed indexes for air quality. In response to this concern, several studies on air quality prediction using artificial neural networks have been done [2, 3]. Unlike other modelling techniques, artificial neural networks (ANN) make no prior assumptions concerning the data distribution and require no mechanistic knowledge. ANN is capable of modelling highly nonlinear relationships and can be trained to accurately generalize when presented with a new data set. An air quality prediction model based on neural networks had also been applied on a short-term and long-term basis. Viotti et al [4] has applied this prediction model to predict the vehicular air pollutant levels in the city of Perugia, Italy, while Sabri and Tarek [5] have applied it in the region of Annaba, Algeria. However, the latter have combined a radial basis function (RBF) network and multiple layer perceptron (MLP) in their model to predict the air pollutant concentrations. In addition to the emission sources, meteorological factors (wind speed and direction, temperature, precipitation and boundary layer heights), can govern the variability of atmospheric PM10 [6, 7] as well. In fact, urban and industrialized areas tend to record their highest PM10 concentrations under stable meteorological conditions coupled with thermal inversions or during long range transport events [8, 9] while the lowest readings tend to occur during windy and rainy periods [10]. Many researchers have studied the prediction of particulate matter concentration in the environment. Perez et al. [11] and Yan and Jian [12] have focused their study on the prediction of the PM2.5 (particulate matter with a diameter smaller than 2.5 micrometers) concentration using an ANN model. Some of the researchers have developed an air quality prediction model based on neural networks with a multilayer perceptron structure. Gardner and Dorling [13] and Perez and Trier [14] have adopted this model to predict the NO and NO2 concentration based on meteorological data in Central London and traffic junctions in Santiago City in Chile respectively. They have also concluded that the MLP has better performance compared to their previously developed regression models. Feed-forward artificial neural networks (FANN) have also been applied by Sousa et al. [15] to predict hourly ozone concentration based on meteorological data while Ul-Saufie et al. [16] has applied it by combining with PCA to predict PM10 concentration in Negeri Sembilan, Malaysia. Cigizoglu and Kisi [17] have also applied FANN. Chelani et al. [18] have predicted SO2 values at three sites in Delhi, India, using neural networks and compared the results with 2 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. those of multivariate regression models. Wind speed, wind direction index, relative humidity and temperature variables have been used as inputs for their developed recurrent neural network. Even though there were successes in many applications of ANN and considerably less restrictions on the environmental input data, large training data sets are usually required to improve the accuracy and minimize uncertainty in the output data, which up to now has been a significant disadvantage of these models. Gardner and Dorling [19] have reviewed the limitations and problems associated with the training of ANNs and emphasized that fundamental understanding of the basic theory is the key in developing ANNs. It is well known that a neural network can approximate any smooth nonlinear function between model inputs and outputs by selecting a suitable set of connecting weights and transfer functions [19]. Therefore in this paper, selective combination of multiple neural networks (MNN) is introduced to improve the single feed-forward neural artificial network (FANN) prediction for the API model as shown in Fig. 1 [20]. This paper is organized as follows: Section 2 presents the case study concerning the API sampling area and location in Malaysia. The concept of single FANN and MNN combination using FS and BE method are presented in Section 3. The results and discussions of the proposed MNN with selective combination are presented in Section 4. Finally, the last section concludes this paper. Fig. 1: Combining multiple neural networks. 2. CASE STUDY: PERAK AIR PULLUTION INDEX MONITORING STATIONS, MALAYSIA Most air quality data are obtained from air quality monitoring stations directly or through remote sensing instruments. Here, the air quality data from 4 monitoring stations around Perak State were collected by the Department of Environment (DOE), Malaysia, which is stationed at CA0020, CA0041, CA0045 and CA0046, as illustrated in Fig. 2. These Continuous Air Quality Monitoring (CAQM)-type monitoring stations, are strategically located in residential, traffic, and industrial areas to detect any significant changes in the air quality that may be harmful to human health and the environment [1]. The air quality data was recorded for 4 years, from 2006 to 2009 for eight variables. For the API modelling, variables involved are the concentrations of the air pollutants and meteorological variables, and are divided into groups of input and output variables for the FANN model. However for this study, 6 air pollutant inputs are selected for model 3 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. development as shown in Table 1. A total of 1388 samples were used for the modelling and analysis in this study and the raw data for the modelling of year 2006 is shown in Fig. 3. Fig. 2: Perak air monitoring stations [1]. Table 1: Air quality variables for API modelling Input variables (average hourly) Output variables  Ozone, O3 (mg/l)  Particulate matter with size less than 10 microns, PM10 (mg/l)  Carbon monoxide, CO (mg/l)  Wind speed (km/hr)  Air temperature (oC)  Relative humidity (%) API 3. FEED-FORWARD ARTIFICIAL NEURAL NETWORK MODEL DEVELOPMENT In this case study, an hourly average of 1388 data samples were taken from Malaysia’s Department of Environment database from year 2006 to year 2009. All the data was normalized to zero mean and unit standard deviation to cope with the different magnitudes in the input and output data. Then, the input data were divided randomly using the MatlabTM divideint command into three sets of data, namely 70% (972 samples) as training data, 15% (208 sample) as testing data, and 15% (208 samples) as unseen validation data. Then the individual networks were trained using the Levenberg-Marquardt optimization algorithm with regularization and “early stopping”. The networks are single hidden layer feed-forward neural networks (FANN). Hidden layer neurons use the logarithmic sigmoid activation function whereas output layer neurons use the linear activation function. In this study, 20 networks with fixed identical structure were 4 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. Fig. 3: Raw data for input and output for FANN model prediction for year 2006. (a) Average wind speed, (b) Average ambient temperature, (c) Average relative humidity, (d) Average CO, (e) average PM10, (f) average O3, (g) API. developed from bootstrap re-samples of the original training and testing data. If the number of networks for combination is too small we might not get the optimum reduction of the MSE in the combination. In re-sampling the training and testing data, a bootstrap re- sampling technique was applied where the training and testing data were first transformed into the discrete time functions, therefore re-sampling the discrete time data does not affect the input-output mapping of the models. The FANN is developed based on the discrete time of the process as the prediction output at time (t), y(t), is predicted based on the process inputs at time t, u(t), as follows: )](),....,(),([)(ˆ 21 tututufty m  (1) 0 100 200 300 400 0 10 20 Samples (a) A v e ra g e w in d s p e e d , k m /h r 0 100 200 300 400 0 50 100 Samples (c) A v e ra g e R e la ti v e H u m id it y , % 0 100 200 300 400 24 26 28 30 Samples (b) A v e ra g e a m b ie n t T e m p e ra tu re , C e ls iu s 0 100 200 300 400 0 0.5 1 1.5 Samples (d) C O ,m g /l 0 100 200 300 400 0 50 100 150 Samples (e) P M 1 0 , m g /l 0 100 200 300 400 0 0.01 0.02 0.03 Samples (f) O 3 , m g /l 0 50 100 150 200 250 300 350 400 10 20 30 40 50 60 70 80 90 100 110 Samples (g) A P I 5 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. where u(t) is the process input at time (t), y(t)is the predicted process output at time t, which is the API, and m is the number of the process inputs and for this case study is 6 as shown in Table 1. Then the forward selection (FS) and backward elimination (BE) approach combined with simple averaging method was developed. The FS and BE method was developed in our previous paper with the different application of the prediction [21]. Generally, in FS, the individual networks are added one at a time to the aggregated network where when the network is combined or included in the aggregated network it will produces the greatest decrease in model prediction MSE. This process starts with an empty aggregated model and the first network to be chosen in the aggregated network is the single network that has the least MSE in training and testing data or what we call the best individual network. The second network added is the one, when combined with the first added network, produces the largest reduction in MSE on the original training and testing data. This procedure is repeated until the MSE on the training and testing data cannot be further reduced by adding more networks. On the other hand, in the BE, the aggregated network begins with combining all the individual networks in the pool of networks and removes one network at a time until the MSE on the training and testing data cannot be further reduced. The network deleted at each step is selected such that its deletion results in the largest reduction in the aggregated network MSE on the training and testing data. The detailed procedures for the FS and BE method can be found in [21]. The simple average method is used in combining the selected networks in both approaches as shown in Eqn. (2) where, if all n networks are combined, the aggregated network output is:    n i i Y n Y 1 ˆ1ˆ (2) The performances of the actual and predicted models are based on the MSE and coefficient determination, R2. The advantages of using the MSE include its easy calculation and that it penalizes large errors in each observation. Therefore the average sum square error in each sample observation is able to determine the quality of the prediction of the model. On the other hand, the R2 provides the inconsistency measure of the data reproduced or predicted and the fitness of the model to capture the actual process. The higher the values of R2, or closest to 1, and the smallest of the MSE, or closest to zero, the better the model. 4. RESULTS AND DISCUSSION The inputs of these network models are the hourly average of carbon monoxide, wind speed, air temperature, relative humidity, PM10 and O3 and their output is the API values as shown in Eqn. (1). The single FANN network with a single hidden layer was applied with the Levenberg-Marquardt training algorithm with a sigmoid activation function in the hidden layer and a linear activation function in the output layer. The structure of the single FANN is represented by the number of nodes in each layer. The number of nodes in the input layer is 6, which represents the input variables, while the outer layer has only one node representing one model output variable. However, the fitted model was assured by the number of nodes in the hidden layer. Therefore, the determination of the number of nodes in the hidden layer was carr ied out by calculating the MSE for the combination of training and testing data. The number of nodes in the hidden layer was varied between 1 and 20 in order to find the “best” number of nodes for the model. Figure 4 shows the performance of the model prediction 6 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. with different numbers of nodes in the hidden layer. The lowest MSE value on the combination of the training and testing data was 0.1652, recorded by the model with 9 hidden nodes in the hidden layer. Thus, the network with 9 hidden nodes was selected as the final model structure or network architecture, i.e. the topology of the network is 6-9-1. Fig. 4: MSE on the training and testing data with different numbers of hidden nodes. Figure 5 shows the neural network model prediction performances on the training and testing data. In Figure 4, the solid lines represent the scaled true values of API and the dotted lines represent the model predictions. It can be seen that the predicted values are very close to the actual values for both sets of data. The MSE and the R2 for training and testing data are 0.1988, 0.1613 and 0.7962, 0.8257 respectively. Figure 6 shows the model prediction performance on the unseen validation data from the single FANN. In Fig. 6, the scaled true values of API are represented by the solid line while the model predictions are represented by the dotted line. The single FANN model clearly emulates the patterns of process accurately on the unseen validation data. The MSE and the R2 values on the unseen validation data are 0.1856 and 0.7950 respectively. Figure 6 clearly shows that the predicted and the actual values are close to each other. Thus, it showed in the intricate model that the API process can be modelled and generalized quite well using single FANN. However, even though single FANN is shown to be able to predict the API quite accurately, single FANN models sometimes lack robustness as shown in Fig. 7a and 7b. Single FANNs sometimes suffer badly when applied to unseen data where some neural network might fail to deliver the correct result due to the network training converged to undesired local minima or over-fitting of noise in the actual data. In Fig. 7a, one of the best single FANNs in training and testing data was network number 14 but its 7 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. Fig. 5: Actual and predicted values for training and testing data. Fig. 6: Actual and predicted values on the unseen validation data from single FANN. performance on the unseen data was not among the best. Figure 7b shows that the best network on unseen validation data is network number 7, but its performance on the training and testing data is not among the best at all. There is no guarantee that the best model on the training and testing data will be the best on the unseen data. Therefore the combination of multiple neural networks is proposed in this study with the aim of enhancing the neural network robustness on unseen data. 8 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. Fig. 7: MSE for single FANN. (a) MSE for Training and Testing data, (b) MSE for Validation data. Fig. 8: MSE for aggregated multiple neural networks on the unseen validation data for BE and FS approaches. Figure 8 shows the multiple neural network performance using selective combinations with BE and FS methods. The performance of aggregated networks on training and testing data is consistent with the performance on the unseen validation data for both select ive 0 2 4 6 8 10 12 14 16 18 20 0.15 0.2 0.25 No. of networks (a) M S E T e s ti n g a n d t ra in in g d a ta 0 2 4 6 8 10 12 14 16 18 20 0.15 0.2 0.25 No. of networks (b) M S E V a li d a ti o n 9 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. combination methods. The reduction of MSE in training and testing data for BE and FS combinations are consistent with the reduction of MSE in the unseen/validation data. It shows the robustness of the proposed modelling techniques as compared to the single FANN where the best network performance in training and testing data will not guarantee the best performance in the unseen validation data. The numbers of networks for the final combination are reduced to 3 networks for both methods which show the minimum MSE in Training and testing data that also correspond to MSE in validation data. The final result analysis is shown in Table 2. In this particular case, the FS and BE approaches led to the same individual networks being combined. Even though the number of networks combined was quite small for both selective methods, the most important thing is that both combination approaches perform better than the single FANN. Table 2: Statistical Analysis of MNN performance on the unseen validation data Number of networks combined MSE R2 Single FANN 1 0.1856 0.7950 Combined all MNN 20 0.1649 0.8170 FS Aggregated MNN 3 (12,15,20) 0.1635 0.8200 BE Aggregated MNN 3 (12,15,20) 0.1635 0.8200 As for comparison, Azid et al. [22, 23] did carry out API modelling for the Southern region of Malaysia with 2 different sets of data containing 202,050 and 232,505 observations respectively. In [22] the input was reduced to 10 from 12 possible inputs with the R2 and RMSE of 0.724 and 7.562 for unseen validation data respectively. On the other hand, in [23], the input was reduced to 5 from a possible 8 with the R2 and RMSE of 0.618 and 10.017 for unseen data respectively. Therefore, the MNN did perform better than the [22] and [23] API modelling for Malaysia as shown in Table 2 with the R2 and RMSE of 0.8200 and 0.160 for unseen validation data respectively. This performance was obtained with fewer sample data (1388 observations) as compared in [22] and [23]. 5. CONCLUSION This study proposes single FANN and multiple neural networks to model API based on the environmental monitoring data to get reliable and fast API predictions in order to mitigate the problems related to API. The single FANN does model the API quite well with relatively small MSE and high R2 values on the unseen data. However, in order to overcome the non-robust nature of single FANN, multiple neural networks are proposed with two selective combination methods. Both selective combination methods further improve the model prediction as compared to single FANN and combining all networks. This clearly shows that it is possible to reduce the number of networks combined for the API prediction without losses in performance. ACKNOWLEDGEMENT The authors would like to acknowledge the support from the Universiti Sains Malaysia (USM) and Newcastle University, United Kingdom, and special gratitude to Department of Environment (DOE) Malaysia for providing and giving permission to utilize their air quality data for this study. 10 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. REFERENCES [1] ASMA. “Air Pollutant Index (API),“ Retrieved on July, 2012. Available from http://www.doe.gov.my/portalv1/en/info-umum/english-air-pollutant-index-api/100 [2] Akkoyunlu A, Yetilmezsoy K, Erturk F, Oztemel E. (2010) A neural network-based approach for the prediction of urban SO2 concentrations in the Istanbul metropolitan area. International Journal of Environment and Pollution, 40(4):301-315. [3] Wang W, Lu W, Wang X , Leung YT. (2003) Prediction of maximum daily ozone level using combined neural network and statistical characteristics. Environmental International, 29(5):555–562. [4] Viotti P, Liuti G, Di Genova P. (2002) Atmospheric urban pollution: applications of an artificial neural network (ANN) to the city of Perugia. Ecol Modell., 148(1):27-46. [5] Sabri G, Tarek KM. (2012) Combination of artificial neural network models for air quality predictions for the region of Annaba, Algeria. Int. J. Environ. Stud., 69(1):79-89. [6] Amodio M, Andriani E, Cafagna I, Caselli M, Daresta BE, de Gennaro G, Tutino M. (2010) A statistical investigation about sources of PM in South Italy. Atmos. Res., 98:207- 218. [7] Rodriguez S, Querol X, Alastuey A, Kallos G, Kakaliagou O. (2001) Saharan dust contributions to PM10 and TSP levels in Southern and Eastern Spain. Atmos. Environ., 35:2433-2447. [8] PeyJ, Pérez N, Querol X, Alastuey A, Cusack M, Reche C. (2010) Intense winter atmospheric pollution episodes affecting the Western Mediterranean. Sci. Total Environ., 408(8):1951–1959. [9] Pohjola MA, Rantamäki M, Kukkonen J, Karppinen A, Berge E. (2004) Meteorological evaluation of a severe air pollution episode in Helsinki on 27-29 December 1995. Boreal Environ. Res., 9(1):75–87. [10] De Gennaro G, Trizio L, Di Gilio A, Pey J, Pérez N, Cusack M, Querol X. (2013) Neural network model for the prediction of PM10 daily concentrations in two sites in the Western Mediterranean. Sci. Total Environ., 463-464:875–883. [11] Perez P, Trier A, Reyes J. (2000) Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile. Atmos. Environ., 34:1189–-196. [12] Yan CK, Jian L.(2013) Identification of significant factors for air pollution levels using a neural network based knowledge discovery system. Neurocomputing, 99: 564-569. [13] Gardner MW, Dorling SR. (1998) Artificial neural networks (the multilayer perceptron) - A review of applications in the atmospheric sciences. Atmos. Environ., 32(14-15):2627-2636. [14] Perez P, Trier A. (2001) Prediction of NO and NO2 concentrations near a street with heavy traffic in Santiago, Chile. Atmos. Environ., 35:1783-1789. [15] Sousa S, Martins F, Alvimferraz M, Pereira M. (2007) Multiple linear regression and artificial neural networks based on principal components to predict ozone concentrations. Environ. Model Softw., 22(1):97-103. [16] Ul-Saufie AZ, Yahaya AS, Ramli NA, Rosaida N, Hamid HA. (2013) Future daily PM10 concentrations prediction by combining regression models and feedforward backpropagation models with principle component analysis (PCA). Atmos. Environ., 77:621-630. [17] Cigizoglu KH, Kisi Ö. (2006) Methods to improve the neural network performance in suspended sediment estimation. J. Hydrol., 317:221-238. [18] Chelani AB, Chalapati RC, Phadke K, Hasan M. (2002) Prediction of sulfur dioxide concentration using artificial neural networks. Environ. Model Softw., 17:161-168. [19] Gardner MW, Dorling SR. (1999) Neural network modelling and prediction of hourly NO and NO concentrations in urban air in London. Atmos. Environ., 33:709-719. [20] Zhang J. (1999) “Developing Robust Non-linear Models Through Bootstrap Aggregated Neural Networks. Neurocomputing, 25:93-113. [21] Ahmad Z, Zhang J. (2009) Selective combination of multiple neural networks for improving model prediction in nonlinear systems modelling through forward selection and backward elimination. Neurocomputing, 72:1198-1204. 11 IIUM Engineering Journal, Vol. 18, No. 1, 2017 Ahmad et al. [22] Azid A, Juahir H, Latif MT, Mohd Zain S, Osman MR. (2003) Feed-Forward artificial neural network model for Air Pollutant Index prediction in the southern region of Peninsular Malaysia. Journal of Environmental Protection, 4:1-10. [23] Azid A, Juahir H, Toriman ME, Kamarudin MKA, Mohd Saudi AS, ,Che Hasnam CN, Abdul Aziz NA, Zaman F, Latif MT, Mohamed Zainuddin SF, Osman MR, Yamin M. (2014) Prediction of the Level of Air Pollution Using Principal Component Analysis and Artificial Neural Network Techniques: a Case Study in Malaysia. Water Air Soil Pollut., 225:2063-2077. NOMENCLATURE ANN Artificial Neural Network - API Air Pollution Index - BE Backward Elimination - CO carbon monoxide mg/l FANN Feed-forward Artificial Neural Network - FS Forward Selection - MLP Multi-Layer Perceptron - MNN Multiple Neural Networks - MSE Mean sum square error - n Number of networks combined - NO Nitrogen monoxide mg/l NO2 Nitrogen Oxide mg/l O3 Ozone mg/l PCA Principle Component Analysis - PM10 Concentration of particulate matter with a size less than 10 microns mg/l PM2.5 Concentration of particulate matter with a size less than 2.5 microns mg/l R2 Coefficient determination g/mol X Input data - X̂ Input data after resampling - Y Output data - Ŷ Network Prediction Output - Subscript i Number of network 12