CET-vol95


                                                                                                                                                                 DOI: 10.3303/CET2295005 
 
 
Paper Received: 23 April 2022; Revised: 6 June 2022; Accepted: 2 June 2022 
Please cite this article as: Cangialosi F., Bruno E., Fornaro A., 2022, Integrating Citizen Science and Machine Learning Algorithms for the 
Recognition of Odour Classes Nearby a Wastewater Treatment Plant, Chemical Engineering Transactions, 95, 25-30  
DOI:10.3303/CET2295005 
  

 CHEMICAL ENGINEERING TRANSACTIONS  
 

VOL. 95, 2022 

A publication of 

 
The Italian Association 
of Chemical Engineering 
Online at www.cetjournal.it 

Guest Editors: Selena Sironi, Laura Capelli 

Copyright © 2022, AIDIC Servizi S.r.l. 
ISBN 978-88-95608-94-5; ISSN 2283-9216 

 
Integrating Citizen Science and Machine Learning Algorithms 

for the Recognition of Odour Classes nearby a Wastewater 

Treatment Plant 

Federico Cangialosia*, Edoardo Brunoa, Antonio Fornarob  

aTecnologia e Ambiente (T&A), Putignano (BA), Italia  
bLabservice Analytica, Anzola Dell’Emilia (BO), Italia  

 federico.cangialosi@icloud.com 

Odour nuisance is an increasingly topical problem, especially in newly developed urban areas. The use of 

machine learning algorithms for the classification and quantification of odour sources is becoming more and 

more widespread using instrumental odour monitoring systems (IOMS) for odour measurements. In this context 

of odour nuisances, the role of citizens can represent one of the fundamental factors for controlling the 

environment as several studies have already stressed: citizen science is now considered an additional tool for 

the smart management of environmental monitoring since it is able to carry out an in-depth analysis of the 

pollution problem. This paper presents a continuous monitoring study at the fenceline of three urban wastewater 

treatment plants. The study was based on two distinct elements: firstly, continuous monitoring was used at the 

plant fenceline using instrumental odour monitoring systems (IOMS) for odour measurements. Once a database 

with all IOMS data was obtained, odour classification and quantification algorithms were developed via machine 

learning techniques, such as Artificial Neural Networks (ANNs) and random forest, which were used to set-up a 

system capable of automatically recognizing both the odour class and concentration. Then, citizen science was 

used, by employing the data derived from an app available for the citizens: the app was set up in a way that 

citizens could enter the type and intensity of the smell they detected so each report would be recorded with GPS 

location, date, time and weather data allowing a comprehensive data mapping across space and time. We 

carried out a monitoring campaign over a period of five months, and then we compared the data obtained from 

the algorithms with the reports of the citizens, then studying the actual causes of the nuisances and verifying 

whether they were related to the monitored plant. At first, we carried out an analysis of the results provided by 

the IOMS, so that we could identify the most frequent odour classes and relative odour concentrations: it was 

decided to investigate different ranges of odour concentration to verify which sources were most influential in 

the most intense episodes of nuisance. Then, we correlated such information with the weather data and citizens’ 

reports, to find out whether the reports were related to the plant. The description of the odours perceived by the 

population, alongside the identification of the appropriate wind cone influencing the receptors from the plant, 

allowed us to identify the events that could be attributed to the known sources. The results obtained from joint 

analysis of IOMS and citizens data were therefore useful for establishing to what extent the unpleasant odours 

perceived by the citizens came from the monitored plant. 

1. Introduction 

In the context of atmospheric pollution, odours are classified as a relevant component and an important indicator 

of the health of urban areas close to urban wastewater treatment plants (WWTP) (Oliva, et al., 2021)  

In such a complex and socially sensitive context for the various reports of bad smells in the vicinity of urban 

wastewater treatment plants (WWTPs), the contribution of citizen science for the identification of odour 

emissions (Brattoli, et al., 2016, Lotesoriere, et al., 2021, Yen-Cha et al., 2017,  Zheng et al., 2017), and the 

electronic nose or odour monitoring system instrumental (IOMS) (Karakaya et al., 2020) are powerful tools to 

help competent authorities and / or environmental protection agencies to define appropriate strategies in order 

to identify, measure and reduce the impact of odours on receptors. 

25


As regards the IOMS, the developments in recent years have been very significant both in hardware and 

software terms, also thanks to the increasing use of algorithms that provide for suitable signal processing by 

extracting the most significant features (Zarra et al., 2021) and machine learning techniques (Men et al., 2018, 

Cangialosi et al., 2021, Yelim et al., 2022). The main goal of this study is to evaluate the potential of a monitoring 

system that combines the most recent instrumental techniques with the potential of citizen science to assess 

the odour impact connected to three wastewater treatment plants characterized by multiple emission sources. 

2. Materials and methods 

2.1 Plants description 

The three urban wastewater treatment plants (WWTPs) considered in the study are located in the industrial area 

of Monopoli (Bari-Italy), the city of Polignano a Mare (Bari-Italy) and the industrial area of Putignano (Bari-Italy), 
respectively. Based on the data of various studies carried out for the olfactory characterization of urban 

wastewater treatment plants (Naddeo et al., 2016), it has been established that the most critical sections from 

the point of view of odour emissions are the pre-treatments, primary sedimentation, and sludge treatment. Based 

on these studies and the evidence gathered in the field, the mapping of the odours of the plant was carried out: 

the first phase involved a complete characterization of the emission sources, and subsequently a sampling 

program of the most critical sources was defined. The sampling program for the collection of samples for training 

and testing was designed to also consider environmental variations (temperature, relative humidity) and lasted 

several months. 

2.2 IOMS training equipment and procedure 

The IOMS (MSEM32® by Sensigent, Baldwin Park, CA, USA and Labservice Analytica, Anzola dell’ Emilia, 

Italy) was positioned close to the fenceline of Monopoli WWTP. After duplicate collection, each sample was fed 

to the IOMS on the same day and the replicate sample was analyzed using dynamic olfactometry (DO) at the 

T&A Laboratory within 24 hours, using the LEO dynamic olfactometer (ARCO Solutions srl, Trieste, Italy) for the 

measurement of Odour Concentrations (Cod), expressed as a European olfactory unit (uoE/m3).  A total of 51 

samples were collected, with odour concentrations ranging from 20 to 2435 uoE/m3. After a preliminary 

characterization, the odour classes selected were Class 1 (pretreatments), Class 2 (sludge conditioning), Class 

3 (biogas) and Class 0 (unknown source). The dataset used for training the machine learning (ML) algorithms 

was obtained from the signals acquired by the IOMS which has an array of 32 sensors. 

2.3 Data pretreatment, algorithms for the classification and quantification of odours and APP 
description 

The response curves, which represent the variation of the sensor signals, were analyzed to extract the 

characteristics of the signal. The classification of odours and the prediction of the odour unit was carried out 

using two machine learning algorithms. The first is the Random Forest, and the second is a multilayer neural 

network (Multi-Layer Perceptron - MLP). All the data collected by the IOMS, representing sensor responses, 

were then extracted for training. After data pre-treatment, several tests were performed to choose an appropriate 

subset of the input variables, using the Recursive Feature Elimination with Cross Validation (RFECV) algorithm 

(Demarchi et al., 2020), to obtain the set of the most significant sensors that they were subsequently used for 

the construction of both algorithms. The overall dataset with the selected features was then divided into a training 

set and test set with an 80:20 ratio, thus using 600 data for training and 150 for performance evaluation. For the 

classification process, a confusion matrix was calculated for both the neural network and the Random Forest 

and the accuracy, calculated both for each class and overall, and the Cohen’s kappa coefficient was used as 

scoring parameters. For the regression, the absolute differences between the measured odour concentrations 

and those predicted were calculated, both by the MLP algorithm and by the RF.  

As for citizen reports collection, the App “Signal App-Odori”, developed by the Municipality of Monopoli for both 

Android and iOS operating systems, was employed for the Monopoli WWTP. Once the user has logged in, it is 

possible to report when an odor nuisance is perceived. The user can indicate his level of odor annoyance: weak, 

easily detectable or very intense, and the type of smell perceived and can also enter a brief description of the 

perceived odour, to help the classification of the smell.  The citizens of Monopoli were informed of the app and 

the project through a press conference chaired by the mayor who illustrated the objectives and operating 

methods of the program. For the plants in Polignano a Mare and Putignano, a Telegram bot (Odor-bot by 

Labservice Analytica) with the same features of signal App was employed. Both the applications allowed the 

citizen to classify the odour nuisances, among the others, as “wastewater treatment” or “sludges”. In this case, 

the population was informed of the project through social media and online presentations, as public events were 

prohibited during the COVID period. 

26


3. Results and discussion 

3.1 On-site training phase and testing for classification and regression 

The instrumental signals were processed through a feature selection procedure which identifies the most 

suitable variables to be used in subsequent classification and regression models. Once the models (MLP and 

RF) were selected, the classification accuracy rates for each class and the overall accuracy rate for the best 

models were calculated. The results for the training set showed an accuracy for each class of not less than 0.99 

for MLP and equal to 1 for RF. After analysing the results of the classification with the data of the test-set, 

consisting of 150 samples, it was found that only three elements were not correctly classified and both the MLP 

and the RF scored 0.98 on global accuracy and 0.97 on Cohen’s Kappa coefficient. The RMSE mean square 

deviation for MLP is equal to 130 uo/m3, while for RF the value is equal to 97 uo/m3. The results of the training 

and testing phase are discussed in detail elsewhere (Cangialosi et al., 2021). 

3.2 Joint analysis of class-concentration data 

Once the IOMS training was completed, the data were collected on each site in the monitoring periods indicated 

in Table 1, in which the most representative statistical indices obtained from the univariate analysis of the 

concentration distribution are also shown. In Figure 1, the cumulative distribution of odour concentrations for all 

the WWTPs are shown. 

Table 1: Univariate analysis of the 3 plants data. 

 Monopoli WWTP Polignano WWTP Putignano WWTP 

Monitoring period 10/02/2021- 11/05/2021 01/07/2021- 05/10/2021 01/10/2021-10/01/2022 

Number of data  258,059 258,409 286,231 

Median 109 uo/m3 7 uo/m3 35 uo/m3 

95° Percentile 382 uo/m3 140 uo/m3 224 uo/m3 

 
Figure 1: Cumulative distribution of odor concentrations. 

 
Odour concentrations for the Polignano a Mare WWTP are very low, thus odour class analysis was not 

meaningful. As regards the other two plants, having the data on concentration and odour classes allowed us to 

jointly examine the data: it was decided to divide all the data from the IOMS into concentration classes with 

respect to the odour concentration. The lower bound of the first class was set to 100 uo/m3 as below 100 uo/m3 

the results of the classification may not be relevant; the width of each class was chosen to be 100 uo/m3. 

For the Monopoli plant, as can be seen from Figure 2(a), the higher the concentration value, the more relevant 

the contribution of class 0 (other or unknown) is: the lowest contribution of Class 0 (15%) is in the 100-200 uo/m3 

range and it reaches 69% in the range with values above 1000 uo/m3.  

On the other hand, for the Putignano plant (Figure 2b), class 1 (pretreatment) is dominant (98%) throughout the 

concentration classes, except for the class with odour concentration above 400 uo/m3, where class 2 (sludge 

conditioning) is detected with a frequency of 20%. 

27


(a) (b) 

Figure 2: Percentages of odour classes detected among the different odour intensity bands in the Monopoli 

plant (a) and the Putignano plant (b) 

Since the highest concentrations are likely to be more responsible for odor nuisance it is important to analyse 

the events reported by citizens and verify, using the IOMS data, how many of them may be related to an internal 

or external (and not known) source, namely Class 0, as discussed in the following paragraph. 

 
3.3 Selection of citizen reports and joint analysis with IOMS data 

 
In the monitoring period (February 2021- January 2022), a total of 298 citizens reports were collected, 268 of 

which from Monopoli and 30 from Putignano. No reports have been received from the municipality of Polignano 

a Mare, confirming the fact that the plant odour emissions were not of concern.  

A two-step selection of citizen reports was adopted: firstly, only the reports for which the wind direction at the 

time of reporting was aligned to the plant-receptor direction were considered, plus the data with wind calms; 

then we selected the reports for which the descriptions of the type of odour were also available, selecting those 

relative to “wastewater treatment” or “sludge”. For Monopoli The citizens’ reports to be ascribed to the plants by 

wind direction and type of odour are 11 out 268 and 18 out of 30 for Putignano, the remaining reports were not 

considered because they did not match either the direction or the type of odor reported by the citizens and 

therefore the relative odor nuisance did not it could in no way have come from the plant. In the municipality of 

Monopoli there are other several possible sources (another WWTP next to the monitored plant, a waste storage 

plant and a power plant powered by biofuel). The selected reports are shown in Figure 3. 

 
(a) (b) 

Figure 3:  Analysis of the reports in the period of interest, selected by wind direction and type of odour in 

Monopoli (a) and Putignano (b). In black are indicated the WWTPs.  

 
The selected reports were then correlated with IOMS data. Since the reports refer to a specific time, it was 

decided to consider, for each report, all the IOMS data within a time window of one hour, centered on the instant 

of the report, thus having a wider range to analyze, in order to take into account two aspects of human reporting: 

firstly, there might be delays in reporting with respect to the moment of perception; secondly, there might be 

reports at the time of initial perception of the odour which could subsequently increase. 

28


Figure 4 shows the daily temporal distribution of the reports, compared with the daily temporal distribution of the 

events recorded by the IOMS with a concentration greater than 500 uo/m3 for the city of Monopoli, and greater 

than 300 uo/m3 for the city of Putignano, both normalized with respect to the maximum number of events. 

As can be seen, for both WWTP there are some periods of strong correlation between the number of reports 

and the IOMS data with high concentration. 

 
(a) (b) 

Figure 4 : Comparison between the temporal distribution of all the reports, the selected reports and IOMS data 

with high concentrations in Monopoli (a) and in Putignano (b) 

 
For the Putignano plant, we wanted to verify if there was a significant difference in odour concentrations between 

the class of IOMS data aligned with the citizen reports and the other. As shown in Figure 5a, the percentage of 

data with low concentration (<100 uo/m3) is 60% when no reports were recorded; on the other hand, the 

percentage of data with higher concentration (>100 uo/m3) is 70% during the time windows of citizen reports, 

thus confirming the consistency between IOMS detection at the WWTP fenceline and the nuisance reported by 

citizens. For the Monopoli WWTP, where several odour classes were detected (see Figure 2a), the IOMS data 

directly correlated with the reports were then analyzed with reference to the odour classes.  

As shown in Figure 5b, the percentage of attribution to class 0, i.e unknown, was found to be 40%, while 60% 

of the time in which there were reports from citizens is directly attributable to the two main plant odour sources, 

27.24% pretreatments (class 1), 32.51% sludge conditioning (class 2). 

 
(a) (b) 

Figure 5: Correlation between odour concentrations and selected citizen reports for Putignano (a) and 

ppercentages of odour classes detected during selected reports for Monopoli (b)  

 
Therefore, for the Monopoli plant, the data acquired by the IOMS system and the selected citizens reports, 

clearly highlighted how a significant part of them (40%) may derive from emission areas not attributable to 

specific sources within the WWTP, but from other close plants whose emissions were classified by citizens as 

“wastewater treatment” or “sludges”. 

29


4. Conclusions 

The present work describes the integration between tools of citizen science and the use of IOMS equipped with 

artificial intelligence algorithms for monitoring odour emissions from civil wastewater treatment plants. 

It was questioned whether the integration between IOMS data and citizens’ reports is able to ascertain whether 

odour nuisance is directly related to a specific WWTP or to other unknown odour sources and to identify the 

critical sources within a WWTP. In particular, the analysis of the field data, recorded after carefully training of 

IOMS both for odour concentrations and classes, made it possible to identify the classes of odours that are 

responsible for the highest concentration values during the various months of monitoring. The reports, an 

average of more than 20 per month for the 11 months of monitoring, were analyzed to consider the wind direction 

at the time of the report and the description of the type of odour, to highlight those related to wastewater 

treatment plants.  For the Polignano a Mare WWTP it was not considered to proceed with the analysis of the 

classes as the values of odour concentrations were very low and citizen reports confirmed that odour emissions 

were not of concern. For the Putignano WWTP, the analysis of the IOMS data during the hours in which the 

reports were made, allowed us to quantify the contribution of the emission sources of the plant, thus identifying 

the pretreatments as the most relevant source for low-medium concentrations, whereas a not negligible 

contribution (20%) is given at high concentrations by sludge treatment. It was also verified that all the perceived 

nuisances, classified by citizens as “wastewater treatment” or “sludges”, actually derived from the plant.  

On the other hand, for the Monopoli WWTP, the joint analysis of odour concentrations, odour classes and citizen 

reports allowed to point out that 60% of the odour nuisances are directly related to the plant (27.24% 

pretreatments and 32.51% sludge conditioning), while 40% of them are not attributable to any source within the 

plant and may be related to other similar sources, such as a WWTP located nearby. 

The combined use of the instrumental approach and data relating to citizens' reports via the App has proven to 

be useful and effective, especially in the presence of multiple odour emission sources. 

References 

Brattoli, M.; Mazzone, A.; Giua, R.; Assennato, G; de Gennaro, G., 2016, Automated Collection of Real-Time Alerts of 

Citizens as a Useful Tool to Continuously Monitor Malodorous Emissions. Int. J. Env. Res. Pub. Health, 13, 263, 

doi:10.3390/ijerph13030263. 

Cangialosi, F.; Bruno, E.; De Santis, G., 2021 Application of Machine Learning for Fenceline Monitoring of Odor 

Classes and Concentrations at a Wastewater Treatment Plant. Sensors, 21, 4716. 

https://doi.org/10.3390/s21144716 

Demarchi, L.; Kania, A.; Ciężkowski, W.; Piórkowski, H.; Oświecimska-Piasko, Z.; Chormański, J.. 2020 Recursive 

Feature Elimination and Random Forest Classification of Natura 2000 Grasslands in Lowland River Valleys of 

Poland Based on Airborne Hyperspectral and LiDAR Data Fusion. Remote. Sens. , 12, 1842, 

doi:10.3390/rs12111842. 

Shepherd G. M., 2004. The human sense of smell: Are we better than we think? PLoS Biol. 2, 5, e146 

Karakaya, D.; Ulucan, O.; Turkan., 2020, M. Electronic Nose and Its Applications: A Survey. Int. J. Aut. Comp., 17, 

179–209, doi:10.1007/s11633-019-1212-9. 

Lotesoriere, B.; Giacomello, A.; Bax, C.; Capelli, L., 2021, The Italian Pilot Study of the D-NOSES Project: An 

Integrated Approach Involving Citizen Science and Olfactometry to Identify Odour Sources in the Area of 

Castellanza (VA). Chem. Eng. Trans. 85, 145–150. 

Men, H.; Fu, S.; Yang, J.; Cheng, M.; Shi, Y.; Liu, J., 2018, Comparison of SVM, RF and ELM on an Electronic Nose 

for the Intelligent Evaluation of Paraffin Samples. Sensors, 18, 285, doi:10.3390/s18010285. 

Naddeo, V.; Zarra, T.; Oliva, G.; Kubo, A.; Ukida, N.; Higuchi, T., 2016 Odour measurement in wastewater treatment 

plant by a new prototype of e.Nose: Correlation and comparison study with reference to both European and 

Japanese approaches. Chem. Eng. Trans. 54, 85–90. 

Oliva, G.; Zarra, T.; Massimo, R.; Senatore, V.; Buonerba, A.; Belgiorno, V.; Naddeo, V., 2021, Optimization of 

Classification Prediction Performances of an Instrumental Odour Monitoring System by Using Temperature 

Correction Approach. Chemosensors, 9, 147, doi: 10.3390/chemosensors9060147. 

Choi, Y.; Kim, K.; Kim, S.; Kim, D., 2022, Identification of odor emission sources in urban areas using machine learning-

based classification models, Atmospheric Environment: X, Volume 13, 100156, ISSN 2590-1621, 

https://doi.org/10.1016/j.aeaoa.2022.100156. 

Hsu, Y.; Dille, P.; Cross, J.; Dias, B.; Sargent, R.; Nourbakhsh, I., 2017, Community-Empowered Air Quality Monitoring 

System. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. Association for 

Computing Machinery, New York, NY, USA, 1607–1619. DOI: https://doi.org/10.1145/3025453.3025853 

Zarra, T.; Galang, M.G.K.; Ballesteros, F.C. Jr.; Belgiorno, V.; Naddeo, V., 2021, Instrumental Odour Monitoring 

System Classification Performance Optimization by Analysis of Different Pattern-Recognition and Feature 

Extraction Techniques. Sensors, 21, 114, dx.doi:10.3390/ s21010114. 

Zheng, H.; Hong, Y.; Long, D.; and Jing, H., 2017, Monitoring surface water quality using social media in the context 

of citizen science, Hydrol. Earth Syst. Sci., 21, 949–961, https://doi.org/10.5194/hess-21-949-2017. 

30

http://doi.org/10.3390/ijerph13030263
http://doi.org/10.3390/ijerph13030263
https://doi.org/10.3390/s21144716
https://doi.org/10.3390/rs12111842
https://doi.org/10.3390/s18010285
https://doi.org/10.1016/j.aeaoa.2022.100156
https://doi.org/10.1145/3025453.3025853