Fire SM: new dataset for anomaly detection of fire in video surveillance ACTA IMEKO ISSN: 2221-870X March 2022, Volume 11, Number 1, 1 - 6 ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 1 Fire SM: new dataset for anomaly detection of fire in video surveillance Shital Mali1, Uday Khot1 1 Department of Electronics and Telecommunication , St. Francis Institute of Technology, Mumbai University, Mumbai, India Section: RESEARCH PAPER Keywords: Anomalous; convolutional neural network; dataset; fire; smoke Citation: Shital Mali, Uday Khot, Fire SM: new dataset for anomaly detection of fire in video surveillance, Acta IMEKO, vol. 11, no. 1, article 25, March 2022, identifier: IMEKO-ACTA-11 (2022)-01-25 Section Editor: Md Zia Ur Rahman, Koneru Lakshmaiah Education Foundation, Guntur, India Received November 29, 2021; In final form March 6, 2022; Published March 2022 Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Corresponding author: Shital Mali, e-mail: shital.mali@rait.ac.in 1. INTRODUCTION Surveillance cameras are widespread, and it is not feasible to have people actively tracking them. In most instances, nearly all footage of surveillance camera is unimportant. Only rare pieces of video are of the main concern. Thus, the key inspiration for creating video anomaly detection along with image-based is, to automatically locate areas of video/image which are irregular. This would mark those for human inspection. Recently, the research study of the identification of video/image anomaly has been characterised by two parameters. The training videos are made using a secure event. The anomalous event identification would be task followed after examining the video. In order to define what is usual for a specific scene, it is important to have training footage of normal behaviour. By an anomalous case that implies the localized video section, which is substantially dissimilar, happen inside a training video. More difficult is to choose very different attributes, that have been handled in point of interest application. Such disparity would be due to many causes, the majority usually remarkable presence of the things inside video. Interestingly note, most researchers conferred on anomaly detection after experimentation [1], [2], [3], [4] and some have published their findings with different techniques [5], [6], [7]. Few studies have discussed about the usual anomaly video which either coming from one or two same scene. The attribute that may attach in identification purpose, the unique index of geographical space in an instance of video. For certain instances, the detection algorithm one can identify the anomalous scene that for other instance may not be a case of anomalous. The problem in little moment was needed to take care under such research study. The quality-wise distinct issue in one scene may create a difficulty in superimposed multiple-scenes. The identification and analysis of single instance indicates to very uniquely handle feature in surveillance system, this study focused on such aspect. So, measurement technology plays a vital role in anomaly detection and surveillance applications. The formulation explains about the differences which lastly act spatially. The detection of anomalous activity in video/image directly related to performance and accuracy of detection algorithm. There would always be a scope of improvement in anomalous detection algorithm. ABSTRACT Tiny datasets of restricted range operations, as well as flawed assessment criteria, are currently stifling progress in video anomaly detection science. This paper aims at assisting the progress of this research topic, incorporating a wide and diverse new dataset known as Fire SM. Further, additional information can be derived by a precise estimation in early fire detection using an indicator, Average Precision. In addition to the proposed dataset, the investigations under anomaly situations have been supported by results. In this paper different anomaly detection methods that offer efficient way to detect Fire incidences have been compared with two existing popular techniques. The findings were analysed using Average Precision (AP) as a performance measure. It indicates about 78 % accuracy on the proposed dataset, compared to 71 % and 61 % on Foggia dataset, for InceptionNet and FireNet algorithm, respectively. The proposed dataset can be useful in a variety of cases. Findings show that the crucial advantage is its diversity. mailto:shital.mali@rait.ac.in ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 2 The number of challenges would come to place while dealing with anomalous detection in fire related datasets. The shortcoming involves the unique kind dataset solely based on fire anomalous instances, low resolution of existing datasets, variations in anomalous. Few more cases, in which uncertainty, inconsistencies, and loss in quality have been identified. The main focus of the paper has been on the detection of anomaly, analysis of obtained result after application on experimental dataset with the help of few assessment indices. The induction of new dataset of early fire and smoke (refer to Table 1) would be helpful in many applications. Maintaining diversities in dataset would be a key point, which checks anomalous things in different direction and more complex way. 2. EXISTING DATASET Fires are man-made hazards that inflict human. It causes social, and economic damage. Early fire alarm and an automatic approach are important and useful to emergency recovery services to minimize these losses. Existing fire alarm devices have been shown to be unreliable in terms of numerous real-world situations. The vital disadvantage of the sensor-based framework is that it should be situated close to a fire or warmth source. However, this makes them impractical to use in a variety of frequently occurring scenarios such as long-distance fire occurrences as seen in Figure 1. Due to this, the traditional approach has failed to avoid a number of fire deaths. Solutions to this usually involve a reasonable amount of fire or heat sensation to stimulate the alarm. In addition, the fire or smoke regions are not precisely located. Due to shortcomings of fire detection, researchers have been researching computer vision related approaches that have become alternatives for improving the fire and smoke detection system. Existing vision-based approaches focusing solely on the transformation of colour space for fire area detection [1], [2]. Rule-based methodology, along with colour space, has a promising future in delivering improved results. However, such devices are also vulnerable to other lit items such as streetlights. Additional methods applied to the decision-making algorithm additional features to colour-based methods such as location, boundary and motion cues [3], [4]. Classifiers such as Bayes Classifier, Dual Optical Flow and Multi-Expert Scheme have been used to minimize false detection or misclassification. However, these strategies are vulnerable to error and fail in many complex real-world scenarios as seen in Figure 2. However, due to the complexities of the condition, fire detection is a challenging task. As it does not have a definite form, area of incidence, complex temporal behaviour so as to extract the function. The hand-crafted collection of features involves a considerable amount of domain information. Table 2 listed details of existing dataset. The Foggia video dataset [8] and the Chino dataset [9] were the two basic datasets. The first dataset includes of 31 enclosed environment and open-air videos. In that, seventeen are with not fire related while fourteen categorized of fire. As a result, colour- based methods are incapable for recognizing genuine fire and Table 1. Fire Instances in Fire SM Dataset. Anomaly Class Instances 1. Outside Offices 78 2. Outside Apartment 88 3. In Bushes 26 4. Outside Light 15 5. Street Light 13 6. Decorative Lighting 11 7. Bon-fire 9 8. Cooking Gas 25 Table 2. Existing Datasets. Type Size Per Image Rate No. of Frames Related Fire Remark or Observations Fire1 320x240 15 705 yes See [10] Fire2 320x240 29 116 yes Refer [10] and [11] Fire3 400x256 15 255 yes Fire4 400x256 15 240 yes Fire5 400x256 15 195 yes Fire6 320x240 10 1200 yes Fire7 400x256 15 195 yes Fire8 400x256 15 240 yes Fire9 400x256 15 240 yes Fire10 400x256 15 210 yes Fire11 400x256 15 210 yes Fire12 400x256 15 210 yes Fire13 320x240 25 1650 yes Fire14 320x240 15 5535 yes Foggia et al. [8] Fire15 320x240 15 240 no Refer [11] and [10] Fire16 320x240 10 900 no Fire17 320x240 25 1725 no Fire18 352x288 10 600 no Fire19 320x240 10 630 no Fire20 320x240 9 5958 no Fire21 720x480 10 80 no Fire22 480x272 25 22500 no Foggia et al. [8] Fire23 720x576 7 6097 no Refer [11] and [10] Fire24 320x240 10 342 no Fire25 352x288 10 140 no Fire26 720x576 7 847 no Fire27 320x240 10 1400 no Fire28 352x288 25 6025 no Fire29 720x576 10 600 no Fire30 800x600 15 1920 no Foggia et al. [8] Fire31 800x600 15 1485 no Figure 1. Test Images of Training Data. Figure 2. Sample confusing images which look like fire or smoke. ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 3 scenes with red shading parts. Additionally, movement-based strategies may mistakenly portray a mountain scene of smoke, fog, or haze. These pieces have made the informational collection more troublesome, empowering us to push our engineering and assess its exhibition in different genuine settings. Another issue that arises during data processing is the difference between the fire and the non-fire. At a greater distance, for example, Fire2 [10] video contains so little fire. In the other hand, the Fire13 [10] video shows no fire, but only within a very small range. Thus, red designs and grounds like billboard (Fire14) and radish grass (Fire6) are available in numerous photos, making the dataset hard to decipher. The second dataset is relatively limited but very difficult. This dataset contains a total of 226 images, 119 of which contain fire while the other 107 are fire-like pictures including night falls, fire-like stars, and daylight coming through windows and so on. An enormous amount of data is required in training for Convolutional Neural Networks (CNNs). Conversely, the current image/video fire collections are insufficient to meet the demands. Table 3 displays some limited scale fire picture/video information repositories. The data collection includes 13,400 fire images in all. These photographs were taken both outside and inside. There are 9695 "fire" and 7442 "smoke" facets in the data collection. In addition, the dataset includes 15,780 images that do not have flames. These data were acquired from 16 separate user environments and involve 49,614 distorted images. Each picture usually involves distortion such as due to surrounding noise or climatic condition. For this investigation, half of the pictures in the information assortment are utilized as the preparation/approval set. The remaining half is utilized as the test set. 3. EXPERIMENTAL DETAILS Experiments were carried out using a deep neural network technique has been applied on proposed dataset. For which the system was used, the NVidia RTX 2080 processor with 10 GB on-board ram, as well as Ubuntu OS16.04 based on system. The CPU would be of Intel Core i5. This system was having RAM of 64GB. The analyses utilized 68,457 pictures acquired from notable fire datasets. This includes Foggia et al. [8] of 62,690 pictures. The planning and testing periods of the tests followed the trial system, where 20 % and 80 % of the information were utilized for preparing and testing, separately. The technique was applied with a qualified proposed updated EfficientDet [13]. The modified EfficientDet algorithm incorporated with Leaky Relu as activation function has been replaced Hardswish. A training data of 2717 pictures was generated by using a model of 2529 fire pictures and 190 non-fire pictures. The planned network, however, with only 2-classes, i.e. fire and not fire class. Data sets are one of the essential components for evaluating the output of any given device. Evaluating the algorithm against a regular data set is one of the most difficult activities. In the suggested datasets, all photographs are original and taken by real people. This dataset is therefore the most demanding and diversified dataset ever produced. This hand- crafted research dataset was designed to explain the generalization of a qualified model. This involves an average of 2 boxes per picture of varying size and aspect ratio. Activation mapping has been exploited. This was required to get the approximate bounding box. The loss function was used as a binary cross-entropy during the study of this dataset. In addition, the optimizer is found to be RMSProp with an early learning score of 0.001s. The number of 300 epochs was taken into account. The accompanying segments present subtleties of results got utilizing different fire datasets and their correlation with cutting edge fire information base methodologies. 4. FIRE SM DATASET DESCRIPTION AND RESULTS This proposed Fire SM dataset was verified for density of occurrences of actual fire location in an image. The dataset contains images of the fire which was located not only at centre of image but also at corner, top, bottom side as well included. The density of fire location in an image was shown in Figure 3. This figure shows the fire location in relative coordinate plane. In this, red colour signifies fire at middle, orange, yellow colour Table 3. Number of fire image/video datasets present [12]. Institution Format Object Website Bilkent University Video Fire, smoke, disturbance http://signal.ee.bilkent.edu.tr/visitfire/index.html CVPR Lab, at Keimyung University Video Fire, smoke, disturbance https://cvpr.kmu.ac.kr UMRCNRS 6134 SPE, Corsica University Dataset Fire http://cfdb.univ-crse.fr/index.php?menu=1 Faculty of Electrical Engineering, split university Image, video Smoke http://wildfire.fesb.hr/ Institute of microelectronics, Seville, Spain Image, video Smoke https://www2.imse-cnm.csic.es/vmote/english_version/ National fire Research Laboratory, NIST Video Fire https://www.nist.gov/topics/fire State Key Laboratory of Fire Science, University of Science and Technology of China Image, Video Smoke http://smoke.ustc.edu.en/datasets.htm Figure 3. Fire location in images with distribution in relative coordinate in Fire SM dataset (Red-at middle, Orange, Yellow- at corner, Sky Blue, Dark Blue- not at middle and corner). http://signal.ee.bilkent.edu.tr/visitfire/index.html https://cvpr.kmu.ac.kr/ http://cfdb.univ-crse.fr/index.php?menu=1 http://wildfire.fesb.hr/ https://www2.imse-cnm.csic.es/vmote/english_version/ https://www.nist.gov/topics/fire http://smoke.ustc.edu.en/datasets.htm ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 4 signifies fire at corner, while sky blue, dark blue colour signifies fire not at middle and corner position. This figure supported to the diversified distribution dataset of fire was proposed for anomaly fire detection technique. In reality, with a phenomenon that appears over several frames, it is necessary to discover an irregularity in probably a portion of the images. However, confirming the area in every frame of the track is typically needless. This is especially true where there is uncertainty regarding when to start and finish the previously described phenomenon, as well as when anomalous activity is heavily occluded for a few frames. Below mentioned feature are nothing but a measure of classification quality. 4.1. Feature Indices 4.1.1. Localized Detection Index/Rate The Localized Detection Rate LDR is defined as LDR = (Number of true regions detected)/(total number of regions). True region in an image detected if intersection of true area and recognized local portion is more or equivalent to  as shown in Figure 4. 4.1.2. Region based Detection Rate The Region based Detection Rate RBDR is defined as RBDR= (Number of positive image detected)/(total number of regions).  ranges between 0 to 1. Default, =0.1. The Negative Region Rate NRR is defined as NRR= (total non-positive regions) / (total frames or images) The average detection rate for negative region rate, NRR, will ranging from 0 to 1. There is a compromise between the discovery rate (genuine positive rate) and the bogus positive rate, likewise with any location rule. This can be caught in the ROC bend determined by changing the inconsistency score edge that characterises what locales are seen as abnormality. Figure 5 and Figure 6, show characteristic curves for InceptionNet method, Foggia and Fire SM datasets. The nature of the curve signifies those values favorable to proposed dataset i.e., Fire SM compared to Foggia. Khan et al. [14] described InceptionNet method on fire instance dataset. The dataset mentioned was less diversified as compared to proposed Fire SM dataset. Khan et al. [14] and FireNet [15] proposed approaches focused on the classification of the leave-one-out strategy to be used for each level. In comparison to these algorithms, the updated EfficientDet [16]-[19] based more on the degree of detection. This paper considered the Average Precision (AP) indicator for quantitative analysis. Results were collected and 1 1 1 1 1 1 1 1 1 Figure 4. Representation of framework of areas petitioning frame. ‘1’ defines depicted as true region. Gives an idea of detection method. a) b) Figure 5. a, b NRR Per Frame characteristic curves for different dataset. a) b) Figure 6. a, b NRR frame level characteristic curves for different dataset. ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 5 seen in Table 4 for the proposed dataset relative to the Foggia dataset. Activation mapping has been used to get an estimated bounding box. Table 4 shows the updated EfficientDet performance better relative to other algorithms. At present, both Average Precision (AP) at 50 and AP at 75 were compared. The EfficientDet result obtained given the proposed Early Fire dataset was approximately 73 per cent and 71 per cent compared to approximately 53 per cent, 51 per cent and approximately 68 per cent respectively, and 58 per cent for InceptionNet and FireNet. On the other hand, when taking into account the Foggia dataset, the findings obtained were approximately 82 per cent, 78 per cent. These have been compared to about 65 %, 61 % and around 73 %, and 71 % for InceptionNet and FireNet respectively, for AP@50 and AP@75 both. Figure 7 shows the detection of fire and smoke. 5. CONCLUSIONS This paper introduces a new Fire SM database of fire anomaly scenarios. The database has been therefore the most demanding and diversified dataset ever produced. This hand-crafted research dataset was designed to explain the generalization of a qualified model. This research study proposes novel lightweight and real- time for detecting smoke and fire in videos or photographs. Exiting datasets are either restricted or produced synthetically for testing purposes. In this study, validation was carried out on a real-world challenging proposed dataset that includes the majority of fire and smoke event scenarios. Further, the weighted bi-directional Feature Pyramid Network (BiFPN) as well as compound scaling, consistently achieve better efficiency in EfficientDet. Experiment findings show that Google's newest model, EfficientDet, outperforms Foggia on the proposed dataset. These results were obtained using Average Precision (AP) as an indicator; on the proposed dataset, it shows around 78 %, compared to 71 % and 61 % for InceptionNet and FireNet, respectively, on the Foggia dataset. The new assessment criteria address the shortcomings of the traditional criteria in this field. It provides a more accurate picture of how well an algorithm performs in real environment. Furthermore, in this study, two variants of a latest fire anomaly detection algorithm used as a benchmark to which future work was measured. The new database would be helpful to encourage new novel techniques under this research field. REFERENCES [1] T. Celik, H. Demirel, H. Ozkaramanl, M. Uyguroglu, Fire detection using statistical color model in video sequences. Journal of Visual Communication and Image Representation. 2007 Apr 1;18(2):176-85. DOI: 10.1016/j.jvcir.2006.12.003 [2] B. C. Ko, S. J. Ham, J. Y. Nam. Modeling and formalization of fuzzy finite automata for detection of irregular fire flames. IEEE Transactions on Circuits and Systems for Video Technology. 2011 May 19; 21(12):1903-12. DOI: 10.1109/TCSVT.2011.2157190 [3] J. Choi, J. Y. Choi, Patch-based fire detection with online outlier learning. In 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 2015 Aug 25, pp. 1-6. DOI: 10.1109/AVSS.2015.7301763 [4] T. Wang, L. Shi, P. Yuan, L. Bu, X. Hou, A new fire detection method based on flame color dispersion and similarity in consecutive frames. In2017 Chinese Automation Congress (CAC) 2017 Oct 20 (pp. 151-156). IEEE. DOI: 10.1109/CAC.2017.8242754 [5] K. Muhammad, J. Ahmad, Z. Lv, P. Bellavista, P. Yang, S. W. Baik, Efficient Deep CNN-Based Fire Detection and Localization in Video Surveillance Applications. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 7, July 2019, pp. 1419- 1434. DOI: 10.1109/TSMC.2018.2830099 [6] A. Jadon, M. Omama, A. Varshney, M. S. Ansari, R. Sharma, FireNet: a specialized lightweight fire & smoke detection model for real-time IoT applications. arXiv preprint arXiv:1905.11922. 2019 May 28. DOI: 10.48550/arXiv.1905.11922 [7] K. Muhammad, J. Ahmad, I. Mehmood, S. Rho and S. W. Baik, Convolutional Neural Networks Based Fire Detection in Surveillance Videos. IEEE Access, vol. 6, pp. 18174-18183, 2018. DOI: 10.1109/ACCESS.2018.2812835 [8] P Foggia, A Saggese, M Vento, Real-time fire detection for video- surveillance applications using a combination of experts based on Figure 7. Detection of Fire and Smoke in Proposed Dataset. Table 4. Comparison of Updated Efficientdet to Inceptionnet and Firenet (*AP@50: 50 % above overlap, AP@75: 75 % above overlap). Method Early Fire and Smoke (proposed) Foggia dataset AP@50 AP@75 AP@50 AP@75 Khan et. al [14] InceptionNet 53.41 50.63 65.23 61.28 FireNet [15] 68.46 57.94 73.23 70.65 modified EfficientDet D0 73.35 70.78 81.92 78.23 https://doi.org/10.1016/j.jvcir.2006.12.003 https://doi.org/10.1109/TCSVT.2011.2157190 https://doi.org/10.1109/AVSS.2015.7301763 https://doi.org/10.1109/CAC.2017.8242754 https://doi.org/10.1109/TSMC.2018.2830099 https://doi.org/10.48550/arXiv.1905.11922 https://doi.org/10.1109/ACCESS.2018.2812835 ACTA IMEKO | www.imeko.org March 2022 | Volume 11 | Number 1 | 6 color, shape, and motion. IEEE TRANSACTIONS on circuits and systems for video technology. 2015 Jan 19;25(9):1545-56. DOI: 10.1109/TCSVT.2015.2392531 [9] D. Y. Chino, L. P. Avalhais, J. F. Rodrigues, A. J. Traina, BoWFire: detection of fire in still images by integrating pixel color and texture analysis, in 28th SIBGRAPI Conference on Graphics, Patterns and Images, 2015, pp. 95-102. DOI: 10.1109/SIBGRAPI.2015.19 [10] E. Cetin, Computer vision-based fire detection dataset. Online [Accessed 17 March 2022] http://signal.ee.bilkent.edu.tr/VisiFire/ Ultimate chase. Online [Accessed 17 March 2022] http://ultimatechase.com/ [11] Li Pu, W. Zhao, Image fire detection algorithms based on convolutional neural networks. Case Studies in Thermal Engineering. 2020 June 1, 19:100625. DOI: 10.1016/j.csite.2020.100625 [12] M Tan, R. Pang, Q. V. Le, Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781-10790. [13] K. Muhammad, J. Ahmad, Z. Lv, P. Bellavista, P. Yang, S. W. Baik, Efficient Deep CNN-Based Fire Detection and Localization in Video Surveillance Applications. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 7, July 2019, pp. 1419- 1434. DOI: 10.1109/TSMC.2018.2830099 [14] A Jadon, M Omama, A Varshney, M. S. Ansari, R. Sharma, FireNet: a specialized lightweight fire & smoke detection model for real-time IoT applications. arXiv preprint arXiv:1905.11922. 2019 May 28. DOI: 10.48550/arXiv.1905.11922 [15] Fire SM dataset Online [Accessed 17 March 2022] https://tinyurl.com/83exdz6d [16] Federica Vurchio, Giorgia Fiori, Andrea Scorza, Salvatore Andrea Sciuto, Comparative evaluation of three image analysis methods for angular displacement measurement in a MEMS microgripper prototype: a preliminary study, Acta IMEKO, vol. 10, no. 2, pp. 119-125, 2021. DOI: 10.21014/acta_imeko.v10i2.1047 [17] Henrik Ingerslev, Soren Andresen, Jacob Holm Winther, Digital signal processing functions for ultra-low frequency calibrations, Acta IMEKO, vol. 9, 2020, no. 5, pp. 374-378. DOI: 10.21014/acta_imeko.v9i5.1004 [18] Lorenzo Ciani, Alessandro Bartolini, Giulia Guidi, Gabriele Patrizi, A hybrid tree sensor network for a condition monitoring system to optimise maintenance policy, Acta IMEKO, vol. 9, 2020, no. 1, pp. 3-9. DOI: 10.21014/acta_imeko.v9i1.732 [19] András Kalapos, Csaba Gór, Róbert Moni, István Harmati, Vision-based reinforcement learning for lane-tracking control, Acta IMEKO, vol. 10, 2021, no. 3, pp. 7-14. DOI: 10.21014/acta_imeko.v10i3.1020 https://doi.org/10.1109/TCSVT.2015.2392531 https://doi.org/10.1109/SIBGRAPI.2015.19 http://signal.ee.bilkent.edu.tr/VisiFire/ http://ultimatechase.com/ https://doi.org/10.1016/j.csite.2020.100625 https://doi.org/10.1109/TSMC.2018.2830099 https://doi.org/10.48550/arXiv.1905.11922 https://tinyurl.com/83exdz6d http://dx.doi.org/10.21014/acta_imeko.v10i2.1047 http://dx.doi.org/10.21014/acta_imeko.v9i5.1004 http://dx.doi.org/10.21014/acta_imeko.v9i1.732 http://dx.doi.org/10.21014/acta_imeko.v10i3.1020