Performance evaluation of underwater image pre-processing algorithms for the improvement of multi-view 3D reconstruction ACTA IMEKO ISSN: 2221-870X September 2019, Volume 8, Number 3, 69 – 77 ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 69 Performance evaluation of underwater image pre-processing algorithms for the improvement of multi-view 3D reconstruction Alessandro Gallo1, Fabio Bruno1, Loris Barbieri1, Antonio Lagudi1, Maurizio Muzzupappa1 1 Department of Mechanical, Energy and Management Engineering (DIMEG), University of Calabria, Via P. Bucci 46C, 87036 Rende, Italy Section: RESEARCH PAPER Keywords: 3D reconstruction; Underwater Cultural Heritage; Image enhancement; Underwater imaging Citation: Alessandro Gallo, Fabio Bruno, Loris Barbieri, Antonio Lagudi, Maurizio Muzzupappa, Performance evaluation of underwater image pre-processing algorithms for the improvement of multi-view 3D reconstruction, Acta IMEKO, vol. 8, no. 3, article 11, September 2019, identifier: IMEKO-ACTA-08 (2019)- 03-11 Section Editor: Egidio De Benedetto, University of Salento, Italy Received November 12, 2018; In final form May 28, 2019; Published September 2019 Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Corresponding author: Loris Barbieri, e-mail: loris.barbieri@unical.it 1. INTRODUCTION The 3D reconstruction of submerged structures or archaeological finds has achieved notable popularity in Underwater Cultural Heritage (UCH) preservation, as the method may allow for the exploration of sites located in inaccessible and hostile environments. In the last decade, techniques and tools for 3D reconstructions have been widely employed in the underwater archaeology field according to the guidelines of UNESCO, which suggest the in-situ preservation of underwater heritage [1]. Among the different 3D imaging techniques that are suitable for underwater applications, photogrammetry represents a valid method of reconstructing 3D scenes from a set of images taken from different viewpoints [2]. The popularity of this technique is also due to the acquisition devices (still or a movie camera with appropriate waterproof casings) that are affordable and easy to use compared to dedicated devices like LIDAR, multi-beam, etc. [3]. Furthermore, these devices can be handled by scuba divers or mounted on underwater robots [4]. Unfortunately, image- based acquisition suffers due to the poor environmental conditions. The depth of the water, flora, fauna, weather conditions, and sea currents are all factors that affect visibility, refraction, and lighting conditions. Consequently, these factors limit underwater photogrammetry’s scope to close-range applications, and further efforts are required to improve the radiometric quality of the images. For these reasons, the enhancement of underwater images is still a necessary step in improving the accuracy of 3D reconstruction and creating realistic textures. This article presents a performance evaluation of underwater image pre-processing algorithms for the improvement of multi- view 3D reconstruction. Two existing colour enhancement models, i.e. ACE (Automatic Colour Equalisation) and PCA (Principal Component Analysis) algorithms, have been tested to compare their results with those provided by a new method based on histogram stretching and manual retouching (HIST). To this end, an experimental campaign has been planned using Design of Experiment (DOE) [5] criteria to investigate the factors that affect reconstruction accuracy. The experimental ABSTRACT 3D models of submerged structures and underwater archaeological finds are widely used in various and different applications, such as monitoring, analysis, dissemination, and inspection. Underwater environments are characterised by poor visibility conditions and the presence of marine flora and fauna. Consequently, the adoption of passive optical techniques for the 3D reconstruction of underwater scenarios is a highly challenging task. This article presents a performance analysis conducted on a multi-view technique that is commonly used in air in order to highlight its limits in the underwater environment and then provide guidelines for the accurate modelling of a submerged site in poor visibility conditions. A performance analysis has been performed by comparing different image enhancement algorithms, and the results have been adopted to reconstruct an area of 40 m2 at a depth of about 5 m at the underwater archaeological site of Baiae (Italy). mailto:loris.barbieri@unical.it ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 70 campaign has been carried out in the underwater archaeological site of Baiae (Naples, Italy), where the seafloor, with a water depth ranging between 2.5 and 20 m, offers a particularly interesting environment, as it encompasses a submerged area of many hectares and presents a wide range of different architectural structures with a number of decorations that are still preserved. The research has been conducted in the context of the iMARECulture project [6], [7], [8], which aims to develop new tools and technologies for improving public awareness of UCH. The article is organised as follows. Section 2 presents related works about underwater image pre-processing methods. Section 3 describes the image acquisition and pre-processing stage. In section 4, the results of the statistical analysis are detailed. Section 5 presents the 3D reconstruction obtained by means of the enhanced images, and finally, conclusions are presented in section 6. 2. UNDERWATER IMAGE PRE-PROCESSING ALGORITHMS Underwater pictures generally suffer from light absorption, which causes some defects mostly on the red channel (the first component of the light spectrum that is absorbed), and this effect is already noticeable at only a few metres of depth. The pre-processing of underwater images can be conducted with two different approaches: image restoration techniques or image enhancement methods [9], [10]. Image restoration techniques need some environmental parameters to be entered, such as scattering and attenuation coefficients, while image enhancement methods do not require a priori knowledge of the underwater environment. The physical effects of visibility degradation have been analysed in [11], showing that the degradation effects can be associated mainly with the partial polarisation of light. The developed algorithm is based on a couple of images taken through a polariser at different orientation, improving contrast and colour and doubling the underwater visibility range. The work of [12] presents an image restoration filter based on a simplified version of the Jaffe-McGlamery underwater image formation model, which can be used for images with limited backscatter in diffuse lighting. The ACE algorithm [13] is inspired by the human vision, which is able to adapt to highly variable lighting conditions, extracting visual information from the underwater environment [14]. The algorithm combines the Patch White algorithm with the Gray World algorithm, taking into account the spatial distribution of colour information. In the first stages of the ACE method, chromatic data and pixels are processed and adjusted according to the information contained in the image. Subsequently, colours in the output image are restored and enhanced [15]. Different to the ACE algorithm, which can adapt to widely varying lighting conditions and can extract visual information, in order to reduce the number of variables considerably while still retaining much of the information in the original dataset, it is possible to adopt the Principal Component Analysis (PCA) algorithm. PCA is one of the most popular multivariate statistical techniques that analyses a data table representing observations described by several dependent variables and extracts the important information in the form of a set of new orthogonal variables called principal components. In this specific application, the PCA algorithm allows us to extract a dominant colour of the image. Hence, in most cases, the water colour also provides good results in term of colour enhancement. An automatic enhancement algorithm that does not require any correction parameter has been proposed in [16], where each source of errors is corrected sequentially. The first step removes the moiré effect, then a homomorphic or frequency filter is applied to equalise brightness and to enhance the contrast. Regarding the acquisition noise, a wavelet denoising filter followed by an anisotropic filtering has been applied. Finally, dynamic expansion is applied to increase contrast, followed by colour equalisation. The process is performed on one channel, specifically the YCbCr colour space, in order to optimise the computation time. Even if this last step speeds up all the following processes avoiding the need to process each RGB channel each time, it is important to point out that the use of a homomorphic filter affects the geometry and could generate errors on the reconstructed scene. The effectiveness of the use of different colour spaces for the enhancement of underwater images has been demonstrated in [17], where a slide stretching algorithm has been used both on RGB and HSI colour spaces. After a contrast stretching on RGB colour space has been performed, the resulting images have been converted to HSI colour space and processed through saturation and intensity stretching in order to increase the true colour and solve the problem of lighting. The aim of underwater colour correction is not only to obtain better quality images, but also to improve the performance of feature extraction algorithms in terms of the detection of feature points. The effects of different image pre- processing methods on the performance of the SURF (Speeded Up Robust Features) detector [18] have been investigated in [20], and the IACE (Image Adaptive Contrast Enhancement) method has been proposed. In particular, the IACE method enhances the intrinsic features in images, like corners, edges, and blobs, along with maintaining the relative contrast between the pixels. Thanks to this capability, the IACE method has proven better than other techniques, like Histogram Equalisation and Multiscale Retinex algorithm for Color Enhancement, in terms of the repeatability of their SURF detector and the robustness and distinctiveness of their SURF descriptor. Different to previous works [20] that focus on the comparison of different image enhancement algorithms, with all other conditions being equal, this article presents a performance analysis based on a DOE approach, which takes into account the main influential factors that affect 3D reconstruction quality. 3. EXPERIMENTATION The experiment was undertaken in the underwater archaeological site of Baiae, which is located few kilometres north of Naples (Italy). The submerged environment of Baiae is characterised by highly critical visibility conditions due to water turbidity and the heavy presence of flora and fauna. The area selected for the experimentation is the thermal room of ‘Villa Protiro’, with a size of 5 x 8 m at an average depth from the sea level of 5 m. The choice of this area is due to the presence of different building materials (bricks, mortar, tile floors, etc.) and a strong colonisation of various bio-fouling agents. Because of the critical visibility conditions, 3D reconstruction techniques based on the multi-view stereo method are not sufficient for performing an accurate 3D reconstruction of the submerged archaeological area. The underwater images require, then, a pre- processing stage involving the adoption of image enhancement algorithms that could have a relevant or mediocre impact on the quality of the final 3D reconstructed model. ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 71 3.1. Experimental setup The experimental setup consists of a camera, its underwater housing, and two underwater strobes. The camera is a Nikon D7000 reflex device equipped with a CMOS (Complementary Metal-Oxide Semiconductor) sensor size of 23.6 x 15.8 mm and a resolution of 4928 × 3264 pixels (16.2 effective megapixels), as well as a AF-Nikkor 20 mm lens. The underwater housing, manufactured by Ikelite, is equipped with a spherical port. The flashguns are connected to the camera housing with a pair of articulated arms at a distance of 45 cm. The two strobes have been fixed at a distance of 5 cm behind the dome pointing outwards to illuminate the object with the ‘edge’ of the light beam. A calibration panel, produced by Lastolite, has been used in the beginning of the survey to acquire a colour calibration image to perform in-situ white balance correction, while a digital depth gauge has been used to maintain a constant depth from the seabed. 3.2. Image acquisition The photogrammetric survey of the submerged area has been carried out in two different dive sessions, in the north and south parts of the site. The survey has been carried out according to a standard aerial photography layout: The diver swims at a distance from the submerged structures of about 2.5 m, taking overlapping pictures along straight lines that cover the whole area in the north-south direction. Another set of images has been acquired in the east-west direction. The occluded areas have been acquired using oblique photographs. At the end of the survey activity, the dataset included a total of around 700 images. 3.3. Colour enhancement of underwater images The original images (OR) have been then enhanced by means of three algorithms (ACE, PCA, HIST) and corrected through an in-situ white balance correction procedure. The ACE (Automatic Colour Enhancement) algorithm proposed in [9] and the PCA algorithm proposed in [21] have been adopted. The HIST (Histogram Stretching and Manual Retouching) algorithm is a semiautomatic enhancement methodology that has been developed for this study. It is based on histogram stretching and a manual colour retouching procedure and has been implemented using batch actions in a graphics editor to rescue the maximum amount of information from a set of defective and noisy pictures. In particular, the HIST method consists of the following three-step procedure: preliminary histogram stretching to improve the contrast; mixing of the colour channels to balance the missing information on the red channel; creation of a set of adjustment layers, including saturation enhancement for some missing hues, contrast masks, colour balancing, and equalising. In addition to the enhanced algorithms, the images have been processed by performing an in-situ white balance correction procedure (WB) performed by means of a Lastolite waterproof panel. The following figure shows an original uncorrected image (Figure 1a) and those enhanced with the WB procedure (Figure 1b), ACE (Figure 1c), HIST (Figure 1d) and PCA (Figure 1e) algorithms. 3.4. Design of the experimental campaign The experimental campaign has been planned according to the DOE criteria with the purpose of identifying the most influential factors affecting the results of the 3D reconstruction in the underwater environment. Particular attention has been given to the effect of the image enhancement methods on the camera orientation and the self-calibration bundle adjustment process. The measured data has been compared and analysed by means of standard statistical tools to verify if a particular factor (or a combination of factors) has an impact on a parameter with a certain confidence level. On the basis of the results of this analysis, it is possible to find the best combination of factors that should be used for an accurate and dense 3D reconstruction by using a multi-view stereo technique. 3.5. Influencing factors The first step is the selection and identification of the influencing factors that could have an influence on the quality of a 3D reconstruction performed with a multi-view technique in the underwater environment. The influencing factors have been chosen among those that cannot be controlled in situ, such as the presence of marine organisms in motion and the level of turbidity of the water. Furthermore, factors that can be considered as a direct consequence of others have been taken into account. For instance, focus settings depend on the distance from the subject, and the focal length can be set according to the required field of view and to the working distance. The factors selected for the experiment are reported in Table 1. The first factor (EN) is related to the original images (OR) and to the colour enhancement algorithm (WB, HIST, ACE and PCA) adopted to improve underwater images. The second factor refers to image resolution (PYR). The full resolution images have not been used in order to save computational time. The raw images (4928 x 3264 pixels) have been resized by means of the Mitchell-Netravali Cubic Filter [22] in order to create the following levels: level 1 for images of 2464 x 1632 pixels; level 2 for images of 1232 x 816 pixels; and level 3 for images of 616 x 408 pixels. The third factor is represented by the composite RGB image and its three R (Red), G (Green), and B (Blue) components. This factor has been taken into consideration in order to investigate the influence of a single-colour channel on the reconstruction Figure 1. Sample original image (a) corrected with the in situ white balance measurement (b), enhanced with the ACE method (c), HIST method (d), and PCA method (e). ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 72 quality with respect to the grayscale image obtained by combining the RGB components. In order to perform a quantitative analysis of the impact of camera layout on the processing results, the last influencing factor is referred to the type of image set (SET). Seven subsets have been created, which differ among each other according to the type of shot (aerial vs. oblique), working distance, and overlapping pictures. In particular, the first two sets include photos that have been taken with a standard aerial layout. The third set includes pictures characterised by high overlap and good visibility due to the reduced distance from the submerged structures. The fourth and fifth sets cover the outside masonry structures of the outer walls, while the sixth and seventh sets group oblique pictures with variable working distances. 3.6. Measured parameters The 3D reconstruction quality has been evaluated by means of four different parameters: the mean number of extracted features; the percentage of matched features; the percentage of oriented cameras; and bundle adjustment mean re-projection error. The mean number of extracted features (𝑛𝑢𝑚�̅�) has been calculated according to the SIFT (Scale Invariant Feature Transform) operator [23] that consists of the following relationship: 𝑛𝑢𝑚�̅�𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 = 𝑛𝑢𝑚𝐹𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 𝑛𝑢𝑚𝐼𝑚𝑆𝐸𝑇 (1) where 𝑛𝑢𝑚𝐹 is the total number of extracted features for each configuration, and 𝑛𝑢𝑚𝐼𝑚 is the number of images included in each set. The percentage of matched features (𝑚𝑎𝑡𝑐ℎ𝑒𝑑𝐹%) and percentage of oriented cameras (𝑐𝑎𝑚%) parameters have been evaluated using Bundler [24], through the following relationships: 𝑚𝑎𝑡𝑐ℎ𝑒𝑑𝐹% 𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 = 𝑚𝑎𝑡𝑐ℎ𝑒𝑑𝐹𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 𝑛𝑢𝐹𝐸𝑁,𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 (2) 𝑐𝑎𝑚% 𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 = 𝑐𝑎𝑚𝐸𝑁,𝑃𝑌𝑅,𝐶𝐻,𝑆𝐸𝑇 𝑛𝑢𝑚𝐼𝑚𝑆𝐸𝑇 (3) where 𝑚𝑎𝑡𝑐ℎ𝑒𝑑𝐹 represents the number of matched 3D points in the sparse scene reconstruction resulting at the end of the bundle adjustment process, and 𝑐𝑎𝑚 is the number of oriented images for each configuration. The bundle adjustment mean re-projection error (available as output in the Bundler log file and measured in pixels) is the result of a minimisation problem applied to the sum of distances between the projections of each track (a connected set of matching key points across multiple images) and its corresponding image features. 3.7. Dataset generation As mentioned above in section 3.5, the whole dataset has been grouped into seven subsets according to: camera orientation; distance from the subject; pictures taken with flash; and the heavy presence of dark and bright areas. The grouping procedure has meant a selection and reduction in the number of images to 196 pictures. A Matlab script has been programmed in order to manage the selected images and apply the different image enhancement algorithms. Firstly, the enhancement methods (ACE, PCA, WB, HIST) and the WB correction technique have been applied to the original full resolution images. Secondly, the red, green, and blue colour components have been extracted only from images enhanced with WB correction, ACE, and HIST methods. Mo action has been taken on images enhanced with the PCA method because this method produces a single-channel output. Lastly, all the images have been resized according to the pyramid levels. 4. STATISTICAL ANALYSIS Table 2 shows the mean values of the measured parameters, described in section 3.6, computed for each influential factor. The data has been analysed by means of statistical instruments and have been compared by performing an ANOVA test with a 95 % confidence level. The summary of the results is presented in Table 3, in which the measured parameters have been computed from the main source only. The Tukey post hoc test has been performed in order to find out significant differences between the groups of each influential factor. 4.1. Mean extracted features Figure 2 reports the mean values of the extracted features for all the factors of influence (summarised in Table 2). The results show that the number of extracted points strongly depends on image resolution: A greater number of features is obtained from images with a higher resolution. In this regard, the data summarised in Table 3 shows a statistically significant difference Table 1. Influential factors and related symbol and levels. Influential factor Symbol Level Colour enhancement method EN OR, HIST, ACE, PCA, WB Image pyramid level PYR 1, 2, 3 Colour channel CH RGB, R, G, B Image set SET SET 1, SET 2,..., SET 7 Figure 2. The mean values of extracted features for all the influential factors. ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 73 among the three levels of the PYR factor, and this result shows that the number of features has a strong relationship with image resolution. Nevertheless, the images at level 1 (four times more pixels than level 2), produces only three times more features than the images at level 2 and nine times more features than the images at level 3 (containing 16 times less pixels). The image enhancement methods ACE and HIST have returned the best results. On the contrary, the colour channel appears to be less influential than enhancement algorithms, because these last ones operate a mixing between the various colour channels in different ways. The results related to the factor ‘image set’ show a difference in behaviour within the seven datasets. In particular, the highest number of features is extracted from the images belonging to set 3, which includes pictures taken at a reduced distance from the subject. The first two sets are related to the same area: The second set has been acquired after having removed the sand that covered the tiled floor in order to improve the reliability of point detection. Set 6 shows a lower number of features, as the oblique pictures have been taken from a greater distance, and the presence of the blue background is more evident. 4.2. Percentage of matched features The most influential factor on the parameter ‘percentage of matched features’ (Table 2) is the image enhancement algorithm used. As depicted in Figure 3, it is noticeable that the HIST algorithm allows for matching a higher number of features. The second factor in terms of influence is the colour channel: RGB images and the green channel allow for matching the maximum number of features. The results related to the factor ‘image pyramid level’ reflect the low influence deduced from the ANOVA analysis (Table 3), but it must be pointed out that resized images lead to a higher percentage of matched points. For all the three enhancement algorithms used in the experimentation, PYR levels 1 and 2 leads to a higher performance of the feature-matching algorithm. One of the reasons for this behaviour is the fact that by reducing the resolution, it is possible to find more robust features, as they are extracted from the more evident details. The images at level 3 have not shown good results due to the lack of reliable details. Regarding the ‘image set’ factor, the matching of the images included in sets 5, 6, and 7 leads to poor results, since these are composed by oblique photographs only. The best results have Table 2. Mean values of the measured parameters computed for each influential factor. Factors Mean extracted features (1) % matched features (2) % oriented cameras (3) Bundle adjustment mean re-projection error (pixels) EN HIST 5344.9 2.41 43.24 0.19 ACE 6984.8 1.45 37.72 0.22 PCA 1841.2 2.07 33.38 0.18 WB 2452.1 1.21 21.23 0.22 OR 260.1 1.14 9.26 0.10 PYR 1 8806.0 1.61 37.22 0.24 2 2829.6 2.04 32.58 0.16 3 954.2 1.35 20.42 0.13 CH RGB 3848.9 2.20 35.98 0.18 R 4315.0 1.36 30.51 0.17 G 3300.1 2.19 34.82 0.18 B 5322.4 0.91 18.97 0.16 SET 1 2813.7 1.71 28.44 0.14 2 4120.1 1.81 25.97 0.15 3 5986.0 2.39 48.46 0.13 4 3647.2 2.56 47.02 0.22 5 4654.6 0.96 22.62 0.24 6 3040.7 1.43 26.59 0.24 7 5022.6 0.98 20.90 0.22 Table 3. Summary of the results of the ANOVA analysis. Factors Mean extracted features (1) % matched features (2) % oriented cameras (3) Bundle adjustment mean re-projection error F p-value F p-value F p-value F p-value EN 1065.41 0 21.07 0 91.08 0 9.17 0.0002 PYR 1457.10 0 5.93 0.005 20.61 0 13.82 0 CH 47.80 0 14.54 0 12.39 0 0.18 0.9129 SET 92.79 0 7.76 0.0001 18.64 0 6.11 0.0002 ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 74 been obtained for sets 3 and 4, which include images taken with a reduced working distance and a greater picture overlap. 4.3. Percentage of oriented cameras The image enhancement algorithm is the most influential factor on the parameter ‘percentage of oriented cameras’ (Table 2). The performances obtained using each method are clearly better than the results obtained with the original images. For the latter, only 9 % of the cameras have been oriented, while about 43 %, 37 %, and 33 % of cameras have been successfully oriented while using the HIST, ACE, and PCA enhancement methods, respectively. The second most influential factor is image resolution. By analysing the results in Figure 4 and the outcomes of the Tukey post hoc test, it is noticeable that there are no statistically relevant differences between the values for the first and second levels of the image pyramid. This means that it is possible to obtain the maximum number of oriented cameras with a lower resolution, also saving computational time. 4.4. Bundle adjustment mean re-projection error The ANOVA results (Table 3) show that the most influential factor for the parameter ‘mean re-projection error’ is image resolution. If we consider the pixel size of the subsampled images and the mean distance from the subject of 2.5 m, the first and second pyramid levels have demonstrated errors measured on the ground of 0.29 and 0.38 mm, respectively. By analysing the results presented in Table 2 and depicted in Figure 5, it is noticeable that a higher error has been measured on the image sets containing oblique photographs only. The presence of the blue background in almost all the pictures reduces the accuracy of the bundle adjustment process. Furthermore, the ANOVA analysis results (Table 3) reveal that there is not a statistically significant difference among the different colour channels (CH). 4.5. Discussion The statistical analysis allowed for choosing the best combination of factors that should be used to perform the 3D reconstruction of the site. The accuracy of the SfM (Structure from Motion) procedure (namely the mean re-projection error) is mainly related to the camera network orientation. Sets 1, 2 and 3 are characterised by convergent images with a high overlap, forming a more robust network. Moreover, as reported in the previous section, both levels 1 and 2 led to errors below the acceptable value of 0.5 mm. For these reasons, it is possible to save computational time using subsampled images, which also result in a higher percentage of matched features. In fact, in the same conditions and varying only the image resolution, the average reconstruction time for the datasets taken into account in the study presents a saving time of 81 % for PYR2 compared to PYR1, and 92 % for PYR3 compared to PYR1. HIST and ACE methods considerably increase the performance of image matching. The first method returns better results in terms of the percentage of oriented cameras and matched features, increasing the performance by about 150 % and 50 %, respectively. The analysis of the effects of white balance correction has shown good results for the parameters ‘mean number of extracted features’ and ‘percentage of oriented cameras’. In particular, white balance corrected images have shown a higher number of extracted features and better performances compared with PCA method. The images Figure 4. Mean values of the percentage of oriented cameras for all the influential factors. Figure 3. The mean values of the percentage of the matched features for all the influential factors. Figure 5. Bundle adjustment mean re-projection error for all the influential factors. ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 75 obtained using the custom white balance adjustment lead to less stable results, since the correction is performed at the beginning of the survey. The results in terms of the number of oriented cameras have shown similar values for the first and second image pyramid levels. Concerning the re-projection error, as described in section 4.4, in this case the best choice is also to use a lower resolution in order to save computational time without affecting reconstruction accuracy. Regarding the colour channel, the data has demonstrated the better performance of the RGB images, particularly the green channel, which outperforms the results for other channels in terms of matched features and oriented images. Considering the results related to the factor ‘image set’, it is evident that the highest number of features is extracted from the sequences of images that have been taken using a standard aerial photography layout, with a reduced distance to the subject. On the contrary, oblique pictures, where the presence of the blue background is more evident, returned poor results in terms of percentage of matched features and percentage of oriented cameras. 5. RESULTS The 3D reconstruction pipeline starts by performing the orientation of the whole dataset of 722 pictures by means of the Bundler software [25]. In the first instance, the 3D reconstruction has been performed on the dataset composed by the original images. The image orientation process failed to orient all the pictures in a single block: the dataset has been divided into two non-overlapping groups, the north and south parts, which have been reconstructed separately. In particular, 384 images have been oriented for the north block and 116 for the south block. This failure is mainly due to the sandy seabed present in the central part of the room, which makes the extraction and matching of features difficult, as a consequence of the low contrast. Furthermore, the lack of overlapping areas in the reconstructed model prevented the alignment of the two blocks. The results of the statistical analysis allow for choosing the best combination of factors to be used in order to improve the reconstruction process, represented by RGB images resized to 25 % (second pyramid level) enhanced with the HIST method. The enhanced dataset has been processed with Bundler, and a subset of 533 images related to the whole area has been aligned, allowing for the generation of a complete 3D point cloud without the need to register the different meshes (Figure 6). This result shows that colour correction considerably improves the matching process. The data returned by Bundler (camera positions and camera parameters computed by a self-calibration procedure) and the undistorted images have been processed with PMVS2 (Patch Based Multi-View Stereo) [26] in order to create a dense cloud of about 10 million points related to the whole site. This algorithm estimates the surface orientation while enforcing the local photometric consistency, which is important for obtaining accurate models for low-textured objects or for images affected by blur due to the turbidity in the underwater environment. Furthermore, PMVS2 automatically rejects moving objects such as fishes and algae. The dense stereo matching algorithm implemented in PVMS2 receives, as inputs, an undistorted set of images and the 3 × 4 camera projection matrix computed by Bundler. The output is a coloured dense 3D point cloud. The PMVS2 parameters used to fine-tune the 3D reconstruction are the size of the correlation window and the level in the internal image pyramid used for the computation. In our experiment, a fixed correlation window with a size of 7 × 7 pixels was adopted, while the image resolution (image pyramid level) was chosen according to the results obtained through the variational analysis. Moreover, image triplets instead of pairs were used to increase the robustness of the reconstruction. The 3D point cloud has been elaborated with Meshlab tools. The first operation was the manual selection and deletion of unwanted areas and outliers caused by the presence of underwater flora and fauna and bad visibility conditions. Then, a watertight surface with about 25 millions of triangles (Figure 7) was obtained through the Poisson Surface Reconstruction algorithm. The resulting surface has been subsequently decimated in a mesh of 6.5 million triangles and 3 million points in order to be handled more efficiently without losing details. Since the camera orientation procedure has been carried out with an unknown scale factor, it is necessary to scale the model by selecting two points with a known distance. In this experimentation, a scale bar has been placed in the scene and reconstructed in order to evaluate the scale factor. The last step consists in the application of the texture on the 3D surface. Colour information can be extracted directly from the coloured point cloud, but this method does not allow for the Figure 6. Results of the camera orientation process (enhanced pictures with HIST method): sparse point cloud and 533 oriented cameras. Figure 7. Reconstructed surface. ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 76 creation of a high-quality texture, because its resolution depends on the point cloud density. Moreover, since the enhancement procedure is often performed to improve the feature extraction process (by increasing the contrast without taking into account the fidelity of the colours – usually single-component or greyscale images are used), the colour information stored in the pixels cannot be used. Since the camera positions are known, the texture mapping has been carried out by means the projection and blending of high-resolution images directly on the 3D surface. In particular, an image subset has been selected because the averaging among neighbourhood values during the blending on the images works better if a small overlapping area is present. This subset of images has been extracted from the images enhanced with the HIST method, which also gave the best results in terms of texture quality. This is mainly due to the manual retouching step performed on a sample image and then exported to the whole dataset. The result of this procedure is a texture with a resolution comparable to the original images (Figure 8). 6. CONCLUSIONS This paper has presented a performance analysis, based on a DOE approach, of the main influential factors that affect 3D reconstruction quality. The performance of three different colour enhancement algorithms, ACE, PCA, and HIST, have been evaluated by using a variance analysis, including the effects of image resolution and colour channels. The results of the ANOVA analysis show that the factors EN (image enhancement method), PYR (image pyramid level), and SET (image set) are influential, with a confidence level of 95 %, for all the parameters, while the results related to the factor CH (colour channel) have shown a limited influence, since each enhancement method performs mixing operations among the channels. The ANOVA data allowed for choosing the best combination of factors to optimise the SfM bundle adjustment mean re- projection error, the number of extracted features, oriented cameras, and matched features, also taking the processing time into account. More precisely, the best combination is characterised by RGB images resized to 25 % and enhanced with the HIST method, which returns more stable results. By using the results of the statistical analysis to correct and process the underwater images, it has been possible to align an unordered sequence of more than 500 images belonging to the entire site. On the contrary, the original images could not be used to align all the cameras. Moreover, the corrected images allowed for creating a model mapped with a high-quality texture, comparable with original images in terms of resolution and with a fair colour balance, since the whole dataset shares the same colour statistics. Even if these techniques have been used in other works related to underwater archaeology, this experiment represents a significant case study for verifying their robustness in the presence of strong turbidity and poor environmental conditions, providing useful guidelines for an accurate modelling of a submerged site. ACKNOWLEDGEMENT This work has been supported by the iMARECulture project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 727153. REFERENCES [1] UNESCO, Convention on the Protection of the Underwater Cultural Heritage, 2 November 2001, http://www.unesco.org. [2] G. Telem, S. Filin, Photogrammetric modelling of underwater environments, ISPRS Journal of Photogrammetry and Remote Sensing 65, 5 (2010), pp. 433-444. [3] F. Menna, P. Agrafiotis, A. Georgopoulos, State of the art and applications in archaeological underwater 3D recording and mapping, Journal of Cultural Heritage 33 (2018) pp. 231-248. [4] F. Bruno, A. Lagudi, L. Barbieri, D. Rizzo, M. Muzzupappa, L. De Napoli, Augmented Reality visualization of scene depth for aiding ROV pilots in underwater manipulation, Ocean Engineering 168C (2018) pp. 140-154. [5] D. C. Montgomery, Design and Analysis of Experiments, John Wiley & Sons, New York, 2017, ISBN: 978-1-119-11347-8. [6] IMARECulture, http://www.iMARECulture.eu [7] F. Bruno, A. Lagudi, G. Ritacco, J. Cejka, P. Kouril, F. Liarokapis, P. Agrafiotis, D. Skarlatos, O. Philpin-Briscoe, E.C. Poullis, Development and integration of digital technologies addressed to raise awareness and access to European underwater cultural heritage, An Overview of the H2020 iMARECulture Project, Proc. of the MTS/IEEE Conference Oceans’17, Aberdeen, UK, 19-22 June, 2017. [8] D. Skarlatos, P. Agrafiotis, T. Balogh, F. Bruno, F. Castro, B.D. Petriaggi, S. Demesticha, A. Doulamis, P. Drap, A. Georgopoulos, ‘Project iMARECulture: Advanced VR, iMmersive serious games and augmented reality as tools to raise awareness and access to European underwater cultural heritage’, Proc. of the International Conference on Cultural Heritage, Nicosia, Cyprus, 1-5 November, 2016. [9] A. Mahiddine, J. Seinturier, D. Peloso, J. M. Boï, P. Drap, D. Merad, Underwater image pre-processing for automated photogrammetry in high turbidity water, VSMM2012, 2012, pp. 189-194. [10] R. Schettini, S. Corchs, Imaging for underwater archaeology, American Journal of Field Archaeology 27, 3 (2000), pp. 319-328. [11] Y. Y. Schechner, N. Karpel, Recovery of underwater visibility and structure by polarization analysis, IEEE Journal of Oceanic Engineering, 2005, 30(3), pp. 570-587. [12] E. Trucco, A.T. Olmos-Antillon, Self-tuning underwater image restoration, IEEE Journal of Oceanic Engineering 31, 2 (2006) pp. 511-519. [13] A. Rizzi, C. Gatta, From Retinex to Automatic Color Equalization: issues in developing a new algorithm for unsupervised color equalization, Journal of Electronic Imaging 13 (2004) pp.75-84. [14] M. Chambah, D. Semani, A. Renouf, P. Courtellemont, A. Rizzi, Underwater color constancy: enhancement of automatic live fish recognition, Proc. of the 16th Annual Symposium on Electronic Imaging, 2003, United States, 5293, pp. 157-169. Figure 8. Final textured 3D model. ACTA IMEKO | www.imeko.org September 2019 | Volume 8 | Number 3 | 77 [15] F. Petit, Traitement et analyse d’images couleur sous-marines: modèles physiques et représentation quaternionique, Doctorat, Sciences et Ingénierie pour l'Information, Poitier, 2010. [16] S. Bazeille, I. Quidu, L. Jaulin, J. P. Malkasse, Automatic underwater image pre-processing, CMM’06 - Caracterisation Du Milieu Marin, 2006. [17] K. Iqbal, R. Abdul Salam, A. Osman, A. Z. Talib, Underwater image enhancement using an integrated colour model, IAENG International Journal of Computer Science 32, 2 (2007) pp.239- 244. [18] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool, Speeded-up robust features (SURF), Comput. Vis. Image Underst. 110 (2008) pp.346- 359. [19] R. Kalia, K.-D. Lee, B.V.R. Samir, S.-K. Je, W.-G. Oh, An analysis of the effect of different image pre-processing techniques on the performance of SURF: Speeded Up Robust Feature, Proc. of the 17th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), Ulsan, South Korea, 9-11 February 2011, pp.1-6. [20] M. Mangeruga, F. Bruno, M. Cozza, P. Agrafiotis, D. Skarlatos, Guidelines for underwater image enhancement based on benchmarking of different methods, Remote Sensing 10, 10 (2018) 1652, pp.1-27. [21] A. Tonazzini, E. Salerno, M. Mochi, L. Bedini, Blind source separation techniques for detecting hidden texts and textures in document images, Image Analysis and Recognition Lecture Notes in Computer Science 3212 (2004) pp. 241-248. [22] D. P. Mitchell, A. N. Netravali, Reconstruction filters in computer graphics, Computer Graphics 22, 4 (1988) pp. 221-228. [23] D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision 60, 2 (2004) pp. 91-110. [24] Z. Zhang, A flexible new technique for camera calibration, IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11 (2000), pp.1330-1334. [25] Bundler software, http://www.cs.cornell.edu/~snavely/bundler [26] Y. Furukawa, J. Ponce, Accurate, dense, and robust multi-view stereopsis, IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8 (2010), pp. 1362-1376.