AP_06_4.vp 1 Introduction Endoscopy is a minimally invasive diagnostic procedure that is used to give the physician a realistic impression of almost any part of the gastrointestinal tract or other organs inside the human body. During an endoscopic examination a flexible tube, which contains among other things a video cam- era and a lamp, is introduced into the human body. It has a steerable tip that enables the physician to change perspective and to ease navigation, e. g. through the colon. Another system for imaging the gastointestinal tract is a so-called „Camera in a Pill“ or „PillCam“ [1]. This is swallowed by the patient and travels through the gastrointestinal tract in a natural way. The images that are recorded by the camera are sent via a radio frequency transmitter to a data recorder, which is worn on a belt around the patient’s waist. Due to the frontal illumination, there are often light re- flections that are disturbing for the physician. In the case of classic endoscopy, the physician may change the perspective by turning the tip of the endoscope to avoid these reflections. In the case of a camera-in-pill examination, however, there is no way to force the pill to move to a better position. There- fore, these reflections need to be removed in a different manner. The algorithm we are going to present may also be applied to classic endoscopy for the physician’s convenience. 2 Methods The algorithm that we apply is used in image communica- tions to conceal image data corrupted by transmission errors [2] or in the restoration of x-ray images with defective detec- tor elements [3]. It models the defective image as a pointwise product of the undisturbed image and a known binary defect map that contains ones where the image is correct and zeros in the case of a defective pixel. This pointwise product in the spatial domain corresponds to a convolution in the spectral domain. To restore the image, a spectral deconvolution is performed. In case of endoscopy, we first perform a segmentation by thresholding in the YUV color space and then we treat the pixels that belong to light reflections as defective pixels, setting them to zero, and then we apply the defect interpola- tion algorithm. 2.1 Segmentation of reflections Before the spectral deconvolution algorithm is applicable, a map has to be generated that contains ones where the image is undisturbed and zeros where specular reflexions occur. These reflections are very bright, so the affected pixels show high luminance in the YUV color space. Thus, the im- age is transferred from RGB into the YUV color space by applying the linear transform Y U V � � � � � � � � � � � 0 299 0 587 0114 0147 0 289 0 436 0 61 . . . . . . . 5 0 515 0100 � � � � � � � � � � � � � � � � � � � �. . R G B (1) where the Y component contains luminance and U and V contain chrominance. Fig. 1 shows a typical histogram of the luminance channel of an endoscopic image. On the right hand side, there is a small peak that corresponds to pixels that contain specular reflections. We now segment these reflections by thresholding with a value that is just lower than the left edge of this hill. Therefore, we low pass filter the histogram and find the posi- tion beginning from the right hand side where the derivative changes from a significant positive value to a value near zero. Since there are often dark rings around the affected pixels in the reconstructed image, and these are also very disturb- ing, we enlarge the segmented areas by an erosion operation. Note that we here use erosion instead of dilation to en- large the segments, since we have a black segment on a white background. 32 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 46 No. 4/2006 Removal of Specular Reflections in Endoscopic Images T. Stehle During an endoscopic examination, pictures from the inside of the human body are displayed on a computer monitor. Disturbing light reflections are often visible in these images. In this paper, we present an approach for removing these reflections and replacing them by an estimate obtained using a spectral deconvolution algorithm. Keywords: endoscopy, light reflection removal, spectral deconvolution, defect interpolation. Fig. 1: Histogram of the luminance channel of an endoscopic image containing specular reflections. The x axis shows the luminance value, the y axis shows the number of occurrences 2.2 Spectral deconvolution For ease of notation, we describe the univariate case. The results can be easily generalized to the bivariate case. The observed image g(n) is modeled as g n f n w n( ) ( ) ( )� G k N F k W k N F l W k l l N ( ) ( ) ( ) ( ) ( ) � � � � � 1 1 0 1 where 0 �n k N, , f(n) is the undisturbed image, and w(n) a binary window, with w(n) � 0 if the corresponding pixel n in g(n) belongs to a reflection and w(n) � 1 otherwise. G(k), F(k) and W(k) are the DFT spectra of g(n), f(n) and w(n), respectively. Both f(n) and g(n) are real valued, hence, G(k) � G*(N k) and F(k) � F*(N k). Our goal is to estimate F k f n( ) ( )DFT� ��� for 0 �n k N, . Let us select a spectral line pair G(s) and G(N s) of G(k). If F(k) consisted only of two lines at s and N s, i.e. F k F k F s k s F N s k N s( ) �( ) �( ) ( ) �( ) ( )� � � �� � (3) convolution with the window spectrum W(k) would yield for the observed pair � �G s N F s W F s W s G s N F s W ( ) �( ) ( ) � ( ) ( ) ( ) � ( ) ( ) * * * * � � � 1 0 2 1 0� �� �( ) ( ) ,*F s W s2 (4) where �( )F s and �( ) � ( )*F N s F s � are the estimated coeffi- cients. From (4), �( )F s can be found to �( ) ( ) ( ) ( ) ( ) ( ) ( ) * F s N G s W G s W s W W s � 0 2 0 22 2 . (5) The estimated spectrum �( )F k would then be given by (3). Generally, F(k) consists of more than two spectral lines. The error after deconvolution (5) at s and N s in the spectral do- main is given by G k G k N F k W k G k N F s W k s F ( ) * ( ) ( ) �( ) ( ) ( ) �( ) ( ) � 1 1 1 � � � �� �( ) ( )s W k s � (6) where G(1)(k) is the spectrum of the window internal differ- ence image � �g n g n f n w n w n f n f n( )( ) ( ) �( ) ( ) ( ) ( ) �( )1 � � . (7) Clearly, the spectral error G k( )( )1 0� at the selected line pair k � s and k � N s. Given s and N s, the estimates �( )F s and �( )F N s are optimal in the minimum mean square error (MMSE) sense if the energy Eg of g (1)(n) is minimized. Using Parseval’s theorem, Eg can directly be evaluated in the spectral domain according to � �E g n N G k N E g n N k N G � � � � � � �( ) ( )( ) ( ) . 1 2 0 1 1 2 0 1 1 1 (8) This provides MMSE estimates for the sought frequency coefficients. If the selected line pair �( )F s and �( )F N s is dominant in the sense specified below, its convolution with W(k) tends to “hide” other, less dominant spectral coefficients of F(k). This influence is removed by (6), so that another line pair can be selected from G(1)(k), estimated and subtracted. This leads to the following iteration for spectral deconvolution: � Initialization: � ( )( )F k0 0� , G k G k( )( ) ( )0 � , i � 1. � i-th iteration step: Select a pair of spectral coefficients G si i( ) ( )( ) 1 , G N si i( ) ( )( ) 1 out of G ki( )( ) 1 . � Estimate �( )( )F s i , �( )( )F N s i according to (5) such that G s G N si i i i( ) ( ) ( ) ( )( ) ( )� � 0, i.e. � �G s N F s W F s W s G i i i i i i ( ) ( ) ( ) * ( ) ( ) *( ( ) �( ) ( ) � ( ) ( ) � �1 1 0 2 � �1 1 0 2) ( ) * ( ) ( ) * ( )( ) � ( ) ( ) �( ) ( )s N F s W F s W si i i i� � (9) � � � �( ) ( ) ( ) ( ) ( ) ( ( ) ( ) ( ) ( ) ( ) * ( ) F s N G s W G s W s W i i i i i i � 1 10 2 0 22 2 ) ( ) . ( ) W s i (10) Since we seek to minimize the error (8), the line pair we should select in the ith iteration is that which maximizes the energy reduction � �� E iF s�( )( ) , which can be calculated to � �� E i i i i i F s N F s W k s F s W k s �( ) �( ) ( ) � ( ) ( ) ( ) ( ) ( ) * ( ) ( ) � � � 1 2 2 0 1 k N � � (11) which can be rewritten to �E i i i i i N F s F s W k s F s W k s � � � � 1 2 2 � ( ) �( ) ( ) � ( ) ( * ( ) ( ) ( ) * ( ) ( ) * ( ) ( ) * ( ) ) ( ) �( ) � ( ) ( W k s N F s F s W k i k N i i � �� � �� � � � 0 1 2 1 s F s W k s W k si i i i k N ( ) ( ) ( ) * ( )) �( ) ( ) ( ) . 2 0 1 � � � �� � ��� � (12) For a binary window, we have w n w n2( ) ( )� . Inserting this into Parseval’s theorem, we obtain W k s W k s W k N i k N i k N k N ( ) ( ) ( ) ( ) ( )� � � � � � � � � � 2 0 1 2 0 1 2 0 1 w n N W n N ( ) ( ) � � � 0 1 0 (13) © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 33 Acta Polytechnica Vol. 46 No. 4/2006 D F T � � � � and W k s W k s N w n j p s N n i i k N i ( ) ( ) ( ) exp ( ) * ( ) ( ) � � � � � � � � 0 1 2 2 � � � � � n N iN W s 0 1 2( ).( ) (14) Inserting (13) and (14) into (12) yields � �� E i i i i i iF s G s F s G s� � � ( ) ( ) �( ) ( )* ( ) ( ) ( ) ( ) *( ) ( )1 1 . (15) Since we know how to estimate �( )( )F s i optimally (10), � ( )* ( )F s i can be eliminated, thus expressing the error energy reduction depending on the available error spectrum line G ki s i( ) ( )( ) 1 as � E i i i i N W W s G s W G s � ( ) ( ) ( ) ( ) Re ( ( ) ( ) ( ) ( ) 0 2 2 0 2 2 2 1 2 1� �( ) ( )) ( ) .i iW s 2 2 �� � � � � � � � � ! (16) Selecting the best line pair in each iteration hence implies finding s such that �E according to (16) is maximum. Because of the symmetry of the DFT spectra of real valued signals, it suffices to search over only half the coefficients of G ki( )( ) 1 , i.e. from k � 0 to k � N/2, to find s(i). �( ) ( ) ( )( ) ( ) ( )F s N W G si i i� 0 1 . (17) Similarly, the calculation of the error energy reduction �E according to (16) modifies to � E i iN W G s� 2 0 1 2 ( ) ( )( ) ( ) . (18) Clearly, when selecting only a single line, the error en- ergy reduction depends only on the modulus spectrum | ( )|( ) ( )G si i 1 , apart from a constant factor. To save computa- tional expense in the selection step, a simplified approach is to always select s(i) such that| ( )|( ) ( )G si i 1 is maximum, regard- less of whether a single line is tested, that is, s i( ) � 0 or s Ni( ) � 2, or a line pair. Practically, the outcome of the itera- tion remains almost unchanged. When the estimated spectrum � ( )* ( )F s i contains as many lines as there are samples of f(n) inside the window w(n), the remaining error energy EG vanishes. The backtransformed estimate � ( )( )f ni is then identical to g(n) inside the observation window, and contains extrapolated information outside. In practice, the iteration is stopped when �E falls below a pre- specified level �, or when a maximum number of iterations is reached. To achieve high spectral resolution for the interpolated signal, it is often reasonable to apply zero-padding to g(n) and w(n) before transforming and starting the iteration. An illus- tration of the entire recursion is given in Fig. 2. 3 Results The algorithm as described above was applied to several endoscopic images. In our experiments, we used a 9×9 circu- lar structure element for erosion of the binary mask image, and we independently applied 100 iterations of the deconvo- lution algorithm to each channel of the YUV color space. Fig. 3 and Fig. 4 show two examples with the original image and a processed version of the image. The specular reflections are removed and the holes are filled with texture that does not disturb the overall impression of the image. However, the results are not perfect. In Fig. 3, several ves- sels are affected by specular reflections in the lower right corner, and in the processed image some of these vessels ap- pear interrupted. In Fig. 4, block artifacts were added to the image in areas where a large number of specular reflections occurred. In some cases, the specular reflections itselves were surrounded by green, pink and yellow circles. These circles were correctly identified not being specular reflections and therefore they were not removed completely. This is not se- vere if the physician knows that they are caused by reflections. In the processed image these colors are visible, too, but the reflections have been removed, so the physician might think it is the real color of the tissue. The processed images have not yet been presented to a physician because of the drawbacks described above. How- ever, the results achieved so far are already very promising. 4 Discussion We have presented an approach for removing specular re- flections from endoscopic images by segmentation of the af- fected pixels and interpolation via a spectral deconvolution algorithm. The algorithm, however, needs further improvement. We are going to modify the segmentation of the specular reflec- tions such that the typical elliptical shape is incorporated into 34 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 46 No. 4/2006 Fig. 2: Illustration of the recursive spectral deconvolution algo- rithm. From the final spectral estimate � ( )( )F ki , the interpo- lated signal estimate is obtained by an inverse DFT. the segmentation process. A rejection criterion will be inte- grated so that the quantitiy of false positive segmentation re- sults is reduced. In addition, we will address the color disturbances in the processed images. 5 Acknowledgments I would like to thank Prof. Dr. T. Aach, Institute of Im- aging and Computer Vision, RWTH Aachen University, Germany, who supervised this project. I would also like to thank Prof. Dr. med. C. Trautwein, University Hospital Aachen, Germany, who kindly provided the endoscopic im- ages used in this paper. References [1] Iddan, G., Meron, G., Glukhovsky, A. Swain P.: Wireless Capsule Endoscopy, Nature, 2000, p. 405–417. [2] Kaup, A., Meisinger, K., Aach, T.: Frequency Selec- tive Signal Extrapolation with Applications to Error © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 35 Acta Polytechnica Vol. 46 No. 4/2006 Fig. 3: Endoscopic image of the colon with specular reflections (left) and processed image (right). The overall impression of the pro- cessed image is better, but some artifacs have been added to the image. The vessels in the lower right corner appear blurred, and some of them interrupted Fig. 4: Endoscopic image of the esophagus with specular reflections (left) and the processed image (right). In areas with a large number of specular reflections block artifacts have been added to the image. Concealment in Image Communications. International Journal of Electronic Communication (AE), Vol. 59 (2005), p. 147–156. [3] Aach, T., Metzler, V.: Defect Interpolation in Digital Ra- diography – How Object-Oriented Transform Coding helps. SPIE Medical Imaging, Proceedings of SPIE 4322, 2001, p. 824–835. Thomas Stehle e-mail: Thomas.Stehle@lfb.rwth-aachen.de Institute of Imaging & Computer Vision RWTH Aachen University D-52056 Aachen, Germany 36 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 46 No. 4/2006