Acta Polytechnica CTU Proceedings


doi:10.14311/APP.2015.1.0008
Acta Polytechnica CTU Proceedings 2:8–14, 2015 © Czech Technical University in Prague, 2015

available online at http://ojs.cvut.cz/ojs/index.php/app

ON FPGA BASED ACCELERATION OF IMAGE PROCESSING
IN MOBILE ROBOTICS

Petr Čížek∗, Jan Faigl

Department of Computer Science, Faculty of Electrical Engineering, CTU in Prague, Technická 2, 166 27
Prague, Czech Republic

∗ corresponding author: petr.cizek@fel.cvut.cz

Abstract. In visual navigation tasks, a lack of the computational resources is one of the main
limitations of micro robotic platforms to be deployed in autonomous missions. It is because the most of
nowadays techniques of visual navigation relies on a detection of salient points that is computationally
very demanding. In this paper, an FPGA assisted acceleration of image processing is considered to
overcome limitations of computational resources available on-board and to enable high processing
speeds while it may lower the power consumption of the system. The paper reports on performance
evaluation of the CPU–based and FPGA–based implementations of a visual teach-and-repeat navigation
system based on detection and tracking of the FAST image salient points. The results indicate that
even a computationally efficient FAST algorithm can benefit from a parallel (low–cost) FPGA–based
implementation that has a competitive processing time but more importantly it is a more power
efficient.

Keywords: FPGA, system–on–chip, image processing, FAST feature detector, visual navigation.

1. Introduction
One of the key indicators of the intelligent behaviour
of a mobile robot is its ability to react promptly in
complex situations and make fast and correct deci-
sions regarding the robot goals. On micro-robotic
platforms, this is a very comprehensive task due to
limited computational resources. A typical example
of micro-robotic platforms are micro aerial vehicles
(MAVs) [1], small legged robots [2] or robots used to
study swarm intelligence [3]. They can be character-
ized by constrained dimensions and payload, which is
directly related to the limited battery capacity. Al-
together, these parameters determine an available
computational power on-board of the mobile robot.
A direct trade-off between the computational capabil-
ities and power consumption can be identified, char-
acterizing the maximum time for which the robot can
perform its mission.

This trade-off between computationally demanding
and power efficient decision-making techniques is es-
pecially recognizable in navigation algorithms that
rely on computer vision methods based on process-
ing of a large amount of visual data. Salient point
extractors are popular techniques of machine percep-
tion in the mobile robot navigation tasks; however,
not all available approaches are currently suitable for
micro-robotic platforms due to the limited on-board
computational resources. Moreover, a practical deploy-
ment of a mobile robot always demands a real-time
performance of decision-making algorithms, which im-
poses further restrictions on the applicable approaches
regarding the available resources.

Two fundamental approaches how to deal with lim-
ited computational resources on micro-robotic plat-

forms can be considered. The first approach is a
simplification of the computationally demanding pro-
cessing, e.g., by introducing approximations of the
demanding method and simplifying the whole prin-
ciple, which can be a daunting task as it may not
be always possible. The second approach is a uti-
lization of the optimized implementations for newly
available features of the modern processors, e.g., a
dedicated optimization for special (typically SIMD–
single instruction multiple data) instructions of the
new CPUs. Besides, dedicated co-processors can be
utilized to speedup the computationally demanding
calculations by massively parallel processing that is
available using conventional graphics cards, e.g., using
CUDA or OpenCL. Although modern graphics cards
are able to provide superior peak performance, they
also have high power consumption requirements (in
order of hundreds of Watts) and thus they are not
suitable for micro-robotic platforms.
Therefore, as an alternative solution for parallel

based and power-efficient computations, it is a more
suitable to consider Field-Programmable Gate Array
(FPGA) technology to develop a custom architecture
specifically designed for the particular computational
task, which, in the end, can be very power and com-
putationally efficient while also small in dimensions
and cheap in development.

However, FPGA-based solutions need a significantly
different approach for an efficient implementation of
the particular algorithms than a conventional CPU.
Thus, the deployment time can be high compared to
a gain from the implementation of a custom design
computational architecture for the FPGA.

In this paper, we report our results on a comparison

8

http://dx.doi.org/10.14311/APP.2015.1.0008
http://ojs.cvut.cz/ojs/index.php/app


vol. 2/2015 On FPGA Based Acceleration of Image Processing in Mobile Robotics

of the selected image processing techniques suitable
for embedded computation of the visual navigation
task implemented on a conventional on-board plat-
form and a dedicated solution based on FPGA pro-
cessing of the image data. In particular, we consider
a detection of the salient points in the image. The
salient points are image patterns which differs from
their local neighbourhood and are expected to be reli-
ably and repeatably detectable, preferably invariant
to camera viewpoint changes. Such features provide a
mobile robot with a limited set of environmental an-
chor points which are then utilized for a vision-based
navigation.

Feature based methods consist of three stages. The
first stage is a detection of features to identify salient
points in the image. Then, for each detected feature
its descriptor is calculated to describe the local im-
age surrounding of the feature in order to distinguish
individual features. The third stage is a process of
establishing feature correspondences which is called
feature matching. The matching is based on a compar-
ison of the feature descriptors to determine whether
the features correspond to the same salient object
already detected in the environment.

Regarding the computational complexity of the fea-
ture detection, we can classify the features to artificial,
like blobs or patterns, and naturally occurring in the
environment. For the field robotics, natural landmarks
are more important; however, artificial landmarks
are much easier to detect. One of the foundational
feature detectors is the Scale-invariant feature trans-
form (SIFT) algorithm [4], which is unfortunately
computationally very demanding, and therefore, a
Speeded-up robust features (SURF) algorithm has
been proposed [5] to provide salient points with less
computational requirements.

Although SURF is less demanding than SIFT, it can
be still computationally restrictive on small platforms.
That is why researchers investigate other methods
how to detect and describe salient objects in the en-
vironment by a computationally efficient algorithm
such as the FAST feature detector [6]. FAST is widely
adopted by robotic community for its low computa-
tional complexity. It has been deployed in several
visual navigation methods, e.g., [7, 8], and it also
provides a base for different feature extractors [9, 10].
Following our intention to have a power and com-

putationally efficient image processing available on a
mobile robot, we consider FAST as a suitable algo-
rithm for visual navigation task computed on-board.
Therefore, we compare a performance of the FAST
feature extraction implemented on a conventional em-
bedded platform with a CPU based on an ARM ar-
chitecture and a custom implementation of the FAST
detector on an FPGA-based dedicated co-processor.
The comparison indicates that even a computation-
ally efficient feature extraction based on the FAST
algorithm can benefit from parallel FPGA-based im-
plementation that has a competitive processing time

Figure 1. Hexapod walking robot platform

and it is a more power efficient.
The paper is organized as follows. An overview of

the most related reports on enhanced computationally
and power efficient vision-based navigation approaches
based on FPGA co-processor is presented in Section 2.
The proposed evaluation of the FPGA-based image
processing is considered in the context of a monocu-
lar vision-based teach-and-repeat autonomous naviga-
tion [11] that is briefly described in Section 4 together
with the main idea of the FAST feature detection and
the selected BRIEF feature descriptor [12]. Results on
the evaluation of the computational burden reduction
using an embedded platform of the hexapod walk-
ing robot (see Figure 1) and FPGA–based solution
are presented in Section 5. Concluding remarks and
future work is dedicated to conclusion in Section 6.

2. Related work
This paper is focused on FPGA-based implementa-
tions of computer vision techniques that are utilized
in mobile robot navigation tasks. Several approaches
have been published and their authors reported im-
provements of the computational and power efficiency
in vision-based navigation tasks by a dedicated imple-
mentation on FPGA.
Probably one of the first FPGA based implemen-

tation of the SURF feature extraction has been in-
troduced in [13], where an embedded module capable
of image processing suitable for mobile robotics is
presented. The module implements a SURF feature
extraction [14] from images with the resolution of
1024×768 pixels. The reported frame rate is 15 fps,
which can be considered low in a comparison to the
24 fps GPU-based implementation of SURF; how-
ever, the power consumption is only 6 Watts. Notice,
the FPGA implementation outperforms a pure CPU
implementation running on an Intel Atom dual-core
processor, which is able to process only a single frame
per second.

Another FPGA-based module for stereo image pro-
cessing has been recently introduced by the ETH

9


Petr Čížek, Jan Faigl Acta Polytechnica CTU Proceedings

Computer Vision and Geometry group [15]. It is
based on the FPGA and ARM Cortex A9 quad-core
CPU and calculates dense disparity images with the
resolution of 752×480 pixels at the 60 frames per
second. The disparity estimation relies on the semi
global matching approach, which is computationally a
very intensive task. The reported power consumption
of the module is 5 Watts. In [16], the same authors
extend their module by implementing a reactive based
collision avoidance method for MAVs and tested it in
an outdoor environment.

Authors of [17] presented a miniature module with
a low-cost FPGA and MCU for visual navigation of
MAVs. They utilize a reduced version of the PTAM
algorithm [8] (mapping maximum of 200 features due
to memory limitations) for an estimation of the visual
odometry. Their approach is based on the FPGA
implementation of the FAST features detection and
BRIEF [12] feature description. The MCU is utilized
for visual odometry calculation and the whole system
operates at 30 frames per second on images with the
resolution of 160×120 pixels.

In [18], authors consider the FPGA to synchronize
data from the IMU and FAST feature extraction from
a camera stereo pair for a robust inertial assisted real-
time visual SLAM. The reported update rate is about
20 frames per second for images with the resolution
of 752×480 pixels. Similar results with a slightly
different approach are presented in [19].
An embedded FPGA–based computer vision mod-

ule has been presented in [20]. The proposed FPGA
architecture uses an efficient pipelining for the FAST
feature detection in a video stream with the resolu-
tion of 752×480 pixels at the frame rate of 60 frames
per second, which is utilized in the visual teach–and–
repeat navigation method.

Regarding the physical dimensions of the aforemen-
tioned modules, they are all based on system–on–chip
(SoC) solutions on FPGA development boards which
(in all cases) do not exceed 12 cm × 8 cm. The power
consumption of these modules is less than 10 Watts
while all of them exhibit a real-time performance in
the particular robotic navigation task. The SoC archi-
tecture combines the advantages of the FPGA with a
regular CPU in order to reduce deployment costs and
improve performance of the system. Its main aspects
are described in Section 3.

All of the discussed approaches seem to be a suitable
choice for deployment on a micro-robotic platform.
However, the most promising are the approaches ([17–
20]) based on the FAST feature detector [6] because
this detector is widely adopted by the robotic commu-
nity due to its low computational complexity. Hence,
it represents a base for further methods. For example,
it is a source of the parallel tracking and mapping
(PTAM) method [8] that can be considered as a vi-
sual monocular odometry (or SLAM – simultaneous
localization and mapping) technique, also suitable for
MAVs, albeit the original PTAM method was designed

to be used in small augmented reality workspaces.
The method relies on tracking and mapping of FAST
features for the 6DOF position estimation in the envi-
ronment.

FAST has been recently utilized in the semi-direct
monocular visual odometry [7] in which FAST features
are tracked and used to build a map that is further
used for a visual odometry estimation. The algorithm
was tested on MAV equipped with the Odroid-U2
ARM quad-core computer achieving computational
speed of 50 frames per second. Despite of the CPU
implementation, the reported power consumption of
the system is 10 Watts, which is a competitive to the
one of the first FPGA module for the SURF detection
introduced in [13].

Therefore, we consider the FAST features detection
on the ARM–based computer and FPGA–based imple-
mentation [20] in this evaluation study of the potential
benefits of the parallel (FPGA–based) implementation
of the image processing for visual navigation.

3. Processor centric design
A combination of the FPGA and CPU allows to ac-
celerate a computer vision application and also allows
to carry out more complex tasks by an improved sys-
tem. The main idea of combining these two types of
computational architectures is the processor centric
system–on–chip co-design, which allows to exploit ben-
efits of both architectures. Generally, a CPU is a more
suitable for general purpose computations and it is
also a more easily programmable. However, a parallel
nature of the FPGA fabric makes it well suitable for
accelerating demanding or repetitive computational
operations performed by the CPU. Besides, it is also
suitable for online signal processing and sensory inter-
facing [21].
There are two principal ways how to incorporate

both the FPGA and CPU in a single embedded design.
It is possible to have both of them soldered on the
printed circuit board and utilize a communication
channel like SPI for a fast data transmission between
them. However, much more efficient is to utilize the
system-on-chip (SoC) solution, where the CPU is a
part of the FPGA chip. The CPU can be either
softcore or hardcore. The FPGA fabric is extremely
versatile, and therefore, it can implement even a CPU
in its fabric, which is called a softcore CPU. On the
other hand, manufacturers started to incorporate the
CPU in the design of the chip to further exploit the
FPGA–CPU co-design possibilities. In that case, the
CPU is etched into the silicon of the chip together
with the FPGA fabric and such a solution is then
referenced as the hardcore CPU.

Nowadays hardcore processors are usually based on
multi-core ARM architecture and they are running at
the units of GHz. In the case of the softcore proces-
sors, their speed is limited to around 200 MHz due to
the FPGA fabric limitations. The main advantage of
the both hardcore and softcore processors is that they

10


vol. 2/2015 On FPGA Based Acceleration of Image Processing in Mobile Robotics

can be directly programmed in the C programming
language. In addition, they can host a real-time op-
erating system, which makes their programming for
time critical applications even simpler.

From the processor point of view, the FPGA fabric
can act either as a memory mapped slave device or
it can be directly connected to the CPU core and
accelerates the processor instructions. In that case,
the instruction set of the CPU is extended by custom
designed instructions.

The main advantage of FPGA–based SoC systems
is their versatility and reconfigurability which may be
of a particular use in the field of mobile robotics. The
programmer can choose components which assemble
the system according to the needs of the particular
application. The FPGA resources allow to build both
internal computational units and IO communication
interfaces, which can be directly connected to the
main CPU. Moreover, FPGA always posses a lot of
general purpose IO pins which can be used for any
communication not exceeding maximum frequency of
the given FPGA chip (usually about 300∼350 MHz)
regardless the communication, which can be serial like
SPI and I2C or parallel. In addition, specialized hard-
core bus endpoints for a high-speed communication
are common integral parts of nowadays FPGAs, e.g.,
Gigahertz serial endpoints like PCIe, SATA or parallel
DRAM interfaces.

4. Bearing-only Visual Navigation
The navigation algorithm used for the evaluation of
implementations of the FAST feature detector on the
CPU and FPGA is based on the teach-and-repeat
navigation algorithm proposed in [11]. The naviga-
tion is based on tracking previously mapped visual
features that are used for a bearing-only navigation.
The approach relies on the relative localization pro-
vided by the odometry, which would eventually inte-
grate unbounded position error over time. However,
the approach considers the odometry only locally for
traversing a short straight line segment along which
detected visual landmarks are utilized to correct the
robot heading and thus suppresses the odometric er-
ror. The corrections of the heading are based on a
modus of the horizontal displacements of the tenta-
tive correspondences between previously mapped and
the currently perceived visual features. The modus
is found by a histogram voting as it is visualized in
Figure 2.

The tentative correspondences are established using
the FAST feature detector and the BRIEF feature
descriptor [12] which are briefly described in the fol-
lowing subsections.

4.1. Features from Accelerated
Segment Test

The FAST feature detector [6] belongs to the family
of the so called appearance based detectors because it

Figure 2. SURFnav navigation algorithm. Matched
features from the current and previously learned image.
The corresponding navigation histogram is depicted
at the bottom.

searches the image for corner-like structures. The de-
tector is optimized with respect to the computational
complexity. The algorithm uses a set of comparisons
between the center pixel p and pixels defined on the
7×7 neighbourhood using the circle in the image de-
termined by Bresenham’s circle algorithm, which is
shown in Figure 3.

Figure 3. FAST feature detection. Courtesy of [6].

At first, the pixels on the circle are labelled dark or
bright depending on their relative brightness to the
central pixel. The candidate pixel is considered as a
feature if there are n contiguous bright or dark pixels
in the circle. For n ≥ 12 it is possible to use a rapid
rejection method for a faster outlier rejection which
leads to a common setting of n = 12. However, the
original paper [6] approved that the best repeatability
is exhibited for the setting of n = 9. The corner score,
which is necessary for the non-maxima suppression, is
calculated as the sum of the absolute differences be-
tween the pixels in the contiguous arc and the central
pixel.
The FPGA architecture presented in [20] uses effi-

cient pipelining for the FAST feature detection accel-
eration. The image data are processed as an online
stream and the architecture performs all the steps of
the algorithm, i.e., the corner detection, corner score
calculation, and non-maxima suppression, in parallel
as data are transmitted from the image sensor. The
architecture also allows to use different values of n
without an impact on the computational speed or used

11


Petr Čížek, Jan Faigl Acta Polytechnica CTU Proceedings

resources.

4.2. Binary robust independent
elementary features

Once features are detected, individual salient points
are characterized by descriptors. The purpose of the
descriptor is to describe the image neighbourhood
of the point and thus characterize the salient object
of the environment. In the proposed evaluation, the
selected descriptor is the BRIEF descriptor which
stands to the binary feature descriptor [12]. It is
based on pairwise intensity comparisons of pixels in-
side an image patch surrounding the located feature.
These comparisons form a set of unique binary tests
which are subsequently stored into a q-dimensional bit
vector. The pairwise comparisons can be chosen either
randomly or evolutionary, e.g., they can be trained
by methods of reinforcement learning [22], which op-
timize the descriptor for the particular environment.
Typically used values of q are 64, 128, and 256 bits that
correspond to the BRIEF8, BRIEF16, and BRIEF32
variants of the feature descriptor, respectively.

The descriptor similarity is evaluated using the
Hamming distance that computes how many bits of
two given feature vectors are different.

5. Evaluation
The main motivation behind the utilisation of FPGA–
based systems in visual navigation tasks is a latency
reduction in the robotic navigation system which is
especially recognizable in experiments performed in
dynamic or confined environments, which impose im-
minent threats to the mobile robotic platform [1, 23]
and the reduction of the computational load thus
power consumption of the mobile robot platforms.
This section presents the results of the experimen-

tal evaluation of the impact of the FPGA–based co-
processor utilization on the reduction of the com-
putational burden of the feature extraction process
in the monocular vision–based teach–and–repeat au-
tonomous navigation. The particular parts of the
whole navigation system under the evaluation are
the feature extraction chain and the computationally
most demanding parts of the navigation algorithm:
the feature detection and description; features match-
ing to the map previously created in the teach mode;
construction of the navigational histogram; and de-
termination of the robot heading correction based on
the maximum peak in the histogram voting method.

For the feature extraction the FAST feature detec-
tor [6] is used with the setting of n = 12, while 256 bit
long BRIEF [12] (BRIEF32) is utilized as the feature
descriptor. The number of features is set to approx.
200 features per image and the pre-learned map in the
navigation pipeline evaluation contains 200 features.
The pure CPU based implementation running on

the Odroid U3 board [24] was compared to the

(a) . Outdoor urban env. (b) . Indoor lab env.

Figure 4. Testing environments.

FPGA–based SoC implementation on Terasic DE0-
nano board [25]. In particular, the hardware and
software used in the evaluation are as follows:

Odroid U3 – A small embedded micro-computer
suitable for robotic applications. It features 1.7 GHz
ARM Cortex A9 quad-core microprocessor (Sam-
sung Exynos4412 Prime) running the Ubuntu 12.04
Linux operating system. The attached camera is the
Mobius ActionCam with the resolution of 640×480
with 60 fps. The implementation of the navigation
algorithm was based on the Open Computer Vision
library [26] (OpenCV 2.4.9) implementations of the
FAST feature detector and BRIEF32 descriptor.
DE0-nano board – An embedded module with
the Altera Cyclone IV EP4CE22 FPGA equipped
with the APTINA MT9V034 grayscale camera sen-
sor with the global shutter and the resolution of
752×480 providing images with 60 fps. The imple-
mentation of the navigation algorithm follows the
approach described in [20]. It utilizes the bare-metal
programmed NIOS IIe softcore processor clocked
at 150 MHz for the feature description, matching,
and histogram voting and a dedicated FPGA–based
parallel co-processor for the FAST feature detection.

The proposed evaluation consists of the focused
examination of the feature detector and descriptor
and examination of the performance of the whole
navigation system. The tests were performed in field
conditions both in outdoor urban (Fig. 4a) and indoor
lab (Fig. 4b) environments.

First, we evaluated the performance of the FPGA–
based and CPU–based implementations of the FAST
feature detector and computation of the BRIEF de-
scriptor. Two criteria are considered in this evaluation:
correctness of the FPGA–based implementation and
computational requirements.
Regarding the correctness of the FPGA–based im-

plementation, it provides identical results to the CPU–
based counterpart for the detected features and also
the same descriptors. This test was performed on
artificially generated data in order to ensure the cor-
rectness of the results. The data consists of a checker-
board pattern with grayscale gradient which allows
to detect various number of corners depending on the
predefined threshold.

Concerning the computational requirements of the

12


vol. 2/2015 On FPGA Based Acceleration of Image Processing in Mobile Robotics

feature extractor, the Odroid implementation needs
(in average) 17.07 ms for approximately 200 FAST
feature detections with BRIEF32 descriptors in an
image with the resolution of 640×480 pixels. The DE0-
nano implementation exhibits similar performance
with the processing time of 16.91 ms for 200 feature
detections and descriptions.
The navigation stack performance benchmarking

consists of the feature extraction, feature matching
against the pre-learned map, and the histogram vot-
ing method to establish the heading correction of
the robot. The achieved results are summarized in
Table 1.

Platform Odroid U3 DE0-nano

CPU cores 4 1
CPU clock 1.7 GHz 150 MHz
FPGA - 22320∗∗
FPGA usage - 28%
Feature extraction 17.07 ms 16.91 ms
Navigation pipeline 35.89 ms 24.38 ms
Power consumption 8.13 W 1.82 W
∗∗ The number of available logic elements.

Table 1. Navigation pipeline processing results of
Odroid U3 and DE0-nano SoC platforms

5.1. Discussion
The results presented in Table 1 indicate that the
FPGA–based SoC approach does not significantly
reduce the required time for feature extraction; how-
ever, the power consumption is reduced by 4.5 times
magnitude in comparison to the purely CPU–based
implementation.
Regarding the update rate of individual methods

the FPGA–based implementation is capable to pro-
cess each other image in the 60 fps camera stream,
which gives about 30 processed images per second
while the CPU–based implementation can process 20
frames per second if we consider a uniform image re-
trieval; otherwise the update rate of the CPU–based
implementation is 27 fps but images are dropped at
non-deterministic way which leads to uneven distribu-
tion of information in time.

The FAST feature detector of the DE0-nano imple-
mentation performs feature detection online on the
stream of camera data, which implies that the im-
age coordinates of all salient points in the image are
known during the readout of a single image from the
camera sensor. The feature description and matching
to the pre-learned map is performed in an instant
when the feature is detected in the camera stream.
However, 256 bit long BRIEF32 descriptor calculation
together with matching to approx. 200 features from
pre-learned map on the 150 MHz single core CPU
makes it impossible to finish all the computations
during the readout of one single image. Therefore,
each other frame in a 60 Hz camera stream is skipped,

which allows the algorithm to finish all the calcula-
tions in less than 32 ms during the readout of the next
image.

6. Conclusion
In this paper, we consider a problem of decreasing
the computational load of the embedded computa-
tional resources utilized in the vision-based navigation
tasks. The presented results show that although the
processing times of both the FPGA–based and CPU–
based implementations are almost similar most-likely
due to the overall low complexity of the used feature
extraction, the FPGA–based system–on–chip design
can significantly reduce the power consumption of
the embedded processor in comparison to a purely
CPU–based solution.
In order to further verify and quantify the results,

we plan to thoroughly benchmark each part of the
visual navigation algorithm stack and incorporate the
FPGA-based image processing in a more complex
navigation task of a visual SLAM, where the com-
putational power of the FPGA platform can be of
potential benefit.

Acknowledgements
The presented work has been supported by the Czech
Science Foundation (GAČR) under research project No.
13-18316P.

References
[1] N. Michael, D. Mellinger, Q. Lindsey, V. Kumar. The
grasp multiple micro-uav testbed. Robotics Automation
Magazine, IEEE 17(3):56–65, 2010.
doi:10.1109/MRA.2010.937855.

[2] D. Belter, P. Skrzypczyński, K. Walas, D. Wlodkowic.
Affordable Multi-legged Robots for Research and STEM
Education: A Case Study of Design and Technological
Aspects. In Progress in Automation, Robotics and
Measuring Techniques, vol. 351 of Advances in
Intelligent Systems and Computing, pp. 23–34. Springer
International Publishing, 2015.
doi:10.1007/978-3-319-15847-1_3.

[3] F. Arvin, J. Murray, C. Zhang, et al. Colias: an
autonomous micro robot for swarm robotic applications.
International Journal of Advanced Robotic Systems
11(113):1–10, 2014. doi:10.5772/58730.

[4] D. G. Lowe. Distinctive Image Features from
Scale-Invariant Keypoints. International Journal of
Computer Vision 60(2):91–110, 2004.
doi:10.1023/B:VISI.0000029664.99615.94.

[5] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool.
Speeded-up robust features (SURF). Computer vision
and image understanding 110(3):346–359, 2008.
doi:10.1016/j.cviu.2007.09.014.

[6] Rosten, Edward and Drummond, Tom. Machine
learning for high-speed corner detection. In ECCV
2006, vol. 1, pp. 430–443. doi:10.1007/11744023_34.

[7] C. Forster, M. Pizzoli, D. Scaramuzza. SVO: Fast
semi-direct monocular visual odometry. In ICRA 2014,
pp. 15–22. doi:10.1109/ICRA.2014.6906584.

13

http://dx.doi.org/10.1109/MRA.2010.937855
http://dx.doi.org/10.1007/978-3-319-15847-1_3
http://dx.doi.org/10.5772/58730
http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94
http://dx.doi.org/10.1016/j.cviu.2007.09.014
http://dx.doi.org/10.1007/11744023_34
http://dx.doi.org/10.1109/ICRA.2014.6906584


Petr Čížek, Jan Faigl Acta Polytechnica CTU Proceedings

[8] G. Klein, D. Murray. Parallel Tracking and Mapping
for Small AR Workspaces. In Proceedings of the 2007
6th IEEE and ACM International Symposium on Mixed
and Augmented Reality, pp. 225–234.
doi:10.1109/ISMAR.2007.4538852.

[9] S. Leutenegger, M. Chli, R. Siegwart. BRISK: Binary
Robust invariant scalable keypoints. In ICCV 2011, pp.
2548–2555. 2011. doi:10.1109/ICCV.2011.6126542.

[10] E. Rublee, V. Rabaud, K. Konolige, G. Bradski. ORB:
An efficient alternative to SIFT or SURF. In ICCV
2011, pp. 2564–2571. doi:10.1109/ICCV.2011.6126544.

[11] T. Krajník, J. Faigl, M. Vonásek, V. Kulich, et al.
Simple yet Stable Bearing-only Navigation. Journal of
Field Robotics 2010. doi:10.1002/rob.20354.

[12] M. Calonder, V. Lepetit, C. Strecha, P. Fua. BRIEF:
binary robust independent elementary features. In
ECCV 2010, pp. 778–792. Springer.
doi:10.1007/978-3-642-15561-1_56.

[13] T. Krajník, J. Šváb, S. Pedre, et al. FPGA-based
module for SURF extraction. Machine vision and
applications 25(3):787–800, 2014.
doi:10.1007/s00138-014-0599-0.

[14] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool.
Speeded-up robust features (SURF). Computer vision
and image understanding 110(3):346–359, 2008.
doi:10.1016/j.cviu.2007.09.014.

[15] D. Honegger, H. Oleynikova, M. Pollefeys. Real-time
and Low Latency Embedded Computer Vision
Hardware Based on a Combination of FPGA and Mobile
CPU. IROS 2014 doi:10.1109/IROS.2014.6943263.

[16] H. Oleynikova, D. Honegger, M. Pollefeys. Reactive
Avoidance Using Embedded Stereo Vision for MAV
Flight. In ICRA 2015, pp. 50–56. IEEE.
doi:10.1109/ICRA.2015.7138979.

[17] R. Konomura, K. Hori. Visual 3D self localization
with 8 gram circuit board for very compact and fully
autonomous unmanned aerial vehicles. In ICRA 2014,
pp. 5215–5220. doi:10.1109/ICRA.2014.6907625.

[18] J. Nikolic, J. Rehder, M. Burri, et al. A synchronized
visual-inertial sensor system with FPGA pre-processing
for accurate real-time SLAM. In ICRA 2014, pp.
431–437. doi:10.1109/ICRA.2014.6906892.

[19] G. Zhou, J. Ye, W. Ren, et al. On-board
inertial-assisted visual odometer on an embedded
system. In ICRA 2014, pp. 2602–2608.
doi:10.1109/ICRA.2014.6907232.

[20] P. Čížek. Embedded module for image processing.
Master’s thesis, CTU, Faculty of Electrical Engineering,
2015.

[21] García, Gabriel J and Jara, Carlos A and Pomares,
Jorge and Alabdo, Aiman and Poggi, Lucas M and Torres,
Fernando. A survey on FPGA-Based sensor systems:
towards intelligent and reconfigurable low-power sensors
for computer vision, control and signal processing.
Sensors 14(4):6247–6278, 2014. doi:10.3390/s140406247.

[22] T. Krajník, P. deCristoforis, M. Nitsche, et al. Image
features and seasons revisited. ECMR 2015 – to appear .

[23] S. Lupashin, M. Hehn, M. W. Mueller, et al. A
platform for aerial robotics research and demonstration:
The flying machine arena. Mechatronics 24(1):41–54,
2014. doi:10.1016/j.mechatronics.2013.11.006.

[24] Collective of authors. Hardkernel co., Ltd.@ONLINE.
http://www.hardkernel.com, cited on 2015/08/17.

[25] Collective of authors. Terasic, Inc.@ONLINE.
http://www.terasic.com.tw, cited on 2015/08/17.

[26] Bradski, G. and Kaebler, A. Computer vision with
the OpenCV library. O’Reilly Media, 2008.

14

http://dx.doi.org/10.1109/ISMAR.2007.4538852
http://dx.doi.org/10.1109/ICCV.2011.6126542
http://dx.doi.org/10.1109/ICCV.2011.6126544
http://dx.doi.org/10.1002/rob.20354
http://dx.doi.org/10.1007/978-3-642-15561-1_56
http://dx.doi.org/10.1007/s00138-014-0599-0
http://dx.doi.org/10.1016/j.cviu.2007.09.014
http://dx.doi.org/10.1109/IROS.2014.6943263
http://dx.doi.org/10.1109/ICRA.2015.7138979
http://dx.doi.org/10.1109/ICRA.2014.6907625
http://dx.doi.org/10.1109/ICRA.2014.6906892
http://dx.doi.org/10.1109/ICRA.2014.6907232
http://dx.doi.org/10.3390/s140406247
http://dx.doi.org/10.1016/j.mechatronics.2013.11.006
http://www.hardkernel.com
http://www.terasic.com.tw

	Acta Polytechnica CTU Proceedings 2:8–14, 2015
	1 Introduction
	2 Related work
	3 Processor centric design
	4 Bearing-only Visual Navigation
	4.1 Features from Accelerated Segment Test
	4.2 Binary robust independent elementary features

	5 Evaluation
	5.1 Discussion

	6 Conclusion
	Acknowledgements
	References