Polymorphic Protocols for Fighting Bots


Electronic Communications of the EASST
Volume 080 (2021)

Conference on Networked Systems 2021
(NetSys 2021)

Polymorphic Protocols for Fighting Bots

August See

5 pages

Guest Editors: Andreas Blenk, Mathias Fischer, Stefan Fischer, Horst Hellbrueck, Oliver
Hohlfeld, Andreas Kassler, Koojana Kuladinithi, Winfried Lamersdorf, Olaf Landsiedel, Andreas
Timm-Giel, Alexey Vinel

ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122

http://www.easst.org/eceasst/


ECEASST

Polymorphic Protocols for Fighting Bots

August See

richard.august.see@uni-hamburg.de
Universität Hamburg, Germany

Abstract: Web Robots (bots) that automate communication with a service on the
Internet via their API are efficient and easy to scale. A large number of bots leads
to significant losses for providers and can frustrate users of social media, games or
online stores. Existing solutions such as CAPTCHAs or complex registrations either
frustrate users or are easy to circumvent. Current solutions that make it difficult to
create bots are only effective for the first bot. Once the first bot is created, it can be
easily duplicated to build an army of bots. This paper presents an approach inspired
by polymorphic malware and censorship resistance to change this. Each client that
communicates with a service does so by using its own application protocol that is
syntactically different but not semantically. Thus, a bot creator is forced to either
find a way to automatically extract the whole application protocol from a client or
to reverse engineer a new protocol for each bot that is created.

Keywords: bots, protocol obfuscation, reverse engineering

1 Introduction

Services on the Internet are more and more interconnected. They communicate and exchange
data. For example, price comparison portals automatically retrieve data from a number of dif-
ferent third party services. For other services, like social media, or games, an automated usage
is often not wanted, since this might affect the satisfaction of human users. In online dating for
example, probably no one wants to chat and meet with an automated program.

There are several ways to automate [ASLN20]. One approach is to automate the use of the
user interface where a bot presses buttons or writes in text fields just like the user. This approach
is inefficient because each new bot needs its own user interface. This is especially expensive for
applications that run natively (desktop or mobile) and do not have a web application. When the
applications can not be run in parallel even a new instance of the operating system needs to be
launched. A much more efficient way is to communicate directly with the service via its API,
sticking to the application protocol. Here a simple script is enough. Neither a user interface is
needed nor a specific operating system or a browser instance. The script can simply be executed
multiple times to spawn a multitude of bots e.g., for nation wide spamming and influencing
voting opinions in social media [BF16]. This is why this paper focuses on automation via the
API of a service.

The main problem of internet services is that automation cannot be prevented, it can only be
made more difficult. If a human user can use a service, it can also be automated. Currently mainly
CAPTCHAs coupled with risk assessment approaches are used to combat bots. Depending on
the calculated risk that a user is a bot, a more difficult CAPTCHA to no CAPTCHA at all is

1 / 5 Volume 080 (2021)

mailto:richard.august.see@uni-hamburg.de


Polymorphic Protocols for Fighting Bots

presented [Liu18]. However, the effectiveness of CAPTCHAs in general decreases with the ad-
vancement in machine learning [SPK16, AA20]. Another method to combat bots is to make the
creation as difficult as possible. This can be done by using obfuscation or anti-reverse engineer-
ing techniques. These techniques increase the cost required to create one initial bot. However
they do not affect the cost required to simply duplicate the first initial bot. Simultaneously, this
is what makes bots so dangerous: the ability to build many bots quickly and cost-effectively.

The main contribution of this paper is a novel approach that increases the labor and compu-
tational cost of duplicating a bot. This is done by using so-called polymorphic protocols which
are inspired by polymorphic malware and censorship resistance. The idea is that each client that
communicates with a service does so with its own application protocol. The protocol can be con-
sidered as an identifier to distinguish between clients. This effectively eliminates the convenient
way to simply duplicate a bot. The duplicated bots would all share the same protocol and can
therefore all be detected together and excluded from further communication with the service.
This increases the cost of bot creators to create multiple different bots. Bot creators must now
either:

1. Manually extract the respective protocol from clients again and again. Ideally the effort
required for this is close to the effort to build the first bot.

2. Find a way to automatically extract the respective protocol from clients.

It is important to note that the approach is not about making it harder to create a bot or even to
make an application or service bot secure. It is a proactive obfuscation approach that restricts the
cost-effective duplication of bots. Interoperability between different services or legitimate bots
can still be ensured e.g., by giving the creators an API key after thorough verification.

The rest of the paper is structured as follows. Section 2 discusses other approaches that make it
harder to create bots. Section 3 explains how polymorphic protocols can be created and applied,
and proposes possible research questions. Finally, Section 4 concludes the paper.

2 Related Work

The word polymorphism comes from biology and refers to different forms (morphs) within the
population of one species [HGE76]. In the field of computer science this term is often used in
the area of computer malware that evades signature detection due to that each instance of the
same malware is different in its form [YY10]. Polymorphic protocols are mostly mentioned
in literature in connection with censorship resistance and privacy [WDA+15]. A similar idea
to polymorphic protocols is used by the authors of [RS10] which aim to achieve less shared
vulnerabilities in server replicas by transforming the code while keeping the semantic.

Depending on the application there are different ways to make the creation of a bot more diffi-
cult. A common way is that some sort of identification is required for interacting with the service
e.g., by registering. Depending on the information required for registering, this process is more
or less hindering the creation of bots. A registration where only some username and password
is required is less secure than a form where in addition a valid Email is required, which is less
secure than when a valid phone number is required or even a valid document e.g., a passport.
However, a registration needs time and may disturb user experience [GN16]. Furthermore, not

NetSys 2021 2 / 5


ECEASST

Proctocol Generator

Fixed Secret SeedDynamic Identifier

Formal Protocol Specification Custom Protocol

Public

Private

Could be public or private

Figure 1: Polymorphic Protocol Generation Overview

every user might be willing to share a phone number or even documents with some service. An-
other way is to make the protocol itself more resilient to analysis. A common way is to encrypt
and authenticate the protocol in addition to TLS1. When reconstructing the protocol one does
not only need to know the specification of the protocol, but also the application key(s) to en-
crypt and authenticate messages. Those application keys are then protected, using anti-reverse
engineering techniques, like obfuscation [GTG07]. However, the protection is not ideal. On a
controlled device those techniques only increase the cost of the bot creator in obtaining the key
but they do not prevent it. Even if the key would be different and hidden in different parts for
each application, at some point the key has to be used. One of these points is when a message
needs to be encrypted or authenticated. There the key can be extracted automatically e.g., by
using hooks. Than a bot can be duplicated again without much cost.

3 Polymorphic Protocols

The objective is to limit the cost-effective duplication of bots. This is achieved by using a so-
called polymorphic protocol. The creation of one bot is as difficult as without a polymorphic
protocol. However, the creation of further bots should then be similarly difficult. The main as-
sumption is that compared to automatically extracting application keys, the extraction of a whole
communication protocol is a much greater effort. This assumption is supported by literature
[NSC15, NBFS06, LJLW13] but not confirmed. A polymorphic protocol could also encrypt
messages. As a result the automatic extraction of such a protocol can become at least as difficult
as the extraction of an application key. The challenge is to estimate how much more demanding
it is.

3.1 Functionality and Approach

The basic functionality is shown in Figure 1. The Protocol Generator takes a Formal Protocol
Specification as well as a Dynamic Identifier and Fixed Secret Seed as input, and outputs a Cus-
tom Protocol, i.e., bound to the Dynamic Identifier. The Dynamic Identifier can be a device id,
a username or some nonce, depending on the application. The Fixed Secret Seed is there so that
if all components are public, it is not possible to create a valid custom protocol without knowing
this Secret Seed. The Formal Protocol Specification is needed for the Protocol Generator to ran-
domize the protocol, while making sure that any mandatory semantic dependencies of messages

1 https://github.com/see-aestas/SINoALICE-API, https://github.com/see-aestas/JodelApi

3 / 5 Volume 080 (2021)

https://github.com/see-aestas/SINoALICE-API
https://github.com/see-aestas/JodelApi


Polymorphic Protocols for Fighting Bots

are preserved. The Protocol Generator can randomize the order, transmission method, encoding
and other features of the protocol or even inject extra dummy bytes, creating a custom protocol.
The Custom Protocol is then merged with a client application e.g., website, smartphone app, or
desktop app. This means that each client has a different protocol for communicating with the
same service. A challenge for this usage is to ensure that a custom protocol can not easily be ex-
tracted from a binary, in contrast to the usage of polymorphic protocols in censorship resistance.
A combination with additional encryption on the custom protocol and the use of application
packers like UPX2 can amplify the effort of extracting a protocol from an application.

Depending on the application, a custom protocol can be assigned once to each client, or the
same protocol to a group, or a new protocol every other day. However, this requires significant
effort in terms of computing time and transmission.

3.2 Arising Research Questions

Possible research questions are:

1. How much additional cost does the use of polymorphic protocols cause for an attacker
who wants to set up multiple bots for a service?

2. How much additional cost does the use of polymorphic protocols cause for a service
provider and how does the approach scale?

The evaluation should include a comparison with methods where only an application key is
used to secure the protocol against bots. However, evaluating the first research question is chal-
lenging in some aspects. It is hard to quantify the cost of extracting a protocol (automatically)
for a generic attacker who could use any method. One possibility is to examine this question
argumentatively on the basis of empirical research in this area. Another possibility would be to
exemplarily apply different automatic protocol extraction techniques on a client with polymor-
phic protocol.
The second question is more straightforward to evaluate. Aspects such as application size, data
consumption, time and resources to create a custom protocol in comparison to non-polymorphic
protocol can be quantified. It is also possible to investigate which common technologies can only
be used to a limited extent due to the use of polymorphic protocols e.g., CDNs.

4 Conclusion

This paper presents an approach to make it harder to duplicate bots. It adopts the technique of
syntactically modification while maintaining the semantics of polymorphic malware and cen-
sorship resistance. The basic idea of the approach is to equip each application with a syntactic
different protocol. This obfuscation makes it very expensive for a bot creator to assemble an
army of bots because it is either necessary to find a way which extracts the protocol automati-
cally or to extract each protocol manually.
In future work such a polymorphic protocol will be implemented and evaluated. In addition, it
will be tested whether the findings can be applied to censorship resistance.
2 https://upx.github.io/

NetSys 2021 4 / 5

https://upx.github.io/


ECEASST

Bibliography

[AA20] F. H. Alqahtani, F. A. Alsulaiman. Is image-based CAPTCHA secure against at-
tacks based on machine learning? An experimental study. Computers & Security
88:101635, 2020.

[ASLN20] B. Amin Azad, O. Starov, P. Laperdrix, N. Nikiforakis. Web runner 2049: Evaluat-
ing third-party anti-bot services. In Proceedings of the 17th DIMVA. 2020.

[BF16] A. Bessi, E. Ferrara. Social bots distort the 2016 US Presidential election online
discussion. First monday 21(11-7), 2016.

[GN16] R. Gafni, I. Nagar. CAPTCHA–Security affecting user experience. IISIT 13:063–
077, 2016.

[GTG07] M. N. Gagnon, S. Taylor, A. K. Ghosh. Software protection through anti-debugging.
IEEE Security & Privacy 5(3):82–84, 2007.

[HGE76] P. W. Hedrick, M. E. Ginevan, E. P. Ewing. Genetic polymorphism in heterogeneous
environments. Annual review of Ecology and Systematics 7(1):1–32, 1976.

[Liu18] W. Liu. Introducing reCAPTCHA v3: the new way to stop bots. https://developers.
google.com/search/blog/2018/10/introducing-recaptcha-v3-new-way-to, 2018. Ac-
cessed: 2021-05-20.

[LJLW13] M. Liu, C. Jia, L. Liu, Z. Wang. Extracting sent message formats from executa-
bles using backward slicing. In 2013 Fourth International Conference on Emerging
Intelligent Data and Web Technologies. Pp. 377–384. 2013.

[NBFS06] J. Newsome, D. Brumley, J. Franklin, D. Song. Replayer: Automatic protocol replay
by binary analysis. In Proceedings of the 13th ACM conference on Computer and
communications security. Pp. 311–321. 2006.

[NSC15] J. Narayan, S. K. Shukla, T. C. Clancy. A survey of automatic protocol reverse
engineering tools. CSUR 48(3):1–26, 2015.

[RS10] T. Roeder, F. B. Schneider. Proactive obfuscation. TOCS 28(2):1–54, 2010.

[SPK16] S. Sivakorn, I. Polakis, A. D. Keromytis. I am robot:(deep) learning to break seman-
tic image captchas. In 2016 IEEE EuroS&P. Pp. 388–403. 2016.

[WDA+15] L. Wang, K. P. Dyer, A. Akella, T. Ristenpart, T. Shrimpton. Seeing through
network-protocol obfuscation. In Proceedings of the 22nd ACM SIGSAC Confer-
ence on Computer and Communications Security. Pp. 57–69. 2015.

[YY10] I. You, K. Yim. Malware obfuscation techniques: A brief survey. In 2010 Interna-
tional conference on broadband, wireless computing, communication and applica-
tions. Pp. 297–300. 2010.

5 / 5 Volume 080 (2021)

https://developers.google.com/search/blog/2018/10/introducing-recaptcha-v3-new-way-to
https://developers.google.com/search/blog/2018/10/introducing-recaptcha-v3-new-way-to

	Introduction
	Related Work
	Polymorphic Protocols
	Functionality and Approach
	Arising Research Questions

	Conclusion