Salminen, Jung & Jansen

 
48 

 
ARE DATA-DRIVEN PERSONAS 

CONSIDERED HARMFUL? DIVERSIFYING 

USER UNDERSTANDINGS WITH MORE 

THAN ALGORITHMS 
 

JONI SALMI NEN  H A M A D  B I N  K H A L I F A  U N I V E R S I T Y  &  T H E  U N I V E R S I T Y  O F  T U R K U ,  SOON-GY O 

JUNG H A M A D  B I N  K H A L I F A  U N I V E R S I T Y  A ND BER NA RD J. JA NSE N H A M A D  B I N  K H A L I F A  
U N I V E R S I T Y  

 
ABSTRACT 

  In this work, we build on research on data-driven personas to present what 
might be “wrong with them”. From wrong assumptions by the client and wrong 
applications of methods to imbalanced, messy, or superficial data; a lack of 
communication regarding how these personas are created; and issues with usability, 
there are a plethora of issues that plague data-driven personas. We conclude by 
contemplating whether data-driven personas are even worthwhile and, if they are, 
then what are some of the immediate remedies required from the human-computer 
interaction community to make data-driven personas a viable tool for user 
understanding. 

KEY WORDS 

Personas, Data-Driven Personas, Harm 

 
INTRODUCTION 

Persona is a user-centred design (UCD) technique applied in HCI, user experience (UX), and 

other fields such as business and marketing. Personas, in these contexts, are defined as 

archetypes of user groups that share similar traits or behaviours (e.g., goals), and they are 

typically portrayed in the form of user profiles (Cooper 1999; Nielsen et al. 2015). 

Data-driven personas, supposedly, elevate manually created personas from low-tech 

design artefacts to high-tech user representations (Jansen et al. 2020). While data-driven 

personas can be defined in several ways (Jansen et al. 2020; McGinn & Kotamraju 2008; 

Miaskiewicz & Luxmoore 2017; Mijač et al. 2018; Zhang et al. 2016), we crystallize the 

contemporary definition as follows: a data-driven persona is a complete persona profile, created 

in a persona template using quantitative data about a given user population which is analysed 

using statistical techniques, including data science, and machine learning algorithms. This 

definition, as can be seen, is heavily rooted in “quantitative data” and “algorithms,” to which we 

will return later. 

It is standard that the research articles on data-driven personas (An, Kwak, Jung, et al. 

2018; An, Kwak, Salminen, et al. 2018; Chapman et al. 2008; Goodman-Deane et al. 2018; 

Miaskiewicz & Luxmoore 2017; Mijač et al, 2018; Zhu et al. 2019) start out by criticizing manual 


Persona Studies 2021, vol. 7, no. 1  

 
49 
 

(“traditional”) persona creation methods, proposing data-driven personas as a remedy to 

shortcomings that include, inter alia, slowness, unreliability or risk of human analyst bias, small 

sample sizes and lack of representativeness, as well as un-reactiveness of the static personas 

created to the constant and dynamic changes in user behaviours, preferences, and 

characteristics (Chapman & Milham 2006; Howard 2015; Jansen et al. 2020; Salminen, Jansen, 

An, Kwak, et al. 2018).  

Therefore, advocates of data-driven personas see these personas as superior to 

manually created personas, and as offering a solution to the problems of manual persona 

creation. Their argument is that personas are “fixed” when using “data” and “algorithms”. This 

kind of thinking poses hidden dangers. First, the goal of personas as “data-driven” (i.e., based on 

real user data, albeit often qualitative) was always present in the original conceptualization of 

Cooper (1999) and the further development of other HCI scholars (Anvari & Tran 2013; Nielsen 

2019b; Pruitt & Grudin 2003; Turner & Turner 2011). Second, it is becoming increasingly clear 

from gained experience in the field as well as from the accumulated knowledge on over-reliance 

on “data” and “algorithms” that both the use and development of data driven personas are 

marred with challenges. These challenges logically evoke a variation of Dijkstra’s classic 

(Dijkstra 1968) question: Are data-driven personas considered harmful? In other words, should 

the research community pursue data-driven personas or are data-driven personas a dead end? 

The purpose of this article is to highlight the variety and scope of challenges pertaining 

to data-driven personas. We not only consider the challenges with the development and 

evaluation of data-driven personas, as typically done in previous works (An, Kwak, Jung, et al. 

2018; An, Kwak, Salminen, et al. 2018; Chapman et al. 2008) – where it is presumed that data-

driven personas, as statistical creations, are primarily technical artefacts that should be judged 

by technical merits and metrics – but we also consider the larger schemes and ramifications of 

data-driven persona lifecycles, including how they fit into the HCI research community (which is 

predominantly qualitatively oriented when it comes to personas) and how they fit into 

organizations that, on one hand, tend to admire “data” and “algorithms” and, on the other hand, 

struggle to capture real value using data-driven personas. 

We adopt the lifecycle view of personas (Adlin & Pruitt 2010), discussing challenges 

pertaining to different stages of the data-driven persona project, including its initiation, persona 

adoption, support, and general impediments that data-driven personas “inherit” from the 

essential nature of personas. Many observations in this manuscript are based on the authors’ 

encounters with HCI reviewers, especially those expressing scepticism towards data-driven 

personas. Due to anonymity, we cannot know the backgrounds of those reviewers, but based on 

their comments, it is often logical to assume that they come from a different tradition of creating 

and using personas and might, in some cases, be threatened by “data” and “algorithms” (we are 

basing this argument on the tone of some the reviews we have received over the years). In our 

opinion, it is important to make these points of criticism visible and put them forward to 

critically assess the current state and the future of the research on data-driven personas. 

RELATED LITERATURE 

Conceptually, persona criticism can be divided into three types. First, there is criticism towards 

personas as a design technique in general. This form of criticism applies to all types of personas, 

including manually created personas, data-driven personas, and mixed-method personas. 

Second, there is approach-specific criticism, e.g., that manually created personas are often based 

on low sample sizes (Chapman & Milham 2006). Third, there is method-specific criticism within 


Salminen, Jung & Jansen

 
50 

 
a specific approach, e.g., that K-means clustering would not be optimal for data-driven personas 

because it assigns each demographic group only to a single cluster (Kwak et al. 2017). 

The literature does not always explicitly state the level of criticism it is engaged in. 

Hence, when reviewing the challenges related to data-driven personas, we need to consider if 

the specific points of criticism are valid for data-driven personas. In this brief literature review, 

we summarize well-cited articles presenting persona criticism, and relate that criticism to data-

driven personas. The following section will dig deeper into this criticism. Additional resources 

for the reader include a literature review of quantitative persona creation (Salminen, Guan, 

Jung, et al. 2020) and a textbook focused on data-driven personas (Jansen et al. 2021). 

One of the most cited persona critiques was put forth by Chapman and Milham (2006). 

While their criticism focused on several aspects that data-driven personas claim as benefits over 

manual personas (e.g., low sample size, staling), some of the concerns apply to data-driven 

personas as well. These include, at least, the inconsistency problem (one part of persona profile 

information can be from Source A and another from Source B and these may or may not refer to 

the same users) and the granularity problem (increasing the number of persona attributes 

requires more personas to be created in order to cover all possible segments).  

Salminen, Jung, and Jansen (2020) mentioned ‘three Es’ as general challenges of 

personas: Envision (personas have no direct relationship to real user data), Execution (quality of 

the generated personas is low or unknown), and Evaluation (the success of personas is based on 

anecdotal feedback). The latter two can be considered as relevant concerns for data-driven and 

manual personas alike. In addition, Salminen, Guan, Jung, et al. (2020) mention the following 

challenges of quantitative persona creation: (1) lack of standards and best practices, (2) lack of 

ethical considerations, and (3) loss of immersion. These are critical issues that we expand on in 

the following section. 

Multiple authors discuss the political challenges of persona use, particularly 

stereotyping (Marsden & Haag 2016; Rönkkö et al. 2004; Turner & Turner 2011). These 

concerns do not magically disappear with “data” and “algorithms” but can, in fact, become even 

more accentuated if the persona generation is done precariously or the results taken at face 

value. Therefore, these issues remain relevant for data-driven personas. 

Howard (2015) posed the overarching question, “Are personas really usable?”. The crux 

of his criticism is that, although personas were originally introduced to facilitate communication 

among team members in UCD, in reality, personas do not solve the communication problems, 

but may even lead to further misunderstandings. A similar conclusion was made by Friess 

(2012) who, based on an ethnographic study, reported that designers rarely evoke or mention 

personas in their daily jobs, and by Matthews et al. (2012) whose participants found personas 

as abstract and misleading. Finally, De Voil (2010) raises several key issues regarding the 

concept of personas, proposing that personas are artificial thinking aids with severe limitations. 

These concerns remain topical for data-driven personas, and we will expand on them in the 

following section. 

None of the criticism for data-driven personas in the past literature is comprehensive. 

Instead, the levels of criticism are often mixed, and most often the criticism focuses either on 

manually created personas or on personas as a user-centric design technique. While we do not 

claim to unveil all the challenges of data-driven personas in this article, we nonetheless make an 

analysis of how the central challenges regarding personas in general manifest in data-driven 

personas.  


Persona Studies 2021, vol. 7, no. 1  

 
51 
 

The challenges of data-driven personas are both theoretical and painstakingly practical, 

and the HCI community should be aware of them. It is our purpose to summarize the main ones 

and raise some novel ones that touch upon the transition of data-driven personas from “one-off 

exercises” into interactive persona systems (Jung et al. 2017, 2020; Jung, Salminen, Kwak, et al. 

2018; Jung, Salminen, An, et al. 2018). This transition of personas moving from “static” 

layouts/templates/posters to interactive persona systems imposes some novel challenges 

regarding user experience (UX), user interfaces (UI), and interaction techniques) that the 

previous literature has not made explicit. This article opens discussion on those challenges, 

further expanding the list of “grand challenges” faced by data-driven personas. 

APPROACH 

We organize this work by themes, which represent central issues with data-driven personas. 
These are derived from authors’ prior experience with data-driven persona research (more 

than a dozen published papers, including several CHI publications) and with helping more than 

half a dozen client organizations in industries such as news and media, telecommunications, 

airline, e-commerce, and design/UX research, to apply data-driven personas. Drawing from this 

experience-based knowledge, we formulate central arguments as to why data-driven personas 

could be considered harmful. In deference to alternative views, we also solicitated comments 

from two external researchers known to have extensive experience in persona research (one in 

qualitative personas and the other in quantitative personas). Their comments were 

incorporated in honing the arguments made in this manuscript. Finally, our work is based on 

perusing a large body of literature on data-driven personas (e.g., Adlin & Pruitt 2009; Goodman-

Deane et al. 2018; Guo & Razikin 2015; McGinn & Kotamraju 2008; Miaskiewicz & Luxmoore 

2017; Mijač et al. 2018; Watanabe et al. 2017; Zhang et al. 2016; Zhu et al. 2019 and many 

others) over a period of multiple years. 

WHY ARE DATA-DRIVEN PERSONAS HARMFUL? 

Starting from project definition and extending to persona data collection, creation, evaluation, 

and their eventual adoption in organizations, many things can go wrong with data-driven 

personas. We adopt a “lifecycle view” (cf. Adlin & Pruitt 2009, 2010) to inspect these challenges 

that have an adverse effect on the decision to undertake a data-driven persona project. In other 

words, the following challenges consider the data-driven persona project’s (a) initiation 

(expectations, objectivity, standards), (b) adoption (user perceptions and use), and (c) support 

(training, maintenance), as well as (d) general impediments (superficiality, aggregation, 

averages, and relevance). 

Challenge 1: Inflated Expectations from Stakeholders 

A common, yet often discarded aspect of applying algorithms for persona creation, is that the 

average stakeholder in a company attributes mythical properties and capabilities to these 

technological inventions, so that the mere mentioning of “data”, “algorithms”, or “non-negative 

matrix factorization” evokes positive qualities such as trustworthiness, efficiency, and 

transcendence of human capabilities. This effect, dubbed the “mystique of numbers” by Siegel 

(2010, p. 4721), refers to the phenomenon wherein stakeholders have unrealistic expectations 

from data-driven personas. As soon as stakeholders are informed that data-driven personas are 

based on “real data” and “millions of user interactions” which are analysed by (the implicitly 

objective) “algorithms”, they abandon their critical attitudes and become willing to take 

personas seriously. While this effect is beneficial for persona adoption, which is typically 

hindered by the lack of credibility, stakeholder commitment, and trust in the personas (Friess 


Salminen, Jung & Jansen

 
52 

 
2012; Jensen et al. 2017; Matthews et al. 2012; Nielsen 2019a; Rönkkö 2005; Rönkkö et al. 

2004; Seidelin et al. 2014), it comes with the negative side-effect of hyperbolic expectations. 

In the long run, the stakeholders’ unrealistic expectations may result in various adverse 

outcomes. These include, e.g., disappointment in the fact that data-driven personas did not solve 

all analytics problems despite the superhuman capabilities of the algorithms. Similarly, there is 

a risk that hidden errors in the data, algorithms, or simply the misunderstandings of what 

certain data is and how it is created in the persona profile, skew the stakeholders’ decision-

making process thus defeating the original purpose of data-driven personas which is to provide 

valid, correct, and accurate information for stakeholders to consider real user needs, wants, 

interests, and goals. 

Stakeholders may believe that statistical methods may simply be selected and applied to 

get “the answer,” i.e., the immutable truth of their users (whereas, in reality, the truth is much 

more nuanced than what algorithms reveal). This is paired with a strong, but almost always 

unstated assumption that distinct types of people must exist. The contrary assumption, that 

people are approximately multivariate, normally distributed, and do not fall into neatly 

separable groups, is typically rejected from the outset. An analogy is slicing a pizza: there are 

infinite ways to do it, and none is correct or incorrect; it all depends on one’s goal (Chapman & 

Feit 2019). As such, the data analysis efforts involved in data-driven persona creation can be 

more or less successful, but none of them is the only and perfect solution. 

Furthermore, a predominant focus on statistical significance in data-driven persona 

creation may overlook the personas’ practical significance, as these two concepts do not always 

equate in the real world. For example, there may be a statistically significant difference between 

two user groups with a low magnitude (Jansen et al. 2019), rendering this difference 

unimportant for decision making. Technically-oriented persona creators may want to optimize 

the accuracy or validity of the personas based on some metric to minimize or maximize, 

whereas stakeholders would want to optimize the usefulness of the personas, regardless of how 

they are created. A crucial question for the purpose of usefulness maximization is: are the 

similarities and differences among the personas truly so important that they matter for decision 

making? Such considerations are often omitted when reporting data-driven personas in 

academic literature. Consequently, data-driven personas may end up being abstract and esoteric 

⎯ i.e., technically complex and difficult to communicate to stakeholders in ways that are both 

truthful and easy to understand (Salminen, Jung, & Jansen 2020).  

In summary, the present ability to generate data-driven personas does not match the 

expected perfection, meaning that there may be a gap of what the stakeholders think they get and 

what they actually get with data-driven personas. 

Challenge 2: Algorithms are Biased Too 

Data-driven personas can be seen as design artefacts created by algorithms. As such, they are 

susceptible to what is known as algorithmic bias (Friedman & Nissenbaum 1996; Hajian et al. 

2016), a tendency of algorithms to accentuate the properties of the data while ignoring fairness 

or legality of the outcomes. An example would be an algorithm repeatedly picking African-

American names for criminal personas created from the data (Salminen, Froneman, Jung, et al. 

2020).  

In general, there are three sources of bias in algorithmic systems: (1) 

imbalanced/skewed datasets that “favour” one user group over another; (2) mathematics of the 

algorithm that accentuate the differences among the groups by “picking” certain groups over 

others; and (3) cultural assumptions that are encoded in datasets and systems, leading to 


Persona Studies 2021, vol. 7, no. 1  

 
53 
 

systematic discrimination by structural design (Hajian et al. 2016). Data-driven personas are 

not immune to these concerns. In fact, “data-driven,” when blindly applied, can unintentionally 

become “bias-driven”. Therefore, following the on-going research in the ethical analysis of 

algorithms (Eslami et al. 2018), an ethical review of data-driven persona development is 

necessary.  

While research papers may claim that data-driven persona development increases 

objectivity (Jansen et al. 2020; Mijač et al. 2018), the deployment of algorithms for data analysis 

may present new sources of prejudice and lack of transparency (Salminen, Santos, Jung, et al. 

2019). Additional ethical challenges include safeguarding the privacy of online users and giving 

stakeholders information and tools to assess how reliable and trustworthy the data-driven 

personas are (which is a non-trivial problem as the technical sophistication of end-users of 

personas greatly varies). Thus far, research on ethics in data-driven personas is scarce, with the 

exception of a couple of studies (Goodman-Deane et al. 2018; Salminen, Froneman, Jung, et al. 

2020). It is uncertain if data-driven persona advocates recognise these ethical issues in their 

work, as most studies simply lack the discussion. For example, replacing the persona generation 

algorithm can have a drastic effect on the generated personas, even when the underlying data is 

the same (Brickey et al. 2012) and yet, there is virtually no work comparing what kind of 

personas different algorithms generate from the same user data. Thus, it is uncertain if data-

driven personas can become biased and if they can, how can the issue be effectively addressed? 

Challenge 3: Where are the Standards? 

Data-driven personas paradoxically suffer from a lack of standards. The lack of standards is 

paradoxical because, being the result of quantitative data and objective/replicable processes, 

data-driven personas are, in theory, in a perfect position for standards to emerge. Yet, there are 

no standards or metrics even for measuring such a basic concept as persona quality, which 

would be fundamental for comparing and ranking different data-driven methods.  

Unlike in computer science where researchers run experiments on baseline datasets 

that are the same for everyone, no baseline datasets exist for persona creation. Unlike in fields 

like psychology, where there are studies on norms of perception – e.g., how certain groups by 

age, gender, or culture view the world (Gosling et al. 2003) – data-driven persona studies 

propose no such norms or even discuss them. Hence, it is difficult to understand what features 

and expectations users have for data-driven persona systems. The lack of standardization also 

makes it difficult to obtain strong guidelines for persona creation that would be derived from 

empirical research. There are no empirically validated guidelines, for example, as to how many 

personas should be created, what metrics should be used to evaluate the personas, what ethical 

considerations should be made when collecting and processing data for persona generation, and 

so on.  

Also, although data-driven personas could be generated from many alternative metrics 
to describe different behaviours (e.g., clicking behaviour, viewing behaviour, purchase 

behaviour), typically, studies use only one behavioural interaction metric at a time (An, Kwak, 

Salminen, et al. 2018). Which metric(s) to choose, then? This issue is akin to that in the field of 

analytics, where stakeholders need to define their questions well to avoid getting lost in the 

dozens of reports afforded by the analytics systems. For data-driven personas, there exists 

virtually no guidance for this metrics selection problem, but researchers and practitioners carry 

out the selection in an ad-hoc manner.  

The lack of standards hinders data-driven persona creation (the choice of methods is 

unclear, as is the mutual comparison of methods), use (what are the standard use cases for data-


Salminen, Jung & Jansen

 
54 

 
driven personas?), and understanding of the data-driven persona user behaviour (how many 

personas do user view? How long they spend, on average, on persona profiles? What 

information is the most crucial for decision making?). Apart from limited exploratory work on 

these matters (Salminen, Kathleen Guan, Jung, Chowdhury, et al., 2020; Salminen, Nielsen, Jung, 

An, et al., 2018; Salminen, Willemien Froneman, Jung, Chowdhury, et al., 2020; Salminen, Ying-

Hsang Liu, Sengun, Santos, et al., 2020), no convincing standards for data-driven persona user 

behaviour have been developed to date.  

Challenge 4: Mess, Confusing, and Difficult to Use 

User studies report many issues with data-driven persona UX and UI (Salminen, Jung, An, et al. 

2019; Salminen, Jansen, An, Jung, et al. 2018; Salminen, Jung, An, Kwak, et al. 2018; Salminen, 

Sengun, Jung, et al. 2019). These issues include, at least, confusion over what the information in 

persona profiles is and how it is generated (lack of transparency), how to get more information 

about a specific persona, questions about the reliability and trustworthiness of the information, 

and – the most vital question of all – “Now what? How can I use this persona?”.  

According to our experience, stakeholders struggle to make use of persona systems, 

even when they are provided with multiple features, such as interest prediction, gap analysis, 

and search and navigation (Jansen et al. 2020). These features may appear unfamiliar to 

persona users, and it may be that it is more important for design outcomes that personas are 

inspirational and memorable rather than numerical and accurate. In this light, the definite proof 

of value for data-driven personas remains elusive. Moreover, at this stage, research on effective 

UIs for data-driven personas is still in its infancy (Salminen, Liu, Sengun, et al. 2020), and there 

is little empirical evidence about how stakeholders interact with these systems, what features 

are requested, and so on. It is fairly easy to generate proof-of-concepts (Mijač et al. 2018), but 

the leap from these prototypes into full-fledged production systems with an active user base is 

still in the horizon. Therefore, making data-driven personas user friendly and useful remains an 

obstacle for their wider application. 

Challenge 5: Superficial and Unsurprising 

It can be said there is a consensus among qualitative persona researchers that, even if not 

always obtained, the goal and purpose of personas is to provide in-depth understanding of 

different user types, that is to facilitate the sense of empathy (Blomquist & Arvola 2002; Haag & 

Marsden 2019; Nielsen et al. 2017; Nielsen & Storgaard Hansen 2014; Wright & McCarthy 

2008). These insights are, on one hand, the result of the creation process itself; by immersing 

oneself into the user data, one achieves a thorough understanding of the user’s circumstances. 

On the other hand, gaining such insights relies on the innate ability of humans to understand 

other humans (Grudin 2006). Algorithms cannot think and, hence, they cannot compete with 

this ability.  

Thus, a major concern with data driven personas is that the algorithms behind their 

generation often lack the ability to interpret, decipher, and encode common sense meanings. 

Cultural meanings and (tacit) distinctions are difficult even for untrained humans, and a data-

driven persona algorithm is completely oblivious to them unless – with some method that has 

not been created yet – trained to classify information based on its cultural meaning. Cultural 

factors are lacking in the data-driven persona literature, despite an extensive body of literature 

on culture-cognizant application of manually created personas (Anvari et al. 2019; Jensen et al. 

2017; Nielsen 2010). While data-driven persona profiles can include social media comments 

(Salminen, Şengün, Kwak, Jansen, et al. 2018), they cannot disambiguate their meanings or 

make any complex interpretations from these comments.  


Persona Studies 2021, vol. 7, no. 1  

 
55 
 

Finally, persona enrichment poses an issue since it tends to require using independent 

datasets (Mijač et al. 2018), evoking the consistency problem (see p. 1). Yet, without in-depth 

insights, data-driven personas risks remaining shallow alternative UIs for website analytics 

data, having little actionable information. 

Challenge 6: Aggregation Makes Things Worse 

Chapman and Milham (2006) were first to articulate the aggregation problem of personas. 

Later, other researchers have observed this issue (Bødker et al. 2012; Matthews et al. 2012; 

Salminen, Jansen, An, Kwak, et al. 2018); yet, no definitive solutions have been proposed. 

Personas are, by definition, aggregates:  they group individual users to one user representation. 

Yet, each user is unique and different from others (sometimes referred to as segment of one 

[Lingel 2012]).  

Chapman et al. (2008) analysed data-driven personas and found that the more granular 

representations of users we want, the more personas are needed. For example, if we want to 

represent users by gender, two personas (male and female) may be sufficient. However, if we 

want to represent both gender and age, assuming two age categories, we now need twice the 

number of personas: male-young, male-mature, female-young, and female-mature. The issue is 

that the selection of granularity of personas is arbitrary and there are no rules for deciding this 

granularity. 

Another issue of data-driven personas is their potential weakness against the argument 

often mentioned by practitioners, “With individualized data, I can target the individual users, so 

why would I need personas/segments/clusters/etc.?”. This question is valid, as in use cases 

such as personalization and recommendation systems, the unit of analysis is the individual and 

the decision-maker is an algorithm (albeit, it is also true that many of such algorithms rely on 

dimensionality reduction, which is a form of grouping [Huang et al. 2019]). As the world moves 

towards automated decision-making, is there room for data-driven personas? 

Challenge 7: The Average Persona Does Not Exist 

The typical definition of personas is that they describe typical users (Marsden & Haag 2016; 

Sakata et al. 2014). Therefore, creating an average persona is the conceptual and practical 

default of many persona-creation projects. Its challenges relate, firstly, to stereotyping when 

focusing on the mean/average user (Marsden & Haag 2016; Marsden & Pröbster 2019; Turner 

& Turner 2011) and, secondly, to the focus of data-driven algorithms on the central tendency in 

the data.  

What we mean by this can be illustrated with a simple example. Assume two datapoints 

about users, with numerical values of “1” and “5”. Their average is “3” which is equally far from 

both observations and thus does not well represent either datapoint. This “flaw of averages” is 

well documented in a classic study conducted by the United States Air Force in 1950, finding 

that, among 4,000 measured pilots, no pilots matched all the average attributes of height, 

weight, etc. (Hertzberg et al. 1954).  The problem with mean-centred personas (i.e., those that 

describe average, typical users) is the general problem with the mean: if half of your users are 

right-handed and half are left-handed, should your persona be middle-handed? Obviously not. 

Instead, you need personas for both left- and right-handed users. This is what we mean we talk 

about diversity of personas – a good persona set is one that covers various user types, not only 

their hypothetical amalgamation. 

Yet, by picking “representative” behaviours and characteristics for the data-driven 

personas, we tend to overlook the extremes. These extremes, anomalies, deviations, minorities, 


Salminen, Jung & Jansen

 
56 

 
and fringe groups are, therefore, not considered by the stakeholders using the data-driven 

personas, as these segments are hidden; they do not exist, as far as the stakeholder is 

considered (Salminen, Froneman, Jung, et al. 2020). This rounding up of characteristics may end 

up with eliminating everything that makes a user unique, resulting in bland and unimaginative 

user profiles that feed rather than curb stereotypes. Therefore, representativeness comes at the 

cost of diversity. 

Challenge 8: Maintenance Cost 

Unlike traditional personas that are created once and then used for some time, data-driven 

personas require constant nurturing, care, and maintenance. This maintenance is costly and 

time-consuming. The reason for maintenance stems from the reliance on live datasets (Jung, 

Salminen, An, et al. 2018). As platforms such as YouTube and Facebook repeatedly change their 

terms of service and APIs, often without proper documentation or notifications for developers, 

persona systems reliant on these data sources are “broken” until the necessary updates are 

made. 

Similarly, software packages and algorithms are frequently updated, requiring the data-

driven persona developers to monitor and implement these updates to ensure the continued 

functionality of the system. Thus, unlike traditional personas that are independent and 

contained, data-driven personas tend to have complex linkages to sub-systems, data science 

libraries, and Web technologies that come with a built-in technical debt (Thomas et al. 2018).  

Related problems are missing data, unknown measurement errors in data exports, 

sampling/thresholds that limit the data collection speed and may skew the data distributions, 

and the adding/removing of data variables and classes by the online platforms without 

providing any say to researchers on these decisions. When personas are built around data 

sources owned by multi-national corporations such as Facebook and Google, the dependence on 

the goodwill of these organizations to continue sharing user data is high. If these platforms were 

to consider it strategically unwise to continue sharing data via their APIs, data-driven personas 

would be quickly broken.  

Finally, privacy issues such as the General Data Protection Regulation (GDPR) in the 

European Union and similar legislative initiatives in other economic areas may further limit the 

availability of user data for applications such as data-driven persona generation in the future. 

While the benefits of data-driven personas can be seen in the abundance of online user data and 

the effect of democratizing personas for even smaller organizations that can have access to this 

data, a future scenario where the data becomes less accessible and perhaps only accessible for 

large corporations against payment can be envisioned. Future developments bring forth this 

cloud of uncertainty for data-driven applications such as online user personas. 

Challenge 9: Personas are Passé! 

 A compelling argument against data-driven personas that one often reads in some reviews and 

online discussions among UX professionals is that personas are not relevant anymore and 

organisations are using other methods. Blažica (2014) surveyed start-up companies about their 

use of UX techniques and observed that personas ranked the fourth last (out of eight 

techniques) in terms of stakeholder familiarity and also fourth last in terms of regular use. 

While ten respondents indicated that they had used personas “a few times”, only one 

respondent reported regular use. This, indeed, warrants concern as personas did relatively 

poorly compared to other methods.  


Persona Studies 2021, vol. 7, no. 1  

 
57 
 

If there is general redundancy for personas among HCI professionals, then this 

sentiment is inherited to data-driven personas that, after all, are personas. Indeed, multiple 

alternative techniques for UCD exist ⎯ e.g., interviews, focus groups, surveys, participant 

observation, user narratives, jobs-to-be-done, scenarios, customer journeys, and so on (Blažica 

2014; Carroll 1997; Goodman et al. 2013; Kliman-Silver et al. 2020). Similarly, there are a 

plethora of analytics tools that provide numerical information about users in the form of charts, 

numbers, and tables (e.g., YouTube Analytics, Google Analytics, Facebook Insights, IBM 

Analytics, etc.). So, why are data-driven personas needed? 

 
 Applies to… 

Challenge Data-driven personas All personas (also ones 

created manually) 

C1: Inflated expectations x  

C2: Algorithmic bias x  

C3: Lack of standards x x 

C4: User perceptions and 

difficulty of use 

x x 

C5: Superficiality x  

C6: Problem of aggregation x x 

C7: Problem of averages x x 

C8: Maintenance cost x  

C9: Irrelevance x x 

Table 1: Data-driven personas inherit challenges from the concept of personas but there are 

also challenges unique to them. 

DISCUSSION 

We presented nine challenges (see Table 1) that might imply that data-driven personas are 

harmful. These challenges are far more serious than generally believed. They start with wrong 

assumptions from the stakeholders, and extend to precarious application of methods, 

imbalanced or messy data, access to superficial data only, lack of communication how they were 

created, too complex UIs, unclear or lacking definitions of persona content, and omission of 

ethical considerations. The crucial message of our work is that the state-of-the-art of data-

driven personas does not create perfect personas, despite the somewhat illusionary and 

impressive use of technical jargon such as “data”, “algorithms”, and so on. 

Are data-driven personas worthwhile, then? Indeed, at first glance, the challenges may 

seem overwhelming. It is, in any case, certain that no single paper or research project can solve 

them. For the discovery of solutions, researchers within the persona domain need to work in 

unison. Probabilistic methods can assist with the aggregation problem (Chapman et al. 2015). 

Other potential solutions involve developing standards for the choice process for 

hyperparameters (most importantly, the number of personas) and for evaluation metrics that 

need to be consistent among the numerous data-driven persona methodologies. 


Salminen, Jung & Jansen

 
58 

 
It can also be said that ‘no persona is an island’, meaning that data-driven personas co-

exist and co-evolve together with computer science, HCI, and other related fields. For example, 

studies in algorithmic bias (Diaz et al. 2018; Hajian et al. 2016; Salminen, Jung, & Jansen 2019) 

apply to data-driven personas, for example, through the process of selecting persona name, 

persona’s demographic traits, and interpreting persona’s sentiment by using tools of Natural 

Language Processing (NLP). Data-driven personas lay on a foundation of algorithmic and 

technological work, implying that their future is intertwined with the progress in these fields 

that support and enable the technical back-end of data-driven personas. 

CONCLUSION 

Data-driven personas may provide many benefits relative to manually created personas. 

However, the implicit assumption that data-driven personas would be “perfect” or “easy” is not 
correct. On the other hand, organizations show a continuous interest in data-driven personas, 

albeit often with unrealistic expectations. Hence, the research efforts in this space are valuable 

and worthwhile. The future will reveal if data-driven personas remain at the level of perpetual 

promise or, if at some point, they redeem the high expectations that their (small but persistent) 

group of advocates claim. 

 
WORKS CITED 

Adlin T & Pruitt J 2009, ‘Putting personas to work: Using data-driven personas to focus product 
planning, design, and development; in Sears A and Jacko JA (eds) Human-Computer 
Interaction: Development Process, CRC Press, New York, pp. 95–120. 

— 2010, The Essential Persona Lifecycle: Your Guide to Building and Using Personas. 1st ed. 
Morgan Kaufmann Publishers Inc. San Francisco.  

An J, Kwak H, Jung S-G, Salminen J & Jansen BJ 2018, ‘Customer segmentation using online 
platforms: isolating behavioral and demographic segments for persona creation via 
aggregated user data’, Social Network Analysis and Mining, vol. 8, no. 1: 54. DOI: 
10.1007/s13278-018-0531-0. 

An J, Kwak H, Salminen J, Jung S-G & Jansen BJ 2018, ‘Imaginary People Representing Real 
Numbers: Generating Personas from Online Social Media Data’, ACM Transactions on the 
Web (TWEB), vol. 12, no. 4: 27. DOI: 10.1145/3265986. 

Anvari F & Tran HMT 2013, ‘Persona ontology for user centred design professionals’, in The 
ICIME 4th International Conference on Information Management and Evaluation, Ho Chi 
Minh City, Vietnam, 2013, pp. 35–44. 

Anvari F, Richards D, Hitchens M & Tran H 2019, ‘Teaching user centered conceptual design 
using cross-cultural personas and peer reviews for a large cohort of students’, in 
Proceedings of the 41st International Conference on Software Engineering: Software 
Engineering Education and Training, IEEE Press, Piscataway, NJ, 2019, pp. 62–73.  

Blažica B 2014, Use of UX and HCI tools among start-ups. Working paper. Ljubljana, Slovenia: 
XLAB Research. 

Blomquist A & Arvola M 2002, ‘Personas in action: Ethnography in an interaction design team’, 
in Proceedings of the second Nordic conference on Human-computer interaction, ACM 
Press, New York, pp. 197–200.  

Bødker S, Christiansen E, Nyvang T & Zander P 2012, ‘Personas, people and participation: 
challenges from the trenches of local government’, in Proceedings of the 12th 
Participatory Design Conference on Research Papers: Volume 1 - PDC ’12, ACM Press, 
Roskilde, p. 91. ACM Press. DOI: 10.1145/2347635.2347649. 


Persona Studies 2021, vol. 7, no. 1  

 
59 
 

Brickey J, Walczak S & Burgess T 2012, ‘Comparing Semi-Automated Clustering Methods for 
Persona Development’, IEEE Transactions on Software Engineering, vol. 38, no. 3, pp. 
537–546. DOI: 10.1109/TSE.2011.60. 

Carroll JM 1997, ‘Chapter 17 - Scenario-Based Design’, in Helander MG, Landauer TK, and 
Prabhu PV (eds) Handbook of Human-Computer Interaction (Second Edition). 
Amsterdam: North-Holland, pp. 383–406. DOI: 10.1016/B978-044481862-1.50083-2. 

Chapman C & Feit EM 2019, R For Marketing Research and Analytics. Springer. 
Chapman C & Milham RP 2006, ‘The Personas’ New Clothes: Methodological and Practical 

Arguments against a Popular Method’, in Proceedings of the Human Factors and 
Ergonomics Society Annual Meeting, 1 October 2006, pp. 634–636. DOI: 
10.1177/154193120605000503. 

Chapman C, Love E, Milham RP, ElRif P & Alford J 2008, ‘Quantitative Evaluation of Personas as 
Information’, in Proceedings of the Human Factors and Ergonomics Society Annual 
Meeting, 1 September 2008, pp. 1107–1111. DOI: 10.1177/154193120805201602. 

Chapman C, Krontiris K & Webb J 2015, ‘Profile CBC: Using Conjoint Analysis for Consumer 
Profiles’, in Sawtooth Software Conference Proceedings, Google Research. Available at: 
https://research.google.com/pubs/archive/44167.pdf. 

Cooper A 1999, The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and 
How to Restore the Sanity. 1st ed. Sams-Pearson Education, Indianapolis. 

De Voil N 2010, Personas considered harmful. Industry report. London, UK: De Voil Consulting. 
Available at: http://www.devoil.com/papers/PersonasConsideredHarmful.pdf. 
Retrieved November 2019. 

Diaz M, Johnson I, Lazar A, Piper A & Gergle D 2018, ‘Addressing Age-Related Bias in Sentiment 
Analysis’, in Proceedings of the 2018 CHI Conference on Human Factors in Computing 
Systems, New York, pp. 1–14. CHI ’18. Association for Computing Machinery. DOI: 
10.1145/3173574.3173986. 

Dijkstra EW 1968, ‘Letters to the editor: go to statement considered harmful’, Communications 
of the ACM, vol.11, no. 3, pp.147–148. 

Eslami M, Krishna Kumaran SR, Sandvig C & Karahalios K 2018, ‘Communicating Algorithmic 
Process in Online Behavioral Advertising’, in Proceedings of the 2018 CHI Conference on 
Human Factors in Computing Systems, ACM, Montréal, p. 432. 

Friedman B & Nissenbaum H 1996, ‘Bias in computer systems’, ACM Transactions on Information 
Systems (TOIS), vol.14, no. 3, pp. 330–347. 

Friess E 2012, ‘Personas and decision making in the design process: an ethnographic case 
study’, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 
pp. 1209–1218. DOI: https://doi.org/10.1145/2207676.2208572. 

Goodman E, Kuniavsky M & Moed A 2013, Observing the User Experience: A Practitioner’s Guide 
to User Research. Morgan Kaufmann. 

Goodman-Deane J, Waller S, Demin D, González de Heredia A, Bradley M & Clarkson J 2018, 
‘Evaluating Inclusivity using Quantitative Personas’, in In the Proceedings of Design 
Research Society Conference 2018, Limerick. DOI: 10.21606/drs.2018.400. 

Gosling SD, Rentfrow PJ & Swann WB 2003, ‘A very brief measure of the Big-Five personality 
domains’, Journal of Research in personality, vol. 37, no.6, pp: 504–528. 

Grudin J 2006, ‘Why Personas Work: The Psychological Evidence’, in Pruitt J and Adlin T (eds) 
The Persona Lifecycle. Elsevier, pp. 642–663. DOI: 10.1016/B978-012566251-2/50013-
7. 

Guo H & Razikin KB 2015, ‘Anthropological User Research: A Data-Driven Approach to Personas 
Development’, in Proceedings of the Annual Meeting of the Australian Special Interest 
Group for Computer Human Interaction, OzCHI ’15. ACM, New York, pp. 417–421. DOI: 
10.1145/2838739.2838816. 

Haag M & Marsden N 2019, ‘Exploring personas as a method to foster empathy in student IT 
design teams’, International Journal of Technology and Design Education, vol. 29, no. 3, 
pp.565–582. DOI: 10.1007/s10798-018-9452-5. 


Salminen, Jung & Jansen

 
60 

 
Hajian S, Bonchi F & Castillo C 2016, ‘Algorithmic Bias: From Discrimination Discovery to 
Fairness-aware Data Mining’, in Proceedings of the 22nd ACM SIGKDD International 
Conference on Knowledge Discovery and Data Mining, KDD ’16. Association for 
Computing Machinery San Francisco, pp. 2125–2126. DOI: 10.1145/2939672.2945386. 

Hertzberg HT, Daniels GS & Churchill E 1954, Anthropometry of Flying Personnel-1950. Antioch, 
Yellow Springs. 

Howard TW 2015, ‘Are Personas Really Usable?’, Communication Design Quarterly Review, vol. 3, 
no. 2, pp. 20–26. DOI: https://doi.org/10.1145/2752853.2752856. 

Huang X, Wu L & Ye Y 2019, ‘A Review on Dimensionality Reduction Techniques’, International 
Journal of Pattern Recognition and Artificial Intelligence, vol. 33, no. 10. DOI: 
10.1142/S0218001419500174. 

Jansen BJ, Jung S-G & Salminen J 2019, ‘Creating Manageable Persona Sets from Large User 
Populations’, in Extended Abstracts of the 2019 CHI Conference on Human Factors in 
Computing Systems, ACM, Glasgow, pp. 1–6.  DOI: 10.1145/3290607.3313006. 

Jansen BJ, Salminen J, Jung S-G, & Guan K 2021, Data-Driven Personas. 1st ed. Synthesis Lectures 
on Human-Centered Informatics. Morgan & Claypool Publishers. Available at: 
https://www.morganclaypool.com/doi/abs/10.2200/S01072ED1V01Y202101HCI048 
(accessed 10 February 2021). 

Jansen BJ, Salminen J & Jung S-G 2020, ‘Data-Driven Personas for Enhanced User Understanding: 
Combining Empathy with Rationality for Better Insights to Analytics’, Data and 
Information Management, vol. 4, no. 1, pp. 1–17. DOI: https://doi.org/10.2478/dim-
2020-0005. 

Jensen I, Hautopp H, Nielsen L & Madsen S 2017, ‘Developing international personas: A new 
intercultural communication practice in globalized societies’, Journal of Intercultural 
Communication, vol. 43. 

Jung S-G, An J, Kwak H, Ahmad M, Nielsen L & Jansen BJ 2017, ‘Persona Generation from 
Aggregated Social Media Data’ in Proceedings of the 2017 CHI Conference Extended 
Abstracts on Human Factors in Computing Systems, CHI EA ’17. ACM, Denver, pp. 1748–
1755.  

Jung S-G, Salminen J, Kwak H & Jansen BJ 2018, ‘Automatic Persona Generation (APG): A 
Rationale and Demonstration’, in CHIIR ’18: Proceedings of the 2018 Conference on 
Human Information Interaction & Retrieval, ACM, New Jersey, pp. 321–324. ACM. DOI: 
https://doi.org/10.1145/3176349.3176893. 

Jung S-G, Salminen J, An J & Jansen BJ 2018, ‘Automatically Conceptualizing Social Media 
Analytics Data via Personas’, in Proceedings of the International AAAI Conference on Web 
and Social Media (ICWSM 2018), San Francisco, p. 2. 

Jung S-G, Salminen J & Jansen BJ 2020, ‘Giving Faces to Data: Creating Data-Driven Personas 
from Personified Big Data’, in Proceedings of the 25th International Conference on 
Intelligent User Interfaces Companion, IUI ’20. Association for Computing Machinery, 
Cagliari, pp. 132–133. DOI: 10.1145/3379336.3381465. 

Kliman-Silver C, Siy O, Awadalla K, Lentz A, Convertino G & Churchill E 2020, ‘Adapting User 
Experience Research Methods for AI-Driven Experiences’, in Extended Abstracts of the 
2020 CHI Conference on Human Factors in Computing Systems, pp. 1–8. 

Kwak H, An J & Jansen BJ 2017, ‘Automatic Generation of Personas Using YouTube Social Media 
Data’, in Proceedins of the Hawaii International Conference on System Sciences (HICSS-50), 
Waikoloa, pp. 833–842. 

Lingel J 2012, ‘Ethics and dilemmas of online ethnography’, in CHI’12 Extended Abstracts on 
Human Factors in Computing Systems, pp. 41–50. 

Marsden N & Haag M 2016, ‘Stereotypes and politics: reflections on personas’, in Proceedings of 
the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4017–4031. 

Marsden N & Pröbster M 2019, ‘Personas and Identity: Looking at Multiple Identities to Inform 
the Construction of Personas’, in Proceedings of the 2019 CHI Conference on Human 
Factors in Computing Systems  - CHI ’19,  ACM Press, Glasgow, pp. 1–14. DOI: 
10.1145/3290605.3300565. 


Persona Studies 2021, vol. 7, no. 1  

 
61 
 

Matthews T, Judge T & Whittaker S 2012, ‘How do designers and user experience professionals 
actually perceive and use personas?’, in Proceedings of the 2012 ACM annual conference 
on Human Factors in Computing Systems - CHI ’12, ACM Press, Austin, p. 1219. DOI: 
10.1145/2207676.2208573. 

McGinn JJ & Kotamraju N 2008, ‘Data-driven persona development’, in Proceedings of the SIGCHI 
Conference on Human Factors in Computing Systems, ACM, Florence, pp. 1521–1524. DOI: 
10.1145/1357054.1357292. 

Miaskiewicz T & Luxmoore C 2017, ‘The Use of Data-Driven Personas to Facilitate 
Organizational Adoption–A Case Study’, The Design Journal, vol. 20, no. 3, pp. 357–374. 

Mijač T, Jadrić M & Ćukušić M 2018, ‘The potential and issues in data-driven development of 
web personas’, in 2018 41st International Convention on Information and Communication 
Technology, Electronics and Microelectronics (MIPRO), pp. 1237–1242. DOI: 
10.23919/MIPRO.2018.8400224. 

Nielsen L 2010, ‘Personas in Cross-Cultural Projects’, in Katre D, Orngreen R, Yammiyavar P, et 
al. (eds) Human Work Interaction Design: Usability in Social, Cultural and Organizational 
Contexts. IFIP Advances in Information and Communication Technology. Springer, 
Berlin, pp. 76–82. DOI: 10.1007/978-3-642-11762-6_7. 

— 2019a, ‘Going Global—International Personas’, in Nielsen L (ed.) Personas - User Focused 
Design. Human–Computer Interaction Series. Springer, London, pp. 123–133. DOI: 
10.1007/978-1-4471-7427-1_7. 

— 2019b, Personas - User Focused Design. 2nd ed. 2019 edition. Springer, New York. 
Nielsen L & Storgaard Hansen K 2014, ‘Personas is applicable: a study on the use of personas in 

Denmark’, in Proceedings of the SIGCHI Conference on Human Factors in Computing 
Systems, ACM, Toronto, pp. 1665–1674.  

Nielsen L, Hansen KS, Stage J & Billestrup J 2015, ‘A Template for Design Personas: Analysis of 
47 Persona Descriptions from Danish Industries and Organizations’, International 
Journal of Sociotechnology and Knowledge Development, vol. 7, no. 1, pp. 45–61. DOI: 
10.4018/ijskd.2015010104. 

Nielsen L, Jung S-G, An J, Salminen J, Kwak H & Jansen BJ 2017, ‘Who Are Your Users?: 
Comparing Media Professionals’ Preconception of Users to Data-driven Personas’, in 
Proceedings of the 29th Australian Conference on Computer-Human Interaction, Brisbane, 
ACM, Queensland, pp. 602–606. OZCHI ’17. ACM. DOI: 10.1145/3152771.3156178. 

Pruitt J & Grudin J 2003, ‘Personas: Practice and Theory’, in Proceedings of the 2003 Conference 
on Designing for User Experiences, DUX ’03. ACM, San Francisco, pp. 1–15. DOI: 
10.1145/997078.997089. 

Rönkkö K 2005, ‘An Empirical Study Demonstrating How Different Design Constraints, Project 
Organization and Contexts Limited the Utility of Personas’, in Proceedings of the 
Proceedings of the 38th Annual Hawaii International Conference on System Sciences - 
Volume 08, HICSS ’05. IEEE Computer Society, Washington. DOI: 
10.1109/HICSS.2005.85. 

Rönkkö K, Hellman M, Kilander B & Dittrich Y 2004, ‘Personas is Not Applicable: Local Remedies 
Interpreted in a Wider Context’, in Proceedings of the Eighth Conference on Participatory 
Design: Artful Integration: Interweaving Media, Materials and Practices - Volume 1, PDC 
04 ACM, Toronto, pp. 112–120. DOI: 10.1145/1011870.1011884. 

Sakata J, Zhang M, Pu S,Xing J & Versha K 2014, ‘Beam: a mobile application to improve 
happiness and mental health’, in CHI’14 Extended Abstracts on Human Factors in 
Computing Systems, pp. 221–226. 

Salminen J, Jansen BJ, An J, Kwak H & Jung S-G 2018, ‘Are personas done? Evaluating their 
usefulness in the age of digital analytics’, Persona Studies, vol. 4, no. 2, pp. 47–65. DOI: 
10.21153/psj2018vol4no2art737. 

Salminen J, Jansen BJ, An J, Jung S-G, Nielsen L & Kwak H 2018, ‘Fixation and Confusion – 
Investigating Eye-tracking Participants’ Exposure to Information in Personas’, in 
Proceedings of the ACM SIGIR Conference on Human Information Interaction and Retrieval 
(CHIIR 2018), ACM, New Jersey, pp. 110–119. DOI: 10.1145/3176349.3176391. 


Salminen, Jung & Jansen

 
62 

 
Salminen J, Jung S-G, An J, Kwak H & Jansen BJ 2018, ‘Findings of a User Study of Automatically 
Generated Personas’, in Extended Abstracts of the 2018 CHI Conference on Human Factors 
in Computing Systems – CHI ’18, ACM Press, Montreal, pp. 1–6. DOI: 
10.1145/3170427.3188470. 

Salminen J, Jung S-G, An J, Kwak H, Nielsen L & Jansen BJ 2019, ‘Confusion and information 
triggered by photos in persona profiles’, International Journal of Human-Computer 
Studies, vol. 129, pp. 1–14. DOI: 10.1016/j.ijhcs.2019.03.005. 

Salminen J, Jung S-G & Jansen BJ 2019, ‘Detecting Demographic Bias in Automatically Generated 
Personas’, in Extended Abstracts of the 2019 CHI Conference on Human Factors in 
Computing Systems, CHI EA ’19. ACM, New York, pp. LBW0122:1-LBW0122:6. DOI: 
10.1145/3290607.3313034. 

Salminen J, Nielsen L, Jung S-G, An J, Kwak H & Jansen BJ 2018, ‘“Is More Better?”: Impact of 
Multiple Photos on Perception of Persona Profiles’, in Proceedings of ACM CHI Conference 
on Human Factors in Computing Systems (CHI2018), ACM, Montréal, DOI: 
10.1145/3173574.3173891. 

Salminen J, Sengün S, Jung S-G & Jansen BJ 2019, ‘Design Issues in Automatically Generated 
Persona Profiles: A Qualitative Analysis from 38 Think-Aloud Transcripts’, in 
Proceedings of the ACM SIGIR Conference on Human Information Interaction and Retrieval 
(CHIIR), ACM, Glasgow, pp. 225–229. DOI: 10.1145/3295750.3298942. 

Salminen J, Şengün S, Kwak H, Jansen BJ, An J, Jung S-G, Vieweg S & Harrell DF 2018, ‘From 2,772 
segments to five personas: Summarizing a diverse online audience by generating 
culturally adapted personas’,  First Monday, vol. 23, no. 6. DOI: 10.5210/fm.v23i6.8415. 

Salminen J, Santos Joao M., Jung S-G, Eslami M Jansen BJ 2019, ‘Persona Transparency: Analyzing 
the Impact of Explanations on Perceptions of Data-Driven Personas’,  International 
Journal of Human–Computer Interaction, vol. 0, no. 0, pp.1–13. DOI: 
10.1080/10447318.2019.1688946. 

Salminen J, Guan Kathleen, Jung S-G, Chowdhury SA & Jansen BJ 2020, ‘A Literature Review of 
Quantitative Persona Creation’, in CHI ’20: Proceedings of the 2020 CHI Conference on 
Human Factors in Computing Systems, ACM, Honolulu, pp. 1–14. DOI: 
https://doi.org/10.1145/3313831.3376502. 

Salminen J, Jung S-G & Jansen BJ 2020, ‘Explaining Data-Driven Personas’, in Proceedings of the 
Workshop on Explainable Smart Systems for Algorithmic Transparency in Emerging 
Technologies co-located with 25th International Conference on Intelligent User Interfaces 
(IUI 2020), CEUR Workshop Proceedings, Cagliari, p. 7. DOI: urn:nbn:de:0074-2582-4. 

Salminen J, Jung S-G, Chowdhury SA, Sengün S & Jansen BJ 2020, ‘Personas and Analytics: A 
Comparative User Study of Efficiency and Effectiveness for a User Identification Task’, in 
Proceedings of the ACM Conference of Human Factors in Computing Systems (CHI’20), 
ACM, Honolulu, DOI: https://doi.org/10.1145/3313831.3376770. 

Salminen J, Liu Ying-Hsang, Sengün S, Santos JM, Jung S-G & Jansen BJ 2020, ‘The Effect of 
Numerical and Textual Information on Visual Engagement and Perceptions of AI-Driven 
Persona Interfaces’, in IUI ’20: Proceedings of the 25th International Conference on 
Intelligent User Interfaces, ACM, Cagliary, pp. 357–368. DOI: 
https://doi.org/10.1145/3377325.3377492. 

Salminen J, Froneman Willemien, Jung S, Chowdhury S & Jansen BJ 2020, ‘The Ethics of Data-
Driven Personas’, in Extended Abstracts of the 2020 CHI Conference on Human Factors in 
Computing Systems Extended Abstracts, CHI ’20. Association for Computing Machinery, 
Honolulu, pp. 1–9. DOI: 10.1145/3334480.3382790. 

Seidelin C, Jonsson A, Høgild M, Rømer J & Diekmann P 2014, ‘Implementing personas for 
international markets: a question of UX maturity’, in Proceedings at SIDER. 

Siegel DA 2010, ‘The Mystique of Numbers: Belief in Quantitative Approaches to Segmentation 
and Persona Development’, in CHI ’10 Extended Abstracts on Human Factors in 
Computing Systems, CHI EA ’10. ACM, New York, pp. 4721–4732. DOI: 
10.1145/1753846.1754221. 


Persona Studies 2021, vol. 7, no. 1  

 
63 
 

Thomas TW, Tabassum M, Chu B & Lipford H 2018, ‘Security during application development: 
An application security expert perspective’, in Proceedings of the 2018 CHI Conference on 
Human Factors in Computing Systems, pp. 1–12. 

Turner P & Turner S 2011, ‘Is stereotyping inevitable when designing with personas?’ Design 
studies, vol. 32, no. 1, pp. 30–44. 

Watanabe Y, Washizaki H, Honda K, Noyori Y, Fukazawa Y, Morizuki A, Shibata H, Ogawa K, 
Ishigaki M, Shiizaki S, Yamaguchi T & Yagi T 2017, ‘ID3P: Iterative Data-driven 
Development of Persona Based on Quantitative Evaluation and Revision’, in Proceedings 
of the 10th International Workshop on Cooperative and Human Aspects of Software 
Engineering, CHASE ’17. IEEE Press, Piscataway, pp. 49–55. DOI: 
10.1109/CHASE.2017.9. 

Wright P & McCarthy J 2008, ‘Empathy and Experience in HCI’, in Proceedings of the SIGCHI 
Conference on Human Factors in Computing Systems, CHI ’08. ACM, Florence, pp. 637–
646. DOI: 10.1145/1357054.1357156. 

Zhang X, Brown H-F & Shankar A 2016, ‘Data-driven Personas: Constructing Archetypal Users 
with Clickstreams and User Telemetry’, in Proceedings of the 2016 CHI Conference on 
Human Factors in Computing Systems, CHI ’16. ACM, San Jose, pp. 5350–5359.  

Zhu H, Wang H & Carroll JM 2019, ‘Creating Persona Skeletons from Imbalanced Datasets - A 
Case Study using U.S. Older Adults’ Health Data’, in Proceedings of the 2019 on Designing 
Interactive Systems Conference - DIS ’19, ACM Press, San Diego, pp. 61–70. DOI: 
10.1145/3322276.3322285.