Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

Avatar Web-Based Self-Report Survey System Technology for 
Public Health Research: Technical Outcome Results and 
Lessons Learned 
Craig Savel1; Stan Mierzwa1; Pamina M. Gorbach (Dr.P.H.)2; Samir Souidi1; Michelle Lally 
(MD)3; Gregory Zimet (Ph.D.)4;  Adolescent Medicine Trials Network for HIV/AIDS 
Interventions 

1. Information Technology, Population Council, New York, NY 
2. Department of Epidemiology, University of California, Los Angeles (UCLA), CA 
3. Alpert Medical School of Brown University, Lifespan Hospital System, and VA Medical 

Center, Providence, RI 
4. Indiana University School of Medicine, Indianapolis, IN 

 
Abstract 

This paper reports on a specific Web-based self-report data collection system that was developed for 
a public health research study in the United States. Our focus is on technical outcome results and 
lessons learned that may be useful to other projects requiring such a solution. The system was 
accessible from any device that had a browser that supported HTML5. Report findings include: which 
hardware devices, Web browsers, and operating systems were used; the rate of survey completion; 
and key considerations for employing Web-based surveys in a clinical trial setting. 

Keywords: Self-Report Data Collection; Electronic Data Collection; CASI; Avatars; HTML5; Smartphones; 
Web Browsers; Web-Based Survey; Clinical Trials 

Correspondence: smierzwa@popcouncil.org 

DOI: 10.5210/ojphi.v8i2.6719 

Copyright ©2016 the author(s) 
 
This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. 
Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the 
copy and the copy is used for educational, not-for-profit purposes. 

Introduction 

Given the challenges associated with collecting accurate self-reported data in research studies, new 
approaches using customizable avatars and online questionnaires are being developed in an 
attempt to improve the frequency and accuracy of self-reports. In looking for ways to better collect 
survey data, we developed a technology solution consisting of a Web-based self-report data 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

collection system that used customizable avatars to collect data. Participants were instructed to 
take two surveys at specific time periods. Self-created avatars “traveled” with participants such 
that they would appear during the first and second surveys and also appear if a participant restarted 
a survey that was not completed on the first attempt. The survey website was HTML5 compatible, 
but we elected not to use HTML5 local storage because of confidentiality concerns and 
requirements for data security in this public health research study. The survey was designed to 
work on any HTML5-compatible browser and on any tablet, smartphone, or computer that had a 
browser that supported HTML5. Although we recognized that participants who had older browsers 
(Internet Explorer 8 or earlier, old versions of Firefox and Chrome) might not be able to access 
and complete the self-report survey, it was felt that most of the target audience would have little 
trouble doing so. To the best of our knowledge, no other study has specifically examined the 
devices, Internet browsers, and operating systems used to complete a Web-based self-report survey 
for a public health research project. 

Methods 

Many self-report electronic data collection systems in HIV and/or other public health research 
studies use technology that exists in a controlled environment. The study protocol generally 
dictates the type of computer or device to be used and the method for presenting the study’s survey. 
This study was an ancillary study to a large clinical trial of pre-exposure prophylaxis use by HIV-
negative adolescent males 15–17 years of age conducted at 12 sites in the United States through 
the NICHD-funded Adolescent Trials Network. Most of these sites were adolescent HIV clinics 
(ATN 110/113). After completing procedures for the clinical trial, adolescent males enrolled in the 
trial were offered participation in this ancillary study. If they agreed to participate, they were given 
choices on how to complete the study questionnaire. Participants were able to access the Web-
based survey either from inside a study clinic (using clinic computers) or from a device of the 
participant’s choice (either inside or outside the clinic setting). This meant that the self-report 
survey system needed to be built such that participants could access the survey from computers, 
tablets, or smartphones on a variety of operating systems using many different browsers. The 
system needed to allow participants to create their own customized avatars that would follow them 
through the questionnaire. It also had to allow them to edit or use the same avatar in a follow-up 
survey. The customized avatars would appear on each question screen, and they would move to 
different locations on the screen in order to present the survey questions within a text bubble [1]. 

During the Web-based self-report survey data collection, information was collected on several 
technical measures such as which Web browser and operating system was used. The method of 
data collection was made available via log files that are common in Web servers. Our study used 
the Microsoft Internet Information Server to capture this information. Several elements will be 
reported in the Findings section, including the preference of using the interactive questionnaire 
Web-based survey system, recording the amount of time to complete the electronic survey, and 
the percentage of participants that completed the survey. Many of the qualities of the very simple 
end-user screen design, as well as the elements of start and end time, and computer name were 
adopted from the Population Council ACASI technology solution [2]. 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

Findings (Results) 

The study enrolled its first participants in July 2013 and completed enrollment in July 2015. Since 
each of the 167 study participants should have made at least two visits to the assigned clinic during 
the study, and may have accessed the survey additional times as needed to complete it, we were 
able to collect sufficient data. Because of potential confidentiality issues, we were not able to use 
cookies to track visitors, even anonymously. We were therefore also not able to use Google 
Analytics. The preferred strategy was to parse the USER AGENT, glean as much as possible, and 
write that information into a secure database. Browser data were sent to the server via a USER 
AGENT text string with every request; this occurred automatically when the browser on the 
computer communicated with the device. A USER AGENT string indicates which browser was 
used, its version number, and details about the user’s system, such as the operating system and 
version. For various reasons, including incomplete or corrupted USER AGENT strings, some visits 
to the survey may not have been logged/recorded (meaning communication between the browser 
and the server). 

Table 1. Basic result data on utilized Internet browsers and versions: June 2013–July 2015 

BROWSER and VERSION # VISITS % OF VISITS 

Internet Explorer 196 34% 

7.0 28  

…..8.0 19  

…..9.0 126  

…..10.0 23  

Safari 149 26% 

0.0 (see note) 110  

4.0 1  

5.0 3  

5.1 3  

6.0 1  

6.1 31  

Chrome 128 21% 

0.0 (see note) 118  

18.0 8  

27.0 2  

Firefox 4 < 1% 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

16.0 1  

21.0 3  

Android 4.0 13 2% 

Other or unknown 89 16% 

NOTE: Version 0.0 for both Safari and Chrome browsers occurs when the browser version is not 
part of the string sent to the server. 

These data show some notable differences from general statistics for US users. The website 
Statcounter (http://gs.statcounter.com/#all-browser-US-monthly-201306-201504-bar) collects 
statistics on Web usage. Among participants taking the survey, the most popular browser was 
Internet Explorer by a margin of 8%. The second most popular was Safari, and the third most 
popular was Chrome. (See Table 1.) In contrast, Statcounter shows that for the time period the 
survey was running, Chrome was the most popular browser at 31%, Internet Explorer was second 
at 24%, and Safari was third at 23%. Firefox had 11% usage but less than 1% usage for the survey. 

What can account for this difference? It is impossible to know, but we hypothesize that more users 
opted to fill out the surveys at the clinic than we had expected. Businesses, governments, and social 
service organizations are often “late adopters” of technology, and if users filled out the surveys at 
the time of the visit to the clinic that could account for the difference. Since survey participants 
were young people, we expected mobile browsers, especially iPhones or iPads, to be factored in. 
Apple mobile products use Safari as the default and this can account for the relative greater use of 
Safari in our survey as opposed to general statistics. 

Table 2. Basic result data on operating systems usage: ATN 123* June 2013–July 2015 

OPERATING SYSTEM # VISITS % OF VISITS 

Windows 302 53% 

Windows 7 264  

Windows 8 4  

Windows XP 31  

Windows Vista 3  

Mac OSX 99 17% 

Unknown 89 16% 

Linux 39 7% 

iOS 24 4% 

Android 20 3% 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

*ATN – Adolescent Medicine Trials Network 

Striking differences were also found between survey respondents’ operating system usage (see 
Table 2) and the general statistics for US users during that time period. In reviewing Statcounter 
data on operating systems (http://gs.statcounter.com/#all-os-US-monthly-201306-201504-bar), 
we notice that more than 50% of responses were from a Windows-based computer, whereas 17% 
were from a Mac OSX-based system. 

Although it is more difficult to determine the device type by webserver log files, one can infer a 
likely device from a log file. For instance, if a log file entry shows a Windows operating system, 
the device is obviously a Windows PC or laptop. If a log file shows iOS as the operating system, 
then it is an Apple mobile device. It is much more difficult to determine, however, whether that 
device is an iPhone or an iPad. Another problem is that for many versions of mobile operating 
systems, there is no accurate information in the header file. This is especially true of Android 
devices. 

How successful was the survey? To gauge effectiveness, some of the questions we might ask are: 
What percentage of respondents finished the self-report avatar survey? How many finished on the 
first attempt? How long did it take to finish the survey? 

The number of respondents who started Survey 1 was 154; 96 completed it. The number of 
respondents who started Survey 2 was 106; 89 completed it. There was only one user who 
attempted both surveys and completed neither. 

Table 3. Totals of users who started and completed surveys 1 and 2 

Started 
Survey 1 

Started 
Survey 2 

Completed 
Survey 1 

Completed 
Survey 2 

Completed 
1; did not 
complete 2 

Completed 
2; did not 
complete 1 

Completed 
both 
surveys 

154 106 96 89 9 2 87 

Among those who started the first survey, 62% finished it. Almost 84% of those who started the 
second survey completed it. (See Table 3.) 

Participants were allowed multiple attempts to complete the survey. Most who completed the 
survey did so on one try, although a few required multiple tries. 

We cannot know, of course, why a participant needed more than one attempt to complete a survey, 
but it is instructive to compare survey attempts. For example, what were the browsers and 
operating systems used for each attempt? Were they the same or different? Can we observe patterns? 

A quick and preliminary look at survey statistics shows that most of those who logged in more 
than one time for a given survey logged in using the same browser and operating system as they 
had used previously. There were a few trends though. Most users who switched went from a 
Windows machine using Internet Explorer to either Windows using Chrome or a Mac. The second 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

largest group of switchers went from Safari on iOS to Safari on Mac OS. Those who started on 
Android tended to stay on Android. 

Some participants did not complete the surveys. This includes those who did not complete the 
surveys at all and those who did not complete the surveys in the allotted time but returned to restart 
from the beginning. 

For participants who did not complete either survey on the first visit, including those who returned 
and completed the survey later and those who did not, certain questions were “stoppers.” In other 
words, many users who stopped the survey stopped at the same questions. The one that caused the 
most users to stop was a complicated type of question that was presented in a calendar. Users were 
required to answer two questions for each calendar date, accessible via a pop-up. Users were not 
able to advance to the next question until both questions were answered for all dates. Eighteen 
participants left the survey without completing the question that was presented in that format. Six 
users logged in, did not complete any questions, and did not return to complete the survey. Other 
than that, no more than two users were stopped at any other particular question. 

Discussion 

The Web-Based Self-Report Survey was made available to participants with the option of taking 
it in a controlled clinic environment or on their own outside the clinic using whatever device they 
had, wherever they were. Because of confidentiality concerns it is not possible to verify that the 
surveys were more often completed in a clinic. In a future similarly designed project, it would be 
beneficial to consider adding logic to the survey to record whether it was actually taken in one of 
the original clinic sites on a computer belonging to the study. 

The Web-Based Self-Report Survey included many of the assumed benefits of electronic survey-
taking via the Web: consistency in survey presentation, minimization of errors in data collected 
because of edit and range checks, and the ability to know when surveys are completed and 
consequently prevent users from taking a survey they had already completed. The number of 
studies collecting self-reported data via the Web continues to increase rapidly [3]. The quality of 
anthropometric data collected using a Web-based questionnaire, with regard to missing and 
plausible answers, has been shown to be equal to, or better than, that of data collected using a paper 
version of the questionnaire [4]. 

To complete the Web-based survey, participants were provided with the secure link to the site as 
well as a user ID and password to use when logging in. When surveys were taken in the clinic it 
was much easier to ensure or validate that the actual participant was taking the survey; when the 
survey was taken away from the clinic, there is a possibility that the participant is being aided or 
having the survey done by someone other than themselves. 

Limitations 

For 16% of the surveys, it was not possible to get information on the browser or operating system. 
These data come from text strings sent from a user’s browser to a server. A large percentage of 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

data were unclassifiable, and this could have had an impact on the outcome of top browser and 
operating system used, given that the separation between the top three browsers was small: 34%, 
26%, and 21%. It is important to remember that there was no absolute requirement to send the 
survey and no correction mechanism if the data was incomplete or missing. Karl Groves writes in 
online design journal boxesandarrows: 

“Server log files are inappropriate for gathering usability data. They are meant to provide server 
administrators with data about the behavior of the server, not the behavior of the user. The log file 
is a flat file containing technical information about requests for files on the server.” [5] 

Since the Avatar-Based Self-Report Survey was administered in the United States, it was assumed 
that Internet access would be widely available; it was therefore anticipated that, in most cases, 
participants would perform the survey at home. However, we cannot be sure that the participants 
had Internet access via their mobile devices and/or home computers. The requirement of Internet 
access may lead to a higher rate of completion in clinics if studies such as this are conducted in 
the developing world where many such public health research projects are likely to take place. In 
addition, for those who did not have Internet access, it was not assessed if this was associated with 
any sociodemographic factors. If future similar projects are to include self-report Web-based 
surveys in the context of a clinical trial, we would recommend reviewing data on Internet 
broadband access availability. The National Broadband Map (NBM) is an available resource that 
is created and maintained by the National Telecommunications & Information Administration in 
collaboration with the FCC, 50 US states, 5 territories, and the District of Columbia 
(www.broadband.gov). By reviewing the Internet access data available to households in the United 
States, one could do a scan to ensure that adequate coverage is available. Household broadband 
adoption rates have increased dramatically over the past decade, from about 4% in 2000 to nearly 
70% in 2011 (6). Although this is quite an increase, the latest US Census Population Surveys do 
suggest there is still a gap in home Internet access. Current Population Survey data from 2003 to 
2011 demonstrate a persistent 12–13 percentage point gap in broadband adoption rates between 
metropolitan areas and nonmetropolitan households (6). Depending on the demographic 
characteristics of the survey participants in particular studies, it could also be useful to focus more 
specifically on the Internet access that is available to particular age groups as well as the race 
and/or ethnic background of the householder. Such data is available in the US Census American 
Community Survey Reports. As of 2014, it was reported that for individuals in the age range of 
15–34 years, Internet access was available to 77.4% of households. In addition, Internet access 
was available to 76.2% of White-only households, 60.6% of Black-only households, 86% of 
Asian-only households, and 65.9% of Hispanic of any race households [6]. These data could be 
helpful in scanning a prospective survey participant population to determine the probability of 
using a Web-based self-report survey outside the clinic. In a study that compared adolescent survey 
completion via telephone versus web-based, overall 41.5% completed the survey online as 
compared to 59.8% via the telephone interview [7]. This finding also indicates that considering 
the method for administering an adolescent survey may be valuable, rather than assuming that a 
web-based approach is optimal. Finally, the surveys were conducted at 12 locations across the 
United States. The sites may have implemented the study differently and it may have been easier 
for them to manage reimbursement and retention if participants completed the survey in the clinic. 
This may have had an effect on the survey data discussed above. 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

Conclusion 

Public health researchers, particularly in the social science and epidemiology arenas, continue to 
consider technologies that would aid in obtaining more accurate response options when doing self-
report surveys. There is ample research on conducting Web-based surveys, but more knowledge 
and research into self-report public health surveys is needed. 

We did find that the majority of our surveys were completed on Windows-based computers with 
a corresponding Internet Explorer browser. For participants who needed more than one visit to 
complete a survey, the largest percentage went from a Windows-based computer running Internet 
Explorer to an Apple OS (either Mac OS or iOS) running Safari. 

For future projects requiring the use of a self-report survey that allows respondents to use their 
own personal devices (BYOD), we would consider adding several new elements to the solution. 
These changes or additions would include: better logging of the exact browser, version, and 
operating system used; the time it took to respond to each question; access to the Internet and 
device access away from the clinic for participants; and individual participant demographics. 

Acknowledgments 

We thank Sarah Thornton and the ATN Data and Operations Center at Westat for their 
collaboration in setting up the operational data collection process at the many sites involved in this 
research study. We acknowledge the contribution of the investigators and staff at the following 
sites that participated in this study: University of South Florida, Tampa (Emmanuel, Straub, 
Enriquez-Bruce), Children's Hospital of Los Angeles (Belzer, Tucker), Children's National 
Medical Center (D'Angelo, Trexler), Children's Hospital of Philadelphia (Douglas, Tanney), John 
H. Stroger Jr. Hospital of Cook County and the Ruth M. Rothstein CORE Center (Martinez, Henry-
Reid, Bojan), Tulane University Health Sciences Center (Abdalian, Kozina), University of Miami 
School of Medicine (Friedman, Maturo), St. Jude's Children's Research Hospital (Flynn, Dillard), 
Baylor College of Medicine, Texas Children’s Hospital (Paul, Head); Wayne State University 
(Secord, Outlaw, Cromer); Johns Hopkins University School of Medicine (Agwu, Sanders, 
Anderson); The Fenway Institute (Mayer, Dormitzer); and University of Colorado (Reirden, 
Chambers). We would like to acknowledge Irene Friedland, at the Population Council, for the 
thorough edit of the paper she provided. The comments and views of the authors do not necessarily 
represent the views of the Eunice Kennedy Shriver National Institute of Child Health and Human 
Development. The study was scientifically reviewed by the ATN’s Community Prevention 
Leadership Group. Network, scientific and logistical support was provided by the ATN 
Coordinating Center (Wilson, Partlow) at The University of Alabama at Birmingham. The 
investigators are grateful to the members of the local youth Community Advisory Boards for their 
insight and counsel and are indebted to the youth who participated in this study. This work was 
supported by the Adolescent Medicine Trials Network for HIV/AIDS Interventions (ATN) and 
NIH support, Bill Kapogiannis and with supplemental funding from NIDA and NIMH Grant 
NICHD 5 U01 HD 40533 and 5 U01 HD 40474. 

http://ojphi.org/


Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome 
Results and Lessons Learned  
 
 
Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 8(2):e189, 2016 

 
OJPHI 

References 

1. Savel, C., Mierzwa, S., Gorbach, P., et al. 2014. Web-based, mobile-device friendly, self-
report survey system incorporating avatars and gaming console techniques. Online J Public 
Health Inform. 6(2), •••. PubMed 

2. Mierzwa, S., Souidi, S., Friedland, I., et al. 2013. “Approaches that will yield greater success 
when implementing self-administered electronic data capture ICT systems in the developing 
world with an illiterate or semi-literate population.” New York: Population Council. 

3. Bonn, S.E., Trolle Lagerros, Y., and Bälter, K. 2013. How valid are web-based self-reports of 
weight? J Med Internet Res. 15(4), e52. doi:http://dx.doi.org/10.2196/jmir.2393. PubMed 

4. Touvier M., Méjean, C., Kesse-Guyot, E., et al. 2010. Comparison between web-based and 
paper versions of a self-administered anthropometric questionnaire. Eur J Epidemiol. 25(5), 
287-96. doi:http://dx.doi.org/10.1007/s10654-010-9433-9. PubMed 

5. Groves, Karl. 2007. “The limitations of server log files for usability analysis,” 
boxesandarrows. http://boxesandarrows.com/the-limitations-of-server-log-files-for-usability-
analysis/ 

6.  File, Thom and Ryan, Camille. 2014. “Computer and Internet use in the United States: 2013,” 
American Community Survey Reports. United States Census Bureau, ACS-28. 

7. Rivara, Frederick P., Koepsell, Thomas D., Wang, Jin, et al. 2011. Comparison of telephone 
with world wide web-based responses by parents and teens to a follow-up survey after injury. 
Health Serv Res. doi:10.1111/j.1475-6773.2010.01236.x. 

8. Whitacre, Brian, Strover, Sharon, and Gallardo, Roberto. 2015. How much does broadband 
infrastructure matter? Decomposing the metro-non-metro adoption gap with the help of the 
National Broadband Map. Gov Inf Q. 32, 261-69.  http://dx.doi.org/10.1016/j.giq.2015.03.002 

 
http://ojphi.org/
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=25422726&dopt=Abstract
http://dx.doi.org/10.2196/jmir.2393
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=23570956&dopt=Abstract
http://dx.doi.org/10.1007/s10654-010-9433-9
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=20191377&dopt=Abstract
http://dx.doi.org/10.1016/j.giq.2015.03.002

	Avatar Web-Based Self-Report Survey System Technology for Public Health Research: Technical Outcome Results and Lessons Learned
	Introduction
	Methods
	Findings (Results)
	Discussion
	Limitations
	Conclusion
	Acknowledgments
	References