Burkell & Regan - final before TS Correspondence Address: Jaquelyn Burkell, Faculty of Information & Media Studies, Western University, London, ON, N6A 5B9; Email: jburkell@uwo.ca ISSN: 1911-4788 Volume 15, Issue 3, 397-413, 2021 Expression in the Virtual Public: Social Justice Considerations in Harvesting Youth Online Discussions for Research Purposes JACQUELYN BURKELL Western University, Canada PRISCILLA REGAN George Mason University, USA ABSTRACT Information posted by youth in online social media contexts is regularly accessed, downloaded, integrated, and analyzed by academic researchers. The practice raises significant social justice considerations for researchers including issues of representation and equitable distribution of risks and benefits. Use of this type of data for research purposes helps to ensure representation in research of the voices of (sometimes marginalized) youth who participate in these online contexts, at times discussing issues that are also under-represented. At the same time, youth whose data are harvested are subject (often without notice or consent) to the risks associated with this research, while receiving little if any direct benefit from the work. These risks include the potential loss of online social community as well as threats to participant rights and wellbeing. This paper explores the tension between the social justice benefit of representation and considerations that would suggest caution, the latter including inequitable distribution of research-related costs and benefits, and the traditional ethics concerns of participant autonomy and privacy in the context of youth participation in online discussions. In the final section, we propose guidelines and considerations for the conduct of online social media research to assist researchers to balance and respect representational and participant rights or wellbeing considerations, especially with youth. KEYWORDS youth; social media; social justice; research Introduction Online discussion groups, especially those involving youth, provide a rich source of naturalistic data for research (Jowett, 2015). Research use of these data raises social justice considerations including issues of representation and Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 398 equitable distribution of risks, benefits, and harms. Research ethics addresses some of these issues, albeit with particular emphasis on the protection of individual research participants; other social justice considerations fall outside of research ethics issues or are in tension with them. In this paper, we consider the social justice implications of research that uses data harvested from youth social media. We begin by analyzing the risks and benefits of this type of research from a social justice perspective, highlighting the tension between representation and respect for participant rights and wellbeing. In the next section of the paper, we discuss the guidance provided in research ethics guidelines for researchers considering the harvesting of social media data for research purposes, focusing on the ethical considerations that are raised in these guidelines. We follow this analysis with discussion of specific examples of research that has used harvested social media data, with specific attention to the social justice issues that are raised by these studies. Finally, we outline best practices and considerations that support ethical decisions about research involving the harvesting of social media information that are informed by social justice principles. A quick Google search for discussion or support groups reveals support groups for transgender teens, pro-anorexia discussion groups, and support groups for those living with depression. There are also myriad online social spaces for group discussions of a more prosaic nature, such as reactions to and recommendations for books. Some groups require registration and sign-in to access conversations, while others do not. Some groups are moderated, while in other cases content and interactions are not moderated. Some discussion group interactions are archived and potentially searchable on the open web, while in other cases content is accessible only in real time. Participants in online discussion groups typically use nicknames or usernames and do not provide other personal (particularly identifying) information; however, in the case of groups that require registration, profile information may be available. Online discussion groups are a valuable social resource, particularly for individuals who are members of marginalized populations. These groups provide venues for personal expression, exploration, and support that may not otherwise be available. They often involve discussions among individuals who lack support from those immediately present in their lives as they deal with sensitive issues that often lead to shame or bullying. In these discussions, narratives of personal stories provide not only a means of personal recovery but also a route to social and political change (Costa et al., 2012). Often referred to as support groups, these venues for online discussion empower members through sharing similar experiences, fears, and hopes and potentially revealing the social and structural factors that constrain their lives and the inequities that need to be addressed. For example, Morrow and Weiser (2012) point out that “the social and structural aspects of mental health continue to be marginalized as do the voices of people with lived experiences of mental distress” (p. 30). Online social spaces are also Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 399 important venues for political organization and exploration of political attitudes (Middaugh et al., 2017), enabling quick organization and real-world political action (Yang, 2007). These spaces, and the interactions within them, are particularly important for youth who live under conditions of political oppression (Bowe & Blom, 2010, 2011; Tufekci & Wilson, 2012) and for whom exposure could entail significant risk. Research that engages with the voices and stories (Costa et al., 2012; De Ridder & Van Bauwel, 2015; Trevisan & Reilly, 2014) of members of marginalized groups is important as a means of achieving social justice for these groups, but the research methods used also need to respect and protect both the group as a communal unit and the individuals within it. It is also important to note that the goal of representation cannot effectively be reached if researchers do not fully understand the content they are analyzing or the context in which it is produced. Thus, researchers must guard against engaging in “helicopter research” (Minasny & Fiantis, 2018) in which researchers “come in to small communities, take their samples, and leave” (Evans, 2018), with little understanding of the communities they study and without intention or action to ensure that the results of the research benefit the community members under observation (Evans, 2018). Although the term helicopter research is typically applied to international research with vulnerable communities, the same principles are relevant to outsider academic researchers who harvest data from online communities without benefit of extended engagement, negotiation, and deep attention to community values and rights. Although concerns about helicopter research are increasingly present in the research ethics context, the question that more commonly determines the acceptability of harvesting social media discussions is whether the group discussions are public or private in nature. Research ethics guidelines typically allow observation of behaviour in public spaces, so long as those being observed are not identified, and have no reasonable expectation of privacy. If these conditions are not met, research may be allowed under stricter conditions that include notice to participants and securing of consent. The question of whether online spaces are public or private is often contested, and online discussions occupy an apparently liminal position between public and private exchange. As a result it is often unclear whether participants hold an expectation of privacy in their group communications. From a social justice perspective, the implications for the conduct of research are complex. Representation is a key social justice principle, and from that perspective participation in research, and therefore representation in the results of research and the knowledge that flows from these results, is of significant positive value. Allowing covert research undermines the autonomy of participants, but may increase participation, and thus representation. By contrast, when research is acknowledged and participant consent is required, autonomy is supported but research participation, and thus representation, may suffer. Moreover, requiring notice and consent Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 400 (which by definition makes research overt) can undermine the perceived safety of online social spaces, thus reducing the social value of these spaces that these individuals and groups have developed for their own purposes. The ethical issues regarding research using online discussion groups are even more problematic when youth may be participants in the groups. One issue is that minors are in many cases deemed unable to consent for themselves (e.g., Health Canada, 2019), in some cases because they are presumed to lack capacity (Canadian Institutes of Health Research et al., 2018, Article 3.3), and research ethics guidelines typically require parental consent for participation, often in conjunction with youth assent for data collection (Canadian Institutes of Health Research et al., 2018, Article 3.3). Seeking parental consent for participation, however, may raise privacy issues for youth, especially as online groups offer a venue to discuss topics about which youth may not be comfortable speaking with parents or teachers (see e.g., Taylor, 2008). If data collection is publicly acknowledged and thus could require consent, that may have the effect of limiting youth participation in these important discussions. A second issue is that youth are very sensitive to the perceived privateness of seemingly public spaces (boyd & Marwick, 2011), and as participants in those spaces may have different conceptions of their interactions, and different standards and practices with respect to appropriate practices within those spaces, compared to adults (Berriman & Thompson, 2015). De Ridder and Van Bauwel (2015) refer to social network sites as “extended places… strongly connected to particular offline local places of which they are often an extension” (p. 782). Relatedly, youth often employ online discussions as a forum for trying on personas, presenting different selves, and discovering the self with which they are most comfortable (Regan & Steeves, 2010; Steeves & Regan, 2014), and observation within those spaces could compromise this important developmental exploration. For example, Selfridge and Mitchell (2020) found that youth living on or close to the street experimented on social media in expressing responses to the death of a friend of family member, particularly in terms of demonstrating grief or hope, with navigating difficult relationships, and with supporting each other. A third issue involves youth’s somewhat limited understanding of the role of researchers and the implications of being observed, especially in settings such as online discussions where researchers somewhat disappear into the background. In analyzing three co-research settings with youth, Collier (2019) found that the presence of researchers became normalized, that “children did not appear to imagine audiences beyond me,” (p. 48), and “negotiating ongoing consent was tricky, primarily because they did not see the need for it” (p. 51). This process of normalization is also likely in studies such as Hung’s (2020), in which the author was invited to join a group of 16- 18 year old boys on Xbox’s Live’s party chat as they discussed a range of issues, such as abortion rights, as well as “intimate and personal issues, such as romantic relationships, quarrels with parents, and expectations for the Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 401 future” (p. 600).1 Finally, there is the risk of compromised anonymity and confidentiality for individuals whose online information is harvested, with or without explicit consent. If there is the possibility that youth can be reidentified, then that digital record could affect psychological and social development, future life options (Burkell & Regan, 2018; Regan & Steeves, 2010), and even personal safety. These traditional research ethics concerns address some but not all of the relevant social justice issues, and there are some differences between the two sets of concerns. Hoffman and Jonas (2017) argue that “justice occupies a precarious position in the history of research policy and ethics …[and] has received comparatively less explicit attention than other values, especially informed consent and beneficence” (p. 4). They point out that, oftentimes, social justice concerns in the research context are framed in terms of the selection of subjects, distribution of risks and benefits, and inclusion of vulnerable or marginal groups (p. 5). Research accessing online discussions broadens representation and inclusion of hard to reach or otherwise invisible subjects, and may be able partially to correct for a “lack of attention to the needs and issues of populations currently marginalized in society” (Fassinger & Morrow, 2013, p. 69). In analyzing research involving online disability group interactions, Trevisan and Reilly (2014) found that the analysis of Facebook content was instrumental in providing “disabled users with a lens to interpret the effects of policy measures and participate in relevant conversations,” that sharing experiences was “a fundamental step in the creation of group identity and collective agency” (p. 1135), and that the omission of discussions on semi-public Facebook pages from the research record “would have in fact equated to the ‘silencing’ of disabled people’s voices” (p. 1137). Lyons et al. (2013), borrowing from earlier work by Crethar et al. (2008), argue that research can contribute to social justice when it promotes the principles of equity, access, participation, and harmony for culturally diverse populations, and the use of online social media data for research purposes can meet these considerations. To some extent, online discussion groups may select populations “because they are available, are in compromised positions or are manipulable.” but at the same time selection is “for reasons directly related to the problem being studied” (Pieper & Thomson, 2013, p. 102). Research ethics requirements for notice and consent support participant choice with respect to data harvesting. At the same time, however, if participants are aware that data are being harvested this can have the impact of limiting participation in the research or in the online contexts from which discussions are harvested. Researchers must pay careful attention to the autonomy of the participants whose data they wish to access so that they do 1 Hung (2020) became acquainted with members of the group while working on a curriculum project at their high school (p. 599). He notes, “I received IRB approval from my research institution and was given the participants’ consent to record their XBL chats” (p. 600) but does not provide further details. Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 402 not become “willing participants – even instigators – in reinforcing posthumanist systems of surveillance on populations we wish to support or observe” (Luka et al., 2017, pp. 4-5). In order that research using social media data can contribute to the social justice aim of representation while at the same time protecting participant rights, researchers should not engage in helicopter science, but instead be sensitive to the culture of the community, enter the community with an open and inquiring attitude, and allow the research direction to emerge from the concerns of participants (Fassinger & Morrow, 2013, pp. 72-74). This requires a reflexive perspective that flexibly responds to the situation with attention to both research questions and the larger cultural context (Luka et al, 2017, p. 30). Again, however, this prolonged and more intense engagement with research participants will increase the intrusiveness of the research and thus its impact on the social ecology of the online environment. From a social justice position, an ethical framing of online research is not simply a private matter between the researcher and the subject monitored by a research ethics board, but necessitates a nuanced understanding of the users’ expectations of the online site and their experiences on the site, as well as the broader social and political context of the site and its purpose. Research Ethics Guidance As research began to incorporate data from online discussion groups, research ethics boards (REB) were confronted with the questions of whether these projects required ethical review, and if so whether they should receive ethics approval. Not surprisingly, REBs at different institutions came to varying conclusions for seemingly similar projects. National policy bodies responsible for ethical research standards began to review and revise their guidelines and requirements to account for Internet research generally and particularly research involving online discussion groups. Other guidelines emerged directly from the research community. The Association of Internet Researchers (AOIR), for example, issued guidelines in 2002 (Ess & Association of Internet Researchers Ethics Working Committee, 2002), and provided updates and expansions in 2012 (Markham & Buchanan, 2012), and 2019 (Franzke et al., 2019). The AOIR guidelines identify the following six fundamental ethical guidelines: • The greater the vulnerability of the community and participants, the greater the obligation of the researcher to protect the community and participants; • Because harm is determined based on the context, ethical decision making requires practical judgment attentive to the specific context; • All digital information involves individual persons even if that is not immediately apparent; Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 403 • The rights of subjects must be balanced with the social benefits of the research; • Ethical issues arise in all stages of the research process; and • Ethical decision-making is a deliberative process requiring consultation with many people and resources. (paraphrased from Markham & Buchanan, 2012, pp. 3-4). These guidelines raise key questions that researchers and ethics boards should consider when determining the ethicality of online research. They also address broader social justice considerations. REBs are informed by ethical guidelines in making their decisions; these guidelines, however, are not prescriptive but instead provide an interpretive framework for temporally and geographically situated individual ethical decisions. Social justice considerations, especially those informed by virtue ethics and feminist ethics of care, “stress context and situation rather than abstract principles, and dialogue and negotiation rather than rules and autonomy” (Edwards & Mauthner, 2002, p. 20). Thus, different REBs can reach different decisions regarding the ethicality of research when they are working under different ethics guidelines, and even when working under the same guidelines if local interpretations of the guidelines differ. Despite the possibilities for different interpretations, all ethics guidelines recognize that vulnerable populations should be approached with special care. Our current focus on online discussion groups involves a number of generally recognized vulnerabilities including those related to minors, politically or socially sensitive subjects, women, groups with special needs, illnesses, or emotional states (Franzke et al., 2019, p. 17), which heightens the importance of social justice considerations during all stages of the research process. As Luka et al. (2017) point out, with an ethics of care perspective, “being deeply aware of our own identity and agency is critical to being able to understand marginalized subjects without romanticizing or appropriating their experiences” (p. 31). A similar concern, discussed above as a concern about helicopter research, is the responsibility of researchers to avoid “inserting themselves into vulnerable communities to collect data, and abruptly leaving without returning findings to, or meaningfully impacting the lives of intended and unintended participants” (Guishard et al., 2018, p. 9) Research ethics guidelines such as those put forward by the Association of Internet Researchers (Franzke et al., 2019) offer insight into specific characteristics that signal a private, rather than public, online discussion. These include: • a closed discussion group that requires membership requirement for joining the discussion; • a sensitive topic of discussion; • terms of reference or privacy policies that limit research use of the data, or that specifically state the content will not be used for purposes beyond the immediate interaction or discussion. Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 404 Circumstances that are seen to mitigate against an expectation of privacy include: • terms of reference or privacy policies that explicitly allow the use of the data for research purposes; • open discussion groups that do not require membership to join; • searchable archives of the discussions, particularly if these are available on the open web. The nature of public and private contexts is increasingly contested in a number of domains and questions arising in online discussion groups are not unique. Spicker (2011) points out that disputes about ethics and social justice of covert research (i.e., research that is not declared to participants) focus on whether the circumstances are truly public, questioning the assumption that “doing things in a public place is public because it is visible” (p. 125). Instead, Spicker argues, “what defines something as public is not the geographical location where it happens, but the nature of the act” (2011, p. 125). One important research ethics consideration is participant anonymity. Ethics guidelines highlight the need to ensure that participants cannot be identified. In the context of research use of online social media data, these guidelines are often interpreted to require anonymization of pseudonyms, in recognition of the facts that enduring pseudonyms can be meaningful identities in their own right and where attached to a significant amount of personal information including extended communications, have the potential to lead to real-world identification. Another concern, less often addressed in research ethics considerations, is that online searches could locate material quoted from online discussion groups, and thus identify the source (both where the material appeared, and who produced the material), thereby compromising participant anonymity. In order to provide a more concrete understanding of how these ethical issues and social justice considerations arise in different research projects and how researchers account for these issues, in the next section we explore several examples. We selected these as illustrative of the types of research that rely on harvested social media data, and we focused on similar examples that provide contrasting approaches to research ethics and social justice considerations. All studies, by virtue of their publication, offer voice to participants, and thus meet at least minimally the social justice goal of representation. The studies differ, however, in how other social justice and research ethics considerations were addressed. All received ethics approval, although the details of such approval are vague given article length restrictions. This discussion is not intended to reflect on the validity of that approval or the ethical conduct of the research or researchers. Our goal here is to highlight the types of research using online discussion groups as data that have been considered to satisfy ethical and social justice guidelines (often at different points in time), and to review these decisions in light of current guidelines and practices. Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 405 Examples Siriaraya et al. (2011) examined empathetic communication in one online discussion group: Teenhelp.org. Their study is in many ways typical of research involving online discussion group interactions. They analyzed posts to the public and anonymous moderated discussion group, accessible without a login requirement. Posts were archived in a searchable database, and researchers downloaded for analysis messages from the earliest threads in the archived discussion. Although they do not identify the dates for these interactions, this suggests that these posts occurred in the relatively distant past, since archiving would generally be reserved for older and inactive discussions. In order to protect the anonymity of those whose postings were analyzed, usernames were not reported. Moreover, the published research did not use direct quotes from individual posts, but instead reported the frequency of various coded characteristics (e.g., empathetic communication) in the downloaded posts. Subrahmanyam et al. (2004) conducted research involving online discussion groups taking place much earlier than those examined by Siriaraya et al. (2011). It is also possible that Subrahmanyam et al. (2004) collected their data before the AoIR released their 2002 guidelines, and as a result their work could not have been informed by those guidelines. These researchers analyzed a 30-minute transcript from a moderated teen chat room to gain insight into the processes through which sexuality and identity are constructed. The researchers used a participant-observer approach to record chat room interactions. One researcher gained access to the group through an Internet provider, following a process identical to that recommended to parents who were providing access to the group for their children; thus, it appears that some type of membership or sign-on was required to access the group. The group was moderated. No mention is made of seeking consent from the participants, nor did the researchers approach the moderator(s) to obtain approval for data collection. As required by the local REB, the researcher collecting the data did not contribute to the online discussion, nor did she respond to any messages directed to her. Also, as required by the REB, user names were replaced by pseudonyms in the transcript used for analysis, and in reports of the data. The analysis was qualitative, and the published results reproduced individual statements and extended interactions between two or more discussion group participants. In neither example were participants informed about the research, thus there was no risk that the research disturbed the value of the online communities for participants. In terms of public versus private space, the research conducted by Siriaraya et al. (2011) presents fewer concerns, as the data analyzed were drawn from a publicly available historical archive. Subrahmanyam et al. (2004), by contrast, had much better opportunity to gain a contextualized understanding of the participant discussions by virtue of the researcher’s extended engagement in the discussion group, and the collection Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 406 of data in real-time as opposed to being downloaded from archives. At the same time, these advantages with respect to a fuller understanding seem to be disadvantages from the perspective of participant autonomy, because the research process involved greater (albeit covert) incursion into the online social environment. This is especially concerning given that access to the group required a sign-on of some description, which would tend to increase participant privacy expectations. Both studies protected participants from the risk of re-identification through substitution of user names with pseudonyms. The analysis conducted by Siriaraya et al. (2011) offers further protection in this respect because no direct quotes were included in the report. The direct quotes reported by Subrahmanyam et al. (2004) seem to present a greater risk of re-identification, and thus a greater threat to participant anonymity and confidentiality; however, if the discussions were not archived (as the research report suggests, but does not confirm), then this risk can be discounted. Covert observation of online interaction without participant consent or even notice to participants would seem in many cases to undermine autonomy. As indicated above, the practice is allowed under research ethics guidelines if it is deemed that the venue is public in nature and the observation and report do not present any threat to participant anonymity. Conducting research under these conditions potentially increases the value of the research by minimizing artificiality and reactivity (Calvey, 2008). If participants are not aware that they are being observed, such observation cannot disturb the value of online social interactions, nor restrict participation or speech, unless the covert researcher participates in those interaction, a practice which raises other ethical issues (see e.g., Brotsky & Giles, 2007). Some data suggest that young people may easily become accustomed to observation by researchers, and express limited interest in negotiation ongoing consent once they have agreed to being observed (see e.g., Collier, 2019). If such accommodation is easily achieved, then even overt research observation might only minimally disturb the social environment. However, there are practical difficulties with obtaining consent. In online discussion groups, users are typically identified by pseudonym and no other identifying information is available. In these circumstances it is difficult to contact participants off-discussion for consent. If the analysis uses archived discussion records, the individuals included in the discussion may not be participants at the time the research is conducted, so the researchers might not be able to reach them for consent. Additionally, the nature of discussion is interactive, and participants often quote one another in responses, making it difficult or impossible to collect data if consent was received from some but not all participants. As a result of these considerations, researchers using online discussion groups as data rarely seek consent from participants. A study by Høybye et al. (2005) demonstrates just how intrusive the process of seeking informed consent can be. Their research involved participant observation of a small online cancer support group. Existing and incoming members were approached for consent to collect data, and Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 407 participation in the group was restricted to those who consented. Fewer than half of the original members consented to the research, and many potential new members were also unable to participate because they would not consent to the research. In this case, participants had full autonomy with respect to the research use of their online discussions. There are downsides to this approach, however, including the concern that participants may change their behaviour in response to the knowledge that they are being observed, the potential for coercion affecting participants who wanted to join the group after the research was started, and the reality that the nature and conduct of the research effectively limited participation in the group during the duration of the research to those who consented, thus excluding others from any benefits associated with participation in a group that had, until the research began, imposed no consent requirements on participants. A potentially less intrusive approach is to seek consent from group moderators rather than individual participants. Evans et al. (2012) used this approach in a study examining social support available from archived online discussion groups. Various moderated discussion groups were considered as sites of data collection. Written permission to collect the data was sought from group moderators. In this case, moderators were viewed as community gatekeepers, and by gaining their consent the researcher achieved a measure of access to and participation from the community consistent with social justice claims (Lyons et al., 2013). While this approach goes some way to respecting the autonomy of participants with respect to the research use of their data, it effectively substitutes one decision proxy (the moderator) for another (the researcher/REB; although see Pullman, 2002, for an argument in support of REB proxy consent). Another approach is to establish whether participants similar to those being observed would in theory consent to the use of their data for research purposes. Stevens et al. (2015) consulted about their planned research with members of a different group focused on similar issues rather than seeking consent from the group they were studying. Seeking input on research use of data from a similarly situated community (Lyons et al., 2013) is a creative way of ensuring a measure of autonomy while also allowing research participation from a potentially marginalized community, assuming the parallel community endorses it. Another approach is to seek consent only for the use of potentially identifiable data in publications arising from the research (e.g., quotations), rather than seeking consent for data collection (see e.g., Mulveen & Hepworth, 2006). Whether or not consent is sought from participants for the harvesting of data, it is important from a research ethics and social justice perspective that participant anonymity and confidentiality be protected. A variety of strategies should be used to protect data sources (i.e., individuals and the groups in which they are participating). These include providing pseudonyms for participant user names, and not providing specific information that identifies the group from which the data were harvested. Further protection for Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 408 individual participants can be offered by eliminating all attributions for quotes, including pseudonyms (Brotsky & Giles, 2007). These measures may be sufficient if interactions are not available in online archives, but additional protection can be provided through a process of “fabrication, involving creative, bricolage-style transfiguration of original data into composite accounts or representational interactions” (Markham, 2012, p. 334). Where data are collected directly from online archives, researchers should consider conducting searches for quoted material to ensure that quotations cannot be traced back to the website and from there to individual participants. Mulveen and Hepworth (2006) employed this technique by entering random phrases taken from discussion on the site into standard search queries, ensuring that the website was not among the returned results (see also Stevens et al., 2015). Best Practices and Further Considerations A scan of published studies that use online discussion groups as sources of data reveals a range of practices designed to preserve participant autonomy in the absence of full informed consent, and also allow representation of often marginalized voices. These include: • examination of the group terms and conditions, and any privacy policy associated with the group, to ensure that these do not suggest that research use of data is restricted or precluded; • limiting observation to those groups that do not require membership (i.e., groups that are open to the public); • removal of identifying information, including usernames and user IDs, from published reports; • seeking approval from discussion group moderators for data collection; • canvassing participants in other online discussion groups addressing the same issue to determine if they would be comfortable having their own discussions mined for research purposes; • seeking post-hoc consent from participants who would be directly quoted in research reports; • conducting searches to ensure that quoted material cannot be linked back to a specific discussion group through search results. The social justice consideration of representing more groups and authentic voices in research and doing so with respect for the value of the group is a legitimate argument for undertaking research involving youth in online discussion groups, if the research actually benefits the group being studied. Researchers should understand that there are costs to this type of research, and that the individuals and groups, who are often marginalized, bear these costs. These costs include undermining the social ecosystem and value of the online communities to individuals and communities as a whole, loss of autonomy for group members, and risk of compromised privacy. These costs Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 409 must be mitigated to the greatest extent possible, and weighed against benefits to determine whether research should be carried out. Our analysis of the literature on research ethics and social justice, as well as of research involving online discussion groups, indicates that several aspects of using data from online discussion groups, especially those involving youth, need more thoughtful discussion and guidance. We discuss four of these below. First, researchers should respect the communities they are researching. There should be no helicopter research but instead engagement with the community and attention to the value of the research for the community. The research should not merely add to knowledge about the group researched, or to understanding online communities more generally, but should contribute to the greater good of the community being studied or the interests of that community – and these interests should be explicitly noted on an ethics application. Additionally, researchers should take active responsibility for protecting the integrity of the community by minimizing intrusiveness. Researchers should not participate in community discussions unless they are actually a member of the community and the community is cognizant of the research activities. Covert research should only be considered if the research is contributing to the community in an explicit way and if an ethics board is convinced that both individual privacy and the integrity of the community will not be compromised. If research is overt, then the researcher must anticipate and minimize the impact on the community. For example, if the researcher seeks consent, what happens to those who do not consent? Are they precluded from participating in the online discussion? If that is the case, the value of the research results is diminished and the costs to the individuals and groups are increased. The questions of consent and protection of autonomy as respects access to online discussions is a second issue in need of more thoughtful discussion and guidance. Researchers should first examine the online group’s terms of service or any other expressions of community values that might indicate the space or interactions are private, and respect these. However, these are often silent on academic research or vague and confusing. As we discuss above, securing direct and a priori consent from all participants is difficult, especially as some members may be hard to find, which may undermine community and participation. The appropriateness of moderator approval also needs to be clarified. In some cases, moderators may indeed know group members well enough to speak on their behalf, but in other cases, especially where participation in the group is fluid, moderators may not have sufficient insight to give what might approximate informed consent. Consent from moderators should not be a one-time action, but should involve moderators as active members of the research team with an ongoing responsibility to revoke approval of the research or to point out the limits of this form of approval. A third issue in need of clarification in using data from online discussion groups involves determining participants’ likely expectation of privacy. In Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 410 making this determination, intuitions and reasoning based on the understanding of public versus private in the offline world do not always translate well or directly to the online environment. In the offline world, people traditionally retain a degree of anonymity or at least a degree of practical obscurity as they traverse and interact in public spaces. In the online environment, the public/private distinction is blurred. We know that participants can have an impression of privacy online which would likely apply to online discussion groups in which participants assume they are opening themselves to a sympathetic community. This would certainly seem to indicate that they did not expect that their comments and interactions would be used for research purposes. Protecting privacy in an online research environment is more difficult because of the increasing ease of re- identifiability, since an internet search could turn up the source of a specific quote if the discussion group is archived and searchable on the web. With the amount of available data and the analytical tools associated with big data, it is increasingly difficult to de-identify data in any meaningful way. Four steps seem to mitigate the risks to privacy and reidentification: use data in aggregate, don’t use quotes; provide pseudonyms for user names; if quotes are to be used, especially if harvesting from an online archive, carry out searches to ensure that the source cannot be identified; and, consider a bricolage-style qualitative report that brings together multiple quotes in an overall picture (Markham, 2012). Finally, the use of archived group materials needs clarification. If researchers seek to download and analyze the archived content of members- only online discussion groups then it is feasible to ask for consent from members for use of the data. Members can see what they have said, evaluate whether they are willing to have that information used for research purposes, and give meaningful informed consent. If some members refuse, the removal of their data would change the nature of the sample but still allow the research to be conducted as long as direct statements of those who refuse, and references to those statements, are removed. If many members refuse, then it is obvious that the research should not proceed. If researchers seek to download and analyze archived content of open online discussion groups that are searchable on the open web then there may be a general perception that the discussion is public, and as a result participant anonymity and confidentiality are moot. In this case, REBs will often allow the research to go forward or require moderator approval. However, we believe that these practices may not be ethically appropriate, as the individuals in the group likely do not have an expectation that the discussions have been archived and are searchable. The current discussions about the right to be forgotten would come into play here, and to be consistent with the spirit of that right REBs should look at these cases more critically than they have in the past, and require additional protections so that members of the group are not easily identified, that pseudonyms are used, and that other measures are taken consistent with respect for persons and beneficence. Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 411 Conclusion The use of online social media data produced by youth offers tremendous potential for insight, and researchers must be careful to attend to social justice issues as well as research ethics considerations in using these data for research purposes. In examining the research use of such data, two social justice issues arise, often in tension: the issue of representation, and the issue of respect for participant rights and wellbeing. Researchers must be careful to balance these considerations in designing and conducting online research. When all is said, our final impressions are threefold: first, the use of these data can offer voice, in the research context, to populations that are typically not represented, and about issues that are not typically discussed in the research context; second, the use of data from online discussion groups without consent can undercut the respect for participant autonomy that is fundamental to research ethics; and third, for many participants, online discussion groups are important spaces for protected and valuable social interaction that may be disturbed by the presence of outsiders, including researchers. It is the responsibility of researchers to balance these considerations, working both to offer voice to marginalized communities while at the same time ensuring respect for participant communities, so that no harm comes to those communities, and that the cultural integrity of the communities are preserved. With respect to the issue of harm to the community, researchers must balance the push toward engagement and overtness in research (including securing informed consent) that is implicit in research ethics guidelines with the social justice consideration of the cost of interfering with the social ecosystem of the online community under study. In navigating online research where boundaries between public and private may seem to be blurred, we agree with Guishard et al. (2018) that a stronger connection to social justice considerations is needed to replace the largely individualistic focus on rights, benefits and harms, which does not sufficiently consider cultural, social and structural contexts or the integrity of on-line communities (pp. 16-24). References Berriman, L., & Thomson, R. (2015). Spectacles of intimacy? Mapping the moral landscape of teenage social media. Journal of Youth Studies, 18(5), 583-597. Bowe, B. J., & Blom, R. (2010). Facilitating dissent: The ethical implications of political organizing via social media. Politics, Culture & Socialization, 1(4), 323-336. Bowe, B. J., & Blom, R. (2011). Cosmopolitanism and suppression of cyber-dissent in the Caucasus: Obstacles and opportunities for social media and the web. Journal of Media Sociology, 3(1-4), 5-19. boyd, d., & Marwick, A. (2011). How teens understand privacy [Unpublished manuscript]. http://www.danah.org/papers/2011/SocialPrivacyPLSC-Draft.pdf Brotsky, S. R., & Giles, D. (2007). Inside the “pro-ana” community: A covert online participant observation. Eating Disorders, 15(2), 93-109. Jacquelyn Burkell & Priscilla Regan Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 412 Burkell, J., & Regan, P. M. (2018). The right to be forgotten and youth: Philosophical and psychological contexts. In J. Packer (Ed.), The Canadian human rights yearbook (pp. 203- 216). Human Rights Research and Education Center & University of Ottawa Press. Calvey, D. (2008). The art and politics of covert research: Doing ‘situated ethics’ in the field. Sociology, 42(5), 905-918. Collier, D. R. (2019). Re-imagining research partnerships: Thinking through “co-research” and ethical practice with children and youth. Studies in Social Justice 13(1), 40-58. Costa, L., Voronka, J., Landry, D., Reid, J., McFarlane, B., Reville, D., & Church, K. (2012). Recovering our stories: A small act of resistance. Studies in Social Justice 6(1), 85-101. Crethar, H. C., Rivera, E. T., & Nash, S. (2008). In search of common threads: Linking multicultural, feminist, and social justice counseling paradigms. Journal of Counseling and Development, 86, 269-278. De Ridder, S., & Van Bauwel, S. (2015). The discursive construction of gay teenagers in times of mediatization: Youths’ reflections on intimate storytelling, queer shame and realness in popular social media places. Journal of Youth Studies 18(6), 777-793. Edwards, R., & Mauthner, M. (2002). Ethics and feminist research: Theory and practice. In M. Mauthner, M. Birch, J. Jessop & T. Miller (Eds.), Ethics in qualitative research (pp. 14-31). Sage. Ess, C., & Association of Internet Researchers Ethics Working Committee. (2002). Ethical decision-making and Internet research: Recommendations from the AoIR ethics working committee. Association of Internet Researchers. http://aoir.org/reports/ethics.pdf Evans, M., Donelle, L., & Hume-Loveland, L. (2012). Social support and online postpartum depression discussion groups: A content analysis. Patient Education and Counseling, 87(3), 405-410. Evans, T. (2018, June 4). Helicopter science. Lateral Magazine, 27. http://www.lateralmag.com/articles/issue-27-helicopter-science Fassinger, R., & Morrow, S. L. (2013). Toward best practices in quantitative, qualitative, and mixed method research: A social justice perspective. Journal for Social Action in Counseling and Psychology, 5(2), 69-83. Franzke, A. S., Bechmann, A., Zimmer, M., Ess, C. M. & Association of Internet Researchers. (2019). Internet research: Ethical guidelines 3.0: Association of Internet Researchers. https://aoir.org/reports/ethics3.pdf Guishard, M. A., Halkovic, A., Galletta, A., & Li, P. (2018). Toward epistemological ethics: Centering communities and social justice in qualitative research. Forum: Qualitative Social Research 19(3), 1-24. Health Canada (2019). Requirements for informed consent documents. https://www.canada.ca/en/health-canada/services/science-research/science-advice-decision- making/research-ethics-board/requirements-informed-consent-documents.html Hoffman, A. L., & Jonas, A. (2017). Recasting justice for internet and online industry research ethics. In M. Zimmer & K. Kinder-Kurlanda (Eds.), Internet research ethics for the social age: New challenges, cases, and contexts (pp. 3-18). Peter Lang. Høybye, M. T., Johansen, C., & Tjørnhøj‐Thomsen, T. (2005). Online interaction. Effects of storytelling in an internet breast cancer support group. Psycho‐Oncology, 14(3), 211-220. Hung, A. C. Y. (2020). Political socialization on Xbox Live: A sociocultural linguistic approach to adolescent identity. Journal of Youth Studies, 23(5), 596-612. Jowett, A. (2015). A case for using online discussion forums in critical psychological research. Qualitative Research in Psychology, 12(3), 287-297. Luka, M. E., Millette, M., & Wallace, J. (2017). A feminist perspective on ethical digital methods. In M. Zimmer & K. Kinder-Kurlanda (Eds.), Internet research ethics for the social age: New challenges, cases, and contexts (pp. 21-36). Peter Lang. Lyons, H. Z., Bike, D. H., Ojeda, L., Johnson, A., Rosales, R., & Flores L. Y. (2013). Qualitative research as social justice practice with culturally diverse populations. Journal for Social Action in Counseling and Psychology 5(2), 10-25. Markham, A. (2012). Fabrication as ethical practice: Qualitative inquiry in ambiguous internet contexts. Information, Communication & Society, 15(3), 334-353. Expression in the Virtual Public Studies in Social Justice, Volume 15, Issue 3, 397-413, 2021 413 Markham, A. & Buchanan, E. (2012). Ethical decision-making and internet research: Recommendations from the AoIR Ethics Research Committee (Version 2.0). http://aoir.org/reports/ethics2.pdf Middaugh, E., Clark, L. S., & Ballard, P. J. (2017). Digital media, participatory politics, and positive youth development. Pediatrics, 140(Supplement 2), S127-S131. Minansy, B., & Fiantis, D. (2018, Aug. 29). ‘Helicopter research’: Who benefits from international studies in Indonesia. The Conversation. https://theconversation.com/helicopter- research-who-benefits-from-international-studies-in-indonesia-102165 Morrow, M., & Weisser, J. (2012). Towards a social justice framework of mental health recovery. Studies in Social Justice 6(1), 27-43. Mulveen, R., & Hepworth, J. (2006). An interpretative phenomenological analysis of participation in a pro-anorexia internet site and its relationship with disordered eating. Journal of Health Psychology, 11(2), 283-296. Pieper, I., & Thomson, C. J. H. (2013). Justice in human research ethics: A conceptual and practical guide. Monash Bioethics Review, 31(1), 99-116. Pullman, D. (2002). Conflicting interests, social justice and proxy consent to research. The Journal of Medicine and Philosophy, 27(5), 523-545. Regan, P. M., & Steeves, V. (2010). Kids r us: Online social networking and the potential for empowerment. Surveillance and Society, 8(2), 151-165. Selfridge, M., & Mitchell, L. M. (2020). Social media as moral laboratory: street involved youth, death and grief. Journal of Youth Studies. DOI: 10.1080/13676261.2020.1746758 Siriaraya, P., Tang, C., Ang, C. S., Pfeil, U., & Zaphiris, P. (2011). A comparison of empathetic communication pattern for teenagers and older people in online support communities. Behaviour & Information Technology, 30(5), 617-628. Spicker, P. (2011). Ethical covert research. Sociology 45(1), 118-133. Steeves, V., & Regan, P. (2014). Young people and the social value of privacy. Journal of Information, Communication and Ethics in Society, 12(4), 298-313. Stevens, G., O’Donnell, V. L., & Williams, L. (2015). Public domain or private data? Developing an ethical approach to social media research in an inter-disciplinary project. Educational Research and Evaluation, 21(2), 154-167. Subrahmanyam, K., Greenfield, P. M., & Tynes, B. (2004). Constructing sexuality and identity in an online teen chat room. Journal of Applied Developmental Psychology, 25(6), 651-666. Taylor, C. G. (2008). Counterproductive effects of parental consent in research involving LGBTTIQ youth: International research ethics and a study of a transgender and Two-Spirit community in Canada. Journal of LGBT Youth, 5(3), 34-56. Trevisan, F., & Reilly, P. (2014). Ethical dilemmas in researching sensitive issues online: Lessons from the study of British disability dissent networks. Information, Communication & Society, 17(9), 1131-1146. Tufekci, Z., & Wilson, C. (2012). Social media and the decision to participate in political protest: Observations from Tahrir Square. Journal of Communication, 62(2), 363-379. Yang, K. W. (2007). Organizing MySpace: Youth walkouts, pleasure, politics, and new media. Educational Foundations, 21, 9-28.