1 Introduction

Political attention towards science communication, public engagement, citizen science, or open science has increased tremendously over the past decades. These labels symbolize a sociopolitical program that pursues the opening of science to society [ Weingart, Joubert & Connoway, 2021 ]. The underlying assumption is that it is important to give non-scientific actors broad access to scientific findings and insights into scientific knowledge production to inform social and political decision-making and to influence scientific agenda setting but also to generate interest in science to legitimize its public funding [cf. Bauer, 2017 ]. As both a condition and a consequence of these developments, an increasing scientism, i.e. an increasing orientation and reference towards science in society has been observed [ Collins, 2014 ]. Kahan, Scheufele and Hall Jamieson [ 2017 ] refer to “science in context” to describe the increasing penetration of public discourse with scientific content in which science (communication) has turned into a part of everyday life. This can be illustrated by different developments. For example, Albæk, Christiansen and Togeby [ 2003 ] and Summ and Volpers [ 2016 ] show that journalistic reporting increasingly refers to scientific topics and experts over time. Moreover, scientific organizations and scientists themselves use online and social media to engage with publics beyond the scientific community [ Entradas et al., 2020 ; Jünger & Fähnrich, 2020 ]. In addition, publicly visible actors such as NGOs or influencers relate to science to strengthen their argument and increase public attention to their goals [ Yearley, 2014 ]. Finally, user-generated communication related to science takes place on various social media platforms — Reddit’s science related forum r/science has over 29 million subscribers and can be considered the world’s largest science communication forum. Overall, the described developments and related changes in the science society-interface have been enabled and shaped by the digital transformation of public communication [ Friedland, Hove & Rojas, 2006 ; Friemel & Neuberger, 2021 ]. As a consequence, potentially anyone can now access scientific findings and work and contribute to conversations about science. Crowd formats such as Wikipedia or Plagiarism Wikis indicate that the boundaries between scientific and lay communication become blurred [ Arroyo-Machado, Torres-Salinas, Herrera-Viedma & Romero-Frías, 2020 ]. Whereas science communication by users on social media platforms such as Reddit could be regarded as a strong trend toward the fulfillment of the political aspiration for the “democratization” of science [ Chilvers & Kearnes, 2020 ], there are also concerns regarding a shift in the societal order of knowledge [ Neuberger et al., 2021 ]. Especially, the losing influence of traditional intermediaries such as journalists has been often regarded as a threat towards public communication quality — also in the context of science communication [ Fähnrich, Weitkamp & Kupper, 2023 ; Massarani, Entradas, Fernandes Neves & Bauer, 2021 ]. These concerns are rooted in normative ideas of democratic discourse and the functional role that is attributed to journalism such as ensuring diversity of content, applying standardized workings and norms to assure discourse quality (e.g. such as the use of reliable sources), and presenting multiple perspectives and critique to enable qualified opinion formation [ Neuberger, 2018 ]. Against this backdrop, it has been doubted that user-generated content, i.e. published content produced by persons outside the realm of professional routines of journalism and/or science [ Naab & Sehl, 2016 ], can compensate for the decline of science journalism and professional science communication. However, there is few evidence of the actual nature of user communication in science-related discourse. It is thus an empirical question which patterns user-generated science communication exhibits, and to what extent it approaches normative discourse standards [ Jönsson & Örnebring, 2011 ].

The online platform Reddit is well suited to explore this question, because the forum offers interaction opportunites that illustrate the formation and spread of popular and controversial content [ Jasser, Garibay, Scheinert & Mantzaris, 2022 ]. r/science can be regarded as a well-established example of the politically fostered and digitally enabled “democratization” of science communication. Founded in 2005, Reddit allows users to post and link content in a broad range of thematic forums (so-called subreddits). While science is being discussed in many different subreddits (such as r/Covid19 for pandemic-related scientific publications), we will focus on Reddit’s r/science which has over 27 million subscribers and covers science more generally. On Reddit, posts can be commented and rated by fellow users; content is moderated based on subreddit guidelines. Reddit is thus a social “micro-cosmos” that allows for some dynamics of user-generated science communication to be traced. Against this backdrop, we are particularly interested in the topics that are negotiated and receive special attention, in the sources that users refer to, and in the level of controversy that is observable in the discussions. In detail, our study aims at answering the following three research questions:

  • How does science communication on r/science develop over time?

  • From which sources does the content on r/science stem from and how are these sources evaluated?

  • Which topics receive attention on r/science, how much do different topics initiate debates?

To respond to these questions, the paper starts with a literature review that summarizes the limited state of the art on user-generated science communication. We will show that this research has focused on different aspects of online science communication but less on communication patterns on a more general level. Moreover, Reddit — although highly attended — has hardly been the focus of science communication research, yet. We will therefore outline the specifics of our study in more detail. We will then describe our research design and methodology. Our empirical study draws on data that entail 694.147 Reddit posts from May 2007 to October 2018 and that we explore in the context of computational analyses (esp. topic modeling and content analysis). We will present our results and discuss the findings against the overall idea of the democratization of science, outline the limitations of our study, and hint at future research.

2 Literature review: discourse dynamics of science communication online

2.1 User-generated science communication

Science communication online has been broadly researched in recent years from very different perspectives [ Dudo, 2015 ; Peters, Dunwoody, Allgaier, Lo & Brossard, 2014 ]. From a rather normative perspective, questions of interactivity and participation have been at the core of science communication literature. Against this backdrop, research has discussed the participatory potential of science communication online with many options for direct interventions from the public [ Bucchi & Trench, 2016 ]. Direct online participation in this manner has been described as “communication between and among publics as well as the communication back and forth between experts and publics” [ Trench, 2008 , p. 128]. In the same perspective, frameworks have been developed to map public participation in the communication process [cf. Trench, 2008 ; for an overview Fähnrich, 2017 ]. For instance, the public participation model by Rowe and Frewer [ 2005 ] includes dialogue between ‘sponsors’ on the one side, e.g., a scientific organization or policy actors, and the public on the other side, in contrast to one-way communication reaching from sponsor to the public or consultation in the reverse direction. Another framework by Bucchi [ 2009 ] describes the mode of knowledge co-production that is a democratic understanding of communicating scientific research where besides experts non-experts are also responsible for knowledge production. Research focusing on the popular Reddit format “Ask me Anything” (AMAs), where experts answer questions from Reddit users [ Hara, Abbazio & Perkins, 2019 ; Kloepper & Pyzdek, 2015 ], follows this focus on expert user-exchange and adds to scientists‘ contributions and motivations to the scientific discourse on Reddit. In contrast, the open culture of Reddit allows users without academic and professional backgrounds to share and discuss scientific results on their own and to challenge scientific findings or scientists on the platform.

Opportunities for further user engagement become evident in creating “subreddits”, which makes the platform a dynamic community with opportunities to develop in structure over time [e.g. P. Singer, Flöck, Meinhart, Zeitfogel & Strohmaier, 2014 ]. The science subreddit ( r/science ) offers a platform for participative and open science communication defined by an online environment, where scientists as well as other science communicators interact and therefore contribute to collective knowledge [cf. Metcalfe, Gascoigne, Medvecky & Nepote, 2022 ]. The structure of reddit serves this purpose by providing community based features such as liking, commenting and information linking [cf. Medvedev, Lambiotte & Delvenne, 2019 ].

User comments follow different dynamics leading to differences in community-building effects. On YouTube, user-generated-content can be more popular than professionally created content [ Welbourne & Grant, 2016 ]. On Reddit, commenting by users who are strongly represented in discussion forums is evaluated as more controversial by others [ Matias, 2019a ]. There is also evidence that comments on r/science written by users new to the community are more likely to hurt guidelines and to be removed by moderators [Matias, 2019a ].

However, despite research on AMA sessions, studies that refer to r/science are rare. Against this backdrop, we aim at broadening the perspective and focus on Reddit as a forum for user-generated science communication. We thus do not focus our research on single formats but r/science in general to explore broader discourse dynamics for user-generated science communication.

2.2 Diversity of sources and quality evaluation in science communication

The online environment is characterized by a high heterogeneity of actors contributing to the production of science communication content [e.g. Fähnrich, 2021 ; Guenther & Weingart, 2016 ]. Among those are universities and independent research institutes, media outlets and journalists, industrial and health organizations, or NGOs. Those actors produce different kinds of science-related content and have different ways to bring their messages across [ Davies & Horst, 2016 ]. Furthermore, science enthusiasts [ Henderson, 2013 ], hackers, artists, and designers place their messages online [ Horst, Davies & Irwin, 2016 ]. This diversity of science communication sources is also mirrored on Reddit which enables users to post and comment on scientific content based on various sources. However, compared to traditional journalistic media, social media platforms typically lack gatekeeping and review mechanisms which makes it difficult to control the quality of interactions [ J. B. Singer, 2013 ] and — in the context of Reddit — the quality of sources of the content that is linked. Consequently, online discourse is often assumed to show a lack of objectivity or accuracy [ Hansen, 2016 ; Holliman, 2004 ]. Science information online can threaten the credibility of public discourse [ Scheufele & Krause, 2019 ] — a threat that has fostered calls for quality promotion in recent years [e.g. Mannino et al., 2021 ]. r/science aims at countering quality concerns through strict guidelines 1 and the interventions of over 1,500 moderators 2 that enforce the subreddit rules [ Lynch, 2020 ]. In this manner, allowing or permitting content or announcing community rules, for example, to inform new users and prevent antisocial behavior can be observed as moderators’ main activities [Matias, 2019a ]. But still, the science communication provided by users on Reddit shows differences with regard to the quality and trustworthiness of sources as there is content from journalistic media, scientific organizations, and scientists but also Wikipedia content [ Moyer, Carson, Dye, Carson & Goldbaum, 2015 ] or content linked from other social media platforms that is questionable in terms of scientific accuracy. Little research has focused on the quality of sources linked by users on Reddit. For science blogs, there is some evidence that other blogs, but also traditional news media and academic sources are linked as references [ Walejko & Ksiazek, 2010 ]. In the context of public responses to Covid-19, Gozzi et al. [ 2020 ] investigated differences in media resources, indicating that users’ activity on Reddit concentrates on daily health and protective behavior, in contrast to media reporting evoking more interest in politics and economics. Furthermore, research suggests that overall user activity in different subreddits is dynamic and can be affected by less user activity [ Zhang, Keegan, Lv & Tan, 2021 ]. The question is how self-referential or broad (citing various sources) discussions on r/science can be characterized and which types of sources are applied the most.

A key aspect of conversations on Reddit is the up/downvote function; this function allows users to vote on both posts as well as comments and thus to contribute to a post’s/comment’s success. Research on public opinion formation suggests also for Reddit that general crowd effects become visible by voting and commenting [ Horne & Adali, 2017 ]. The popularity of a topic is measured according to a ranking of attention (comments), the popularity of topics (upvotes), and the controversy of posts (down-and-upvote ratio). Whether and how often users interact as submitters (providing new comments) or commenters (discussing a topic) and receiving “karma” (ratio of negative and positive user’s reactions) [ Peter, 2015 ] can also influence their position in the social network [ Buntain & Golbeck, 2014 ]. If a post is upvoted highly, which is shown by its score (the result of upvotes/downvotes), it is also likely that more new comments are generated, which suggests opinion leadership [ Leavitt & Clark, 2014 ]. This is especially the case among top-level commenters [ Kilgo et al., 2016 ]. The finding that only a very small number of users actively contribute content to the r/science community illustrates this relation [ Jones, Colusso, Reinecke & Hsieh, 2019 ]. As mentioned before, Reddit’s specific indicators show how popular sources are in the community which is linked to the quality evaluation of posts by users in terms of relevant criteria such as accessibility, relevance, or credibility [ Dohle, 2018 ; Lacy & Rosenstiel, 2015 ]. We, therefore, are interested in crowd effects concerning the popular feature of the upvotes-downvotes ratio on Reddit.

2.3 Content and content assessment in science communication

Content-wise, the plurality of platforms and actors also reflects the diversity of scientific research. Different aspects of science are made salient in media, e.g. about the consequences of climate change [ Bolsen, Palm & Kingsland, 2019 ] or biotechnology [ Marks, Kalaitzandonakes, Wilkins & Zakharova, 2007 ]. Controversial science issues are by no means exclusively discussed by experts [ Xu, Yu & Song, 2018 ]. Wang and Guo [ 2018 ] show, for example, that public discussion on Twitter played a leading role in framing genetically modified mosquitoes compared to journalistic online news media. Science platforms for citizen science often have a specific thematic focus (e.g. air pollution, water quality, biodiversity) [ Liu, Dörler, Heigl & Grossberndt, 2021 ]. In contrast, scientific discussions that are embedded in Reddit’s online posts are more diverse, for example by employing health information [ Bachl, 2016 ] or relating to chemistry or environmental science [e.g. Jones et al., 2019 ]. Content on r/science likely reflects current scientific controversies as moderation guidelines for submissions specify that linked scientific findings should not be older than six months (date of publication) [ Reddit, 2022 ].

Especially for science content, it can be assumed that scientific articles or statements by scientific institutions are linked to evoke interest and to underline scientific arguments for certain issues. Concerning specific topics on Reddit, few studies investigated their prevalence for science issues. The most searched health issues on Reddit can be summed up under discussions about diet and exercise and topics on mental health [ Record, Silberman, Santiago & Ham, 2018 ]. Moreover, it is assumed that Reddit provides different thematic aspects and contexts on science compared to other platforms, e.g. as it has been shown for microblogging on Twitter [ Büchi, 2017 ]. First findings of a study by Jones et al. [ 2019 ] that analyzed the topics most commonly discussed on r/science found medicine, technology, and biology as the most prominent disciplines. Furthermore, the most common fields of research were social sciences, health, and epidemiology [Jones et al., 2019 ]. Relating to the discourse of science communication quality [ Fähnrich et al., 2023 ], the coverage of different scientific fields and topics on Reddit should ideally be diverse to reflect the scientific discourse in society as a whole. In our research, we are therefore interested in investigating broad and narrow thematic patterns of r/science . Recently, an analysis of the AMA sessions on r/science showed that social cues differ between subject areas. For example provocative “trolling” is more represented in discussions on astronomy than on environmental science [ Tang, Abbazio, Hew & Hara, 2021 ]. Besides commenting, trends and controversies in discussions of scientific subfields become visible in Reddit’s upvote-downvote function that allows users to express their opinion about content. Yet, there is no research on levels of controversy of posts for different thematic fields on r/science . Moreover, evidence on the long term development is sparse.

3 Method

3.1 Data collection

To respond to our research questions we collected all r/science posts and metadata between October 2007 and May 2018 via the complete Reddit datasets [ Baumgartner, Zannettou, Keegan, Squire & Blackburn, 2020 ] that are available via the cloud storage and data analysis site Google BigQuery. Missing metadata 3 was collected via RedditAPI (Python PRAW library that allows access to Reddit’s API).

Overall, the dataset resulted in n = 6 9 4 , 1 4 7 posts with metadata (median score: 1; ØScore: 87.20; median amount of comments: 0, Ø: 12.18). All in all, of the whole data set (691.254 posts), most of posts show one (21%) or no comment (56%).

We thus used a subset of data (“relevance”) that included posts with more than five comments ( n = 8 6 , 8 8 2 ). Reddit has a scale of controversy for comments which is computed by the proportion of up and downvotes and the undefined number of general votes. Given that we did not have access to the complete number of votes per thread we estimated controversy votes by up-downvote-ratio and number of comments. A second subset (“controversy”) added posts with an up-downvote-ratio lower than 0.6 (1 = complete approval, 0 = complete refusal), resulting in n = 1 2 , 9 9 5 posts.

3.2 Data analysis

We employed partially automated content analysis and topic modeling. Categories were developed with an inductive approach. For detecting the most common sources that were referred to in r/science , we standardized URLS at the domain level and derived brands. For example, different domains of Yahoo were coded as product-aggregate platforms and domains of media such as huffingtonpost.com as journalistic media. We then manually coded the top 320 brands, the average agreement across all categories and coders was 95% ( r α = . 9 5 ). To investigate the most important topics on r/science we used topic modeling based on post titles. To apply categories, we derived inductive coding of labels and disciplines of those terms mentioned in the headings of Reddit posts. Therefore, we employed the R package spectral/STM [ Roberts, Stewart, Tingley & Airoldi, 2013 ; Roberts, Stewart & Airoldi, 2016 ].

4 Results

4.1 Activity levels

We will first concentrate on general user activity on r/science (RQ1). Overall, there were 694,147 posts on r/science and 86,882 posts with more than five comments between October 2007 and May 2018.

There were in total 192,021 users contributing to posts on r/science . Apart from these registered users, 146,412 user accounts had been deleted. Like other studies on Reddit’s user activity [ Jones et al., 2019 ; Kilgo et al., 2016 ], we found that a small number of users shows a large amount of activity. Data shows that the top 100 users are responsible for 47,063 Posts which is 9.62% of all posts. The average mean posts/user is moderate (median = 2 . 8 5 ).

Figure 1 shows the distribution of posts on r/science in the given timeframe. In general, the activity on r/science has been increasing since the early years of Reddit from 2008 to 2012 with a peak in 2013 and 2014. The growth of posts since 2011 can be potentially explained by the closing of r/reddit.com which was a prominent subreddit for general content and has influenced the growth of other more thematic subreddits [ Olson, 2013 ].


Figure 1 : Chronological distribution of all posts on r/science between October 2007 and May 2018 ( n = 6 9 4 , 1 4 7 posts; subset: n = 8 6 , 8 8 2 ).

From 2015 to 2018 user activity has declined noticeably most likely due to new moderation guidelines that aimed at fostering more quality by making it harder to post content on the subreddit. One incident that might have promoted this development was that volunteer moderators of subreddits went on a strike for half a day. 4 The “Reddit blackout” became popular in the press for moderators demanding better conditions, which was answered by Reddit’s management with an attempt at cooperation [ Matias, 2015 ]. In 2015 Reddit widened its anti-harassment policy which resulted in the banning of inadequate subreddits [ The Associated Press, 2015 ]. Reacting to discussions about increasing censorship, r/science published transparency reports in 2016 to point to their moderation decisions [ Reddit, 2016 ].

To detect which posts are most discussed and therefore rated the most controversial, we included posts with an up-downvote-ratio of < 0 . 6 (1 = 100% agreement, 0 = complete disagreement) in a dataset ( n = 1 2 , 9 9 5 posts). The cutoff point was chosen as threads above 0.6 in our dataset usually have either a high number of upvotes and are very popular or little to no engagement. An up-downvote-ratio of 1 indicated that posts did not receive any attention and remained at the standard score of 1 which was common in the dataset. Notably, if there were more than five comments, a post on average was rated more positively than with fewer comments. We also used a separate dataset for analysis consisting of posts with more than 10 comments ( n = 4 , 7 8 3 posts). Overall, the up-downvote-ratio on r/science increased from 2009–2010 with a slight drop in 2012 where the frequency sank and was increasing again from 2012–2018 ( n = 6 9 4 , 1 4 7 ).

4.2 Sources of content

For answering research question 2 we will outline sources that posted comments on r/science . The sources were coded by inductive category formation 5 for a subsample of 1,000 sources. For the subset with at least five comments per post ( n = 8 6 . 8 8 2 ) there is a pattern where homepages like self.science (so self-posts like AMAs) followed by phys.org and bbc.co.uk were mentioned most. As a next step, we analyzed the controversy of sources (defined as > 5 comments; ratio < 0 . 6 ). 6 Concerning the number of posts by sources, the mass media (3.637 posts), self-posts (1.730 posts), blogs (848 posts), and social media (778 posts) are most relevant (see Table 1 ). We found that posts citing the mass media were most common and held an up-downvote-ratio of .50 (Ø14.85 comments). Self-posts seem to be more discussed (Ø24.6 comments) and controversial, since the up-downvote-ratio is .46. Official websites appear in fairly few posts ( n = 1 4 4 ), but reach the attention of users (Ø24.35 comments), though not rated as overall controversial (0.50) by users on average. In contrast, mere controversy can be seen in posts relating to social media and product-aggregates, with both showing an up-downvote-ratio of .46 (see Table 1 ).

Table 1 : Sources most referred to on r/science in the dataset (Subset > 5 comments; < 0 . 6 up-downvote-ratio; n = 1 2 . 9 9 5 Posts).

4.3 Topics and disciplines

Furthermore, we were interested in the scientific topics and disciplines that are dominating the discourse on r/science (see RQ3). Overall, we identified 37 different topics for posts with five or more comments ( n = 8 6 , 8 2 1 ). 7 As Figure 2 shows, relevant scientific topics include health-related issues such as diseases and risk-related research. Another important topic that we extracted refers to environmental topics like energy and climate change and its conditions and consequences. Furthermore, gender-related discussions score high. Not surprisingly, self-referential content on Reddit is also of importance, most notably AMA sessions.

Topic-wise, we can sum up for RQ 3 that the discipline of cultural studies received the lowest up-downvote-ratio (.47), and thus can be considered the most controversial in the sample, also shown by a high amount of discussion and controversy referring to gender issues (see Figure 3 ). Also, the discipline of life sciences and health-related topics receive controversial ratings and are discussed regularly (.47 controversy, 18.19 av. comments).


Figure 2 : Scientific topics extracted from r/science from posts with five or more comments and their average upvote score ( n = 8 6 , 8 2 1 ).


Figure 3 : Controversy, frequency, and popularity of topics (Subset 2, > 5 comments, ratio < 0 . 6 ; n = 1 2 . 9 9 5 ). 7

5 Discussion

Changes in science communication practice demand investigations of patterns from user-generated communication online. The platform Reddit and its subsection r/science allow users to engage in public discourse about scientific content. Specific features like commenting and voting would allow for a distinct science communication agenda on r/science . Drawing form long term empirical evidence on science reporting, prominent disciplines featured in newspapers are medicine, environment, technology and biology [ Clark & Illman, 2006 ; Elmer, Badenschier & Wormer, 2008 ]. Our results indicate that posts on r/science follow a comparable agenda. It should be noted that our results have limitations concerning preconditions for data collection. The method of topic modeling, while viable in this context, can be imprecise, and it is possible that we missed smaller academic fields. Based on our findings we can say that the disciplines of natural and life sciences are important and, topicwise, climate change and genetics as well as issues such as gender or health dominate the discussion on r/science . Especially the popularity of the latter could be based in news values such as sensationalism or personal concern — however it would need further investigation of Reddit users to come to a better understanding of these topic and agenda setting dynamics. Moreover, systematic analysis of news platforms and Reddit would be needed to investigate further if Reddit has an “own agenda” or shows parallels to the representation of topics in science journalism.

We have found different characteristics of user-generated science communication that indicate a certain level of quality self-control, for example, evident by users’ commenting and voting actively on content. Moreover journalistic and scientific publications play a constant role as sources for posts on r/science . Prominent sources on Reddit that link to social media and self-posts are perceived as more controversial than those from mass media and science, although differences can be considered small. As other studies suggest [e.g. Gozzi et al., 2020 ], there is the presumption of an “information flow” between Reddit and traditional media sources or social media.

Compared to journalistic (online) media, participation on Reddit is user-centric and therefore validated after publishing [ Reddit, 2022 ], which holds the risk of providing inaccurate science-related information. Active users of Reddit may interpret scientific results by incorporating their own views, which can both be seen as a chance and risk for public discourse [e.g. Toepfl & Piwoni, 2015 ]. For r/science , we showed that over the last years users’ activity reveals variation in coverage which can be potentially explained by adopted moderation criteria. These targeted interventions also regulate the quality of submissions. Our data allow us to draw some limited conclusions about the use of (scientific) evidence, given that big media outlets and platforms are often consulted, whereas scientific institutions or journals receive moderate citations. Case studies of specific subreddits report that moderator interventions such as banning content or quarantining 8 subreddits show effects on user traffic or preventing antisocial behavior [ Chandrasekharan, Jhaver, Bruckman & Gilbert, 2020 ; Seering, Wang, Yoon & Kaufman, 2019 ].

Further research might detect how controversial science content could be classified and perceived by users, as well as conducting user surveys about expectations on moderator interventions. An open question is which characteristics of scientific topics are essential for users’ reactions, such as commenting and upvoting. It would be valuable to find out whether there are specific factors that influence users’ contribution to a discussion. In line with research about user comments [e.g. Kaiser, 2017 ; Ziegele, 2016 ] and message framing of online science communication [e.g. Wang & Guo, 2018 ], it could be possible that for example controversy or negativity and personalization drive interactivity on r/science .

Concerning crowd effects on r/science , the analysis of the content of comments would provide deeper insights on platform specific properties. As shown for the topic of climate change videos on YouTube, discussions might provide reference to scientific facts, but also expressing approval or disagreement [e.g. Shapiro & Park, 2015 ]. Also, the tone of scientific discussions differs between issues, as uncivil language is less common for health or environmental news than for politics or economics [ Coe, Kenski & Rains, 2014 ]. Similar patterns can be stated for discussions on Twitter on climate change issues, where related communication shows a small incidence of polarized opinions with evidence of incivility or sarcasm [e.g. Anderson & Huntington, 2017 ]. Considering that r/science exhibits considerably stricter commenting policies than the social platforms mentioned before, it is open to investigation whether polarising comments are as uncommon as expected on r/science .

There might also be differences language-wise concerning content. Hubner and Bond [ 2022 ] showed that experts use more complex language when sharing scientific information compared to non-experts. The interplay between message characteristics of posts or comments and the source of information can provide interesting insights into whether there are “opinion leaders” for specific topics. In addition, this can be connected to the question of what is crucial for the popularity of posts. Is it text or visuals of content and sources that predict access to discussions better, or are there specific sentiment features that are relevant in this context? Those factors can also be of relevance in terms of credibility and quality evaluation of Reddit posts.

Despite the limitations of our study and the further research needs outlined so far, we hope that this contribution can shed some light on user-generated science communication online. Communication modes on Reddit can be classified as dialogical, e.g., the AMA’s format between scientists and users, and highly co-productive for general posting in the subreddit that was under investigation. Hopefully, our analysis of r/science can contribute to a better understanding of the complex science-society interface online and may provide fresh ideas for the continuous discussion on perspectives for participation and science communication in the digital media environment.


Dr. Jonas Kaiser is an Assistant Professor for Journalism at Suffolk University and Faculty Associate at the Berkman Klein Center for Internet & Society at Harvard University. He is an inaugural member of the Spotify Safety Advisory Council. Jonas’ research is located at the intersection of computational social science and public sphere theory and includes topics such as algorithm, digital journalism, mis- and disinformation, and science communication.
@JonasKaiser E-mail: jkaiser@cyber.harvard.edu

Dr. Birte Fähnrich is Adjunct Professor at the Institute for Media and Communication Studies at Freie Universität Berlin. Before, she held a Postdoc-position at the Berlin-Brandenburg Academy of Sciences and Humanities, and was Principle Investigator at Zeppelin University for the EU-funded project RETHINK. Her research focusses on digital science communication, public and political engagement, and university communication. Since November 2021, Birte is also employed as a senior policy officer at the German Federal Ministry of Education and Research.
@birte_faehnrich E-mail: birte.faehnrich@fu-berlin.de

Laura Heintz is research associate and PhD student in the Institute for Publizistik at Johannes Gutenberg University Mainz. Her research focusses on User-Generated-Content and Science-Society-relations and persuasive media effects, especially at the intersection of science communication, health, risk and technologies.
@LauraHeintz7 E-mail: Lheintz@uni-mainz.de


1 On r/science comments have to be based on scientific findings less than 6 months old, therefore, expression of opinion without evidence is deleted by moderators [ Reddit, 2022 ].

2 Most subreddit volunteers apply actively for moderator positions, while previous moderation experience is often expected [ Matias, 2019b ].

3 Including date, headings, linked domains, up-and-downvotes, and number of comments.

4 The trigger for this intervention was that Victoria Taylor, an employee of the subreddit Ask Me Anything, was fired [ Carson, 2015 ].

5 Those categories included journalistic mass media, scientific journals, websites of scientific organizations, blogs, websites of government institutions, product aggregates/platforms, social media, self-posts originating from Reddit.com, other websites, or miscellaneous.

6 Referring to the subset with > 5 comments; < 0 . 6 up-downvote-ratio; n = 1 2 . 9 9 5 Posts.

7 Several Reddit posts that were part of our sample were not long enough for topic modeling (e.g., consisted of enough words after text preparation to be considered useful) and were thus removed from the corpus.

7 The number of analyzed posts reflect the subset of posts in our sample that fit our criteria for controversy.

8 Those subreddits are not prohibited by the content policy, but “average Redditors may nevertheless find content highly offensive or upsetting” [ Reddit, 2021 ].