Facebook hides the most important data on disinformation. Atlantic Report

6 November 20212 years ago babelfish

Apparently billions of people have come across fake news about vaccines on the social platform Facebook, but that number means nothing without a denominator, highlights The Atlantic

Leaked internal documents suggest that Facebook – which recently rebranded itself as Meta – is doing far worse than it claims in minimizing misinformation about Covid-19 vaccines on the social media platform Facebook.

Online misinformation about coronavirus and vaccines is a major concern. In one study, respondents who got some or all of their news from Facebook were significantly more likely to reject Covid-19 vaccines than those who got their news from traditional media sources.

The mere count of misinformation cases found on a social media platform leaves two key questions unanswered: How likely are users to come across misinformation, and are some users particularly likely to be affected by misinformation? These questions are the denominator problem and the distribution problem – we read in The Atlantic.

The Covid-19 disinformation study "Facebook's Algorithm: A Major Threat to Public Health," published by public interest advocacy group Avaaz in August 2020, reported that sources who frequently shared health disinformation – 82 websites and 42 Facebook pages – they had an estimated total reach of 3.8 billion views in one year.

At first glance, this is an incredibly large number. But it is important to remember that this is the numerator. To understand what 3.8 billion views in a year means, we also need to calculate the denominator. The numerator is the part of a fraction above the line, which is divided by the part of the fraction below the line, the denominator.

One possible denominator is 2.9 billion monthly active Facebook users, in which case, on average, every Facebook user has been exposed to at least one piece of information from these health disinformation sources. But the numerator is 3.8 billion content views, not discrete users. How much information does the average Facebook user meet in a year? Meta does not disclose this information.

Market researchers estimate that Facebook users spend 19 to 38 minutes a day on the platform. If Facebook's 1.93 billion daily active users see an average of 10 posts in their daily sessions – a very conservative estimate – the denominator for that 3.8 billion information per year is 7.044 trillion (1.93 billion). daily users for 10 daily posts for 365 days in a year). This means that about 0.05% of the content on Facebook consists of posts shared by suspicious Facebook pages.

The 3.8 billion view figure includes all content posted on these pages, including harmless health content, so the percentage of Facebook posts that are health disinformation is less than one twentieth of a percent.

Is it troubling that there is enough misinformation on Facebook for everyone to have likely encountered at least one case? Or is it reassuring that 99.95% of what is shared on Facebook doesn't come from the sites Avaaz warns about? Neither.

In addition to estimating a denominator, it is important to consider the distribution of this information. Is everyone on Facebook an equal chance of encountering health disinformation? Or are people who identify as anti-vaccine or are looking for information on "alternative health" more likely to encounter this type of misinformation?

Another social media study that focuses on extremist content on YouTube offers a method for understanding the distribution of disinformation. Using browser data from 915 web users, an Anti-Defamation League team recruited a large and demographically diverse sample of U.S. web users and oversampled two groups: YouTube heavy users and individuals who showed strong bias. race or gender negatives in a series of questions asked by investigators. Oversampling consists of probing a small subset of a population more than its proportion of the population to better record data on the subset.

The researchers found that 9.2 percent of participants saw at least one video from an extremist channel, and 22.1 percent saw at least one video from an alternate channel during the months covered by the study. An important piece of context to note: a small group of people were responsible for most of the views of these videos. And more than 90% of extremist or "alternative" video views were made by people who reported a high level of racial or gender resentment in the survey prior to the study.

While roughly one in 10 people have found extremist content on YouTube and two in 10 have found content from right-wing provocateurs, most people who have encountered such content have "bounced" it and went elsewhere. The group that found extremist content and searched for the most were people who allegedly had an interest: people with strong racist and sexist attitudes.

The authors concluded that "the consumption of this potentially harmful content is instead concentrated among Americans who are already rich in racial resentment," and that YouTube algorithms can reinforce this model. In other words, knowing only the fraction of users encountering extreme content doesn't tell you how many people are consuming it. For this, it is also necessary to know the distribution.

A widely publicized study by the hate advocacy group, Center for Countering Digital Hate, titled "Pandemic Profiteers," showed that out of 30 anti-vaccine Facebook groups surveyed, 12 anti-vaccine celebrities were responsible for 70% of the circulated content. in these groups, the three most important were responsible for nearly half. But again, it's crucial to ask about the denominators: How many anti-vaccine groups are hosted on Facebook? And what percentage of Facebook users meet the type of information shared in these groups?

With no information on denominators and distribution, the study reveals something interesting about these 30 anti-vaccine Facebook groups, but nothing about medical misinformation on Facebook as a whole.

This type of study raises the question: “If researchers can find this content, why can't social media platforms identify and remove it? The Pandemic Profiteers study, which implies that Facebook could solve 70% of the medical disinformation problem by deleting only a dozen accounts, explicitly supports the deploration of these disinformation traders. However, I found that, as of the end of August, Facebook has already removed 10 of the 12 vaccine influencers featured in the study from the platform.

Consider Del Bigtree, one of the four biggest spreaders of vaccination disinformation on Facebook. The problem isn't that Bigtree is recruiting new anti-vaccine followers on Facebook; is that Facebook users follow Bigtree on other websites and bring its content to their Facebook communities. It's not about 12 individuals and groups posting health disinformation online – it's probably thousands of individual Facebook users sharing misinformation, with this dozen people, found elsewhere on the web. It is much harder to ban thousands of Facebook users than it is to ban 12 anti-vaccine celebrities.

This is why denominator and distribution issues are essential to understanding online disinformation. Denominator and distribution allow researchers to ask how common or rare the online behaviors are, and who engages in these behaviors. If millions of users each come across occasional bits of medical misinformation, warning labels could be an effective intervention. But if medical misinformation is primarily consumed by a smaller group actively seeking and sharing this content, those warning labels are most likely useless.

Trying to understand misinformation by counting it, regardless of denominators or distribution, is what happens when good intentions collide with shoddy tools. No social media platform allows researchers to accurately calculate how prominent a particular piece of content is on its platform.

Facebook restricts most searchers to its Crowdtangle tool, which shares information about content engagement, but that's not the same as content views. Twitter explicitly forbids researchers from calculating a denominator: the number of Twitter users or the number of tweets shared in a day. YouTube makes it so difficult to find out how many videos are hosted on its service that Google routinely asks interview candidates to estimate the number of videos hosted on YouTube to gauge their quantitative skills.

Social media platform leaders have argued that their tools, despite their problems, are good for society, but this argument would be more compelling if researchers could independently verify this claim.

As the social impacts of social media become more prominent, the pressure on Big Tech platforms to disclose more data about their users and content is set to increase. If these companies respond by increasing the amount of information that researchers can access, look very carefully: will they allow researchers to study the denominator and distribution of online content? And if not, are they afraid of what researchers will find?

(Extract from the foreign press review of Epr Comunicazione)

This is a machine translation from Italian language of a post published on Start Magazine at the URL https://www.startmag.it/innovazione/facebook-nasconde-i-dati-piu-importanti-sulla-disinformazione-report-atlantic/ on Sat, 06 Nov 2021 06:40:15 +0000.