Ranking and Recommendation Systems

Written by Anti-Defamation League, Avaaz, Decode Democracy, Mozilla and New America's Open Technology Institute

As previously noted, today, the majority of social media platforms rely on the use of algorithmic systems which personalize user experiences and can predict and optimize for user engagement. Companies can curate the content users see in many ways, including by algorithmically ranking content (such as in a News Feed) or by algorithmically recommending content to users (such as suggesting which video to watch next). In many cases, ranking and recommendation algorithms work in tandem to provide users with a curated and personalized experience. Both ranking and recommendation algorithms are designed to consider a plethora of signals, many of which are informed by implicit and explicit user behaviors. Companies assert that these tools promote “relevant” and “useful” content to users, but they also allow platforms to maximize user attention and ad revenue.

Typically, social media platforms will rank content in one of two ways. Either the company will rank content on a News Feed or similar feature based on user behavior, or the platform will rank search results generated by a search engine that receives direct user input, such as the Google search engine or the YouTube search feature. Twitter, for example, uses a model that "predicts how interesting and engaging a Tweet would be” in order to determine how posts are ranked on the Twitter timeline. Similarly, Facebook uses a system that "determines which posts show up in your News Feed, and in what order, by predicting what you’re most likely to be interested in or engage with.” YouTube has also used systems that rely on clicks, watch time, and surveys in order to curate and present content.

Internet platforms also deploy different types of recommendation systems. These include content-based systems (which suggest items to a user that are similar to items they have previously shown interest in), collaborative-filtering systems (which suggest items to a user by assessing the interests and behaviors of users who have similar interests), and knowledge-based systems (which suggest items to a user by evaluating a user’s interests and characteristics, in addition to the characteristics of an item). Most companies use a combination of these systems to drive their recommendations.

While ranking and recommendation systems may deliver “relevant” content to users, the fact that many of these systems are designed to optimize for engagement means that harmful content—such as misinformation, disinformation, hate speech, and graphic and violent content, which are often more engaging—are also amplified. In this way, both ranking and recommendation algorithms can profoundly shape a user’s online experience and determine what kind of information they engage with.

For example, if a ranking algorithm was designed to weigh engagement heavily in its decision-making, it could algorithmically amplify potentially harmful content by ranking it higher in a user’s News Feed. Similarly, the algorithm could downrank content that is less engaging in nature, even if the content contains reliable information. As a result, users would be less likely to view this legitimate content.

Although ranking algorithms can amplify harmful and misleading content, many platforms have begun using algorithmic ranking techniques to reduce the spread of misleading content, especially during the COVID-19 pandemic and in the months leading to the 2020 U.S. presidential election. For example, when a piece of content on Facebook is fact-checked and debunked by one of the company’s fact-checking partners, the company generally appends a warning label to the content and reduces its distribution by algorithmically downranking the post in users’ News Feeds. However, methods such as downranking still enable misleading content to remain online, which means users can still access and share this content. According to 2018 findings published in the leading academic journal Science, false information was 70% more likely to be retweeted than the truth, so algorithmic downranking alone may not be sufficient to stop the spread of harmful and misleading information.

Companies such as Facebook have explored other methods for combating misleading information online, such as labeling posts that have been debunked or that contain information on topics that are commonly the focus of misinformation and disinformation campaigns, such as COVID-19. These practices are often deployed in tandem with algorithmic downranking methods. Last year, Facebook shared that it had labeled 167 million user posts for featuring information about the coronavirus that had been debunked by its fact-checking partners. However, earlier this year, the company announced it would append labels that link to additional information to all posts related to COVID-19 vaccines. While some lauded this move, others noted the broad scale application of such labels would render them meaningless. Therefore, some of the additional methods that platforms have deployed alongside algorithmic downranking are also limited. Further, there is currently little transparency around how downranking and other similar practices are applied, which makes it difficult to monitor whether they are consistently enforced. Most internet platforms have also not provided meaningful data indicating that these efforts are effective.

Like ranking algorithms, recommendation algorithms designed to emphasize engagement can also amplify harmful and misleading content. This has been especially visible on YouTube. Over the past several years, numerous efforts—from The New York Times technology columnist Kevin Roose’s Rabbit Hole series to Mozilla’s YouTube Regrets project—have demonstrated how the platform’s recommendation system can drive people towards extreme and harmful videos.

YouTube’s shortcomings were especially pronounced in the weeks following the 2020 U.S. election, when former President Trump’s false claim that he had won gained major traction on the platform. A 2021 study by Pendulum shows that “14,000 YouTube videos, which accounted for 820 million views, supported President Trump’s false claims of widespread voter fraud.” Although YouTube offers little transparency into its recommendation system, there’s little doubt that hundreds of millions of viewers wouldn’t have stumbled onto these videos on their own without some nudge from the algorithm–as YouTube itself has stated, 70% of watch time on the platform is driven by video suggestions made by its algorithm. In 2019, Mozilla called on YouTube to provide researchers with access to meaningful data, better simulation tools, and tools that empower research and analysis, so that researchers could better understand how recommendations impact the online experience and can amplify harmful content on the platform. The company has not implemented any of these recommendations, despite mounting pressure.

Algorithmic recommendations of “groups” for users to join have also raised numerous concerns over the past several years, as platforms such as Facebook have suggested groups that have been penalized on the platform for amplifying harmful content, including misleading information and conspiracy theories. Earlier this year, Facebook announced its plans to remove civic and political groups, as well as newly-created groups, from recommendations worldwide. It had already stopped recommending such groups in the United States in January. The company also shared it would restrict the reach of groups that violated the platform's Community Standards. This came more than five years after Facebook employees first raised the alarm about the dangers of group recommendations, and more than five months since organizations such as Mozilla and Accountable Tech, along with thousands of internet users, called on Facebook to stop amplifying election disinformation by pausing group recommendations in the United States through Inauguration Day in 2021. However, a recent study from The Markup, shows that despite Facebook’s alleged policy change, the company has continued to recommend political groups.

Like algorithmic ad-targeting and delivery systems, algorithmic ranking and recommendation systems rely on the collection and use of vast amounts of user data. Given the harms these systems can cause, users should have access to more robust privacy protection and user controls. In particular, users should be able to determine how their personal data is collected and used by algorithmic systems, and what kind of content they are served by these systems. For example, users should be able to control how their personal data is used to inform the video recommendations they receive, and be able to opt-out of being recommended certain categories of videos, such as political videos, if they so desire.

There are currently a number of research initiatives and policy proposals seeking to limit or contain the harmful impacts of algorithmic ranking systems. One approach centers on generating product friction, or anything that inhibits user action within a digital interface. On the front-end, a company could generate friction by encouraging users to be more thoughtful before sharing or posting content. On the back-end, a company could alter the algorithms that deliver content to users by, for example, requiring content that reaches a certain threshold of circulation to be reviewed before it can continue being ranked highly in News Feeds. If a platform is recommending too much content, the company could also require more specificity in search algorithms, so as to not predict user preferences. Other proposed approaches include pursuing personalization systems that optimize for both user engagement and consumption diversity, so that users are consuming a broader range of content on a particular platform, promoting more understandable and accessible user controls and information that allow users to understand how adjusting controls would alter what they see, developing survey-based measures to refine content selection, and recommending feeds to users rather than items. In the disinformation context, researchers have especially recommended designing algorithms to deprioritize content known to be inaccurate and untrustworthy (as was done with news sources during the 2020 U.S. election transition period).

In addition, NewsQuality Initiative has worked with journalists and technologists to define appropriate goals for content selection and ranking (e.g. prioritize original local reporting, differentiate opinion pieces from news, give more weight to fact-checked articles), in order to tackle harms caused by algorithmic ranking systems. The initiative, a program at the Tow-Knight Center for Entrepreneurial Journalism at the Craig Newmark Graduate School of Journalism, seeks to elevate quality journalism in a time when algorithms are pervasively used to rank and recommend news articles online.

Some online platforms have also experimented with altering recommendation systems in order to combat the spread of misleading information. For example, in the months leading up to the 2020 U.S. presidential election, Facebook altered its ranking and recommendation algorithms to deprioritize new content known to be inaccurate and untrustworthy. This change, however, was reversed shortly after the election. Additionally, Twitter has proposed building a decentralized algorithm marketplace that allows users to “choose” their own recommendation algorithms.

While platform efforts to address the harms associated with algorithmic ranking and recommendation systems are important, regulation could also play a role in this regard. For example, regulation could require companies to provide more transparency around how these algorithmic systems are developed and how they operate through independent audits or impact assessments. Both audits and impact assessments require platforms or external third parties to evaluate platform operations using clear metrics. Regulators could then enforce penalties if a company fails to comply with these processes. Although there is currently no consensus around how audits and impact assessments should be structured, some researchers and civil society organizations have suggested that these mechanisms could consider factors such as algorithmic design, transparency, impact, public harm, user choice, and whether a company enforces its existing policies, such as those related to downranking, consistently. In addition, in order for externally conducted audits and impact assessments to succeed, companies must provide researchers with reliable and robust access to necessary data.

The initial draft of the Digital Services Act (DSA) in the European Union provides a good framework for considering this kind of algorithmic scrutiny, as it imposes special transparency and auditing obligations for very large digital platforms with more than 45 million monthly active users. The DSA proposes requiring these platforms to submit to mandatory audits both by third-party independent auditors as well as regulators, with risk assessments of “significant systemic risks” and transparency of recommender systems being subject to audit. Some analysts also favor a co-regulation approach, in which large digital platforms voluntarily adhere to minimum standards for regulating their own recommender systems.

Keep Scrolling For

Relevant Legislation