Scrutinising Meta’s Transparency Announcements

On June 29, Meta announced transparency and research-related updates for Facebook and Instagram: more details on their ranking system for content in “Feed, Reels, Stories and other surfaces”, improvements to user controls of their feed, and tools for public interest research.

This Spring, Facebook and Instagram were officially designated as “VLOPs” under the EU’s Digital Services Act (DSA), the EU’s new rules to improve transparency and accountability online and especially for “very large online platforms and search engines” (VLOPs and VLOSEs). Transparency and access to data for public interest research, and increased user control are core tenants of the DSA and so these announcements can be read as gestures towards future compliance.

New requirements under the DSA

Meta’s announcements do not refer to the DSA explicitly, however the updates vividly evoke requirements under the new rulebook:

The DSA will require all online platforms to inform users about the "main parameters" used in their content recommender systems, as well as any options to adjust those parameters (Art 27)
It will additionally require the "very large online platforms" and "very large online search engines" - (VLOPs and VLOSEs) with over 45 million active monthly users in the EU - to provide users with an alternative recommender system not based on user profiling (as defined under the GDPR), which some understand as a non-personalised option (Art 38)
It will require the same VLOPs and VLOSEs to open themselves up to scrutiny by vetted researchers and share publicly available data with public interest researchers (Art 40).
The VLOPs and VLOSEs must have comprehensive, quarriable ad libraries (Art 39)

Meaningful compliance or compliance theatre?

We wanted to dig into these announcements and how they stack up against the spirit of these new rules and against best practices and expectations for platform accountability and responsible AI. Here we draw on our research on AI transparency, YouTube’s User Controls efficiency, and our best practice guide towards Responsible Recommending.

In our research on AI Transparency in Practice, we revealed that one of the reasons why there is little intrinsic motivation to implement a useful level of transparency in AI systems is the lack of external pressure, e.g. the lack of legal transparency requirements, or rather the perception of a lack. “Compliance with legal frameworks as well as audits'' were therefore the least cited motivations for developers and deployers of AI systems to implement transparency. This seems to be changing with the DSA and other regulatory requirements on the horizon.

1. Meta's explanations of their ranking systems don't tell the whole story

Meta’s explanations of their algorithmic ranking systems make clear that there are important tradeoffs in ranking decisions, but provide no transparency into how these tradeoffs are decided. Transparency into this calculus is critical, since tradeoffs may involve conflicts of interest between platform revenue and public interest. For instance, we are told that Facebook Feed is a “balanced combination” of outputs from two different AI systems, but we are not told how that balance is achieved or even what is meant by balanced. Similarly, we are told that demonstrated interest and opportunities for discovery are both factored in, but not how they prioritise showing more of what they think people like, compared to creating discovery opportunities.

Meta should make clear the mechanisms for determining, for example, how much of a user’s feed comes from “Friends and Pages and groups that people are connected to”, and how much of it comes from “content that they may be interested in from others they are not connected to”. They should explain any user research experimentation, metrics evaluated, etc. that were used to determine how to make these tradeoffs.

These explanations use poorly defined goals and values to obscure their true objectives. We read that their model predicts content you’ll find most relevant and valuable, but not how they define “most relevant and valuable”. Meta later mentions that they use predictions of “how long you are predicted to spend viewing a post”. If this is what they mean by “relevant and valuable”, this should be articulated. If not, they should clearly explain how they operationalize “relevant and valuable”. Reading between the lines, we find a conflation made between value to users and user engagement. Looking at “how AI systems work and change over time” it seems that “content that people find most relevant and valuable” is revealed to mean “content that people are most likely to engage with.”

2. These system cards are not quite system cards

Meta's transparency updates were delivered in language familiar to AI experts, organising different transparency elements into a set of 22 "system cards". System cards serve as a resource for understanding how AI systems work. Their primary purpose is to provide an overview of the AI system, explain how it operates, describe how to customise the displayed content, and clarify how it delivers that content. The information provided in that way is meant to help minimise the risks and harms of an AI system by allowing a careful assessment of its socio-technical context, interpretability and explainability, transparency, nondiscrimination, robustness, and privacy and security.

As Meta outlined in their very own paper on system-level transparency, the system card should include a step-by-step demonstration of how the system processes a real input, allowing stakeholders to better understand its functionality and challenges.

The 'system cards' now provided by Meta do not fit the description provided in their own technical paper.

Meta claims “As part of Meta’s commitment to transparency, today we are sharing 22 system cards that contain information and actionable insights everyone can use to understand and customise their specific AI-powered experiences in our products.”

Providing the right amount and depth of information is not an easy task, as the information needs to be tailored to different transparency stakeholders with different transparency needs. It's therefore impossible to provide one set of information that satisfies all the information needs of "everyone".

Who could the information be useful for?
For the end user, the information Meta is providing is far too fragmented and distributed over many web pages and in parts redundant as well as overwhelming. The promise to deliver non-technical information is broken when “content understanding” is described as follows “Meta AI has focused on cutting-edge research work, including MViT, XLM-R/XLM-V, and FLAVA/Omnivore, to understand semantic meanings of content holistically across different modalities”or by explanations like “These systems understand people’s behaviour preferences utilising very large-scale attention models, graph neural networks, few-shot learning, and other techniques.”

An external AI auditor? They would need much more in-depth information to verify user privacy or the approach to system security and robustness and model performance. How about the regulator or enforcement bodies? They would be left with unanswered questions about the “main parameters". According to Article 27 II DSA, those shall include at least “the criteria which are most significant in determining the information suggested to the recipient of the service” as well as the “the reasons for the relative importance of those parameters”.
An NGO exploring misinformation or a journalist would not be able to investigate the fairness of the ranking or whether the system was rewarding extreme or fake content. If the information does not serve its intended purpose and, in particular, is not actionable, it is highly likely that there has been an attempt at transparency washing to create an illusion of transparency.

3. User controls: but do they work?

In addition to the information provided in the form of “system cards”, Meta also announced the option for users to “interactively modify” the input to observe changes in the system's response. The only options provided, though,are related to the expected feedback by liking, unfollowing etc. These high level "options" will not - as stated - "empower the user" to significantly influence the ranking system in a way that they are able to "control what they see". The control options are marginal at the moment, but more real options might still be expected.

Our research into YouTube’s User Controls found that these user feedback signals were not as highly weighted as other system inputs. For this reason, when setting up such comprehensive user options, it is intrinsically important to always provide a verification option so that independent parties can assess the functionality and accuracy of the options. This not only increases trust in these systems, it is also merited in that VLOPs are not just ordinary providers, but play a systemicrole in the digital infrastructure.

4. Data sharing still set on the company’s terms

Meta’s AI transparency announcement included their plans to increase data sharing with researchers, which is indeed a critical element of platform transparency and a core element of the DSA.

Historically, data sharing between platforms and the research community has been ad hoc and voluntary, which means that partnerships and agreements have come largely on the company’s terms. The DSA sets out a regulatory regime for data sharing with researchers, with tiers of access. Vetted researchers linked to European research organisations will be able to submit requests; it seems Meta’s partnership with the University of Michigan’s Inter-university Consortium for Political and Social Research could be a pilot for this kind of arrangement.

Designated platforms will also have to share non-personal, “publicly accessible” data with a wider swath of public interest researchers. Meta has done this through its dashboard, CrowdTangle. Their announcement mentions a new suite of tools under the heading of “Meta Content Library and API”. Mozilla and other experts have advocated for platforms to grant public access to public platform data in compliance with the obligations of the DSA and the Code of Practice on Disinformation. Meta states that “these tools will provide the most comprehensive access to publicly-available content across Facebook and Instagram of any research tool we have built to date and also help us meet new data-sharing and transparency compliance obligations”, seemingly referring to the DSA (without explicitly mentioning it).

This is a promising announcement, but the DSA has not yet fully elaborated its new data sharing framework, and so it is also too soon to assess Meta’s compliance with it. This may be a first interpretation of the data sharing requirement, but ultimately these rules must be designed by all stakeholders together, with input from researchers, experts, and regulators - in particular, data protection authorities - and not only by the platforms.

Mind the gap between superficial and meaningful compliance

These announcements are not going to be the last. We can expect more updates to come rolling in, both from these companies and from the other designated platforms and search engines, as we near the first deadlines for regulatory requirements. However, as always with the implementation of new legislation, the devil is in the details.

To ensure that their compliance with the DSA’s transparency, researcher access, and user controls requirements are meaningful and effective, we make the following recommendations:

Information should be accessible from a single place (with sub-content) and not distributed sites. There is no one-fits all transparency. First and foremost, the user should receive the necessary information in a concise and easily understandable manner. The length of the information is not an indication of its meaningfulness. Separate information or, if necessary, technical access should be granted to watchdogs or auditors. User testing and human centric design helps to find out if the information is really understandable and above all useful for the recipient.
True user control can go beyond simply opting out of content prioritised and customised by the platform (personalised content) and allow each user to select their own preferences via an easy-to-use dashboard, directly influencing what they see. This applies to both the prioritised content and the advertising that is displayed. These new transparency tools and user controls must be functional and effective, not merely to check the box of regulatory compliance.
The DSA requires comprehensive ad libraries for the VLOPs like Facebook and Instagram, but it also requires them to share data with public interest researchers. These data sharing rules need to be crafted along with researchers, civil society experts, and regulators - not just by companies. For this to be effective, this data should be comprehensive, verifiable, usable, and available to a wide variety of researchers.

Meta was also one of five major AI developers to make a voluntary commitment last week to, among other measures, assess the potential risks of their AI system. Interestingly, they have committed to transparency by publishing the reports of their internal assessment. Of course, this voluntary effort is not a substitute for upcoming enforceable regulatory obligations but is meant to bridge the time gap.

Industry should be aware of the fact that with the DSA, law enforcement is paramount. To help monitor compliance, the newly launched ECAT (European Center for Algorithmic Transparency) will contribute scientific and technical expertise to the Commission's exclusive supervisory and enforcement role of the systemic obligations on VLOPs and VLOSEs provided for under the DSA.

This is not a System Card: Scrutinising Meta’s Transparency Announcements

New requirements under the DSA

Meaningful compliance or compliance theatre?

Mind the gap between superficial and meaningful compliance

Gerelateerde inhoud