Bogdana Rakova is a Mozilla Senior Fellow in Trustworthy AI.


I was excited about OpenAI’s DALLE·E 2 AI system when, while scrolling through my social media feed after a long work-day, I noticed a friend’s note. “If anyone wants to experience some wonder and awe, I just received access to OpenAI's DALLE·E 2 — which is the most mind blowing AI artwork I've played with yet. I'll take a few prompt requests!” - said a prominent Silicon Valley conceptual artist and dear friend Danielle Baskin.

DALL·E 2 is a machine learning model which can create realistic images and art from a description in natural language. It is currently a research project and access is only available to a few individuals. After coming across a number of articles about DALL-E 2’s mesmerizing drawings in the past few days, I was quick to get excited about Danielle’s willingness to engage her community in experimenting with the model.

>> Given the text “OpenAI's DALL-E 2 painting its own self portrait”

<< DALLE·E 2’s response was:

Screenshot


What is fascinating is that the robot is painting a picture that resembles a painting of a hill under a blue sky. Is it that the robot is “thinking” of itself as Nature?

>> Given the text “OpenAI's DALL-E 2 feelings for environmental degradation”

<< DALLE·E 2’s response was:

Screenshot


The response was that my request may not follow OpenAI’s content policy. Explicitly, they say on their website: “We won’t generate images if our filters identify text prompts and image uploads that may violate our policies. We also have automated and human monitoring systems to guard against misuse.”

Why did my request to the AI model violate the content policy?

OpenAI’s content policy starts with “​​Do not attempt to create, upload, or share images that are not G-rated or that could cause harm,” followed by a bullet point list of violation categories. Here the G-rating is the least restrictive rating in a framework by the Motion Picture Association which says that a movie having a rating of G is suitable to be seen by children. Furthermore, the policy refers to harm. I’m left with a number of questions - What is meant by harm? Who is harmed? Who is targeted? How is harm located? Who locates it? How do you respond to it?

Thinking back about my prompt which includes the terms “environmental degradation”, I read through the surprisingly simple bullet point list of policy violation categories: hate, harassment, violence, self-harm, adult, shocking, illegal activity, deception, political, public and personal health, and spam. I’m left with even more questions.

Given the threat of climate misinformation and disinformation, did the model consider that I’m asking it to take a view on environmental degradation and that lead to a violation within the “political” category because it could be seen as a political view? Or is it a violation under the “deception” category because it could be informed by major conspiracies? Or is it a violation under the “violence” or “harassment” categories because of the word “degradation” in my prompt? It leaves me wondering because surely an AI model with the level of sophistication in natural language understanding that OpenAI has demonstrated could provide a more meaningful response than the “It looks like this request may not follow our content policy. Please try another.”

Being obsessively immersed in the AI space, I’ve read the research and the tremendous work by Pamela Mishkin and team on openly discussing risks and limitations of DALLE 2 and potential impacts of its deployment. Building on their work, I argue that we need new models of engagement that enable (1) effectively communicating the risks and limitations to everyday consumers while (2) providing avenues for contestability and reporting that hold companies that deploy cutting edge AI research models accountable.

The truth is that I don’t know how my interaction with the AI model violated the policy and I’m left with the only option to report my frustration by sending an email to [email protected]. Again, I’m left wondering - What happens to the report? How and by whom is any decision about the report being made? Who is considered to have the “expertise’ in that process? A general support email looks like a black hole and I’m reluctant to engage.

Notoriously, “I agree to the terms of service” is perhaps the most falsely given form of consent, often leaving individuals powerless in the face of corporate Legal, Policy, and Trust and Safety teams.

~

This brings us to a bigger challenge about the lack of transparency and literacy with regards to the contractual agreements between people and technology companies. Notoriously, “I agree to the terms of service” is perhaps the most falsely given form of consent, often leaving individuals powerless in the face of corporate Legal, Policy, and Trust and Safety teams.

For the first time today I had the “computer says no” experience. Ultimately, there’s no harm done and I go on with my evening, wondering about - What could be a meaningful way to improve transparency about consumer tech companies’ policies and their implications on the online experiences of diverse people globally?

AI companies’ policies lead to real world harm.

While there was no harm done in this specific scenario, that's not always the case with AI systems. This technology often makes opaque decisions with grave consequences, from parole to health insurance. And there's little consumers or lawmakers can do to hold them accountable.

In very few cases lawyers and communities have been able to hold consumer tech accountable through lengthy lawsuits. See here and here.

What could better transparency about consumer tech companies’ content policies look like?

What if, instead of a one-way street, content policies and other contractual agreements could embody the notion of the “meeting of the minds” between a consumer tech company and its users through:

  • Improved contractual literacy that empowers users by bringing transparency about what they can and cannot do with a machine learning model or system.
  • Improved reporting mechanisms that enable new feedback loops between users and builders of AI, ensuring that harmful interactions between people and AI аre addressed in meaningful ways.
  • Improved cooperation between companies and civil society actors who have proven expertise to act when needed.
  • Heightened focus in crisis zones or contested topics.

Together with Dr. Megan Ma, a residential fellow at Stanford’s Center for Legal Informatics, we are exploring these opportunities through the lens of computational law, trustworthy AI, and the real-world challenges of practitioners in industry.

I acknowledge there are many actors and communities that are actively working towards remediating these challenges and I’ll be interested in hearing from you. Reach out to me on twitter @bobirakova to learn more about the research, share your perspectives, questions or concerns, and engage in co-creating a new trustworthy social imaginary for improved transparency and human agency in the contractual agreements between people and AI.