Graduate students from the Atlanta University Center's Data Science Club led a design thinking workshop on the criteria and considerations needed to develop Trustworthy AI at MozFest 2022. Here's what they learned.

Machine learning has allowed the world to come a long way by creating ways to use technology for convenience in our everyday lives. Unfortunately, there is no control over who is training these systems, which has led to bias in many machine learning techniques. This bias is often referred to as algorithmic injustice, i.e. the lack of application of concepts like social justice and ethics to various parts of algorithmic systems including machine learning, artificial intelligence (AI), data collection, design, implication, and regulation. This topic is essential for users to be aware of as it demonstrates how these systems can engage in or exacerbate harmful practices (e.g., inequality, discrimination). Throughout our session, we explored current pitfalls of AI and how to develop more trustworthy AI products.

As many populations around the world — especially minorities — have experienced some form of algorithmic injustice in their lives, we asked the participants to share an example of their personal experiences, or any that they had heard about from others in real life or the internet. Some responses included Amazon's automated resume selection algorithm for men's names, facial recognition discrimination against people of color (particularly women of color), and their device’s voice assistant not understanding their accent. Most of the examples aligned with two of the most common types of biases in AI, selection bias and implicit bias. This leads to an urgent need for developing more dependable AI — but how do we accomplish that?

By diving deeper into this notion, we were able to create a list of criteria to consider when developing these products. Some options from the audience included being aware of consent and data danger, distributional consequences, impact and how the data is or will be used, and colorism. Currently, there is a growing concern around a lack of verification that people have complete control over their own data, whether it’s providing, or even retracting information. In terms of “data danger”, participants also felt it was essential for those developing trustworthy AI to ensure that their data was safe. An example given was the risks involved for members of the LGBTQ+ community in disclosing data about themselves compared to cisgender individuals.

Another suggestion was to consider distributional consequences. Several publishing sites such as Pocket use social metrics to develop algorithms. This can create bias as certain publishers who have the most “reach” end up being highlighted, which leads to a greater number of people reading their articles over others. To combat this, it was suggested that data scientists collaborate with website owners to create algorithms that seek publishers with similar topics to “favorited” articles to help increase the visibility of lesser known publishers.

More often than not, minorities have been “behind the scenes” contributors in developing these kinds of algorithms, but due to their lack of visibility, there is an unequal portrayal and representation when making products that should be reflecting our global population. This has also led to issues with colorism in AI. One prominent example includes the tendency for darker skin tones to either not have the same accuracy, or not even be recognized by facial recognition technologies at all, which is not an issue that’s regularly experienced by those with lighter skin tones. AIthough AI has been a major convenience to many people’s lives, the issue still remains that not everyone reaps the same benefits — if there are any at all, for some.

Since targeting minorities could come off as predatory, there is a lot of discussion around potential methods that could be used to garner personal data to aid in the development of more globally reliable AI; however, one major issue lies with compensation. Should there be compensation for those who contribute their data, whether voluntarily or involuntarily? If not, this could lead to AI technology training being left up to methods like “data trawling”, or large volume data mining, to seek out relationships between data points, which could lead to improper generalizations of certain groups. The participants were not only concerned about model training, but also with the use of the data if additional individuals voluntarily contributed. One participant in particular expressed a higher likelihood of sharing their data if the intent of use was explicitly noted.

There has also been some apprehension toward social media. One user voiced their opinion on the matter, noting that sites like Facebook regularly collect large volumes of personal data about its users, whether they are aware of it or not. One explanation is that it leads to personalized ads that help the user have a better web experience; however, this has raised more than a few eyebrows over the years. As technology has advanced to determine our likes, dislikes, and even general personalities, the need for privacy protection rights is higher than ever. Many believe that the reasons for collecting data, as well as the actual data itself, should be stated upfront and be safeguarded without any third-party interests involved. The participants all agreed that adhering to these regulated guidelines would lead to a world of trustworthy AI.

It is clear that there is still a lot of work to be done in creating trustworthy AI products that are unbiased, transparent, and secure. A number of factors such as data rights, discrimination, and even other factors like sexism or compensation are needed at the forefront of the development of these products. Without this, AI will continue to benefit the individuals who programmed them and who they programmed them for, which could lead to further division of our global society.

About the Authors:

Kiandra Smith

Kiandra Smith is a health equity champion passionate about bridging the gap in female-related disparities in underserved and underrepresented communities. Kiandra has been involved in several STEAM related initiatives to educate and mentor students from elementary to undergraduate about various STEAM related careers as an undergraduate and current graduate student at Morehouse School of Medicine. As a current PhD and master’s in clinical research candidate, Ms. Smith has focused her research on understanding the female-specific effects seen in circadian related metabolic disorders on high fat diet.

Shai Waldrip

Shai Waldrip is a Biomedical Sciences Ph.D. candidate at Morehouse School of Medicine passionate about serving her community. Her research primarily focuses on the validation of potential targets for the early detection and personalized therapy of BRCA1-associated triple-negative breast cancer using bioinformatics. She is currently involved in several clubs, most notably as the President of the Atlanta University Center Data Science club and the Vice President of the Bioethics club. She is extremely passionate about spreading awareness of data science, as she has developed multiple programs to give students mentoring and internship opportunities involving bioinformatics and biomedical data science.

Eva Andrews

Eva-Jeneé Andrews is currently a second-year graduate student at Morehouse School of Medicine. Recently, she was awarded a Predoctoral Fellowship at Harvard Medical School. Eva-Jeneé’s research consists of sleep studies in mice to determine the behavioral and molecular correlates of resilience. Eva-Jeneé is also active with her church and often volunteers to give back to families in need. She has an adorable puppy named Hachi, who often goes on adventures with her. Eva-Jeneé’s long-term goals are to maintain a collaborative spirit throughout her academic experience and encourage the youth to pursue careers in science.