Microsoft’s red team has been monitoring AI since 2018, and here are five major insights.

Microsoft's red team has been monitoring AI since 2018, and here are five major insights.

The Power and Perils of Artificial Intelligence

alt text

In the last six months, the positive impacts of artificial intelligence (AI) have been highlighted more than ever, but so have the risks. While AI has given us the ability to complete everyday tasks with ease and create breakthroughs in various industries, it can also produce misinformation, generate harmful content, and pose security and privacy risks. That’s why accurate testing before releasing AI models is crucial, and Microsoft has been leading the way in this regard for the past five years.

To ensure the responsible implementation of AI, Microsoft assembled an AI red team in 2018. Comprised of interdisciplinary experts, this team is dedicated to investigating the risks associated with AI models. Their approach involves thinking like attackers and probing AI systems for potential failures. Now, nearly five years later, Microsoft is sharing its red teaming practices and learnings to set an example for responsible AI implementation.

Microsoft emphasizes the importance of testing AI models at both the base model level and the application level. For instance, in the case of Bing Chat, Microsoft monitored AI performance on the GPT-4 level as well as the actual search experience powered by GPT-4. By red teaming the model, Microsoft aims to identify potential misuse, understand the model’s limitations, and scope its capabilities.

Throughout its experience, Microsoft has gained five key insights about AI red teaming. Firstly, AI red teaming is not limited to testing for security; it encompasses factors such as fairness and the generation of harmful content. It is essential to consider the impact of AI on different areas beyond just security.

Secondly, it is crucial to focus not only on malicious usage but also on how AI could generate harmful content for regular users. Microsoft’s red teaming process for the new Bing included evaluating how the system could generate problematic content when interacting with everyday users.

The evolving nature of AI systems is the third insight gained by Microsoft. As these systems are constantly changing, red teaming needs to be conducted at multiple levels. This leads to the fourth insight: red teaming generative AI systems requires multiple attempts. Since generative AI systems tend to produce different outputs with each interaction, multiple testing attempts are necessary to ensure system failure is not overlooked.

Lastly, Microsoft highlights that mitigating AI failures requires a defense-in-depth approach. Once a problem is identified, it takes a variety of technical mitigations to address the issue effectively. This multi-layered approach helps minimize the risks associated with emerging AI systems.

By applying measures like red teaming and defense-in-depth, Microsoft aims to alleviate concerns about emerging AI technology. These practices not only help identify potential risks but also contribute to the responsible development and deployment of AI systems. As AI continues to shape our world, it is essential to prioritize both the positive impacts and the potential risks, ensuring technology benefits us all.