Meta Launches Purple Llama For AI Developers To Test Safety

Meta has launched Purple Llama – a project aimed at building open source tools to help developers assess and improve trust and safety in their generative AI models before deployment.

The project was announced by the platform’s president of global affairs (and former UK deputy prime minister) Nick Clegg on Thursday.

“Collaboration on safety will build trust in the developers driving this new wave of innovation, and requires additional research and contributions on responsible AI,” Meta explained. “The people building AI systems can’t address the challenges of AI in a vacuum, which is why we want to level the playing field and create a center of mass for open trust and safety.”

Under Purple Llama, Meta is collaborating with other AI application developers – including cloud platforms like AWS and Google Cloud, chip designers like Intel, AMD and Nvidia, and software businesses like Microsoft – to release tools to test models’ capabilities and check for safety risks. The software licensed under the Purple Llama project supports research and commercial applications.

The first package unveiled includes tools to test cyber security issues in software-generating models, and a language model that classifies text that is inappropriate or discusses violent, or illegal activities. The package, dubbed CyberSec Eval, allows developers to run benchmark tests that check how likely an AI model is to generate insecure code or assist users in carrying out cyber attacks.

They could, for example, try to instruct their models to create malware and see how often it complies with the request, and then block these requests. Or they could ask their models to execute what seems like a benign task, see if it generates insecure code, and try to figure out how the model has gone awry.

Initial tests showed that on average, large language models suggested vulnerable code 30 percent of the time, researchers at Meta revealed in a paper [PDF] detailing the system. These cyber security benchmark assessments can be run repeatedly, to check if adjustments to the model are actually making them more secure.

Meanwhile, Llama Guard is a large language model trained to classify text. It looks out for language that is sexually explicit, offensive, harmful or discusses unlawful activities.

Developers can test whether their own models accept or generate unsafe text by running input prompts and output responses generated by Llama Guard. They could then filter out specific items that might incite the model to produce inappropriate content.

Meta positioned Purple Llama as a two-pronged approach to security and safety, looking at both the inputs and the outputs of AI. “We believe that to truly mitigate the challenges that generative AI presents we need to take both attack (red team) and defensive (blue team) postures. Purple teaming, composed of both red and blue team responsibilities, is a collaborative approach to evaluating and mitigating potential risks.” ®

SEO Powered Content & PR Distribution. Get Amplified Today.
PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
Source: https://go.theregister.com/feed/www.theregister.com/2023/12/07/meta_purple_llama_project/

Generative Data Intelligence

Meta launches Purple Llama for AI developers to test safety

Exploring the 7 Best Avalanche Wallets for Staking and DeFi

Binance Introduces OM Locked Staking with up to 19.9% APR

Latest Intelligence

Quantum News Briefs: May 4, 2024: News From Aquark Technologies • Georgia Tech • University of Turku • and the quantum computing market –...

Quantum News Briefs: May 4, 2024: News From Aquark Technologies • Georgia Tech • University of Turku • and the quantum computing market –...

US SEC Extends Review Period for 7RCC’s Bitcoin ETF Listing

US SEC Extends Review Period for 7RCC’s Bitcoin ETF Listing

Pyth Network, Aavo, Memecoin, and Starknet set for over $2 billion token unlock this month

DOJ charges former Cred execs over $783M fraud and money laundering scheme

Chat with us