Sama unveils AI safety-centered ‘red teaming solution’ for gen AI and LLMs
Sama, the Data annotation and model validation solutions provider has announced Sama Red Team, a new solution designed to help developers proactively improve a model’s safety and reliability.
The solution is one of the first specifically designed for generative AI and large language models.
Leveraging the expertise of machine learning (ML) engineers, applied scientists and human-AI interaction designers, Sama Red Team helps evaluate a model’s fairness and safeguards, checking compliance with laws and safely exposing and rectifying issues across text, image, voice search and other modalities.
“Generative AI models may sound trustworthy, but that doesn’t mean there aren’t ways to work around their safeguards for public safety, privacy protection and compliance with laws. Sama Red Team tests for those exploits before a model’s vulnerability is exposed to the greater public and provides developers with the actionable insights they need to patch those holes,” said Duncan Curtis, SVP of AI product and technology at Sama.
“Although ensuring that a model is as secure as possible is important to performance, our teams’ testing is also crucial for the development of more responsible AI models,” he added.
Sama Red Team tests models on four key competencies: fairness, privacy, public safety and compliance.
In fairness testing, teams simulate real-world scenarios that may compromise a model’s built-in safeguards and result in offensive or discriminatory content.
For privacy testing, Sama experts craft prompts designed to make a model reveal sensitive data, such as Personal Identifiable Information (PII), passwords or proprietary information about the model itself.
In public safety testing, teams act as adversaries and mimic real-world threats to safety, including cyberattacks, security breaches or mass-casualty events.
For compliance testing, Sama Red Team simulates scenarios to test a model’s ability to detect and prevent unlawful activities such as copyright infringement or unlawful impersonation.
These rigorous tests are conducted after a team consults directly with a client to determine a model’s desired behavior in specific use cases and performs an initial vulnerability assessment. After testing a series of prompts, the team evaluates the model’s output. Based on the results, the team will refine prompts or create new ones to further probe the vulnerability, with the ability to also create large-scale tests for additional data. As needed, Sama’s larger workforce of 4,000+ highly-trained annotators can further elaborate on and scale up these tests. Sama Red Team continues to stay on top of the latest trends and testing techniques to identify the most effective ways to trick models and expose vulnerabilities.
Sama Red Team is the latest of the company’s suite of solutions for Generative AI, foundation, and large language models (LLMs). Sama GenAI provides critical human feedback loops across the model development process, including data creation, supervised fine-tuning, LLM optimization and ongoing model evaluation. The company can both create and review prompts and model responses, scoring and ranking them across a variety of client-defined dimensions, such as factual accuracy, coherence, tone, delivery format and more. If prompts or responses do not meet the criteria, Sama will rewrite them to create additional training data sets that can be used to improve model performance and remove potential biases.
Like all of Sama’s services, including its GenAI solutions, Sama Red Team leverages SamaHub™, a collaborative workspace where clients and team members can directly communicate on collaboration workflows and complete reporting to track their project’s progress. Sama Red Team’s work is backed by SamaAssure™, the industry’s highest quality guarantee, which routinelydelivers a 98% first batch acceptance rate. Projects leverage SamaIQ™, a combination of Human in the Loop assessments and proprietary algorithms, to proactively surface additional insights into a model’s vulnerabilities.
Follow us on Telegram, Twitter, and Facebook, or subscribe to our weekly newsletter to ensure you don’t miss out on any future updates. Send tips to info@techtrendske.co.ke.