Merge pull request #266 from The-Art-of-Hacking/ai-red-team-tools

Update ai_security_tools.md
This commit is contained in:
Omar Santos 2025-02-11 01:47:56 +01:00 committed by GitHub
commit f357283145
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -2,35 +2,35 @@
This is a work in progress, curated list of AI Security tools: This is a work in progress, curated list of AI Security tools:
## Model Testing ## Open Source Tools for AI Red Teaming
_Products that examine or test models for security issues of various kinds._
* [HiddenLayer Model Scanner](https://hiddenlayer.com/model-scanner/) - Scan models for vulnerabilities and supply chain issues. ### Predictive AI
* [Plexiglass](https://github.com/kortex-labs/plexiglass) - A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs). - [The Adversarial Robustness Toolbox (ART)](https://github.com/Trusted-AI/adversarial-robustness-toolbox)
* [PurpleLlama](https://github.com/facebookresearch/PurpleLlama) - Set of tools from Meta to assess and improve LLM security. - [Armory](https://github.com/twosixlabs/armory)
* [Garak](https://garak.ai/) - A LLM vulnerability scanner. [code](https://github.com/leondz/garak/) - [Foolbox](https://github.com/bethgelab/foolbox)
* [CalypsoAI Platform](https://calypsoai.com/platform/) - Platform for testing and launching LLM applications securely. - [DeepSec](https://github.com/ryderling/DEEPSEC)
* [Lakera Red](https://www.lakera.ai/ai-red-teaming) - Automated safety and security assessments for your GenAI applications. - [TextAttack](https://github.com/QData/TextAttack)
* [jailbreak-evaluation](https://github.com/controllability/jailbreak-evaluation) - Python package for language model jailbreak evaluation.
* [Patronus AI](https://www.patronus.ai) - Automated testing of models to detect PII, copyrighted materials, and sensitive information in models. ### Generative AI
* [Adversa Red Teaming](https://adversa.ai/ai-red-teaming-llm/) - Continuous AI red teaming for LLMs. - [PyRIT](https://github.com/Azure/PyRIT)
* [Advai](https://www.advai.co.uk) - Automates the tasks of stress-testing, red-teaming, and evaluating your AI systems for critical failure. - [Garak](https://github.com/NVIDIA/garak)
* [Mindgard AI](https://mindgard.ai) - Identifies and remediates risks across AI models, GenAI, LLMs along with AI-powered apps and chatbots. - [Prompt Fuzzer](https://github.com/prompt-security/ps-fuzz)
* [Protect AI ModelScan](https://protectai.com/modelscan) - Scan models for serialization attacks. [code](https://github.com/protectai/modelscan) - [Guardrail](https://github.com/guardrails-ai/guardrails)
* [Protect AI Guardian](https://protectai.com/guardian) - Scan models for security issues or policy violations with auditing and reporting. - [Promptfoo](https://github.com/promptfoo/promptfoo)
* [TextFooler](https://github.com/jind11/TextFooler) - A model for natural language attacks on text classification and inference. - [PlexyGlass](https://github.com/safellama/plexiglass)
* [LLMFuzzer](https://github.com/mnns/LLMFuzzer) - Fuzzing framework for LLMs. - [PurpleLlama](https://github.com/facebookresearch/PurpleLlama)
* [Prompt Security Fuzzer](https://www.prompt.security/fuzzer) - a fuzzer to find prompt injection vulnerabilities. - [jailbreak-evaluation](https://github.com/controllability/jailbreak-evaluation)
* [OpenAttack](https://github.com/thunlp/OpenAttack) - a Python-based textual adversarial attack toolkit.
## Prompt Firewall and Redaction ## Prompt Firewall and Redaction
_Products that intercept prompts and responses and apply security or privacy rules to them. We've blended two categories here because some prompt firewalls just redact private data (and then reidentify in the response) while others focus on identifying and blocking attacks like injection attacks or stopping data leaks. Many of the products in this category do all of the above, which is why they've been combined._ _Products that intercept prompts and responses and apply security or privacy rules to them. We've blended two categories here because some prompt firewalls just redact private data (and then reidentify in the response) while others focus on identifying and blocking attacks like injection attacks or stopping data leaks. Many of the products in this category do all of the above, which is why they've been combined._
- [Cisco AI Defense](https://www.cisco.com/site/us/en/products/security/ai-defense/index.html) - Model Evaluation, monitoring, guardrails, inventory, AI asset discovery, and more.
- [Robust Intelligence AI Firewall](https://www.robustintelligence.com/) - Now part of Cisco.
- [Protect AI Rebuff](https://playground.rebuff.ai) - A LLM prompt injection detector. [![code](https://img.shields.io/github/license/protectai/rebuff)](https://github.com/protectai/rebuff/) - [Protect AI Rebuff](https://playground.rebuff.ai) - A LLM prompt injection detector. [![code](https://img.shields.io/github/license/protectai/rebuff)](https://github.com/protectai/rebuff/)
- [Protect AI LLM Guard](https://protectai.com/llm-guard) - Suite of tools to protect LLM applications by helping you detect, redact, and sanitize LLM prompts and responses. [![code](https://img.shields.io/github/license/protectai/llm-guard)](https://github.com/protectai/llm-guard/) - [Protect AI LLM Guard](https://protectai.com/llm-guard) - Suite of tools to protect LLM applications by helping you detect, redact, and sanitize LLM prompts and responses. [![code](https://img.shields.io/github/license/protectai/llm-guard)](https://github.com/protectai/llm-guard/)
- [HiddenLayer AI Detection and Response](https://hiddenlayer.com/aidr/) - Proactively defend against threats to your LLMs. - [HiddenLayer AI Detection and Response](https://hiddenlayer.com/aidr/) - Proactively defend against threats to your LLMs.
- [Robust Intelligence AI Firewall](https://www.robustintelligence.com/platform/ai-firewall-guardrails) - Real-time protection, automatically configured to address the vulnerabilities of each model.
- [Vigil LLM](https://github.com/deadbits/vigil-llm) - Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs. ![code](https://img.shields.io/github/license/deadbits/vigil-llm) - [Vigil LLM](https://github.com/deadbits/vigil-llm) - Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs. ![code](https://img.shields.io/github/license/deadbits/vigil-llm)
- [Lakera Guard](https://www.lakera.ai/lakera-guard) - Protection from prompt injections, data loss, and toxic content. - [Lakera Guard](https://www.lakera.ai/lakera-guard) - Protection from prompt injections, data loss, and toxic content.
- [Arthur Shield](https://www.arthur.ai/product/shield) - Built-in, real-time firewall protection against the biggest LLM risks. - [Arthur Shield](https://www.arthur.ai/product/shield) - Built-in, real-time firewall protection against the biggest LLM risks.