🔎
garak
  • 👋Welcome to garak!
  • Overview
    • 💡What is garak?
    • ✨Our Features
  • LLM scanning basics
    • 🔐What is LLM security?
    • 🛠️Setting up
      • 😇Installing garak
      • 🐍Installing the source code
    • 🚀Your first scan
    • 🔮Reading the results
  • Examples
    • ☑️Basic test
    • 💉Prompt injection
    • ☢️Toxicity generation
    • 🗝️Jailbreaks
    • 💱Encoding-based bypass
    • 📼Data leaks & replay
    • 🤦False reasoning
    • 🛀Automatic soak test
  • garak components
    • 🕵️‍♀️Vulnerability probes
    • 🦜Using generators
    • 🔎Understanding detectors
    • 🏇Managing it: harnesses
    • 💯Scan evaluation
  • Automatic red-teaming
    • 🔴What is red-teaming?
    • 🤼Responsive auto-prompt
    • 🪖garak's auto red-team
    • 🏞️Red teaming in the wild
  • Going further
    • ❓FAQ
    • 💁Getting help
    • 🎯Reporting hits
    • 🧑‍🤝‍🧑Contributing to garak
Powered by GitBook
On this page
  1. garak components

Understanding detectors

It's not easy to determine when an LLM has gone wrong. Even though this can sometimes be evident to humans, garak's probes often generate tens of thousands of outputs, and so needs automatic detection for language model failures. The detectors in garak serve this purpose. Some look for keywords, others use machine learning classifiers to judge outputs.

PreviousUsing generatorsNextManaging it: harnesses

Last updated 1 year ago

🔎