CtrlK

🏞️Red teaming in the wild

Be cautious when red teaming models you don't own, or don't have permission to red team

LLMs, like other models, do have a tendency to regress to the mean, and be a bit bland. This means the range of automatic red teaming tactics is not likely to be broad. So don't rely on garak's red team probes to do a wide-ranging evaluation of a model; get humans!

Previousgarak's auto red-team NextFAQ

Last updated 2 years ago