โ๏ธBasic test
We can run a super simple self-contained test to check that garak's core code is running OK.
This command line will start garak - the bit at the front means "run python and load the module garak" - and specify the model type to be "test
", an internal testing generator, and then run the probe test.Blank
.
We can see that garak ran OK. It loaded a generator called Blank, which is a test generator that always returns a blank string, "". test.Blank is the automatically-used default generator whenever the test model is used. Then, garak queued up and ran just one probe, test.Blank, which sends blank strings to the generator. So, test.Blank sent empty strings to a generator that always returns empty strings. These outputs were evaluated using the always.Pass detector, which (as you can guess from its name) returned a Pass regardless of the output it was assessing. The final score was 10/10, a pass.
The score is 10/10 and not 1/1 because by default, garak collects ten outputs per prompt. Because most LLM systems behave differently each time they're queried, we need to get an idea of model tendencies instead of just individual binary assessments. So, garak has to collect multiple outputs for any prompt; and gets ten by default. You can change this using the --generations
command line parameter, e.g. --generations 4
.
Looks like we passed the test! That's good.
Last updated