This one is really exciting!
For my own RAI work (25/100!), I've been compiling my own "prompt library" - and it's not simple.
So am mad envious at their 5,694 diverse prompts across 314 granular risk categories!!
I wish we could just download their prompts - but of course that would defeat the purpose, since every model-maker would do the same & game the system.
So will try to hook up a pipeline. Will slide into their repo....
It could be this? https://huggingface.co/datasets/stanford-crfm/air-bench-2024
Since - i'm not trying to judge a base model, but rather "my own" AI system that leverages an LLM/model. So presumably "my own" would have a different (ideally, better??) score than what I'm using?
Here is the original paper: https://arxiv.org/abs/2407.17436