wildjailbreak-linear-head-narrowing-L4-lr1e-06
Simple MLP linear head for 4-class classification over LLaMA-2 last-token vectors.
- Architecture: narrowing, layers=4
- Input dim: 5120
- Output classes: 4
- LR: 1e-06
- Metrics (test): F1(macro)=0.969547, Acc=0.963565
Usage
See example code in this repo card or the snippet we provide in the notebook to load and run inference.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train CatBarks/wildjailbreak-linear-head-narrowing-L4-lr1e-06
Evaluation results
- f1_macro on wildjailbreak (LLaMA-2 last-token vectors)self-reported0.000
- accuracy on wildjailbreak (LLaMA-2 last-token vectors)self-reported0.000