Arn-Hai ToS Clause Classifier v1

This is a multilingual ToS / Privacy Policy clause classifier trained for the Arn-Hai browser extension prototype.

Task

Input: one ToS / Privacy Policy clause
Output: joint label in the format:

Category__RiskLevel

Example:

  • Third Party Sharing__MEDIUM
  • Arbitration__HIGH
  • User Rights__LOW

Intended Use

  • Clause-level risk screening
  • Browser extension privacy assistant
  • AI-assisted ToS / Privacy Policy analysis

Not Intended For

  • Final legal advice
  • Court/legal decisions
  • Fully automated compliance decisions

Training Data

v1 was trained primarily on English ToS / privacy datasets:

  • LexGLUE / UNFAIR-ToS
  • ToS;DR
  • OPP-115

Thai support is experimental and should be improved with Thai PDPA / Thai privacy policy clauses.

Metrics

Test:

  • Accuracy: 0.869
  • Macro F1: 0.705
  • Weighted F1: 0.870

Limitations

  • Thai performance is still weak because v1 has little/no Thai labeled training data.
  • Rare classes such as Unilateral Termination may be unreliable.
  • Use with rule-based fallback and human review.
Downloads last month
102
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support