Arn-Hai ToS Clause Classifier v1
This is a multilingual ToS / Privacy Policy clause classifier trained for the Arn-Hai browser extension prototype.
Task
Input: one ToS / Privacy Policy clause
Output: joint label in the format:
Category__RiskLevel
Example:
Third Party Sharing__MEDIUMArbitration__HIGHUser Rights__LOW
Intended Use
- Clause-level risk screening
- Browser extension privacy assistant
- AI-assisted ToS / Privacy Policy analysis
Not Intended For
- Final legal advice
- Court/legal decisions
- Fully automated compliance decisions
Training Data
v1 was trained primarily on English ToS / privacy datasets:
- LexGLUE / UNFAIR-ToS
- ToS;DR
- OPP-115
Thai support is experimental and should be improved with Thai PDPA / Thai privacy policy clauses.
Metrics
Test:
- Accuracy: 0.869
- Macro F1: 0.705
- Weighted F1: 0.870
Limitations
- Thai performance is still weak because v1 has little/no Thai labeled training data.
- Rare classes such as Unilateral Termination may be unreliable.
- Use with rule-based fallback and human review.
- Downloads last month
- 102