Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
FAR AI
non-profit
https://far.ai/
FARAIResearch
AlignmentResearch
Activity Feed
Request to join this org
Follow
46
AI & ML interests
Frontier alignment research to ensure the safe development and deployment of advanced AI systems.
Recent Activity
sam-far
updated
a model
8 days ago
AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1_merged
sam-far
published
a model
8 days ago
AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1_merged
sam-far
updated
a model
8 days ago
AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1
View all activity
Team members
18
AlignmentResearch
's datasets
78
Sort: Recently updated
AlignmentResearch/hidden_reasoning_medium_parity_v1_10000
Viewer
•
Updated
Nov 27
•
10k
•
7
AlignmentResearch/hidden_reasoning_medium_v1b_100000
Viewer
•
Updated
Nov 27
•
100k
•
21
AlignmentResearch/hidden_reasoning_medium_v1_10000
Viewer
•
Updated
Nov 26
•
10k
•
7
AlignmentResearch/hidden_reasoning_easy_v1_10000
Viewer
•
Updated
Nov 24
•
10k
•
18
AlignmentResearch/hidden_reasoning_easy_v1_1000
Viewer
•
Updated
Nov 21
•
1k
•
6
AlignmentResearch/mbpp-synthetic
Updated
Oct 16
•
2
AlignmentResearch/mbpp-hardcode
Updated
Oct 14
•
2
AlignmentResearch/PineappleRLHF
Viewer
•
Updated
Jul 25
•
110k
•
505
AlignmentResearch/backdoor-dataset-free-male-trigger
Viewer
•
Updated
Jul 7
•
158k
•
18
AlignmentResearch/AdvBench
Viewer
•
Updated
May 30
•
1.04k
•
53
AlignmentResearch/ClearHarm
Viewer
•
Updated
May 23
•
7.52k
•
78
•
3
AlignmentResearch/PAPStrongREJECT
Viewer
•
Updated
May 22
•
10.9k
•
13
AlignmentResearch/DolusChat
Viewer
•
Updated
May 20
•
64.9k
•
513
•
2
AlignmentResearch/BoNClearHarm
Viewer
•
Updated
May 13
•
120k
•
17
AlignmentResearch/ReNeLLMClearHarm
Viewer
•
Updated
May 13
•
40k
•
17
AlignmentResearch/ReNeLLMStrongREJECT
Viewer
•
Updated
May 8
•
80k
•
36
AlignmentResearch/WildGuardTest
Viewer
•
Updated
May 7
•
6.27k
•
45
AlignmentResearch/PAPClearHarm
Viewer
•
Updated
May 7
•
4k
•
8
AlignmentResearch/SorryBenchFiltering
Viewer
•
Updated
May 6
•
2.86k
•
18
AlignmentResearch/DoNotAnswer
Viewer
•
Updated
May 6
•
264
•
15
AlignmentResearch/SorryBench
Viewer
•
Updated
May 6
•
240
•
16
AlignmentResearch/StrongREJECT
Viewer
•
Updated
May 2
•
387
•
136
AlignmentResearch/WildChat
Viewer
•
Updated
May 1
•
45.6k
•
11
AlignmentResearch/HarmBench
Viewer
•
Updated
Apr 23
•
400
•
86
AlignmentResearch/WildChatCurriculum
Viewer
•
Updated
Apr 18
•
13.2k
•
279
AlignmentResearch/JailbreakCompletionsCurriculum
Viewer
•
Updated
Apr 18
•
9.39k
•
29
AlignmentResearch/WildChatScored
Viewer
•
Updated
Apr 11
•
13k
•
31
AlignmentResearch/BoNStrongREJECT
Viewer
•
Updated
Mar 19
•
100k
•
6
AlignmentResearch/NestedCiphers
Viewer
•
Updated
Mar 13
•
806k
•
29
AlignmentResearch/AugmentedJailbreaks
Viewer
•
Updated
Mar 13
•
20.8k
•
75
Previous
1
2
3
Next