FAR AI

non-profit

https://far.ai/

AlignmentResearch

Activity Feed Request to join this org

AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

sam-far updated a model 8 days ago

AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1_merged

sam-far published a model 8 days ago

AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1_merged

sam-far updated a model 8 days ago

AlignmentResearch/hr_sdf_exclude_Llama-3.1-70B-Instruct_3_epochs_v1

View all activity

AlignmentResearch 's datasets 78

AlignmentResearch/hidden_reasoning_medium_parity_v1_10000

Viewer • Updated Nov 27 • 10k • 7

AlignmentResearch/hidden_reasoning_medium_v1b_100000

Viewer • Updated Nov 27 • 100k • 21

AlignmentResearch/hidden_reasoning_medium_v1_10000

Viewer • Updated Nov 26 • 10k • 7

AlignmentResearch/hidden_reasoning_easy_v1_10000

Viewer • Updated Nov 24 • 10k • 18

AlignmentResearch/hidden_reasoning_easy_v1_1000

Viewer • Updated Nov 21 • 1k • 6

AlignmentResearch/mbpp-synthetic

Updated Oct 16 • 2

AlignmentResearch/mbpp-hardcode

Updated Oct 14 • 2

AlignmentResearch/PineappleRLHF

Viewer • Updated Jul 25 • 110k • 505

AlignmentResearch/backdoor-dataset-free-male-trigger

Viewer • Updated Jul 7 • 158k • 18

AlignmentResearch/AdvBench

Viewer • Updated May 30 • 1.04k • 53

AlignmentResearch/ClearHarm

Viewer • Updated May 23 • 7.52k • 78 • 3

AlignmentResearch/PAPStrongREJECT

Viewer • Updated May 22 • 10.9k • 13

AlignmentResearch/DolusChat

Viewer • Updated May 20 • 64.9k • 513 • 2

AlignmentResearch/BoNClearHarm

Viewer • Updated May 13 • 120k • 17

AlignmentResearch/ReNeLLMClearHarm

Viewer • Updated May 13 • 40k • 17

AlignmentResearch/ReNeLLMStrongREJECT

Viewer • Updated May 8 • 80k • 36

AlignmentResearch/WildGuardTest

Viewer • Updated May 7 • 6.27k • 45

AlignmentResearch/PAPClearHarm

Viewer • Updated May 7 • 4k • 8

AlignmentResearch/SorryBenchFiltering

Viewer • Updated May 6 • 2.86k • 18

AlignmentResearch/DoNotAnswer

Viewer • Updated May 6 • 264 • 15

AlignmentResearch/SorryBench

Viewer • Updated May 6 • 240 • 16

AlignmentResearch/StrongREJECT

Viewer • Updated May 2 • 387 • 136

AlignmentResearch/WildChat

Viewer • Updated May 1 • 45.6k • 11

AlignmentResearch/HarmBench

Viewer • Updated Apr 23 • 400 • 86

AlignmentResearch/WildChatCurriculum

Viewer • Updated Apr 18 • 13.2k • 279

AlignmentResearch/JailbreakCompletionsCurriculum

Viewer • Updated Apr 18 • 9.39k • 29

AlignmentResearch/WildChatScored

Viewer • Updated Apr 11 • 13k • 31

AlignmentResearch/BoNStrongREJECT

Viewer • Updated Mar 19 • 100k • 6

AlignmentResearch/NestedCiphers

Viewer • Updated Mar 13 • 806k • 29

AlignmentResearch/AugmentedJailbreaks

Viewer • Updated Mar 13 • 20.8k • 75