geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO_mbt Text Generation • 7B • Updated 1 day ago • 28
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO_mbt Text Generation • 7B • Updated 1 day ago • 28
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO Text Generation • 7B • Updated 2 days ago • 5
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.3k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO_mbt Text Generation • 7B • Updated 3 days ago • 78
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 73
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.33k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 68
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO_mbt Text Generation • 7B • Updated 3 days ago • 78
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 68
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.3k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.33k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 73
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.33k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 73
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO_mbt Text Generation • 7B • Updated 3 days ago • 1.3k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO_mbt Text Generation • 7B • Updated 3 days ago • 68
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO_mbt Text Generation • 7B • Updated 3 days ago • 78