Vincent Tu

alckasoc

alckasoc

AI & ML interests

None yet

Recent Activity

upvoted a paper about 19 hours ago

WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

upvoted a paper about 19 hours ago

VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation

upvoted a paper about 19 hours ago

MiMo-V2-Flash Technical Report

View all activity

Organizations

upvoted 4 papers about 19 hours ago

WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Paper • 2601.02439 • Published 4 days ago • 12

VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation

Paper • 2510.23497 • Published Oct 27, 2025 • 1

MiMo-V2-Flash Technical Report

Paper • 2601.02780 • Published 3 days ago • 23

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Paper • 2306.13649 • Published Jun 23, 2023 • 29

upvoted a collection 3 days ago

Gemma 3 Release

Collection

28 items • Updated Aug 11, 2025 • 583

upvoted a paper 7 days ago

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59

upvoted a collection about 2 months ago

Qwen3-VL

Collection

37 items • Updated 9 days ago • 559

upvoted an article 3 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11, 2025

•

upvoted a paper 3 months ago

The Unreasonable Effectiveness of Scaling Agents for Computer Use

Paper • 2510.02250 • Published Oct 2, 2025 • 24

upvoted a paper 9 months ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published Apr 1, 2025 • 27

upvoted 2 papers over 1 year ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 59

The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 68

upvoted 3 collections over 1 year ago

upvoted 3 papers over 1 year ago

FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models

Paper • 2403.07747 • Published Mar 12, 2024 • 1

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1, 2024 • 32

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 152

upvoted 2 papers almost 2 years ago

GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 26

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Paper • 2402.18039 • Published Feb 28, 2024 • 11

Vincent Tu

AI & ML interests

Recent Activity

Organizations

alckasoc's activity

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment