17 23 19

ct2

ct-2

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}

upvoted a paper 27 days ago

Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models

upvoted a collection about 1 month ago

Olmo 3

View all activity

Organizations

None yet

upvoted a paper 12 days ago

Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}

Paper • 2512.02901 • Published 25 days ago • 5

upvoted a paper 27 days ago

Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models

Paper • 2511.23319 • Published 29 days ago • 22

upvoted a collection about 1 month ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 9 items • Updated 4 days ago • 156

liked a model about 2 months ago

turboderp/MiniMax-M2-exl3

Updated Nov 1 • 16 • 7

liked a Space 2 months ago

Megrez2-3x7B-A3B

👀

Megrez2 Chat Demo

liked a model 3 months ago

turboderp/Qwen3-Next-80B-A3B-Thinking-exl3

Updated Nov 1 • 59 • 7

New activity in inclusionAI/Ring-1T-preview 3 months ago

what is the active parameters of the model??

#2 opened 3 months ago by

ct-2

liked a model 3 months ago

stockmark/Stockmark-2-100B-Instruct

Text Generation • 96B • Updated Sep 25 • 1.87k • 9

New activity in unsloth/Qwen3-Next-80B-A3B-Instruct-bnb-4bit 3 months ago

Error when using vLLM

➕ 11

#2 opened 4 months ago by

sheliak

New activity in meituan-longcat/LongCat-Flash-Chat 4 months ago

Any plan to release 120b and 20-30b level models?

👍 3

#5 opened 4 months ago by

Sunny2038

upvoted a paper 4 months ago

Metis: Training Large Language Models with Advanced Low-Bit Quantization

Paper • 2509.00404 • Published Aug 30 • 6

upvoted an article 4 months ago

Article

The Hacker's Guide to Building an AI Supercluster

Aug 31

•

liked a model 4 months ago

NexaAI/OmniNeural-4B

Any-to-Any • Updated Nov 7 • 93 • 160

New activity in moonshotai/Kimi-K2-Instruct 5 months ago

is kimi k2 trained with fp8?

#30 opened 5 months ago by

ct-2

liked a model 6 months ago

ai21labs/AI21-Jamba-Mini-1.7

52B • Updated Jul 6 • 163 • 39

upvoted 2 collections 6 months ago

Jamba 1.7

Collection

The AI21 Jamba family of models are hybrid SSM-Transformer foundation models, blending speed, efficient long context processing, and accuracy. • 4 items • Updated Jul 2 • 12

BitVLA

Collection

1-bit Vision-Language-Action Models for Robotics Manipulation • 9 items • Updated Jun 30 • 3

liked 3 models 8 months ago

ct2

AI & ML interests

Recent Activity

Organizations

ct-2's activity

Megrez2-3x7B-A3B

what is the active parameters of the model??

Error when using vLLM

Any plan to release 120b and 20-30b level models?

The Hacker's Guide to Building an AI Supercluster

is kimi k2 trained with fp8?