ConvNeXt + SE + Muon on CIFAR-10

98.10% test accuracy with ~12M parameters.

See the GitHub repo for full code, training recipe, and reproducibility details.

Quick load

import torch
from huggingface_hub import hf_hub_download
from model import ConvNeXt   # from the GitHub repo

ckpt_path = hf_hub_download(
    repo_id="akira-n-28/convnext-muon-cifar10",
    filename="best_600ep_compile.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu")

model = ConvNeXt(
    depths=(3, 3, 9, 3), dims=(64, 128, 256, 512),
    kernel_size=(7, 5, 3, 3), drop_path_rate=0.15, layer_scale_init=1e-6,
)
model.load_state_dict(ckpt["model_state"])
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train AkiraN28/convnext-muon-cifar10

Evaluation results