MobiusNet

A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating.

Overview

MobiusNet introduces a fundamentally different approach to neural network design:

  • MobiusLens: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU)
  • Thirds Mask: Cantor-inspired fractal channel suppression for regularization
  • Continuous Topology: Layers sample a continuous manifold via the t parameter, not discrete units
  • Twist Rotations: Smooth rotation through representation space across network depth
  • Integrator: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity.

Performance

Model Params GFLOPs Tiny ImageNet
MobiusNet-Base 33.7M 2.69 TBD

Installation

pip install torch torchvision safetensors huggingface_hub tensorboard tqdm

Quick Start

Training

from mobius_trainer_full import train_tiny_imagenet

model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    lr=3e-4,
    batch_size=128,
    use_integrator=True,
    data_dir='./data/tiny-imagenet-200',
    output_dir='./outputs',
    hf_repo='AbstractPhil/mobiusnet',
    save_every_n_epochs=10,
    upload_every_n_epochs=10,
)

Continue from Checkpoint

# From local directory
model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)

# From HuggingFace (auto-downloads)
model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)

Inference

from safetensors.torch import load_file
from mobius_trainer_full import MobiusNet, PRESETS

# Load model
config = PRESETS['mobius_base']
model = MobiusNet(num_classes=200, use_integrator=True, **config)
state_dict = load_file("best_model.safetensors")
model.load_state_dict(state_dict)
model.eval()

# Inference
with torch.no_grad():
    logits = model(image_tensor)
    pred = logits.argmax(1)

Model Presets

Preset Channels Depths ~Params
mobius_tiny_s (64, 128, 256) (2, 2, 2) 500K
mobius_tiny_m (64, 128, 256, 512, 768) (2, 2, 4, 2, 2) 11M
mobius_tiny_l (96, 192, 384, 768) (3, 3, 3, 3) 8M
mobius_base (128, 256, 512, 768, 1024) (2, 2, 2, 2, 2) 33.7M

Architecture

Input
  β”‚
  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Stem (Conv β†’ BN)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”‚
  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Stage 1-N                       β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ MobiusConvBlock (Γ—depth)    β”‚ β”‚
β”‚ β”‚  β”œβ”€ Depthwise-Sep Conv      β”‚ β”‚
β”‚ β”‚  β”œβ”€ BatchNorm               β”‚ β”‚
β”‚ β”‚  β”œβ”€ MobiusLens (wave gate)  β”‚ β”‚
β”‚ β”‚  β”œβ”€ Thirds Mask             β”‚ β”‚
β”‚ β”‚  └─ Learned Residual        β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ Downsample (stride-2 conv)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”‚
  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Integrator (Conv β†’ BN β†’ GELU)   β”‚  ← Task collapse
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”‚
  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Pool β†’ Linear β†’ Classes         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

MobiusLens

Wave-based gating mechanism with three interference paths:

L = wave(phase_l, drift_l)   # Left path  (+1 drift)
M = wave(phase_m, drift_m)   # Middle path (0 drift, ghost)
R = wave(phase_r, drift_r)   # Right path (-1 drift)

# Interference
xor_comp = |L + R - 2*L*R|   # Differentiable XOR
and_comp = L * R              # Differentiable AND

# Gating
gate = weighted_sum(L, M, R) * interference_blend
output = input * sigmoid(layernorm(gate))

The middle path (M) acts as a "ghost" β€” present but diminished β€” maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure).

Thirds Mask

Rotating channel suppression inspired by Cantor set construction:

Layer 0: suppress channels [0:C/3]
Layer 1: suppress channels [C/3:2C/3]
Layer 2: suppress channels [2C/3:C]
Layer 3: back to [0:C/3]

Forces redundancy and prevents co-adaptation across channel groups.

Continuous Topology

Each layer samples a continuous manifold:

t = layer_idx / (total_layers - 1)  # 0 β†’ 1

twist_in_angle = t * Ο€
twist_out_angle = -t * Ο€
scales = scale_range[0] + t * scale_span

Adding layers = finer sampling of the same underlying structure.

Checkpoints

Saved to: checkpoints/{variant}_{dataset}/{timestamp}/

β”œβ”€β”€ config.json
β”œβ”€β”€ best_accuracy.json
β”œβ”€β”€ final_accuracy.json
β”œβ”€β”€ checkpoints/
β”‚   β”œβ”€β”€ checkpoint_epoch_0010.pt
β”‚   β”œβ”€β”€ checkpoint_epoch_0010.safetensors
β”‚   β”œβ”€β”€ best_model.pt
β”‚   β”œβ”€β”€ best_model.safetensors
β”‚   β”œβ”€β”€ final_model.pt
β”‚   └── final_model.safetensors
└── tensorboard/

TensorBoard

Monitor training:

tensorboard --logdir ./outputs/checkpoints

Tracks:

  • Loss, train/val accuracy
  • Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights)
  • Residual weights
  • Weight histograms

Data Setup

Tiny ImageNet

wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
unzip tiny-imagenet-200.zip -d ./data/

License

Apache 2.0

Citation

@misc{mobiusnet2026,
  title={MobiusNet: Wave-Based Topological Vision Architecture},
  author={AbstractPhil},
  year={2026},
  url={https://huggingface.co/AbstractPhil/mobiusnet}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support