You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

🤖 A.M.I.T. 1.0 — Anchored Multi-depth Inference Transformer

A.M.I.T. 1.0 (Anchored Multi-depth Inference Transformer) is an autonomous, ultra-efficient Small Language Model (SLM) engineered by Amit Pathak, built on top of the Qwen base model architecture backbone. It introduces dynamic compute allocation and residual state stabilization to solve static compute inefficiencies in modern transformer architectures.


🌟 Key Architectural Innovations

1. ⚡ Dynamic Compute GRPO Policy Routing

Rather than processing every token through a fixed, heavy neural stack, A.M.I.T. 1.0 incorporates a stochastic policy router trained via Group Relative Policy Optimization (GRPO) with task-correctness rewards.

  • Token-Norm Variance Analysis: The router extracts token-norm variance features to assess sequence complexity in real-time.
  • Dual Execution Tracks: Automatically allocates compute between a ⚡ Shallow Fast Pass (8 Layers) for ultra-low latency queries and a 🔥 Deep Core Pass (32 Layers) for complex reasoning challenges.

2. ⚓ 80/20 Residual Core Stabilizer

To prevent feature degradation and vanishing gradients across deep recurrent or multi-depth execution loops, A.M.I.T. 1.0 implements an 80/20 residual core stabilizer: hnext=0.8FFN(h)+0.2xinput\mathbf{h}_{\text{next}} = 0.8 \cdot \text{FFN}(\mathbf{h}) + 0.2 \cdot \mathbf{x}_{\text{input}} This mechanism anchors deep hidden representations back to the input embeddings, preserving semantic fidelity across variable execution depths.


📊 Model Specifications

Parameter Specification
Base Model Backbone Qwen Base Model Architecture
Model Architecture Anchored Multi-depth Transformer
Active Parameters ~800 Million (0.8B Scale)
Max Context Window 262,144 Tokens (256K Context)
Execution Precision Float16 / BFloat16 / Float32
Author & Creator Amit Pathak
License Apache 2.0

💻 How to Use

🐍 Standard Transformers Inference

You can load and run A.M.I.T. 1.0 directly using Hugging Face transformers:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Amit0392/AMIT-1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    torch_dtype=torch.float16, 
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are AMIT 1.0, an autonomous AI model developed by Amit Pathak."},
    {"role": "user", "content": "Explain quantum computing in simple terms."}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

print(response)

📜 Citation & Attribution

If you use A.M.I.T. 1.0 or its underlying Anchored Multi-depth architecture in your research or applications, please cite:

@article{pathak2026amit,
  title={A.M.I.T. 1.0: Anchored Multi-depth Inference Transformer with GRPO Compute Routing},
  author={Pathak, Amit},
  year={2026}
}

📈 Official Benchmark Results

Evaluated using lm-evaluation-harness on standard zero-shot and few-shot evaluation tasks:

Benchmark Task Evaluation Metric Score Standard Error Description
🧬ARC Challenge acc_norm (Normalized Accuracy) 36.69% ± 1.41% Grade-School Science Reasoning (0-shot)
🧬ARC Challenge acc (Raw Accuracy) 34.47% ± 1.39%
🧮GSM8K exact_match (Flexible Extract) 13.65% ± 0.95% Multi-step Grade School Math (5-shot)
🧮GSM8K exact_match (Strict Match) 5.23% ± 0.61%

🏆 Comparative Leaderboard (0.5B – 3B Scale)

Comparison against leading open-source models in the sub-3B parameter class:

Model Name Parameters ARC-Challenge (acc_norm ↑) GSM8K (exact_match ↑) Architecture Efficiency / Features
Qwen 2.5 (0.5B) 0.49B 32.4% 12.1% Standard Dense Transformer
Llama 3.2 (1B) 1.23B 34.8% 11.5% Standard Dense Transformer
🤖A.M.I.T. 1.0 (Ours) 0.80B 36.69% 13.65% Anchored Multi-depth GRPO Router
SmolLM2 (1.7B) 1.71B 39.2% 18.4% 2x Active Parameters
Qwen 2.5 (1.5B) 1.54B 41.5% 28.5% 2x Active Parameters
Qwen 2.5 (3B) 3.09B 50.2% 55.0% ~4x Active Parameters

Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Amit0392/AMIT-1.0 1