You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

🤖 A.M.I.T. 1.0 — Anchored Multi-depth Inference Transformer

A.M.I.T. 1.0 (Anchored Multi-depth Inference Transformer) is an autonomous, ultra-efficient Small Language Model (SLM) engineered by Amit Pathak, built on top of the Qwen base model architecture backbone. It introduces dynamic compute allocation and residual state stabilization to solve static compute inefficiencies in modern transformer architectures.

🌟 Key Architectural Innovations

1. ⚡ Dynamic Compute GRPO Policy Routing

Rather than processing every token through a fixed, heavy neural stack, A.M.I.T. 1.0 incorporates a stochastic policy router trained via Group Relative Policy Optimization (GRPO) with task-correctness rewards.

Token-Norm Variance Analysis: The router extracts token-norm variance features to assess sequence complexity in real-time.
Dual Execution Tracks: Automatically allocates compute between a ⚡ Shallow Fast Pass (8 Layers) for ultra-low latency queries and a 🔥 Deep Core Pass (32 Layers) for complex reasoning challenges.

2. ⚓ 80/20 Residual Core Stabilizer

To prevent feature degradation and vanishing gradients across deep recurrent or multi-depth execution loops, A.M.I.T. 1.0 implements an 80/20 residual core stabilizer: $\mathbf{h}_{\text{next}} = 0.8 \cdot \text{FFN}(\mathbf{h}) + 0.2 \cdot \mathbf{x}_{\text{input}}$ This mechanism anchors deep hidden representations back to the input embeddings, preserving semantic fidelity across variable execution depths.

📊 Model Specifications

Parameter	Specification
Base Model Backbone	Qwen Base Model Architecture
Model Architecture	Anchored Multi-depth Transformer
Active Parameters	~800 Million (0.8B Scale)
Max Context Window	262,144 Tokens (256K Context)
Execution Precision	Float16 / BFloat16 / Float32
Author & Creator	Amit Pathak
License	Apache 2.0

💻 How to Use

🐍 Standard Transformers Inference

You can load and run A.M.I.T. 1.0 directly using Hugging Face transformers:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Amit0392/AMIT-1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    torch_dtype=torch.float16, 
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are AMIT 1.0, an autonomous AI model developed by Amit Pathak."},
    {"role": "user", "content": "Explain quantum computing in simple terms."}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

print(response)

📜 Citation & Attribution

If you use A.M.I.T. 1.0 or its underlying Anchored Multi-depth architecture in your research or applications, please cite:

@article{pathak2026amit,
  title={A.M.I.T. 1.0: Anchored Multi-depth Inference Transformer with GRPO Compute Routing},
  author={Pathak, Amit},
  year={2026}
}

📈 Official Benchmark Results

Evaluated using lm-evaluation-harness on standard zero-shot and few-shot evaluation tasks:

Benchmark Task	Evaluation Metric	Score	Standard Error	Description
🧬ARC Challenge	`acc_norm` (Normalized Accuracy)	36.69%	± 1.41%	Grade-School Science Reasoning (0-shot)
🧬ARC Challenge	`acc` (Raw Accuracy)	34.47%	± 1.39%
🧮GSM8K	`exact_match` (Flexible Extract)	13.65%	± 0.95%	Multi-step Grade School Math (5-shot)
🧮GSM8K	`exact_match` (Strict Match)	5.23%	± 0.61%

🏆 Comparative Leaderboard (0.5B – 3B Scale)

Comparison against leading open-source models in the sub-3B parameter class:

Model Name	Parameters	ARC-Challenge (`acc_norm` ↑)	GSM8K (`exact_match` ↑)	Architecture Efficiency / Features
Qwen 2.5 (0.5B)	0.49B	32.4%	12.1%	Standard Dense Transformer
Llama 3.2 (1B)	1.23B	34.8%	11.5%	Standard Dense Transformer
🤖A.M.I.T. 1.0 (Ours)	0.80B	36.69% ⚡	13.65% ⚡	Anchored Multi-depth GRPO Router
SmolLM2 (1.7B)	1.71B	39.2%	18.4%	2x Active Parameters
Qwen 2.5 (1.5B)	1.54B	41.5%	28.5%	2x Active Parameters
Qwen 2.5 (3B)	3.09B	50.2%	55.0%	~4x Active Parameters

Downloads last month: -

Safetensors

Model size

0.8B params

Tensor type

BF16

F16

Amit0392
/

AMIT-1.0