Instructions to use MLXBits/sulphur-2-distill-mlx-q4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use MLXBits/sulphur-2-distill-mlx-q4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir sulphur-2-distill-mlx-q4 MLXBits/sulphur-2-distill-mlx-q4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
This repository hosts custom implementations of the LTX2.3 video AI model, refined specifically for high-fidelity generation using the Sulphur 2 architecture. It has been converted to Apples MLX architecture and quantized down to Q4, to maximize memory efficiency. It has been tested on a 32GB M5 Silicon Mac, and that is the lowest recommended RAM for this model.
If you are not on a Mac, this madel variant is not for you.
Model Overview
This implementation represents a highly optimized workflow built around the LTX2.3 core.
- Base Model: LTX2.3
- Refinement Applied: Sulphur 2
- Fusion Detail: The Sulphur 2 refinements have been successfully fused into the
transformer-distilled.safetensorscheckpoint, providing a unified generation experience. - Implementation: MLX Conversion
- Quantization: FP4 (Optimized for performance and memory footprint)
- Target Pipeline: 8/3 Pipeline (Optimized for generation workflow)
Usage Guide
Core Workflow (Recommended)
For the best results and fastest generation times, users should rely on the integrated 8/3 pipeline.
- Primary Generation: Use the fused
transformer-distilled.safetensorscheckpoint to access the Sulphur 2 quality enhancements baked into the LTX2.3 base. - LoRAs: No external LoRAs are required when using the fused model for Sulphur 2 quality, but have been included in this repo for convenience.
Hardware & Compute Notes
- Primary Platform: Optimized for macOS compute environments on Apple silicon M-series SOC's.
- AI Engine: Built around the MLX framework integration.
Prompting Guidelines (LTX Specific)
To achieve optimal generation quality with this model, adhere strictly to the following prompting conventions:
- Structure: Aim for a single, flowing paragraph.
- Tense: Use present tense verbs for all actions and movements.
- Detail Level: Match the level of descriptive detail to the intended shot scale (e.g., high detail for close-ups, broader strokes for wide shots).
- Flow: Describe the camera movement relative to the subject matter.
- Length Target: Aim for 4โ8 descriptive sentences to maintain focus and coherence.
Note: Model coherence (and body horror) has a swift uptake in clips going past ~17 seconds. Test with shorter clips.
- Downloads last month
- 1,309
Quantized