W4A16 GPTQ quantized version of mistralai/Mistral-Large-Instruct-2407

Using intel/auto-round version: v0.8.0

Generation command-line

auto-round --model mistral-large-2407 --scheme "W4A16" --format "auto_gptq" --dataset HuggingFaceH4/ultrachat_200k,claudy-chat-jk --output_dir "./mistral-large-2407-gptq"

Calibration dataset code

@register_dataset(["hell0ks/claudy-chat-JK-1k", "claudy-chat-jk"])
def get_claudy_dataset(
    tokenizer,
    seqlen,
    dataset_name="hell0ks/claudy-chat-JK-1k",
    split=None,
    seed=42,
    apply_chat_template=True,
    system_prompt=None,
):

    dataset = load_dataset("hell0ks/claudy-chat-JK-1k", split="train", streaming=False, trust_remote_code=True)
    dataset = dataset.shuffle(seed=seed).take(1000)

    def is_instruct_tokenizer(tokenizer):
        try:
            out = tokenizer.apply_chat_template([{"role": "user", "content": "Hi"}])
            return bool(out and len(out) > 0)
        except Exception:
            return False

    is_instruct = is_instruct_tokenizer(tokenizer)

    if is_instruct and not apply_chat_template:
        logger.info("Tokenizer looks like an instruct/chat model, but apply_chat_template=False. Setting to True.")
        apply_chat_template = True
    elif not is_instruct and apply_chat_template:
        logger.info("Tokenizer is not an instruct/chat model, but apply_chat_template=True. Setting to False.")
    apply_chat_template = False

    def tokenize_example_batch(examples):
        if not apply_chat_template:
            texts = []
            for message_list in examples["messages"]:
                combined = "".join([msg["content"] for msg in message_list])
                texts.append(combined)
            return tokenizer(texts, truncation=True, max_length=seqlen)
        else:
            return apply_chat_template_to_samples(examples["messages"], tokenizer, seqlen, system_prompt=system_prompt)

    dataset = dataset.map(tokenize_example_batch, batched=True)
    return dataset

Notice

Licensed by Mistral AI under the Mistral AI Research License. You can find copy of license in LICENSE.md.

Downloads last month
22
Safetensors
Model size
0.8B params
Tensor type
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hell0ks/Mistral-Large-Instruct-2407-AutoRound-GPTQ-4bit

Quantized
(25)
this model

Datasets used to train hell0ks/Mistral-Large-Instruct-2407-AutoRound-GPTQ-4bit