Vaibhav (VB) Srivastav @reach_vb, Twitter Profile

Vaibhav (VB) Srivastav @reach_vb

4 weeks ago

IT WORKS! Running Mixtral 8x22B with Transformers! 🔥 Running on a DGX (4x A100 - 80GB) with CPU offloading 🤯

Vaibhav (VB) Srivastav @reach_vb

4 weeks ago

IT WORKS! Running Mixtral 8x22B with Transformers! 🔥 Running on a DGX (4x A100 - 80GB) with CPU offloading 🤯 https://t.co/4gQuvwnHbM

14 59 430 153K 162

Download Image

14 54 424 86K 145

Download Video

Vaibhav (VB) Srivastav @reach_vb

4 weeks ago

In case someone is interested in the code, then here you go: from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "mistral-community/Mixtral-8x22B-v0.1" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto") text = "The meaning of life, universe and everything is " inputs = tokenizer(text, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=20) print(tokenizer.decode(outputs[0], skip_special_tokens=True))