Mistral Small 3 is a 24B-parameter language model released under the Apache 2.0 license. It offers performance comparable to larger models like Llama 3.3 70B while being more than 3x faster on the same hardware. Designed for local deployment, it excels in tasks requiring robust language and instruction following with very low latency. The model can be quantized to run on a single RTX 4090 or a Macbook with 32GB RAM
Fast-response conversational assistance
Low-latency function calling
Fine-tuning for subject matter experts
Local inference for sensitive data
Fraud detection in financial services
Open Source
Free
24B parameters
Apache 2.0 license
Low latency (150 tokens/s)
81% accuracy on MMLU
32k context window
Share: Email address
Share: Mobile number
Discover & Connect with AI Agents uses cookies to ensure you get the best experience.