My favorite local model right now is a bit of surprise to me: I'm really enjoying the relatively tiny Qwen3-8B, running the 4bit quantized version on my Mac using MLX It's surprisingly capable given it's a 4.3GB download and uses just 4-5GB of RAM while it's running