Quickstart

Pick a category and your system size — we show the best local models with ready-to-copy setup commands.

Category

CodingAgent EfficiencyMCP Quality

Best local models for code generation, summarization, and general reasoning. Benchmarked on 10-task LLM coding eval.Models optimized for tool-calling and agent workflows. Tested on Bonsai — the agent efficiency benchmark.Top MCP servers by quality score — evaluated on functionality, maintenance, and community adoption.

System Size

SmolLM3 3BScore: 93.3%1.8 GB
Ollama
ollama run smollm3-3b
llama.cpp
llama-cli -m smollm3-3b.gguf -p "Your prompt" -ngl 99
Qwen2.5 1.5BScore: 85%0.9 GB
Ollama
ollama run qwen2.5-1.5b
llama.cpp
llama-cli -m qwen2.5-1.5b.gguf -p "Your prompt" -ngl 99
Qwen2.5 3BScore: 85%1.8 GB
Ollama
ollama run qwen2.5-3b
llama.cpp
llama-cli -m qwen2.5-3b.gguf -p "Your prompt" -ngl 99
Granite 3.2 2BScore: 82.5%1.5 GB
Ollama
ollama run granite-3.2-2b
llama.cpp
llama-cli -m granite-3.2-2b.gguf -p "Your prompt" -ngl 99
Ministral 3BScore: 81.7%2.0 GB
Ollama
ollama run ministral-3b
llama.cpp
llama-cli -m ministral-3b.gguf -p "Your prompt" -ngl 99
Falcon3-3B-Instruct-4bitScore: 79%1.7 GB
Ollama
ollama run falcon3-3b-instruct-4bit
llama.cpp
llama-cli -m falcon3-3b-instruct-4bit.gguf -p "Your prompt" -ngl 99
Qwen2.5 0.5BScore: 74.2%0.4 GB
Ollama
ollama run qwen2.5-0.5b
llama.cpp
llama-cli -m qwen2.5-0.5b.gguf -p "Your prompt" -ngl 99
Llama 3.2 1BScore: 73.3%0.8 GB
Ollama
ollama run llama-3.2-1b
llama.cpp
llama-cli -m llama-3.2-1b.gguf -p "Your prompt" -ngl 99
SmolLM2 1.7BScore: 70.8%1.0 GB
Ollama
ollama run smollm2-1.7b
llama.cpp
llama-cli -m smollm2-1.7b.gguf -p "Your prompt" -ngl 99
Falcon3-1B-Instruct-4bitScore: 62%0.9 GB
Ollama
ollama run falcon3-1b-instruct-4bit
llama.cpp
llama-cli -m falcon3-1b-instruct-4bit.gguf -p "Your prompt" -ngl 99
DeepSeek-R1 1.5BScore: 27.5%1.0 GB
Ollama
ollama run deepseek-r1-1.5b
llama.cpp
llama-cli -m deepseek-r1-1.5b.gguf -p "Your prompt" -ngl 99
Qwen3.5 0.8BScore: 26%0.5 GB
Ollama
ollama run qwen3.5-0.8b
llama.cpp
llama-cli -m qwen3.5-0.8b.gguf -p "Your prompt" -ngl 99

Top Cloud APIs (coding)

ModelScoreProvider
IBM Granite 4.1 8B90%openrouter.ai
Nemotron 3 Nano 30B A3B90%openrouter.ai
Codestral 250890%openrouter.ai
MiniMax M2 Her90%openrouter.ai
DeepSeek Chat90%openrouter.ai
Qwen3 Coder 30B A3B90%openrouter.ai
Mistral Large 241190%openrouter.ai
DeepSeek Chat V3-032490%openrouter.ai

Full benchmark table →

System Size

Qwen 3.5 9B (4-bit)Score: 83%0.0 GB
Ollama
ollama run qwen-3.5-9b--4-bit-
llama.cpp
llama-cli -m qwen-3.5-9b--4-bit-.gguf -p "Your prompt" -ngl 99
AgenticQwen 8B (4-bit)Score: 81.5%0.0 GB
Ollama
ollama run agenticqwen-8b--4-bit-
llama.cpp
llama-cli -m agenticqwen-8b--4-bit-.gguf -p "Your prompt" -ngl 99
Bonsai 4B (1-bit)Score: 79.9%0.0 GB
Ollama
ollama run bonsai-4b--1-bit-
llama.cpp
llama-cli -m bonsai-4b--1-bit-.gguf -p "Your prompt" -ngl 99
Ternary Bonsai 1.7B (2-bit)Score: 79.9%0.0 GB
Ollama
ollama run ternary-bonsai-1.7b--2-bit-
llama.cpp
llama-cli -m ternary-bonsai-1.7b--2-bit-.gguf -p "Your prompt" -ngl 99
Bonsai 8B (1-bit)Score: 79.8%0.0 GB
Ollama
ollama run bonsai-8b--1-bit-
llama.cpp
llama-cli -m bonsai-8b--1-bit-.gguf -p "Your prompt" -ngl 99
Ternary Bonsai 4B (2-bit)Score: 79.6%0.0 GB
Ollama
ollama run ternary-bonsai-4b--2-bit-
llama.cpp
llama-cli -m ternary-bonsai-4b--2-bit-.gguf -p "Your prompt" -ngl 99
Ternary Bonsai 8B (2-bit)Score: 78.2%0.0 GB
Ollama
ollama run ternary-bonsai-8b--2-bit-
llama.cpp
llama-cli -m ternary-bonsai-8b--2-bit-.gguf -p "Your prompt" -ngl 99
Bonsai 1.7B (1-bit)Score: 73.4%0.0 GB
Ollama
ollama run bonsai-1.7b--1-bit-
llama.cpp
llama-cli -m bonsai-1.7b--1-bit-.gguf -p "Your prompt" -ngl 99

MCP Servers

ModelScoreProvider
playwright-mcp80%32466 ★
github-mcp-server60%29807 ★
fastmcp60%25145 ★
awesome-mcp-servers60%86836 ★
mcp-toolbox0%
claude-mem0%
mcp-pandoc0%
mcp-git0%

Full benchmark table →

Spotted something?

Suggest an improvement, report an error, or just say hi.