Leaderboard / categories

Technique

Troubleshooting failures (split sauces, dense bread) and method advice.

Ranking

Question heatmap (public questions only)

Model	001	002	003	004	005	006	007	008	009	010	011	012	013	014	015	016
GPT-5.4 Mini
GPT-5.5
Claude Fable 5
Grok 4.3
Kimi K2.6
Claude Opus 4.8
Qwen 3.5 Plus
Gemini 3.5 Flash
DeepSeek V4 Pro
Gemini 3.1 Pro Preview
Claude Sonnet 4.6
Mistral Large 3
Llama 4 Maverick

Each cell is one question; deeper colour = higher score. Hover for exact values.