About Model & Commander

Model & Commander is a workbench for testing horizontal prompts across the latest models from Anthropic and OpenAI. Set one system instruction, run it against any model, and compare head-to-head against standard ChatGPT — judged by an impartial LLM.

What is a horizontal prompt?

A horizontal prompt is a system-level instruction that applies across every prompt you run in a session. Think of it as your persona, your style guide, or your evaluation harness. It persists in your browser until you change it.

How the judging works

After both responses complete, the judge (Claude Opus 4.7 by default) scores each response 0–10 on accuracy, relevance, depth, clarity, style, and safety. It rolls up an overall score and a separate technical score, then declares a winner with reasoning.

The baseline runs withoutyour horizontal prompt — the comparison shows you the marginal value of the prompt + model combination you're testing.

Sharing

Every run gets a short URL. Paste it into iMessage, Slack, or X — it renders with a friendly preview card so people can see exactly what was asked, what model answered, and the verdict.