: GitBench
Create multiple worktrees
Tests ability to create and manage multiple worktrees simultaneously. Evaluates multi-worktree management.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'Hello' > hello.txt
  5. 05 git add hello.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 git checkout -b feature-a
  8. 08 echo 'A' > a.txt
  9. 09 git add a.txt
  10. 10 git commit -m 'Feature A'
  11. 11 git checkout main
  12. 12 git checkout -b feature-b
  13. 13 echo 'B' > b.txt
  14. 14 git add b.txt
  15. 15 git commit -m 'Feature B'
  16. 16 git checkout main
Prompt
Create worktrees for both 'feature-a' at '../wt-a' and 'feature-b' at '../wt-b'. Output ONLY the commands to run, one per line.
Expected
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 122 in → 308 out (287 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
deepseek/deepseek-v4-flash:high__json_schema PASS 100% 124 in → 589 out (554 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) { "commands": [ "git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b" ] }
deepseek/deepseek-v4-flash:none PASS 100% 125 in → 21 out (0 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 122 in → 34 out (0 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) { "commands": [ "git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b" ] }
mistralai/devstral-2512__json_schema PASS 100% 143 in → 34 out
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) { "commands": [ "git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b" ] }
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 155 in → 291 out (303 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 156 in → 453 out (415 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) { "commands": [ "git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b" ] }
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 156 in → 34 out (0 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) { "commands": [ "git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b" ] }
poolside/laguna-xs.2:high PASS 100% 190 in → 871 out (849 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
poolside/laguna-xs.2:high__json_schema PASS 100% 183 in → 246 out (220 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) {"commands": ["git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b"]}
poolside/laguna-xs.2:none PASS 100% 181 in → 21 out (0 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
poolside/laguna-xs.2:none__json_schema PASS 100% 183 in → 26 out (0 reasoning)
git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b
JSON Schema Structured Output
(raw) {"commands": ["git worktree add ../wt-a feature-a", "git worktree add ../wt-b feature-b"]}
mistralai/devstral-2512 FAIL 0% 148 in → 25 out
```bash git worktree add ../wt-a feature-a git worktree add ../wt-b feature-b ```
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../wt-a'}, {'type': 'dir_exists', 'path': '../wt-b'}, {'type': 'file_content', 'path': '../wt-a/a.txt', 'value': 'A'}, {'type': 'file_content', 'path': '../wt-b/b.txt', 'value': 'B'}]
nvidia/nemotron-3-nano-30b-a3b:none FAIL 0% 159 in → 22 out (0 reasoning)
git worktree add ../wt-a branch feature-a git worktree add ../wt-b branch feature-b
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../wt-a'}, {'type': 'dir_exists', 'path': '../wt-b'}, {'type': 'file_content', 'path': '../wt-a/a.txt', 'value': 'A'}, {'type': 'file_content', 'path': '../wt-b/b.txt', 'value': 'B'}]