: GitBench
Repair broken worktree
Tests ability to repair a broken worktree with git worktree repair. Evaluates worktree-recovery knowledge.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'Hello' > hello.txt
  5. 05 git add hello.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 git checkout -b feature
  8. 08 echo 'Feature' > feature.txt
  9. 09 git add feature.txt
  10. 10 git commit -m 'Add feature'
  11. 11 git checkout main
  12. 12 git worktree add ../feature-wt feature
  13. 13 rm -rf ../feature-wt
Prompt
The worktree at '../feature-wt' was deleted without using 'git worktree remove'. Run the command to repair/prune the stale worktree reference. Output ONLY the git command, nothing else.
Expected
git worktree prune
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 230 in → 110 out (104 reasoning)
git worktree prune
deepseek/deepseek-v4-flash:none PASS 100% 230 in → 4 out (0 reasoning)
git worktree prune
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 220 in → 13 out (0 reasoning)
git worktree prune
JSON Schema Structured Output
(raw) { "command": "git worktree prune" }
mistralai/devstral-2512__json_schema PASS 100% 269 in → 11 out
git worktree prune
JSON Schema Structured Output
(raw) {"command": "git worktree prune"}
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 269 in → 342 out (366 reasoning)
git worktree prune
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 274 in → 156 out (144 reasoning)
git worktree prune
JSON Schema Structured Output
(raw) { "command": "git worktree prune" }
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 277 in → 14 out (0 reasoning)
git worktree prune
JSON Schema Structured Output
(raw) { "command": "git worktree prune" }
poolside/laguna-xs.2:high PASS 100% 296 in → 210 out (203 reasoning)
git worktree prune
poolside/laguna-xs.2:high__json_schema PASS 100% 298 in → 118 out (102 reasoning)
git worktree prune
JSON Schema Structured Output
(raw) { "command": "git worktree prune" }
poolside/laguna-xs.2:none PASS 100% 295 in → 6 out (0 reasoning)
git worktree prune
poolside/laguna-xs.2:none__json_schema PASS 100% 289 in → 15 out (0 reasoning)
git worktree prune
JSON Schema Structured Output
(raw) { "command": "git worktree prune" }
Invalid JSON. Output: git worktree prune
JSON Schema Structured Output
Structured Output Error
Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
Failure: Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
mistralai/devstral-2512 FAIL 0% 268 in → 11 out
```bash git worktree prune ```
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git worktree list', 'not_contains': 'feature-wt'}]
nvidia/nemotron-3-nano-30b-a3b:none FAIL 0% 277 in → 28 out (0 reasoning)
git worktree remove /path/to/./worktree_usage_f009_txv2sgm1/feature-wt
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git worktree list', 'not_contains': 'feature-wt'}]