: GitBench
Delete worktree after use
Tests ability to delete a worktree after use. Evaluates worktree-cleanup lifecycle.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'Hello' > hello.txt
  5. 05 git add hello.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 git checkout -b feature
  8. 08 echo 'Feature' > feature.txt
  9. 09 git add feature.txt
  10. 10 git commit -m 'Add feature'
  11. 11 git checkout main
  12. 12 git worktree add ../feature-wt feature
Prompt
Delete the worktree at '../feature-wt'. Output ONLY the git command to run, nothing else.
Expected
git worktree remove ../feature-wt
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 192 in → 175 out (165 reasoning)
git worktree remove ../feature-wt
deepseek/deepseek-v4-flash:none PASS 100% 191 in → 9 out (0 reasoning)
git worktree remove ../feature-wt
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 190 in → 12 out (0 reasoning)
git worktree remove ../feature-wt
JSON Schema Structured Output
(raw) {"command":"git worktree remove ../feature-wt"}
mistralai/devstral-2512__json_schema PASS 100% 228 in → 14 out
git worktree remove ../feature-wt
JSON Schema Structured Output
(raw) {"command": "git worktree remove ../feature-wt"}
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 241 in → 154 out (137 reasoning)
git worktree remove ../feature-wt
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 249 in → 230 out (233 reasoning)
git worktree remove ../feature-wt
JSON Schema Structured Output
(raw) {"command":"git worktree remove ../feature-wt"}
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 242 in → 16 out (0 reasoning)
git worktree remove -f ../feature-wt
JSON Schema Structured Output
(raw) {"command": "git worktree remove -f ../feature-wt"}
poolside/laguna-xs.2:high PASS 100% 267 in → 152 out (141 reasoning)
git worktree remove ../feature-wt
poolside/laguna-xs.2:high__json_schema PASS 100% 268 in → 328 out (313 reasoning)
git worktree remove ../feature-wt
JSON Schema Structured Output
(raw) {"command": "git worktree remove ../feature-wt"}
poolside/laguna-xs.2:none PASS 100% 265 in → 10 out (0 reasoning)
git worktree remove ../feature-wt
poolside/laguna-xs.2:none__json_schema PASS 100% 277 in → 14 out (0 reasoning)
git worktree remove ../feature-wt
JSON Schema Structured Output
(raw) {"command": "git worktree remove ../feature-wt"}
Invalid JSON. Output: git worktree remove ../feature-wt
JSON Schema Structured Output
Structured Output Error
Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
Failure: Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
mistralai/devstral-2512 FAIL 0% 230 in → 14 out
```bash git worktree remove ../feature-wt ```
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git worktree list', 'not_contains': 'feature-wt'}, {'type': 'dir_not_exists', 'path': '../feature-wt'}]
nvidia/nemotron-3-nano-30b-a3b:none FAIL 0% 248 in → 6 out (0 reasoning)
git worktree prune
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git worktree list', 'not_contains': 'feature-wt'}, {'type': 'dir_not_exists', 'path': '../feature-wt'}]