: GitBench
Create worktree with new branch
Tests ability to create a new branch and worktree simultaneously. Evaluates combined worktree-create and branch-create.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'Hello' > hello.txt
  5. 05 git add hello.txt
  6. 06 git commit -m 'Initial commit'
Prompt
Create a git worktree at '../new-feature' with a new branch called 'new-feature'. Output ONLY the git worktree command, nothing else.
Expected
git worktree add -b new-feature ../new-feature main
Loading campaign evidence…
deepseek/deepseek-v4-flash:none PASS 100% 118 in → 14 out (0 reasoning)
git worktree add -b new-feature ../new-feature
mistralai/devstral-2512__json_schema PASS 100% 145 in → 19 out
git worktree add -b new-feature ../new-feature
JSON Schema Structured Output
(raw) {"command": "git worktree add -b new-feature ../new-feature"}
poolside/laguna-xs.2:high__json_schema PASS 100% 180 in → 184 out (167 reasoning)
git worktree add -b new-feature ../new-feature
JSON Schema Structured Output
(raw) {"command": "git worktree add -b new-feature ../new-feature"}
deepseek/deepseek-v4-flash:high FAIL 33.3% 119 in → 213 out (200 reasoning)
git worktree add ../new-feature new-feature
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
deepseek/deepseek-v4-flash:high__json_schema FAIL 33.3% 121 in → 303 out (291 reasoning)
git worktree add ../new-feature new-feature
JSON Schema Structured Output
(raw) { "command": "git worktree add ../new-feature new-feature" }
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
deepseek/deepseek-v4-flash:none__json_schema FAIL 33.3% 121 in → 19 out (0 reasoning)
git worktree add ../new-feature new-feature
JSON Schema Structured Output
(raw) { "command": "git worktree add ../new-feature new-feature" }
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
mistralai/devstral-2512 FAIL 33.3% 139 in → 19 out
```bash git worktree add ../new-feature -b new-feature ```
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
nvidia/nemotron-3-nano-30b-a3b:high FAIL 33.3% 153 in → 610 out (623 reasoning)
git worktree add --branch new-feature ../new-feature HEAD
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
nvidia/nemotron-3-nano-30b-a3b:high__json_schema FAIL 33.3% 151 in → 3,720 out (4,094 reasoning)
git worktree add --branch new-feature ../new-feature HEAD
JSON Schema Structured Output
(raw) { "command": "git worktree add --branch new-feature ../new-feature HEAD" }
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
nvidia/nemotron-3-nano-30b-a3b:none FAIL 33.3% 156 in → 15 out (0 reasoning)
git worktree add '../new-feature' refs/heads/main
Failure: Failed assertions: [{'type': 'branch_exists', 'name': 'new-feature'}, {'type': 'git_output', 'command': 'git -C ../new-feature branch --show-current', 'contains': 'new-feature'}]
nvidia/nemotron-3-nano-30b-a3b:none__json_schema FAIL 33.3% 155 in → 18 out (0 reasoning)
git worktree add --branch new-feature ../new-feature
JSON Schema Structured Output
(raw) {"command":"git worktree add --branch new-feature ../new-feature"}
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
poolside/laguna-xs.2:high FAIL 33.3% 180 in → 347 out (337 reasoning)
git worktree add ../new-feature new-feature
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
poolside/laguna-xs.2:none FAIL 33.3% 179 in → 16 out (0 reasoning)
```bash git worktree add ../new-feature new-feature ```
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]
poolside/laguna-xs.2:none__json_schema FAIL 33.3% 178 in → 22 out (0 reasoning)
git worktree add --branch new-feature ../new-feature
JSON Schema Structured Output
(raw) { "command": "git worktree add --branch new-feature ../new-feature" }
Failure: Failed assertions: [{'type': 'dir_exists', 'path': '../new-feature'}, {'type': 'branch_exists', 'name': 'new-feature'}]