: GitBench
Exclude specific files from cleaning
Tests ability to exclude specific files from cleaning with git clean -e. Evaluates exclusion-pattern usage.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'tracked' > tracked.txt
  5. 05 git add tracked.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 echo 'important' > important.tmp
  8. 08 echo 'junk1' > junk1.tmp
  9. 09 echo 'junk2' > junk2.txt
Prompt
Remove all untracked files except important.tmp. Output ONLY the git command, nothing else.
Expected
git clean -f -e important.tmp
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 69 in → 333 out (322 reasoning)
git clean -f -e important.tmp
deepseek/deepseek-v4-flash:high__json_schema PASS 100% 71 in → 322 out (303 reasoning)
git clean -f -e important.tmp
JSON Schema Structured Output
(raw) { "command": "git clean -f -e important.tmp" }
deepseek/deepseek-v4-flash:none PASS 100% 69 in → 10 out (0 reasoning)
git clean -f -e important.tmp
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 69 in → 18 out (0 reasoning)
git clean -f -e important.tmp
JSON Schema Structured Output
(raw) { "command": "git clean -f -e important.tmp" }
mistralai/devstral-2512__json_schema PASS 100% 68 in → 15 out
git clean -f -e important.tmp
JSON Schema Structured Output
(raw) {"command": "git clean -f -e important.tmp"}
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 81 in → 440 out (462 reasoning)
git clean -fd --exclude=important.tmp
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 81 in → 677 out (712 reasoning)
git clean -fd --exclude=important.tmp .
JSON Schema Structured Output
(raw) { "command": "git clean -fd --exclude=important.tmp ." }
nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 81 in → 20 out (0 reasoning)
git clean -fd --dry-run && git clean -fd --exclude=important.tmp
poolside/laguna-xs.2:high PASS 100% 111 in → 218 out (207 reasoning)
git clean -fd -e important.tmp
poolside/laguna-xs.2:high__json_schema PASS 100% 111 in → 326 out (306 reasoning)
git clean -f -e important.tmp
JSON Schema Structured Output
(raw) { "command": "git clean -f -e important.tmp" }
nvidia/nemotron-3-nano-30b-a3b:none__json_schema FAIL 75% 81 in → 13 out (0 reasoning)
git clean -fd
JSON Schema Structured Output
(raw) { "command": "git clean -fd" }
Failure: Failed assertions: [{'type': 'file_exists', 'path': 'important.tmp'}]
mistralai/devstral-2512 FAIL 50% 68 in → 15 out
```bash git clean -f -e important.tmp ```
Failure: Failed assertions: [{'type': 'file_not_exists', 'path': 'junk1.tmp'}, {'type': 'file_not_exists', 'path': 'junk2.txt'}]
poolside/laguna-xs.2:none FAIL 50% 111 in → 11 out (0 reasoning)
git clean -D --exclude=important.tmp
Failure: Failed assertions: [{'type': 'file_not_exists', 'path': 'junk1.tmp'}, {'type': 'file_not_exists', 'path': 'junk2.txt'}]
poolside/laguna-xs.2:none__json_schema FAIL 50% 111 in → 19 out (0 reasoning)
git clean -i --exclude=important.tmp
JSON Schema Structured Output
(raw) { "command": "git clean -i --exclude=important.tmp" }
Failure: Failed assertions: [{'type': 'file_not_exists', 'path': 'junk1.tmp'}, {'type': 'file_not_exists', 'path': 'junk2.txt'}]