: GitBench
Clean untracked files from a subdirectory only
Tests ability to clean untracked files from a subdirectory only. Evaluates path-limited cleanup scoping.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 mkdir -p frontend/dist backend/dist
  5. 05 echo 'index' > frontend/index.html
  6. 06 git add frontend backend
  7. 07 git commit -m 'Initial commit'
  8. 08 echo 'bundle' > frontend/dist/bundle.js
  9. 09 echo 'binary' > backend/dist/app
  10. 10 echo 'root' > scratch.txt
Prompt
Clean only the frontend/dist directory of untracked files. Output ONLY the git command, nothing else.
Expected
git clean -f frontend/dist/
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 67 in → 641 out (632 reasoning)
git clean -f frontend/dist/
deepseek/deepseek-v4-flash:high__json_schema PASS 100% 67 in → 206 out (190 reasoning)
git clean -f frontend/dist/
JSON Schema Structured Output
(raw) { "command": "git clean -f frontend/dist/" }
deepseek/deepseek-v4-flash:none PASS 100% 67 in → 9 out (0 reasoning)
git clean -f frontend/dist/
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 67 in → 15 out (0 reasoning)
git clean -f frontend/dist/
JSON Schema Structured Output
(raw) { "command": "git clean -f frontend/dist/" }
mistralai/devstral-2512__json_schema PASS 100% 63 in → 13 out
git clean -f frontend/dist/
JSON Schema Structured Output
(raw) {"command": "git clean -f frontend/dist/"}
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 76 in → 305 out (334 reasoning)
git clean -fdx frontend/dist/
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 76 in → 516 out (571 reasoning)
git clean -fd -- frontend/dist
JSON Schema Structured Output
(raw) { "command": "git clean -fd -- frontend/dist" }
nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 76 in → 9 out (0 reasoning)
git clean -fdx frontend/dist/
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 76 in → 14 out (0 reasoning)
git clean -fd frontend/dist
JSON Schema Structured Output
(raw) { "command": "git clean -fd frontend/dist" }
poolside/laguna-xs.2:high PASS 100% 113 in → 151 out (141 reasoning)
git clean -f frontend/dist/
poolside/laguna-xs.2:high__json_schema PASS 100% 113 in → 207 out (189 reasoning)
git clean -f frontend/dist/
JSON Schema Structured Output
(raw) { "command": "git clean -f frontend/dist/" }
poolside/laguna-xs.2:none PASS 100% 113 in → 8 out (0 reasoning)
git clean -f frontend/dist
poolside/laguna-xs.2:none__json_schema PASS 100% 113 in → 11 out (0 reasoning)
git clean -fd frontend/dist
JSON Schema Structured Output
(raw) {"command":"git clean -fd frontend/dist"}
mistralai/devstral-2512 FAIL 75% 63 in → 12 out
```bash git clean -fd frontend/dist/ ```
Failure: Failed assertions: [{'type': 'file_not_exists', 'path': 'frontend/dist/bundle.js'}]