: GitBench
Update submodule to latest
Tests ability to update a submodule to its latest remote commit. Evaluates submodule-update workflow.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'App' > app.txt
  5. 05 git add app.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 mkdir -p ../lib-src && cd ../lib-src && git init && git config user.email 'test@test.com' && git config user.name 'Test User' && echo 'lib v1' > lib.py && git add lib.py && git commit -m 'Initial lib'
  8. 08 git clone --bare ../lib-src ../lib-bare
  9. 09 git config protocol.file.allow always
  10. 10 git -c protocol.file.allow=always submodule add ../lib-bare lib
  11. 11 git add .gitmodules lib
  12. 12 git commit -m 'Add lib submodule'
  13. 13 cd ../lib-src && echo 'lib v2' > lib.py && git add lib.py && git commit -m 'Update lib to v2' && git push ../lib-bare HEAD:$(git branch --show-current)
Prompt
The remote submodule has been updated. Run the command to update the submodule to the latest remote version. Output ONLY the git command, nothing else.
Expected
git submodule update --remote
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 92 in → 138 out (129 reasoning)
git submodule update --remote lib
deepseek/deepseek-v4-flash:none PASS 100% 90 in → 8 out (0 reasoning)
git submodule update --remote lib
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 89 in → 17 out (0 reasoning)
git submodule update --remote --init
JSON Schema Structured Output
(raw) { "command": "git submodule update --remote --init" }
mistralai/devstral-2512__json_schema PASS 100% 99 in → 13 out
git submodule update --remote lib
JSON Schema Structured Output
(raw) {"command": "git submodule update --remote lib"}
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 108 in → 350 out (411 reasoning)
git submodule update --remote
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 112 in → 138 out (152 reasoning)
git submodule update --remote --checkout
JSON Schema Structured Output
(raw) { "command": "git submodule update --remote --checkout" }
nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 114 in → 8 out (0 reasoning)
git submodule update --remote lib
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 112 in → 17 out (0 reasoning)
git submodule update --remote --checkout
JSON Schema Structured Output
(raw) { "command": "git submodule update --remote --checkout" }
poolside/laguna-xs.2:high PASS 100% 147 in → 352 out (343 reasoning)
git submodule update --remote lib
poolside/laguna-xs.2:high__json_schema PASS 100% 152 in → 210 out (198 reasoning)
git submodule update --remote
JSON Schema Structured Output
(raw) {"command": "git submodule update --remote"}
poolside/laguna-xs.2:none PASS 100% 147 in → 8 out (0 reasoning)
git submodule update --remote lib
poolside/laguna-xs.2:none__json_schema PASS 100% 148 in → 12 out (0 reasoning)
git submodule update --remote lib
JSON Schema Structured Output
(raw) {"command": "git submodule update --remote lib"}
Invalid JSON. Output: gitsubmodule update --remote lib
JSON Schema Structured Output
Structured Output Error
Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
Failure: Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
mistralai/devstral-2512 FAIL 0% 99 in → 13 out
```bash git submodule update --remote lib ```
Failure: Failed assertions: [{'type': 'file_content', 'path': 'lib/lib.py', 'value': 'lib v2'}]