: GitBench
Add multiple submodules
Tests ability to add multiple submodules to a repository. Evaluates multi-submodule management.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'App' > app.txt
  5. 05 git add app.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 mkdir -p ../lib-src && cd ../lib-src && git init && git config user.email 'test@test.com' && git config user.name 'Test User' && echo 'lib code' > lib.py && git add lib.py && git commit -m 'Initial lib'
  8. 08 git clone --bare ../lib-src ../lib-bare
  9. 09 mkdir -p ../utils-src && cd ../utils-src && git init && git config user.email 'test@test.com' && git config user.name 'Test User' && echo 'utils code' > utils.py && git add utils.py && git commit -m 'Initial utils'
  10. 10 git clone --bare ../utils-src ../utils-bare
  11. 11 git config protocol.file.allow always
Prompt
Add the repository at '../lib-bare' as submodule 'lib' and '../utils-bare' as submodule 'utils'. Output ONLY the commands to run, one per line.
Expected
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 48 in → 125 out (104 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
deepseek/deepseek-v4-flash:high__json_schema PASS 100% 50 in → 126 out (86 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils" ] }
deepseek/deepseek-v4-flash:none PASS 100% 48 in → 20 out (0 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 48 in → 35 out (0 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils" ] }
mistralai/devstral-2512__json_schema PASS 100% 47 in → 34 out
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils" ] }
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 60 in → 159 out (144 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 60 in → 1,061 out (1,167 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils" ] }
nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 60 in → 25 out (0 reasoning)
git init git submodule add '../lib-bare' lib git submodule add '../utils-bare' utils
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 60 in → 39 out (0 reasoning)
git submodule add '../lib-bare' 'lib' git submodule add '../utils-bare' 'utils'
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add '../lib-bare' 'lib'", "git submodule add '../utils-bare' 'utils'" ] }
poolside/laguna-xs.2:high PASS 100% 93 in → 117 out (97 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
poolside/laguna-xs.2:high__json_schema PASS 100% 93 in → 93 out (55 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils" ] }
poolside/laguna-xs.2:none PASS 100% 93 in → 19 out (0 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
poolside/laguna-xs.2:none__json_schema PASS 100% 93 in → 24 out (0 reasoning)
git submodule add ../lib-bare lib git submodule add ../utils-bare utils
JSON Schema Structured Output
(raw) {"commands": ["git submodule add ../lib-bare lib", "git submodule add ../utils-bare utils"]}
mistralai/devstral-2512 FAIL 0% 47 in → 25 out
```bash git submodule add ../lib-bare lib git submodule add ../utils-bare utils ```
Failure: Failed assertions: [{'type': 'dir_exists', 'path': 'lib'}, {'type': 'dir_exists', 'path': 'utils'}, {'type': 'file_content', 'path': 'lib/lib.py', 'value': 'lib code'}, {'type': 'file_content', 'path': 'utils/utils.py', 'value': 'utils code'}]