: GitBench
Add submodule and commit
Tests ability to add a submodule and commit the change. Evaluates end-to-end submodule-addition workflow.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'App' > app.txt
  5. 05 git add app.txt
  6. 06 git commit -m 'Initial commit'
  7. 07 mkdir -p ../lib-src && cd ../lib-src && git init && git config user.email 'test@test.com' && git config user.name 'Test User' && echo 'lib code' > lib.py && git add lib.py && git commit -m 'Initial lib'
  8. 08 git clone --bare ../lib-src ../lib-bare
  9. 09 git config protocol.file.allow always
Prompt
Add the repository at '../lib-bare' as a git submodule at path 'lib', then add and commit the changes. Output ONLY the commands to run, one per line.
Expected
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m 'Add lib submodule'
Loading campaign evidence…
mistralai/devstral-2512__json_schema PASS 100% 46 in → 44 out
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m "Add lib submodule"
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git add .gitmodules lib", "git commit -m \"Add lib submodule\"" ] }
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 59 in → 301 out (285 reasoning)
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m "Add lib submodule"
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git add .gitmodules lib", "git commit -m \"Add lib submodule\"" ] }
deepseek/deepseek-v4-flash:high FAIL 66.7% 47 in → 275 out (257 reasoning)
git submodule add ../lib-bare lib git add . git commit -m "Add submodule lib"
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
deepseek/deepseek-v4-flash:none__json_schema FAIL 66.7% 49 in → 49 out (0 reasoning)
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m 'Add lib-bare as submodule at lib'
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git add .gitmodules lib", "git commit -m 'Add lib-bare as submodule at lib'" ] }
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
nvidia/nemotron-3-nano-30b-a3b:high FAIL 66.7% 59 in → 2,567 out (2,888 reasoning)
git submodule add ../lib-bare lib git commit -m "Add submodule lib"
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
nvidia/nemotron-3-nano-30b-a3b:none FAIL 66.7% 59 in → 10 out (0 reasoning)
git submodule add ../lib-bare lib
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
nvidia/nemotron-3-nano-30b-a3b:none__json_schema FAIL 66.7% 59 in → 46 out (0 reasoning)
git submodule add ../lib-bare lib git add . git commit -m "Add lib-bare submodule at path lib"
JSON Schema Structured Output
(raw) { "commands": [ "git submodule add ../lib-bare lib", "git add .", "git commit -m \"Add lib-bare submodule at path lib\"" ] }
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
poolside/laguna-xs.2:high FAIL 66.7% 93 in → 271 out (246 reasoning)
git submodule add ../lib-bare lib git add . git commit -m "Add lib as submodule"
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
poolside/laguna-xs.2:high__json_schema FAIL 66.7% 93 in → 165 out (134 reasoning)
git submodule add ../lib-bare lib git add . git commit -m "Add lib as submodule"
JSON Schema Structured Output
(raw) {"commands": ["git submodule add ../lib-bare lib", "git add .", "git commit -m \"Add lib as submodule\""]}
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
poolside/laguna-xs.2:none FAIL 66.7% 93 in → 32 out (0 reasoning)
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m "Add lib-bare as submodule at lib"
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
poolside/laguna-xs.2:none__json_schema FAIL 66.7% 93 in → 35 out (0 reasoning)
git submodule add ../lib-bare lib git add .gitmodules lib git commit -m "Add lib as a submodule"
JSON Schema Structured Output
(raw) {"commands": ["git submodule add ../lib-bare lib", "git add .gitmodules lib", "git commit -m \"Add lib as a submodule\""]}
Failure: Failed assertions: [{'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
Invalid JSON. Output: gitsubmodule add ../lib-bare lib git add lib git commit -m "Add lib submodule"
JSON Schema Structured Output
Structured Output Error
Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
Failure: Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
deepseek/deepseek-v4-flash:none FAIL 0% 47 in → 33 out (0 reasoning)
```bash git submodule add ../lib-bare lib git add . git commit -m "Add lib-bare as submodule at lib" ```
Failure: Failed assertions: [{'type': 'file_exists', 'path': '.gitmodules'}, {'type': 'dir_exists', 'path': 'lib'}, {'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]
mistralai/devstral-2512 FAIL 0% 46 in → 37 out
```bash git submodule add ../lib-bare lib git add .gitmodules lib git commit -m "Add lib-bare as submodule at lib" ```
Failure: Failed assertions: [{'type': 'file_exists', 'path': '.gitmodules'}, {'type': 'dir_exists', 'path': 'lib'}, {'type': 'git_output', 'command': 'git log --oneline -1', 'contains': 'Add lib submodule'}]