: GitBench
Single file added
Tests ability to generate a concise commit message for a single new file addition (hello.txt). Evaluates basic commit message generation from a trivial diff.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 echo 'Hello, World!' > hello.txt
  5. 05 git add hello.txt
Prompt
Generate a concise commit message (max 50 characters) describing the changes in this diff. Focus on the action taken (e.g., 'Add', 'Fix', 'Update').
Expected
Add hello.txt with greeting message
Loading campaign evidence…
deepseek/deepseek-v4-flash:none PASS 99% 100 in → 4 out (0 reasoning)
Add hello.txt
deepseek/deepseek-v4-flash:high__json_schema PASS 95.7% 100 in → 126 out (113 reasoning)
Add hello.txt
JSON Schema Structured Output
(raw) {"commit_message":"Add hello.txt"}
nvidia/nemotron-3-nano-30b-a3b:none PASS 95.3% 122 in → 4 out (0 reasoning)
Add hello.txt
poolside/laguna-xs.2:high__json_schema PASS 93.7% 155 in → 227 out (216 reasoning)
Add hello.txt
JSON Schema Structured Output
(raw) {"commit_message": "Add hello.txt"}
deepseek/deepseek-v4-flash:high PASS 93.3% 100 in → 133 out (128 reasoning)
Add hello.txt
mistralai/devstral-2512 PASS 93.3% 109 in → 5 out
Add hello.txt file
mistralai/devstral-2512__json_schema PASS 93.3% 109 in → 11 out
Add hello.txt file
JSON Schema Structured Output
(raw) {"commit_message": "Add hello.txt file"}
poolside/laguna-xs.2:high PASS 93% 155 in → 197 out (191 reasoning)
Add hello.txt
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 92.7% 122 in → 15 out (0 reasoning)
Add hello.txt with greeting
JSON Schema Structured Output
(raw) { "commit_message": "Add hello.txt with greeting" }
poolside/laguna-xs.2:none__json_schema PASS 91.3% 155 in → 17 out (0 reasoning)
Add hello.txt with greeting
JSON Schema Structured Output
(raw) { "commit_message": "Add hello.txt with greeting" }
deepseek/deepseek-v4-flash:none__json_schema PASS 91% 100 in → 13 out (0 reasoning)
Add hello.txt with greeting
JSON Schema Structured Output
(raw) {"commit_message":"Add hello.txt with greeting"}
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 91% 122 in → 355 out (223 reasoning)
Add hello.txt with greeting
JSON Schema Structured Output
(raw) { "commit_message": "Add hello.txt with greeting" }
poolside/laguna-xs.2:none PASS 87% 155 in → 8 out (0 reasoning)
Add hello.txt with greeting message
nvidia/nemotron-3-nano-30b-a3b:high PASS 59.3% 122 in → 107 out (59 reasoning)
Given the diff, a new file `hello.txt` has been added with the content `Hello, World!`. The action is an addition. Concise commit message (max 50 characters, focusing on action): **Add hello.txt with welcome message**