: GitBench
Stash with branch name in message
Tests ability to identify a stash by its custom branch-name message. Evaluates stash-message parsing for identification.

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

  1. 01 git init
  2. 02 git config user.email 'test@test.com'
  3. 03 git config user.name 'Test User'
  4. 04 git checkout -b my-feature-branch
  5. 05 echo 'Initial' > file.txt
  6. 06 git add file.txt
  7. 07 git commit -m 'Initial'
  8. 08 echo 'Feature work' > file.txt
  9. 09 git stash -m 'feature-on-my-feature-branch'
  10. 10 echo 'Main work' > file.txt
  11. 11 git add file.txt
  12. 12 git commit -m 'Main work'
Prompt
You were on 'my-feature-branch' and stashed 'Feature work' with message 'feature-on-my-feature-branch'. Which stash entry contains this work? Provide the stash reference.
Expected
stash@{0}
Loading campaign evidence…
deepseek/deepseek-v4-flash:high PASS 100% 127 in → 87 out (71 reasoning)
The stash entry containing the work is `stash@{0}`.
deepseek/deepseek-v4-flash:none PASS 100% 127 in → 58 out (0 reasoning)
Based on the information provided, the stash entry that contains your "Feature work" is: ``` stash@{0} ``` This is the reference to the stash that you created with the message "feature-on-my-feature-branch" while on the `my-feature-branch`.
deepseek/deepseek-v4-flash:none__json_schema PASS 100% 127 in → 13 out (0 reasoning)
stash@{0}
JSON Schema Structured Output
(raw) { "stash": "stash@{0}" }
mistralai/devstral-2512 PASS 100% 137 in → 77 out
The stash entry containing your work is `stash@{0}`. This is the most recent stash (as indicated by the `0` index) and matches both the branch (`my-feature-branch`) and the stash message (`feature-on-my-feature-branch`) you provided. The diff also confirms it contains the "Feature work" changes you mentioned.
mistralai/devstral-2512__json_schema PASS 100% 137 in → 14 out
stash@{0}
JSON Schema Structured Output
(raw) { "stash": "stash@{0}" }
nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 150 in → 169 out (113 reasoning)
The stash entry that holds the work you stashed is: **`stash@{0}`** (It lives on the current branch `my-feature-branch` and was created with the message **“feature-on‑my‑feature‑branch”**.)
nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 150 in → 203 out (170 reasoning)
stash@{0}
JSON Schema Structured Output
(raw) { "stash": "stash@{0}" }
nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 150 in → 85 out (0 reasoning)
The stash entry that contains the work you described is: ``` stash@{0} – “feature-on-my-feature-branch” ``` In other words, `stash@{0}` (the entry shown by `git stash list` at the top) holds the changes you stashed under the message **“feature-on-my-feature-branch”** on the branch **my-feature-branch**.
nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 150 in → 14 out (0 reasoning)
stash@{0}
JSON Schema Structured Output
(raw) { "stash": "stash@{0}" }
poolside/laguna-xs.2:high PASS 100% 174 in → 245 out (184 reasoning)
The stash entry containing your work is **stash@{0}**. This is confirmed by the stash list output you provided, which shows: ``` stash@{0}: On my-feature-branch: feature-on-my-feature-branch ``` The stash reference is `stash@{0}`.
poolside/laguna-xs.2:high__json_schema PASS 100% 174 in → 165 out (154 reasoning)
stash@{0}
JSON Schema Structured Output
(raw) {"stash": "stash@{0}"}
poolside/laguna-xs.2:none PASS 100% 174 in → 85 out (0 reasoning)
Based on the git stash list you provided, the stash entry containing your 'Feature work' is: **stash@{0}** This is the most recent stash entry (index 0) and it contains the changes from when you were on 'my-feature-branch' with the message 'feature-on-my-feature-branch'. The diff shows your feature work modified file.txt from "Initial" to "Feature work".
poolside/laguna-xs.2:none__json_schema PASS 100% 174 in → 9 out (0 reasoning)
stash@{0}
JSON Schema Structured Output
(raw) {"stash":"stash@{0}"}
Invalid JSON. Output: The stash entry containing the work is **stash@{0}**.
JSON Schema Structured Output
Structured Output Error
Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)
Failure: Failed to parse structured JSON response: Expecting value: line 1 column 1 (char 0)