WIP commits - two consecutive WIP commits should be squashed
Tests ability to identify that two consecutive WIP commits should be squashed into one meaningful commit. Evaluates recognizing work-in-progress patterns.
Baseline Repository
These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.
- 01
git init - 02
git config user.email 'test@test.com' - 03
git config user.name 'Test User' - 04
echo '# Project' > README.md - 05
git add README.md - 06
git commit -m 'Initial commit' - 07
echo 'print(1)' > main.py - 08
git add main.py - 09
git commit -m 'WIP: add main.py' - 10
echo 'print(2)' >> main.py - 11
git add main.py - 12
git commit -m 'WIP: continue work' - 13
echo 'print(3)' >> main.py - 14
git add main.py - 15
git commit -m 'Complete feature'
Prompt
Analyze this git history and identify which consecutive commits should be marked as squash/fixup in an interactive rebase to create a cleaner history. Focus on WIP (Work In Progress) commits that represent incomplete work. Do not include the base/anchor commit they would be squashed into. Provide the commit hashes or commit messages in the order they should be combined. Output ONLY the selected commits, one per line, with no explanation and no base/anchor/final commits.
Expected
WIP: add main.py, WIP: continue work
Campaign Evidence
Loading campaign evidence…
Model Outputs (14)
993c9fd
6f7ab45
c24a1a9
76f3dcd
JSON Schema
Structured Output
(raw) {
"commits": ["c24a1a9", "76f3dcd"]
}
a5259f4
7b402ed
JSON Schema
Structured Output
(raw) {
"commits": [
"a5259f4",
"7b402ed"
]
}
6db6a69
d2990de
c367b0c
99dc37f
JSON Schema
Structured Output
(raw) {"commits": ["c367b0c", "99dc37f"]}
2d6563b
6c5436a
2d99955
10260d7
JSON Schema
Structured Output
(raw) {
"commits": [
"2d99955",
"10260d7"
]
}
c764b74 WIP: add main.py
6978e5a WIP: continue work
76a392c Complete feature
d74f567
03fc360
JSON Schema
Structured Output
(raw) {
"commits": [
"d74f567",
"03fc360"
]
}
6bb7322
2e51009
4d0d107 WIP: add main.py
606097f WIP: continue work
JSON Schema
Structured Output
(raw) {"commits": ["4d0d107 WIP: add main.py", "606097f WIP: continue work"]}
dd38fd1 WIP: continue work
21e1216 WIP: add main.py
490bd45
c228897
JSON Schema
Structured Output
(raw) {"commits": ["490bd45", "c228897"]}
(empty output)
Failure: Model 'deepseek/deepseek-v4-flash:none' violated the reasoning_level=none invariant: provider reported 1420 reasoning token(s)