Show commit touching multiple files
Tests ability to inspect a commit touching multiple files. Evaluates multi-file commit comprehension.
Baseline Repository
These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.
- 01
git init - 02
git config user.email 'test@test.com' - 03
git config user.name 'Test User' - 04
echo 'src' > main.py - 05
echo 'test' > test_main.py - 06
echo 'docs' > README.md - 07
git add main.py test_main.py README.md - 08
git commit -m 'Add all project files'
Prompt
Using git show --stat, how many files were added in the commit 'Add all project files'? Output ONLY the number, nothing else.
Expected
3
Campaign Evidence
Loading campaign evidence…
Model Outputs (14)
3
3
JSON Schema
Structured Output
(raw) {"count":3}
3
3
3
JSON Schema
Structured Output
(raw) {"count": 3}
3
3
JSON Schema
Structured Output
(raw) {
"count": 3
}
3
3
JSON Schema
Structured Output
(raw) {
"count": 3
}
3
3
JSON Schema
Structured Output
(raw) {"count": 3}
3
3
JSON Schema
Structured Output
(raw) {"count": 3}
0
JSON Schema
Structured Output
(raw) {"count": 0}
Failure: Expected '3', got '0'