f011 — git_log

Identify most changed file from stat output

Tests ability to identify the most-changed file from stat output. Evaluates parsing git log --stat for change frequency.

medium git-log stat changed-files analysis

Baseline Repository

These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.

01 git init
02 git config user.email 'test@test.com'
03 git config user.name 'Test User'
04 printf 'line1 line2 line3 ' > small.txt
05 printf 'a b ' > medium.txt
06 printf 'x y z w v u t s r ' > large.txt
07 git add small.txt medium.txt large.txt
08 git commit -m 'Initial commit with three files'
09 printf 'line1 line2 line3 line4 line5 ' > small.txt
10 printf 'a b c d e f g h i j k l m n o ' > medium.txt
11 printf 'x y z w v u t s r q p o n m l k j i h g f e d c b a ' > large.txt
12 git add small.txt medium.txt large.txt
13 git commit -m 'Update all files'

Prompt

In the most recent commit, which file had the most lines changed according to git log --stat? Output ONLY the filename, nothing else.

Expected

large.txt

Campaign Evidence

Loading campaign evidence…

Model Outputs (14)

deepseek/deepseek-v4-flash:high PASS 100% 349 in → 68 out (64 reasoning)

large.txt

deepseek/deepseek-v4-flash:high__json_schema PASS 100% 354 in → 110 out (101 reasoning)

large.txt

JSON Schema Structured Output

(raw) { "filename": "large.txt" }

deepseek/deepseek-v4-flash:none PASS 100% 345 in → 3 out (0 reasoning)

large.txt

deepseek/deepseek-v4-flash:none__json_schema PASS 100% 346 in → 12 out (0 reasoning)

large.txt

JSON Schema Structured Output

(raw) { "filename": "large.txt" }

mistralai/devstral-2512 PASS 100% 448 in → 3 out

large.txt

mistralai/devstral-2512__json_schema PASS 100% 450 in → 8 out

large.txt

JSON Schema Structured Output

(raw) {"filename": "large.txt"}

nvidia/nemotron-3-nano-30b-a3b:high PASS 100% 458 in → 153 out (138 reasoning)

large.txt

nvidia/nemotron-3-nano-30b-a3b:high__json_schema PASS 100% 456 in → 122 out (108 reasoning)

large.txt

JSON Schema Structured Output

(raw) { "filename": "large.txt" }

nvidia/nemotron-3-nano-30b-a3b:none PASS 100% 458 in → 3 out (0 reasoning)

large.txt

nvidia/nemotron-3-nano-30b-a3b:none__json_schema PASS 100% 452 in → 10 out (0 reasoning)

large.txt

JSON Schema Structured Output

(raw) { "filename": "large.txt" }

poolside/laguna-xs.2:high PASS 100% 483 in → 124 out (119 reasoning)

large.txt

poolside/laguna-xs.2:high__json_schema PASS 100% 473 in → 254 out (246 reasoning)

large.txt

JSON Schema Structured Output

(raw) {"filename": "large.txt"}

poolside/laguna-xs.2:none PASS 100% 477 in → 4 out (0 reasoning)

large.txt

poolside/laguna-xs.2:none__json_schema PASS 100% 482 in → 13 out (0 reasoning)

large.txt

JSON Schema Structured Output

(raw) { "filename": "large.txt" }