Word boundary search using git grep -w
Tests ability to perform word-boundary search with git grep -w. Evaluates precise vs substring matching awareness.
Baseline Repository
These commands set up the repo before the model sees the prompt. They define the starting file structure, staged changes, and Git history.
- 01
git init - 02
git config user.email 'test@test.com' - 03
git config user.name 'Test User' - 04
mkdir -p src - 05
echo 'data = load_data("input.csv") result = process(data) save_data(result, "output.csv") data_loader = DataLoader()' > src/pipeline.py - 06
git add . - 07
git commit -m 'Add data pipeline' - 08
echo 'git grep -w data' > .grep_command - 09
git add .grep_command - 10
git commit -m 'Add grep sentinel'
Prompt
Here is the output of a git grep -w command that searches for the whole word 'data'. How many lines match? Output ONLY the number, nothing else.
Expected
2
Campaign Evidence
Loading campaign evidence…
Model Outputs (14)
2
2
JSON Schema
Structured Output
(raw) { "count": 2 }
2
JSON Schema
Structured Output
(raw) { "count": 2 }
2
2
JSON Schema
Structured Output
(raw) {"count": 2}
2
2
JSON Schema
Structured Output
(raw) {
"count": 2
}
2
2
JSON Schema
Structured Output
(raw) {
"count": 2
}
2
2
JSON Schema
Structured Output
(raw) {
"count": 2
}
2
2
JSON Schema
Structured Output
(raw) {"count": 2}
4
Failure: Expected '2', got '4'