Glossary

Eval Harness

An eval harness is a repeatable test setup that measures an AI system's behavior on representative tasks, helping detect regressions and failures.

Test setup for agent behavior