Evaluation Results¶
evaldeck.results.EvaluationResult
¶
Bases: BaseModel
Complete result of evaluating a single test case.
add_grade
¶
Add a grade result.
Source code in src/evaldeck/results.py
add_metric
¶
add_turn_result
¶
Add a turn result.
Source code in src/evaldeck/results.py
evaldeck.results.SuiteResult
¶
Bases: BaseModel
Result of evaluating a test suite.
evaldeck.results.RunResult
¶
Bases: BaseModel
Result of a complete evaluation run (multiple suites).