Goals¶

Data models for goal-guided query optimization.

Overview¶

The goals system uses a flat list of Goal objects, optionally categorized using the CTO framework:

Components: System internals (freshness, citations, data handling)
Trajectories: User journeys (workflows, error recovery, multi-step)
Outcomes: Output qualities (actionable, clear, accurate)

Categories are optional. You can use any string or skip them entirely.

from evaluateur import Goal, GoalSpec

goals = GoalSpec(goals=[
    Goal(name="freshness", text="Test data currency", category="components"),
    Goal(name="error recovery", text="Test error handling", category="trajectories"),
    Goal(name="actionable", text="Request clear next steps", category="outcomes"),
    Goal(name="edge cases", text="Cover unusual inputs"),  # no category
])

GoalSpec¶

Top-level container for goal guidance.

GoalSpec ¶

Bases: BaseModel

User-provided guidance for shaping evaluation queries.

A flat list of goals, optionally categorized. The CTO framework (components / trajectories / outcomes) is supported via the Goal.category field but is not structurally enforced.

is_empty ¶

is_empty() -> bool

Return True if no active goals are specified.

Source code in src/evaluateur/goals/models.py

def is_empty(self) -> bool:
    """Return True if no active goals are specified."""
    return not self.goals or all(g.weight <= 0 for g in self.goals)

available_goals ¶

available_goals() -> list[Goal]

Return goals with positive weight.

Source code in src/evaluateur/goals/models.py

def available_goals(self) -> list[Goal]:
    """Return goals with positive weight."""
    return [g for g in self.goals if g.weight > 0]

render_prompt ¶

render_prompt() -> str

Render this spec into a compact instruction block.

Source code in src/evaluateur/goals/models.py

def render_prompt(self) -> str:
    """Render this spec into a compact instruction block."""
    return render_goal_prompt(self)

to_metadata ¶

to_metadata() -> dict[str, Any]

Return a JSON-serializable metadata representation.

Source code in src/evaluateur/goals/models.py

def to_metadata(self) -> dict[str, Any]:
    """Return a JSON-serializable metadata representation."""
    return self.model_dump(exclude_none=True)

Constructor¶

GoalSpec(
    goals: list[Goal] = [],
)

Fields:

Field	Type	Description
`goals`	`list[Goal]`	Flat list of evaluation goals

Methods¶

`is_empty()`¶

Check if no active goals are specified.

def is_empty(self) -> bool

Returns True if no goals exist or all have weight <= 0.

`available_goals()`¶

Get goals with positive weight.

def available_goals(self) -> list[Goal]

`render_prompt()`¶

Render the spec into a compact prompt string.

def render_prompt(self) -> str

Goal¶

A single evaluation goal for shaping query generation.

Goal ¶

Bases: BaseModel

A single evaluation goal used to guide query generation.

Goals are flat, flexible, and optionally categorized. The category field supports the CTO framework (components / trajectories / outcomes) or any user-defined string.

Constructor¶

Goal(
    name: str = "",
    text: str,
    weight: float = 1.0,
    category: str = "",
)

Fields:

Field	Type	Default	Description
`name`	`str`	`""`	Short goal label
`text`	`str`	required	Full goal description
`weight`	`float`	`1.0`	Relative importance (0 disables)
`category`	`str`	`""`	Optional category (CTO or custom)

Example:

from evaluateur import Goal

goal = Goal(
    name="citation accuracy",
    text="Ensure all claims reference specific sources with publication dates",
    weight=1.5,
    category="components",
)

Weight Behavior¶

weight > 0: Normal goal, sampled proportionally
weight = 0: Disabled (excluded from sampling)
Higher weight = more likely to be sampled

Evaluator.parse_goals()¶

Parse free-form text into a structured GoalSpec.

from pydantic import BaseModel, Field
from evaluateur import Evaluator

class MyModel(BaseModel):
    topic: str = Field(..., description="subject area")

evaluator = Evaluator(MyModel, llm="openai/gpt-4.1-mini")
spec = await evaluator.parse_goals(
    "Test freshness and citation accuracy. Include error recovery.",
)

Structured text (numbered/bulleted lists) is parsed without an LLM call. Free-form text falls back to LLM enrichment.

In most cases, pass a string directly to goals= in evaluator.run() instead of calling this method manually.

CTO Constants¶

Well-known category constants for the CTO framework:

from evaluateur.goals.constants import COMPONENTS, TRAJECTORIES, OUTCOMES, CTO_CATEGORIES

COMPONENTS    # "components"
TRAJECTORIES  # "trajectories"
OUTCOMES      # "outcomes"
CTO_CATEGORIES  # ("components", "trajectories", "outcomes")

GoalMode¶

GoalMode = Literal["full", "sample", "cycle"]

Mode	Behavior
`"sample"`	Pick one goal per query at random (weighted)
`"cycle"`	Interleave goals by category and rotate (even, diverse coverage)
`"full"`	Include all goals in every query

Complete Example¶

import asyncio
from pydantic import BaseModel, Field
from evaluateur import Evaluator, Goal, GoalSpec


class Query(BaseModel):
    topic: str = Field(..., description="subject area")


async def main() -> None:
    evaluator = Evaluator(Query)

    goals = GoalSpec(goals=[
        Goal(
            name="freshness",
            text="Queries should ask about current data and latest versions",
            category="components",
            weight=2.0,
        ),
        Goal(
            name="citations",
            text="Queries should request source references",
            category="components",
        ),
        Goal(
            name="error handling",
            text="Test graceful degradation when data is missing",
            category="trajectories",
        ),
        Goal(
            name="actionable",
            text="Queries should request next steps and recommendations",
            category="outcomes",
        ),
    ])

    async for q in evaluator.run(
        goals=goals,
        goal_mode="sample",
        tuple_count=10,
    ):
        print(f"[{q.metadata.goal_focus}] {q.query}")


asyncio.run(main())

Goals¶

Overview¶

GoalSpec¶

GoalSpec ¶

is_empty ¶

available_goals ¶

render_prompt ¶

to_metadata ¶

Constructor¶

Methods¶

is_empty()¶

available_goals()¶

render_prompt()¶

Goal¶

Goal ¶

Constructor¶

Weight Behavior¶

Evaluator.parse_goals()¶

CTO Constants¶

GoalMode¶

Complete Example¶

See Also¶

`is_empty()`¶

`available_goals()`¶

`render_prompt()`¶