Scenarios & Specifications
Learn how to organize test cases for offline evaluation of AI agents
Scenarios
Scenarios are core to Agent Contracts as requirements are only relevant when tested against specific scenarios.
There are two types of scenarios, Offline Scenarios for verification and Online Scenarios for runtime certification.
The difference lies in that in offline scenarios, you can provide as the specific data
(e.g. a specific customer query) used to evaluate the agent. An online scenario however defines a set of Precondition
that qualifies a scenario and activates the pathconditions
and postconditions
in the contract
.
Anatomy of a Scenario
A scenario consists of different components depending on whether it’s offline or online:
All scenarios have the following components:
name
- Name of the scenariodata
- Input data to the scenario (ONLY used for offline verification)contracts
- Set of requirements to verify or certify againstmetadata
- Information about the test case
Within the contract, precondition
serves different purposes:
-
Offline Verification: OPTIONAL to have Preconditions. They help validate if the agent received the correct inputs to act. If preconditions aren’t met (e.g., customer never provided an order number), the contract is invalidated since the agent didn’t have the necessary information to proceed.
-
Runtime Certification: REQUIRED to have Preconditions. They act as scenario detectors during runtime enforcement. When preconditions are met, we know we’re in a specific scenario and can apply the corresponding path and post conditions.
Creating Offline Scenarios
The data
field of an offline contract should contain the input data that drives your test scenario. This can be any JSON structure that matches your system’s requirements:
Fixed-Input Verification
For a verification that takes a fixed input:
Dynamic-Input Verification
For systems with back-and-forth interactions and require a dynamic input. The data
can contain any information required to generate the dynamic inputs (e.g. initiation data used for simulation)
Creating Online Scenarios
For online scenarios, at least one Precondition
is required in Contract
to qualify the scenario.
Declaring scenarios with Preconditions
Preconditions are used to qualify a contract. If the preconditions are met, the contract certifer will start applying the Pathconditions during runtime and validate the Postcondition upon conclusion of execution.
Specifications
Specifications are comprehensive definitions of how your agent should behave across different scenarios. Think of them as a Product Requirements Document (PRD) that is written in a scenario-based format - they define the expected behavior, business rules, and quality standards your agent must meet.
Specifications as a class can be used to manage both offline and online scenarios.
Here’s how to create and manage specifications:
You can then use the specifications for offline verification and runtime certification.
Next Steps
- Learn how to verify your specifications in offline tests
- Understand runtime certification of specifications during agent execution.