Scenarios

Scenarios are core to Agent Contracts as requirements are only relevant when tested against specific scenarios.

There are two types of scenarios, Offline Scenarios for verification and Online Scenarios for runtime certification.

The difference lies in that in offline scenarios, you can provide as the specific data (e.g. a specific customer query) used to evaluate the agent. An online scenario however defines a set of Precondition that qualifies a scenario and activates the pathconditions and postconditions in the contract.

Anatomy of a Scenario

A scenario consists of different components depending on whether it’s offline or online:

All scenarios have the following components:

  • name - Name of the scenario
  • data - Input data to the scenario (ONLY used for offline verification)
  • contracts - Set of requirements to verify or certify against
  • metadata - Information about the test case

Within the contract, precondition serves different purposes:

  1. Offline Verification: OPTIONAL to have Preconditions. They help validate if the agent received the correct inputs to act. If preconditions aren’t met (e.g., customer never provided an order number), the contract is invalidated since the agent didn’t have the necessary information to proceed.

  2. Runtime Certification: REQUIRED to have Preconditions. They act as scenario detectors during runtime enforcement. When preconditions are met, we know we’re in a specific scenario and can apply the corresponding path and post conditions.

Creating Offline Scenarios

The data field of an offline contract should contain the input data that drives your test scenario. This can be any JSON structure that matches your system’s requirements:

Fixed-Input Verification

For a verification that takes a fixed input:

single_turn_scenario = Scenario(
    name="Financial Analysis Request",
    data={
        "question": "Compare Tesla's gross margin with its top competitors over the last 3 years"
    },
    contracts=[financial_analysis_contract]
)

Dynamic-Input Verification

For systems with back-and-forth interactions and require a dynamic input. The data can contain any information required to generate the dynamic inputs (e.g. initiation data used for simulation)

customer_support_scenario = Scenario(
    name="Damaged Item Refund Request",
    data={
        "uid": "sim-eligible-refund",
        "context": [{
            "text": "Customer wants to return a damaged sweater to store credit"
        }],
        "tasks": [{
            "primary_task": "Request refund for damaged sweater. Don't ask questions about the process of return/refund, let the support agent guide you. Provide only one piece of information at a time.",
            "additional_info": """
                Here're some information you may provide to the agent if asked:
                Order number: order_01JGQF1YAYNFDV874MKFR33CAA
                Your email: jane.doe@example.com
                Photos of damage: Image url: https://example.com/damage.png
                You discovered the damage after a week after purchase.
                The damage is on the sleeves - there's a hole.
                You haven't worn the sweater.
                You want a refund to store credit.
            """,
            "expected_response": "Purchase was refunded, or if the issue was escalated to a manager."
        }]
    },
    contracts=[general_support_contract, refund_specific_contract]
)

Creating Online Scenarios

For online scenarios, at least one Precondition is required in Contract to qualify the scenario.

Declaring scenarios with Preconditions

Preconditions are used to qualify a contract. If the preconditions are met, the contract certifer will start applying the Pathconditions during runtime and validate the Postcondition upon conclusion of execution.

refund_with_damage_contract = Contract(
    name="Refund request with damage",
    requirements=[
        Precondition("Customer asks for refund"),
        Precondition("Customer mentions damage as reason")
        Pathcondition("Verify order details"),
        Pathcondition("Document damage description"),
        Postcondition("Process refund if damage is eligible for refund or escalate appropriately")
    ]
)

shipping_inquery_contract = Contract(
    name="Shipping inquiry",
    requirements=[
        Precondition("Customer asks about shipping status of an order"),
        Pathcondition("Check order details using get_order_details function"),
        Postcondition("Provide shipping status with detailed tracking information")
    ]
)

Specifications

Specifications are comprehensive definitions of how your agent should behave across different scenarios. Think of them as a Product Requirements Document (PRD) that is written in a scenario-based format - they define the expected behavior, business rules, and quality standards your agent must meet.

Specifications as a class can be used to manage both offline and online scenarios.

Here’s how to create and manage specifications:

from agent_contracts.core.datatypes.specifications import Specifications

# Define behavior for customer support system
support_specs = Specifications(scenarios=[
    # Core functionality
    standard_refund_scenario,     # Basic refund flow
    simple_exchange_scenario,     # Basic exchange flow
    
    # Complex business rules
    partial_refund_scenario,      # Handling partial returns
    multi_item_return_scenario,   # Multiple item processing
    
    # Edge cases and error handling
    invalid_order_scenario,       # How to handle invalid inputs
    expired_return_window_scenario # Policy enforcement
])

# Save specifications for version control and sharing
support_specs.save("specifications/customer_support_v1.json") # or .yaml

# Load existing specifications
specs = Specifications.load("specifications/customer_support_v1.json")

You can then use the specifications for offline verification and runtime certification.

Next Steps