Skip to main content

Python Interface

class MyEnv:
    def reset(self, goal: str) -> str:
        ...

    def step(self, action: str) -> tuple[str, bool, bool] | Awaitable[tuple[str, bool, bool]]:
        ...
Return value is (observation, done, success).

TypeScript Interface

interface Environment {
  reset(goal: string): string | Promise<string>;
  step(action: string): StepResult | Promise<StepResult>;
}
Where:
interface StepResult {
  observation: string;
  done: boolean;
  success: boolean;
}

Practical Rules

  1. Store the goal in reset.
  2. Keep observation text actionable.
  3. Terminate episodes via done.
  4. Set success only on true goal completion.

Reference Implementations

  • Python file environment: examples/file_api_env.py
  • Python coding environment: examples/harbor_coding_agent.py
  • TypeScript demos: icrl-ts/examples/*.ts