Skip to main content

Python

trajectories = await agent.train_batch(env_factory, goals)
results = await agent.run_batch(env_factory, eval_goals)
env_factory should return a fresh environment instance per goal.

TypeScript

const trajectories = await agent.trainBatch(envFactory, goals);
const results = await agent.runBatch(envFactory, evalGoals);

Suggested Workflow

  1. Build a curriculum of goals.
  2. Train in rounds.
  3. Evaluate with run/runBatch on held-out goals.
  4. Monitor DB stats and curation behavior.

Validation

Python DB includes deferred validation methods (validate_trajectory, validate_all) that can be run post-training for code-change persistence checks.