Overview
The Harbor coding agent demonstrates ICRL’s performance improvement on realistic software engineering tasks. It uses a simulated coding workspace (/workspace/src, /workspace/tests) with shell-like commands for editing, navigation, and testing. The demo shows baseline vs post-training behavior on verifiable coding tasks.
Source Files
| File | Purpose |
|---|---|
examples/harbor_coding_agent.py | Main demo script with CodingEnvironment |
tests/test_harbor_coding.py | Pytest tests for the demo |
Run Demo
Run Tests
What It Demonstrates
- Coding workspace simulation — Sandboxed
/workspacewithsrc/andtests/ - Shell-like commands —
ls,cd,cat,grep,find,sed,echo,pytest, etc. - Baseline vs post-training — Phase 1 runs evaluation tasks with no examples; Phase 2 trains on coding tasks; Phase 3 re-evaluates with learned examples
- Verifiable tasks — Each task has a
verify(workspace_state) -> boolfunction
Task Structure
Tasks areCodingTask objects with:
goal— Natural language descriptionverify(workspace_state) -> bool— Success predicatesetup— Optional hook to modify initial statedifficulty—"easy","medium","hard"category—"code-analysis","debugging","navigation","testing","refactoring"

