Preference Learning Demo

Overview

The preference learning demo shows ICRL’s ability to learn and adapt to individual user preferences, communication styles, and working patterns over time. Vanilla LLMs treat every user the same; ICRL can match verbosity, format, depth, and tone to each user’s past approved interactions.

The Problem

Every user has different preferences:

Verbosity: Some want detailed explanations, others want terse commands
Format: Some prefer bullet points, others want prose
Depth: Some want “just tell me what to do”, others want “explain why”
Tone: Some prefer formal, others casual

Vanilla LLMs treat every user the same. You end up repeatedly saying “Be more concise” or “Give me the command, not the explanation”.

How ICRL Solves This

ICRL stores successful interactions that reflect your preferences. Over time:

Learns your style from past interactions you approved
Retrieves preference-matched examples for similar requests
Adapts responses automatically without re-prompting

Demo Structure

examples/preference_learning_demo

README.md

setup_demo.py

run_demo.py

user_profiles

expert_terse.json

learner_detailed.json

manager_summary.json

scenarios

seed_interactions.json

test_requests.json

User Profiles

Profile	Style	Wants	Example
Expert Terse	Minimal explanation	Commands, code snippets	”How do I rebase?” yields `git rebase -i HEAD~3`
Learner Detailed	Thorough explanations	Why things work, pitfalls, examples	Full explanation with diagrams
Manager Summary	High-level overview	Impact, timeline, risks, decisions	”Rebasing reorganizes commit history…”

Running the Demo

From the project root:

cd examples/preference_learning_demo

# 1. Setup — seeds trajectories for each user profile
uv run python setup_demo.py

# 2. Run the comparison test
uv run python run_demo.py

# 3. View detailed evaluation
uv run python evaluate_responses.py

Expected Results

User Profile	ICRL Score	Vanilla Score	Improvement
Expert Terse	~90%	~40%	+50%
Learner Detailed	~85%	~50%	+35%
Manager Summary	~85%	~30%	+55%

Prerequisites

OPENAI_API_KEY or ANTHROPIC_API_KEY set
uv run from project root, or python with PYTHONPATH including src/

Key Insight

This demo proves ICRL’s value for personalization where:

One size doesn’t fit all
User preferences are implicit in past interactions
The same question should have different “right” answers for different users

Getting Started

Core Concepts

Guides

Examples

Preference Learning Demo

Overview

The Problem

How ICRL Solves This

Demo Structure

User Profiles

Running the Demo

Expected Results

Prerequisites

Key Insight

Getting Started

Core Concepts

Guides

Examples

Documentation Index

​Overview

​The Problem

​How ICRL Solves This

​Demo Structure

​User Profiles

​Running the Demo

​Expected Results

​Prerequisites

​Key Insight

Overview

The Problem

How ICRL Solves This

Demo Structure

User Profiles

Running the Demo

Expected Results

Prerequisites

Key Insight