Cross-Tool Compatibility and the Handoff Protocol (POS Part 4)
On this page
After exploring Context Management in Part 3, it’s time to look at how this context moves between different AI tools.
The Vendor Lock-in Trap
Most AI workflow systems are built for one tool. Cursor rules work in Cursor. Claude Code’s CLAUDE.md works in Claude Code. Copilot instructions work in Copilot. If you switch tools, you rebuild your setup from scratch.
POS avoids this by building on a universal interface. Files that any tool can read.
The AGENTS.md Standard
AGENTS.md is an emerging standard for describing projects to AI tools. It’s a Markdown file at the repository root that tells any AI what the system is, how it’s organized, what rules to follow, and how to get started.
POS generates AGENTS.md from pos.yaml. Any tool that reads Markdown files can read it.
| Tool | How It Reads Context | Session Management |
|---|---|---|
| Claude Code | AGENTS.md + CLAUDE.md auto-loaded | Automatic via SessionStart hook |
| Cursor | AGENTS.md via .cursorrules include | Manual script or YAML write |
| Windsurf | AGENTS.md via workspace rules | Manual YAML write |
| GitHub Copilot | .github/copilot-instructions.md | Manual YAML write |
| Goose | AGENTS.md natively | Manual script |
| Any LLM | Paste AGENTS.md as system prompt | Manual YAML write |
Every tool gets the same information. Tool-specific config files are thin wrappers that point to AGENTS.md.
Portable Skills
Skills are the trickiest cross-tool challenge. Claude Code reads skills from .claude/skills/ with YAML frontmatter. Other tools don’t understand this format.
POS bridges this with a generation step:
.claude/skills/code-review/SKILL.md → generate-portable-skills.sh → .skills/code-review.md
The portable version strips Claude-specific metadata and outputs plain Markdown. Any tool can read .skills/code-review.md as an instruction document. A registry.yaml catalogs all available skills with their triggers.
Multi-Model Coordination
Different AI models have different strengths. POS matches tasks to models using capability levels:
| Level | Models | Best For |
|---|---|---|
basic | Haiku, Flash, 4o-mini | Status checks, docs, formatting |
standard | Sonnet, GPT-4o, Gemini Pro | Features, bugs, code review |
advanced | Sonnet+, advanced models | Architecture, refactoring |
reasoning | Opus, o3, deep reasoning | Planning, root cause analysis |
The task queue labels each task with a required capability level. A basic model sees only documentation tasks. A reasoning model sees architecture decisions. But capability matching alone isn’t enough. The models need a way to share what they learned.
The Continuity Problem
AI sessions are stateless. When you close Claude Code and reopen it, the model has no memory of the previous conversation. Any accumulated context, such as what you were working on, what decisions were made, and what was tried and failed, disappears.
Handoffs solve this by creating a persistent record at the end of each session that the next session reads at startup.
The Session Lifecycle
Every AI tool that enters POS follows three steps:
1. Register
When a session starts, it announces itself:
# .handoff/sessions/claude-code.yaml
agent: claude-code
context: ticketapp
capability: reasoning
started: "2026-03-18T09:00:00Z"
current_task: null
files_touched: []
Other tools can see who’s active. The system knows which context is in use.
2. Work
During the session, the tool updates its session file:
current_task: "implementing Stripe webhook handler"
files_touched:
- app/Http/Controllers/WebhookController.php
- tests/Feature/WebhookTest.php
- routes/api.php
If the session crashes, there’s still a record of what was being worked on.
3. Close
When the session ends, the tool creates a handoff record:
# .handoff/handoffs/2026-03-18-claude-code.yaml
agent: claude-code
context: ticketapp
started: "2026-03-18T09:00:00Z"
ended: "2026-03-18T11:30:00Z"
summary: |
Implemented Stripe webhook handler for payment events.
Added signature verification and event routing.
Tests pass for payment_intent.succeeded and charge.refunded events.
completed:
- Webhook controller with signature verification
- Event routing for 4 payment event types
- Feature tests for success and failure paths
pending:
- Subscription lifecycle events (not started)
- Webhook retry handling (deferred, needs architecture decision)
blockers:
- Need Stripe webhook secret for staging environment
resume_point: |
Open app/Http/Controllers/WebhookController.php.
The handleSubscription() method is stubbed but not implemented.
Start with the customer.subscription.created event type.
The handoff record is a complete briefing for the next session. It says what was done, what’s left, what’s blocking, and exactly where to resume.
Cross-Model Handoffs
The handoff system enables workflows that span multiple AI models:
Planning phase (Opus): Reads the project requirements, designs the architecture, creates a sprint plan with phased tasks, and writes a handoff describing the plan and key decisions.
Implementation phase (Sonnet): Reads Opus’s handoff, picks up the first implementation task, writes code, runs tests, and creates its own handoff describing what was built and what remains.
Documentation phase (Haiku): Reads Sonnet’s handoff, writes API documentation, formats commit messages, and updates the project README.
Each model reads the previous model’s handoff. Context carries forward without any model needing to re-discover what the others did.
Conflict Prevention
Multiple AI tools can work simultaneously. Session registration provides visibility:
# .state/snapshot.yaml
registered_agents:
- agent: "claude-code"
context: "ticketapp"
capability: "reasoning"
- agent: "cursor"
context: "acmecorp"
capability: "standard"
When an agent registers, it sees other active sessions and avoids working on the same context or files. This is visibility-based coordination, not locking. POS trusts tools to be cooperative.
What Cross-Tool Compatibility Costs
There are trade-offs worth acknowledging:
Lowest common denominator: The system must work with tools that can only read files. This means no interactive UI, no real-time collaboration, no rich integrations.
Manual overhead for some tools: Claude Code gets automatic session management via hooks. Every other tool requires manual registration. This is friction.
Skill parity gaps: Claude Code gets slash commands and tool restrictions. Other tools get the portable Markdown version. Same instructions, no automated trigger matching.
These trade-offs are acceptable because the alternative of building separate integrations for each tool is worse. One system that works everywhere at 80% is better than six perfect integrations that each require separate maintenance.
In the final post, Part 5, we look at what still needs work, honest assessments of current gaps, and the roadmap ahead.
This is part 4 of a 5-part series on Building a Personal Operating System for AI-Assisted Development.