The Problem and the Architecture (POS Part 1)

Every AI coding tool I’ve used has the same fundamental limitation. It forgets everything the moment a session ends.

That sounds like a minor inconvenience until you’re managing five or more projects simultaneously (an employer, freelance clients, side products, nonprofit work, and personal learning goals). At that scale, the forgetting becomes a real operational cost.

The Single-Session Trap

Here is what a typical day looks like without persistent context. You open a new Claude Code session for your event ticketing platform. You spend eight minutes explaining the project structure, the current sprint goals, which files were modified yesterday, and the deployment conventions. You get productive work done. The session ends.

Two hours later, you switch to your payment processing service. Another eight minutes of context-setting. Then your nonprofit bootcamp platform. Then back to the ticketing system, where you explain the same things you explained that morning.

Across five active projects, this re-establishment overhead adds up to 50-75 minutes per day. That isn’t a guess. I tracked it. Nearly an hour of every workday spent telling AI tools things they already knew yesterday.

What Gets Lost

The problem isn’t just time. It’s the quality of what gets lost between sessions:

Project state. Which sprint are we in? What was completed? What’s blocked? A new session has no idea unless you re-explain it.

Accumulated decisions. “We decided to use event sourcing for the payment module because of X.” That context evaporates. Next session, the AI might suggest a different architecture unless you intervene.

Work in progress. You were halfway through a refactor. The AI doesn’t know which files were touched, what the plan was, or what remains.

Cross-project awareness. Your payment service and your ticketing platform share an infrastructure pattern. No single AI session knows both projects exist, let alone that they share conventions.

The Multi-Context Problem

The deeper issue is that AI coding tools were designed for single-project, single-session workflows. When you try to scale them across multiple contexts, four things break down:

No shared state. Each tool maintains its own ephemeral context. There’s no way for one tool to pick up where another left off, or even where a previous session of the same tool left off.

No delegation chain. You can’t say “start the security audit” and have it hand off to “run the deployment” with full context preserved. Every handoff requires manual re-briefing.

No capability matching. Different AI models are better at different tasks. Claude excels at architecture, GPT at certain code generation patterns, Gemini at long-context analysis. But there’s no way to route work to the right model with the right context.

No audit trail. When you’re juggling multiple projects, you need to know what was done, when, and why. AI sessions produce no durable record unless you manually create one.

What a Solution Would Need

Before building anything, I wrote down five constraints that any solution would need to satisfy:

Persist state across sessions. The system must remember project context, decisions, and work in progress without manual re-explanation.
Work with any AI tool. No vendor lock-in. The solution should work with Claude Code, Cursor, Windsurf, GPT, or any future tool.
Load context efficiently. Token budgets are real. The system can’t dump 50,000 lines of context into every session. It needs tiered, selective loading.
Track work reliably. Every task, decision, and handoff needs a durable record.
Require no external dependencies. No databases, no cloud services, no SaaS subscriptions. The system should work offline, on any machine, with tools that already exist.

The solution that emerged is a file-based system where the filesystem itself becomes shared memory.

Design Principle. Everything Is a File

The core architectural insight is simple. AI coding tools are already excellent at reading and writing files. Every major AI coding tool, including Claude Code, Cursor, Windsurf, and Copilot, can read files from disk, follow instructions in markdown, and write structured output. None of them can query a database, call a custom API, or interact with a proprietary state store without significant integration work.

So instead of building infrastructure, I leaned into what already works. The entire system runs on plain files:

YAML for structured data (configuration, state, task queues)
Markdown for instructions and documentation (skills, rules, plans)
Shell scripts for automation (generation, validation, sync)

This choice has four advantages over a database-backed approach:

Any tool can read them. No drivers, no connections, no authentication. If a tool can read a file, it can participate in the system.

Git tracks history. Every change to every file is versioned automatically. You get full audit trails, diffs, and rollback for free.

No infrastructure to maintain. No database server, no cache layer, no message queue. The system runs entirely on the filesystem.

Works offline. No network required. The entire system works on an airplane.

The Config-Driven Core

Everything in POS flows from a single configuration file, pos.yaml. This file defines who you are, what contexts you manage, and how they’re organized:

version: "2.0"

principal:
  name: "Abu Ango"
  timezone: "Africa/Lagos"

contexts:
  - id: acmecorp
    type: "employment"
    name: "AcmeCorp"
    role: "Sr. Developer Advocate"
    shortcuts: ["@acmecorp", "@acme"]
    path: "contexts/acmecorp"

  - id: ticketapp
    type: "product"
    name: "TicketApp"
    role: "CTO"
    focus: "Event ticketing & management"
    stack: "php"
    shortcuts: ["@ticketapp", "@ta"]
    path: "contexts/ticketapp"

  # ... more contexts

From this single file, a set of shell scripts generates everything else. Running ./scripts/pos-generate.sh produces AGENTS.md (the system architecture document any AI tool can read), CLAUDE.md (Claude Code-specific configuration that auto-loads at session start), and the state snapshot. The config is the source of truth; the generated files are derived views.

The Template System

Generated files are produced through a template system that uses {{PLACEHOLDER}} syntax. Templates define the structure of a document, while the config and partials define the content.

For example, every skill file can be generated from a template:

{{PREAMBLE}}

## Standard Rules

{{COMMON_RULES}}

## Specific Instructions

[skill-specific content here]

{{SELF_RATING}}

{{OUTPUT_FORMAT}}

Partials like _preamble.md, _common-rules.md, and _self-rating.md live in templates/skills/partials/. When you need to change a rule that applies to all skills (for example, updating the session context instructions), you edit one partial and regenerate. Every skill gets the update.

This is the same principle behind Helm charts or Terraform modules. Define once, instantiate many times, keep the instances consistent.

Directory Structure

The full system layout follows a predictable hierarchy:

cto/
  pos.yaml                    # Single source of truth
  AGENTS.md                   # Generated: system architecture
  CLAUDE.md                   # Generated: Claude Code config
  .rules/
    universal.md              # Rules for ALL tools/models
  .claude/
    skills/                   # Claude Code skill definitions
      code-review/SKILL.md
      plan-generation/SKILL.md
      ...
  .skills/                    # Portable versions (any tool)
  scripts/
    pos-generate.sh           # Generate from pos.yaml
    sync-state.sh             # Aggregate state snapshot
    generate-skills.sh        # Generate skills from templates
    validate-skills.sh        # Static validation
    ...
  templates/
    skills/partials/          # Shared skill fragments
  contexts/
    acmecorp/               # Employment context
      projects/
      plans/
      docs/
      status.yaml
    ticketapp/               # Product context
      projects/
        ticketapp-core/      # Actual code repository
      plans/
      status.yaml
    ...
  .state/
    snapshot.yaml             # Auto-generated system dashboard
  .handoff/
    sessions/                 # Active session records
    queue.yaml                # Task queue
    artifacts/                # Cross-skill data
    feedback/                 # Self-rating data

Every context follows the same internal structure: projects/, plans/, docs/, and status.yaml. This consistency means any AI tool that understands one context can navigate all of them.

The Two-Path Pattern

A design pattern that appears throughout POS is offering two ways to accomplish the same thing. A shell script for automation, or a YAML file for manual specification. Both produce identical results.

For example, you can create a new context by running ./scripts/pos-init.sh --context ticketapp, or you can manually add the entry to pos.yaml and create the directory structure yourself. The script is convenient; the manual path is transparent. Neither is privileged.

This matters because AI tools vary in their ability to run scripts. Some can execute bash commands; others can only read and write files. The two-path pattern ensures the system works regardless of the tool’s capabilities.

State Aggregation

With a dozen contexts, each with their own status.yaml, you need a way to see the whole picture. That’s what .state/snapshot.yaml provides:

generated: "2026-03-18T17:45:19Z"

queue:
  total_tasks: 28
  available: 11

contexts:
  jobportal:
    active: true
    updated: "2026-03-18T11:03:00Z"
  ticketapp:
    active: true
    updated: "2026-02-27T15:03:00Z"
  learnhub:
    active: false
    resume: "Sprint 10b (DDD Restructure) COMPLETE..."

Running ./scripts/sync-state.sh walks every context, reads its status, and aggregates the result into this single file. Any AI session can read one file and know the state of everything. Which projects are active, what was last worked on, and where to resume.

Why This Works

The architecture works because it aligns with how AI tools already operate. They read files. POS gives them files to read. They follow markdown instructions. POS writes instructions in markdown. They produce structured output. POS defines YAML schemas for that output.

There’s no adaptation layer, no plugin system, no API integration. The filesystem is the API. Git is the database. Markdown is the protocol. Shell scripts are the automation layer.

This means that when a new AI tool appears (and new ones appear regularly), it works with POS on day one, as long as it can read files and follow instructions. That’s a low bar, and every serious AI coding tool clears it.

The next post in this series covers how this architecture translates into executable capabilities through the skill system. These 30 slash commands encode best practices into instructions any AI can follow.

This is part 1 of a 5-part series on Building a Personal Operating System for AI-Assisted Development. POS is open source at github.com/abuango/pos-ai.