# Build a Claude Code Agent for Engineering Management

This is a blueprint to build an AI-powered engineering management system using Claude Code. Hand this to Claude Code and have it build it piece by piece.

Make sure to add your own engineering context — your architecture decisions, team topology, coding standards, and what you've learned from past sprints. The system is only as good as the context you give it. Just tell Claude Code about your team and workflow and it will advise on how to adapt this plan. Building this will be an ongoing process, not a one-shot.

My company AgentEM builds custom autonomous engineering agents — from context extraction through full agentic deployment. [Learn more and book an intro call here](https://agentem.io).

### Workflows vs Agents

Linear and Jira use hardcoded workflows. You configure status columns, automation rules, and notification triggers. The workflow runs the same way every time. If your process changes, you rebuild the automation.

This system uses Claude Code as an agent. You put your engineering context into structured files — architecture decisions, team topology, review standards, sprint learnings. Claude Code reads those files and adapts its behavior based on your team's actual situation. Different architecture? Different specs. Different team capacity? Different sprint plans. Different review standards? Different PR routing.

The tradeoff is that agents require context to be useful. Generic prompts produce generic output. The time you invest filling in your context files is the difference between "another AI tool" and "sounds like our best EM wrote this."

### What This Blueprint Builds

This blueprint has two layers:

**Foundation Layer (Sections 1–10):** Context file templates, 6 skill definitions, database schema, integration specs, and build instructions. This is the knowledge base and skill set — your team's engineering judgment encoded as structured files, and the skills that reason against them.

**Automation Layer (Sections 11–16):** Cron runner, webhook triggers, Slack integration, and CLI commands. This is what makes it an agent instead of a prompt library. The morning risk scan fires at 8am and lands in Slack before standup. A PR opens on GitHub and the Review Orchestrator runs automatically. `agentem sprint-plan` chains Spec → Decompose → Risk Scan in one command.

Together they deliver a working system that runs on a schedule, reacts to events, and posts results without you prompting it.

**What this blueprint does not include:** intelligent event routing, confidence scoring, action execution in external systems (creating tickets, assigning reviewers via API), or autonomy graduation. That's the intelligence layer — the difference between an agent that follows hardcoded rules and one that evaluates context and decides what to do. [AgentEM](https://agentem.io) builds that layer for teams who want full autonomy.

### Starter Prompt for Claude Code

```
Read engineering-agent-blueprint.md. Execute the build in this order:
1. Create the full folder structure
2. Scaffold all context files from templates
3. Create the CLAUDE.md system file
4. Build Skill 1 (Spec Generator)
5. Show me what I need to fill in
```

Build one skill at a time. Test each against your real codebase before moving to the next. Don't try to build all 6 skills and the automation layer in one session — Claude Code's context window is the limiting factor on large builds.

## Table of Contents

### Foundation Layer
1. [System Overview](#system-overview)
2. [Architecture](#architecture)
3. [Folder Structure](#folder-structure)
4. [System Instructions for Claude (CLAUDE.md)](#system-instructions)
5. [Context File Templates](#context-file-templates)
6. [Skill Definitions (The 6-Phase Workflow)](#skill-definitions)
7. [Database Schema](#database-schema)
8. [Integration Specifications](#integration-specifications)
9. [Build Instructions — Foundation](#build-instructions)
10. [Operating Principles](#operating-principles)

### Automation Layer
11. [Automation Architecture](#automation-architecture)
12. [Cron Runner](#cron-runner)
13. [Webhook Triggers](#webhook-triggers)
14. [Slack Integration](#slack-integration)
15. [CLI Commands](#cli-commands)
16. [Build Instructions — Automation](#build-instructions-automation)

---

<a name="system-overview"></a>
## 1. System Overview

AgentEM is an AI agent system that helps engineering leaders (PMs, Engineering Managers, Tech Leads, Design Leads) manage software development from strategy to shipping. It uses the "context layer + skills + automation" pattern:

- **Context files** encode your team's engineering judgment as markdown — architecture decisions, team topology, coding standards, what worked, what didn't
- **Skills** define repeatable workflows — spec writing, ticket decomposition, risk detection, code review routing, release management, retrospectives
- **Automation** runs skills on a schedule, reacts to GitHub/Linear events via webhooks, chains skills together, and posts results to Slack — without you prompting anything

The system improves itself: Skill 6 (Retro Analyzer) feeds learnings back into context files that Skills 1-5 read on the next cycle.

### What This System Does NOT Do

- It does not replace engineering judgment. It augments it.
- It does not write code. It manages the process around code.
- It does not make *intelligent* routing decisions. Events map to skills via hardcoded rules. A PR opened always triggers Review Orchestrator, whether it's a one-line typo or a schema migration. (Intelligent routing — where the system evaluates events and decides what to do — is the next layer. See "What's Next" at the end.)

### How Skills vs Workflows Apply Here

Skills are better than static slash commands for this system because each phase requires the agent to adapt based on context: your architecture, your team's capacity, what signals are active, which ADRs apply. A skill references your context files, reads your topology, checks what's in flight, and decides what to do. A slash command runs the same prompt every time regardless of state.

---

<a name="architecture"></a>
## 2. Architecture

### Signal Detection

Development signals are observable indicators that something needs attention. Every signal must be: **Detectable** (findable via API), **Relevant** (correlates to delivery risk), **Timely** (current state), **Specific** (points to action).

| Signal Type | Source | What It Detects |
|---|---|---|
| Scope creep | GitHub/Linear | Tickets added mid-sprint, spec changes after kickoff |
| PR bottleneck | GitHub | PRs open > 48h, review queue depth per engineer |
| Blocked work | Linear/Jira | Tickets in "blocked" state, stale in-progress items |
| Dependency risk | GitHub + Architecture context | Changes touching shared services, cross-team PRs |
| Quality drift | GitHub | Test coverage drops, lint failures trending up, revert rate |
| Velocity anomaly | Linear/Jira | Sprint burndown off-track vs. historical pattern |
| Knowledge concentration | GitHub | Bus factor — files with single contributor |
| Tech debt accumulation | GitHub + Debt register | Repeated workarounds in same area, TODO density |
| Design-dev misalignment | Figma + GitHub | Implementation diverging from design specs |
| Release risk | GitHub + CI/CD | Build failures, flaky test rate, deployment frequency drop |

### Core Business Logic Modules

| Module | Input | Output | Context Dependencies |
|---|---|---|---|
| **Spec Generator** | Product brief | Feature specification | Strategy, ADRs, system map, spec template, learnings |
| **Ticket Decomposer** | Completed spec | Tickets + sprint plan | Ticket template, team topology, capacity, velocity data |
| **Risk Detector** | API signals | Risk digest | Signal sources, architecture context, escalation thresholds |
| **Review Orchestrator** | Open PRs | Review routing + summaries | Review playbook, team topology, skill matrix |
| **Release Manager** | Merged PRs + tickets | Release notes + go/no-go | Release process, definition of done |
| **Retro Analyzer** | Sprint metrics | Retro document + updated learnings | Historical baselines, signal data, learnings files |

---

<a name="folder-structure"></a>
## 3. Folder Structure

Create this exact structure. Skills and context files reference these paths.

```
agentem/
├── CLAUDE.md                           ← Generated from System Instructions section below
├── context/                            ← YOUR engineering expertise (you fill these in)
│   ├── product/
│   │   └── strategy.md                 ← Product strategy, OKRs, priorities
│   ├── architecture/
│   │   ├── system-map.md               ← Services, dependencies, data flows
│   │   ├── tech-debt.md                ← Known debt items with severity scores
│   │   └── adrs/
│   │       └── 001-example.md          ← Architecture decision records
│   ├── team/
│   │   └── topology.md                 ← People, ownership, capacity, skill matrix
│   ├── standards/
│   │   ├── spec-template.md            ← How specs should be written
│   │   ├── ticket-template.md          ← How tickets should be written
│   │   ├── review-playbook.md          ← Code review standards and process
│   │   └── definition-of-done.md       ← What "done" means
│   ├── process/
│   │   ├── sprint-rituals.md           ← Standup, planning, retro cadence
│   │   ├── release-process.md          ← Release cadence, feature flags, rollback
│   │   └── escalation-paths.md         ← When and how to escalate
│   └── learnings/
│       ├── what-works.md               ← Patterns that led to smooth delivery
│       └── what-doesnt.md              ← Anti-patterns to avoid
│
├── projects/                           ← Per-initiative data (agent creates these)
│   └── {project-slug}/
│       ├── strategy.md
│       ├── specs/
│       ├── plans/
│       ├── tickets/
│       ├── reviews/
│       ├── releases/
│       └── retros/
│
├── skills/                             ← Skill definitions (agent builds these)
│   ├── spec-generator/
│   │   └── SKILL.md
│   ├── ticket-decomposer/
│   │   └── SKILL.md
│   ├── risk-detector/
│   │   └── SKILL.md
│   ├── review-orchestrator/
│   │   └── SKILL.md
│   ├── release-manager/
│   │   └── SKILL.md
│   └── retro-analyzer/
│       └── SKILL.md
│
├── automation/                         ← Automation layer (agent builds these)
│   ├── cron/
│   │   ├── scheduler.ts                ← Cron job definitions and runner
│   │   └── schedules.yaml              ← Schedule config (editable)
│   ├── webhooks/
│   │   ├── server.ts                   ← Express server for incoming webhooks
│   │   ├── github-handler.ts           ← GitHub event → skill mapping
│   │   └── linear-handler.ts           ← Linear event → skill mapping
│   ├── slack/
│   │   ├── client.ts                   ← Slack Web API client
│   │   ├── formatter.ts                ← Skill output → Slack blocks
│   │   └── channels.yaml               ← Channel routing config
│   └── cli/
│       ├── agentem.ts                  ← CLI entry point
│       └── commands/                   ← Individual command handlers
│           ├── risk-scan.ts
│           ├── sprint-plan.ts
│           ├── review-check.ts
│           └── retro.ts
│
├── src/
│   ├── core/                           ← Business logic (agent builds these)
│   │   ├── spec-generator/
│   │   ├── ticket-decomposer/
│   │   ├── risk-detector/
│   │   ├── review-orchestrator/
│   │   ├── release-manager/
│   │   └── retro-analyzer/
│   └── lib/
│       ├── clients/                    ← API integrations
│       │   ├── github.ts
│       │   ├── linear.ts              ← Or jira.ts depending on your stack
│       │   ├── slack.ts               ← Slack Web API wrapper
│       │   └── database.ts            ← SQLite via better-sqlite3
│       ├── services/
│       │   └── notifications.ts
│       └── types/
│           └── index.ts
│
├── scripts/
└── package.json
```

**Scaffold command:**
```bash
mkdir -p agentem/{context/{product,architecture/adrs,team,standards,process,learnings},projects,skills/{spec-generator,ticket-decomposer,risk-detector,review-orchestrator,release-manager,retro-analyzer},src/{core/{spec-generator,ticket-decomposer,risk-detector,review-orchestrator,release-manager,retro-analyzer},lib/{clients,services,types}},scripts}
```

---

<a name="system-instructions"></a>
## 4. System Instructions for Claude (CLAUDE.md)

Generate this file at `agentem/CLAUDE.md`. Replace all `[PLACEHOLDER]` values with the user's actual configuration.

```markdown
# AgentEM — System Instructions

This is a development management agent system. It helps engineering leaders manage
software delivery through 6 skills that read context files encoding team engineering judgment.

## Directory Layout
- `context/` — Engineering expertise files. READ THESE BEFORE EVERY OPERATION.
  - `product/` — Strategy, OKRs, roadmap
  - `architecture/` — System map, ADRs, tech debt register
  - `team/` — Team topology, capacity, skill matrix
  - `standards/` — Spec template, review playbook, definition of done
  - `process/` — Sprint rituals, release process, escalation paths
  - `learnings/` — What works, what doesn't (updated after retros)
- `projects/` — Per-initiative data (one folder per project slug)
- `skills/` — Skill definitions (SKILL.md files)
- `src/` — Business logic and integration clients

## Critical Rules

1. ALWAYS read relevant context files before generating any output.
2. NEVER generate a spec without checking it against ADRs in `context/architecture/adrs/`.
3. NEVER suggest ticket assignments without reading `context/team/topology.md`.
4. ALWAYS flag when proposed work conflicts with priorities in `context/product/strategy.md`.
5. ALWAYS check capacity in `context/team/topology.md` before approving a sprint plan.
6. NEVER update learnings files without presenting changes to the human for approval.
7. When information is missing from context files, note the gap and proceed with best effort. Do not hallucinate context that doesn't exist.
8. When in doubt, surface the decision to the human — don't assume.

## Context File Priority

When information conflicts, trust in this order:
1. User's explicit instruction in the current conversation
2. Context files (most recently updated wins)
3. Data from integrations (GitHub, Linear, etc.)
4. General knowledge

## Tech Stack

- Runtime: [PLACEHOLDER: Node.js / Bun / etc.]
- Language: TypeScript
- Database: SQLite (agentem.db, zero-config, local)
- Project Management: [PLACEHOLDER: Linear / Jira / Notion]
- Source Control: GitHub
- CI/CD: [PLACEHOLDER: GitHub Actions / CircleCI / etc.]

## Integration APIs

| API | Purpose | Rate Limits |
|-----|---------|-------------|
| GitHub REST API | PRs, issues, CI status, file ownership | 5000 req/hr (authenticated) |
| [PLACEHOLDER: Linear/Jira] API | Sprint data, tickets, workload | [PLACEHOLDER] |
| SQLite | Persistent state, metrics, signals | Local file, no limits |

## Signal Routing

| Signal Type | Source | Threshold |
|------------|--------|-----------|
| PR bottleneck | GitHub | Open > [PLACEHOLDER: 48] hours, no review activity |
| Scope creep | [PLACEHOLDER: Linear/Jira] | Tickets added after sprint start date |
| Quality drift | GitHub | Test coverage drops > [PLACEHOLDER: 5]% from baseline |
| Blocked work | [PLACEHOLDER: Linear/Jira] | Blocked state > [PLACEHOLDER: 24] hours |
| Velocity anomaly | [PLACEHOLDER: Linear/Jira] | Burndown [PLACEHOLDER: 30]%+ behind baseline at midpoint |

## The 6-Skill Workflow

Skills execute in order. Each phase's output feeds the next.

1. **Spec Generator** — Brief → Specification (reads: strategy, ADRs, system map, spec template)
2. **Ticket Decomposer** — Spec → Tickets + Sprint Plan (reads: ticket template, topology, capacity)
3. **Risk Detector** — Signals → Risk Digest (reads: all signal sources, architecture, thresholds)
4. **Review Orchestrator** — PRs → Review Routing + Summaries (reads: review playbook, topology)
5. **Release Manager** — Merged work → Release Notes + Go/No-Go (reads: release process, DoD)
6. **Retro Analyzer** — Metrics → Retro Doc + Updated Learnings (reads: baselines, signals, learnings)

## Coding Conventions

- API clients: one file per service in `src/lib/clients/`
- Types: all shared types in `src/lib/types/index.ts`
- Business logic: one module per skill in `src/core/{skill-name}/`
- Context files: always markdown, always in `context/`
- Project outputs: always in `projects/{slug}/{type}/`

## Cost Awareness

- Context file reads: Free (local filesystem)
- Spec and ticket generation: Free (Claude processing only)
- GitHub API: Rate limited — batch requests, cache where possible
- Linear/Jira API: Rate limited — avoid polling more than 2x/day for signal detection
- Database writes: Free (local SQLite, no network, no limits)
```

---

<a name="context-file-templates"></a>
## 5. Context File Templates

Claude Code: generate these context files **one at a time, in order** by analyzing the current project, repo, codebase, and any documentation you have access to. Use the templates below as the structure, but replace `[bracketed]` placeholders with real data inferred from the repo (git history, file structure, README, package.json, CI config, existing docs, code patterns, etc.). Where you cannot confidently infer a value, make your best educated guess based on what you can observe and mark it with `<!-- REVIEW: inferred from [source] -->` so the human knows to verify it.

**Process for each context file:**
1. Generate the file with inferred data
2. Present it to the human and say: "Here's my baseline for `[filename]`. This is inferred from your repo. Please review, correct anything wrong, and confirm before I move to the next file."
3. Wait for the human to confirm or make edits
4. Only then proceed to the next context file

**Order of generation:**
1. `context/product/strategy.md`
2. `context/architecture/system-map.md`
3. `context/architecture/adrs/001-example.md`
4. `context/architecture/tech-debt.md`
5. `context/team/topology.md`
6. `context/standards/spec-template.md`
7. `context/standards/ticket-template.md`
8. `context/standards/review-playbook.md`
9. `context/standards/definition-of-done.md`
10. `context/process/release-process.md`
11. `context/process/escalation-paths.md`
12. `context/learnings/what-works.md`
13. `context/learnings/what-doesnt.md`

This order matters: later files build on earlier ones (e.g., the review playbook references team topology, the spec template references ADRs).

---

### context/product/strategy.md

```markdown
# Product Strategy — [Your Product Name]

## Mission
[One sentence: what your product does and for whom]

## Current Quarter Priorities (Q_ 20__)

### Priority 1: [Name]
- **Objective:** [What we're trying to achieve]
- **Key Results:**
  - KR1: [Measurable target]
  - KR2: [Measurable target]
- **Status:** [On track / At risk / Behind]
- **Owner:** [Name]

### Priority 2: [Name]
[Same structure]

## What We're NOT Doing This Quarter
[Explicit list — equally important as what you are doing]
- [Thing 1] — Reason: [Why not now]
- [Thing 2] — Reason: [Why not now]

## Competitive Context
[2-3 sentences on market position and key competitive dynamics]

## User Segments
| Segment | Size | Priority | Key Need |
|---------|------|----------|----------|
| [Segment 1] | [Rough size] | P0/P1/P2 | [Core need] |
```

---

### context/architecture/system-map.md

```markdown
# System Architecture Map

## Services

### [Service Name]
- **Owner:** [Team/Person]
- **Language/Stack:** [e.g., Python/FastAPI, TypeScript/Next.js]
- **Repository:** [repo URL]
- **Purpose:** [One sentence]
- **Key Dependencies:** [What it talks to]
- **Database:** [DB type + name]
- **Deployment:** [Where it runs]

### [Next Service]
[Same structure — add one block per service]

## Data Flow

1. **User Request Flow:** Client → [API Gateway] → [Service A] → [Service B] → Database
2. **Background Processing:** Queue → [Worker Service] → [Storage] → [Notification Service]
3. **Analytics Pipeline:** Events → [Collector] → [Data Warehouse] → [Dashboard]

## Shared Infrastructure
- **Auth:** [How authentication works, what service handles it]
- **Messaging:** [Queue system, pub/sub]
- **Storage:** [Object storage, CDN]
- **Monitoring:** [What tools, where dashboards live]

## Known Architectural Constraints
- [Constraint 1: e.g., "All new services must use the shared auth library"]
- [Constraint 2: e.g., "Database migrations require a 2-week lead time for DBA review"]
- [Add all non-negotiable rules your team follows]
```

---

### context/architecture/adrs/001-example.md

```markdown
# ADR-001: [Decision Title]

## Status
[Accepted | Superseded by ADR-XXX | Deprecated]

## Date
[When this was decided]

## Context
[What situation prompted this decision]

## Decision
[What we decided to do]

## Alternatives Considered
1. **[Alternative A]** — [Why we didn't choose this]
2. **[Alternative B]** — [Why we didn't choose this]

## Consequences
- **Positive:** [What we gain]
- **Negative:** [What we trade off]
- **Neutral:** [Side effects to be aware of]

## Review Trigger
[When should we revisit this? e.g., "If request volume exceeds 10K/min"]
```

**Instructions to human:** Create 3-5 ADR files for your most frequently referenced decisions. Number them sequentially (001, 002, etc.). Focus on decisions that come up in planning discussions — the ones where someone proposes an approach and you say "we already decided not to do that."

---

### context/architecture/tech-debt.md

```markdown
# Tech Debt Register

| ID | Area | Description | Severity | Blast Radius | Owner | Status |
|----|------|-------------|----------|-------------|-------|--------|
| TD-001 | [Service/Area] | [What the debt is] | [Critical/High/Medium/Low] | [What it affects] | [Name] | [Open/In Progress/Resolved] |
| TD-002 | [Service/Area] | [What the debt is] | [Critical/High/Medium/Low] | [What it affects] | [Name] | [Open/In Progress/Resolved] |

## Scoring Guide
- **Critical:** Causes incidents, blocks feature work, or creates security risk
- **High:** Slows development significantly, causes regular workarounds
- **Medium:** Annoying but manageable, increases complexity
- **Low:** Cleanup item, no functional impact
```

---

### context/team/topology.md

```markdown
# Team Topology

## Team: [Team Name]

### Members
| Name | Role | Seniority | Primary Domain | Secondary Domain |
|------|------|-----------|----------------|------------------|
| [Name] | [Engineering/PM/Design] | [Junior/Mid/Senior/Staff/Principal] | [Primary area] | [Secondary area] |

### Ownership Map
| Service / Area | Primary Owner | Backup |
|---------------|--------------|--------|
| [Service A] | [Name] | [Name] |
| [Frontend App] | [Name] | [Name] |
| [Data Pipeline] | [Name] | [Name] |

### Current Capacity (per 2-week sprint)
- **Total engineering days:** [X] (after PTO, on-call, meetings)
- **Typical allocation:** [X]% feature work, [X]% bugs/maintenance, [X]% tech debt
- **Known upcoming absences:** [Any PTO in next 2 sprints]
- **On-call impact:** [How on-call rotation affects capacity]

### Skill Matrix (Optional but Valuable)
| Name | [Domain A] | [Domain B] | [Domain C] | [Domain D] |
|------|-----------|-----------|-----------|-----------|
| [Name] | Expert | Working | None | Expert |
| [Name] | Working | Expert | Working | None |

Legend: Expert = can design and review | Working = can implement | None = needs pairing
```

---

### context/standards/spec-template.md

```markdown
# Spec Template — [Your Team Name]

Every feature spec should follow this structure. Sections marked [REQUIRED]
cannot be skipped. Sections marked [IF APPLICABLE] can be omitted with a
brief note explaining why.

## [REQUIRED] Problem Statement
State the problem from the user's perspective. Include evidence (data, user
quotes, support tickets). A spec without a clear problem statement should
not proceed to implementation.

## [REQUIRED] Proposed Solution
Describe what we're building and how users will experience it. Keep this at
the "what" level, not the "how" level. Save implementation details for
Technical Approach.

## [REQUIRED] Success Metrics
2-4 measurable outcomes. At least one must be measurable within 2 weeks of
launch (leading indicator). Avoid vanity metrics.

## [REQUIRED] Scope
Explicitly list what's IN scope and what's OUT of scope. Out-of-scope items
should include a brief rationale for why they're excluded.

## [REQUIRED] Technical Approach
How we'll build it. Reference existing services and patterns. Call out any
new infrastructure, data model changes, or API contract changes. Reference
relevant ADRs.

## [REQUIRED] Risks
At minimum: technical risk, timeline risk, and user adoption risk. Each
must have a mitigation strategy.

## [REQUIRED] Rollout Plan
How we ship: feature flags, percentage rollout, beta program. Include
rollback criteria.

## [IF APPLICABLE] Dependencies
Cross-team or external dependencies with contact person and expected timeline.

## [IF APPLICABLE] Data Model Changes
Schema changes, new tables, migration strategy.

## [IF APPLICABLE] API Changes
New or modified endpoints with request/response shapes.

## [IF APPLICABLE] Design
Link to Figma files or describe UI changes.

## Open Questions
Things that need answers. Each should have an owner and target resolution date.
```

---

### context/standards/ticket-template.md

```markdown
# Ticket Template — [Your Team Name]

## Title
[Verb] + [Object] + [Context]
Example: "Add pagination to /api/users endpoint"

## Description
What needs to be done and why. Link to the parent spec.

## Acceptance Criteria
- [ ] [Specific, testable condition 1]
- [ ] [Specific, testable condition 2]
- [ ] [Specific, testable condition 3]

Each criterion must be verifiable by someone other than the author.

## Estimate
[Your estimation approach: story points, t-shirt sizes, or time-based]
- S = [definition, e.g., "< 4 hours, well-understood, no unknowns"]
- M = [definition, e.g., "1-2 days, some unknowns but approach is clear"]
- L = [definition, e.g., "3-5 days, significant unknowns or cross-service"]
- XL = [definition, e.g., "Should be broken into smaller tickets"]

## Required Fields
- Assignee (suggested, not mandated — team self-assigns)
- Priority: [Critical / High / Medium / Low]
- Sprint: [Current or Backlog]
- Labels: [Your team's label taxonomy]
- Linked spec: [URL to parent specification]
```

---

### context/standards/review-playbook.md

```markdown
# Code Review Playbook

## Philosophy
[Your team's review culture in 2-3 sentences. e.g., "Reviews are
collaborative, not adversarial. The goal is to ship better code, not
prove who's smarter. Bias toward approval with suggestions."]

## Required Before Merge
- [ ] All CI checks pass
- [ ] At least [N] approvals from code owners
- [ ] No unresolved critical comments
- [ ] Test coverage meets threshold ([X]%)
- [ ] No new lint warnings introduced
- [ ] Documentation updated if public API changed

## What Reviewers Should Focus On
1. **Correctness** — Does it do what the ticket says?
2. **Edge cases** — What happens with bad input, empty state, race conditions?
3. **Readability** — Will someone unfamiliar understand this in 6 months?
4. **Performance** — Any N+1 queries, unbounded loops, missing indexes?
5. **Security** — Input validation, auth checks, data exposure?
6. **Testing** — Are the tests testing behavior, not implementation?

## What Reviewers Should NOT Focus On
- Stylistic preferences already handled by linters
- Alternative approaches that aren't clearly better
- Scope expansion ("while you're here, could you also...")

## Review SLAs
- **Small PRs (< 200 lines):** Review within [X] hours
- **Medium PRs (200-500 lines):** Review within [X] business day(s)
- **Large PRs (> 500 lines):** Should have been broken up. If unavoidable, review within [X] business days. Author should provide a walkthrough.

## Patterns We Prefer
- [e.g., "Prefer composition over inheritance"]
- [e.g., "Use early returns to reduce nesting"]
- [e.g., "Database queries go in repository classes, not controllers"]

## Anti-Patterns We Reject
- [e.g., "No raw SQL in route handlers"]
- [e.g., "No console.log in production code"]
- [e.g., "No catch-all error handlers that swallow context"]
```

---

### context/standards/definition-of-done.md

```markdown
# Definition of Done

A ticket is done when ALL of the following are true:

- [ ] Code is merged to main
- [ ] All CI checks pass
- [ ] Test coverage meets [X]% threshold for changed files
- [ ] No new lint warnings introduced
- [ ] PR has [N] approvals from code owners
- [ ] Documentation updated (if public API changed)
- [ ] Feature flag configured (if applicable)
- [ ] Acceptance criteria verified by author
- [ ] No known regressions introduced
- [ ] [Add your team-specific items here]
```

---

### context/process/release-process.md

```markdown
# Release Process

## Cadence
[e.g., "Weekly releases on Tuesdays" or "Continuous deployment to staging, weekly to production"]

## Pre-Release Checklist
- [ ] All tickets in release are Done (see Definition of Done)
- [ ] No critical or high-severity open bugs in release scope
- [ ] Release notes drafted
- [ ] Stakeholders notified of upcoming changes
- [ ] Rollback plan documented

## Feature Flags
[Your feature flag strategy, e.g., "All new user-facing features ship behind flags. Flags are removed after 2 weeks of stable 100% rollout."]

## Rollback Criteria
[When to roll back, e.g., "Error rate > 1% on affected endpoints" or "P0 bug reported within 2 hours of deploy"]

## Communication
- **Internal (engineering):** [Where, e.g., #releases Slack channel]
- **Internal (stakeholders):** [Where, e.g., email to product@ list]
- **External (users):** [Where, e.g., changelog page, in-app notification]
```

---

### context/process/escalation-paths.md

```markdown
# Escalation Paths

## Signal Severity → Action

| Severity | Response Time | Who Gets Notified | Example |
|----------|-------------|-------------------|---------|
| Critical | Immediately | [Engineering Lead + PM] | Production incident, security vulnerability, data loss |
| High | Within 4 hours | [Engineering Lead] | Sprint goal at risk, major blocker, quality regression |
| Medium | Next standup | [Relevant ticket owner] | PR bottleneck, blocked ticket, minor scope creep |
| Low | Next planning | [Team backlog] | Tech debt observation, process improvement suggestion |

## Decision Authority
- **Technical approach:** [Who decides, e.g., "Senior engineer on the affected service"]
- **Scope changes mid-sprint:** [Who decides, e.g., "PM + Engineering Lead together"]
- **Release go/no-go:** [Who decides, e.g., "Engineering Lead with PM input"]
- **Architecture changes:** [Who decides, e.g., "Requires ADR + team review"]
```

---

### context/learnings/what-works.md

```markdown
# What Works — Engineering Learnings

This file is updated by the Retro Analyzer (Skill 6) after each sprint.
Manual additions welcome. All skills read this before generating output.

## Spec Writing
- [Seed with 1-2 learnings from your experience]

## Ticket Decomposition
- [Seed with 1-2 learnings]

## Code Review
- [Seed with 1-2 learnings]

## Sprint Planning
- [Seed with 1-2 learnings]

## Releases
- [Seed with 1-2 learnings]
```

---

### context/learnings/what-doesnt.md

```markdown
# What Doesn't Work — Anti-Patterns to Avoid

Updated after retros. All skills read this to avoid known failure modes.

## Spec Writing
- [Seed with 1-2 anti-patterns you've experienced]

## Planning
- [Seed with 1-2 anti-patterns]

## Estimation
- [Seed with 1-2 anti-patterns]

## Process
- [Seed with 1-2 anti-patterns]
```

---

<a name="skill-definitions"></a>
## 6. Skill Definitions

Claude Code: generate each skill as a `SKILL.md` file in its corresponding `skills/{name}/` directory.

---

### Skill 1: Spec Generator

**File:** `skills/spec-generator/SKILL.md`

```markdown
---
name: agentem-spec-generator
description: "Generates structured feature specifications from product briefs. Activate when the user wants to create a spec, PRD, feature proposal, or technical design document. Reads product strategy, architecture context, and team constraints to produce specs that respect existing patterns and decisions."
---

# Spec Generator

## Purpose
Transform a product brief into a complete, implementable feature specification that an engineering team can read, challenge, and decompose into tickets without going back to the PM for clarification.

## When to Activate
- User says "write a spec for...", "create a PRD", "spec out this feature"
- User provides a problem statement and asks for a structured plan
- User references an initiative and asks how to scope it

## Workflow

### Step 1: Gather Context
Before writing anything, read:
1. `context/product/strategy.md` — Verify initiative aligns with priorities. If it conflicts, flag before proceeding.
2. `context/architecture/system-map.md` + `context/architecture/adrs/` — Understand topology, check for relevant past decisions.
3. `context/architecture/tech-debt.md` — Surface "while you're in there" opportunities.
4. `context/standards/spec-template.md` + `context/standards/definition-of-done.md` — Use team's format and quality bar.
5. `context/learnings/what-works.md` + `context/learnings/what-doesnt.md` — Apply past lessons, avoid known failures.
6. If GitHub/Linear access available: search for related existing work.

### Step 2: Clarify (Only If Needed)
If the brief is underspecified, ask max 3 targeted questions:
- Who is this for?
- What does success look like (measurable)?
- What are the hard constraints?
- What's explicitly out of scope?
If the brief is sufficient, skip this step entirely.

### Step 3: Generate the Spec
Follow the team's spec template from `context/standards/spec-template.md`. If none exists, use this default structure:

- Problem Statement (grounded in evidence)
- Proposed Solution (user experience level, not implementation)
- Success Metrics (2-4 measurable, at least 1 leading indicator)
- Scope (explicit in-scope AND out-of-scope with rationale)
- Technical Approach (architecture, key decisions referencing ADRs, dependencies, data model, API changes)
- Risks & Mitigations (table: risk, likelihood, impact, mitigation)
- Rollout Plan (feature flags, percentage rollout, rollback criteria)
- Open Questions (with owners and target dates)

### Step 4: Cross-Reference
After generating:
1. Check against ADRs — flag conflicts or propose superseding with rationale
2. Check tech debt register — flag debt in affected area
3. Check dependency map — flag cross-team coordination needs
4. Check capacity — flag if scope exceeds one sprint
5. Estimate complexity — provide t-shirt size (S/M/L/XL)

### Step 5: Output
1. Save to `projects/{project-slug}/specs/{feature-slug}/spec.md`
2. Suggest reviewers from `context/team/topology.md` (affected service owners)
3. If integration available, create spec review ticket in Linear/Jira

## Quality Criteria
- [ ] References specific architecture components from system map
- [ ] Includes measurable success metrics
- [ ] Has explicit scope boundaries (in AND out)
- [ ] Identifies risks with concrete mitigations
- [ ] Respects ADRs or explicitly proposes overriding them
- [ ] Flags cross-team dependencies
- [ ] Includes rollout plan with rollback criteria
- [ ] Sized appropriately (not a one-liner, not a novel)

## Adaptation Rules
- Detailed brief → skip clarification, generate directly
- Problem statement only → generate 2-3 solution approaches, ask user to choose
- Thin architecture context → note in risks, recommend architecture review
- No relevant ADRs → flag as opportunity to create one
- XL initiative → recommend breaking into phased specs

## Error Handling
- Missing context files → note gap, proceed with best effort, suggest user create them
- Conflicting ADRs → surface both, ask user to resolve
- Scope too large → produce Phase 1 spec + Future Phases appendix
- No clear problem statement → do NOT generate spec. Ask user to articulate problem first.
```

---

### Skill 2: Ticket Decomposer

**File:** `skills/ticket-decomposer/SKILL.md`

```markdown
---
name: agentem-ticket-decomposer
description: "Breaks feature specifications into implementable tickets with acceptance criteria, estimates, and sprint plans. Activate when user has a completed spec and wants to create work items, plan a sprint, or decompose a feature into tasks."
---

# Ticket Decomposer

## Purpose
Take a completed spec and produce a set of tickets that engineers can pick up and implement without ambiguity, grouped into a sprint plan that respects team capacity.

## When to Activate
- User says "break this into tickets", "decompose this spec", "plan the sprint for..."
- User provides a spec and asks for implementation planning
- User asks "how should we build this?"

## Workflow

### Step 1: Read Context
1. The spec to decompose (from `projects/{slug}/specs/`)
2. `context/standards/ticket-template.md` — ticket format
3. `context/team/topology.md` — who's available, what they know, current capacity
4. `context/learnings/what-works.md` — apply decomposition lessons (e.g., ticket sizing)
5. `context/learnings/what-doesnt.md` — avoid known estimation failures
6. Historical velocity data if database is available

### Step 2: Decompose
For each section of the spec's Technical Approach:
1. Identify distinct implementable units (one concern per ticket)
2. Write acceptance criteria that are testable by someone other than the author
3. Estimate using the team's estimation approach from ticket template
4. Identify dependencies between tickets (what must ship before what)
5. Suggest assignee based on skill matrix and current load

### Step 3: Sequence
1. Order tickets by dependency chain (blocking work first)
2. Group into sprint-sized batches using capacity from topology
3. Validate total estimate against available capacity (plan to 80%, not 100%)
4. Flag if total work exceeds one sprint — propose phasing

### Step 4: Output
1. Individual ticket files in `projects/{slug}/tickets/`
2. Sprint plan in `projects/{slug}/plans/{sprint-name}.md`
3. If Linear/Jira integration available, create tickets via API
4. Summary: total tickets, total estimate, proposed sprint allocation, risks

## Quality Criteria
- [ ] Each ticket has one clear concern (not a mini-project)
- [ ] Acceptance criteria are specific enough to test
- [ ] Estimates are consistent with team's historical velocity
- [ ] Dependencies are explicitly ordered
- [ ] Sprint plan doesn't exceed 80% of stated capacity
- [ ] No ticket estimated at XL (break it down further)

## Error Handling
- Spec too vague in Technical Approach → flag specific gaps, ask user to clarify before decomposing
- Team capacity not documented → ask user for available engineering days
- Estimate exceeds 2 sprints → recommend spec be split into phases
```

---

### Skill 3: Risk Detector

**File:** `skills/risk-detector/SKILL.md`

```markdown
---
name: agentem-risk-detector
description: "Monitors development signals across GitHub and project management tools to detect delivery risks. Activate on request for risk scan, sprint health check, or when user asks about blockers, bottlenecks, or delivery risks."
---

# Risk Detector

## Purpose
Scan active development signals and produce a prioritized risk digest with severity, evidence, suggested action, and owner.

## When to Activate
- User says "run a risk scan", "what's at risk?", "sprint health check"
- User asks about blockers, bottlenecks, or delivery status
- Scheduled daily run (if configured)

## Workflow

### Step 1: Read Context
1. `context/architecture/system-map.md` — for blast radius assessment
2. `context/team/topology.md` — for routing risks to owners
3. `context/process/escalation-paths.md` — for severity thresholds
4. Historical signal data if database available

### Step 2: Scan Signals
For each configured signal source:

**GitHub signals:**
- List open PRs. Flag any open > [threshold] hours with no review activity.
- Check CI status on main branch. Flag persistent failures.
- Check recent revert rate. Flag if above baseline.
- Identify files changed by only one contributor (bus factor).

**Linear/Jira signals:**
- List tickets added after sprint start (scope creep).
- List tickets in blocked state > [threshold] hours.
- Calculate current burndown vs. historical baseline.
- List tickets in-progress > [threshold] days without PR.

### Step 3: Score and Prioritize
For each detected signal:
1. Assign severity: Critical / High / Medium / Low (using thresholds from escalation-paths.md)
2. Identify owner from topology
3. Suggest specific action
4. Cross-reference against architecture context for blast radius

### Step 4: Output
Produce risk digest as markdown:
```
## Risk Digest — [Date]

### Critical (Act Now)
[Any critical signals with evidence and suggested action]

### High (Act Today)
[High signals]

### Medium (Discuss at Standup)
[Medium signals]

### Low (Track)
[Low signals]

### All Clear
[Signal types that returned no issues]
```
Save to `projects/{active-project}/reviews/risk-digest-{date}.md`
If database available, save signals to signals table.

## Error Handling
- API rate limited → note which signals couldn't be scanned, suggest retry timing
- No integration configured → list what signals WOULD be detected if connected, suggest setup
- All signals clear → report this explicitly (no news is good news, but confirm you checked)
```

---

### Skill 4: Review Orchestrator

**File:** `skills/review-orchestrator/SKILL.md`

```markdown
---
name: agentem-review-orchestrator
description: "Analyzes open PRs and suggests optimal reviewers based on expertise, load balance, and knowledge-spreading goals. Generates review summaries for complex PRs. Activate when user asks about PR status, review assignments, or code review bottlenecks."
---

# Review Orchestrator

## Purpose
Ensure the right people review the right code, reviews happen promptly, and no one person is overloaded.

## When to Activate
- User asks "who should review this PR?", "what PRs need review?"
- User asks about review bottlenecks or PR aging
- User asks for a summary of a complex PR

## Workflow

### Step 1: Read Context
1. `context/standards/review-playbook.md` — review standards and SLAs
2. `context/team/topology.md` — ownership map and skill matrix
3. `context/standards/definition-of-done.md` — merge requirements

### Step 2: Analyze PRs
For each open PR (via GitHub API):
1. Identify changed files and map to owners from topology
2. Classify size: Small (< 200 lines), Medium (200-500), Large (> 500)
3. Check current age against SLAs from review playbook
4. Check if PR touches shared services or crosses ownership boundaries → flag for architecture review
5. Count current open review assignments per engineer → balance load

### Step 3: Route Reviews
For each PR:
- **Primary reviewer:** File owner from topology (deepest expertise)
- **Secondary reviewer:** Someone who would benefit from exposure (knowledge spreading)
- **Architecture review:** Required if PR touches shared infrastructure or crosses 3+ service boundaries
- Avoid assigning to anyone with > [threshold] open reviews

### Step 4: Generate Summaries (Large PRs Only)
For PRs > 500 lines:
- What changed (grouped by area)
- Why (link to ticket/spec)
- What to focus on during review
- Known risks or tricky parts

### Step 5: Output
- Review assignments/suggestions per PR
- Aging report (PRs exceeding SLA)
- Load balance report (reviews per engineer)
- Summaries for large PRs

## Error Handling
- No CODEOWNERS file → use topology ownership map as fallback
- Reviewer on PTO → skip them, note gap, suggest alternate
- PR too large → suggest author split before review
```

---

### Skill 5: Release Manager

**File:** `skills/release-manager/SKILL.md`

```markdown
---
name: agentem-release-manager
description: "Aggregates completed work for a release, generates release notes and changelog, runs go/no-go checklist, and drafts stakeholder communication. Activate when user is preparing a release, asks for release notes, or needs a go/no-go assessment."
---

# Release Manager

## Purpose
Ensure releases ship with proper documentation, quality verification, and stakeholder communication.

## When to Activate
- User says "prepare the release", "generate release notes", "are we ready to ship?"
- User asks for a go/no-go assessment
- User needs stakeholder communication about a release

## Workflow

### Step 1: Read Context
1. `context/process/release-process.md` — cadence, checklist, flag strategy
2. `context/standards/definition-of-done.md` — quality bar
3. `context/process/escalation-paths.md` — who to notify

### Step 2: Gather Release Data
Via GitHub + Linear/Jira:
1. List all tickets completed since last release
2. List all PRs merged since last release tag
3. Check CI status on release branch
4. Identify any open bugs tagged for this release
5. Check feature flag status for new features

### Step 3: Generate Artifacts

**Release Notes (user-facing):**
- Group changes by category (Features, Improvements, Bug Fixes)
- Write in user-understandable language (not commit messages)
- Highlight breaking changes prominently

**Changelog (developer-facing):**
- List all PRs merged with links
- Note any API changes, schema migrations, config changes
- Flag any changes requiring manual intervention

**Go/No-Go Report:**
- Run through pre-release checklist from release-process.md
- Flag any checklist items that aren't satisfied
- Provide clear SHIP / HOLD recommendation with rationale

**Stakeholder Update Draft:**
- Summary of what's shipping
- Impact on users
- Any known issues or limitations
- Rollback plan reference

### Step 4: Output
Save all artifacts to `projects/{slug}/releases/{version}/`
- release-notes.md
- changelog.md
- go-no-go.md
- stakeholder-update.md

## Error Handling
- Tickets not properly closed → flag with list of incomplete items
- CI failures on release branch → HOLD recommendation with details
- Missing release process context → use minimal checklist, suggest user document their process
```

---

### Skill 6: Retro Analyzer

**File:** `skills/retro-analyzer/SKILL.md`

```markdown
---
name: agentem-retro-analyzer
description: "Analyzes sprint metrics, identifies delivery patterns, generates retrospective documents, and proposes updates to learnings files. Activate at sprint end or when user asks for retro data, velocity analysis, or sprint health review."
---

# Retro Analyzer

## Purpose
Close the feedback loop. Pull sprint metrics, identify patterns, document learnings, and update context files so the system improves every cycle.

## When to Activate
- User says "run the retro", "sprint analysis", "how did we do?"
- End of sprint (if scheduled)
- User asks about velocity trends or delivery patterns

## Workflow

### Step 1: Read Context
1. `context/learnings/what-works.md` + `context/learnings/what-doesnt.md` — current state of learnings
2. Historical metrics from database (if available)
3. Signal data from this sprint's risk scans

### Step 2: Gather Metrics
From Linear/Jira:
- Planned points vs. completed points (velocity)
- Tickets carried over (and why)
- Tickets added mid-sprint (scope creep volume)
- Average cycle time (ticket start → PR merged)

From GitHub:
- PRs merged count
- Average PR turnaround (open → approved)
- Revert rate
- Test coverage trend

From signals table (if available):
- Signals detected this sprint
- How many materialized into real problems
- How many were false positives
- Average time from detection to resolution

### Step 3: Analyze Patterns
Compare this sprint's metrics to historical baselines:
- Velocity: trending up, down, or stable?
- Cycle time: getting faster or slower?
- Scope creep: more or less than usual?
- Review bottlenecks: improving or worsening?
- Quality: coverage and revert rate trends

Identify specific observations:
- What enabled the things that went well?
- What caused the things that didn't go well?
- Were there surprises? What would we do differently?

### Step 4: Generate Retro Document
```
# Sprint Retro — [Sprint Name] ([Date Range])

## Metrics Summary
| Metric | This Sprint | Baseline | Trend |
|--------|------------|----------|-------|
| Velocity | X pts | Y pts avg | ↑/↓/→ |
| Cycle Time | X days | Y days avg | ↑/↓/→ |
| PR Turnaround | X hours | Y hours avg | ↑/↓/→ |
| Scope Changes | +X tickets | +Y avg | ↑/↓/→ |

## What Went Well
[Data-backed observations about what worked]

## What Didn't Go Well
[Data-backed observations about what struggled]

## Action Items
- [ ] [Specific action with owner and due date]

## Proposed Learnings Updates
[Changes to suggest for what-works.md and what-doesnt.md]
```

### Step 5: Propose Learnings Updates
CRITICAL: Do NOT directly update learnings files. Present proposed additions to the human:
- "I'd like to add to what-works.md: [proposed entry]"
- "I'd like to add to what-doesnt.md: [proposed entry]"
Wait for human approval before writing to learnings files.

### Step 6: Output
1. Save retro document to `projects/{slug}/retros/{sprint-name}.md`
2. If approved, append to `context/learnings/what-works.md` and `context/learnings/what-doesnt.md`
3. Save metrics to database if available

## Error Handling
- No historical baseline → establish this sprint as baseline, note that trends require 3+ sprints
- Incomplete data (e.g., not all tickets properly closed) → note gaps, analyze what's available
- No team retro input → generate data-only retro, flag that qualitative input would strengthen it
```

---

<a name="database-schema"></a>
## 7. Database Schema

**Not required for Skills 1-2.** Set up when you need persistent state (Skill 3+).

Recommended: **SQLite** — zero setup, no accounts, no API keys, no network dependency. A single `agentem.db` file lives in the repo alongside your context files. Perfect for single-team local use. If you later productize this as a multi-user hosted platform, migrate to Postgres/Supabase at that point.

```sql
-- Enable foreign keys (SQLite has them off by default)
PRAGMA foreign_keys = ON;

-- Projects
CREATE TABLE projects (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  slug TEXT UNIQUE NOT NULL,
  name TEXT NOT NULL,
  status TEXT DEFAULT 'active',
  strategy_summary TEXT,
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

-- Specs
CREATE TABLE specs (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  project_id TEXT REFERENCES projects(id),
  slug TEXT NOT NULL,
  title TEXT NOT NULL,
  status TEXT DEFAULT 'draft',
  complexity TEXT,
  file_path TEXT,
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now')),
  UNIQUE(project_id, slug)
);

-- Sprints
CREATE TABLE sprints (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  name TEXT NOT NULL,
  start_date TEXT NOT NULL,
  end_date TEXT NOT NULL,
  capacity_days REAL,
  planned_points REAL,
  completed_points REAL,
  status TEXT DEFAULT 'planning',
  created_at TEXT DEFAULT (datetime('now'))
);

-- Tickets
CREATE TABLE tickets (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  spec_id TEXT REFERENCES specs(id),
  external_id TEXT,
  external_url TEXT,
  title TEXT NOT NULL,
  description TEXT,
  acceptance_criteria TEXT,
  estimate_points REAL,
  assignee TEXT,
  status TEXT DEFAULT 'backlog',
  priority TEXT,
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

-- Sprint-Ticket junction
CREATE TABLE sprint_tickets (
  sprint_id TEXT REFERENCES sprints(id),
  ticket_id TEXT REFERENCES tickets(id),
  status TEXT DEFAULT 'planned',
  PRIMARY KEY (sprint_id, ticket_id)
);

-- Signals
CREATE TABLE signals (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  type TEXT NOT NULL,
  severity TEXT NOT NULL,
  title TEXT NOT NULL,
  description TEXT,
  source TEXT,
  reference_url TEXT,
  status TEXT DEFAULT 'open',
  detected_at TEXT DEFAULT (datetime('now')),
  resolved_at TEXT
);

-- Metrics
CREATE TABLE metrics (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  sprint_id TEXT REFERENCES sprints(id),
  metric_type TEXT NOT NULL,
  value REAL NOT NULL,
  recorded_at TEXT DEFAULT (datetime('now'))
);

-- Retros
CREATE TABLE retros (
  id TEXT PRIMARY KEY DEFAULT (lower(hex(randomblob(16)))),
  sprint_id TEXT REFERENCES sprints(id),
  went_well TEXT,
  didnt_go_well TEXT,
  action_items TEXT,
  file_path TEXT,
  created_at TEXT DEFAULT (datetime('now'))
);
```

---

<a name="integration-specifications"></a>
## 8. Integration Specifications

### GitHub Client (`src/lib/clients/github.ts`)

Build these functions:

| Function | Purpose | Used By |
|----------|---------|---------|
| `listOpenPRs(repo)` | All open PRs with age, review status, CI | Risk Detector, Review Orchestrator |
| `getPRDetails(repo, number)` | Full PR: diff stats, reviewers, comments | Review Orchestrator |
| `getRecentMerges(repo, since)` | Merged PRs since date/tag | Release Manager |
| `getCIStatus(repo, ref)` | Build/test status | Risk Detector, Release Manager |
| `getFileContributors(repo, paths)` | Contributors per file | Review Orchestrator (bus factor) |

### Linear Client (`src/lib/clients/linear.ts`)

| Function | Purpose | Used By |
|----------|---------|---------|
| `getActiveCycle(teamId)` | Current sprint with all issues | Risk Detector, Retro Analyzer |
| `getIssuesByState(teamId, state)` | Filtered issues | Risk Detector |
| `createIssue(input)` | Create ticket | Ticket Decomposer |
| `getCompletedIssues(teamId, since)` | Done tickets for period | Release Manager, Retro Analyzer |
| `getCycleHistory(teamId, count)` | Past N sprints' data | Retro Analyzer |

### SQLite Client (`src/lib/clients/database.ts`)

Standard CRUD for all tables in the schema. Use `better-sqlite3` (synchronous, fast, zero-config). Database file lives at `agentem/agentem.db`.

| Function | Purpose |
|----------|---------|
| `initDatabase()` | Create tables if not exist, enable foreign keys |
| `upsertProject(project)` | Insert or update project record |
| `saveSignals(signals[])` | Batch insert detected signals |
| `getMetricsForSprint(sprintId)` | Pull metrics for retro analysis |
| `getHistoricalBaselines(metricType, count)` | Average of last N sprints for comparison |

If you later need multi-user access or a hosted dashboard, migrate to Supabase/Postgres at that point. The schema is compatible — just swap `TEXT` IDs for `UUID`, `REAL` for `NUMERIC`, and `datetime('now')` for `now()`.

---

<a name="build-instructions"></a>
## 9. Build Instructions

Claude Code: execute these in order. Do not skip ahead. Each phase depends on the previous one.

### Phase 1: Scaffold + Auto-Generate Context (Do First)
1. Create the full folder structure from Section 3
2. Analyze the current repo, codebase, and any available documentation (README, ADRs, package.json, CI config, git history, code structure, existing docs)
3. Generate context files **one at a time, in order**, following the process in Section 5. For each file: generate it, present it to the human for review, wait for confirmation, then move to the next file. Do NOT generate all files at once.
4. After all 13 context files are reviewed and confirmed, generate `CLAUDE.md` from Section 4, filling in values based on the confirmed context files
5. Initialize the project (`package.json` with TypeScript, `tsconfig.json`)
6. **Tell the human:** "All context files are reviewed and confirmed. Ready to build Skill 1 (Spec Generator)."

### Phase 2: Skill 1 — Spec Generator (After Human Reviews Context)
1. Create `skills/spec-generator/SKILL.md` from the Skill 1 definition
2. Test by generating a spec for a feature the human describes
3. Evaluate output against the skill's Quality Criteria
4. If output is generic, tell the human which context files need more detail or correction

### Phase 3: Skill 2 — Ticket Decomposer
1. Create `skills/ticket-decomposer/SKILL.md` from the Skill 2 definition
2. Test by decomposing the spec from Phase 2
3. Validate ticket size, acceptance criteria quality, and capacity check

### Phase 4: Integrations (Requires API Keys)
1. Ask the human: "Which project management tool? Linear or Jira?"
2. Ask: "Do you have a GitHub personal access token with repo scope?"
3. Build `src/lib/clients/github.ts` with the functions from Section 8
4. Build `src/lib/clients/{linear|jira}.ts`
5. Test each client function individually before using in skills

### Phase 5: Skill 3 — Risk Detector
1. Create `skills/risk-detector/SKILL.md`
2. Build `src/core/risk-detector/` with signal scanning logic
3. Test against the human's current sprint data
4. Tune thresholds based on false positive rate

### Phase 6: Skills 4-6 — Review, Release, Retro
1. Build each skill in order
2. Test each against real data from the human's repos
3. Skill 6 is the flywheel — once it runs, the system improves itself

### Phase 7: Database (Optional, When Ready for Persistent State)
1. Install `better-sqlite3` (`npm install better-sqlite3`)
2. Build `src/lib/clients/database.ts` with `initDatabase()` that creates tables from schema
3. Run init on first use — creates `agentem.db` in project root
4. Connect skills to database for metrics tracking and signal history
5. Add `agentem.db` to `.gitignore` (data is local, not shared via git)

---

<a name="operating-principles"></a>
## 10. Operating Principles

### Context Files Are the Product
The code is plumbing. The context files are what make the agent useful. A spec generator reading your actual ADRs and team topology is 10x more valuable than one that can query GitHub but has no context about your system. Invest in context files first.

### Signals Must Be Observable
"Team morale is low" is not a signal — you can't detect it via API. "3 PRs open > 5 days with no review" is a signal — GitHub surfaces it. Every signal must map to a data source that can be queried programmatically.

### Test Context Before Code
Skills 1 and 2 require zero integrations. Generate 3 specs and decompose them. Compare to your team's actual output. Refine context files until quality matches your standards. Then invest in integrations.

### The System Improves Itself
Skill 6 learnings feed back into context files. The agent writes better specs every cycle because it knows what decomposition patterns led to smooth delivery. This is the flywheel — protect it by actually running retros and reviewing proposed learnings updates.

### Human in the Loop Is a Feature
The agent surfaces information, generates artifacts, and flags risks. Humans make decisions. Never let the agent update context files without review. Never let it escalate without human judgment. The goal is augmented leadership, not autonomous management.

### 80% Capacity Rule
When planning sprints, the system should target 80% of stated capacity. 100% capacity planning fails every time. The remaining 20% absorbs interrupts, bugs, and the unexpected. This is encoded in Skill 2 but should be reinforced in `context/learnings/what-works.md`.

### Start Small, Expand Deliberately
Build Skill 1. Use it for 2 weeks. Build Skill 2. The feedback loop between using a skill and improving context files is where the value comes from. Trying to build all 6 skills at once produces a system nobody trusts.



<a name="automation-architecture"></a>
## 11. Automation Architecture

The foundation layer (Sections 1-10) gives you skills you invoke manually — you open Claude Code, prompt a skill, review the output. The automation layer makes the system run itself.

Three automation mechanisms, in order of increasing agency:

| Mechanism | What It Does | Example |
|---|---|---|
| **Cron Runner** | Executes skills on a schedule | Risk scan at 8am every morning, retro prompt every other Friday |
| **Webhook Triggers** | Executes skills in response to events | PR opened → Review Orchestrator, Issue labeled "spec" → Spec Generator |
| **CLI Commands** | Chains skills into pipelines you invoke once | `agentem sprint-plan` runs Spec → Decompose → Risk Scan in sequence |

All three post results to Slack automatically. You don't prompt anything — you read the output in your team channel and act on it.

### What Automation Does NOT Do

- It does not choose which skill to run based on event analysis. Mappings are hardcoded. PR opened always triggers Review Orchestrator, whether the PR is trivial or critical.
- It does not score confidence or decide between auto-executing and asking for approval. It runs the skill and posts the result. You decide what to do with it.
- It does not take actions in external systems. It generates artifacts (specs, risk digests, review assignments) and posts them to Slack. It does not create Linear tickets, assign GitHub reviewers, or merge PRs on its own.

These capabilities — intelligent routing, confidence scoring, action execution, and approval models — are the intelligence layer. See "What's Next" at the end of this document.

### How Automation Invokes Skills

Every automation mechanism uses the same interface: it spawns a Claude Code session with a structured prompt, waits for output, and routes the output to Slack.

```typescript
// automation/lib/skill-runner.ts

interface SkillRunRequest {
  skill: 'spec-generator' | 'ticket-decomposer' | 'risk-detector' | 
         'review-orchestrator' | 'release-manager' | 'retro-analyzer';
  trigger: 'cron' | 'webhook' | 'cli';
  triggerDetail: string;       // e.g., "cron:daily-risk-scan" or "webhook:pr.opened"
  additionalContext?: string;  // event-specific data to include in the prompt
  project?: string;            // project slug for output directory
}

interface SkillRunResult {
  skill: string;
  trigger: string;
  status: 'success' | 'error';
  outputPath: string;          // path to generated artifact
  summary: string;             // one-paragraph summary for Slack
  startedAt: string;
  completedAt: string;
  tokenUsage?: number;
}
```

The skill runner constructs a prompt that includes:
1. The skill's SKILL.md definition
2. The trigger context (what caused this run)
3. Any additional event data (PR details, sprint metrics, etc.)
4. Output directory path

Then it invokes Claude Code, captures the output, and returns a structured result.

---

<a name="cron-runner"></a>
## 12. Cron Runner

The cron runner uses `node-cron` to execute skills on a schedule. Schedules are defined in YAML for easy editing.

### Schedule Configuration (`automation/cron/schedules.yaml`)

```yaml
# automation/cron/schedules.yaml
# 
# Cron format: second(optional) minute hour day-of-month month day-of-week
# All times in your local timezone (set TZ env var)

schedules:

  # --- Daily ---
  
  morning-risk-scan:
    cron: "0 8 * * 1-5"           # 8:00 AM, Monday-Friday
    skill: risk-detector
    description: "Morning risk digest posted to Slack"
    slack_channel: "#engineering"
    enabled: true

  review-check:
    cron: "0 10 * * 1-5"          # 10:00 AM, Monday-Friday
    skill: review-orchestrator
    description: "Check for stale PRs and route reviews"
    slack_channel: "#code-review"
    enabled: true

  afternoon-risk-scan:
    cron: "0 15 * * 1-5"          # 3:00 PM, Monday-Friday
    skill: risk-detector
    description: "Afternoon risk check — catch anything from morning PRs"
    slack_channel: "#engineering"
    enabled: false                 # Enable when comfortable with morning cadence

  # --- Weekly ---
  
  monday-sprint-kickoff:
    cron: "0 9 * * 1"             # 9:00 AM, Monday
    skill: risk-detector
    additionalContext: "Focus on sprint scope: check for unestimated tickets, missing specs, dependency risks for this sprint's work."
    slack_channel: "#engineering"
    enabled: true

  friday-release-check:
    cron: "0 14 * * 5"            # 2:00 PM, Friday
    skill: release-manager
    description: "End-of-week release readiness assessment"
    slack_channel: "#releases"
    enabled: true

  # --- Bi-weekly (Sprint Cadence) ---
  
  sprint-retro:
    cron: "0 10 * * 5"            # 10:00 AM, every other Friday
    skill: retro-analyzer
    description: "Generate sprint retro document before retro meeting"
    slack_channel: "#engineering"
    enabled: true
    # Note: node-cron doesn't support "every other week" natively.
    # The scheduler checks the sprint counter in the database and skips odd/even weeks.
```

### Cron Runner Implementation (`automation/cron/scheduler.ts`)

```typescript
// automation/cron/scheduler.ts

import * as cron from 'node-cron';
import * as yaml from 'js-yaml';
import * as fs from 'fs';
import { runSkill } from '../lib/skill-runner';
import { postToSlack } from '../slack/client';
import { formatSkillResult } from '../slack/formatter';

interface ScheduleConfig {
  cron: string;
  skill: string;
  description?: string;
  additionalContext?: string;
  slack_channel: string;
  enabled: boolean;
}

export function startScheduler(): void {
  const config = yaml.load(
    fs.readFileSync('automation/cron/schedules.yaml', 'utf-8')
  ) as { schedules: Record<string, ScheduleConfig> };

  for (const [name, schedule] of Object.entries(config.schedules)) {
    if (!schedule.enabled) {
      console.log(`[scheduler] Skipping disabled: ${name}`);
      continue;
    }

    if (!cron.validate(schedule.cron)) {
      console.error(`[scheduler] Invalid cron expression for ${name}: ${schedule.cron}`);
      continue;
    }

    cron.schedule(schedule.cron, async () => {
      console.log(`[scheduler] Running: ${name} (${schedule.skill})`);
      
      try {
        const result = await runSkill({
          skill: schedule.skill as any,
          trigger: 'cron',
          triggerDetail: `cron:${name}`,
          additionalContext: schedule.additionalContext,
        });

        const slackBlocks = formatSkillResult(result, name);
        await postToSlack(schedule.slack_channel, slackBlocks);
        
        console.log(`[scheduler] Completed: ${name} (${result.status})`);
      } catch (error) {
        console.error(`[scheduler] Failed: ${name}`, error);
        await postToSlack(schedule.slack_channel, [{
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `⚠️ *Scheduled task failed:* ${name}\n\`\`\`${error.message}\`\`\``
          }
        }]);
      }
    });

    console.log(`[scheduler] Registered: ${name} → ${schedule.skill} (${schedule.cron})`);
  }
}
```

### Running the Scheduler

```bash
# Start the cron scheduler (runs in foreground, keep alive with pm2 or systemd)
npx ts-node automation/cron/scheduler.ts

# Or with pm2 for production
pm2 start automation/cron/scheduler.ts --name agentem-scheduler --interpreter ts-node
```

### Editing Schedules

Change `schedules.yaml` and restart the scheduler. No code changes needed. To test a schedule immediately:

```bash
# Dry-run a specific schedule
npx ts-node -e "
  import { runSkill } from './automation/lib/skill-runner';
  runSkill({ skill: 'risk-detector', trigger: 'cron', triggerDetail: 'manual-test' })
    .then(r => console.log(r.summary))
"
```

---

<a name="webhook-triggers"></a>
## 13. Webhook Triggers

Webhooks react to events in GitHub and Linear. An event arrives, the handler maps it to a skill, and the skill runs automatically.

### Webhook Server (`automation/webhooks/server.ts`)

```typescript
// automation/webhooks/server.ts

import express from 'express';
import crypto from 'crypto';
import { handleGitHubEvent } from './github-handler';
import { handleLinearEvent } from './linear-handler';

const app = express();
app.use(express.json());

// GitHub webhook endpoint
app.post('/webhooks/github', (req, res) => {
  // Verify signature
  const signature = req.headers['x-hub-signature-256'] as string;
  const payload = JSON.stringify(req.body);
  const expected = 'sha256=' + crypto
    .createHmac('sha256', process.env.GITHUB_WEBHOOK_SECRET!)
    .update(payload)
    .digest('hex');

  if (!crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
    return res.status(401).send('Invalid signature');
  }

  const event = req.headers['x-github-event'] as string;
  handleGitHubEvent(event, req.body);
  res.status(200).send('OK');
});

// Linear webhook endpoint
app.post('/webhooks/linear', (req, res) => {
  handleLinearEvent(req.body);
  res.status(200).send('OK');
});

app.listen(process.env.WEBHOOK_PORT || 3100, () => {
  console.log(`[webhooks] Listening on port ${process.env.WEBHOOK_PORT || 3100}`);
});
```

### GitHub Event → Skill Mapping (`automation/webhooks/github-handler.ts`)

Mappings are hardcoded. Every PR opened triggers Review Orchestrator. Every push to main triggers Release Manager check. No intelligence — just direct mapping.

```typescript
// automation/webhooks/github-handler.ts

import { runSkill } from '../lib/skill-runner';
import { postToSlack } from '../slack/client';
import { formatSkillResult } from '../slack/formatter';

interface EventMapping {
  skill: string;
  slackChannel: string;
  additionalContext: (payload: any) => string;
}

// Event → Skill mapping table
// To change behavior, edit this map. No other configuration needed.
const GITHUB_MAPPINGS: Record<string, EventMapping> = {
  
  'pull_request.opened': {
    skill: 'review-orchestrator',
    slackChannel: '#code-review',
    additionalContext: (payload) => `
PR #${payload.pull_request.number}: ${payload.pull_request.title}
Author: ${payload.pull_request.user.login}
Base: ${payload.pull_request.base.ref}
Files changed: ${payload.pull_request.changed_files}
Additions: ${payload.pull_request.additions}, Deletions: ${payload.pull_request.deletions}
URL: ${payload.pull_request.html_url}
    `.trim()
  },

  'pull_request.ready_for_review': {
    skill: 'review-orchestrator',
    slackChannel: '#code-review',
    additionalContext: (payload) => `
PR #${payload.pull_request.number} moved from draft to ready for review.
Title: ${payload.pull_request.title}
Author: ${payload.pull_request.user.login}
URL: ${payload.pull_request.html_url}
    `.trim()
  },

  'push': {
    skill: 'release-manager',
    slackChannel: '#releases',
    additionalContext: (payload) => {
      // Only trigger on pushes to main/master
      const branch = payload.ref.replace('refs/heads/', '');
      if (!['main', 'master'].includes(branch)) return '__SKIP__';
      return `
Push to ${branch}: ${payload.commits.length} commit(s)
Pusher: ${payload.pusher.name}
Commits: ${payload.commits.map((c: any) => `- ${c.message}`).join('\n')}
      `.trim();
    }
  },

  'issues.labeled': {
    skill: 'spec-generator',
    slackChannel: '#engineering',
    additionalContext: (payload) => {
      // Only trigger on "spec-needed" label
      if (payload.label.name !== 'spec-needed') return '__SKIP__';
      return `
Issue #${payload.issue.number}: ${payload.issue.title}
Body: ${payload.issue.body || '(empty)'}
Labels: ${payload.issue.labels.map((l: any) => l.name).join(', ')}
URL: ${payload.issue.html_url}
      `.trim();
    }
  },
};

export async function handleGitHubEvent(event: string, payload: any): Promise<void> {
  const action = payload.action ? `${event}.${payload.action}` : event;
  const mapping = GITHUB_MAPPINGS[action] || GITHUB_MAPPINGS[event];
  
  if (!mapping) {
    console.log(`[github] No mapping for event: ${action}`);
    return;
  }

  const context = mapping.additionalContext(payload);
  if (context === '__SKIP__') {
    console.log(`[github] Skipping ${action} — filter condition not met`);
    return;
  }

  console.log(`[github] ${action} → ${mapping.skill}`);
  
  try {
    const result = await runSkill({
      skill: mapping.skill as any,
      trigger: 'webhook',
      triggerDetail: `github:${action}`,
      additionalContext: context,
    });

    const slackBlocks = formatSkillResult(result, `GitHub: ${action}`);
    await postToSlack(mapping.slackChannel, slackBlocks);
  } catch (error) {
    console.error(`[github] Skill execution failed for ${action}:`, error);
  }
}
```

### Linear Event → Skill Mapping (`automation/webhooks/linear-handler.ts`)

```typescript
// automation/webhooks/linear-handler.ts

import { runSkill } from '../lib/skill-runner';
import { postToSlack } from '../slack/client';
import { formatSkillResult } from '../slack/formatter';

export async function handleLinearEvent(payload: any): Promise<void> {
  const { type, action, data } = payload;

  // Issue moved to "In Progress" → risk scan on the associated work
  if (type === 'Issue' && action === 'update' && data.state?.name === 'In Progress') {
    console.log(`[linear] Issue moved to In Progress: ${data.title}`);
    
    const result = await runSkill({
      skill: 'risk-detector',
      trigger: 'webhook',
      triggerDetail: 'linear:issue.in_progress',
      additionalContext: `
Issue: ${data.title}
Assignee: ${data.assignee?.name || 'Unassigned'}
Priority: ${data.priority}
Labels: ${data.labels?.map((l: any) => l.name).join(', ') || 'none'}
Focus risk scan on dependencies and blockers for this specific ticket.
      `.trim()
    });

    await postToSlack('#engineering', formatSkillResult(result, 'Linear: issue in progress'));
  }

  // Cycle (sprint) started → sprint kickoff risk scan
  if (type === 'Cycle' && action === 'create') {
    console.log(`[linear] New cycle started: ${data.name}`);
    
    const result = await runSkill({
      skill: 'risk-detector',
      trigger: 'webhook',
      triggerDetail: 'linear:cycle.started',
      additionalContext: `
New sprint started: ${data.name}
Start: ${data.startsAt}
End: ${data.endsAt}
Run a full sprint scope risk assessment: unestimated tickets, missing specs, dependency risks.
      `.trim()
    });

    await postToSlack('#engineering', formatSkillResult(result, 'Linear: sprint started'));
  }

  // Cycle completed → retro prompt
  if (type === 'Cycle' && action === 'update' && data.completedAt) {
    console.log(`[linear] Cycle completed: ${data.name}`);
    
    const result = await runSkill({
      skill: 'retro-analyzer',
      trigger: 'webhook',
      triggerDetail: 'linear:cycle.completed',
      additionalContext: `Sprint completed: ${data.name}. Generate the full retro document.`
    });

    await postToSlack('#engineering', formatSkillResult(result, 'Linear: sprint completed'));
  }
}
```

### Setting Up Webhooks

**GitHub:**
1. Go to repo Settings → Webhooks → Add webhook
2. Payload URL: `https://your-server.com/webhooks/github`
3. Content type: `application/json`
4. Secret: Set `GITHUB_WEBHOOK_SECRET` env var to same value
5. Events: Pull requests, Pushes, Issues

**Linear:**
1. Go to Settings → API → Webhooks → New webhook
2. URL: `https://your-server.com/webhooks/linear`
3. Resource types: Issues, Cycles

**Exposing locally (development):**
```bash
# Use ngrok or cloudflared to expose the webhook server
ngrok http 3100
# Copy the https URL to your GitHub/Linear webhook settings
```

---

<a name="slack-integration"></a>
## 14. Slack Integration

Every automation mechanism posts results to Slack. The Slack integration converts skill output into formatted Slack Block Kit messages. A centralized delivery resolver (`automation/slack/delivery.ts`) reads `channels.yaml` and routes each skill's output to the correct channel — no trigger mechanism needs to hardcode a channel.

### Slack Client (`automation/slack/client.ts`)

```typescript
// automation/slack/client.ts

import { WebClient } from '@slack/web-api';

const slack = new WebClient(process.env.SLACK_BOT_TOKEN);

export async function postToSlack(channel: string, blocks: any[]): Promise<void> {
  try {
    await slack.chat.postMessage({
      channel,
      blocks,
      text: 'AgentEM update', // Fallback for notifications
    });
  } catch (error) {
    console.error(`[slack] Failed to post to ${channel}:`, error);
  }
}

export async function postThread(channel: string, threadTs: string, blocks: any[]): Promise<void> {
  try {
    await slack.chat.postMessage({
      channel,
      thread_ts: threadTs,
      blocks,
      text: 'AgentEM detail',
    });
  } catch (error) {
    console.error(`[slack] Failed to post thread:`, error);
  }
}
```

### Slack Formatter (`automation/slack/formatter.ts`)

```typescript
// automation/slack/formatter.ts

import { SkillRunResult } from '../lib/skill-runner';

export function formatSkillResult(result: SkillRunResult, triggerLabel: string): any[] {
  const statusEmoji = result.status === 'success' ? '✅' : '⚠️';
  const skillLabels: Record<string, string> = {
    'risk-detector': '🔍 Risk Scan',
    'review-orchestrator': '👀 Review Routing',
    'spec-generator': '📝 Spec Generated',
    'ticket-decomposer': '🎫 Tickets Decomposed',
    'release-manager': '🚀 Release Check',
    'retro-analyzer': '📊 Sprint Retro',
  };

  const blocks: any[] = [
    {
      type: 'header',
      text: {
        type: 'plain_text',
        text: `${statusEmoji} ${skillLabels[result.skill] || result.skill}`,
      }
    },
    {
      type: 'context',
      elements: [{
        type: 'mrkdwn',
        text: `Triggered by: *${triggerLabel}* • ${new Date(result.completedAt).toLocaleTimeString()}`
      }]
    },
    { type: 'divider' },
    {
      type: 'section',
      text: {
        type: 'mrkdwn',
        text: result.summary
      }
    },
  ];

  if (result.outputPath) {
    blocks.push({
      type: 'context',
      elements: [{
        type: 'mrkdwn',
        text: `📎 Full output: \`${result.outputPath}\``
      }]
    });
  }

  return blocks;
}

// Specialized formatters for high-frequency skills

export function formatRiskDigest(risks: any[]): any[] {
  const critical = risks.filter(r => r.severity === 'critical');
  const high = risks.filter(r => r.severity === 'high');
  const medium = risks.filter(r => r.severity === 'medium');

  const sections: any[] = [{
    type: 'header',
    text: { type: 'plain_text', text: '🔍 Morning Risk Digest' }
  }];

  if (critical.length === 0 && high.length === 0) {
    sections.push({
      type: 'section',
      text: { type: 'mrkdwn', text: '✅ *All clear.* No critical or high-severity risks detected.' }
    });
  }

  for (const risk of critical) {
    sections.push({
      type: 'section',
      text: { type: 'mrkdwn', text: `🔴 *CRITICAL:* ${risk.title}\n${risk.description}` }
    });
  }

  for (const risk of high) {
    sections.push({
      type: 'section',
      text: { type: 'mrkdwn', text: `🟠 *HIGH:* ${risk.title}\n${risk.description}` }
    });
  }

  if (medium.length > 0) {
    sections.push({
      type: 'section',
      text: { 
        type: 'mrkdwn', 
        text: `🟡 *${medium.length} medium risks* — discuss at standup if relevant.` 
      }
    });
  }

  return sections;
}

export function formatReviewRouting(assignments: any[]): any[] {
  return [
    {
      type: 'header',
      text: { type: 'plain_text', text: '👀 Review Assignments' }
    },
    ...assignments.map(a => ({
      type: 'section',
      text: {
        type: 'mrkdwn',
        text: `<${a.prUrl}|#${a.prNumber} ${a.prTitle}> → *${a.reviewer}*\n_${a.reason}_`
      }
    }))
  ];
}
```

### Channel Configuration (`automation/slack/channels.yaml`)

```yaml
# automation/slack/channels.yaml
# Centralized delivery config — all trigger mechanisms resolve channels through here.
# See automation/slack/delivery.ts for the resolver.

default_channel: "#engineering"

channels:
  risk-detector: "#engineering"
  review-orchestrator: "#code-review"
  spec-generator: "#engineering"
  ticket-decomposer: "#engineering"
  release-manager: "#releases"
  retro-analyzer: "#engineering"

# Override for specific triggers
overrides:
  # Critical risks go to a separate alerts channel
  risk-detector:critical: "#eng-alerts"
```

### Delivery Resolver (`automation/slack/delivery.ts`)

Every trigger mechanism (cron, webhooks, CLI) calls `resolveChannel()` instead of hardcoding a Slack channel. The resolver applies a priority chain:

1. **Explicit override** — the caller passes a `channelOverride` (e.g., a cron schedule's `slack_channel` or a generic webhook payload's `slackChannel`). If present, it wins.
2. **Qualifier match** — if the caller passes a qualifier (e.g., `"critical"`), the resolver looks for `{skill}:{qualifier}` in the `overrides` section of `channels.yaml`.
3. **Skill default** — the `channels` section maps each skill to its default channel.
4. **Global default** — `default_channel` catches anything not explicitly mapped.

```typescript
// automation/slack/delivery.ts

import * as yaml from 'js-yaml';
import * as fs from 'fs';
import * as path from 'path';

interface ChannelsConfig {
  default_channel: string;
  channels: Record<string, string>;
  overrides?: Record<string, string>;
}

let cached: ChannelsConfig | null = null;

function loadConfig(): ChannelsConfig {
  if (cached) return cached;
  const configPath = path.resolve(__dirname, 'channels.yaml');
  cached = yaml.load(fs.readFileSync(configPath, 'utf-8')) as ChannelsConfig;
  return cached;
}

export function resolveChannel(
  skill: string,
  qualifier?: string,
  channelOverride?: string,
): string {
  if (channelOverride) return channelOverride;

  const config = loadConfig();

  if (qualifier && config.overrides) {
    const key = `${skill}:${qualifier}`;
    if (config.overrides[key]) return config.overrides[key];
  }

  if (config.channels[skill]) return config.channels[skill];

  return config.default_channel;
}
```

> **Note:** The cron scheduler's per-schedule `slack_channel` still works — it's passed as `channelOverride` and wins the priority chain. Omitting `slack_channel` from a schedule now falls back to `channels.yaml` instead of failing.

### Slack App Setup

1. Go to api.slack.com/apps → Create New App
2. From scratch → name it "AgentEM" → select your workspace
3. OAuth & Permissions → add scopes: `chat:write`, `chat:write.public`
4. Install to Workspace → copy Bot User OAuth Token
5. Set env var: `SLACK_BOT_TOKEN=xoxb-your-token`
6. Invite the bot to your channels: `/invite @AgentEM`

---

<a name="cli-commands"></a>
## 15. CLI Commands

The CLI provides on-demand skill execution and skill chaining. Use it when you want to run a skill outside the cron schedule or chain multiple skills together.

### CLI Entry Point (`automation/cli/agentem.ts`)

```typescript
#!/usr/bin/env ts-node
// automation/cli/agentem.ts

import { Command } from 'commander';
import { runSkill } from '../lib/skill-runner';
import { postToSlack } from '../slack/client';
import { formatSkillResult } from '../slack/formatter';

const program = new Command();

program
  .name('agentem')
  .description('AgentEM CLI — run skills and automation commands')
  .version('0.1.0');

// --- Individual skill commands ---

program
  .command('risk-scan')
  .description('Run a risk scan on the current sprint')
  .option('--focus <area>', 'Focus area for the scan (e.g., "dependencies", "scope")')
  .option('--quiet', 'Print summary only, no Slack post')
  .action(async (opts) => {
    const result = await runSkill({
      skill: 'risk-detector',
      trigger: 'cli',
      triggerDetail: 'cli:risk-scan',
      additionalContext: opts.focus ? `Focus on: ${opts.focus}` : undefined,
    });
    console.log(result.summary);
    if (!opts.quiet) {
      await postToSlack('#engineering', formatSkillResult(result, 'Manual risk scan'));
    }
  });

program
  .command('review-check')
  .description('Check for stale PRs and suggest review assignments')
  .option('--quiet', 'Print summary only, no Slack post')
  .action(async (opts) => {
    const result = await runSkill({
      skill: 'review-orchestrator',
      trigger: 'cli',
      triggerDetail: 'cli:review-check',
    });
    console.log(result.summary);
    if (!opts.quiet) {
      await postToSlack('#code-review', formatSkillResult(result, 'Manual review check'));
    }
  });

program
  .command('generate-spec <description>')
  .description('Generate a spec from a feature description')
  .option('--project <slug>', 'Project slug for output directory')
  .option('--quiet', 'Print summary only, no Slack post')
  .action(async (description, opts) => {
    const result = await runSkill({
      skill: 'spec-generator',
      trigger: 'cli',
      triggerDetail: 'cli:generate-spec',
      additionalContext: `Feature request: ${description}`,
      project: opts.project,
    });
    console.log(`Spec generated: ${result.outputPath}`);
    console.log(result.summary);
  });

program
  .command('retro')
  .description('Generate a sprint retrospective document')
  .option('--sprint <name>', 'Sprint name (defaults to current)')
  .option('--quiet', 'Print summary only, no Slack post')
  .action(async (opts) => {
    const result = await runSkill({
      skill: 'retro-analyzer',
      trigger: 'cli',
      triggerDetail: 'cli:retro',
      additionalContext: opts.sprint ? `Sprint: ${opts.sprint}` : undefined,
    });
    console.log(`Retro generated: ${result.outputPath}`);
    if (!opts.quiet) {
      await postToSlack('#engineering', formatSkillResult(result, 'Manual retro'));
    }
  });

// --- Chained commands (pipelines) ---

program
  .command('sprint-plan <feature-description>')
  .description('Full pipeline: Spec → Decompose → Risk Scan')
  .option('--project <slug>', 'Project slug')
  .action(async (description, opts) => {
    console.log('Step 1/3: Generating spec...');
    const spec = await runSkill({
      skill: 'spec-generator',
      trigger: 'cli',
      triggerDetail: 'cli:sprint-plan:spec',
      additionalContext: `Feature: ${description}`,
      project: opts.project,
    });
    console.log(`  ✓ Spec: ${spec.outputPath}`);

    console.log('Step 2/3: Decomposing into tickets...');
    const tickets = await runSkill({
      skill: 'ticket-decomposer',
      trigger: 'cli',
      triggerDetail: 'cli:sprint-plan:decompose',
      additionalContext: `Decompose the spec at: ${spec.outputPath}`,
      project: opts.project,
    });
    console.log(`  ✓ Tickets: ${tickets.outputPath}`);

    console.log('Step 3/3: Running risk scan...');
    const risks = await runSkill({
      skill: 'risk-detector',
      trigger: 'cli',
      triggerDetail: 'cli:sprint-plan:risk',
      additionalContext: `Risk scan the sprint plan at: ${tickets.outputPath}`,
      project: opts.project,
    });
    console.log(`  ✓ Risk digest: ${risks.outputPath}`);

    console.log('\n--- Sprint Plan Complete ---');
    console.log(`Spec:    ${spec.outputPath}`);
    console.log(`Tickets: ${tickets.outputPath}`);
    console.log(`Risks:   ${risks.outputPath}`);

    await postToSlack('#engineering', [{
      type: 'header',
      text: { type: 'plain_text', text: '📋 Sprint Plan Generated' }
    }, {
      type: 'section',
      text: { type: 'mrkdwn', text: `*Feature:* ${description}\n\n${spec.summary}\n\n${tickets.summary}\n\n${risks.summary}` }
    }]);
  });

program
  .command('sprint-close')
  .description('Full pipeline: Release Check → Retro → Learnings Update')
  .action(async () => {
    console.log('Step 1/2: Running release check...');
    const release = await runSkill({
      skill: 'release-manager',
      trigger: 'cli',
      triggerDetail: 'cli:sprint-close:release',
    });
    console.log(`  ✓ Release notes: ${release.outputPath}`);

    console.log('Step 2/2: Generating retro...');
    const retro = await runSkill({
      skill: 'retro-analyzer',
      trigger: 'cli',
      triggerDetail: 'cli:sprint-close:retro',
    });
    console.log(`  ✓ Retro: ${retro.outputPath}`);

    console.log('\n--- Sprint Close Complete ---');
    console.log(`Release: ${release.outputPath}`);
    console.log(`Retro:   ${retro.outputPath}`);

    await postToSlack('#engineering', [{
      type: 'header',
      text: { type: 'plain_text', text: '🏁 Sprint Close' }
    }, {
      type: 'section',
      text: { type: 'mrkdwn', text: `${release.summary}\n\n${retro.summary}` }
    }]);
  });

program.parse();
```

### Usage

```bash
# Install globally for convenience
npm link

# Individual skills
agentem risk-scan
agentem risk-scan --focus "dependencies"
agentem review-check
agentem generate-spec "Add SSO login for enterprise customers" --project enterprise-sso
agentem retro --sprint "Sprint 24"

# Chained pipelines
agentem sprint-plan "Add webhook support for real-time notifications" --project webhooks
agentem sprint-close

# Quiet mode (no Slack post, just stdout)
agentem risk-scan --quiet
```

---

<a name="build-instructions-automation"></a>
## 16. Build Instructions — Automation Layer

**Prerequisite:** Sections 1-10 must be built and working. You should have run Skills 1-2 manually at least twice and refined your context files based on the output.

### Phase 8: Slack Integration

1. Create a Slack app (see Section 14 setup instructions)
2. Build `automation/slack/client.ts` and `automation/slack/formatter.ts`
3. Test: post a manually-triggered risk scan result to your test channel
4. If the format is wrong or noisy, adjust the formatter before proceeding

### Phase 9: Cron Runner

1. Install dependencies: `npm install node-cron js-yaml`
2. Build `automation/cron/scheduler.ts`
3. Create `automation/cron/schedules.yaml` — start with just morning risk scan enabled
4. Test: run the scheduler, wait for the first cron tick, verify Slack output
5. **Run for one full week** with only the morning risk scan. Review output daily. Tune risk thresholds and context files until the morning digest is useful, not noisy.
6. Then enable `review-check`. Run for another week.
7. Add remaining schedules one at a time. Never enable all at once.

### Phase 10: Webhook Server

1. Build `automation/webhooks/server.ts`, `github-handler.ts`, `linear-handler.ts`
2. Set up ngrok or cloudflared for local development
3. Configure GitHub webhook on ONE test repo first (not your main repo)
4. Open a test PR — verify Review Orchestrator runs and posts to Slack
5. Review the output quality. If the skill produces garbage on webhook-triggered runs, fix the `additionalContext` data being passed.
6. Once stable on the test repo, add webhooks to remaining repos

### Phase 11: CLI

1. Install: `npm install commander`
2. Build `automation/cli/agentem.ts` and individual command handlers
3. `npm link` for global access
4. Test each command individually: `agentem risk-scan --quiet`
5. Test chained commands: `agentem sprint-plan "test feature" --project test`
6. Verify Slack output for each

### Phase 12: Production Deployment

1. Deploy webhook server to a persistent host (Railway, Render, Fly.io, or your own VPS)
2. Update GitHub/Linear webhook URLs to production endpoint
3. Run scheduler with pm2 or systemd: `pm2 start automation/cron/scheduler.ts --name agentem-scheduler`
4. Set environment variables: `GITHUB_TOKEN`, `GITHUB_WEBHOOK_SECRET`, `LINEAR_API_KEY`, `SLACK_BOT_TOKEN`, `TZ`
5. Monitor for 1 week. Check Slack output daily. Adjust schedules and thresholds.

---

## What's Next: From Automation to Intelligence

This blueprint gives you a working agent — skills that run on a schedule, react to GitHub and Linear events, chain into pipelines, and post results to Slack without you prompting anything.

But it's a *dumb* agent.

Every mapping is hardcoded. PR opened always triggers Review Orchestrator regardless of whether it's a one-line typo fix or a 40-file schema migration touching your most critical service. The cron runner fires the same risk scan at 8am whether you're mid-sprint or between sprints. The CLI chains always run the same sequence.

The next layer is what makes this *intelligent*:

- **Event normalization:** Diverse webhook payloads from GitHub, Linear, and Slack get transformed into a standard event schema the system reasons over — not just routes through.
- **Intelligent routing:** An event arrives and the router evaluates it: What services are affected? What's the blast radius? Who's the right reviewer given current workload? Which skill should run? Should this auto-execute or wait for approval? That evaluation uses your context files — the same ones you already built.
- **Action execution:** Not just generating a risk digest and posting it to Slack, but actually creating the Linear tickets, assigning the GitHub reviewers, posting PR comments, and updating sprint status. The agent *acts*, not just *informs*.
- **Approval model:** Four-tier autonomy system calibrated to your risk tolerance. Digests are always autonomous. Ticket creation always requires approval. Review assignments graduate to autonomous after the system demonstrates accuracy. You control the boundaries.
- **Autonomy graduation:** The system learns from your approval history. Actions you consistently approve get promoted. Actions you modify get demoted. Over three sprints, the system earns trust and handles 60-70% of routine decisions without you.

That's what AgentEM builds for engineering teams — the intelligence layer on top of the automation you just set up, tuned to your specific workflow, integrated with your tools, and calibrated to your team's risk tolerance.

**[agentem.io](https://agentem.io)**
