# Research Report

# Research Report (starter blueprint)

> **Slug:** `research-report`
> **Type:** `agent_blueprint`
> **Source:** code-defined; appears in every tenant's `agent_blueprint://list` automatically. Fork via `agents.fork_blueprint` to customize.
> **Outputs:** one `research_report` artifact

The Research Report starter replicates Amdahl's canonical research pipeline as a blueprint. Give it a question; it decomposes into sub-questions, fans queries against your business data + knowledge base, synthesizes evidence with confidence scoring, surfaces cross-cutting findings, and emits a fully cited `research_report` artifact.

## When to use it

- A founder or operator asks a research question and you want grounded findings, not LLM bullet points.
- You're prepping a board narrative, ICP refresh, or sales debrief and need to anchor claims in real customer evidence.
- You want the same investigation pattern that the internal `researcher` specialist runs, but accessible from MCP / external Claude / a scheduled trigger.

## What it produces

One `research_report` artifact with:

```jsonc
{
  "topic": "<your question, verbatim>",
  "summary": "<3-5 sentence executive summary>",
  "sub_questions": [
    {
      "question": "<sub-question 1>",
      "answer": "<grounded synthesis>",
      "confidence": "high | medium | low",
      "citations": ["<source-id>", ...]
    }
    // ... 2-5 more
  ],
  "key_findings": [
    {
      "finding": "<cross-cutting pattern>",
      "confidence": "high | medium | low",
      "citations": [...]
    }
    // ... 4-9 more
  ],
  "gaps": ["<questions the substrate couldn't answer>", ...]
}
```

## Inputs

| Name                     | Type              | Required | Description                                                                  |
| ------------------------ | ----------------- | -------- | ---------------------------------------------------------------------------- |
| `topic`                  | string            | yes      | The research question, 10-500 chars.                                         |
| `depth`                  | enum              | no       | `quick` / `standard` (default) / `high`. Drives how deep the synthesis runs. |
| `audience`               | string            | no       | Audience lens (free-text persona + segment).                                 |
| `company_list`           | json              | no       | Array of company names — narrows substrate queries to those accounts.        |
| `reference_document_ids` | artifact_ref_list | no       | `knowledge_doc` artifacts to ground the investigation in.                    |

## How it runs

5 top-level steps:

1. **decompose** — `llm`. Break the topic into 3-6 mutually exclusive sub-questions. Composes `prompt://researcher/topic_decomposition`.
2. **gather** — `loop` (parallel). For each sub-question, run `data.cluster_search` (and fall back to `data.search` / `knowledge_base.search` if cluster substrate is thin). Concurrency capped at 6.
3. **synthesize** — `llm`. Per-sub-question synthesis with confidence scoring. Composes `prompt://researcher/evidence_synthesis` + `prompt://researcher/confidence_scoring`.
4. **cross_synth** — `llm`. Surface 5-10 key findings that span ≥2 sub-questions or expose patterns no single query could see. Composes `prompt://researcher/cross_pattern_synthesis`.
5. **save** — `tool`. `artifacts.create` of `research_report` with the fully cited shape.

The blueprint's `policy.tool_allowlist` is GUIDANCE the reading agent should respect as it walks the recipe; the API key's scope grammar is the real safety boundary. On the interactive MCP surface there is no `run-blueprint` tool — the reading agent IS the runner and the allowlist is unenforced advice. The headless SDK runner (`docs/blueprint-runner-sdk.md`) also relies on its least-privilege scoped token, not the allowlist, as the boundary.

## Forking it

Any tenant can fork this starter via:

```
artifacts.create artifact_type=agent_blueprint content_json=<your modifications>
```

Common forks:

- **Add tenant-specific clusters** — extend `gather` to also call your custom internal-data tool.
- **Tighten depth** — set the default depth to `quick` if your team doesn't need multi-pass synthesis.
- **Pre-seed audience filters** — bake your ICP company_list into a hardcoded value and remove the input.

## Limits

- Depth `high` runs ~3-4× the queries of `quick`. Plan accordingly when batching.
- On the interactive MCP path, reads are not cost-bounded server-side — there is no `run-blueprint` tool to account against; walking the recipe runs against the reading agent's own budget. The headless SDK runner (`docs/blueprint-runner-sdk.md`) accounts platform-initiated runs server-side.
- Substrate queries scope to the calling API key's business; cross-tenant reads are blocked at the operation registry boundary.