Research Report (starter blueprint)

Slug: research-report Type: agent_blueprint Source: code-defined; appears in every tenant's agent_blueprint://list automatically. Fork via agents.fork_blueprint to customize. Outputs: one research_report artifact

The Research Report starter replicates Amdahl's canonical research pipeline as a blueprint. Give it a question; it decomposes into sub-questions, fans queries against your business data + knowledge base, synthesizes evidence with confidence scoring, surfaces cross-cutting findings, and emits a fully cited research_report artifact.

When to use it

A founder or operator asks a research question and you want grounded findings, not LLM bullet points.
You're prepping a board narrative, ICP refresh, or sales debrief and need to anchor claims in real customer evidence.
You want the same investigation pattern that the internal researcher specialist runs, but accessible from MCP / external Claude / a scheduled trigger.

What it produces

One research_report artifact with:

jsonc

{
  "topic": "<your question, verbatim>",
  "summary": "<3-5 sentence executive summary>",
  "sub_questions": [
    {
      "question": "<sub-question 1>",
      "answer": "<grounded synthesis>",
      "confidence": "high | medium | low",
      "citations": ["<source-id>", ...]
    }
    // ... 2-5 more
  ],
  "key_findings": [
    {
      "finding": "<cross-cutting pattern>",
      "confidence": "high | medium | low",
      "citations": [...]
    }
    // ... 4-9 more
  ],
  "gaps": ["<questions the substrate couldn't answer>", ...]
}

Inputs

Name	Type	Required	Description
`topic`	string	yes	The research question, 10-500 chars.
`depth`	enum	no	`quick` / `standard` (default) / `high`. Drives how deep the synthesis runs.
`audience`	string	no	Audience lens (free-text persona + segment).
`company_list`	json	no	Array of company names — narrows substrate queries to those accounts.
`reference_document_ids`	artifact_ref_list	no	`knowledge_doc` artifacts to ground the investigation in.

How it runs

5 top-level steps:

decompose — llm. Break the topic into 3-6 mutually exclusive sub-questions. Composes prompt://researcher/topic_decomposition.
gather — loop (parallel). For each sub-question, run data.cluster_search (and fall back to data.search / knowledge_base.search if cluster substrate is thin). Concurrency capped at 6.
synthesize — llm. Per-sub-question synthesis with confidence scoring. Composes prompt://researcher/evidence_synthesis + prompt://researcher/confidence_scoring.
cross_synth — llm. Surface 5-10 key findings that span ≥2 sub-questions or expose patterns no single query could see. Composes prompt://researcher/cross_pattern_synthesis.
save — tool. artifacts.create of research_report with the fully cited shape.

The blueprint's policy.tool_allowlist is GUIDANCE the reading agent should respect as it walks the recipe; the API key's scope grammar is the real safety boundary. On the interactive MCP surface there is no run-blueprint tool — the reading agent IS the runner and the allowlist is unenforced advice. The headless SDK runner (docs/blueprint-runner-sdk.md) also relies on its least-privilege scoped token, not the allowlist, as the boundary.

Forking it

Any tenant can fork this starter via:

code

artifacts.create artifact_type=agent_blueprint content_json=<your modifications>

Common forks:

Add tenant-specific clusters — extend gather to also call your custom internal-data tool.
Tighten depth — set the default depth to quick if your team doesn't need multi-pass synthesis.
Pre-seed audience filters — bake your ICP company_list into a hardcoded value and remove the input.

Limits

Depth high runs ~3-4× the queries of quick. Plan accordingly when batching.
On the interactive MCP path, reads are not cost-bounded server-side — there is no run-blueprint tool to account against; walking the recipe runs against the reading agent's own budget. The headless SDK runner (docs/blueprint-runner-sdk.md) accounts platform-initiated runs server-side.
Substrate queries scope to the calling API key's business; cross-tenant reads are blocked at the operation registry boundary.