Skip to main content

Overview

The chat composer’s + button lets you attach files to a message. Attached files are sent to the AI planner as native multimodal content — images as image blocks and PDFs as document blocks — so the model reads them directly when generating an edit, exactly like attaching a file in Claude or ChatGPT. This is distinct from the Asset Manager, which picks or generates an image to drop into a specific block field. Attachments are context for the request (“make the hero match this screenshot”, “use the copy from this brief”), not a value written into a prop.
MVP scope is images + PDF only — the two formats every supported provider reads natively, with no server-side text extraction. Other document types (.docx, .xlsx, .txt, …) are not accepted.

Supported files

KindTypesSent to the model as
ImagePNG, JPEG, WebP, GIFimage content block
PDFapplication/pdfdocument content block (text + per-page visuals)
  • Max size: 32 MB per file (the per-request payload ceiling for the model APIs).
  • Multiple files can be attached to one message; paste (⌘/Ctrl+V) of an image also works.
  • All active Claude models read PDFs natively, so attachments work on every model tier (fast / balanced / reasoning). OpenAI and Gemini receive the same files via their respective image and document/file content parts.

How it flows

  1. Upload — the editor POSTs each file to POST /attachment/upload (multipart field file). The orchestrator stores it alongside generated images and returns { url, mimeType, name, bytes, kind }.
  2. Chip — the file shows as a removable chip in the composer (thumbnail for images, a labelled chip for PDFs). Nothing is injected into the message text.
  3. Send — on submit, the attachments ride along in the /chat request body as an attachments[] array. An attachment-only message (no text) is allowed.
  4. Plan — the planner fetches each attachment’s bytes as base64 and adds them as native content blocks ahead of the request text. The system prompt gains an “Attached files” instruction telling the planner to treat them as authoritative input.

Fetch safety

Attachment URLs arrive in the client-supplied /chat body, so the planner-side resolver only fetches URLs on this orchestrator’s own /generated-images/ origin (an SSRF allowlist), with a request timeout and a 32 MB byte cap. Anything else — arbitrary or internal URLs — is skipped before a request is made.

Request shape

The /chat (and /chat/start) body accepts an optional attachments array:
{
  "session": "…",
  "slug": "/",
  "message": "make the hero match this screenshot",
  "attachments": [
    {
      "id": "a1",
      "kind": "image",
      "url": "https://orchestrator.example.com/generated-images/upload_….png",
      "mimeType": "image/png",
      "name": "screenshot.png",
      "bytes": 123456
    }
  ]
}
Up to 6 attachments are accepted per message. Attachments are per-turn context — they are not persisted into chat history and do not carry over to the next message.

Limitations

  • Puck visual editor surfaces the attach UI but does not yet send attachments through its transport — attachments are local-only there for now.
  • PDF on OpenAI is only sent to models that accept file content parts (the gpt-4o / gpt-4.1 / gpt-5 families). On other OpenAI models the PDF is dropped (images still go through) so the planning call doesn’t fail. Anthropic and Gemini read PDFs on every tier.