Overview
The chat composer’s + button lets you attach files to a message. Attached files are sent to the AI planner as native multimodal content — images as image blocks and PDFs as document blocks — so the model reads them directly when generating an edit, exactly like attaching a file in Claude or ChatGPT. This is distinct from the Asset Manager, which picks or generates an image to drop into a specific block field. Attachments are context for the request (“make the hero match this screenshot”, “use the copy from this brief”), not a value written into a prop.MVP scope is images + PDF only — the two formats every supported provider
reads natively, with no server-side text extraction. Other document types
(
.docx, .xlsx, .txt, …) are not accepted.Supported files
| Kind | Types | Sent to the model as |
|---|---|---|
| Image | PNG, JPEG, WebP, GIF | image content block |
application/pdf | document content block (text + per-page visuals) |
- Max size: 32 MB per file (the per-request payload ceiling for the model APIs).
- Multiple files can be attached to one message; paste (⌘/Ctrl+V) of an image also works.
- All active Claude models read PDFs natively, so attachments work on every model tier (fast / balanced / reasoning). OpenAI and Gemini receive the same files via their respective image and document/file content parts.
How it flows
- Upload — the editor POSTs each file to
POST /attachment/upload(multipart fieldfile). The orchestrator stores it alongside generated images and returns{ url, mimeType, name, bytes, kind }. - Chip — the file shows as a removable chip in the composer (thumbnail for images, a labelled chip for PDFs). Nothing is injected into the message text.
- Send — on submit, the attachments ride along in the
/chatrequest body as anattachments[]array. An attachment-only message (no text) is allowed. - Plan — the planner fetches each attachment’s bytes as base64 and adds them as native content blocks ahead of the request text. The system prompt gains an “Attached files” instruction telling the planner to treat them as authoritative input.
Fetch safety
Attachment URLs arrive in the client-supplied/chat body, so the planner-side
resolver only fetches URLs on this orchestrator’s own
/generated-images/ origin (an SSRF allowlist), with a request timeout and a
32 MB byte cap. Anything else — arbitrary or internal URLs — is skipped before a
request is made.
Request shape
The/chat (and /chat/start) body accepts an optional attachments array:
Limitations
- Puck visual editor surfaces the attach UI but does not yet send attachments through its transport — attachments are local-only there for now.
- PDF on OpenAI is only sent to models that accept
filecontent parts (the gpt-4o / gpt-4.1 / gpt-5 families). On other OpenAI models the PDF is dropped (images still go through) so the planning call doesn’t fail. Anthropic and Gemini read PDFs on every tier.