There’s no single “best” generative model. Each one is good at a different thing, and the right pick depends on what you’re making. This page is a cheat sheet for the model picker dropdowns — what each is good at, and when to reach for it. The agent will usually pick a reasonable default. This page is for when you want to override it, or when you’re working directly on a node and want to make the call yourself.Documentation Index
Fetch the complete documentation index at: https://docs.melius.com/llms.txt
Use this file to discover all available pages before exploring further.
Image models
Nano Banana Pro
Best for: style adherence, lifestyle photography, product-in-context shots, anything where matching a reference look is the priority. Strengths: unmatched at following a style brief. If you’ve run an Image Style Analysis and the description is detailed (lighting, palette, composition), Nano Banana Pro will hew to it closely. Also strong at natural materials, fabric, food, skin tone, and editorial lighting. Weaknesses: struggles with legible typography, especially small text. If your output has visible copy on it, lean GPT Image 2 instead — or generate the background in Nano Banana Pro and use a Studio node to add type. Default this for: static ads, product photography, lifestyle imagery, anything fashion or editorial.GPT Image 2 (Low / Medium / High)
Best for: anything with visible typography, packaging shots with legible labels, ads with on-image copy. Strengths: dramatically better at rendering text than any other current image model. Will produce sharp, legible copy in the output, including on packaging, signage, and product labels. Reasonable at style adherence too, but Nano Banana Pro is better for pure aesthetic. Weaknesses: slower than Nano Banana, especially at “High.” Style adherence is solid but not as tight as Nano Banana Pro. Default this for: product shots where the label or packaging text needs to be legible, ads with overlaid copy that needs to be sharp, mockups of branded packaging. Medium vs High: Medium is faster and good for exploration. High is sharper, slower, and worth the extra time for production output. Don’t burn budget on High during early exploration.Nano Banana 2
Best for: quick iteration, lower-stakes generations, draft-quality exploration. Strengths: fast and cheap. Good enough for early ideation. Weaknesses: style adherence is solid but not as crisp as Nano Banana Pro. Less detail in fine textures. Default this for: mood-finding, early variants where you want to cast a wide net before committing to a direction.Video models
Seedance 2.0
Best for: human-centered video, UGC, anything with a character. Strengths: we have un-gated face access on Seedance 2.0, which means it handles human faces and character consistency better than most APIs. The right pick for UGC-style ads where the same persona has to appear across clips. Weaknesses: less strong on highly stylized or abstract video. For dreamy or surreal aesthetics, try Veo or Sora. Default this for: UGC ads, talking-head video, character-driven sequences.Veo 3.1
Best for: cinematic motion, environmental shots, product-in-motion. Strengths: strong on camera movement, atmospheric lighting, environmental scenes. Good for hero shots, brand films, and product-in-motion sequences. Weaknesses: less reliable on faces and character consistency than Seedance. Default this for: cinematic brand content, product reveal sequences, environmental b-roll.Sora 2
Best for: longer narrative motion, complex camera moves, stylized aesthetic work. Strengths: good at longer continuous shots and stylized motion. Useful when the look you’re going for is intentionally cinematic or surreal. Weaknesses: less consistent than Seedance for character work.Text / LLM models
Text nodes are LLM calls — generally for ideation, prompt-writing, brand analysis, or holding pasted context. You usually don’t need to think about the model choice for these, but if you do:Gemini 3.1 Pro
Best for: image style analysis (the Image Style Analysis template pre-configures this), multimodal reasoning with images attached. This is the model under the hood of the Image Style Analysis template. If you’re running your own image-attached reasoning, Gemini 3.1 Pro is a strong default.GPT-5 / Claude / others
Best for: general LLM tasks — ideation, prompt-writing, brand brief synthesis, etc. For most marketer use cases, the model choice on text nodes matters less than the prompt content. The agent will pick a sensible default. Override only if you’ve got a specific reason to.Audio models
ElevenLabs
The current default and only audio provider. Roughly 20 voices are curated in the in-product picker, with the full ElevenLabs catalog available — ping us in Slack if you need a specific voice that isn’t in the picker.Custom voice cloning at the team level is in active development. Once it ships, you’ll be able to upload a voice sample and reuse it across canvases. We’ll update this reference when it’s live.
Quick decision table
| If you’re making… | Reach for |
|---|---|
| Static ads with editorial photography style | Nano Banana Pro |
| Product packaging shots with legible labels | GPT Image 2 (High) |
| Ads with on-image copy / typography | GPT Image 2 |
| Lifestyle imagery, fashion, food, fabric | Nano Banana Pro |
| Quick exploration / mood-finding | Nano Banana 2 |
| UGC video with consistent character | Seedance 2.0 |
| Cinematic brand video | Veo 3.1 |
| Longer narrative or stylized video | Sora 2 |
| Image style analysis | Gemini 3.1 Pro (template default) |
| Voiceover | ElevenLabs |