Model guide - Melius

There’s no single “best” generative model. Each one is good at a different thing, and the right pick depends on what you’re making. This page is a cheat sheet for the model picker dropdowns — what each is good at, and when to reach for it. The agent will usually pick a reasonable default. This page is for when you want to override it, or when you’re working directly on a node and want to make the call yourself.

Image models

Nano Banana Pro

Best for: style adherence, lifestyle photography, product-in-context shots, anything where matching a reference look is the priority. Strengths: unmatched at following a style brief. If you’ve run an Image Style Analysis and the description is detailed (lighting, palette, composition), Nano Banana Pro will hew to it closely. Also strong at natural materials, fabric, food, skin tone, and editorial lighting. Weaknesses: struggles with legible typography, especially small text. If your output has visible copy on it, lean GPT Image 2 instead — or generate the background in Nano Banana Pro and use a Studio node to add type. Default this for: static ads, product photography, lifestyle imagery, anything fashion or editorial.

GPT Image 2 (Low / Medium / High)

Best for: anything with visible typography, packaging shots with legible labels, ads with on-image copy. Strengths: dramatically better at rendering text than any other current image model. Will produce sharp, legible copy in the output, including on packaging, signage, and product labels. Reasonable at style adherence too, but Nano Banana Pro is better for pure aesthetic. Weaknesses: slower than Nano Banana, especially at “High.” Style adherence is solid but not as tight as Nano Banana Pro. Default this for: product shots where the label or packaging text needs to be legible, ads with overlaid copy that needs to be sharp, mockups of branded packaging. Medium vs High: Medium is faster and good for exploration. High is sharper, slower, and worth the extra time for production output. Don’t burn budget on High during early exploration.

Nano Banana 2

Best for: quick iteration, lower-stakes generations, draft-quality exploration. Strengths: fast and cheap. Good enough for early ideation. Weaknesses: style adherence is solid but not as crisp as Nano Banana Pro. Less detail in fine textures. Default this for: mood-finding, early variants where you want to cast a wide net before committing to a direction.

Video models

Seedance 2.0

Best for: human-centered video, UGC, anything with a character. Strengths: we have un-gated face access on Seedance 2.0, which means it handles human faces and character consistency better than most APIs. The right pick for UGC-style ads where the same persona has to appear across clips. Weaknesses: less strong on highly stylized or abstract video. For dreamy or surreal aesthetics, try Veo or Sora. Default this for: UGC ads, talking-head video, character-driven sequences.

Veo 3.1

Best for: cinematic motion, environmental shots, product-in-motion. Strengths: strong on camera movement, atmospheric lighting, environmental scenes. Good for hero shots, brand films, and product-in-motion sequences. Weaknesses: less reliable on faces and character consistency than Seedance. Default this for: cinematic brand content, product reveal sequences, environmental b-roll.

Sora 2

Best for: longer narrative motion, complex camera moves, stylized aesthetic work. Strengths: good at longer continuous shots and stylized motion. Useful when the look you’re going for is intentionally cinematic or surreal. Weaknesses: less consistent than Seedance for character work.

Text / LLM models

Text nodes are LLM calls — generally for ideation, prompt-writing, brand analysis, or holding pasted context. You usually don’t need to think about the model choice for these, but if you do:

Gemini 3.1 Pro

Best for: image style analysis (the Image Style Analysis template pre-configures this), multimodal reasoning with images attached. This is the model under the hood of the Image Style Analysis template. If you’re running your own image-attached reasoning, Gemini 3.1 Pro is a strong default.

GPT-5 / Claude / others

Best for: general LLM tasks — ideation, prompt-writing, brand brief synthesis, etc. For most marketer use cases, the model choice on text nodes matters less than the prompt content. The agent will pick a sensible default. Override only if you’ve got a specific reason to.

Audio models

ElevenLabs

The current default and only audio provider. Roughly 20 voices are curated in the in-product picker, with the full ElevenLabs catalog available — ping us in Slack if you need a specific voice that isn’t in the picker.

Custom voice cloning at the team level is in active development. Once it ships, you’ll be able to upload a voice sample and reuse it across canvases. We’ll update this reference when it’s live.

Quick decision table

If you’re making…	Reach for
Static ads with editorial photography style	Nano Banana Pro
Product packaging shots with legible labels	GPT Image 2 (High)
Ads with on-image copy / typography	GPT Image 2
Lifestyle imagery, fashion, food, fabric	Nano Banana Pro
Quick exploration / mood-finding	Nano Banana 2
UGC video with consistent character	Seedance 2.0
Cinematic brand video	Veo 3.1
Longer narrative or stylized video	Sora 2
Image style analysis	Gemini 3.1 Pro (template default)
Voiceover	ElevenLabs

On model choice volatility

Generative model capabilities shift quickly. Today’s “best at typography” model may not be the best six months from now — and we add new models to the picker as they ship. The agent stays current with the model landscape, so if you’re letting it pick, you’ll mostly stay on the right side of those shifts automatically. If you’re picking manually, check this page periodically (we update it when the landscape moves) or ask the agent: “What’s the best model on Melius for [task] right now?”

​Image models

​Nano Banana Pro

​GPT Image 2 (Low / Medium / High)

​Nano Banana 2

​Video models

​Seedance 2.0

​Veo 3.1

​Sora 2

​Text / LLM models

​Gemini 3.1 Pro

​GPT-5 / Claude / others

​Audio models

​ElevenLabs

​Quick decision table

​On model choice volatility

Image models

Nano Banana Pro

GPT Image 2 (Low / Medium / High)

Nano Banana 2

Video models

Seedance 2.0

Veo 3.1

Sora 2

Text / LLM models

Gemini 3.1 Pro

GPT-5 / Claude / others

Audio models

ElevenLabs

Quick decision table

On model choice volatility