UGC videos - Melius

UGC-style video is one of the harder categories of paid creative to produce: it needs to feel authentic, the character needs to look consistent across cuts, the voiceover has to land, and the pacing has to work. Doing all this manually is slow. Doing it on Melius collapses the bottleneck. Time: 30–60 minutes for a single 30-second spot end-to-end. You’ll need: a script (or a brief to generate one), a character reference (a moodboard image of someone who fits the persona, or a custom-generated portrait), and the product.

The general shape

A UGC ad is usually three to six short clips stitched together with voiceover and on-screen copy. The Melius workflow:

Define the character — generate a character sheet so the same person appears across every cut.
Generate each clip — image-to-video from a starting frame, using the character sheet for consistency.
Add voiceover — use an audio node to generate the voice.
Stitch the clips — use a stitch node to concatenate them into the finished video.

You finish the timeline polish (captions, cuts, music) in your video editor of choice. Most teams hand the stitched output off to their video editor as a starting point.

Step 1: Character sheet

A character sheet is a small set of images showing the same person from multiple angles, with consistent features. This is what keeps the actor looking like the same person across cuts.

Drop one reference image of the character

Could be a moodboard scrape or a custom-generated portrait. Try not to use a real identifiable person.

Brief the agent

“Help me build a character sheet from this reference. I want four or five angles — front-facing, three-quarter left, three-quarter right, profile, and a casual everyday shot. Use GPT Image 2 (Medium). Image-to-image, keep the face and identity consistent.”

The agent generates the angles

You’ll get four to five image nodes. If any face looks different from the reference, regenerate just that node.

Unified-group the angles

Select the character sheet images and create a unified group. You now have a single reference port you can drag into every downstream video node.

You can also consolidate the angles into a single sheet image: “Combine these angles into a single character sheet image in grid format.” The agent creates one image with all angles laid out — handy for handoff and easier to drag around the canvas as one reference.

Step 2: Generate the clips

For each clip in your script:

Create a video node

Right-click → New video node. Pick Seedance 2.0 (best for character work — we have un-gated face access) at 9x16, 720p (faster) or 1080p (production).

Connect inputs

Character sheet → reference image input (keeps face consistent).
Product pack shot → reference image input (if the product appears in the clip).
Brand anchor / script context → context input.

Write the clip prompt

Describe what’s happening in the clip and what the character is doing/saying. Example:

9 seconds. Character (from @character-sheet) sitting at a kitchen
counter, holding @product, looking directly at the phone camera
UGC-style. Natural morning light. Slight head movement, conversational
energy. Setting: casual home kitchen.

Run with 2–3 variations

Video generations are slower and more expensive than image gens, so don’t overdo variations. Three is usually plenty for the model to give you a usable take.

Repeat for each clip

Duplicate the video node (Cmd+C / Cmd+V) and change only the prompt for each clip. The character sheet and product references stay connected.

Most video models cap at 9–12 seconds per clip. For a 30-second ad you’ll need three to four clips stitched together. Plan your script around that constraint.

Step 3: Voiceover

Create an audio node

Right-click → New audio node. ElevenLabs is the current voice provider.

Pick a voice

Roughly 20 voices are curated in the picker. We have access to the full ElevenLabs catalog — if you need a specific voice that isn’t in the picker, ping us in Slack.

Paste your VO script

The audio node will generate a clip from the script. Keep clips short — under 30 seconds each — for quality and consistency.

Custom voice cloning at the team level is in active development. Once it ships, you’ll be able to upload a voice sample and reuse it across canvases for consistent persona-driven UGC. We’ll update this page when it’s live.

Step 4: Stitch the clips

Create a stitch node

Right-click → New stitch node. Or brief the agent: “Use a stitch node to combine the video clips in order: opening, body 1, body 2, body 3, closer.”

Connect each video clip

Drag from each video node into the stitch node in the order they should appear.

Set output settings

Aspect ratio, resolution, fit-to-screen vs stretch-to-fill. Match the platform you’re shipping to.

Run

The stitch node outputs a single video file with the clips concatenated.

Step 5: Finish in your editor

Download the stitched video and the voiceover audio (as separate files). Hand both to your video editor — they’ll add captions, sync the voiceover, drop in music, and handle the last-mile polish.

Captions inside Melius are coming with the timeline editor. Until then, captioning is the one part of UGC video work that still leaves the canvas. Most teams handle it in their video editor of choice.

A common pattern: agency UGC at scale

Agencies producing UGC for clients typically structure work like this:

One project per client. Brand anchor lives in a text node at the top.
One canvas per campaign. Holds the character sheet, product references, and all the clips.
Claude drives the build. Via MCP, Claude reads the campaign brief and creative concept, then builds out the canvas — character sheet, clip nodes, voiceover, stitch — automatically. The designer or marketer comes in afterwards to pick winners and polish.

This is how a 1-person creative ops team can produce 10–20x more video output than they could a year ago. See Drive Melius from Claude for the setup.

Common pitfalls

Character drift across clips. Without a character sheet, every video model generation produces a slightly different face. Always build the character sheet first.
One clip per take, no variations. Video generations are probabilistic too. Run 2–3 variations per clip — the cost is real but small, and the time saved on bad takes is large.
Trying to render a 30-second clip in one node. Models cap at 9–12 seconds. Plan your script around 3–4 short cuts.
Forgetting product references in the clips that show the product. The model will invent a product that isn’t yours. Connect the pack shot to every video node that needs the product visible.

​The general shape

​Step 1: Character sheet

​Step 2: Generate the clips

​Step 3: Voiceover

​Step 4: Stitch the clips

​Step 5: Finish in your editor

​A common pattern: agency UGC at scale

​Common pitfalls

The general shape

Step 1: Character sheet

Step 2: Generate the clips

Step 3: Voiceover

Step 4: Stitch the clips

Step 5: Finish in your editor

A common pattern: agency UGC at scale

Common pitfalls