Most "AI brand video" prompts produce silent stock-style clips that someone has to score, dub, and grade in post before they resemble anything deliverable. Veo 3 is different: it synthesizes audio natively alongside the image, which means a prompt that explicitly directs ambient sound, foley, voiceover, and music can produce something close to a finished spot — not footage waiting to become one. These 30 copy-paste prompts are written for that capability, for brand marketers, agency producers, and founders who are running their own ads.
What Working Brand Video Prompts Need
Name the brand pillar and the emotional beat before anything else. A brand video isn't "a video about your product" — it's a statement about what your brand believes, made through a specific emotional lens. Is this clip meant to convey craft and obsession? Accessibility and warmth? Bold ambition? Name it at the top of the prompt so every other decision (camera, light, audio, grade) flows from that intent rather than defaulting to generic.
Audio is how brand feel actually lands — direct it explicitly. Veo 3 synthesizes audio from the prompt, not from the visuals alone. If you don't describe what viewers hear, you get ambient room tone that may or may not serve the brand. Write out all four audio layers: ambient environment, foley tied to the product interaction, any dialogue or voiceover (with tone, pacing, and delivery style), and music (genre, mood, instrumentation, dynamic shape). The difference between a brand that feels premium and one that feels generic is usually the sound mix. See the full breakdown in the Veo 3 prompt guide.
Pick a camera move that flatters the subject. A slow dolly-in creates intimacy and draws the viewer toward the product. An orbit shot communicates dimensionality and craft. A crane rising from close to wide conveys scale and aspiration. A locked-off static frame with movement only in the subject builds authority. These aren't aesthetic preferences — they're rhetorical tools. Choose the move for what it says, not just what it shows.
Specify lighting and color grade tied to your brand palette. Warm tungsten light with a lifted, slightly hazy grade reads as artisanal and approachable. Cool blue-white light with hard shadows reads as clinical precision. Deep shadows with a single hero light source reads as luxury. These cues are free to add to a prompt and they do significant work. If you have brand hex codes, translate them to grade notes: "warm amber tones consistent with brand palette #D4A84B and #2C1810."
Match the aspect ratio to the channel before you render. Veo 3 can produce different aspect ratios, and getting this wrong wastes the clip. 16:9 is right for YouTube, TV, and widescreen web placements. 9:16 is for Stories, Reels, and Shorts — the majority of paid social impressions. 1:1 works for Instagram feed. 4:5 is Instagram portrait, which takes up more real estate in feed. Specify the ratio in the prompt and match it to the planned placement. For multi-channel campaigns, see prompts 26–30 below and read AI prompts for marketing for distribution strategy.
Brand Anthem & Manifesto Prompts (1–5)
1. Anthemic Brand Film with Voiceover
Brand pillar: We believe that [BRAND BELIEF — e.g., "everyday objects
should be made with the same care as heirlooms"].
A sweeping montage of hands at work — craftspeople, makers, and
everyday users — cut to a rising anthem. Open on extreme close-up
of texture (wood grain, fabric weave, ceramic glaze), then pull
back through a series of dissolves to reveal people in full context.
Final shot: product on a simple surface, single light source, held
in steady hands.
Camera: Slow dolly-in on close-up detail shots; crane rising from
close to wide on the final reveal.
Lighting: Warm practical light (3200K tungsten) supplemented by
soft fill. No harsh shadows. Color grade: warm, slightly lifted
blacks — reminiscent of analog film. Brand palette: [YOUR WARM HEX].
Audio:
- Ambient: Workshop sounds at low level — distant tools, subtle
hum of intention
- Foley: Fabric brush, ceramic clink, deliberate footsteps on
hardwood
- Voiceover: Calm, measured, authoritative — mid-register male
or female voice, unhurried cadence, 3 sentences maximum. VO
starts at 4 seconds, ends before final product reveal. Text:
"We didn't set out to build a company. We set out to get one
thing right. [BRAND TAGLINE]."
- Music: Orchestral indie-folk — acoustic guitar with single
cello, sparse piano entering at the 15-second mark, swelling
to full arrangement by :25, then resolving to a single held
note on the product reveal
- Mix: Music at -18 LUFS under VO; foley at -24 LUFS; VO
loudest element
Aspect: 16:9 for YouTube/TV. Duration: 30 seconds.
2. "We Believe" Manifesto Film
Brand pillar: Conviction — this brand stands for something specific
and isn't apologizing for it.
A direct-to-camera manifesto film. Single subject — a founder or
brand spokesperson — speaking to camera in a simple, un-decorated
environment (raw brick wall, open warehouse, plain studio). No
B-roll cuts. No text overlays. Just the person and the words.
Camera: Locked-off medium close-up (chest to top of head). Subject
in sharp focus, background in soft bokeh. No camera movement until
the final beat, where a very slow push-in closes on the face as
the final line lands.
Lighting: Single large softbox camera left, subtle bounce fill
camera right. Hard-edged practical behind subject for depth.
Color grade: desaturated midtones, strong skin tones — Kodak
Vision3 analog reference.
Audio:
- Ambient: Almost silent — very faint room tone to avoid
a dead signal
- Foley: None — this is purely voice-driven
- Dialogue: Warm, personal, direct — not a performance, a
conversation. Pacing: deliberate, with pauses. Text: "We
believe [BELIEF 1]. We believe [BELIEF 2]. We believe that
[BRAND CONVICTION]. That's why we built [BRAND/PRODUCT]."
- Music: Single sustained piano note that enters on the final
sentence, holds through the last beat, and fades — nothing
more
- Mix: Dialogue completely dry and present; music -28 LUFS
supporting, never competing
Aspect: 16:9 and 9:16 (frame subject center for both crops).
Duration: 20 seconds.
3. Slow-Motion Mission Montage
Brand pillar: Relentless pursuit of quality — the brand's mission
is visible in how it works, not just what it makes.
A slow-motion montage of the brand's work and process. No dialogue —
pure visual and audio storytelling. Sequence: raw material being
selected → hands working with precision → product taking form →
finished product in use by a real person → logo resolve.
Camera: Series of slow-motion close-ups (120fps feel) interspersed
with normal-speed wide establishing shots. Each close-up: dolly-in
or rack focus. Final wide shot of product in context: locked off.
Lighting: Natural light preferred — window light with bounced fill.
Golden hour palette on exterior shots. Color grade: rich, warm,
slightly desaturated — editorial documentary feel.
Audio:
- Ambient: Each environment rendered accurately — workshop hum,
outdoor wind, quiet studio
- Foley: Emphasized process sounds in slow motion (tactile,
satisfying) — material texture, tool contact, final product
interaction
- Voiceover: None — the work speaks
- Music: Post-rock instrumental — starts with fingerpicked guitar
loop, builds with layered percussion at the midpoint, full
crescendo as product is revealed in use, resolves to single
acoustic note on logo
- Mix: Foley prominent and musical, blending with score —
sound design is the score
Aspect: 16:9. Duration: 45 seconds.
4. Day-in-the-Life Brand Identity Film
Brand pillar: The brand fits naturally into a life that values
[BRAND VALUE — e.g., "intentional living" or "high performance"].
A single character's day, structured as a visual poem. Not a
product demo — the product appears but the protagonist is the
character and how they move through the world. Dawn to dusk:
morning ritual → work → lunch break → afternoon focus → evening
wind-down. Brand product appears in 2-3 scenes, used naturally.
Camera: Observational — handheld with intention (not shaky,
stabilized but with organic movement). Eye-level. Intimate.
Character never addresses camera.
Lighting: Practical and natural throughout the day. Warm morning
light, harsh midday managed with shade and fill, soft evening
practical. Color grade: consistent warm-neutral palette —
lifestyle editorial, clean grain.
Audio:
- Ambient: Full environmental audio for each scene —
morning birds, city hum, café murmur, evening quiet
- Foley: Coffee maker, keyboard, product interaction sounds
woven naturally
- Voiceover: Calm, introspective internal monologue — 2 lines
maximum, entering at the midpoint and at the close. Tone:
self-assured, unhurried. "[LINE 1 AT MIDPOINT]. [CLOSING
REFLECTION]."
- Music: Lo-fi jazz or ambient electronic — present
throughout at a low level (-22 LUFS), swelling slightly
in the final third; no hard breaks, continuous feel
- Mix: Ambient and foley carry the energy; music provides
emotional undercurrent; VO sits clearly above both
Aspect: 16:9. Duration: 60 seconds.
5. Abstract Values Brand Film
Brand pillar: The brand's core values — [VALUE 1], [VALUE 2],
[VALUE 3] — expressed visually without literal product footage.
A fully abstract brand film: shapes, light, motion, and material
that evoke the emotional world of the brand without showing the
product directly. Think title sequence energy — every frame is
brand-expressive. Closing 3 seconds: brand logo on black.
Camera: Macro and abstract — extreme close-ups of materials,
light through surfaces, motion blur, geometric forms in motion.
Smooth transitions via morphing and color-state changes.
Lighting: Highly controlled — single-color lighting rigs that
shift through brand palette colors. Color grade: brand palette
is the grade — no neutral moments. Reference: [BRAND PALETTE
COLORS].
Audio:
- Ambient: Synthesized abstract environment — this is a
constructed world, not a real one
- Foley: Abstracted material sounds (glass resonance,
metallic hum, soft impact) tuned to musical pitch
- Voiceover: None
- Music: Original-feeling electronic composition — opening
with a single tone that represents the brand, building
into a minimal electronic arrangement that climaxes at
the 20-second mark, resolving to silence on the logo card
- Mix: Music and sound design unified — one continuous
sonic experience, no separation
Aspect: 16:9. Duration: 25 seconds.
Product Hero Prompts (6–10)
6. Studio Hero Rotation
Brand pillar: This product is worth examining from every angle —
craft and design are the proof points.
A single product, studio environment, slow 360-degree rotation.
The product is the only subject. Nothing else in frame. Background:
pure white or brand-signature color. Product: [PRODUCT NAME AND
BRIEF PHYSICAL DESCRIPTION].
Camera: Camera orbits the product at product-center height,
completing approximately 270 degrees over the clip duration.
Start at 3/4 front angle, orbit to reveal back, return to
near-front. Smooth continuous motion, no jump cuts.
Lighting: Three-point product lighting — key light at 45 degrees
(hard, to define form), fill at 135 degrees (soft, 2 stops under
key), rim light from below-back (creates separation). Color
grade: clean and accurate — no warming or cooling that would
misrepresent product color.
Audio:
- Ambient: Studio silence — imperceptible room tone only
- Foley: Very subtle, high-frequency studio ambient (air
conditioning at -40 LUFS) — enough to signal "premium
environment" without being audible as a sound
- Voiceover: None
- Music: Minimal electronic — single repeating motif that
suggests precision and intention, 4-bar loop, no
development; product audio carries the clip
- Mix: Music at -30 LUFS, essentially subliminal; foley
barely audible; this clip is nearly silent
Aspect: 1:1 for Instagram feed and 16:9 for web. Duration: 8 seconds.
The product should look exactly as in the reference image —
do not reinterpret shape, color, materials, or proportions.
7. Hero Product in Dramatic Light
Brand pillar: This product belongs in a conversation about
beautiful objects — premium, considered, worth displaying.
A single product shot with dramatic, high-contrast lighting.
Product emerges from near-darkness into a single defined beam
of light — the product discovery moment. Slight mist or
atmospheric haze adds depth. No movement except a barely
perceptible slow push-in that comes to rest on the hero angle.
Camera: Starting slightly wide of product center, very slow
dolly-in arriving at ideal hero angle over 8 seconds.
End frame is the shot that could be a print ad.
Lighting: Single hard source (spot or fresnel) at 30 degrees
above, slightly camera left. Deep shadows. Background: black
to near-black. Color temp: 5600K with a slight amber gel —
warm but not soft. Color grade: high contrast, rich shadows,
clean highlights — cinematic, not commercial.
Audio:
- Ambient: Low, resonant hum — almost felt rather than heard;
sub-bass presence at -45 LUFS
- Foley: A single soft surface contact sound at the opening
frame — the product being placed — then silence
- Voiceover: None
- Music: Single sustained string note that swells very slowly
from inaudible to -18 LUFS over the clip duration,
then holds
- Mix: The music swell is the emotional arc; everything
else is near-silent
Aspect: 16:9. Duration: 10 seconds.
The product should look exactly as in the reference image —
do not reinterpret shape, color, materials, or proportions.
8. Hero Product on Contextual Surface
Brand pillar: This product was made for a specific life —
and that life is aspirational.
Product placed on a surface that evokes its natural context
(kitchen counter, desk, bathroom shelf, workshop bench —
choose the surface that fits [YOUR PRODUCT'S CONTEXT]).
The environment is blurred but readable. Natural light from
a window at frame left creates long, soft shadows.
Camera: Low-angle static shot — camera at surface level,
looking across the product. After 3 seconds, a very gentle
rack focus from foreground (blurred) to product (sharp)
to background (blurred). No other movement.
Lighting: Single large window source from camera left.
Warm morning light (4000K). Soft shadows at approximately
45-degree angle across the surface. Color grade:
lifestyle-editorial — warm tones, slightly lifted highlights,
natural color accuracy for the product.
Audio:
- Ambient: The contextual environment — morning kitchen
sounds, distant outdoor birds, or workspace hum
depending on the surface
- Foley: Surface texture as the product is imagined
resting on it — a subtle, satisfying sound on the
opening frame
- Voiceover: None
- Music: Acoustic guitar single-chord loop with soft
brushed percussion — warm, intimate, not intrusive;
fades out by final 2 seconds leaving only ambient
- Mix: Ambient carries the setting; music provides warmth;
product foley anchors the moment
Aspect: 4:5 for Instagram portrait. Duration: 8 seconds.
The product should look exactly as in the reference image —
do not reinterpret shape, color, materials, or proportions.
9. Hero Product with Hand Interaction
Brand pillar: This product is made to be used — the interaction
reveals quality that static shots can't convey.
A hand enters frame from below-right, picks up the product,
examines it — a deliberate, unhurried interaction. The
hand's movement communicates that the product feels
significant to hold. Set on a clean surface, neutral
background.
Camera: Medium close-up. Camera is static. The action
is in the hand and product. A slow rack focus follows
the product from surface to lifted position. The hand
that interacts should be deliberately cast — clean,
expressive, appropriate to brand identity.
Lighting: Soft two-light setup — large softbox overhead
slightly front, small fill from below to eliminate
under-shadow on the hand. Product color-accurate.
Color grade: clean and honest — no stylization that
sacrifices product color accuracy.
Audio:
- Ambient: Near-silent — faint studio presence
- Foley: This is the centerpiece — the sound of picking
up the product: surface texture leaving, weight settling
in the hand, any material-specific sounds (click, hum,
texture). Every interaction sound is precisely foley'd
and satisfying
- Voiceover: None
- Music: None — this clip lives on its product sounds
- Mix: Foley is the hero; no music competing with it
Aspect: 1:1. Duration: 6 seconds.
The product should look exactly as in the reference image —
do not reinterpret shape, color, materials, or proportions.
10. Foley-Driven Product Detail Reel
Brand pillar: The craft and materiality of this product are
the selling point — details that photographs miss.
A fast-paced but controlled series of extreme close-up
macro shots of the product's physical details — texture,
joints, finishes, mechanisms, surfaces. Each cut is
matched with a precise foley sound. This is a product
detail clip designed for an audience that appreciates
craft.
Camera: Series of 2-second macro close-ups. Shallow depth
of field on each. Mix of static and very slow micro-dolly
moves. Cut rhythm: regular, every 2 seconds. Final shot:
pull back from the final detail to full product reveal
held for 2 seconds.
Lighting: Variable per shot to flatter each detail —
raking light for texture, transmitted light for
translucent elements, spot for metallic highlights.
Color grade: consistent across all shots despite
different lighting — brand palette maintained throughout.
Audio:
- Ambient: Absent — replaced entirely by foley
- Foley: Each cut introduces a new, precise sound matched
to the material shown. Fabric: soft brush. Metal:
resonant tap. Wood: warm knock. Mechanism: deliberate
click. These sounds are elevated and musical — cut
precisely to each edit
- Voiceover: None
- Music: Rhythmic minimal electronic that matches the
2-second cut rhythm — each foley hit falls on a
musical beat; music and sound design are one composition
- Mix: Foley and music fully integrated; no separation
Aspect: 9:16 for Reels/Shorts. Duration: 15 seconds.
The product should look exactly as in the reference image —
do not reinterpret shape, color, materials, or proportions.
Founder Story & Behind-the-Scenes Prompts (11–15)
11. Founder Talking-Head with Cutaways
Brand pillar: The person behind the brand is the reason to
believe — authenticity and conviction come through in their
direct testimony.
A founder interview-style clip. Founder speaks to slightly
off-camera interviewer (not directly to lens — more candid,
less performative). After 5 seconds of establishing the
founder's face and setting, cut to B-roll of the product
or process, then return to founder for the closing
statement.
Camera: Medium close-up on founder, camera slightly right
of subject, subject looks slightly left of camera. B-roll:
matching medium and close-up shots of the product or work
described. Return: slightly tighter on founder for
closing line.
Lighting: Interview lighting — large softbox at 45 degrees,
rim light from behind, practical background element for
depth. Warm but honest. Color grade: documentary-editorial,
slightly warm, natural skin tones.
Audio:
- Ambient: Interview room — low, present, honest
- Foley: B-roll sounds match the visuals — workshop,
studio, or process sounds
- Dialogue: Founder's actual or scripted candid speech.
Tone: direct, personal, no corporate language, warm
and confident. Pacing: natural with pauses. "[FOUNDER
OPENING STATEMENT ABOUT WHY THEY STARTED]. [B-ROLL].
[CLOSING LINE ABOUT WHAT THEY BELIEVE]."
- Music: Sparse acoustic guitar entering only during
B-roll section at -24 LUFS, fading back out when
founder returns
- Mix: Dialogue primary; music and ambient clearly
secondary
Aspect: 16:9. Duration: 25 seconds.
12. Founder on the Factory Floor
Brand pillar: This founder is close to the work —
not a figurehead, someone who understands every step
of making this product.
A founder walking through the production or creation
environment — workshop, factory floor, kitchen, studio —
gesturing at processes and materials. Observational style:
camera following, not staged. The environment is the
proof point.
Camera: Documentary handheld — stabilized but with
intentional organic movement. Following founder at
mid-distance, occasionally breaking off to capture
what they're pointing at, then returning. Natural
walking pace.
Lighting: Available light in the production environment,
supplemented by a single small LED fill on the
cameraman's shoulder for face lift. Honest and
industrial. Color grade: cinéma vérité — slight
desaturation, accurate color, editorial grain.
Audio:
- Ambient: Full production environment — machinery,
ventilation, the sound of work happening
- Foley: Tools, materials, and production sounds
audible and present — this environment is the
proof of the brand's claims
- Dialogue: Founder narrating in motion — 2 to 3
brief, specific statements about what they're
showing. Conversational, slightly imperfect,
completely credible. "This is where we [PROCESS].
We do it this way because [REASON]. Every [PRODUCT]
goes through [STEP]."
- Music: None — the production environment IS the
score
- Mix: Ambient and dialogue only; foley details
emerge naturally from the environment
Aspect: 16:9. Duration: 30 seconds.
13. Founder Product Origin Story
Brand pillar: This product exists because of a personal
need the founder had — the origin story is the brand's
most credible proof of product-market fit.
A cinematic treatment of the founding moment — not a
talking-head but a visually dramatized memory. Founder
narrates over images that represent the problem they
experienced and the moment they decided to solve it.
Camera: Three-scene structure. Scene 1: A memory —
slightly soft, warmer color — the founder encountering
the problem (2-3 shots, each 3 seconds). Scene 2:
A transition moment — close-up of hands, the beginning
of making. Scene 3: Product exists — the resolution
shot in current-day quality.
Lighting: Memory scenes: warm, slightly overexposed,
dreamlike. Transition: dramatic, single source.
Resolution: clean and aspirational. Color grade:
temperature shift across the three scenes mirrors
the emotional arc.
Audio:
- Ambient: Each scene's environment rendered fully
- Foley: Contrast between problem-era sounds and
resolution sounds is part of the emotional shift
- Voiceover: Founder narrating in first person,
reflective and specific — calm, authoritative,
personal. "I was [SITUATION]. Every time I [PROBLEM
ACTION]. So I decided to make something different.
[PRODUCT NAME] is what I wish had existed."
- Music: Piano motif in the memory scenes, building
to a warmer, fuller arrangement as the resolution
arrives; no music during the transition beat
- Mix: VO guides the clip; music and ambient support
without competing
Aspect: 16:9. Duration: 35 seconds.
14. Behind-the-Scenes Craftsmanship
Brand pillar: The process of making this product is itself
a demonstration of the brand's values — patience, skill,
and care.
A process documentary clip: a single artisan or maker
completing one meaningful step in creating the product.
No dialogue. Pure process. The work speaks for itself.
Camera: Mix of wide establishing (to show context) and
extreme macro (to show detail of the craft). Free-moving
but purposeful. Close-ups prioritize the most technically
interesting or visually satisfying moments in the process.
Lighting: Natural workshop or studio light supplemented
by a tungsten practical overhead. Warm, imperfect, human.
Color grade: rich saturation in the materials, honest
skin tones on the hands — slightly cinematic but not
over-stylized.
Audio:
- Ambient: Workshop environment — present and immersive;
the sonic world of making
- Foley: The craft sounds elevated to documentary-quality
sound design — each tool, material interaction, and
process step heard clearly and with texture
- Voiceover: None
- Music: None in first half — the craft sounds carry it;
a sparse piano or acoustic motif begins at the halfway
mark and gently rises to meet the completion of the step
- Mix: Foley and ambient dominant throughout; music
only enters to close emotionally
Aspect: 16:9. Duration: 20 seconds.
15. Founder and Customer Moment
Brand pillar: The relationship between this brand and its
customers is the real product — the founder cares about
the outcome, not just the sale.
A candid or lightly staged interaction between the founder
and a real or represented customer — receiving feedback,
seeing the product in use, sharing in the moment.
Warm, honest, unperformed.
Camera: Two-shot to begin (founder and customer in
conversation or shared moment), then alternating close-ups
on each face as they react to each other. Final frame:
both in frame, product between them.
Lighting: Warm practical environment. Candid lighting —
not interview-lit. Natural. Color grade: warm,
humanistic — the tones of a good afternoon.
Audio:
- Ambient: The environment of the meeting — café,
studio, outdoor — full and present
- Foley: Product handling sounds if the product
is passed between them
- Dialogue: Natural, overlapping at edges, genuine.
Customer says something specific and positive about
the product. Founder responds with something that
reveals they care about this particular use case.
Neither person is performing. Tone: warm, personal,
real. Both voices present and clear.
- Music: Acoustic guitar and light percussion, warm
and unhurried, at -20 LUFS throughout, fading on
the final frame
- Mix: Dialogue primary; music provides warmth
without overriding the candor of the moment
Aspect: 16:9 and 1:1. Duration: 20 seconds.
Problem-Solution Ad Prompts (16–20)
16. Problem Setup to Product Reveal to Result
Brand pillar: This product solves a real, specific problem
that the target customer has lived — and the relief is
the emotional payoff.
Three-act structure in 30 seconds. Act 1 (0–10s):
the problem rendered specifically — a person experiencing
[SPECIFIC PROBLEM] in a real environment. Act 2 (10–18s):
the product introduced. Act 3 (18–30s): the problem
resolved, the person's reaction, the result.
Camera: Act 1: documentary handheld, problem POV,
slightly unsettled. Act 2: controlled studio-adjacent
product reveal — slow push-in. Act 3: static or
slow pull-back, resolved, open.
Lighting: Act 1: cool, slightly harsh, high-contrast —
the problem looks uncomfortable. Act 2: neutral,
clean product light. Act 3: warmer, lifted, resolved.
Color grade: temperature shift across the three acts.
Audio:
- Ambient: Act 1: the sound of the problem environment —
present and slightly intrusive. Act 3: the same
environment but quieter, easier.
- Foley: Product interaction in Act 2 is satisfying
and deliberate — the product sounds like it works
- Voiceover: Urgent, direct — entering in Act 1:
"If you [PROBLEM DESCRIPTION], you know how it
feels." Brief silence. Then in Act 3:
"[PRODUCT NAME] changes that."
- Music: Act 1: dissonant, low-energy, slightly
anxious minimal track. Act 2: silence or a single
resolving chord. Act 3: bright, resolving,
forward-moving — same instrument as Act 1 but
in a major key
- Mix: VO primary; music carries the emotional arc
of each act
Aspect: 16:9 and 9:16. Duration: 30 seconds.
17. Irritation to Relief
Brand pillar: This product removes friction — its value
is what it eliminates, not just what it adds.
A single-character clip. Open on a moment of genuine
minor irritation — the kind that's immediately
recognizable to the target customer. The character
is mid-action, encountering [SPECIFIC FRICTION POINT].
Cut to: they have [PRODUCT]. Cut to: the same action,
now effortless. Close on: character's expression —
small, genuine satisfaction.
Camera: Three short clips, each locked-off or with
minimal movement. Intercut quickly: problem (3s),
product (2s), resolution (4s). Final expression:
slow push-in closing on the face.
Lighting: Consistent across all three — natural,
realistic environment lighting. This is not a
stylized ad; it feels like a real moment.
Audio:
- Ambient: Consistent environment — same location
across all three beats
- Foley: The friction sound in the first beat is
grating, specific, and recognizable. The equivalent
sound in the third beat is smooth, satisfying, resolved.
- Dialogue: Character mutters a single word in the
first beat — an expletive bleeped, or "seriously?"
or just an audible exhale. Nothing in the second or
third beat — the result speaks.
- Music: None — foley carries the entire emotional
argument; silence makes the contrast louder
- Mix: Foley-only; ambient at low level; dialogue
natural and unprocessed
Aspect: 9:16. Duration: 12 seconds.
18. Frustration to Delight
Brand pillar: This brand understands the emotional
experience of using products in this category —
including the bad one — and built something that
replaces frustration with genuine delight.
A character encounters the category experience
before and after [PRODUCT]. The before version is
clearly the old way — the character's body language
and expression register frustration without
overacting. The after version: same task, same
person, visible delight that doesn't need to be
telegraphed.
Camera: Before: handheld, slightly close, slightly
uncomfortable framing. After: wider, static, room
to breathe. Character's face in close-up for the
final reaction.
Lighting: Before: slightly cooler, less flattering.
After: warmer, cleaner, the same setting made
slightly more beautiful. Subtle shift — feels like
the same day, same place, different experience.
Audio:
- Ambient: Same environment — the shift is in tone,
not location
- Foley: Before: the clunky, frustrating sounds of
the old way. After: clean, satisfying, smooth —
the product sounds better than the alternative
- Voiceover: Energetic, direct, relatable —
entering at the transition: "There's actually
a better way to [TASK]. [PRODUCT NAME]."
- Music: Before: discordant, compressed, slightly
annoying — mirrors the experience. After: open,
warm, minor-to-major resolution. The musical
shift is the emotional argument.
- Mix: VO direct and present; music shift is
the key moment; foley supports the contrast
Aspect: 16:9 and 9:16. Duration: 20 seconds.
19. Before-and-After with Audio Shift
Brand pillar: The difference this product makes is
viscerally audible and visible — before and after
aren't just different states, they're different
sonic and visual worlds.
A split-narrative clip: the left half of the frame
(or first clip) shows the before state; the right
half (or second clip after a hard cut) shows the
after. The audio shift is as deliberate as the
visual shift — this clip is engineered to land
on sound.
Camera: Option A (split-screen): left and right
simultaneously, hard vertical line through center,
identical shot composition on each side — same
angle, same environment, same character position.
Option B (hard cut): identical framing before and
after, hard cut at midpoint.
Lighting: Before: slightly desaturated, cooler,
less lift. After: warmer, more saturated, cleaner
highlights — same scene, better version.
Audio:
- Ambient: Before: the messy, loud, or uncomfortable
sounds of the old state. After: the same scene,
quieter, cleaner, better-sounding.
- Foley: The task or interaction sounds radically
different between states — the after version is
satisfying, the before is grating
- Voiceover: None — the audio shift makes the
argument without words
- Music: Before: discordant or absent. After:
a clean, warm musical phrase that begins
exactly on the cut and completes on the final frame
- Mix: The audio transition at the cut is the
entire ad — make it unmissable
Aspect: 16:9. Duration: 15 seconds.
20. "We Used to Do X, Now We Do Y"
Brand pillar: This brand represents a category
evolution — the old way was accepted because
no one had done better; this product changes
the premise.
A confident, direct ad. Two clips: the old way
(short, slightly comedic in its inadequacy,
not mean-spirited), then the new way (the product
in use, clearly superior). The edit is the
argument: why was anyone doing it the old way?
Camera: Old way: medium shot, slightly low-energy
framing, the old method visible and readable.
New way: wider, cleaner framing — the product
makes things look better spatially as well.
Lighting: Old way: flat, institutional. New way:
intentional, flattering — the product improves
the scene.
Audio:
- Ambient: Old way: slightly dull, institutional
ambient — the sound of accepting the mediocre.
New way: cleaner, livelier version of the same
environment.
- Foley: Old way: the awkward or unsatisfying
sounds of the old method. New way: clean,
satisfying product sounds.
- Voiceover: Calm, slightly wry, authoritative —
entering over the old way: "For [TIME PERIOD],
people [OLD METHOD]." Over the new way:
"We thought there was a better approach."
- Music: A single confident musical statement
beginning on the cut to the new way —
forward-moving, decisive; nothing during
the old way
- Mix: VO primary; music enters at the brand's
moment of conviction; foley contrast does
the work
Aspect: 16:9 and 9:16. Duration: 20 seconds.
Lifestyle Commercial Prompts (21–25)
21. Kitchen Morning Routine with Product
Brand pillar: This product belongs in a morning routine
that prioritizes [QUALITY — e.g., "craft", "calm",
"efficiency"] — it makes the most mundane moment
of the day feel considered.
A 20-second kitchen morning scene. Single character,
unhurried pace, the product is a natural part of
the ritual — not highlighted awkwardly, not ignored.
The character moves through the scene as though
no camera is present.
Camera: Series of 3-4 observational shots. Opens
wide on the kitchen — the scene is beautiful and
real. Cuts to medium of the character with the
product. Closes on a detail shot of the product
interaction. Final frame: character in background,
product sharp in foreground, morning light.
Lighting: Morning natural light from windows —
golden, warm, long shadows. No added lights
that feel artificial. Color grade: warm, slightly
lifted — the best version of a real morning.
Audio:
- Ambient: Kitchen morning sounds — coffee maker,
distant outdoor birds, maybe a radio at
low volume in the background
- Foley: Product use sounds in the foreground —
satisfying, domestic, real
- Voiceover: Warm, personal, gentle — like a
recommendation from a friend. Entering at
the product interaction shot: "Some mornings,
everything just works. [PRODUCT NAME]."
- Music: Soft acoustic guitar with light
percussion — the sound of a good morning;
fades under the VO and resolves on the
final frame
- Mix: Ambient and foley create the world;
VO makes the brand statement; music
carries emotional warmth
Aspect: 16:9 and 9:16. Duration: 20 seconds.
22. Weekend Outdoor with Product
Brand pillar: This product belongs in a life that
makes room for outdoor time — it works as hard
as the people who use it, and looks good doing it.
A weekend outdoor scene — hiking trail, camping
spot, park, or backyard depending on the product's
context. Character using the product in its natural
outdoor habitat. Bright, energetic, real.
Camera: Mix of wide landscape shots (product and
character in context) and close medium shots
(character and product interaction). One slow
tracking shot following the character in motion
with the product.
Lighting: Bright outdoor daylight, managed with
a bounce or slight fill to prevent harsh shadows
on faces. Color grade: saturated and bright —
outdoor energy, not moody.
Audio:
- Ambient: Full outdoor environment — wind,
leaves, ambient natural sounds of the specific
location; present and immersive
- Foley: Product use sounds appropriate to
outdoor context — elevated above ambient
to ensure they're heard
- Voiceover: Energetic, confident — the voice
of someone who's outside and happy about it:
"Built for wherever you go."
- Music: Upbeat indie folk or light electronic
with an outdoor energy — driving rhythm,
positive, no melancholy; dynamic: starts
medium, builds through the tracking shot,
peaks at the VO, then resolves
- Mix: Ambient and music in near-equal balance;
VO cuts through clearly; foley punctuates
product moments
Aspect: 16:9 and 9:16. Duration: 20 seconds.
23. Bathroom Self-Care Routine with Product
Brand pillar: This product is part of a self-care
practice that the target customer takes seriously —
it belongs in a ritual, not a rushed obligation.
A bathroom self-care scene. Single character,
morning or evening ritual. Product used with
intention — not rushed, not mechanical. The
space is clean, considered, beautiful without
being unattainable.
Camera: Close-up details interspersed with
medium shots of the character in full ritual
mode. Slow and deliberate cuts. Final shot:
character post-ritual — a brief expression
of satisfaction in the mirror.
Lighting: Warm, soft bathroom lighting —
mirror bounce, soft overhead, no harsh
shadows. Slightly golden. Color grade:
skin tones warm and accurate; product
color accurate; overall: spa-adjacent
without clinical whiteness.
Audio:
- Ambient: Bathroom quiet — water running
softly at a distance, ambient warmth
- Foley: Product sounds are intimate and
sensory — the texture, the sound of
application, the satisfying close of
the container
- Voiceover: Calm, self-assured, slightly
intimate — the voice of someone who
has found a thing that works:
"[PRODUCT] was made for this moment."
- Music: Ambient electronic or soft R&B —
slow tempo, warm, private; the music of
taking time for yourself
- Mix: Foley dominant in product moments;
music provides the emotional register;
VO closes the clip
Aspect: 9:16. Duration: 15 seconds.
24. Urban Commute with Product
Brand pillar: This product belongs in a fast-moving,
urban life — it keeps up, it fits, it solves
problems in real time.
An urban commute scene — subway platform, city
sidewalk, bike lane, café. Character moving
through the city with the product as a natural
part of the commute. The city is alive and the
product is part of navigating it well.
Camera: Dynamic — a mix of tracking shots
following the character, handheld energy
appropriate to urban movement, and one
moment of stillness where the character
pauses to use the product. Urban environment
is in background, slightly bokeh'd but readable.
Lighting: Urban available light — mixed
color temperatures, street lighting, daylight
filtering through buildings. No additional
lighting — this is a real commute. Color grade:
slightly cool urban palette, then warmer
when the character arrives at their destination.
Audio:
- Ambient: City sounds — subway, traffic,
footsteps on concrete; full and present,
cutting through
- Foley: Product use sounds elevated above
the urban ambient — they should be heard
and satisfying even against city noise
- Voiceover: None
- Music: Urban electronic or hip-hop
instrumental — fast enough to match the
commute energy, with a drop or shift at
the product interaction moment; confident
and contemporary
- Mix: Ambient and music in tension —
the city vs. the product moment;
foley of product cuts through both
Aspect: 9:16. Duration: 15 seconds.
25. Family Moment with Product
Brand pillar: This product enables or enhances
a family moment that matters — it's not the
hero, the moment is; the product makes the
moment possible or better.
A family scene — morning breakfast, an outdoor
moment, or a shared activity — where the product
plays a supporting role in enabling connection.
The family is clearly real and unperformed.
The product is present but not foregrounded
awkwardly.
Camera: Observational and warm. Mix of wide
family group shots and close-ups of faces and
shared moments. One close-up of the product
in use — held briefly, then cut back to
the human moment.
Lighting: Natural, warm, family-appropriate.
Morning kitchen light or outdoor afternoon
light. No sharp shadows. The light says
"this is a good moment." Color grade: warm,
slightly lifted, timeless — not trendy.
Audio:
- Ambient: Family environment — voices,
laughter, background sounds of a
real home; present and warm
- Foley: Product use sounds natural
and clear within the family ambient
- Voiceover: Warm, gentle, parental
register — unhurried, meant to be
believed: "For the moments that matter."
- Music: Acoustic piano with soft strings —
warm, unhurried, the sound of something
good and real; dynamic: low throughout,
swelling slightly under the VO
- Mix: Family ambient carries the humanity;
music provides the emotional frame;
VO closes it
Aspect: 16:9 and 1:1. Duration: 20 seconds.
Multi-Shot Campaign Sequence Prompts (26–30)
26. Three-Shot Hook, Demo, CTA Sequence
Brand pillar: This campaign is designed to stop
the scroll, demonstrate value quickly, and
convert — three shots, one argument.
A three-shot paid social sequence. Shot 1 (0–4s):
Hook — something visually or emotionally
arresting that speaks directly to the target
customer's situation. No explanation yet.
Shot 2 (4–12s): Demo — the product solving
the problem clearly and quickly. Shot 3 (12–15s):
CTA — product and brand on screen, direct
call to action.
Camera: Shot 1: arresting, slightly unconventional
framing — make the viewer stop. Shot 2: clear,
well-lit, demonstration-forward — make the value
obvious. Shot 3: clean product on surface,
CTA overlay space at bottom of frame.
Lighting: Consistent across all three shots —
warm, product-accurate, appealing. Color grade:
unified brand palette across all three.
Audio:
- Ambient: Hook shot: immediately relevant
to the target customer's world. Demo shot:
product environment. CTA: near-silent.
- Foley: Demo shot foley is the hero —
the product working, clearly and satisfyingly
- Voiceover: Urgent and direct. Shot 1:
"[HOOK STATEMENT — the problem or desire
in one sentence]." Shot 2 (optional):
"[PRODUCT NAME] — [ONE-SENTENCE DEMO
DESCRIPTION]." Shot 3: "[CTA —
e.g., 'Try it free at [URL]' or
'Shop now — link in bio']."
- Music: Fast, confident, contemporary —
social-native energy; present from the
first frame, consistent across all three
shots; does not distract from VO
- Mix: VO cuts through at all times;
music at -16 LUFS; foley punctuates
the demo
Aspect: 9:16 for Reels/Stories. Duration: 15 seconds.
Render also in 1:1 for feed placement — character/product
stays center-frame in both crops.
27. Four-Shot Brand Identity Sequence
Brand pillar: This campaign sequence builds
brand recognition through consistency —
four distinct visual moments that together
define the brand's visual and sonic identity.
A four-shot brand identity sequence intended
for display and social media. Shot 1: Logo/wordmark
reveal (branded graphic or in-environment logo).
Shot 2: Hero product shot. Shot 3: Lifestyle
moment with product. Shot 4: CTA card with
tagline and product.
Camera: Shot 1: graphic treatment or simple
push-in from black. Shot 2: studio hero orbit.
Shot 3: lifestyle medium shot. Shot 4: clean,
static product-on-surface with text overlay zone.
Lighting: Unified across all shots — brand palette
drives the color temperature. Consistent grade
across all four shots so they read as a system.
Audio:
- Ambient: Present in the lifestyle shot only;
silent or minimal in all others
- Foley: Product sounds in Shots 2 and 3
- Voiceover: Calm, authoritative brand
voice — entering at Shot 3:
"[BRAND TAGLINE OR VALUE STATEMENT]."
CTA at Shot 4: "[ACTION —
'Discover more at [URL]']."
- Music: A single brand sonic motif —
4-bar loop, consistent across all four
shots, resolving cleanly on the CTA
card. This is the brand's audio
identity element.
- Mix: Music consistent and clear throughout;
VO in Shots 3 and 4 primary; foley
supports without competing
Aspect: 16:9, 9:16, and 1:1 — deliver as three
renders with consistent center-frame composition.
Duration: 20 seconds total (5 seconds per shot).
28. 30-Second TV-Ready Commercial
Brand pillar: This brand belongs in living rooms —
the commercial is crafted to the standard of
broadcast TV, not social media.
A fully structured 30-second commercial with
broadcast-quality visual and audio production.
Classic structure: Seconds 0–5: establish
the world. Seconds 5–20: problem and product
demonstration. Seconds 20–28: emotional resolution
and brand moment. Seconds 28–30: logo and tagline card.
Camera: Professional broadcast composition throughout.
Wide establishing, medium action, close detail,
two-shot resolution, product-and-logo lockup.
Every cut is motivated. No handheld — smooth,
deliberate, broadcast-standard.
Lighting: Three-point everywhere, consistent
5600K daylight-balanced across the main sequence,
warmer on the resolution sequence. Color grade:
broadcast-safe, rich but not oversaturated,
product color completely accurate.
Audio:
- Ambient: Each environment rendered fully and
appropriately; ambient levels are broadcast-safe
- Foley: Complete and professional — every surface
contact, product interaction, and movement is foley'd
- Voiceover: Professional broadcast VO —
warm, authoritative, trust-building register.
Entering at second 5 and continuing through
second 28 with natural pauses.
"[PRODUCT CATEGORY TRUTH]. [PRODUCT NAME]
[DOES THIS]. For [TARGET CUSTOMER DESCRIPTION],
[PRODUCT NAME] is [BRAND PROMISE]. [TAGLINE]."
- Music: Original-sounding orchestral or
produced track — four-part dynamic shape:
establishment, build, resolution, resolve.
Full broadcast mix quality; music plays
continuously for 30 seconds with intentional
dynamic shaping under the VO
- Mix: Broadcast LUFS standards (-23 LUFS integrated);
VO at -16 LUFS peaks; music ducked under VO,
full during non-VO moments; foley consistent
at -24 LUFS
Aspect: 16:9. Duration: 30 seconds.
29. Multi-Channel Campaign Hero
Brand pillar: One creative concept, delivered
with fidelity across every channel — the campaign
hero that informs all executions.
A single master brand film that is compositionally
designed to crop cleanly across all four channel
aspect ratios without losing the key visual
element (product + character). All action and
key visuals are center-frame.
Prompt this clip with the master composition
(subject/product centered, 20% margin on all
edges from any critical elements), then re-prompt
with each target aspect ratio to maintain consistency.
Camera: Central composition throughout. Key action
happens in the center 60% of the frame. Wide
headroom. Slow movements that work in all
orientations. No critical elements in the outer
20% of frame.
Lighting: Brand-consistent. Warm, accurate,
aspirational. Color grade: brand palette,
consistent.
Audio:
- Ambient: The brand world — present and
recognizable across all placements
- Foley: Product interactions consistently
audible across all aspect ratios (audio
is not aspect-dependent)
- Voiceover: Brand VO line — short, brand-ownable,
works as an audio logo. "[BRAND TAGLINE]."
Entering at the 70% mark of the clip duration
in every cut.
- Music: Brand sonic identity — consistent
across all channel deliverables; same
4-bar loop regardless of aspect
- Mix: Identical across all renders;
audio consistency is non-negotiable
for campaign recognition
Aspect: Master in 16:9, then re-prompt for 9:16,
1:1, and 4:5. Duration: 15 seconds.
Note: Veo 3 maintains character and product
consistency across re-prompted aspect crops —
prompt the master first, then use the same
subject reference for each crop to hold
visual consistency.
30. Retargeting Ad with Payoff
Brand pillar: This ad is shown to someone who
already knows the brand — it doesn't need to
introduce; it needs to close. The payoff is
the entire point.
A retargeting ad for someone who has already
seen the brand. Skips the problem setup entirely.
Opens on the product. Opens on the result.
Opens on the reason this person should stop
waiting. No context-setting — this viewer
already has context.
Camera: Opens immediately on the best product
shot in the campaign — the hero angle, the
most compelling frame. No build-up. Then cuts
to: testimonial face (real or represented) with
a specific positive reaction. Then: CTA card.
Lighting: Matches the campaign master —
this ad is visually consistent with what
this retargeted viewer already saw.
Color grade: campaign-consistent.
Audio:
- Ambient: Minimal — this ad is not about
establishing a world; it's about converting
- Foley: Product sounds only in the opening shot —
familiar to this viewer from prior exposures
- Voiceover: Direct, slightly urgent,
conversational — the voice of someone
giving a friend a nudge:
"You've seen it. Here's why everyone
who tries [PRODUCT NAME] comes back.
[CTA — specific action with incentive
if applicable]."
- Music: The same sonic motif from the
campaign master — instantly recognizable
to the retargeted viewer; brief, confident,
resolves on the CTA card
- Mix: VO primary throughout — this is
a closing argument; music at -20 LUFS
underneath
Aspect: 9:16 and 1:1. Duration: 10 seconds.
Brand Video Power Tips
Name the brand pillar before anything else. Every creative decision — camera, light, audio, grade — should serve a named emotional intent. "Our product is well-made" is not a pillar. "We believe craft is a form of respect for the customer" is.
Audio carries brand more than visual — direct it explicitly. Write out all four layers: ambient environment, product foley, voiceover tone and pacing, and music with genre, mood, instrumentation, and dynamic shape. Veo 3 synthesizes audio from the prompt. A vague prompt gets generic audio. A specific prompt gets brand audio.
Pick a camera move that flatters the subject, not just one that looks good. Slow dolly-in creates intimacy and draws the viewer toward the product. Push-in on a founder creates authority and presence. Orbit reveals craft and dimensionality. Locked-off with internal movement builds tension. Choose the move that says something, not just one that moves.
Color grade to your brand palette, not to "cinematic." Specify temperature, saturation, contrast, and lift in terms of what your brand looks like — not just generic grade references. If you have brand hex codes, translate them to grade notes. The grade is part of the brand.
Match the aspect ratio to the channel and specify it in the prompt. 16:9 for YouTube and TV. 9:16 for Stories, Reels, and Shorts. 1:1 for Instagram feed. 4:5 for Instagram portrait (more real estate). Getting this wrong means re-rendering — it costs time and can affect consistency.
For multi-channel campaigns, prompt the master first, then re-prompt for each crop. Veo 3 holds visual consistency when the same subject and reference are maintained across prompts. Prompt the master 16:9 version, then explicitly re-prompt for 9:16 and 1:1 with the same brand brief and subject reference. The audio stays consistent; the framing adapts.
Make a brand video about our skincare product.
Brand pillar: we believe skincare should feel like a ritual, not a routine — calm, deliberate, earned. A slow dolly-in on a single glass serum bottle on a warm marble surface, morning window light from the left, long soft shadows. Camera: slow dolly-in, 8 seconds, arriving at the hero angle. Lighting: 4000K window source, soft fill, lifted highlights. Color grade: warm neutral, brand palette #F5E6D3 and #8B6F5E. Audio: ambient — quiet bathroom morning, distant birds; foley — deliberate placement of the bottle on marble, cap removed with a soft click; VO — calm, self-assured, intimate female voice entering at 5 seconds: "Take your time."; music — sparse piano motif, 4-bar loop, resolves on the final frame. Aspect: 9:16 for Instagram Stories. Duration: 12 seconds. Product should look exactly as in the reference image — no reinterpretation of shape, color, or materials.
Start Building Brand Video Prompts
The difference between a clip that looks like AI stock footage and one that functions as a finished brand asset is almost entirely in the brief. Camera, light, audio, grade, aspect, duration — when all six are named and aligned to a brand pillar, Veo 3 has what it needs to synthesize something close to a deliverable.
Use the AI prompt generator to build structured video prompts from a plain-English description of your brand and what you need. For the full set of Veo 3 techniques — motion, environment, character, and audio — read 30 Best Veo 3 Prompts with Audio and the Veo 3 Prompt Guide. For how video fits into a broader content and distribution strategy, see AI Prompts for Marketing.
One note on disclosure: AI-generated video content may require labeling in paid advertising contexts depending on platform policies and regional regulations. Verify requirements for your placements before publishing.