This guide provides the most effective tags and techniques for prompting Eleven v3, including voice selection, changes in capitalization, audio tags. Experiment with these methods to discover what works best for your specific voice and use case.
The v3 Model is in Beta. Can be unstable and reduce similarity to speaker. Preview audio before submitting. If you include an audio tag, it may affect the subtitles. We recommend turning off subtitles or avoiding the use of the audio tag to ensure optimal video quality.
Eleven v3 introduces emotional control through audio tags. You can direct voices to laugh, whisper, act sarcastic, or express curiosity among many other styles. Speed is also controlled through audio tags.
These tags control vocal delivery and emotional expression:
[laughs]
, [laughs harder]
, [starts laughing]
, [wheezing]
[whispers]
[sighs]
, [exhales]
[sarcastic]
, [curious]
, [excited]
, [crying]
, [snorts]
, [mischievously]
[whispers] I never knew it could be this way, but I'm glad we're here.
Add environmental sounds and effects:
[gunshot]
, [applause]
, [clapping]
, [explosion]
[swallows]
, [gulps]
[applause] Thank you all for coming tonight! [gunshot] What was that?
Experimental tags for creative applications:
[strong X accent]
(replace X with desired accent)[sings]
, [woo]
, [fart]
[strong French accent] "Zat's life, my friend — you can't control everysing."
The most important parameter for Eleven v3 is the voice you choose. It needs to be similar enough to the desired delivery. For example, if the voice is shouting and you use the audio tag [whispering]
, it likely won’t work well.
Choose voices strategically based on your intended use:
If you have any questions or need assistance, please feel free to contact our live chat anytime.