Documentation Index
Fetch the complete documentation index at: https://docs.beyondwords.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
BeyondWords allows you to generate a custom AI voice from a text prompt, then use this voice to create audio and video. This means you can create voices to perfectly suit your needs, without the need for recordings or approvals. All generated voices are powered by ElevenLabs.Generate a voice
Go to organization settings
In the BeyondWords dashboard, click your avatar then select “Organization”.Then, open the “Custom voices” page.
Enter a prompt
Enter the prompt to generate your voice. (See our voice prompting guidelines below.)Once complete, click “Continue”.
Add a custom preview script (optional)
To add a custom preview script, enable the toggle then enter your script. (See our preview script guidelines below.)Otherwise, leave the toggle disabled.Once complete, click “Generate voice”.
Review voice options
Once generation is complete, you’ll see three voice options. Click the play button next to each voice to hear a preview.Select your preferred voice, then click “Continue”.
Add voice details
Enter the following voice details:
- Voice name: Choose a label you will recognize when picking this voice in your projects
- Language and accent: Choose how this voice should be labeled for language and accent (if relevant) in your projects
- Gender: Male, female, or neutral
Preview your voice and set its scope
Use the preview section to hear how your voice sounds across different voice models, languages, and accents.Once you’re ready, click “Done”.
Voice prompting guidelines
A well-written prompt and preview script are key to generating the best possible voice for your use case. The prompt should describe the key attributes of the voice you want to create. A good prompt is specific and consistent—vague or contradictory descriptions tend to produce unreliable results. Stick to concrete, observable qualities: how the voice sounds (pitch, tone, resonance), how it speaks (pace, rhythm, emphasis), and who it sounds like (persona, age, accent). Avoid abstract descriptors like “professional” or “engaging” that don’t translate into a distinct vocal quality. We recommend using the following template:| Attribute | Description | Examples |
|---|---|---|
| Language and accent | The native language and accent that shape the voice’s character | Native American English, Native Brazilian Portuguese, Native Gulf Arabic |
| Gender | The perceived gender of the speaker | Female, male, gender-neutral |
| Age | The age or age range of the speaker, which affects vocal maturity, texture, and energy | 20–25, middle-aged |
| Audio quality | How polished and produced the voice sounds | Broadcast-quality, studio-quality |
| Persona | The role or identity of the speaker, which helps shape delivery style and intent | News anchor, financial correspondent, policy analyst |
| Emotion | The emotional delivery of the voice | Calm, composed, measured, authoritative, assured |
| Tone | The physical quality of the voice, shaped by pitch, resonance, and texture | Warm and resonant, clear and crisp, smooth and neutral |
| Pacing | The speed and rhythm of speech | Measured and deliberate, steady with natural rhythm, slightly brisk but clear |
| Pitch | How high or low the voice sounds | Mid-range, slightly low-pitched, low and resonant |
Broad accents (e.g., American English, Brazilian Portuguese) are more reliable than specific regional varieties (e.g., Glaswegian, Northeastern Brazilian). For specific accents, try voice cloning.
Example voice prompt
Here’s an example of a well-structured voice prompt:Native English, American. Female, 35–45. Broadcast quality. Persona: seasoned journalist narrator. Emotion: composed, analytical, subtly wry. Warm, mid-pitched timbre with a clear, articulate delivery and precise diction. Speaks at a measured, deliberate pace with confident cadence and restrained emphasis, prioritizing clarity and insight over drama. Maintains a calm, authoritative tone with occasional light dryness, as if guiding the listener through complex ideas with quiet confidence.
Preview script guidelines
The preview script can influence the voice’s energy, pacing, and emotional delivery, so it’s important to use text that reflects your intended use case. We recommend writing around 80 words in the same style and language as your content. Consider copying and pasting a representative section from an existing article.FAQs
What languages are supported for voice generation?
What languages are supported for voice generation?
You can generate voices in 74 languages: Afrikaans, Arabic, Armenian, Assamese, Azerbaijani, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chichewa, Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Kirghiz, Korean, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malay, Malayalam, Mandarin Chinese, Marathi, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Sindhi, Slovak, Slovenian, Somali, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.Your language selection may limit your access to certain voice models.
Are generated voices multilingual?
Are generated voices multilingual?
Yes, generated voices are multilingual. They can speak all the languages available within your chosen voice model.
How many voices can I generate?
How many voices can I generate?
You can generate as many voices and voice previews as needed. Credits are only used when you generate audio or video in a project.
Why doesn't my generated voice sound as expected?
Why doesn't my generated voice sound as expected?
If your generated voice doesn’t sound as expected, it may be because your prompt is unclear or overly complex. We recommend experimenting with different prompts until you get the desired result.Remember that ElevenLabs may be unable to generate voices with highly specific regional accents. For that use case, we recommend voice cloning.