A professional voice clone will mirror the speaker data it is trained on. For an optimal clone, we require speakers to record a tailored script, such as articles, to ensure the model captures your desired speaking style.
See supported languages and accents.
Record 10 articles
Clone a voice with just 10 article recordings or 30 minutes of audio.
Ready in 24 hours
Get a highly natural voice clone within one day.
Hyper-realistic
Generate audio so authentic listeners will feel like they have their favorite author in their pocket.
Custom pronunciations
Our Professional Voice Clones support full pronunciation customization—including IPA—across all languages.
Create a professional voice clone
Professional voice cloning isn’t available through self-service just yet - but we’ll guide you through the process.To get started, please book a meeting or email support@beyondwords.io.
1
Book a meeting
- Book a meeting or reach out to our team.
- We’ll discuss your goals for the voice and walk you through each step of the cloning process.
2
Share 10-15 articles
- We believe voices sound best when trained on content that’s authentically yours.
- Share 10 to 15 published articles - a Word doc is totally fine. We’ll use these to create a customised recording script for your speaker.
3
Share your voice details
- We’ll need your speaker’s first and last name - this is required so they can record the voice cloning consent statement, which gives us permission to clone their voice.
- You can also give the voice a name - this helps you find and manage it in the platform later.
- Once we have these details, we’ll send you a link to the recording script where you can upload the recordings.
4
Record and upload
- Your speaker will record both the script and the consent statement.
- We’ll provide simple recording guidelines and audio requirements to help you get the best results.
- Once recordings are complete, just upload the files and click “Submit.”
5
Training
- Now it’s over to us.
- Voice training typically takes 1-2 days, after which we’ll review and deploy it to your account.
6
Use your voice
- We’ll let you know as soon as the voice is ready.
- You can then start turning articles into audio using your new, professionally cloned voice.
Recording tips
The voice clone will accurately replicate the style and performance of the speaker. For this reason, it is important that each article is recorded with the same energy, pace, and style that you would like the voice clone to have.Read as separate articles
Read as separate articles
Please record and upload one audio file per article. This will allow the speaker to give correct meaning and structure to their performance.
Take breaks
Take breaks
We recommend adequate breaks during the recording process to reduce the risk of error and voice fatigue.
Correcting mistakes
Correcting mistakes
If you make a mistake, please re-record from an appropriate place in the article to maintain the naturalness and fluency of the recording. It is permissible to “punch in”. Please let us know if you would like guidelines on achieving this.
Indicating mistakes/issues with the script
Indicating mistakes/issues with the script
Click the flag icon on the right-hand side of the script recording page to report any errors, provide comments, or let us know about any edits made to the script.
Recording location
Recording location
It is important to record in a quiet location and to use the same recording equipment throughout. We recommend recording in a professional studio and sitting at a consistent distance from the microphone. You can create a temporary setup with thick fabrics like duvets or quilts to dampen unwanted sounds and echoes.
Distance
Distance
The speaker should ensure they are comfortable before recording to eliminate the need for movement. Two fists away is a good starting point.
Plosives
Plosives
Employ a pop filter to minimise “p” and “b” sounds, ensuring crisp audio.
Pronunciations
Pronunciations
To ensure that words are mapped to their correct sounds, words must be pronounced accurately and distinctly, precisely as they are in the script. The script may be normalised for text-to-speech, so you may notice some unusual punctuation and formatting (for example, “2020” might be written as “twenty-twenty”). Where letters should be pronounced individually, spaces or hyphens may be used to indicate breaks (for example, “I S S”, “CAR-T”). The speaker should take the time to review the script beforehand and clarify the pronunciation of any unfamiliar or ambiguous words.
Speaking style
Speaking style
Use a natural speaking style that you can maintain consistently throughout the recordings. While some variance is natural and desirable, keeping volume, pitch, intonation, and tempo as consistent as possible is important.
Voice quality
Voice quality
To ensure consistency, the speaker should take regular water breaks and rest their voice. Rather than recording the script all at once, we recommend recording in multiple short sessions to reduce the risk of the voice becoming tired or strained.
Breathing and pausing
Breathing and pausing
Pause naturally at punctuation and try to breathe away from the microphone. Keep your breathing at a low and consistent volume, or the voice clone’s breaths can become unnatural and distracting.
Hydration and mouth noise
Hydration and mouth noise
Mouth noise can be copied by the voice clone and cause unpredictable results. Mouth noise can be caused by not being sufficiently hydrated. To help reduce mouth noise, it is important to become well-hydrated on the days leading up to the recording sessions and throughout. Do not wait until the day of recordings to become hydrated — your body will get rid of it. Reducing caffeine and alcohol can help. If you’re sufficiently hydrated and still have audible mouth noise, chewing gum with xylitol or a bite of green apple reduces mouth noise on the day of recording.
Recording requirements
Save each file as an individual .wav audio file then upload it under the words of each article in the script recording interface. Optimum recordings are:- File format: *.wav, Mono
- Sampling rate: Minimum of 22 kHz for clear audio capture.
- Sample format: Minimum of 16-bit PCM (uncompressed) for lossless audio quality.
- Volume levels: Between -23dB and -18dB RMS across the recording, with a maximum peak of -3dB to avoid clipping and distortion.
- Signal-to-noise ratio (SNR): Greater than 35dB (higher is better) for minimal background noise.
- Environment noise, echo: Background noise level before speaking should be less than -70dB for optimal clarity.
- Send us the files as “unprocessed” as possible: e.g. do not apply filters, compression, limiters and the like. We’ll standardise your files in-house to ensure optimal settings perfect for voice cloning
Voice Scoping
Voice scoping allows you to make a custom voice available on specific projects rather than across your entire account. Can be useful for organizations with multiple projects and a large number of custom voice clones, where only certain voices should be available for certain projects.1
Go to the Voice Cloning section
In the top-left menu, select
Voice cloning
.2
Find the voice you want to scope
You will see a list of all available custom voices (both Instant and Professional voice clones).
Click the
Click the
⋯
menu next to the voice name and select Edit
.3
Select which projects to scope the voice to
Choose whether to make the voice available to
All projects
or to Specific projects
.4
Save changes
Your custom voice will be available in either
All Projects
or the Specific projects
that you have selected.Scoping a voice to a project will not set it as the default voice for the project. You will have to do this via you Voices Preferences