Voice Cloning

Last updated: 2026-03-09

How Voice Cloning Works#

Altrixy's voice cloning uses Eleven Labs' neural voice synthesis technology to create a digital replica of your voice:

Sample Collection — You provide voice recordings that capture your vocal characteristics
Feature Extraction — The AI analyzes pitch, tone, cadence, accent, and speaking patterns
Model Training — A custom voice model is trained on your samples (takes 5-15 minutes)
Synthesis — The trained model generates new speech in your voice from any text input

The resulting voice clone captures your unique vocal identity while producing natural-sounding speech on any topic.

Preparing Voice Samples#

The quality of your voice clone depends heavily on the quality and variety of your samples.

What makes a good sample:
- Clear audio with minimal background noise
- Your natural speaking voice (not reading voice)
- Varied sentence structures and emotions
- Mix of short sentences, long sentences, and questions
- At least some industry-specific vocabulary you commonly use

What to avoid:
- Whispering or shouting
- Background music or ambient noise
- Multiple speakers in the same recording
- Heavy audio processing or filters
- Reading in a monotone voice

Upload Requirements#

Voice samples must meet the following specifications:

Requirement	Specification
Format	WAV, MP3, M4A, or FLAC
Minimum duration	30 seconds per sample
Maximum duration	5 minutes per sample
Minimum samples	3 samples
Recommended samples	5-10 samples
Total audio	At least 3 minutes, ideally 10+ minutes
Sample rate	16kHz minimum (44kHz recommended)
Channels	Mono preferred (stereo is accepted)
File size	Max 50MB per file

Note

More audio data generally produces a better clone. If you can provide 10+ minutes of varied speech, the AI can capture more nuances of your voice.

Cloning Process and Timeline#

After uploading your samples:

Validation (instant) — Samples are checked for quality, duration, and format compliance
Preprocessing (1-2 minutes) — Audio is normalized, trimmed, and optimized
Training (5-15 minutes) — The neural voice model is trained on your data
Quality Check (1-2 minutes) — The system generates test phrases and verifies quality
Ready — Your voice clone is available for use

You will receive an in-app notification and optional email when your clone is ready. The entire process typically takes 10-20 minutes from upload to completion.

Testing Your Clone#

After training completes, test your voice clone:

Go to Voice Studio > My Voices
Click Test on your new voice clone
Type a sentence or use one of the test prompts provided
Click Generate Preview
Listen to the output and compare with your natural voice

If the clone does not sound right, you can:
- Add more samples — More data improves quality
- Replace low-quality samples — Remove noisy or unnatural recordings
- Retrain — Click "Retrain Model" to rebuild with updated samples
- Adjust settings — Tune stability and clarity sliders

Updating Voice Samples#

Your voice clone can be updated at any time:

1. Go to Voice Studio > My Voices
2. Click the voice clone you want to update
3. In the Samples tab, you can:
- Add new samples with the Upload button
- Remove existing samples by clicking the X on each
- Re-record samples using the built-in recorder
4. Click Retrain Model to rebuild the clone with updated samples

Retraining follows the same timeline as the initial training (5-15 minutes). Your previous voice clone remains available until the new training completes.

Tip

Re-record your voice clone every 6-12 months if you notice your natural speaking style has changed.

Was this helpful?

Edit on GitHub

PreviousVoice Studio Guide NextAudio Introductions