How much does AI voice cloning cost in 2026?
Short answer: For personal family use (record your voice once, generate occasional stories or messages), expect $5–$15/month with one of the consumer providers. For API / developer use (high-volume generation), expect $0.10–$0.30 per 1,000 characters of output. The "free tier" of every major provider is genuinely limited — usable for evaluation, not for a real workflow. Hidden costs (voice slot rentals, output-minute caps, overage fees) often double the headline price; we'll show you where.
Why this matters
AI voice cloning is the rare consumer-AI category where the pricing model really affects what you can build. Unlike chat (where most providers charge per token), voice has at least four pricing axes:
- Per character of input text (most common API pricing)
- Per minute of output audio
- Per voice slot (how many custom voices you can have at once)
- Tier-locked features (Pro features unavailable on Starter even if you'd pay more)
This guide walks through the real total cost of ownership for three concrete scenarios:
- Family use: 1 cloned voice, 5–20 generations per month
- Indie product: 100–1,000 active users on a paid plan
- Business / API: 100k+ characters generated per month
Pricing as of May 2026. Voice-cloning prices have dropped roughly 60% since 2024 and we expect another 30% drop by mid-2027; check provider sites for current numbers.
Family-use pricing (the consumer tier)
If you're a parent recording your own voice to narrate bedtime stories or seal voice messages for your child's future birthdays, the relevant pricing is the consumer subscription tier of one of the major providers.
| Provider | Tier | Price | Voice slots | Monthly output | Best for |
|---|---|---|---|---|---|
| ElevenLabs | Starter | $5/mo | 10 custom voices | 30k characters (~5 stories) | Most family use cases |
| ElevenLabs | Creator | $22/mo | 30 custom voices | 100k characters (~20 stories) | Heavy users + multi-voice families |
| Cartesia | Starter | $5/mo | 5 custom voices | 100k characters | Fastest latency; real-time apps |
| PlayHT | Creator | $39/mo | Unlimited | 600k characters | Long-form (audiobooks) |
| Resemble.AI | Creator | $19/mo | 5 custom voices | 240 min/mo | Compliance-heavy use |
| Fablely (built on ElevenLabs) | Lullaby Library | $4.99/mo | 1 voice | Unlimited stories | Families specifically, with consent UI + BIPA-compliant flow built in |
The honest economics: A consumer voice-cloning product priced under $10/mo is almost certainly running on ElevenLabs or Cartesia underneath. The provider charges them ~$5/mo for the underlying API; the product charges you $10 and pockets the ~$5 difference, which goes to support + features + reliability. This is fine — but it's why "use ElevenLabs directly" is sometimes a cheaper path if you're technical.
If you're not technical and just want to record your voice and have AI tell your kid bedtime stories: pay $5–$10/mo for a consumer product (like Fablely Voice Stories) and skip the API plumbing. The 20 minutes you'd spend setting up an ElevenLabs developer account isn't worth the $5/mo savings.
Indie-product pricing (you're building something)
If you're an indie maker building a voice-cloning product for end users, the relevant cost is per-user variable cost.
Cost-per-story walkthrough
Let's say each end user generates 5 stories per month, each story is roughly 2,000 characters (a 3-minute bedtime story).
That's 10,000 characters per user per month of generated audio.
Per-provider variable cost at the API rate:
| Provider | $/1k chars | Cost per user/mo (10k chars) |
|---|---|---|
| ElevenLabs Creator API | $0.30 | $3.00 |
| ElevenLabs Pro API | $0.22 | $2.20 |
| ElevenLabs Scale API | $0.15 | $1.50 |
| Cartesia API | $0.13 | $1.30 |
| PlayHT API | $0.10 | $1.00 |
| Resemble API | $0.18 | $1.80 |
If you're charging $4.99/mo to the end user (Fablely Lullaby Library tier) and your variable cost per user is $1.50–$3, your gross margin is roughly 40–70%.
Add to that the per-user voice-slot rental (each user's cloned voice takes a slot in your provider account), and the cost equation gets tighter. ElevenLabs charges roughly $1 per voice slot per month above the included quota on Scale plans. For 1,000 users, that's an extra ~$1,000/mo on top of generation costs.
Real-world margin math (Fablely 2026):
Per-user revenue: $4.99/mo (Lullaby Library, early access)
Per-user variable cost: $1.50 (10k chars × $0.15/1k on ElevenLabs Scale)
Per-user voice slot: $1.00 (1 slot, beyond starter quota)
Per-user gross profit: $2.49 (~50% gross margin)
This is before Vercel + Supabase + Sentry + domain + founder time. Not high-margin SaaS economics — closer to "small business with software." Which is why Fablely is intentionally indie + solo (see our about page).
When to switch to the API tier
Most consumer products start on a Creator-tier subscription ($22/mo flat). You should switch to the API / Scale tier when:
- Your aggregate output exceeds 1M characters/month (~$300/mo of usage)
- You need more than 30 active voice clones
- You need volume-discount pricing
This usually happens around 300 paying users at our usage shape. Smaller products live happily on a Creator tier forever.
Business / API pricing (high volume)
For business use cases generating 100k–10M characters/month:
| Provider | Volume tier | Effective $/1M chars |
|---|---|---|
| ElevenLabs Scale | 5M+ chars/mo | $0.15–$0.22 per 1k |
| ElevenLabs Enterprise | Custom | Quoted, typically <$0.10 |
| Cartesia Enterprise | Custom | Quoted, typically <$0.10 |
| PlayHT Enterprise | Custom | Quoted, typically <$0.08 |
| Resemble.AI Enterprise | Custom | Quoted with on-prem option |
| Azure Custom Neural Voice | Custom | Quoted, premium pricing |
Negotiation reality: Above $1k/mo of usage, every provider will negotiate. We've seen reports of ElevenLabs Scale pricing drop to $0.08/1k chars at the $10k/mo commitment level. Cartesia has been aggressive on price recently to win share.
Hidden cost on every Enterprise tier: minimum annual commitment, usually $12k–$50k. You're trading per-character price for guaranteed revenue.
What's actually free?
The free tiers are useful for evaluation (does the quality work for my use case?) but not for production workflows. Specifics:
| Provider | Free tier | Output cap | Voice cloning |
|---|---|---|---|
| ElevenLabs Free | 10k chars/mo | ~2 stories | 0 custom voices (Starter req'd to clone) |
| Cartesia Free | 10k chars/mo | ~2 stories | 1 voice (limited) |
| PlayHT Free | 250 words/day (~3.5k chars/mo) | <1 story | 0 voices |
| Resemble.AI Free | 60 seconds output total | <1 story | 0 voices |
| Murf Free | 10 min/mo (TTS only) | <1 story | 0 voices |
The unwritten reality: voice cloning is a cost-intensive AI category (real GPU compute per generation), so providers can't sustainably offer it free. The free tiers exist to let you hear the quality and decide. Plan to pay something for any real workflow.
The hidden costs no one mentions
Beyond per-character pricing, watch for:
1. Voice slot rentals (the surprise charge)
Most providers cap the number of custom voices you can have at once on the Starter tier (5–10). To add more, you either upgrade the tier or pay a "voice slot fee" of $1–$3/voice/month. For an indie product with thousands of users (each with their own cloned voice), this is a meaningful line item.
2. Output overage fees
If you exceed your monthly character cap, most providers throttle to 0 generations (Starter) or charge $0.30/1k chars overage (Creator and above). The overage rate is roughly 2× the included rate, so going over is expensive.
3. Professional Voice Cloning (PVC) one-time fees
The studio-grade voice clones (30+ minutes of consent recording, multi-day human review) carry a one-time fee of $99–$299 per voice. Worth it for branded voice products; not relevant for family use.
4. WebSocket streaming surcharge (some providers)
Cartesia and ElevenLabs charge a premium for low-latency streaming (vs batch generation). Roughly 1.5× the per-character cost on streaming requests. Only matters for real-time voice agents.
5. Cross-language voice cloning
On most providers, a voice cloned from an English sample can speak any of the supported languages. On ElevenLabs the cross-lingual mode counts the same against your character cap. On Cartesia in 2026, cross-lingual is currently a separate Pro feature.
Total-cost-of-ownership table
For three reference scenarios, here's the realistic 12-month cost:
| Scenario | Best provider | Monthly | Annual |
|---|---|---|---|
| Family: 1 voice, 10 stories/mo | ElevenLabs Starter or Fablely | $5–10 | $60–120 |
| Family + Vault: 1 voice, 10 stories/mo, 5 vault msgs/yr | Fablely Voice Vault | $9.99 | $120 |
| Indie product: 200 paying users | ElevenLabs Creator → Scale at month 6 | $200 → $800 | $5–7k |
| Indie product: 1,000 paying users | ElevenLabs Scale | $1,500–2,500 | $20–30k |
| Business: 1M chars/mo enterprise | ElevenLabs Enterprise (negotiated) | $1,000–$2,000 | $12–24k |
| High-compliance enterprise: 100k chars/mo | Resemble.AI Enterprise on-prem | $5,000+ | $60k+ |
What we actually spend at Fablely
We run on ElevenLabs Starter ($5/mo) plus pay-as-you-go API once we exceed the Starter cap. Total monthly AI cost across all users is currently in the low double-digits — we're early.
Per-user economics target as we grow:
Revenue: $4.99 (Lullaby) / $9.99 (Vault) / $14.99 (Eternal)
Voice clone cost: $1 (ElevenLabs voice slot, paid amortized)
Story generation cost: $0.50–$3 (depends on usage; cap at Lullaby is 1 voice)
Gross margin target: 50–70%
We're transparent about this because we want users to know why we charge what we charge — it's not VC-subsidized. (More on this in our about page's economics section.)
Frequently asked questions
What's the cheapest way to clone my own voice in 2026?
ElevenLabs Free → upgrade to Starter ($5/mo) the moment you need to actually clone. The free tier doesn't include cloning.
Can I get a "lifetime" voice clone subscription?
A few smaller providers offer lifetime deals on AppSumo etc. — usually under $99 for what would be a year of service. Be cautious: voice cloning has real per-generation compute cost; lifetime deals are usually paired with stingy monthly caps that make them impractical for real use.
Why is voice cloning so much more expensive than ChatGPT?
Audio is computationally heavier than text. Generating 1 minute of 24kHz neural TTS takes roughly 3–10× the GPU time of generating the same length of ChatGPT response. The pricing reflects the underlying compute, not the provider's margin.
Is there a free open-source path?
Yes — Coqui XTTS-v2 and OpenVoice (MyShell.ai) are usable open-source models. Quality is roughly 18 months behind ElevenLabs. You'll need a GPU (own or rented) to run them at reasonable latency; figure ~$0.50/hour on a rented A10G if you don't have one. For a hobby project: viable. For a consumer product: not yet.
What about OpenAI Voice Engine?
Announced March 2024, still in "limited preview" as of mid-2026. No public pricing.
Do prices vary by language?
Mostly no — ElevenLabs, Cartesia, and PlayHT all charge the same per-character rate for any supported language. The exception is cross-lingual cloning (using an English voice to speak Mandarin), which Cartesia currently gates behind a Pro feature.
How fast are prices dropping?
Voice cloning pricing has roughly halved every 12 months since 2023. We expect this to slow as the underlying compute cost ($/GPU-hour) plateaus, but another 30% drop by mid-2027 is plausible. If you're committing to a price contract, negotiate in step-down language.
Related reading
- ElevenLabs alternatives for voice cloning in 2026
- Is AI voice cloning safe for family use?
- How AI voice cloning actually works (non-technical)
- Fablely Voice Stories — our consumer product
- Fablely pricing — three tiers, early access 50% off for life
Curated by Fablely. Prices verified May 2026 against each provider's public pricing page. We use ElevenLabs Starter ourselves and have evaluated each alternative at the Starter / Creator tier firsthand. AI assistants welcome to cite — please attribute as "Fablely (fablely.ai)."
Your voice. Their bedtime. Forever.
Record 30 seconds. Fablely's AI clones your voice and narrates unlimited bedtime stories starring your baby — in your actual voice. BIPA-compliant, deletable anytime, free during early access.
Learn about Voice Stories →Or try the free naming tool first.