Report: Debunking AI dubbing

myths

The world of content creation grows more competitive every day:

2 million

YouTube channels are part of YouTube’s Partner Program

500 minutes

of video are uploaded every minute to the platform

29,000

YouTube channels have over 1 million subscribers

(as of January 2022)

306,000

channels have over 100,000 subscribers

(as of January 2022)

To rise to the top, creators and brands have to be across multiple different platforms.

It’s essential they tailor content for each to funnel audiences across platforms and grow their overall following.


That’s a lot of resource, time, and creativity. It’s tough.

Read how creators reduce their channel management costs by adding dubbed audio to their main YouTube channels.

手机
Woman Filming a Vlog in Her Living Room

So, content creators are increasingly using AI dubbing to win audiences abroad and increase their overall metrics.

If they use a managed service, they can still focus their attention on building their following in their main market, while distributing globally. It’s smart.


Though the generative AI industry has flourished in 2023, some serious questions have also been raised about the technology.


This report debunks the top 5 AI dubbing myths, so creators can make better decisions about how to localize.

AI dubbing in 2023, from people in the industry

Quote Mark Icon

With MLA, creators can access a global audience without having to create new channels, which means more work, more admin and more headaches. It also lets creators expand the potential audience pool by many multiples – depending on the original language of the post”

Jim Louderback

Ex-CEO of VidCon

Quote Mark Icon

Before AI dubbing, localization options were either subtitling (cost-effective but far less engaging), studio dubbing (high quality but costly and time-consuming) or piecemeal AI tools (poor quality and resource required to manage the various tools). Now companies can affordably scale localization to unlock global audiences.”

Jesse Shemen

Founder and CEO Papercup

What are the benefits of MLA? Find out here

What is MLA? Find out here

What is AI dubbing?

Laptop

Myth #1

All AI dubbing

is the same

Nope. Here are the key differences:

What’s an API?

1. The way they’re made

Off-the-shelf voices

Off-the-shelf AI voices are integrated into voice generation platforms using APIs


They’re built for a wide set of use cases, meaning the expressivity of the voice is not the main aim


A common use of off-the-shelf voices is home assistants


If a company offers 30+ languages, it’s likely they use off-the-shelf voices

AI voices specially designed for video

Curated voices are built with a specific use case in mind


Papercup voices are built specifically for dubbing videos


Curated voices are built using data specific to the use case


Papercup’s use case is video dubbing, so we use data from real voice actors and existing premium content to ensure our voices are super realistic

Down Arrow Icon

2. The technology they use

Voice A

Voice B

New artificial

Soundwave Illustration
Check Circle

Pitch

Check Circle

Intonation

Check Circle

Voice conversion

Rhythm

Voice conversion

Takes the tone, timbre and rhythm of the original speaker to create new AI voices.

Quote Mark Icon

“The primary driver for looking at AI dubbing was being able to increase our viewer base and revenue and to do that through using videos from our existing archive.”

John Montoya, Senior Director, Content Strategy at VICE Media

Voice A

New artificial

Voice A

Check Circle
Soundwave Illustration

Timbre

Check Circle

Tone

Check Circle

Voice cloning

Cadence

Voice cloning

Copies the identity of a single speaker’s voice and is solely trained on that speaker’s voice data.

Lightbulb Outline Vector

Creators’ voices are central to their brands, so they sometimes think their videos should be localized using an exact replica of their voices.


Actually, audiences care about a series of other things ahead of having the exact same voice as the original.


Scroll down to our survey results to find out what they care about most.

Hours of voice actor speech data

Library of realistic

AI voices

Soundwave Illustration

Languages models

Library of realistic AI voices

Hours of voice data is recorded in a studio and used to train models. These generate speech that sounds like any number of real people.

Quote Mark Icon

At Papercup, we have over 100 AI voices, built on data from real voice actors and existing premium video content. Our focus is on producing the most expressive AI voices because we know this is what audiences care about the most.”

James Leoni,

Head of Machine Learning, Papercup

3. They’re either managed or self-service:

Managed:

A team of localization experts devise and manage the localization strategy, dubbing process and post-production.

Self-service:

Users either upload a text script themselves to the platform or the platform generates one from a video. The user checks the transcript and output audio.

7 best AI dubbing companies:

Managed:

Both:

Self-service:

4. They’re fully automated vs. human quality check

Fully automated

Pros:

  • Fast and cost-effective
  • Affordable



Cons:

  • No quality check except what the user is able to do themselves
  • Requires language skills and management

Automated with human quality check

Pros:

  • Checked by expert translators
  • Fast and cost-effective
  • Affordable
  • Process is managed
  • Super likelife voices


Cons:

  • No platform access yet


Quote Mark Icon

“AI dubbing is lowering localization barriers to entry. Content that was once limited by language can now access international markets.”

Gavin Bridge

Variety

Myth #2

My audience won’t connect with my content if it’s not my voice

We surveyed 300 people and the majority said translation accuracy and lifelike voices are most important.

My entire brand is built on my voice, I can’t possibly dub with a generic voice

I want my dubbed voice to sound exactly like me because my audience expects that

I have to use real dubbing actors because AI voices sound robotic

My audience wants dubbing that is lip synced

I don’t speak any other languages, how will I know if the translation is accurate?

When we talk about quality video dubbing, what do we mean?

The accuracy of the translation quality

How expressive the voices sound

How well the dubbed audio matches the moving image

Quote Mark Icon

"Advances in generative AI mean that we can now dub with quality AI voices affordably. The standard of voices can now deliver the experience the audience expects and so drive engagement. Most content creators are risk adverse, because all they have is the brand they have created and the audience they have, so quality is everything."


Top YouTube creator

Creators and digital media companies

build their social following on the style and quality of their content.


Rightly, this means they’re protective over the brand they’ve created.


We asked 300 people across Latin America to rank what’s most important to them when watching dubbed content. Here’s what matters most to audiences:

1st

2nd

3rd

4th

5th

The accuracy of the translation

How realistic the voice sounds

The dubbed audio can reflect the emotion of the original

How accurately the dubbed audio matches the visuals on screen

The new dubbed content has the exact same voice as the original

27.9

20.1

20.1

19.7

12.3

23.0

21.3

21.7

15.6

18.4

18.9

22.5

25.5

22.1

13.9

18.4

24.6

18.9

19.3

18.9

11.9

11.5

16.8

23.4

36.5

0

25

50

75

100

Lightbulb Outline Vector

50% ranked the ‘accuracy of the translation’ and ‘the dubbed audio can reflect the emotion of the original’ as most important.


Respondents ranked ‘the new dubbed content has the exact same voice as the original’ as least important overall.

Quality is what drives results

Insider gained

740 million views in the last 12 months

Fremantle’s AI dubbed Spanish channel is tracking

1 million views a month

Lightbulb Outline Vector

Poor quality dubbing can tank your watch time, so it’s critical to find a provider that can offer the right quality.


Below is a useful guide on what to ask AI dubbing providers to ensure they can offer the quality you need.


At Papercup, our team of expert translators checks all translations. Our AI voices are designed specifically for video and are optimized to sound super realistic.

Myth #3

AI voices sound robotic

At Papercup, we asked audiences if they could tell our AI voices apart from the real thing. They couldn’t.

Myth #3

Laptop

Lots of voices do sound robotic, but technology has come a long way

We asked these people to tell us which videos are dubbed with Papercup AI voices and which are dubbed by real people...

The results show that when it’s done right, audiences engage...

57%

Increase in RPM for back catalog videos because watch time is much longer

4.6x

Longer watch time for Spanish; 3X longer for French


28X

Spanish dub tracks drove 28x more views than subtitles over the same period of time.


1M

MLA reached 1 million views 5X faster than dedicated language channels


Check Circle

These are the results Papercup’s clients have seen using our AI voices to localize their videos. Our clients have global brand equity that cannot be jeopardised by poor quality dubbing.

Myth #4

I won’t

get my investment back

The beauty of using AI dubbing to localize content is you can test new languages, see the results and then commit to a long-term strategy.

There’s minimal upfront investment

Compared to managing separate language channels, AI dubbing for YouTube’s MLA reduces channel management costs.


You can start with one language and build out from there.


Papercup’s in-house channel management team works with you to hit your new market goals, so you can be sure your investment yields returns.

Test your content in new markets

See traction in a new market, then strategically invest when you’re confident of the return.


We work with Fremantle to localize X Factor, Idols and Got Talent into Spanish and Arabic and the newly created channels have gained 27 million views in 2 years.

Low risk, high reward

Papercup works with Insider, Bloomberg, Jamie Oliver, Sky News and Fremantle.


These companies first tested localizing for new markets without having to hire teams on the ground or allocate internal resource to manage the new channels.


Now their channels amass millions of views, increasing overall reach and revenue.

Myth #5

There’s a right time to go global

There’s no reason to wait to go global. Remember, when average view duration increases (because you’re getting global views) your content gets recommended more.

1

Think of localizing as just another channel. You don’t wait until you’ve conquered YouTube to get into TikTok; you funnel audiences from one to the other.


It’s the same with localizing: you’re growing your pool of viewers so you can monetize more effectively.

3

Adding language tracks benefits the performance of English-language channels. Our clients have seen local-language watch time outdo the original language. When average view duration is higher, that content gets recommended more.

2

Even if you’re not the most popular today, you have the chance to take another market by storm as a new entrant.

4

Being a first mover in a new market puts you in pole position. Acting before all creators realize the metric-boosting capabilities of dubbing content removes competition.

Want to reach global audiences like our clients?

Let one of our AI dubbing experts show you how it works.