Voice Over Talent

AI in VO: How to Get it Right

A couple of friends shared spot-on points in this article, and someone asked for my take on it in a recent meeting. So here’s my 3rd cent on the matter. 

I don’t think I’ve ever seen this topic become as active as I have so far this year, and I have a confession to make: 

Some years ago, I got booked on a “research” project for one of the companies developing AI for use in voice applications and otherwise. Was I well paid for my time and labor? Yes. Did I work with great people and have fun doing it? Absolutely. Do I regret the job? Yep, because, in hindsight, I went into it signing away rights to more than I should’ve. 

At the time, I was still a Non-Union actor. I was also a full-time freelancer, as I am today. Halfway through the project, I started thinking, “Hm, am I putting myself out of a job here?” As many people do, I fell into the trap of taking short-term gains without fully considering the potential long-term consequences.

In the VO business, issues started with high-speed Internet connectivity, creating a competitively over-saturated buyers’ market. Now with AI, humans are being removed from the equation: Innovation brought more people into the business than it would ever have enough work for, and now it’s started to take the work back and away from people altogether. The Tech giveth and The Tech taketh away.

Shiny new toys can have a way of turning businesspeople into overthinkers and also a way of turning artists into nerds. For Producers, the trap of AI voices, whether “cloned” or wholly synthesized, is over-engineering and going down a road that may be penny-wise but pound-foolish, especially if using entirely fake voices. There’s no point in spending time and effort on creating something sterile that doesn’t make the audience feel something, engage and take action. 

It doesn’t have to come to that, though, nor should it. Hollywood and Silicon Valley must keep learning to speak the same language to avoid this and find a sustainable middle ground. 

For performers, as to AI as to lowball rates. Professionalism means knowing how to do business, think creatively, and negotiate, but also when and how to refuse to even audition. Always be willing and able to say no; if needed, walk away and do something else for a living. 

If a company wants to “clone” your voice, then every time that clone gets used in something, how and for how long, you need to collect commensurate fees. This business has standards for a reason, and applying them in a way that keeps pace with innovation is just an Engineering matter. AI companies doing business in good faith should relish the opportunity to do so if they want to gain legitimacy and take market share.

Example 1:

Advertising; a wild spot, running on Chicago radio for (1) cycle, i.e. (13) weeks

For that, Scale is currently USD 429.19. So when creating the project session on the AI platform, the buyer needs to specify that as the intended usage and then pay that, plus additional one-time fees of 17.25% for health insurance and retirement contributions and 6.75% for employer contributions. Once they approve that cost via the AI platform, then the buyer can create the spot. 

What if they want to be able to make multiple cuts or edits afterward? No problem. If they want to do so before the cycle’s End Date has passed, an additional $429.19 must get paid to the talent every time they export (or “bounce”) audio from their account on the AI platform. If they want to go back and make edits after the cycle’s End Date has passed, fine. That should be the same process through which they could optionally extend or change the usage. In that case, just like before, the AI platform should recalculate the fee, get their approval to let them back into the project session, and then charge them that fee when they save their changes and export the audio.

Example 2:


That’s non-broadcast corporate/industrial work, Category 1, i.e., “Programs designed to train, inform, promote a product or perform a public relations function (which) may be exhibited in classrooms, museums, libraries or other similar locations. Included are closed-circuit television transmissing and teleconferences.” 

Up to an hour of work would be $559.24 between the session and H&R contributions, plus a few percentage points for one’s paymaster. So, to use a “clone” of your voice, the buyer should pay that as the initial base fee, and then any time they want to go back into the AI platform and make changes, additional costs need to be paid to you based on the exact word counts of those changes. 

This versioning and management functionality has long existed in CMS platforms and applications like Adobe Acrobat. Comparing whatever edits Producers made to an original script is a “DIFF” operation software-wise. It doesn’t require anything so advanced as machine learning or AI, so there’s no reason any company selling AI voices shouldn’t be able to implement it. $0.20 – $0.35 per word is standard. Buyers should still pay for every change, but only for what degree of changes get made each time.

The Bottom Line

AI isn’t going away, and for buyers who want to settle for it, it can give them tons of ease, flexibility, and convenience. It can also help them save money in the long run. On the talent end, it can save time and labor without talent getting cheated out of the recurring revenue it still deserves. Like anything, there are trade-offs all around, so everyone should keep an open mind.

As we actors go, how many of us behind the mics wouldn’t prefer to focus our manual energies on auditioning for or working on commercials, promos, and character work, where it’s all about the exacting creative details and individual performers’ ability to take direction, improv, and act all in real-time? As actors, aren’t we supposed to be storytellers who make audiences feel something? When clients on some other job types often want something low-cost, fast, and with utility, “a nice sounding voice” as a relative commodity, why shouldn’t we be optimizing while still monetizing? Win-win here is possible if both the Buy and Sell sides genuinely want to dance. It takes two to Tango. Anything less than that becomes just more “industry” pain and balkanization.

If you’re genuinely professional talent, you owe it to yourself and your trade brothers and sisters today and tomorrow to be innovative. Make hard choices as you have to. It’s vital to consider when and how to turn down a job to protect your career. Live high or live low, but never compromise your principles because your choices will always stay with you no matter how you live or where you go.