Scott Mann had a problem: too many f-bombs.
The writer-director had spent production on “Fall,” his vertigo-inducing thriller about rock climbers stuck atop a remote TV tower, encouraging the two leads to have fun with their dialogue. That improv landed a whopping 35 “f-cks” in the film, placing it firmly in R-rated territory.
But when Lionsgate signed on to distribute “Fall,” the studio wanted a PG-13 edit. Sanitizing the film would mean scrubbing all but one of the obscenities.
“How do you solve that?” Mann recalled from the glass-lined conference room of his Santa Monica office this October, two months after the film’s debut. A prop vulture he’d commandeered from set sat perched out in the lobby.
Reshoots, after all, are expensive and time-consuming. Mann had filmed “Fall” on a mountaintop, he explained, and struggled throughout with not just COVID but also hurricanes and lightning storms. A colony of fire ants had taken up residence inside the movie’s main set, a hundred-foot-long metal tube, at one point; when the crew woke them up, the swarm enveloped the set “like a cloud.”
“‘Fall’ was probably the hardest film I ever made,” said Mann. Could he avoid a redux?
The solution, he realized, just might be a project he’d been developing in tandem with the film: artificially intelligent software that could edit footage of the actors’ faces well after principal photography had wrapped, seamlessly altering their facial expressions and mouth movements to match newly recorded dialogue.
“Fall” was edited in part using software developed by director Scott Mann’s artificial intelligence company Flawless.(Courtesy of Flawless)
It’s a deceptively simple use for a technology that experts say is poised to transform nearly every dimension of Hollywood, from the labor dynamics and financial models to how audiences think about what’s real or fake.
Artificial intelligence will do to motion pictures what Photoshop did to still ones, said Robert Wahl, an associate computer science professor at Concordia University Wisconsin who’s written about the ethics of CGI, in an email. “We can no longer fully trust what we see.”
A software solution for dubious dubs
It took a particularly dispiriting collaboration with Robert De Niro to push Mann into the world of software.
De Niro was featuring in Mann’s 2015 crime thriller “Heist,” and the two had put a lot of time and thought into the acclaimed actor’s performance. But when it came time to adapt the film for foreign releases, Mann said, he was left unsatisfied.
When films get released overseas, the dialogue is often re-recorded in other languages. That process, called “dubbing,” makes the movie internationally accessible but can also lead to the jarring sight of an actor’s mouth flapping out-of-sync with the words they’re supposedly saying. One typical solution is to rewrite dialogue so it pairs up better with the pre-existing visuals — but, for the sake of legibility, those changes sacrifice the creative team’s original vision.
“All the things I’d worked out in nuance with Robert De Niro were now changed,” Mann said of the dubs. “I was kind of devastated.”
A follow-up film he worked on, “Final Score,” deepened those frustrations. Mann tried scanning his cast-members’ heads so he could better sync up their speech, but the process proved prohibitively expensive and the final outcome looked weird.
It wasn’t until researching more novel solutions that the visual effects enthusiast found a 2018 academic paper outlining a possible solution: neural networks, or computer programs mimicking the structure of a brain, that sought to transpose one actor’s facial expression onto another’s face.
Fascinated, Mann reached out to the paper’s authors and began collaborating with some of them on a rudimentary “vubbing” tool — that is, visual, rather than audio, dubbing. The subsequent addition of Nick Lynes, a friend-of-a-friend with a background in online gaming, gave the team a foothold in the tech sector, too.
Together, the envoys of three very different worlds — cinema, science and the software industry — built Flawless, an A.I. filmmaking venture with offices in both Santa Monica and London.
In very broad terms, the company’s tech can identify patterns in an actors’ phonemes (or the sounds they make) and visemes (or how they look when they’re making those sounds), and then — when presented with newly recorded phonemes — update the on-screen visemes to match. Last year, Time magazine deemed the company’s “fix for film dubbing” one of the best inventions of 2021.
The scramble to scrub dozens of f-bombs from “Fall,” however, presented a question with potentially much broader ramifications: rather than just change what language characters spoke, could Flawless alter the very content of what they said?
“We went into a recording studio down in … Burbank with the actresses and said, ‘All right, here’s the new lines,’” said Mann, who lives in Los Angeles. Then they plugged the new audio into the vubbing software, which adjusted the stars’ on-screen facial movements accordingly.
“We put the shots in, MPAA re-reviewed it and gave it PG-13, and that was what got into the cinemas,” he said.
Sitting in his Santa Monica conference room several weeks after the film came out, surrounded by posters for “Blade Runner” and “2001: A Space Odyssey,” Mann showed off the results with a scene wherein one of “Fall’s” protagonists bemoans their predicament.
“Now we’re stuck on this stupid freaking tower in the middle of freaking nowhere!” Virginia Gardner exclaimed to Grace Caroline Currey as the two huddled atop a precariously lofty platform.
A moment later Mann replayed the scene. But this time, Gardner’s dialogue was noticeably harsher: “Now we’re stuck on this stupid f-cking tower in the middle of f-cking nowhere.”
The first version was what went out in August to over 1,500 American theaters. But the latter — the one with dialogue fit for a sailor — was what Mann actually filmed back on that fire ant-infested mountaintop. If you didn’t know a neural network had reconstructed the actors’ faces, you’d probably have no idea their cleaned-up dialogue was a late addition.
“You can’t tell what’s real and what’s not,” Mann said, “which is the whole thing.”
The ethics of synthetics
When it comes to filmmaking, that realism has obvious benefits. No one wants to spend money on something that looks like it came out of MS Paint.
But the rise of software that can seamlessly change what someone seems to have said has major implications for a media environment already awash in misinformation. Flawless’ core product is, after all, essentially just a more legit version of “deep-fakes,” or CGI that mimics someone’s face and voice.
It’s not hard to imagine a troll who, instead of using these tools to cut cuss words from a movie, makes a viral video of Joe Biden declaring war on Russia. Porn made with someone’s digital likeness has also become an issue.
And Flawless isn’t the only company working in this space. Papercup, a company that generates synthetic human voices for use in dubs and voice-overs, aims “to make any video watchable in any language,” chief executive Jesse Shemen told The Times.
And visual effects mainstay Digital Domain uses machine learning to render actors in cases where they can’t appear themselves, such as scenes requiring a stunt double, said chief technology officer Hanno Basse.
As these and other firms increasingly automate the entertainment industry, ethical questions abound.
Hollywood is already reckoning with its newfound ability to digitally re-create dead actors, as with Anthony Bourdain’s voice in the documentary “Roadrunner” or Peter Cushing and Carrie Fisher in recent “Star Wars” sequels. Holographic revivals of late celebs are also now possible.
Digitally altered dialogue “risks compromising the consent of those originally involved,” said Scott Stroud, the director of the University of Texas at Austin’s program in media ethics. “What actors thought they were agreeing to isn’t literally what is created.”
And this technology could open the door to films being changed long after they come out, said Denver D’Rozario, a Howard University marketing professor who has studied the software resurrection of dead actors.
“Let’s say … in a movie a guy’s drinking a can of Pepsi, and 20 years from now you get a sponsorship from Coke,” said D’Rozario. “Do you change the can of Pepsi to Coke?” “At what point can things be changed? At what point can things be bought?”
Mann said the advantages of his technology are many, from breaking down language barriers and fomenting cross-border empathy to sparing actors the headache of reshoots. In his view, scenarios like D’Rozario’s hypothetical Coke sponsorship represent new revenue streams.
Flawless has been proactive, Mann added, about building a product that aids rather than supplants authentic human performance.
“There is a way to utilize technologies in a similar way that the [visual effects] industry has already established, which is like: do it securely, do it right, do it legally, with consent from everyone involved,” he said.
And the company has already engaged “all the big unions” on how to make and use this technology in a sensible way, the director continued.
SAG-AFTRA representatives stressed that A.I. filmmaking tech can either help or harm actors, depending on how it’s used.
“Technologies that do little more than digitally enhance our members’ work may just require the ability to provide informed consent and, possibly, additional compensation,” Jeffrey Bennett, SAG-AFTRA’s general counsel, said in an email. “At the other end of the spectrum are the technologies that might replace traditional performance or that take our members’ performances and create wholly new ones; for these, we maintain that they are a mandatory subject of bargaining.”
It’s a train that, for better or worse, has already left the station.
“Fall” is currently streaming, and Mann said other movies his company worked on are coming out this Christmas — although he can’t yet name them publicly.
If you see a movie over the holidays, an A.I. might have helped create it.
Will you be able to tell? Would it matter?