If you’re a GenAI company in 2023, you might have a lot of capital. But do you have any customers?
That’s one thing that British AI company Synthesia has on many other darlings of the current GenAI craze. Today the company has more than 50k customers, with nearly half of Fortune 500 companies using its tech to automatically generate videos with virtual avatars. The NHS has used it to make explanatory videos in different languages.
It was also one of only seven private tech companies to achieve a billion-dollar valuation in Europe this year amid a slowdown in funding.
“I love when there's some French cement-mixing company that you’ve never heard of before, but has 15k employees who don't care about AI at all, they’re just trying to do their job better and Synthesia is the right tool,” says Victor Riparbelli, the company’s CEO, who’s spearheaded the commercial growth of the business.
The company declined to give any figures about revenue, so it’s unclear how much money these customers are actually bringing in. But it has been enough to convince investors like Accel and Kleiner Perkins to part with their capital.
So how has Synthesia gone from university research to billion-dollar company?
Hollywood from your bedroom
Riparbelli says that the company’s big vision has always been “How do we enable a 16-year-old sitting in their bedroom with a good idea to make a Hollywood film.”
For the first three years of the company’s story, Synthesia was building an AI dubbing tool, using computer vision to make mouth movements more lifelike for different languages. The computer vision side of the tech was based on the academic research of his cofounders, Matthias Niessner and Lourdes Agapito, who wrote some of the first papers that demonstrated how AI could be used to generate video.
“What quickly panned out with that was: it works, it was cool and we did a fair amount of revenue on it,” Riparbelli says. “But it was also very clear that if we went down that path, we'd be stuck in being a service-based visual effects company. It was very hard to see how that was gonna get to a truly impactful, VC-style outcome.”
Riparbelli says that he and the team then realised that the best target customer for GenAI video wasn’t people already making videos, but people who wanted to make videos at work, but didn’t have the resources to do it.
“We learned that there are actually billions of people in the world who really, really want to make video content, but they have no idea where to start. They don't know how to use a camera, they can't get the budget internally to do it,” he says.
Synthesia could provide a product that’s “70% lower” quality than video cameras, but more affordable and easy — a tradeoff many of these amateurs were willing to pay for.
“This is magic”
Synthesia was founded in 2017 when Riparbelli moved to London and began working and consulting in the VR industry. He remembers that the lightbulb moment for launching Synthesia was when he read a research paper by Synthesia’s cofounder Niessner.
“This was one of the first papers that demonstrated a video that was generated with AI and the feeling for me when I saw that was just like, ‘This is magic. This is going to change everything we know about media production’,” says Riparbelli.
Before long, he had teamed up with Niessner and two other cofounders — Lourdes Agapito and COO/CFO Steffen Tjerrild — to start turning research into a business. Agapito is a professor of 3D computer vision at University College London.
Riparbelli remembers that, at the time, other founders were using the latest research in AI for video generation to create tools like Snapchat filters — dog ears, for example.
“There's a bunch of companies that did that and they made good money on it. But I felt like there's so much more to this technology than just giving yourself dog ears. This technology is going to be so powerful in 10 years that it's going to be truly transformative.”
Staying ahead of the pack
Today the company is up against a whole crop of new startups that are using GenAI to develop similar video products for enterprise use — including Colossyan, HeyGen and Speechify — and they’re more than happy to lob grenades at Synthesia.
Each competitor has a page on its website outlining how it outperforms the incumbent AI videomaker, with Colossyan boasting better customer service ratings, HeyGen claiming better lipsync quality and Speechify saying it’s got better audio fine-tuning tech.
“We’re also very happy to talk about how we’re better than them,” says Riparbelli. “There’s a lot of companies coming after us now, but that’s a sign that this product and technology has real value.”
He says he’s confident that Synthesia’s technical expertise, anchored in the academic expertise of its cofounders, will keep it at the front of the pack.
“I'd say most of the competitors we have today are kind of stuck on version one of our technology,” he says. “That works and we know that that's already a big market in itself, but ultimately we want to drive the market forward and take the avatars to a level where they're much more realistic and expressive.”
Agapito says that the next research frontier for Synthesia’s technology involves building an AI model that can generate avatars that can make physical gestures and interact with other objects.
The founders estimate that they’re roughly 40% of the way on their journey from a research demo, to a teenager being able to generate a Hollywood film from their bedroom. The hyperscaling VC growth that’s taking them there seems almost like an apparition to Agapito, who says she’s always had one main goal in mind, since starting research on 3D computer vision.
“I want to help equip artists and creators with technology that will help them so they can be more creative. It’s about letting them focus on the creative side of things, without having to do things manually.”