Interview

October 25, 2024

Synthesia has quadrupled its team since 2022 to push for a 'ChatGPT-style' moment in AI video

Synthesia cofounder Steffen Tjerrild talks Sifted through the company's plans to cross the uncanny valley with synthetic video


John Thornhill

5 min read

John Thornhill and Steffen Tjerrild on stage at the Sifted Summit 2024

London-based Synthesia — which builds a text-to-video platform for enterprises — is fast becoming one of Europe’s most promising GenAI companies.

Founded in 2017, it won a $1bn valuation when it raised $90m from Accel and Nvidia last year.  Synthesia has nearly tripled its revenues, from £8.6m in 2022 to £25.7m in 2023, and the company says it now has more than half of the Fortune 100 using its tech.

But its losses also increased more than five-fold in the same period (from £4.5m to £23.5m after tax), in large part due to a big increase in hiring, with average monthly headcount growing from 78 in 2022 to 198 in 2023.

Advertisement

Today, the company employs more than 400 people, as it anticipates a breakout “ChatGPT moment,” when its talking head avatars cross the “uncanny valley” and become indistinguishable from real people on video calls. 

Once those avatars have the visual likeness, latency, intellectual understanding, tonality and hand gestures of a real person then the lines between “content and conversation” blur, says Steffen Tjerrild, Synthesia’s co-founder and COO. 

“We still feel that our true ChatGPT moment is ahead of us, not behind us,” he says in an interview with Sifted. How far away is that? “We used to say it’s 10 years. I think we’re more like 3 to 5 years. I think the space is moving incredibly fast.” 

Rivals, such as OpenAI which developed ChatGPT and has launched the Sora text-to-video service, have hinted they might cross the uncanny valley even quicker than that.

Building for businesses

More than 50k companies, including Amazon, Ocado and Johnson & Johnson, currently use Synthesia’s technology to communicate with their employees in multiple languages and market their services to customers.  Whereas a chief executive would have to make multiple trips to a studio to record videos in real life, Synthesia can generate synthetic videos within minutes that can be speedily adapted for different audiences. 

Tjerrild argues that video is increasingly becoming the language of the internet as younger generations, in particular, consume content via TikTok, YouTube and Netflix. 

A study by the network intelligence company Sandvine found that video accounted for 66% of all internet traffic in 2022 growing 22% over the year before. Other estimates suggest it is now closer to 80%.

“A lot of these video experiences are taking over our private lives. But at work it’s still PDFs, PowerPoints and emails. There is a starker and starker difference,” Tjerrild says, arguing that, to stay relevant, companies will increasingly need to focus on visual and interactive experiences.

As well as creating more realistic videos, Synthesia’s aim is to create a content management system for enterprises as more of their communication and information migrates from text to video. Having software-rendered video means it will become easier to search, index, translate and A-B test content, and the company hopes that offering this suite of services will enable it to develop its product from a point solution to an enterprise platform.

Many employees might not take too kindly to their chief executive only communicating with them via an avatar rather than taking the time to speak to them directly. Tjerrild says there will always remain a time and a place for real video, and that avatars should augment human communication, rather than replace it. 

Advertisement

Generated videos can be constantly updated in 50 different languages to ensure content is up to date and legally compliant. While the BBC is always likely to have a human newsreader fronting the Six O’Clock News, it could reach new audiences by creating multilingual avatars to read out cricket results for audiences in India, for example. 

Gaining trust

The idea of creating perfect synthetic humans on video disturbs many people, who are already freaking out about the threat of deepfakes. But, in Tjerrild’s view, that will only increase the importance of understanding the provenance of videos, which could be good news for companies like Synthesia that claim to prioritise the responsible use of technology. 

The company’s focus on building a safe platform, centred around consent, control and collaboration, means it will not allow unknown customers to generate clones of politicians, for example. “We get a lot of requests, but we say no to all of them,” he says.

Last month, Synthesia also became the first AI video company to achieve preliminary ISO approval — an international audit standard for responsible development and use of AI —  for its content generation process.

What’s next?

Tjerrild says that Synthesia, which has been backed by Accel, Nvidia, Kleiner Perkins and GV, is keeping its head down building the business and has no plans to IPO in the “near to short term future.”

One of the biggest challenges facing any AI company is how to hire and retain talented employees. But he argues it is easier to attract staff when your company has a “crazy mission” to enable users to create a Hollywood-style movie on their laptops.

“It’s hard to build an easy company and it’s easy to build a hard company,” he says. “We’ve been fortunate enough to have a crazy enough mission and a hard enough company to build to attract people.”

Even so, Tjerrild accepts that Synthesia is facing new challenges as the number of its employees has more than quadrupled since 2022 and the company operates across four continents. The business has gone from a position where “everybody knows everything to a default where nobody knows nothing.”

That has been a big learning experience for Tjerrild and his co-founder and chief executive, Victor Riparbelli. Their solution? To use Synthesia’s own communication tools to update employees on new initiatives and product launches. “It’s pretty powerful and it gives us a lot of empathy for the challenges our customers are facing,” he says. “It is like trying to eat as much of our own dog food as possible.”

John Thornhill

John Thornhill is Sifted’s editorial director and cofounder. He is also innovation editor of the Financial Times, and tweets from @johnthornhillft