Europe’s best-funded GenAI startups all have one thing in common: they all started out producing open-source models.
That’s in stark contrast to the likes of world-famous OpenAI, which manages and licences out its models to companies and individuals, and has already topped $1.6bn in annualised revenue.
France’s Mistral, the UK’s Stability AI and Germany’s Aleph Alpha have raised more than $1.3bn between them, according to Dealroom data — and they allow users to download some of their source code for some of their technology, free of charge.
But what they’re giving away is far from cheap to make — generative AI models are expensive to develop due to the high costs of talent and computational power needed to train them.
And some industry watchers are now beginning to ask whether they can ever make enough money to give those VCs a tidy return on investment.
Stability AI seems to have the same concerns: it recently walked back on its open source vision, putting its most powerful models behind a paywall.
Why open source might not work for GenAI
Monetising open-source technology is not a new idea — software companies have been using the strategy for years.
“You open source part of the technology so it becomes widely adopted technology that your company is the expert in, and then you can sell a lot of things around that,” says Mike Linksvayer, VP of developer policy at GitHub — such as bonus features or services. Big tech companies including Microsoft, Amazon Web Services (AWS) and Google have used open source strategies.
But generative AI models are different from your typical software product. They’ve been mostly developed in a handful of advanced AI labs by a relatively small number of highly in-demand specialist researchers, and also need large quantities of specialist AI chips (GPUs) to operate at scale.
“For a while everybody was fascinated by these open-source models that you could run on your own infrastructure, even on your laptop,” Andreas Goeldi, an AI-focused partner at Swiss VC firm b2venture, tells Sifted — pointing to the excitement many in the industry had around running powerful chatbots like ChatGPT locally, without relying on an external provider.
“Now, people are figuring out, 'Oh, it's actually not that easy to run this stuff in a more serious application.' You have to buy GPUs and you have to run all this infrastructure. It’s very complicated to maintain and set up,” he says.
The alternative is essentially paying a third-party provider to host and run your model, and access it via an API, as OpenAI does.
Goeldi believes there is now a “movement back to, 'Oh, let's just buy API access,' like what OpenAI has been selling for the whole time. I mean, even Mistral just now launched a platform product.” In December, Mistral launched the beta version of its developer platform that will charge companies to access some of its models through API.
The bull case for open source
Others argue that not every company will want to give control of critical technology elements of their business to a third party, particularly when companies like OpenAI are known for keeping the workings of their models private.
“If the code is closed source, it’s a blackbox. Open-source models let the enterprise understand how they work,” says Maxime Corbani, senior associate at deeptech VC firm Runa Capital, adding that open-source models are easier to refine, and can be more easily specialised by the client, by “fine tuning” them on proprietary company data.
Aleph Alpha has already partnered with the German government and city of Heidelberg — allowing them to run proprietary AI models where sensitive citizen data is held on-site, and not shared outside the government.
Linksvayer of GitHub agrees that many enterprises will still see value in controlling their AI technology on their own premises (“on prem” for short), and that company leaders will be faced with similar decisions to those they’ve dealt with in the past, about running their own software or outsourcing.
“You make a decision about how you’re going to obtain and maintain a set of features. You host yourself or have somebody else host it for you. There are kind of different costs and trade-offs that include things like having to have your own people who are expert in running a particular software stack like foundation models,” he says.
“There might be some things that are really core to a company’s domain where it wants to maintain staff expertise and run it on prem. Then there's a bunch of other stuff that's not core to their field that they might outsource.”
A young market
Linksvayer says these kinds of cost tradeoffs currently work against open-source AI companies, due to the scarcity of talent who know how to run LLMs and other generative models.
“The expertise is certainly much more limited,” he says, before adding that this will likely change as more tech developers upskill to meet demand.
“Things are evolving so rapidly. You can make a lot of money with this expertise so lots of developers are learning tools of the trade.”
Linksvayer thinks that it’s healthy to have a good mix of open and closed source models out in the market, to give company builders options to set up their tech stack in the way that suits them. What’s yet to be seen is how deeply this frontier, but at times unreliable, technology will penetrate into the enterprise world — where reliability and accuracy are often the most valuable currencies.