Enthusiasm for large language models (LLMs) is showing no sign of losing pace: German open source startup deepset.ai has today raised $30m in funding to continue expanding its product, which gives enterprise customers the tools to build their own search applications with LLMs.
It's the company’s third funding round and brings its total raised to $46m. The round was led by Balderton Capital and included returning investors Google Ventures, Harpoon Ventures, System.One and Lunar Ventures.
LLMs are AI models trained on huge datasets to understand and generate human-like speech. They have become a hot investment bet for VCs, especially with the release of OpenAI’s chatbot ChatGPT.
Building customised LLMs for enterprises
Deepset enables customers to tailor existing general-purpose LLMs to their own use cases. The end product can take different shapes, ranging from search bars to chatbots through virtual assistants — but in any case, they enable the user to quickly search, retrieve, summarise and analyse large datasets.
One of deepset’s first customers was French telecoms company Alcatel-Lucent Enterprise, which built a chatbot based on its library of technical documentation. This enabled technicians in the field to ask questions about issues they were having and immediately access recommended documents.
More recently, deepset has worked with Airbus to develop a question-answering application that lets pilots access the most relevant aircraft operation guidelines directly from the cockpit — saving them time sifting through thousands of pages of manuals, when they're likely to need a quick answer for issues ranging from a sick passenger to a failing sensor.
Adapting pre-trained LLMs
Deepset’s product, named Haystack, lets developers access the building blocks necessary to create their custom-made LLM application.
“Through these building blocks, you can pick the technology you will use,” deepset cofounder Milos Rusic tells Sifted. “For example, you pick the actual LLM — it could be from OpenAI, from Hugging Face, or from somewhere else.”
Haystack provides access to a range of widely available pre-trained LLMs, such as Facebook’s LLaMA, Google’s BERT or GPT-4 — but also to other technologies like natural language processing (NLP) tools. These are key to adapt general-purpose LLMs to industry-specific jargon.
Deepset, for instance, worked with Austrian legal publishing house Manz, which provides supporting documentation to legal professionals as they handle cases, to create a search engine that can sift through the 3m documents archived by the publisher. While Google’s English BERT model does have a specific LEGAL-BERT version designed for the legal domain, this wasn't the case for the German version of BERT. NLP tools were crucial to fine-tune the model to the nuances of German legal language.
An open source product
Haystack is open source, meaning that the code underpinning the platform is available for anyone to use freely. In theory, any developer could leverage Haystack to put together a DIY LLM application for no fee.
To generate revenue, the company offers support and services on top of Haystack. “If I am doing risk management in a bank and I want to make credit decisions based on information I am getting out of my LLM application, I want to make sure that the application always gets it right,” says Rusic. “Once I’m in production, there's a lot of workflow and app lifecycle that need to be supported.”
Since 2019, deepset has offered these development and scaling services directly through its cloud-based platform deepset Cloud, a subscription-based SaaS product.
This business model means that deepset, which secured its first customers in 2018, the same year it was founded, was initially bootstrapped.
It wasn't until 2021 that deepset raised its first funding round. “We saw lots of demand for our product, which translated into lots of commercial interest,” says Rusic. “It was quite clear to us at that point in time that there was an opportunity to build a big software company, so we decided to raise money.”
Deepset says it's seen a 250% increase in active users since the beginning of the year. The company is now hoping to use additional funding to develop its sales and marketing strategy while continuing to expand Haystack and deepset Cloud. It is also planning to invest in extending its international presence to the US.