Google DeepMind is reportedly close to releasing a new AI model — one powerful, versatile and useful enough to threaten ChatGPT’s dominance since being released by OpenAI last year.
Here’s what you need to know about the model called Gemini, the latest shiny new thing coming to GenAI land, based on what has been reported so far and hints from DeepMind executives.
New architecture
DeepMind CEO Demis Hassabis recently told Wired that Gemini will not just be one model, but a series of models of different sizes (models trained on smaller datasets are faster and cheaper to use).
It’ll also be what’s called a “multimodal” model, which means it can interpret and create both text and images as prompts, rather than just one or the other.
This week, OpenAI announced a new version of its DALL-E 3 image generator, which is now integrated into a multimodal ChatGPT interface, as the big AI labs continue to duke it out to create the most versatile and useful models.
But there are also hints that Gemini might outperform existing models when it comes to problem solving and planning abilities. Hassabis also told Wired that the model has been designed using techniques from AlphaGo — the company’s AI system that famously beat a champion player at the board game Go in 2016.
“At a high level, you can think of Gemini as combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of the large models,” Hassabis told the publication, adding that there’ll be some new tricks in there as well.
“We also have some new innovations that are going to be pretty interesting.”
Compute
Another big reason why Gemini has got GenAI stans excited is the computer processing (known in AI circles as “compute”) resources that Google has at its disposal. The amount of compute power that AI companies have is an important factor in building more powerful models.
It’s hard to know exactly how many chips each rival company has at its disposal, but semiconductor news site SemiAnalysis recently described Google as the “most compute-rich company in the world”.
The publication estimates that Google’s compute infrastructure will be five times more powerful than OpenAI’s by the end of this year, and 20 times heftier by the end of next year.
Data
It’s not just compute resources that might give Google DeepMind’s Gemini an edge. Another ace that Google has up its sleeve is data: a lot of data.
Startups like OpenAI have historically relied on data sources like Common Crawl (a repository of text from the internet) and BookCorpus (a dataset containing the text of thousands of books) for model training.
Both of these data sources are known to include copyrighted material, and publishers and writers are now fighting back in the courts, claiming their works have been misused.
Google, meanwhile, owns huge proprietary datasets in YouTube videos, Google Books and Google Scholar — meaning the company can happily train big models without the same legal liability.
Taking all the above into account, it looks like Google DeepMind — which has been called a sleeping giant in the GenAI race — is waking up, as the company that rules the internet could be about to take the AI crown too.