Opinion

February 25, 2021

How to get AI to sound less drunk: the GPT-3 case study

The much-hyped GPT-3 still lacks understanding of the world — but that may be coming.


Laurent Sorber

6 min read

Photo by Phillip Glickman on Unsplash

GPT-3 has created a lot of buzz since its release a few months ago. Deservedly so, as it’s a big step forward in AI.

The system can generate (almost) plausible conversations with the likes of Nietzsche, write op eds for The Guardian and was even used successfully to post undercover comments on Reddit for a week.

But even with GPT-3, AI is still stuck in Uncanny Valley. GPT-3 output feels like it was written by a human at first glance, but it isn’t quite. On closer inspection, it lacks substance and coherence.

Advertisement

There are two reasons for this.

First, GPT-3 can’t make a point. GPT-3 behaves as a stream of consciousness, which means its next sentence feeds off of the last few sentences.

GPT-3 associates and builds on what it can hear itself saying. At the sentence level, GPT-3 text will usually make sense, maybe even at paragraph level. Its shortcomings become obvious in longer texts (as you can see in this blog post on Bitcoin written by GPT-3).

GPT-3, like a drunk, cannot make its point because it has no point to make. In the context of entertainment (or a bar), GPT-3 offers up neat tricks. But it will not write your grant proposal or business plan anytime soon.
Secondly, even if you gave it a point to make, it wouldn’t really know how to get to that point.

Say we offer GPT-3 the purpose it needs, for instance: 'we need to increase sales by 15% in 2021'. GPT-3 would not be able to move logically from one point to another to arrive at the conclusion that we do need to increase sales by 15%. Instead, it will bulldoze its way through. It will do a sort-of-okay job of it, but not necessarily well enough to convince your sales and marketing division.

A recent criticism of natural language processing models like GPT-3 is that they don’t seem to understand language. One big tell is that it cannot tell the difference between the sentences 'Marijuana causes cancer' and 'Cancer causes marijuana'.

But the problem is deeper than that: GPT-3 just doesn’t understand the world — it has no concept of reality.

A reality check for AI: building a 'world model'

Understanding the world like humans do would help AI to solve its two weaknesses as described above. A 'world model' — a broad, intuitive understanding of what is realistic and what isn’t — would allow it to make a clear point. And would help it build a logical argument to arrive at that point.

Building such a world model is not easy.

It’s not easy to learn how to count if you can only read about it.

Even learning the basic life skills that toddlers learn — like counting — is more difficult than it looks. GPT-3 is trained on text, and it’s not easy to learn how to count if you can only read about it. Google recently tried and succeeded in counting small numbers — with some mistakes interspersed. Eventually, it achieved a level similar to a three-year-old’s ability to count.

Difficult as that is, it will be even more difficult to teach the model about the world.

Some projects today are trying to simply add more knowledge about the world to machine learning models. ConceptNet Numberbatch, for example, provides a semi-structured set of word meanings that can be added to a machine learning model so that it has a way of learning about words that is not just about observing them in context.

It’s a useful approach and definitely helps, but ultimately I expect this approach will fall short, because the volume of human knowledge stored in text is vastly larger than that stored in knowledge graphs like ConceptNet.

Advertisement

Beyond transformers: a new architecture for AI

What’s needed is a next-generation architecture.

All modern language models, including GPT-1 to -3, are built as a composition of a highly successful neural network component called the transformer. This architecture has brought massive advances and is the reason we’re seeing this almost-credible text coming from it. The strength of transformers is that they are very good at free association.

They have that in common with us. Humans are also good at free association (we call it imagination), but we balance that out with our ability to reason. This ability offers a reality check for our imagination. It helps us move logically from one point to the other and finally, to our conclusion.

It is our ability to apply 'slow thinking' that keeps us honest, and is what GPT-3 currently lacks.

It is our ability to apply 'slow thinking' that keeps us honest, and is what GPT-3 currently lacks. We’ll need a new type of neural network architecture to build AI with a sense of reality.

Recently, the Facebook AI Research (FAIR) team delivered a first — and at first sight quite impressive — attempt with a new transformer that they called a 'Feedback Transformer'.

The Feedback Transformer improves on the original transformer in two ways.

Firstly, it increases the memory capacity of the transformer. Rather than remembering the last few sentences, the Feedback Transformer can remember everything that has been said up to that point.

Among other things, that allows the Feedback Transformer to remember how often something occurred until now — it can “keep count”.

Once we achieve that, AI can graduate from a parlour trick into the real business world.

Secondly, and more importantly, the Feedback Transformer can understand an entire computer program. For the transformer, this is too big a task, as it simply does not have a large enough memory to hold the entire program state in its mind. The Feedback Transformer, on the other hand, can read a computer program and accurately explain what the program will do as it is reading it.

New steps in AI like the Feedback Transformer should directly improve the 'stream of consciousness' weakness of the transformers, which will make its output sound a lot less like a drunk and a lot more like a real, sober human being. Unlike the transformer, it will not jump around purely associatively, but it will take into account everything it said and heard before.

This increased memory should also significantly increase the reasoning capacity of the transformer, bringing us closer to an AI that can explain why we need to increase sales by 15%.

There’s no doubt that smart minds in AI will try to tackle the 'free association' issue over the next few years. As the experiment of the FAIR group shows, we can probably get it done without sentient AI or without huge increases in computational power. Once we achieve that, AI can graduate from a parlour trick into the real business world.

Laurent Sorber is cofounder and CTO at machine learning company Radix.