The real value of large language models like GPT-4 isn’t in writing, it’s reading

Logically we can state with confidence that a million monkeys banging on a million typewriters would eventually reproduce the complete works of Shakespeare. That's all very well for the monkeys, but it's not actually that useful to humans.

The global, internet-connected world in which we live can feel as chaotic and nonsensical as a barrage of monkey-written text. You wake up one morning and find that panicked bank depositors, many of whom are making decisions while following social media, have collapsed a major financial institution within a span of hours. Large language models (LLMs) like GPT-4, which can spin up hot takes and blog posts in a minute, are only making this noise worse.

The power of a million monkeys has been compressed into a model, and a million years has been compressed to a minute.

But despite what many VCs and founders think, the real value of the technological advances in language models is not in adding to the noise. The real value is not in “writing” but in “reading”. In other words, being able to create insight to make sense of it all. Founders who will weather the generative AI hype cycle will be those who build with this in mind, instead of another GPT-4-powered marketing copy generator.

Analysing SVB media coverage

Recently, excitement about the potential of LLMs — and criticism of them — has been focused on their ability to write. Take Noam Chomsky and his co-authors in the New York Times, who blasted the “The False Promise of ChatGPT” because it does not exactly mimic the way that humans learn language.

These denunciations, as well as ridiculous suggestions that AI is in fact a sentient being trapped in a computer, are likely to get more frequent with abilities demonstrated in GPT-4. They are also great examples of focusing on sci-fi questions rather than the benefits of using LLMs, which is helping us do work (the word “robot” comes from the Slavic root for “work”) that we can’t do on our own. In other words, that reading aspect I talked about.

Reading and evaluating 1,000 articles in the space of a minute is impossible for a human, but it is now possible with language models to 'read' and create data points on those articles almost instantly

Large language models do not need to be able to “understand” ideas and concepts the same way humans do in order to be able to reliably make meaningful distinctions that replicate human work. At Overtone, we recently used our model to analyse the media coverage that led to the panic at Silicon Valley Bank. Reading and evaluating 1,000 articles in the space of a minute is impossible for a human, but it is now possible with language models to “read” and create data points on those articles almost instantly.

These data points are at a level of abstraction above entity extraction or sentiment (what NLP has been used for until recently) and are able to accurately classify huge amounts of text in ways that are understandable by and useful to a human, such as an editor deciding which articles should be included in a newsletter or a PR professional responding to a crisis. But note: the AI isn’t adding to the noise.

Poking the bear

Those who are overly focused on the benefits of LLM writing have promoted new ways for humans to interact with technology, such as being a “prompt engineer”, which consists of throwing different combinations of inputs at the black box models in order to get an optimal result.

To me, this misses the power of language models to parse through text and classify it, including into categories that can determine whether a generative output is doing what the human prompting the machine wanted. As one professor told the Washington Post, prompt engineering is “not a science… It’s ‘let’s poke the bear in different ways and see how it roars back'.”

As someone who spent the first part of my career as a reporter and editor, the work of producing content online is not, or at least should not, be about creating *a* piece of content. It should be about creating *the* piece of content that informs, entertains or inspires your audience to fulfil their user needs.

Instead of prompt engineering and spinning the wheel of generative output, the best businesses based on language models will be focused on fine-tuning using their own “reading” models to reliably create effective communication to users, the readers. Those users, human beings, can get value from receiving information that wasn’t just splurged onto the internet, but made with them and their needs in mind.

Finding the correct use cases and value creation for their “readers” will be the most important work startups can do as LLMs become increasingly accessible, as we have already seen this week with Stanford’s training of Alpaca on a relatively small amount of compute power. Investors are worried about the defensibility of AI businesses, and one of the few moats will be creating a connection to users and creating *the* content they need.

Greg Brockman’s example on the GPT-4 Developer Livestream showing analysis of a tax document is a good example, but the value rests not in GPT-4 being able to output information about the tax code either as a conversation or as a poem. The value is in being able to parse through a document like the tax code and use existing rules about language to see how the different tax rules work together. Businesses in specific industries, from education to entertainment, can create similar value because they know the problems of their users, their readers, and how language needs to be conceptualised and expressed to solve those problems.

Large language models have their limitations, but the focus on writing has taken the debate away from their real usefulness to our present moment, using language to make sense of all the language out there. The best businesses will be focused on solving people’s problems, and everything else is monkeying around.

The real value of large language models like GPT-4 isn’t in writing, it’s readingself.__wrap_b(":Raiplm:",0.7)

All of the attention around large language models has focused on its ability to write — but could reading be where the potential lies?self.__wrap_b(":Reiplm:",1)

Analysing SVB media coverage

Poking the bear

The real value of large language models like GPT-4 isn’t in writing, it’s reading

All of the attention around large language models has focused on its ability to write — but could reading be where the potential lies?