Interview

November 24, 2023

Stability AI’s outgoing head of audio reckons GenAI is broken: here’s why

When Ed Newton-Rex quit Stability AI, he spelled out what's going wrong in generative AI. Now he tells Sifted how to fix it


Tim Smith

6 min read

Lexie Yu

Ed Newton-Rex is one of those people who has what feels like a couple too many strings to his bow. 

He’s a dad, tech founder, coder, composer, scout investor for Andreessen Horowitz and mentor at the iconic Abbey Road Studios. 

More recently, he’s also become one of the leading voices sticking up for creators in the emerging battle over copyright and compensation in the world of GenAI.

And, until last week, he was also the VP of audio at Stability AI — the London-based startup that shot to fame when it raised a $101m Series A round in October last year, off the back of the launch of open source image generator Stable Diffusion, which the company helped fund.

Advertisement

Newton-Rex isn’t one to bite his tongue either. He announced his resignation in dramatic public fashion — taking to X to spell out why he disagrees with Stability’s “opinion that training generative AI models on copyrighted works is ‘fair use’”.

Fair use is a legal concept that a number of large generative AI companies are using to defend their use of copyrighted material in the data used to train their models.

Their argument — as Newton-Rex sees it — “just doesn't seem right” and could risk “decimating entire industries” if we don’t work out better solutions fast. Sifted sat down with the polymath and activist to hear how he sees the battle lines shaping up, in what will be a fundamental struggle for how GenAI companies compensate creators.

Unfair use?

To properly understand why Newton-Rex left his presumably lucrative role at Stability AI, it’s worth taking a minute to drill into the legal nuances of what fair use of copyrighted material entails.

Fair use has historically been used to protect the reproduction of copyrighted works in limited circumstances, but the term covers a number of legal concepts that have to be weighed up against each other. 

In January, Stability’s CEO Emad Mostaque told Sifted that he doesn’t believe the outputs from image generation models like Stable Diffusion represent a misuse of copyrighted material because they are “transformative”. This element of fair use says that works that draw on copyrighted material but change the nature of that material don’t infringe copyright.

Historic examples of this include things like collages, where bits of existing artworks can be repurposed to create new art.

But Newton-Rex believes that his argument is trumped by another factor covered by fair use. In his X post, he pointed out that the US congress has stated that material created with the help of copyrighted work isn’t protected by fair use if it affects or undermines “a new or potential market” for the original work or its creator.

“It’s clear that these models will be able to and are already to some degree replacing the demand for the original work,” he tells Sifted. 

“To me that says, ‘This isn’t fair use’. The fact that you’ve got all these creators who’ve relied on copyright being essentially told by generative AI companies that they can't rely on copyright here and that value can be extracted from their works without their consent and without paying them — I think ultimately that just doesn’t seem right.”

Advertisement

Last week, Stability told Sifted, in response to Newton-Rex’s resignation, that the company “thanks Ed for his contribution to Stability AI and wish[es] him all the best in his future endeavours”. 

On X, meanwhile, Mostaque commented on Newton-Rex’s resignation post, sharing the company’s submission to the US congress’s copyright office, laying out its position on fair use.

How to fix it?

Newton-Rex is quick to point out that he doesn’t believe that GenAI companies are acting out of “malice” and that they are making their own interpretations of the rules, as you would expect them to.

“I don’t think it’s because these companies are being evil,” he says. “People weigh the different legal considerations according to what they believe in. I think fundamentally these generative AI companies think that they are building powerful tools that will help people.”

That said, Newton-Rex thinks it’s vital that we start thinking of better frameworks for the relationship between AI companies and creators, now that generative models have gotten good enough to compete with some types of content that have professional value.

“The danger now is that it’s all got so good — the Turing test is being passed in various domains — that now is when we need to figure out: ‘Okay, how do we make sure this doesn’t just suddenly decimate entire industries?’” he says. “We've got to have that conversation pretty rapidly to be honest.”

Newton-Rex doesn’t just talk the talk on this issue, he’s also walked the walk. Stability AI’s audio generation model — Stable Audio — which he spearheaded, was trained on music from a production music library called AudioSparx, with Stability agreeing to a revenue share model, in exchange for the training data.

“A revenue share agreement is just one potential solution to how you work with rights holders,” he says. “You can pay upfront for data, you can give equity in the company, you can do all sorts of things to work with the rights holders.”

Newton-Rex adds that while there’s not necessarily a one-size-fits-all solution, future legislation should focus on one minimum standard: that AI companies need the consent of content creators to use their work in their training data.

“Not everyone wants the same thing, but as long as you have that step of consent, then I think that’s a really important thing,” he says.

A musician’s mind

Speaking to Newton-Rex, it quickly becomes clear that he’s coming from this debate not just as a technologist, but as a passionate musician himself.

“I kind of hate thinking about musical material as training data because it dehumanises people’s creative output. But ultimately, for an AI company, you have to see training data as a resource,” he says. “Just like GPUs are a resource, you strike the deals you need to, to get that resource.”

There’s also some irony to the fact that Newton-Rex is now trying to tame a beast that he himself played a large part in creating. In 2010, he founded Jukedeck, one of the first AI-powered music generation systems, which he later sold to ByteDance (TikTok’s parent company) in 2019.

Much has changed since Newton-Rex himself began coding Jukedeck’s technology, with that system generating sheet music, rather than raw audio.

“The technology has fundamentally changed. We were creating symbolic music. So we were just using AI to create the notes and the chords on a page, essentially. And then we used automatic production efforts to turn that into audio,” he says. 

“Because of various innovations and because people are now chucking loads more GPUs and training data at these things, you can just go straight to the audio generation step, which makes these much more powerful tools.”

Newton-Rex says he’s not yet sure what lies next for him, but he clearly remains passionate about how cutting-edge AI technology can be a genuinely useful tool for creatives.

“With Stable Audio, one of the big use cases for musicians has been people creating samples to use in their own music,” he says. “You know, entering a text prompt, creating a drum loop, or some crazy-sounding synth, or whatever it is, and using that in their own music.”

But now that the technology has got good enough to be useful to creative players, Newton-Rex is laser-focused on ensuring we set the right rules of the game.

Tim Smith

Tim Smith is news editor at Sifted. He covers deeptech and AI, and produces Startup Europe — The Sifted Podcast . Follow him on X and LinkedIn