Opinion

November 29, 2023

Mistral, Aleph Alpha and Big Tech’s lobbying on AI safety will hurt startups

For a startup to meet safety standards under the EU AI Act, they'll need information and safety guarantees from general-purpose AI model developers. They're not getting it


Jaan Tallinn and Risto Uuk

5 min read

Jaan Tallinn and Risto Uuk

As negotiations on the AI Act unfold, there have been efforts to replace hard regulation for general-purpose AI models with soft codes of conduct. 

France, Germany and Italy have recently written a joint non-paper on the regulation of such models, advocating for "mandatory self-regulation through codes of conduct".

This may be to the benefit of big tech companies — and perhaps, the French Mistral AI and German Aleph Alpha —but it will force European startups, such as downstream general-purpose AI model developers and deployers, to pick up the tab for legal and compliance costs.

Advertisement

Innovative companies like Be My Eyes, a Danish startup which leveraged GPT-4 to build an app helping the visually impaired navigate the world, rely on general-purpose AI models.

It is crucial that they know those models are safe and that they are not exposing themselves to unacceptable levels of regulatory and liability risk. 

If a European startup has to meet safety standards of general-purpose AI models under the AI Act, they will only want to buy models from companies that can assure them that the final product will be safe.

But the information and guarantees that they need are not being offered.

All of this means that European startups have unsafe services that they will be asked to make safe under the AI Act, with limited resources to do so.

Security concerns

Recent incidents highlight valid concerns about integrating general-purpose AI models into downstream AI systems without adequate information and safety assurances.

OpenAI's ChatGPT, for example, leaked the conversation histories of random users.

Université du Québec tested ChatGPT and discovered that it produces software code that falls well below minimal security standards. ChatGPT managed to generate just five secure programs out of 21 first attempts.

Finally, Samsung employees in its semiconductor arm unwittingly leaked secret data to ChatGPT while using it to help them fix problems with their source code. The workers inputted confidential data, such as the source code for a new program as well as internal meeting notes related to their hardware.

And current research only emphasises this stark reality.

A recent report by the Centre for European Policy Studies (CEPS) describes one of the most common business models: one entity writes the code, trains the system and then sells access through a branded application or API to another entity (OpenAI develops GPT-4 and provides API access to a European startup). 

Accessing a general-purpose AI model through an API dramatically limits the extent to which the deployer accessing it can understand and examine the model.

A European startup is unlikely to be able to effectively interrogate the general-purpose AI model, such as through red teaming, adversarial training, generating important or context-specific evaluation metrics, or altering the model in any way, with the possible exception of fine-tuning. 

Advertisement

These are technical interventions that an original general-purpose AI model developer is often best-suited to do — identify vulnerabilities and weaknesses of a model by testing its robustness and security; train an AI model with deliberately manipulated or adversarial data to improve its resilience; and create specific criteria to evaluate the performance of an AI model in a particular context. 

New data

When an SME adjusts a pre-trained AI model on new data specific to a particular task, then what data they use is under their control. Considering this, the European DIGITAL SME Alliance has publicly called for the fair allocation of responsibility in the value chain.

Stanford University researchers have evaluated general-purpose AI model providers and found that they rarely disclose adequate information regarding the data, compute and deployment of their models, or the key characteristics of those models themselves.

Arguably, the biggest issue this study revealed is that while there are plenty of risks — including malicious use, unintentional harm and structural or systemic risk — few providers disclose the mitigations they implement or the efficacy of these mitigations. 

Equally, none of the providers explain why some risks cannot be mitigated.

Finally, they rarely measure models’ performance in terms of intentional harms, such as malicious use or factors like robustness and calibration. 

Cost-prohibitive

General-purpose AI models distributed via more open approaches don’t fare any better. Open source developers may argue that they cannot provide safety guarantees because they don’t yet know how to make these models safe, and that by simply providing their entire models openly, they fulfil their obligations to act responsibly. 

Yet, providing this information to other developers is not sufficient.

If certain models cannot demonstrate basic safety and risk mitigation, then they may not be ready for release. In this environment, European startups will need to put significant effort into conducting regular due diligence exercises to identify the risks to which developers expose them. This will likely be completely cost-prohibitive.

Regulation has the profound opportunity to not only address legal and safety implications, but also to nurture trust and resilience within the AI value chain.

Furthermore, the companies developing general-purpose AI have made it clear in their restrictive Terms of Use that they will not guarantee the quality or reliability of their systems. In fact, some of them, such as Anthropic, require that businesses accessing their models through APIs indemnify them against essentially any claim related to that access, even if the business did not breach Anthropic's Terms of Service.

Clearly, startups have not managed to incentivise much larger companies developing foundation models to warrant the quality and reliability of their models, likely because they lack bargaining power.

We hope that the AI Act negotiations lead to a harmonious coexistence of innovation and safeguards. In the ever-shifting landscape of technology, regulation has the profound opportunity to not only address legal and safety implications, but also to nurture trust and resilience within the AI value chain.

Such has been the goal of the AI Act from the beginning. This will then not only increase the uptake by European startups, but by end-users across Europe as well.

Jaan Tallinn

Jaan Tallinn is a founding engineer of Skype and Kazaa. He is also a partner at Ambient Sound Investments (asi.ee), an active angel investor, and has served on the Estonian President’s Academic Advisory Board. He is also a philanthropist in the field of existential risk.

Risto Uuk

Risto Uuk is the EU Research Lead at the Future of Life Institute, leading research efforts on EU AI policy as well as hosting the AI Act website and running the AI Act newsletter. Previously, he has worked for the World Economic Forum and European Commission on trustworthy AI projects.