As negotiations on the AI Act unfold, there have been efforts to replace hard regulation for general-purpose AI models with soft codes of conduct.
France, Germany and Italy have recently written a joint non-paper on the regulation of such models, advocating for "mandatory self-regulation through codes of conduct".
This may be to the benefit of big tech companies — and perhaps, the French Mistral AI and German Aleph Alpha —but it will force European startups, such as downstream general-purpose AI model developers and deployers, to pick up the tab for legal and compliance costs.
Innovative companies like Be My Eyes, a Danish startup which leveraged GPT-4 to build an app helping the visually impaired navigate the world, rely on general-purpose AI models.
It is crucial that they know those models are safe and that they are not exposing themselves to unacceptable levels of regulatory and liability risk.
If a European startup has to meet safety standards of general-purpose AI models under the AI Act, they will only want to buy models from companies that can assure them that the final product will be safe.
But the information and guarantees that they need are not being offered.
All of this means that European startups have unsafe services that they will be asked to make safe under the AI Act, with limited resources to do so.
Recent incidents highlight valid concerns about integrating general-purpose AI models into downstream AI systems without adequate information and safety assurances.
OpenAI's ChatGPT, for example, leaked the conversation histories of random users.
Université du Québec tested ChatGPT and discovered that it produces software code that falls well below minimal security standards. ChatGPT managed to generate just five secure programs out of 21 first attempts.
Finally, Samsung employees in its semiconductor arm unwittingly leaked secret data to ChatGPT while using it to help them fix problems with their source code. The workers inputted confidential data, such as the source code for a new program as well as internal meeting notes related to their hardware.
And current research only emphasises this stark reality.
A recent report by the Centre for European Policy Studies (CEPS) describes one of the most common business models: one entity writes the code, trains the system and then sells access through a branded application or API to another entity (OpenAI develops GPT-4 and provides API access to a European startup).
Accessing a general-purpose AI model through an API dramatically limits the extent to which the deployer accessing it can understand and examine the model.
A European startup is unlikely to be able to effectively interrogate the general-purpose AI model, such as through red teaming, adversarial training, generating important or context-specific evaluation metrics, or altering the model in any way, with the possible exception of fine-tuning.
These are technical interventions that an original general-purpose AI model developer is often best-suited to do — identify vulnerabilities and weaknesses of a model by testing its robustness and security; train an AI model with deliberately manipulated or adversarial data to improve its resilience; and create specific criteria to evaluate the performance of an AI model in a particular context.
When an SME adjusts a pre-trained AI model on new data specific to a particular task, then what data they use is under their control. Considering this, the European DIGITAL SME Alliance has publicly called for the fair allocation of responsibility in the value chain.
Stanford University researchers have evaluated general-purpose AI model providers and found that they rarely disclose adequate information regarding the data, compute and deployment of their models, or the key characteristics of those models themselves.
Arguably, the biggest issue this study revealed is that while there are plenty of risks — including malicious use, unintentional harm and structural or systemic risk — few providers disclose the mitigations they implement or the efficacy of these mitigations.
Equally, none of the providers explain why some risks cannot be mitigated.
Finally, they rarely measure models’ performance in terms of intentional harms, such as malicious use or factors like robustness and calibration.
General-purpose AI models distributed via more open approaches don’t fare any better. Open source developers may argue that they cannot provide safety guarantees because they don’t yet know how to make these models safe, and that by simply providing their entire models openly, they fulfil their obligations to act responsibly.
Yet, providing this information to other developers is not sufficient.
If certain models cannot demonstrate basic safety and risk mitigation, then they may not be ready for release. In this environment, European startups will need to put significant effort into conducting regular due diligence exercises to identify the risks to which developers expose them. This will likely be completely cost-prohibitive.
Regulation has the profound opportunity to not only address legal and safety implications, but also to nurture trust and resilience within the AI value chain.
Clearly, startups have not managed to incentivise much larger companies developing foundation models to warrant the quality and reliability of their models, likely because they lack bargaining power.
We hope that the AI Act negotiations lead to a harmonious coexistence of innovation and safeguards. In the ever-shifting landscape of technology, regulation has the profound opportunity to not only address legal and safety implications, but also to nurture trust and resilience within the AI value chain.
Such has been the goal of the AI Act from the beginning. This will then not only increase the uptake by European startups, but by end-users across Europe as well.