9/25/2023 4:48:14 PM | 4 minute read

Training Generative AI: Update in Progress

Get in touch

Richard Barker

Senior Knowledge Lawyer

Get in touch

Richard Barker

Senior Knowledge Lawyer

As AI continues to develop, and more and more people become familiar with generative AI and its potential uses, the debate over how to balance the rights of AI developers and IP rightsholders continues to rage on. With players on both sides of the debate, as well as Governments across the world, having differing views on the best way forward, it is proving a difficult nut to crack. Whilst we wait for the UKIPO to publish its much anticipated Code of Practice on Copyright and Artificial Intelligence, we take a look below at some recent updates that have caught our attention.

But first, a quick reminder of what this debate is all about…

In order to train an AI system, huge amounts of data need to be inputted into it. Perhaps unsurprisingly, the data and information used is often protected by copyright. For example, it might consist of things like artwork, books, music or photographs. Whilst the exact training process may vary from AI system to AI system, one consistent theme across those processes is that they tend to involve the creation of a copy (whether transient or permanent) of the underlying data. As a result, if the data used is unlicensed and contains copyright protected works there is necessarily a risk of infringement by copying (unless an exception applies).

As we reported in June last year, following its consultation on AI and IP, the UKIPO announced its intention to introduce a new exception for copyright and database rights which would allow text and data mining (“TDM”) for any purpose, with no ability for rights holders to opt out or contract out. The intention behind this was to allow AI developers to continue to use third party data to train their AI systems without infringing third party IP. However, following significant backlash from those in the creative industries, the Government sought to distance itself from those plans. In its place, in March 2023, the UK Government announced that the UKIPO would be producing a new code of practice on copyright and AI (see our previous blogs here and here). Publication was promised “by the end of the Summer”, but nothing has materialised yet.

UK Latest

On 30^th August, the House of Commons Culture, Media and Sport Committee published a report entitled “Connected tech: AI and creative technology”, which is highly critical of the Government’s initial handling of this issue. In its opinion, the current framework - which includes an exception for TDM which is carried out for non-commercial research purposes, alongside the ability for rightsholders to license their works to third parties – already achieves the right balance between rightsholders and AI developers. It argues that expanding the TDM exception would “risk reducing arts and cultural production as mere “inputs” in AI development” and that the Government “should support the continuance of a strong copyright regime in the UK and be clear that licences are required to use copyrighted content in AI”, whilst ensuring that creators are appropriately rewarded.

A day later, on 31^st August, the House of Commons Science, Innovation and Technology Committee published an interim report on the governance of AI. Amongst other things, this report sets out 12 challenges that the Committee believes governance of AI must meet, one of which (Challenge 8) relates to IP and copyright – “Some AI models and tools make use of other people’s content: policy must establish the rights of the originators of this content, and these rights must be enforced”. Whilst no solutions were proposed in this report, it does note that some of those in the creative industries appear to favour the establishment of some form of licensing framework for using copyright works to train AI.

EU Latest

In the EU, Spain (which currently holds the Presidency of the EU Council) has reportedly raised concerns about the EU Parliament’s proposals to include in the EU AI Act certain transparency obligations on providers of generative AI - including to disclose details of copyright-protected material used to train their systems (see our earlier blog post). Perhaps unsurprisingly, Spain has suggested that these may be too onerous and difficult to comply with in practice, due to (i) the significant amount of data involved in training AI systems and (ii) the difficulties involved in separating out copyright protected data from unprotected data. It remains to be seen whether those provisions will survive the trilogue negotiations.

What have the industry players themselves been up to?

It’s not just government institutions that are getting in on the act, however. Whilst government’s weigh up their options, the industry players continue to make their own moves.

Various rightsholders continue to assert their position by bringing IP infringement claims against generative AI developers, with several actions being brought in the US and the UK (see e.g. here) and new actions being filed on a regular basis (particularly in the US). Others are seeking to drive the agenda through publishing their own guidance and principles, with a recent example being the publication of a set of “Global Principles on AI” by a number of organisations in the publishing and journalism industry.

Meanwhile, on the developer side, some organisations are seeking to allay their customers’ concerns by offering to take on the risk of IP infringements themselves. Microsoft, for example, has announced that it will extend its IP indemnity support to cover users of its commercial Copilot services (including Microsoft 365 Copilot and GitHub Copilot), provided certain guardrails and filters are used – “if a third party sues a commercial customer for copyright infringement for using Microsoft’s Copilots or the output they generate, we will defend the customer and pay the amount of any adverse judgments or settlements that result from the lawsuit, as long as the customer used the guardrails and content filters we have built into our products”.

What else can we expect in 2023?

Whilst the end of the year is rapidly approaching, we remain hopeful that we will have the UKIPO’s Code of Practice on Copyright and AI and the final form of the EU AI Act before the end of the year.

For more information on the risks and opportunities around AI, explore the different publications and podcasts from our Regulating AI series.