This browser is not actively supported anymore. For the best passle experience, we strongly recommend you upgrade your browser.
THE LENS
Digital developments in focus
| 2 minutes read

ICO Consultation Series on Gen AI: Second Call

The ICO has launched its second call for evidence as part of its consultation series on data privacy and generative AI.

The ICO’s first call for evidence covered the lawful basis for training generative AI models on web-scraped data (see our earlier blog). The second call covers the application of the purpose limitation principle to different stages of the generative AI cycle. In particular it looks at the compatibility of re-using training data, the fact that foundation models can have many purposes and the importance of defining a specific purpose.

AI Model Lifecycle

The ICO notes that the purpose limitation principle requires organisations to be clear and transparent about why they process personal data and to ensure their activities meet people’s reasonable expectations. They highlight that it is therefore necessary for AI developers to identify a specific purpose at each stage in order to evaluate the model’s data privacy compliance, not just with the purpose limitation principle but wider data privacy requirements too.

Training Data | Compatibility

Given training data is expensive and can be challenging to collate, the ICO notes developers may want to re-use a dataset instead. However, the question is whether the purpose of training that new model is compatible with the original purpose for which the data was collected. 

From a data privacy perspective, it is important to assess the reasonable expectation of people whose data is being re-used (a compatibility assessment). This can be difficult if there isn’t any direct contact with the relevant individuals but public messaging and prominent privacy information may help to mitigate this risk. If the further processing is not compatible with the original purpose, the developer will need to establish a new and separate purpose or would need to anonymise the data.

Underlying (foundation) models have many purposes

AI applications like chatbots and virtual assistants can all rely on one underlying model that acts as a foundation. After the initial training of the generative AI model, an application is built based on it or a revised version of it, enabling its deployment for more practical/commercial usage.

The ICO is of the view that developing a generative AI model and developing an application based on such a model (fined-tuned or not) constitute different purposes. This is also distinguishable from collating repositories of web-scraped data which the ICO considers another separate purpose.

Defining a purpose

Whilst the ICO acknowledges that it may be less easy to precisely define purposes in the earlier stages of the generative AI lifecycle, such as the initial data collection, they note that the purpose must be detailed and specific enough so that all relevant parties have a clear understanding of why and how the personal data is used.  The ICO therefore expects that defining a purpose at the initial stages of the generative AI lifecycle includes considering what types of deployments the model could result in, and what functionality the model will have.

Conclusion

Whilst requiring the developer to consider the purpose at each stage of the AI lifecycle makes sense from a technical privacy compliance perspective, it is questionable how realistic this is in practice given the possible use case may well be endless. It is also not clear whether, if a previously unexpected use is made of the generative AI model, this has any impact on the privacy compliance assessment at earlier stages. The consultation closes on 12 April 2024 and it will be interesting to see where the ICO ultimately lands on this.

Organisations will be better able to comply with data protection law and maintain public trust if they give careful consideration to the difference between developing the generative AI model, developing the application based on it, and are clear about what types of data are used and how in each case.

Tags

ai, data, digital regulation