5/7/2024 8:21:56 AM | 2 minute read

Am I hallucinating? ICO consults on accuracy in GenAI

Get in touch

Rebecca Cousin

Partner

Get in touch

Rebecca Cousin

Partner

The ICO has published the latest chapter in its series consulting on GenAI, this one focussing on accuracy. As we know, accuracy is a key tenet of the GDPR, with a requirement on organisations to correct inaccurate information in some circumstances. This is no different in the context of AI. However, the ICO helpfully explains that the GDPR accuracy principle isn’t the same as statistical accuracy, which refers to the accuracy of an AI system itself. This means that outputs of GenAI don’t have to be 100% statistically accurate.

Purpose determines importance of accuracy

The ICO explains that the appropriate level of statistical accuracy is dependent on the way the model is used. For example, if the AI is used to develop story lines for gaming, then the outputs do not have to be accurate per se. However, this is not so if the AI model is used to make decisions about people. In these cases, the ICO views accuracy as a central principle.

To protect themselves from claims of inaccuracy, developers will need to be clear what usage is or is not permitted. Where the purpose is not known by the developer, providing information on statistical accuracy to users will be particularly important. Developers should also assess and communicate the risk and impact of “hallucinations”, i.e. incorrect and unexpected outputs.

Accuracy of training data is key

It is important that developers and deployers consider the impact of training data on the accuracy of outputs. If inaccurate inputs contribute to inaccurate outputs, the ICO considers that it is likely that neither the developer nor the deployer will be complying with the accuracy principle. However, the purpose still needs to be considered before determining this. To avoid falling foul of the accuracy principle, if the developer has a particular purpose in mind for the AI, it will need to curate the training data accordingly to ensure sufficient accuracy for that particular purpose.

Complaint based on inaccurate output

A complaint based on inaccurate output has already been made to the Austrian DPA. This arose as ChatGPT produced the wrong date of birth of NOYB’s founder, apparently as an hallucination. This type of inaccuracy is likely to be a common challenge with GenAI, and is why the ICO’s views on the importance of purpose are key. Was this date of birth being used to decide on pricing of life assurance or for a school quiz for example? The implications of the inaccuracy are very different in each case. It will be interesting to see what approach the Austrian DPA takes to this complaint and the extent to which it is similar or different to the ICO's views in this consultation.

Linkage to rights to rectification and deletion

The ICO notes that the accuracy principle is closely linked to the right to rectification of inaccurate data and data deletion. This is another challenge for Gen AI, and indeed, when OpenAI was asked to rectify or erase the inaccurate data from ChatGPT, they reportedly stated that it was technically impossible to do so. The ICO has promised that a future chapter will consult on these and other data subject rights.

Next steps

The ICO consultation closes on 10 May. AI developers and users should monitor the output of the consultation, as well as the views of the Austrian and other EU DPAs, and ensure that accuracy is appropriately factored into their development and use of AI.

Am I hallucinating? ICO consults on accuracy in GenAI

Get in touch

Get in touch

Sign up to receive the latest insights. Click here to subscribe to The Lens Blog.

Tags

Get in touch

Get in touch

Latest Insights

FCA shares cyber resilience insights from 2024 industry discussions

Recent retailer attacks shine spotlight on the Computer Misuse Act

What do the open source exemptions for GPAI models mean for you? The EU AI Act Guidelines provide some clarity

When deletion becomes a breach: ICO fines Birthlink for destroying irreplaceable data