EDPB Opinion on AI Models and Data Protection Compliance
Wednesday, 05 February 2025
On 17 December 2024, the European Data Protection Board (“EDPB”), following a first of its kind request from the Data Protection Commission (“DPC”) under Article 64(2) of the GDPR, published Opinion 28.2024 on certain data protection aspects related to the processing of personal data in the context of AI models (the “Opinion”).
The Opinion focuses on data protection compliance requirements arising from the training/development and use/deployment of AI models.
The Opinion was welcomed by the DPC, which published the press release below.
What does the Opinion cover?
The Opinion addressed the following four key queries raised by the DPC (the “Queries”):-
(i) Under what circumstances may an AI Model be considered as ‘anonymous’?;
The EDPB suggests a case by case assessment to anonymity of any AI model, with a focus on the two following criteria: (a) the likelihood of extracting personal data directly from the model must be insignificant, accounting for methods like membership inference or inversion attacks; and (b) queries submitted to the model must not yield identifiable data, even unintentionally.
The GDPR will not apply to any AI model which has been fully anonymised, but this remains a high threshold to reach under the GDPR. This is reinforced in the Opinion, which provides that extensive record keeping, including AI risk assessments, Data Protection Impact Assessments ("DPIAs") and technical safeguards that are required to demonstrate claims of anonymity.
The Opinion specifically states that the elements in the WP29 Opinion 05/2014 on Anonymisation Techniques should be referred to where considering anonymity.
(ii) How controllers may demonstrate the appropriateness of legitimate interest as a legal basis for personal data processing to create, update and/or develop an AI Model;
The Opinion usefully advocates reliance on legitimate interests (under Article 6(1) (f) of the GDPR) as an appropriate lawful basis for personal data processing in AI systems at all stages of the development and deployment of AI models, but subject to stringent controls.
A legitimate interests assessment (“LIA”) must be carried out which (a) identifies a legitimate interest which must be lawful, specific and demonstrable, such as fraud detection or cybersecurity enhancement; (b) satisfies the necessity test (i.e. the processing must be indispensable to achieving the legitimate interest with no less intrusive alternatives available); and (c) satisfies the balancing test (i.e. that the legitimate interests are not overridden by the fundamental rights and freedoms of the data subjects).
In implementing the balancing test at the development stage, businesses should consider a data subject’s interest in self-determination and in maintaining control over their own data when considering whether it is lawful to collect personal data for training AI models and what might reasonably be expected by a data subject in this regard.
The Opinion references a range of ‘mitigating measures’ that can be relied upon to reduce risks to data subjects, including (a) technical measures such as pseudonymisation and personal data masking; (b) measures that facilitate the exercise of data subject rights, such as allowing a reasonable period of time to elapse between collection of training data and its use or providing an unconditional right for data subjects to opt-out of the use of their personal data for training or deploying the model; (c) transparency measures such as public communications and media campaigns in respect of the developer’s use of personal data for the AI model development; and (d) Web-scraping measures such as excluding certain data categories or sources and collection of data from specified websites.
(iii) How controllers may demonstrate the appropriateness of legitimate interest as a legal basis for personal data processing to deploy an AI Model;
The above requirements apply equally to the deployment stage, where businesses should consider the impact on data subjects’ fundamental rights that arise from the purpose for which the AI model is used.
(iv) What are the consequences of an unlawful processing of personal data in the development phase of an AI model on the subsequent processing or operation of the AI model?
The Opinion highlights three specific scenarios involving the use of unlawfully processed personal data to train AI models and their regulatory consequences, as follows. The Opinion confirms that supervisory authorities can use corrective measures against controllers that unlawfully process personal data to train their AI models, including ordering the erasure of the part of the dataset that was processed unlawfully or, where proportionate, ordering the erasure of the whole dataset used to develop the AI model and/or the AI model itself.
Scenario 1: Personal data is retained in the AI model and is subsequently processed by the same controller (i.e. in both the development and deployment phases): The Opinion provides that a case by case assessment is required as to whether the lack of legal basis for the initial processing activity impacts on the lawfulness of the subsequent processing (i.e. the implication being that if a supervisory authority orders the controller to delete the unlawfully processed personal data, such corrective measures would not allow the data to be subsequently processed and there is a risk of fines being imposed).
Scenario 2: Personal data is retained in the AI model and is processed by another controller in the context of the deployment of the model: The Opinion provides that whether the controller carried out an appropriate assessment as part of its accountability obligations under Article 5(1) (a) and Article 6 of the GDPR should be taken into account by any subsequent controller. Failure to assess the model’s development history, including the source of the data and whether the AI model infringes the GDPR may lead to regulatory sanctions.
Scenarios 3: A controller unlawfully processes personal data to develop the AI model then ensures that it is anonymised before the same or another controller uses the AI model: If it can be demonstrated that the subsequent operation of the AI model does not entail the processing of personal data, then the GDPR does not apply and the unlawfulness of the initial processing should not impact the subsequent operation of the model. This scenario may well be intended to preserve the marketability of existing AI models which (in recognition of how AI models have been developed and deployed to date) would otherwise be unlawful, but can be saved by these corrective measures.
What is excluded from the Opinion?
Notably, the Opinion covers ‘certain’ data protection aspects arising from the Queries, but does not extend to the compatibility principle under Article 6(4) of the GDPR (i.e. to process existing data), the processing of special category data (i.e. in light of the specific conditions which attach to such processing under Article 9(1) of the GDPR), automated processing restrictions under Article 22 of the GDPR (including profiling), data protection by design and default requirements (Article 25(1) of the GDPR) or DPIAs (Article 35 of the GDPR).
What does the Opinion mean for my business?
- In practical terms, where contemplating the use of personal data for training/development and/or use/deployment of AI models, an in depth consideration and application of data protection laws will be required. This includes implementing mitigating measures such as data minimisation, anonymization and pseudonymisation, privacy preserving technologies, transparency measures and ‘web-scraping’ restrictions.
- Developers and users of AI models will need to be in a position to demonstrate compliance with data protection laws through DPIAs LIAs, Fundamental Rights Impact Assessments and appropriate technical and organisational measures, as well as internal and external facing governance structures and policies and procedures for the responsible use of AI.
- Forensic due diligence will be required for developers when sourcing personal data for AI training purposes and ensuring the ‘explainability’ of the AI model and for deployers of AI models built using or which process personal data. A subsequent controller may be found not to have fulfilled their accountability obligations in the absence of appropriate assessments of the data protection implications of deployment of AI models. This includes taking reasonable steps to assess the developer’s compliance with data protection laws during the training/development stage (such as sources of data and data minimisation techniques used as well as the provision of a LIA).
- “AI” due diligence will now form part of mainstream queries to be raised in relation to technology companies in any share or asset sale, with the expectation that the risk involved in using personal data for training AI may mean limited (if any) warranties and indemnification in this area and the transfer of AI products and services on an “as-is” basis.
Conclusion/Commentary
The Opinion is an important starting point on the journey towards an aligned EU approach on the use of personal data in AI models and the EDPB have indicated that there will be further ‘more specific’ guidance to follow. Greater clarity overall will be needed to ensure that compliance with data protection laws can be achieved, while at the same time allowing for and not impeding the significant technological innovation and benefits to society which may be brought about through the use of AI.
For further information please contact Data Protection Partners Zelda Deasy, Seán O'Donnell, or any member of the Byrne Wallace Shields LLP Data Protection/GDPR Team.