
The European Data Protection Board (EDPB) has released a comprehensive opinion on the intersection of artificial intelligence and privacy laws. This document explores how developers can use personal data to build and deploy AI models, such as large language models (LLMs), while adhering to strict privacy regulations. The EDPB's guidance is crucial for regulatory enforcement and aims to provide clarity on complex issues like model anonymity, the legitimacy of processing personal data, and the handling of unlawfully trained models. The opinion also underscores the importance of balancing innovation with stringent data protection standards.
Ensuring Model Anonymity in AI Development
The EDPB emphasizes that achieving true anonymity in AI models is essential for ensuring compliance with privacy laws. According to the board, this requires a thorough, case-by-case assessment to determine if an AI model poses a very low risk of identifying individuals or extracting personal data through user queries. Developers must implement various methods during the design and development phases to minimize identifiability risks, including selective data sourcing, data minimization techniques, and robust methodological choices. These measures are critical to demonstrating that the model does not process personal data, thereby falling outside the scope of the GDPR.
To achieve model anonymity, developers need to consider multiple strategies. Selective data sourcing involves avoiding or limiting the collection of personal data by excluding inappropriate sources. During the data preparation phase, data minimization and filtering steps can further reduce the risk of identifiability. Additionally, making robust methodological choices, such as using regularization methods to improve model generalization and applying privacy-preserving techniques like differential privacy, can significantly lower the likelihood of re-identification. Moreover, incorporating measures that prevent users from obtaining personal data through queries adds another layer of protection. The EDPB stresses that only when these efforts collectively ensure a very low risk of re-identification can a model be considered truly anonymous, thus exempt from GDPR requirements.
Evaluating Legal Bases for AI Development and Deployment
The EDPB also delves into the appropriate legal bases for processing personal data in AI development and deployment. While the GDPR offers limited options, the "legitimate interest" basis emerges as a potential solution. This approach does not require individual consent, making it more commercially viable for large-scale AI projects. However, regulators must conduct detailed assessments to determine if legitimate interest is suitable, considering factors like the purpose, necessity, and impact on individual rights. The EDPB outlines a three-step test to evaluate whether processing personal data for AI development aligns with the principles of lawful, specific, and necessary processing.
The three-step test includes evaluating the purpose and necessity of processing personal data, ensuring it achieves the intended outcome without less intrusive alternatives. Regulators must also perform a balancing test to weigh the impact on individual rights against the processor's interests. Special attention is given to the proportionality of data processed relative to the goal, adhering to the GDPR's data minimization principle. Furthermore, the reasonable expectations of data subjects play a crucial role in this assessment. If the balancing test favors individuals' interests, mitigation measures may be implemented to limit the impact on their rights. These measures could include technical solutions, pseudonymization, masking personal data, enabling opt-outs, and enhancing transparency. The EDPB acknowledges that each case is unique, requiring tailored approaches to ensure compliance with privacy laws while fostering innovation in AI technology.
