NPA 2024: Can AI and Data Protection Coexist? 

Författare:

|

Datum:

|

The Nordic Privacy Arena (NPA) is Scandinavia’s largest annual gathering on the topic of data protection. Last week, the 9th edition of the NPA took place from September 30 to October 1 at Münchenbryggeriet in Stockholm. It included 25 agenda items and attracted many participants who took part in person and online.

This is the first in a three-part series on notable presentations and discussions centred on the topic of AI, which has become increasingly important with the EU’s AI Act and rapid technological advances.

Deep Dive into Legal Challenges

In her insightful keynote speech, Liane Colonna from the Swedish Research Institute of Law and Informatics explored the complex interplay between artificial intelligence (AI) and data protection laws. Drawing on her extensive research, including her dissertation on the legal aspects of data mining, Colonna highlighted how AI technologies challenge the core data protection principles of the General Data Protection Regulation (GDPR).

Blurring the Lines Between Personal and Non-Personal Data

Colonna began by discussing how AI blurs the distinction between personal and non-personal data, as it is able to draw correlations and conclusions from big collection of data sets. Large language models (LLMs), such as those used for chatbots and virtual assistants, are trained on massive amounts of text and code from the internet — including books, articles, personal opinions and even unique writing styles. Although providers often claim to anonymise user data, research shows that it is still possible to re-identify individuals.

She pointed out that techniques such as differential privacy and secure multiparty computation offer some hope for protecting user data. However, it is technically difficult to implement these methods on a large scale. There is also a trade-off between the usefulness of data and privacy: the more personal data you hide from the AI, the less effective the model can become. For example, if an AI tool is designed to assist people with low vision or blindness, it may need to process personal names to provide more accurate answers.  

The Scope of the GDPR and Household Exemptions

An interesting legal issue raised by Colonna concerns the GDPR’s household exemption, which applies when individuals process personal data exclusively for personal or household activities. She raised the question of whether the creation of synthetic AI content, such as deepfakes for private use, falls under this exemption. For example, if someone uses private photos to create a deepfake video of their ex-partner, how should the GDPR treat this scenario? Her example highlights the challenges of applying existing laws to new technological contexts.

Challenging Core Data Protection Principles

Colonna dived deeper into how AI technologies challenge specific data protection principles:

  1. Legality Principle: When training AI models like ChatGPT, the only plausible legal basis under the GDPR is the ”legitimate interest” of the data controller. However, it’s debatable whether commercial interests in training such models outweigh the rights of individuals whose data is used. This issue becomes even more complex when processing sensitive data under Article 9 of the GDPR.
  2. Purpose Limitation: AI systems often cannot determine the exact purpose of data processing in advance, as it is in the nature of AI to extract unforeseen patterns and insights from the data. This unpredictability conflicts with the requirement of the GDPR that data controllers must determine the purpose of the processing in advance and restrict the use of the data accordingly.
  3. Data Minimization: AI models require huge amounts of data to function effectively. This is in direct conflict with the principle of data minimisation, which mandates that only necessary data should be processed. This tension is particularly evident in general-purpose AI systems, where the potential applications are broad and not fully defined in the development phase.
  4. Accuracy Principle: AI results can sometimes be inaccurate or misleading. Colonna pointed out concerns that AI systems produce ”hallucinations” or false information, which poses significant risks when it is used to produce medical summaries or legal advice. In addition, training AI models with their own outputs could lead to further inaccuracies.
  5. The right to erasure: Implementing the ”right to be forgotten” in AI systems is a technical challenge. Deleting personal data from AI models is not straightforward due to the complicated way in which data is embedded during training. She said we need to develop effective methods for ”machine unlearning”.

Reconciling AI with Data Protection rules

Liane Colonna emphasized the importance of collaboration between legal experts and technologists to address these challenges. While concepts like ”data protection by design” are promising, they are difficult to operationalize. She cautioned policymakers about the complexities involved in embedding legal rules into technological systems.

She also noted a shift in regulatory approaches, citing the EU’s AI Act as an example. Unlike the GDPR’s rights-based framework, the AI Act adopts a social protection model, focusing more on consumer protection and risks to society than individual rights.