EDPB Opinion on ChatGPT and Generative AI

What data protection requirements arise from the EDPB analysis of ChatGPT and generative AI? Web scraping, legal basis for training, data subject rights, accuracy and age verification in focus.

11 February 20264 min read
EDPBChatGPTGenerative AIWeb ScrapingData Subject RightsGDPR

Overview

In the wake of the public discussion around generative AI systems such as ChatGPT, the European Data Protection Board (EDPB) initiated a coordinated review by national supervisory authorities. The aim was to clarify key data protection questions relating to Large Language Models (LLMs).

The opinion is particularly relevant for organisations, as it sets concrete benchmarks for:

  • Web scraping
  • Training data processing
  • Data subject rights
  • Accuracy of outputs
  • Age verification

This article summarises the key points and derives practical implications for AI projects.

1. Web Scraping as a Training Source

A central topic was the use of publicly accessible data from the internet.

Key Statement

Publicly accessible data remains personal data within the meaning of the GDPR.

This means:

  • A legal basis is required
  • Transparency obligations apply
  • Data subject rights must be ensured

Public Does Not Mean Freely Usable

The mere public accessibility does not justify unrestricted further processing.

The EDPB emphasises:

  • Each training phase constitutes a processing operation
  • Legitimate interest requires careful balancing
  • Sensitive data increases the requirements

Particularly for large-scale web scraping, the following must be assessed:

  • Reasonable expectations of data subjects
  • Intensity of interference
  • Safeguards in place

3. Data Subject Rights with LLMs

A central audit point was:

  • How can access, erasure or rectification rights be implemented?

Challenge

LLMs do not store data in a traditional database format but as statistical representations.

The EDPB nevertheless requires:

  • Processes for reviewing individual requests
  • Mechanisms for data erasure or suppression
  • Transparent communication about technical limitations

Technical Limitations

Technical complexity does not automatically exempt from legal obligations.

4. Accuracy of AI Outputs

Generative AI can:

  • Generate false statements
  • Falsely represent individuals
  • Reproduce incorrect facts

The EDPB points out:

  • Controllers must take appropriate measures
  • Provide clear notices about possible errors
  • Offer correction mechanisms

5. Age Verification

Particular attention was given to:

  • Protection of minors
  • Access restrictions
  • Age-appropriate usage

For AI systems with broad accessibility:

  • Appropriate age verification must be implemented
  • Risks to children must be minimised

6. Delineation of Responsibilities

The EDPB emphasises precise role clarification:

RoleDescription
Model providerTraining and model provision
IntegratorIntegration into own product
DeployerConcrete deployment

Each role carries its own responsibility.

7. Transparency Requirements

For generative AI, the following are particularly required:

  • Clear information about AI usage
  • Description of data categories
  • Notice of potential error-proneness
  • Explanation of logic at an understandable level

Practical Implications for Organisations

1. Review Training Data

  • Document provenance
  • Identify sensitive data
  • Verify legal basis

2. Operationalise Data Subject Rights

  • Processes for handling access requests
  • Review erasure mechanisms
  • Define escalation paths

3. Enhance Transparency

  • Update privacy notices
  • Explicitly identify AI usage
  • Make error-proneness transparent

4. Consider Protection of Minors

  • Review age verification
  • Assess usage scenarios

Connection to the EU AI Act

The EDPB opinion complements:

  • Transparency obligations
  • Risk management requirements
  • Documentation obligations

Particularly for GPAI models, parallel requirements exist.

Common Misconceptions

AssumptionReality
"Web scraping is permitted"Only with a legal basis
"LLMs do not store personal data"Inferences can be personal data
"Inaccurate outputs are technically unavoidable"Organisational obligations exist

Strategic Recommendation

Organisations should not view generative AI in isolation but rather:

  • As a data protection-relevant overall system
  • With a clear governance structure
  • With documented risk analysis

Proactive transparency significantly reduces regulatory risk.

Need help implementing?

Work with Creativate AI Studio to design, validate and implement AI systems — technically sound, compliant and production-ready.

Need legal clarity?

For specific legal questions on the AI Act and GDPR, specialized legal advice focusing on AI regulation, data protection and compliance structures is available.

Independent legal advice. No automated legal information. The platform ai-playbook.eu does not provide legal advice.

Next Steps

  1. Review training data sources and legal basis.
  2. Implement data subject rights processes.
  3. Revise transparency information.
  4. Consider protection of minors.
  5. Integrate EDPB requirements into your AI governance.

Need help implementing?

Work with Creativate AI Studio to design, validate and implement AI systems — technically sound, compliant and production-ready.

Need legal clarity?

For specific legal questions on the AI Act and GDPR, specialized legal advice focusing on AI regulation, data protection and compliance structures is available.

Independent legal advice. No automated legal information. The platform ai-playbook.eu does not provide legal advice.

Related Articles