Datenschutz- & KI-Technologie Glossar
Definitionen für alle Begriffe, Akronyme und Konzepte, die bei der PII-Anonymisierung, KI-Sicherheit und Datenschutz-Compliance verwendet werden.
Ein offenes Protokoll von Anthropic, das es KI-Modellen ermöglicht, auf standardisierte Weise mit externen Tools und Datenquellen zu interagieren. anonym.legal implementiert einen MCP Server, damit AI-Coding-Tools die Anonymisierung aufrufen können, ohne ihren Workflow zu verlassen.
Die MCP Server-Integration von anonym.legal ermöglicht es AI-Coding-Assistenten (Claude Desktop, Cursor, VS Code Copilot), die Anonymisierungs-API direkt als Tool aufzurufen. PII wird aus Code, Prompts und Kontext entfernt, bevor es an das KI-Modell gesendet wird. Installation über: npx @anthropic-ai/mcp-server-anonym-legal
Eine spezialisierte DLP-Kategorie, die verhindert, dass PII und vertrauliche Daten in Prompts an generative KI-Modelle (ChatGPT, Claude, Gemini) einbezogen werden. Die Chrome-Erweiterung und der MCP Server von anonym.legal adressieren dieses Risiko an der Eingabestelle.
Eine Sicherheitsdisziplin und Kategorie von Softwaretools, die die unbefugte Übertragung sensibler Daten außerhalb einer Organisation erkennen und verhindern. anonym.legal funktioniert als Browser- und AI-Layer-DLP-Lösung für PII.
The replacement strategy applied to detected PII. anonym.legal supports: REPLACE (placeholder text), REDACT (empty string), MASK (asterisks), HASH (SHA-256 digest), ENCRYPT (reversible AES-256-GCM), and KEEP (pass-through, useful for testing).
User-defined PII patterns added on top of anonym.legal's built-in 285+ entity types. Supports regex patterns, word lists, and deny-lists. Useful for organization-specific identifiers such as employee IDs, internal project codes, or proprietary product names.
Usage-based billing where API calls consume tokens calculated from text length, entity count, and processing mode (analyze vs. anonymize). analyze_text: 2–10+ tokens, anonymize_text: 3–20+ tokens, detokenize_text: 1–5+ tokens. Management operations are free.
Any data that can identify a specific individual directly or in combination with other data. Examples: names, email addresses, social security numbers, IP addresses, biometric records.
Der unumkehrbare Prozess des Entfernens oder Transformierens von Identifikationsinformationen, so dass Personen nicht mehr direkt oder indirekt identifiziert werden können. Unter GDPR fallen wahrhaft anonymisierte Daten außerhalb des Geltungsbereichs der Verordnung.
Replacing direct identifiers with artificial values (pseudonyms) while retaining the ability to re-identify individuals using a separate key. GDPR Article 4(5) recognizes it as a privacy-enhancing technique but does not exempt pseudonymized data from the regulation.
Permanently removing or obscuring sensitive information from documents, replacing it with a visual marker such as [REDACTED] or a black bar. Unlike encryption, redaction is one-way and the original data cannot be recovered.
Replacing sensitive data with a non-sensitive placeholder (token) that maps back to the original in a secure vault. Unlike encryption, the token itself has no mathematical relationship to the original data. Used in the MCP Server to enable AI response de-anonymization.
A one-way transformation of data into a fixed-length digest using algorithms such as SHA-256. Used for consistent pseudonymization, deduplication, and integrity verification. Hash values cannot be reversed but can be vulnerable to rainbow table attacks if not salted.
Any health-related information linked to an identifiable individual, regulated under HIPAA in the US. Includes diagnoses, treatment records, insurance data, and any of the 18 HIPAA Safe Harbor identifiers.
EU Regulation 2016/679, the primary data protection framework for the European Union. Applies to any organization processing personal data of EU residents. Fines up to €20M or 4% of global annual revenue. Key rights: access, erasure, portability, restriction, objection.
US federal law establishing standards for protecting sensitive patient health information. The Privacy Rule governs PHI use; the Security Rule requires administrative, physical, and technical safeguards for electronic PHI (ePHI). Violations carry fines up to $1.9M per category per year.
International standard for information security management systems (ISMS). Certification requires documented policies, risk assessments, and controls. anonym.legal's EU servers are ISO 27001-certified, ensuring structured security governance.
Security standard for organizations handling payment card data, maintained by the PCI Security Standards Council. Requires encryption, access controls, logging, and regular testing. Non-compliance can result in fines and loss of card processing privileges.
California privacy law granting residents the right to know, delete, and opt out of the sale of their personal information. Applies to businesses meeting revenue, data volume, or data-selling thresholds.
EU Regulation 2024/1689, the world's first comprehensive AI law. Classifies AI systems by risk level. High-risk system requirements — covering AI used in employment, credit, healthcare and critical infrastructure — become enforceable on August 2, 2026. Art. 10 requires data governance and data minimisation for training data; Art. 13 requires transparency. Applies to AI developers, deployers, and importers operating in the EU.
GDPR Art. 5(1)(c) requires that personal data be "adequate, relevant and limited to what is necessary" for its processing purpose. In AI workflows, this means stripping PII from prompts before sending to an AI model — the model only receives what is strictly necessary for the task. anonymize.dev automates this principle at the MCP protocol layer, replacing PII with tokens before any AI model processes the request.
A development practice where developers describe intent in natural language and rely on AI coding agents (Cursor, Windsurf, Claude, GitHub Copilot) to generate implementation. The security risk: developers routinely paste real customer records, database connection strings, API keys, and log files into AI prompts as context. 77% of developers have sent company data to AI tools (Datenschutz Week 2026). Without PII interception, this constitutes a GDPR Art. 5 violation and potential data breach.
The use of unapproved AI tools — often via personal accounts — outside IT visibility and control. 88% of organisations use AI in at least one business function; of these, nearly half of generative AI users rely on personal AI applications that operate outside governance frameworks. Shadow AI has added on average €670,000 to breach costs, and 20% of organisations have experienced breaches directly caused by shadow AI (Kiteworks 2026). Providing approved, privacy-safe AI tools reduces unauthorised use by 89%.
AI systems that autonomously plan and execute multi-step tasks — browsing the web, calling APIs, writing and running code, accessing files — often with minimal human oversight. The UK Information Commissioner's Office (ICO) published guidance in January 2026 highlighting that agentic AI exacerbates data protection risks because human oversight becomes difficult when agents operate autonomously across large volumes of personal data. MCP-based PII interception (anonymize.dev) applies at the protocol layer, protecting data regardless of agent autonomy level.
A critical vulnerability class in MCP-enabled AI agents where malicious content in external sources (GitHub issues, web pages, documents) can hijack the agent and cause it to exfiltrate data from connected tools. A documented example: attackers planted malicious content in GitHub issues that caused the GitHub MCP server to leak private repository data. PII interception before agent context is assembled mitigates this attack class by ensuring real personal data is never present in the agent's working context.
A category of tools and techniques that enable data analytics and AI workflows while preserving individual privacy. Includes: tokenisation (anonymize.dev), differential privacy (adds noise to statistical outputs), k-anonymity (ensures each record is indistinguishable from k−1 others), synthetic data generation, federated learning, and secure multi-party computation. The EU's 2026 data privacy landscape increasingly requires organisations to deploy PETs as part of a Privacy-by-Design (GDPR Art. 25) implementation strategy.
An authenticated encryption algorithm combining AES-256 (256-bit key) with Galois/Counter Mode for both confidentiality and integrity. Used in anonym.legal's reversible anonymization to encrypt replaced entities. Provides both secrecy and tamper detection.
A cryptographic hash function producing a 256-bit digest. Used in anonym.legal for HMAC authentication of API requests, ZK auth proofs, and consistent entity pseudonymization (hashing with salt produces the same replacement for the same original value).
The cryptographic protocol securing data in transit. anonym.legal enforces TLS 1.2 minimum with TLS 1.3 preferred, HSTS with one-year max-age, and HTTP/2. All traffic between clients and the server is encrypted in transit.
Verschlüsselung in which only communicating parties can read the messages; the service provider has no access to plaintext. In anonym.legal's ZK Auth mode, encryption keys never leave the client device, achieving E2EE for anonymized output storage.
A category of personal information that the detection engine recognizes and can anonymize. Examples: PERSON, EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IBAN_CODE, US_SSN, IP_ADDRESS. anonym.legal supports 285+ entity types across 48 languages.
Entity types detected regardless of text language, typically through format-based regex with checksum validation. Examples: CREDIT_CARD, IBAN_CODE, EMAIL_ADDRESS, PHONE_NUMBER, IP_ADDRESS, URL, CRYPTO address.
Entity types for national and government-issued identifiers: US_SSN, US_PASSPORT, UK_NHS, ES_NIF, DE_PERSONALAUSWEIS, FR_INSEE, IT_FISCAL_CODE, and 50+ other country-specific ID formats. Detected using country-specific regex + checksum patterns.
Entity types covering financial identifiers: CREDIT_CARD (Luhn checksum), IBAN_CODE (ISO 13616 checksum), SWIFT_CODE (BIC format), US_BANK_NUMBER, NRP (Spanish tax ID). Detected with checksum validation to minimize false positives.
Entity types for the 18 HIPAA Safe Harbor identifiers and additional health-related PII: US_MRN (medical record numbers), MEDICAL_LICENSE, HEALTHCARE_PLAN_BENEFICIARY, and diagnosis/treatment context entities.
This is a developer-focused subset. The full glossary has 90+ terms covering all compliance frameworks, detection algorithms, and infrastructure terms.
View Full Glossary on anonym.legal →