Privacy & AI Technology Glossary
Definitions for all terms, acronyms, and concepts used in PII anonymization, AI security, and data privacy compliance.
An open protocol by Anthropic enabling AI models to interact with external tools and data sources in a standardized way. anonym.legal implements an MCP Server so AI coding tools can invoke anonymization without leaving their workflow.
anonym.legal's MCP Server integration enables AI coding assistants (Claude Desktop, Cursor, VS Code Copilot) to call the anonymization API directly as a tool. PII is stripped from code, prompts, and context before being sent to the AI model. Install via: npx @anthropic-ai/mcp-server-anonym-legal
A specialized DLP category focused on preventing PII and confidential data from being included in prompts sent to generative AI models (ChatGPT, Claude, Gemini). anonym.legal's Chrome Extension and MCP Server address this risk at the point of input.
A security discipline and category of software tools that detect and prevent unauthorized transmission of sensitive data outside an organization. anonym.legal functions as a browser-layer and AI-layer DLP solution for PII.
The replacement strategy applied to detected PII. anonym.legal supports: REPLACE (placeholder text), REDACT (empty string), MASK (asterisks), HASH (SHA-256 digest), ENCRYPT (reversible AES-256-GCM), and KEEP (pass-through, useful for testing).
User-defined PII patterns added on top of anonym.legal's built-in 285+ entity types. Supports regex patterns, word lists, and deny-lists. Useful for organization-specific identifiers such as employee IDs, internal project codes, or proprietary product names.
Usage-based billing where API calls consume tokens calculated from text length, entity count, and processing mode (analyze vs. anonymize). analyze_text: 2–10+ tokens, anonymize_text: 3–20+ tokens, detokenize_text: 1–5+ tokens. Management operations are free.
Any data that can identify a specific individual directly or in combination with other data. Examples: names, email addresses, social security numbers, IP addresses, biometric records.
The irreversible process of removing or transforming identifying information so that individuals can no longer be identified, directly or indirectly. Under GDPR, truly anonymized data falls outside the regulation's scope.
Replacing direct identifiers with artificial values (pseudonyms) while retaining the ability to re-identify individuals using a separate key. GDPR Article 4(5) recognizes it as a privacy-enhancing technique but does not exempt pseudonymized data from the regulation.
Permanently removing or obscuring sensitive information from documents, replacing it with a visual marker such as [REDACTED] or a black bar. Unlike encryption, redaction is one-way and the original data cannot be recovered.
Replacing sensitive data with a non-sensitive placeholder (token) that maps back to the original in a secure vault. Unlike encryption, the token itself has no mathematical relationship to the original data. Used in the MCP Server to enable AI response de-anonymization.
A one-way transformation of data into a fixed-length digest using algorithms such as SHA-256. Used for consistent pseudonymization, deduplication, and integrity verification. Hash values cannot be reversed but can be vulnerable to rainbow table attacks if not salted.
Any health-related information linked to an identifiable individual, regulated under HIPAA in the US. Includes diagnoses, treatment records, insurance data, and any of the 18 HIPAA Safe Harbor identifiers.
EU Regulation 2016/679, the primary data protection framework for the European Union. Applies to any organization processing personal data of EU residents. Fines up to €20M or 4% of global annual revenue. Key rights: access, erasure, portability, restriction, objection.
US federal law establishing standards for protecting sensitive patient health information. The Privacy Rule governs PHI use; the Security Rule requires administrative, physical, and technical safeguards for electronic PHI (ePHI). Violations carry fines up to $1.9M per category per year.
International standard for information security management systems (ISMS). Certification requires documented policies, risk assessments, and controls. anonym.legal's EU servers are ISO 27001-certified, ensuring structured security governance.
Security standard for organizations handling payment card data, maintained by the PCI Security Standards Council. Requires encryption, access controls, logging, and regular testing. Non-compliance can result in fines and loss of card processing privileges.
California privacy law granting residents the right to know, delete, and opt out of the sale of their personal information. Applies to businesses meeting revenue, data volume, or data-selling thresholds.
EU Regulation 2024/1689, the world's first comprehensive AI law. Classifies AI systems by risk level. High-risk system requirements — covering AI used in employment, credit, healthcare and critical infrastructure — become enforceable on August 2, 2026. Art. 10 requires data governance and data minimisation for training data; Art. 13 requires transparency. Applies to AI developers, deployers, and importers operating in the EU.
GDPR Art. 5(1)(c) requires that personal data be "adequate, relevant and limited to what is necessary" for its processing purpose. In AI workflows, this means stripping PII from prompts before sending to an AI model — the model only receives what is strictly necessary for the task. anonymize.dev automates this principle at the MCP protocol layer, replacing PII with tokens before any AI model processes the request.
A development practice where developers describe intent in natural language and rely on AI coding agents (Cursor, Windsurf, Claude, GitHub Copilot) to generate implementation. The security risk: developers routinely paste real customer records, database connection strings, API keys, and log files into AI prompts as context. 77% of developers have sent company data to AI tools (Data Privacy Week 2026). Without PII interception, this constitutes a GDPR Art. 5 violation and potential data breach.
The use of unapproved AI tools — often via personal accounts — outside IT visibility and control. 88% of organisations use AI in at least one business function; of these, nearly half of generative AI users rely on personal AI applications that operate outside governance frameworks. Shadow AI has added on average €670,000 to breach costs, and 20% of organisations have experienced breaches directly caused by shadow AI (Kiteworks 2026). Providing approved, privacy-safe AI tools reduces unauthorised use by 89%.
AI systems that autonomously plan and execute multi-step tasks — browsing the web, calling APIs, writing and running code, accessing files — often with minimal human oversight. The UK Information Commissioner's Office (ICO) published guidance in January 2026 highlighting that agentic AI exacerbates data protection risks because human oversight becomes difficult when agents operate autonomously across large volumes of personal data. MCP-based PII interception (anonymize.dev) applies at the protocol layer, protecting data regardless of agent autonomy level.
A critical vulnerability class in MCP-enabled AI agents where malicious content in external sources (GitHub issues, web pages, documents) can hijack the agent and cause it to exfiltrate data from connected tools. A documented example: attackers planted malicious content in GitHub issues that caused the GitHub MCP server to leak private repository data. PII interception before agent context is assembled mitigates this attack class by ensuring real personal data is never present in the agent's working context.
A category of tools and techniques that enable data analytics and AI workflows while preserving individual privacy. Includes: tokenisation (anonymize.dev), differential privacy (adds noise to statistical outputs), k-anonymity (ensures each record is indistinguishable from k−1 others), synthetic data generation, federated learning, and secure multi-party computation. The EU's 2026 data privacy landscape increasingly requires organisations to deploy PETs as part of a Privacy-by-Design (GDPR Art. 25) implementation strategy.
An authenticated encryption algorithm combining AES-256 (256-bit key) with Galois/Counter Mode for both confidentiality and integrity. Used in anonym.legal's reversible anonymization to encrypt replaced entities. Provides both secrecy and tamper detection.
A cryptographic hash function producing a 256-bit digest. Used in anonym.legal for HMAC authentication of API requests, ZK auth proofs, and consistent entity pseudonymization (hashing with salt produces the same replacement for the same original value).
The cryptographic protocol securing data in transit. anonym.legal enforces TLS 1.2 minimum with TLS 1.3 preferred, HSTS with one-year max-age, and HTTP/2. All traffic between clients and the server is encrypted in transit.
Encryption in which only communicating parties can read the messages; the service provider has no access to plaintext. In anonym.legal's ZK Auth mode, encryption keys never leave the client device, achieving E2EE for anonymized output storage.
A category of personal information that the detection engine recognizes and can anonymize. Examples: PERSON, EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IBAN_CODE, US_SSN, IP_ADDRESS. anonym.legal supports 285+ entity types across 48 languages.
Entity types detected regardless of text language, typically through format-based regex with checksum validation. Examples: CREDIT_CARD, IBAN_CODE, EMAIL_ADDRESS, PHONE_NUMBER, IP_ADDRESS, URL, CRYPTO address.
Entity types for national and government-issued identifiers: US_SSN, US_PASSPORT, UK_NHS, ES_NIF, DE_PERSONALAUSWEIS, FR_INSEE, IT_FISCAL_CODE, and 50+ other country-specific ID formats. Detected using country-specific regex + checksum patterns.
Entity types covering financial identifiers: CREDIT_CARD (Luhn checksum), IBAN_CODE (ISO 13616 checksum), SWIFT_CODE (BIC format), US_BANK_NUMBER, NRP (Spanish tax ID). Detected with checksum validation to minimize false positives.
Entity types for the 18 HIPAA Safe Harbor identifiers and additional health-related PII: US_MRN (medical record numbers), MEDICAL_LICENSE, HEALTHCARE_PLAN_BENEFICIARY, and diagnosis/treatment context entities.
This is a developer-focused subset. The full glossary has 90+ terms covering all compliance frameworks, detection algorithms, and infrastructure terms.
View Full Glossary on anonym.legal →