Privacy & AI Technology Glossary
Definitions for all terms, acronyms, and concepts used in PII anonymization, AI security, and data privacy compliance.
An open protocol by Anthropic enabling AI models to interact with external tools and data sources in a standardized way. anonym.legal implements an MCP Server so AI coding tools can invoke anonymization without leaving their workflow.
anonym.legal's MCP Server integration enables AI coding assistants (Claude Desktop, Cursor, VS Code Copilot) to call the anonymization API directly as a tool. PII is stripped from code, prompts, and context before being sent to the AI model. Install via: npx @anthropic-ai/mcp-server-anonym-legal
A specialized DLP category focused on preventing PII and confidential data from being included in prompts sent to generative AI models (ChatGPT, Claude, Gemini). anonym.legal's Chrome Extension and MCP Server address this risk at the point of input.
A security discipline and category of software tools that detect and prevent unauthorized transmission of sensitive data outside an organization. anonym.legal functions as a browser-layer and AI-layer DLP solution for PII.
The replacement strategy applied to detected PII. anonym.legal supports: REPLACE (placeholder text), REDACT (empty string), MASK (asterisks), HASH (SHA-256 digest), ENCRYPT (reversible AES-256-GCM), and KEEP (pass-through, useful for testing).
User-defined PII patterns added on top of anonym.legal's built-in 285+ entity types. Supports regex patterns, word lists, and deny-lists. Useful for organization-specific identifiers such as employee IDs, internal project codes, or proprietary product names.
Usage-based billing where API calls consume tokens calculated from text length, entity count, and processing mode (analyze vs. anonymize). analyze_text: 2–10+ tokens, anonymize_text: 3–20+ tokens, detokenize_text: 1–5+ tokens. Management operations are free.
Any data that can identify a specific individual directly or in combination with other data. Examples: names, email addresses, social security numbers, IP addresses, biometric records.
The irreversible process of removing or transforming identifying information so that individuals can no longer be identified, directly or indirectly. Under GDPR, truly anonymized data falls outside the regulation's scope.
Replacing direct identifiers with artificial values (pseudonyms) while retaining the ability to re-identify individuals using a separate key. GDPR Article 4(5) recognizes it as a privacy-enhancing technique but does not exempt pseudonymized data from the regulation.
Permanently removing or obscuring sensitive information from documents, replacing it with a visual marker such as [REDACTED] or a black bar. Unlike encryption, redaction is one-way and the original data cannot be recovered.
Replacing sensitive data with a non-sensitive placeholder (token) that maps back to the original in a secure vault. Unlike encryption, the token itself has no mathematical relationship to the original data. Used in the MCP Server to enable AI response de-anonymization.
A one-way transformation of data into a fixed-length digest using algorithms such as SHA-256. Used for consistent pseudonymization, deduplication, and integrity verification. Hash values cannot be reversed but can be vulnerable to rainbow table attacks if not salted.
Any health-related information linked to an identifiable individual, regulated under HIPAA in the US. Includes diagnoses, treatment records, insurance data, and any of the 18 HIPAA Safe Harbor identifiers.
EU Regulation 2016/679, the primary data protection framework for the European Union. Applies to any organization processing personal data of EU residents. Fines up to €20M or 4% of global annual revenue. Key rights: access, erasure, portability, restriction, objection.
US federal law establishing standards for protecting sensitive patient health information. The Privacy Rule governs PHI use; the Security Rule requires administrative, physical, and technical safeguards for electronic PHI (ePHI). Violations carry fines up to $1.9M per category per year.
International standard for information security management systems (ISMS). Certification requires documented policies, risk assessments, and controls. anonym.legal's EU servers are ISO 27001-certified, ensuring structured security governance.
Security standard for organizations handling payment card data, maintained by the PCI Security Standards Council. Requires encryption, access controls, logging, and regular testing. Non-compliance can result in fines and loss of card processing privileges.
California privacy law granting residents the right to know, delete, and opt out of the sale of their personal information. Applies to businesses meeting revenue, data volume, or data-selling thresholds.
An authenticated encryption algorithm combining AES-256 (256-bit key) with Galois/Counter Mode for both confidentiality and integrity. Used in anonym.legal's reversible anonymization to encrypt replaced entities. Provides both secrecy and tamper detection.
A cryptographic hash function producing a 256-bit digest. Used in anonym.legal for HMAC authentication of API requests, ZK auth proofs, and consistent entity pseudonymization (hashing with salt produces the same replacement for the same original value).
The cryptographic protocol securing data in transit. anonym.legal enforces TLS 1.2 minimum with TLS 1.3 preferred, HSTS with one-year max-age, and HTTP/2. All traffic between clients and the server is encrypted in transit.
Encryption in which only communicating parties can read the messages; the service provider has no access to plaintext. In anonym.legal's ZK Auth mode, encryption keys never leave the client device, achieving E2EE for anonymized output storage.
A category of personal information that the detection engine recognizes and can anonymize. Examples: PERSON, EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IBAN_CODE, US_SSN, IP_ADDRESS. anonym.legal supports 285+ entity types across 48 languages.
Entity types detected regardless of text language, typically through format-based regex with checksum validation. Examples: CREDIT_CARD, IBAN_CODE, EMAIL_ADDRESS, PHONE_NUMBER, IP_ADDRESS, URL, CRYPTO address.
Entity types for national and government-issued identifiers: US_SSN, US_PASSPORT, UK_NHS, ES_NIF, DE_PERSONALAUSWEIS, FR_INSEE, IT_FISCAL_CODE, and 50+ other country-specific ID formats. Detected using country-specific regex + checksum patterns.
Entity types covering financial identifiers: CREDIT_CARD (Luhn checksum), IBAN_CODE (ISO 13616 checksum), SWIFT_CODE (BIC format), US_BANK_NUMBER, NRP (Spanish tax ID). Detected with checksum validation to minimize false positives.
Entity types for the 18 HIPAA Safe Harbor identifiers and additional health-related PII: US_MRN (medical record numbers), MEDICAL_LICENSE, HEALTHCARE_PLAN_BENEFICIARY, and diagnosis/treatment context entities.
This is a developer-focused subset. The full glossary has 90+ terms covering all compliance frameworks, detection algorithms, and infrastructure terms.
View Full Glossary on anonym.legal →