What npm package do I use for the MCP Server?

Use the package @anthropic-ai/mcp-server-anonym-legal via npx. Add it to your claude_desktop_config.json or Cursor MCP settings with your ANONYM_API_KEY environment variable.

What are MCP operators and when should I use each one?

Operators control how each entity type is anonymized. There are 6: replace substitutes a custom label; redact removes the value entirely; hash (SHA-256/512) produces a deterministic one-way fingerprint; encrypt (AES-256) encrypts with a key you provide — reversible only by key holders; mask hides a configurable number of characters; keep passes the entity through unchanged. Configure them per entity type in a single anonymize_text call.

How does session persistence work and when do tokens expire?

anonymize_text returns a session_id. Tokens are restored via detokenize_text using that session_id. Session persistence: 'session' keeps tokens 24 hours (default), 'persistent' keeps them 30 days. Manage sessions with list_sessions and delete_session.

What is vibe coding and why is it a GDPR risk?

Vibe coding is the practice of building applications by describing intent in natural language to AI agents like Cursor, Claude, or Windsurf. The GDPR risk: developers routinely include real customer data — names, emails, account records, API keys — as context in these prompts. That data is transmitted to external AI model providers' servers. Under GDPR Art. 5(1)(c), personal data must be limited to what is necessary. anonymize.dev's MCP server intercepts every prompt before it leaves your machine, replacing PII with reversible tokens.

What is shadow AI and how does anonymize.dev protect against it?

Shadow AI refers to employees using unapproved AI tools — typically through personal accounts — outside IT governance. 88% of organisations use AI regularly; nearly half of generative AI users rely on personal AI accounts outside governance frameworks, creating GDPR liability. anonymize.dev's MCP server enforces PII protection at the protocol layer across all MCP-compatible tools, reducing the incentive for developers to use personal AI accounts.

Does anonymize.dev help with EU AI Act compliance?

Yes. EU AI Act high-risk system requirements are enforceable from August 2, 2026. anonymize.dev automates Art. 10 data minimisation (AI models only receive tokenised data) and provides session audit logs for Art. 13 transparency obligations. Combined with GDPR Art. 25 privacy by design, the MCP interception layer implements data minimisation automatically at the protocol level.

What is the MCP Privacy Gap and how does anonymize.dev close it?

When third-party MCP servers are added to Claude or Cursor, privacy protections from AI providers apply to the AI model — not to MCP servers you add. anonymize.dev intercepts every prompt at the MCP protocol layer before the AI model sees it, so all other MCP servers only ever receive tokenised data, not real PII.

FAQ | Anonymize.dev — MCP Server & PII Protection Questions

Q: How do I prevent developers from accidentally pasting API keys and source code into AI tools?

anonym.legal's MCP Server integrates directly with AI coding tools like Claude and Cursor, intercepting text before it reaches the AI provider and automatically detecting API keys, database credentials, environment variables, and proprietary code patterns. The MCP Server operates as a middleware layer — developers use their AI tools normally while PII and secrets are stripped in real time.

Q: Is it possible to let developers use AI tools while preventing PII from leaving our corporate network?

anonym.legal's MCP Server runs as a local middleware that intercepts AI tool requests, strips PII and secrets, and forwards only sanitized content to AI providers. All PII detection uses anonym.legal's EU-hosted API — no sensitive data is sent to third-party AI providers.

Choosing Your Product

Both products are built on the same Microsoft Presidio-based detection engine with ISO 27001:2022 certification and German servers (Hetzner). The key differences:

anonym.legal — developer-focused. 7 MCP tools, 285+ entity types, 48 languages, 26 entity groups, 6 operators. Ideal for individual developers, legal tech, and teams building AI pipelines with text data. NPM: @anthropic-ai/mcp-server-anonym-legal.

cloak.business — full platform. 10 MCP tools, 320+ entity types, 70+ countries, image OCR in 38 languages, batch processing (1–100 texts per call), plus a Chrome Extension for browser-based AI tools. Ideal for enterprises, teams handling visual documents, and organizations needing browser-level PII protection. NPM: cloak-business-mcp-server.

Both offer a €0 free tier so you can test before choosing.

Yes — cloak.business is the product with image capabilities. It provides two dedicated MCP tools: cloak_analyze_image (OCR + PII detection in 38 languages) and cloak_redact_image (visual redaction that draws black boxes over detected PII regions in the image). This makes cloak.business the choice for workflows involving scanned documents, photos, screenshots, or any visual data. anonym.legal handles text only.

For high-volume document pipelines, cloak.business is the better fit. Its cloak_batch_analyze tool processes 1–100 texts in a single API call, dramatically reducing latency and API call overhead for bulk workflows. It also supports more entity types (320+ vs 285+) and covers more countries (70+ vs 48 languages). If you are processing text-only content at moderate volume, anonym.legal's 7-tool API is simpler to integrate. Both start free — upgrade only when you need higher limits.

MCP Server Integration

anonym.legal's MCP (Model Context Protocol) Server integrates directly with AI coding tools like Claude and Cursor, intercepting text before it reaches the AI provider and automatically detecting API keys, database credentials, environment variables, and proprietary code patterns. The MCP Server operates as a middleware layer — developers use their AI tools normally while PII and secrets are stripped in real time.

The MCP Server intercepts contract text before it reaches Claude, automatically detecting and replacing client names, addresses, financial terms, and case-specific identifiers with anonymized tokens. Lawyers work with the AI normally — the anonymization is transparent and the reversible encryption feature (on paid plans) allows restoring original names in the AI's response. This satisfies attorney-client privilege requirements while enabling AI-assisted legal work.

Rather than banning AI tools entirely (which drives shadow IT usage), the MCP Server (both anonym.legal and cloak.business) and cloak.business's Chrome Extension provide technical controls that strip PII and proprietary data before it reaches AI providers. Employees use ChatGPT, Claude, and other tools normally while sensitive data is automatically intercepted and anonymized. This approach enables AI productivity gains while satisfying enterprise security requirements.

anonym.legal's MCP Server integrates as a middleware layer between your coding environment and the AI provider, automatically detecting API keys, connection strings, JWT tokens, AWS credentials, and other secrets before they're transmitted. The server operates locally and processes interception in real time without slowing down your development workflow.

A practical GenAI DLP solution combines browser-level protection (cloak.business Chrome Extension for ChatGPT, Gemini, DeepSeek, Perplexity) with developer tool integration (MCP Server — anonym.legal or cloak.business — for Claude, Cursor) and document-level protection (Office Add-in for Word, available on both products). Both products share EU-hosted (Germany) infrastructure detecting 285–320+ entity types across 48+ languages before data reaches any AI provider.

The MCP Server (anonym.legal or cloak.business) runs as a local middleware that intercepts AI tool requests, strips PII and secrets, and forwards only sanitized content to AI providers. All PII detection uses EU-hosted APIs (Germany, Hetzner) — no sensitive data is sent to third-party AI providers. For browser-based AI tools, cloak.business's Chrome Extension provides the same protection level without requiring developers to change their workflow.

This type of incident requires three layers of prevention: browser-level interception (Chrome Extension), AI tool integration (MCP Server), and network-level controls. cloak.business's Chrome Extension detects PII in clipboard content before it reaches AI chatbots, while the MCP Server (anonym.legal or cloak.business) provides the same protection for developer AI tools. All products detect names, SSNs, addresses, and other PII in real time before submission.

MCP Operators & Advanced Usage

Operators control how each entity type is anonymized. There are 6: replace substitutes a custom label like [CLIENT]; redact removes the value entirely; hash (SHA-256 or SHA-512) produces a deterministic one-way fingerprint useful for deduplication; encrypt (AES-256) encrypts the value with a key you provide — reversible only by key holders; mask hides a configurable number of characters from start or end; keep passes the entity through unchanged. You configure them per entity type in a single anonymize_text call — e.g., hash PERSON, encrypt IBAN, mask CREDIT_CARD, redact US_SSN.

When you call anonymize_text with mode: "tokenize" (the default), the server returns a session_id. Tokens in the text can later be restored via detokenize_text using that session_id. There are two persistence levels: session (default) keeps tokens for 24 hours — suitable for single-conversation AI sessions. persistent keeps tokens for 30 days — suitable for long-running workflows, legal reviews, or records that span multiple sessions. You can list all active sessions with list_sessions and delete any with delete_session for GDPR erasure compliance.

Yes — use the ad_hoc_recognizers parameter in both analyze_text and anonymize_text. Each recognizer specifies an entity_type name (e.g., "EMPLOYEE_ID"), one or more patterns (regex + optional confidence score), optional context words that boost detection confidence, and optional languages to limit scope. Up to 10 custom recognizers per request. This covers internal identifiers, project codes, proprietary formats, and any domain-specific data your organization uses that isn't in the standard 285+ entity library.

E2E mode (e2e_mode: true in anonymize_text) shifts the token mapping to the client side. Instead of the server storing which token maps to which original value, the server returns the positions of each entity in the original text, and your client builds the mapping locally. The server only sees the anonymized text — never the PII values or the mapping. Use E2E mode when: you operate in a zero-trust environment where even anonym.legal's servers must not hold mappings; you need to encrypt the mapping with your own key; or you're building a self-contained system where the server is a detection engine only.

When Claude or another AI returns a response containing tokens like <PERSON_1> or <EMAIL_1>, those tokens can be restored by calling anonym_legal_detokenize_text with the text and the session_id from the original anonymization. In Claude Desktop (stdio mode), you can instruct Claude to call detokenize_text automatically as part of its response flow. The MCP Server does not intercept AI responses automatically — the detokenize call is explicit. This design gives you control over exactly when and whether to restore values.

Entity groups are predefined collections of related entity types. Instead of listing all 20+ German identifiers individually, you pass entity_groups: ["DACH"] and all DE/AT/CH/LI identifiers are included automatically. There are 26 groups: UNIVERSAL (common PII), FINANCIAL (banking), HEALTHCARE (medical), CORPORATE (business IDs), NORTH_AMERICA, DACH, UK_IRELAND, FRANCE, LATIN_AMERICA, NORDIC, ITALY, LUSOPHONE, NETHERLANDS, POLAND, ASIA_PACIFIC, OCEANIA, EASTERN_EUROPE, CENTRAL_EUROPE, BALKANS, BALTIC, SOUTHERN_EUROPE, MIDDLE_EAST, VEHICLE, INSURANCE, LEGAL, EDUCATION. You can combine groups — e.g., ["UNIVERSAL", "FINANCIAL", "DACH"] — and add individual entities on top.

Entity Detection & Languages

anonym.legal detects 285+ PII entity types across 48 languages. This includes personal data (names, emails, phone numbers), financial data (credit cards, IBANs), government IDs (SSN, passport, driver's license), technical identifiers (IP addresses, API keys), healthcare data, and country-specific identifiers like Brazilian CPF, Indian PAN and Aadhaar, German Steuer-ID, French NIR, Nordic personnummers, and many more.

No — anonym.legal's 285+ entity types include Brazilian CPF, Indian PAN and Aadhaar, all EU IBAN formats, and dozens more country-specific identifiers with built-in checksum validation. Each country's format knowledge and validation logic is maintained by the anonym.legal team, so organizations with global operations get comprehensive coverage from a single tool without custom regex development.

Vanilla Presidio with its ~55 default entity types often over-matches in technical contexts because it lacks context-aware filtering. anonym.legal extends Presidio with 285+ entity types and a hybrid three-tier detection approach — regex for structured patterns, NLP for semantic entities, and cross-validation between tiers — that dramatically reduces false positives in log files, code, and technical documents while maintaining high recall for actual PII.

Most PII tools are English-centric — a German Steuer-ID (11 digits with checksum), French NIR (15 digits), or Polish PESEL have unique formats that generic regex patterns miss entirely. anonym.legal uses a three-tier detection stack: spaCy language-native models for 25 high-resource European languages, Stanza for 7 additional languages, and XLM-RoBERTa cross-lingual transformers for 16 lower-resource languages. With 285+ entity types including DACH, Nordic, and Eastern European identifiers, GDPR-compliant detection works across all EU member states.

Pricing & Plans

anonym.legal offers four plans: Free (200 tokens/month, €0), Basic (1,000 tokens/cycle, €3), Pro (4,000 tokens/cycle, €15), Business (10,000 tokens/cycle, €29). MCP Server is available on all plans. Token costs vary by operation: anonymize_text costs 3–20+ tokens, analyze_text costs 2–10+ tokens. Most solo developers stay on Basic at €3/month.

Tokens are consumed per API operation. analyze_text costs 2–10+ tokens (depending on text length and entity count), anonymize_text costs 3–20+ tokens, detokenize_text costs 1–5+ tokens. Management operations (get_balance, estimate_cost, list_sessions, delete_session) are free. One token roughly corresponds to 4 characters of text analyzed.

Yes — the MCP Server is available on Pro and Business plans. Pro (€15/month, 4,000 tokens/cycle) covers most individual developer workflows. Business (€29/month, 10,000 tokens/cycle) is ideal for teams.

Vibe Coding & Shadow AI

Vibe coding is the practice of building applications by describing intent in natural language to AI agents like Cursor, Claude, or Windsurf. The GDPR risk: developers routinely include real customer data — names, emails, account records, API keys — as context in these prompts. That data is then transmitted to external AI model providers' servers. Under GDPR Art. 5(1)(c), personal data must be limited to what is necessary. Sending full production records as AI context fails this test.

anonymize.dev's MCP server intercepts every prompt before it leaves your machine, replacing PII with reversible tokens. The AI receives only tokens — never real data.

Shadow AI refers to employees using unapproved AI tools — typically through personal accounts — outside IT governance. 88% of organisations use AI in at least one business function, yet nearly half of those users rely on personal AI accounts that operate outside any data processing agreement. This creates GDPR liability: there is no data processing agreement (DPA) in place, and no way to audit what data was sent.

When organisations provide approved, privacy-safe AI access, unauthorised tool use drops by 89%. anonymize.dev ensures that even approved tools handle PII correctly at the protocol layer.

The MCP server operates at the protocol layer — it intercepts AI prompts before they reach any AI model, regardless of whether the developer is using Claude Desktop, Cursor, VS Code, or another MCP-compatible tool. This means PII protection is enforced consistently across all approved AI tools, without requiring developers to remember to sanitise their prompts manually.

For teams, deploying the MCP server as a standardised configuration reduces the incentive to use personal AI accounts: privacy is handled automatically, so developers can work with real context without risk.

EU AI Act & Compliance

Yes. The EU AI Act's high-risk AI system requirements become enforceable on August 2, 2026. Key obligations relevant to AI-assisted development:

Art. 10 — Data governance: Training and input data must be relevant, representative and as free as possible from errors. Anonymize.dev's data minimisation layer ensures AI models receive only what is necessary.
Art. 13 — Transparency: High-risk AI systems must log inputs and outputs. anonymize.dev's session records provide an audit trail of what was anonymised, for which entity types, and when.

Combined with GDPR Art. 25 (privacy by design), anonymize.dev automates the data minimisation principle at the protocol layer.

GDPR Art. 25 requires data minimisation and privacy protection to be embedded in systems by design — not added as an afterthought. anonymize.dev implements this at the MCP protocol layer: PII is stripped before AI processing begins, ensuring the AI system never processes more personal data than necessary for the task. This is automatic — no developer action is needed on a per-request basis.

EU servers (Hetzner, Germany), ISO 27001:2022 certification, and Zero-Knowledge session mode (where the server never stores the token-to-value mapping in plaintext) provide the technical and organisational measures required under Art. 32.

Yes. Session records document: which entity types were detected, which operators were applied (hash/encrypt/mask/redact/replace), the session ID, and the timestamp. These logs can be exported to demonstrate that data minimisation was applied before AI processing. The delete_session tool provides GDPR Art. 17 (right to erasure) compliance — permanently deleting all token mappings on request.

Note: anonymize.dev covers the AI workflow layer. A full GDPR audit covers your entire data processing estate — organisational policies, DPAs, and DPIAs remain your responsibility.

Agentic AI & MCP Security

Agentic AI refers to AI systems that autonomously plan and execute multi-step tasks — browsing web pages, querying databases, calling APIs, writing and running code — often with minimal human oversight per step. The privacy risk is compounded compared to a single chat interaction: the agent may accumulate large volumes of personal data across many tool calls, and human oversight of each individual action is often impractical.

The UK ICO published guidance in January 2026 specifically addressing agentic AI data protection obligations. anonymize.dev mitigates this by ensuring every prompt and context window entering the agent's AI model contains only tokens, not real PII — regardless of how many tool calls the agent makes.

When you add third-party MCP servers to Claude or Cursor (for database access, file operations, web search), privacy protections from Anthropic or Cursor apply to the AI model — not to the MCP servers you add. The AI can instruct those servers to process any data in the current conversation context, including PII.

anonymize.dev closes this gap by intercepting every prompt at the MCP protocol layer before the AI model — and therefore before any other MCP server — sees it. Other MCP servers only ever receive tokenised data.

A Toxic Agent Flow attack occurs when malicious content in an external source (a GitHub issue, a web page, a document the agent reads) contains hidden instructions that hijack the AI agent and cause it to exfiltrate data through connected MCP tools. A documented example: attackers used GitHub issues to cause the GitHub MCP server to leak private repository contents.

anonymize.dev limits the blast radius of this attack: because real PII has been replaced with tokens before the agent's context is built, any data exfiltrated via a Toxic Agent Flow attack contains only tokens — not real customer records, API keys, or other sensitive values.

Looking for more FAQs? The full FAQ library covers Zero-Knowledge auth, GDPR compliance, MCP Server, Office Add-in, and more.

View Full FAQ Library →

Frequently Asked Questions

Ready to protect your AI workflows?