What Operators Are
When you call anonym_legal_anonymize_text, PII in your text gets detected and replaced. By default, all detected entities are replaced with positional tokens: <PERSON_1>, <EMAIL_1>, etc.
The operators parameter overrides this per entity type. You pass a JSON object where the key is an entity type (e.g., "PERSON") and the value is an operator configuration. The 6 available operators are: replace, redact, hash, encrypt, mask, and keep.
"operators": {
"ENTITY_TYPE_1": { "type": "operator_name", ...params },
"ENTITY_TYPE_2": { "type": "operator_name", ...params },
...
}
Replaces the detected entity with a static string you provide via new_value. If new_value is omitted, the entity is replaced with a type-labeled placeholder token like <PERSON>.
Use when: you need a human-readable placeholder — legal redlines ([CLIENT]), audit trails ([REDACTED]), or labeled markers that tell the AI what type was removed without providing the value.
type: "replace" · new_value?: string (max 100 chars)
"PERSON": { "type": "replace", "new_value": "[CLIENT]" }
"LOCATION": { "type": "replace", "new_value": "[LOCATION]" }
"EMAIL_ADDRESS": { "type": "replace" }
Removes the entity from the text completely. No token, no placeholder — the value is gone. The text around it is preserved. Use with mode: "redact" globally for permanent removal, or selectively per entity type.
Use when: the entity must not exist in any form — SSNs, passwords, authentication tokens. GDPR Art. 25 data minimisation. Any scenario where even a placeholder carries risk.
type: "redact" · no additional parameters
"US_SSN": { "type": "redact" }
"CREDIT_CARD": { "type": "redact" }
Replaces the entity with a SHA-256 or SHA-512 hex digest. The same input always produces the same hash — making it useful for deduplication and linking records without storing original values. Hashing is one-way: you cannot recover the original from the hash.
Use when: you need to link records (same person across documents) without retaining PII. Healthcare de-identification (HIPAA Safe Harbor). Analytics where individual identity is needed for counting but not display. Audit logs where users must be traceable but not identifiable.
type: "hash" · hash_type?: "SHA256" | "SHA512" (default: SHA256)
"PERSON": { "type": "hash", "hash_type": "SHA256" }
"EMAIL_ADDRESS": { "type": "hash", "hash_type": "SHA256" }
Encrypts the entity value with AES-256 using a key you provide. The key is 16, 24, or 32 characters. anonym.legal does not store or log the key. Anyone with the key can decrypt the value. Unlike tokenization, no server-side session is needed — decryption is self-contained.
Use when: you need the original value recoverable by a specific key holder, but not by the AI or by server-side session lookup. Cross-system sharing where the receiving party has the key. Long-term archives where session tokens would expire. Zero-knowledge deployments where server access is untrusted.
type: "encrypt" · key: string (exactly 16, 24, or 32 characters — required)
"IBAN_CODE": { "type": "encrypt", "key": "my-32-char-encryption-key-here-!" }
"CREDIT_CARD": { "type": "encrypt", "key": "my-32-char-encryption-key-here-!" }
Replaces a portion of the entity with a masking character (default: *). You control how many characters to mask and whether to mask from the start or end. The remaining visible characters provide enough context for the AI to reason about the type without seeing the full value.
Use when: partial visibility is needed for context — last 4 digits of a card number, first 3 digits of a phone, visible domain in an email. Customer support UIs. Audit logs showing "user ending in 89" without exposing full data.
type: "mask" · chars_to_mask: number (1–100, required) · masking_char?: string (1 char, default: "*") · from_end?: boolean (mask from end if true, from start if false — default: false)
"PHONE_NUMBER": { "type": "mask", "chars_to_mask": 6, "from_end": true }
"US_SSN": { "type": "mask", "chars_to_mask": 7, "from_end": false }
"CREDIT_CARD": { "type": "mask", "chars_to_mask": 12, "masking_char": "X", "from_end": false }
Detects the entity type but leaves the original value in place. This is the opt-out operator — useful when you want to analyze what types exist (or get detection counts) while selectively preserving certain values that are acceptable to share with the AI.
Use when: dates, general locations, or other non-sensitive types are contextually important and you do not want to obscure them. You want detection results (e.g., via analyze_text first) but some types pass through. Presets that cover broad entity groups but should skip a few types.
type: "keep" · no additional parameters
"DATE_TIME": { "type": "keep" }
"LOCATION": { "type": "keep" }
"PERSON": { "type": "redact" }
Operator Decision Table
Pick the right operator for each situation:
| Requirement |
Operator |
Recoverable? |
| AI needs context about what was removed | replace | No (label only) |
| Value must not appear in any form | redact | No |
| Need to link same entity across records (dedup) | hash | No (one-way) |
| Specific party must be able to decrypt later | encrypt | Yes (with key) |
| Partial visibility needed (last 4 digits) | mask | Partial |
| AI needs full value, not sensitive | keep | n/a (not changed) |
| Need to restore original in AI response | tokenize (default) | Yes (session_id) |
Combined Operator Examples
Real-world scenarios mixing operators in a single call:
Legal — Contract Review
{
"text": "The contract between ACME Corp and John Smith (SSN 123-45-6789)\n for property at 123 Main St, IBAN DE89370400440532013000...",
"entity_groups": ["UNIVERSAL", "FINANCIAL", "NORTH_AMERICA", "LEGAL"],
"operators": {
"PERSON": { "type": "replace", "new_value": "[PARTY]" },
"LOCATION": { "type": "replace", "new_value": "[ADDRESS]" },
"US_SSN": { "type": "redact" },
"IBAN_CODE": { "type": "encrypt", "key": "legal-vault-32char-key---------!" },
"DATE_TIME": { "type": "keep" }
},
"persistence": "persistent"
}
Healthcare — HIPAA De-identification
{
"text": "Patient Jane Doe (MRN: MR-8842156, DOB: 1985-03-15)\n prescribed metformin 500mg, US Medicare 1EG4-TE5-MK72",
"entity_groups": ["UNIVERSAL", "HEALTHCARE", "NORTH_AMERICA"],
"operators": {
"PERSON": { "type": "hash", "hash_type": "SHA256" },
"DATE_TIME": { "type": "replace", "new_value": "[DATE]" },
"MEDICAL_RECORD_NUMBER": { "type": "hash", "hash_type": "SHA256" },
"US_MEDICARE": { "type": "redact" }
}
}
FinTech — Transaction Analysis
{
"text": "Transaction from account 4532015112830366 (John Smith)\n IBAN DE89370400440532013000, amount €4,250.00",
"entity_groups": ["UNIVERSAL", "FINANCIAL", "DACH"],
"operators": {
"CREDIT_CARD": { "type": "mask", "chars_to_mask": 12, "from_end": false },
"IBAN_CODE": { "type": "mask", "chars_to_mask": 14, "from_end": true },
"PERSON": { "type": "hash", "hash_type": "SHA256" }
}
}
Summary
- replace — human-readable label, non-reversible without session
- redact — permanent removal, no trace
- hash — deterministic one-way, enables deduplication
- encrypt — AES-256, reversible with your key only
- mask — partial visibility, configurable from start or end
- keep — detect but pass through unchanged
The default (no operator specified for an entity type) is equivalent to replace with a positional token like <PERSON_1>, which is reversible via anonym_legal_detokenize_text using the returned session_id.
Full tool reference: See the MCP Server page for the complete parameter documentation of all 7 tools, 26 entity groups, and integration setup guides.