Anonymization
Enable automatic anonymization in Caplena to protect personal data, stay compliant, and safely analyze or share sensitive feedback.
Caplena allows you to automatically remove personal identifiable information (PII) from all text comments, helping you stay compliant with privacy regulations and safely share or analyze sensitive data.
How to Enable Anonymization
You can enable anonymization when creating a new project:
-
Click New Project
-
In the upload flow, you’ll see the Anonymization step
-
Toggle it on to activate anonymization
Choose What to Anonymize
After toggling anonymization on, a settings panel will appear where you can choose the types of information you want to anonymize:
-
Email addresses
-
Phone numbers
-
Usernames
-
And more
Simply check the relevant boxes.
Want more control? Click Advanced Settings to:
-
Include extra PII types (e.g. ZIP Code, Religion, Gender)
-
Add custom sensitive data fields
-
Tailor anonymization to industry-specific compliance
✅ Allow-list vs. ⛔ Block-list
You can further refine anonymization behavior using:
-
Allow-list – Words that should not be anonymized (e.g., "John Doe")
-
Block-list – Words that must always be anonymized, even if they aren’t names (e.g., "very curious")
How to use:
-
Click “Add term”
-
Paste in a list from Excel
- Matching is case-insensitive and based on whole words only
How It Works
-
Case insensitive – “Smith” and “smith” are treated the same.
-
Exact matches only – Only full matches will be considered.
-
Whole word match – A match must be a complete word or phrase, not part of a longer one.
✅ Allow-list examples:
Names or terms that should NOT be anonymized.
- Smith
- John Doe
⛔ Block-list examples:
Words or phrases that should always be anonymized, even if they are not names.
- is
- very curious
Words or phrases that should always be anonymized, even if they are not names.
Once you've selected the settings that match your needs, click Continue to move forward with your data import or processing.
Note that the anonymization process cannot be undone as only the anonymized data is stored on Caplena.
The anonymization process takes place right after the upload onto the Caplena server. The original data - the data that includes the PII information – will not be visible at any time. For technical reasons, the original data that includes the PII information will remain on the Caplena server for a short period of time, before it will be automatically and permanently deleted from the server and its backup system.
Anonymization Tips
Address vs. Location:
Address refers to structured location formats: street name, house number, ZIP/postcode, and city (e.g., “25 Oxford Street, London W1D 2LF” or “742 Evergreen Terrace, Springfield, IL 62704”)
Location captures general geographic mentions or landmarks (e.g., “Central Park”, “the Lake District”, “Northern California”)
Address vs. Street/City/ZIP/Postcode
-
If you anonymize too much, you’ll need to reupload the data in a new project
-
If anything goes wrong, reach out to us — we’re happy to help and will reimburse credits if needed
Anonymization and Translations
Caplena applies anonymization before translation, meaning that if the source text is anonymized, the translated version will also be anonymized.
⚠️ Please note:
Not all languages are currently supported for anonymization. Texts in unsupported languages may not be anonymized or only partially anonymized.
Language | ISO Code |
---|---|
Afrikaans | af |
Arabic | ar |
Bambara | bm |
Belarusian | be |
Bengali | bn |
Bulgarian | bg |
Burmese | my |
Cantonese (Traditional) | zh-TW |
Catalan | ca |
Croatian | hr |
Czech | cs |
Danish | da |
Dutch | nl |
English | en |
Estonian | et |
Finnish | fi |
French | fr |
Georgian | ka |
German | de |
Greek | el |
Hebrew | he |
Hindi | hi |
Hungarian | hu |
Icelandic | is |
Indonesian | id |
Italian | it |
Japanese | ja |
Khmer | km |
Korean | ko |
Latvian | lv |
Lithuanian | lt |
Luxembourgish | lb |
Malay | ms |
Mandarin (Simplified) | zh-CN |
Moldovan | ro |
Norwegian (Bokmål) | nb |
Persian (Farsi) | fa |
Polish | pl |
Portuguese | pt |
Punjabi | pa |
Romanian | ro |
Russian | ru |
Slovak | sk |
Slovenian | sl |
Spanish | es |
Swahili | sw |
Swedish | sv |
Tagalog | tl |
Tamil | ta |
Thai | th |
Turkish | tr |
Ukrainian | uk |
Vietnamese | vi |