Protect sensitive user data in n8n AI workflows by masking PII before it reaches any language model. Use the Code node to detect and replace emails, phone numbers, SSNs, and credit card numbers with placeholders, then restore originals after the LLM responds. This keeps private data out of third-party APIs while preserving workflow functionality.
Why PII Masking Matters in AI Workflows
When you send user data through n8n to a language model, every piece of personally identifiable information (PII) leaves your server and lands on a third-party API. This creates compliance risks under GDPR, HIPAA, and CCPA. The solution is a two-step mask-then-unmask pattern: a Code node strips PII before the LLM call, and another Code node restores it afterward. This tutorial walks you through building that pattern with real regex rules and a reusable mapping table.
Prerequisites
- A running n8n instance (v1.30 or later)
- An active credential for at least one LLM provider (OpenAI, Anthropic, or Google)
- Basic understanding of JavaScript regular expressions
- Familiarity with n8n Code node and expression syntax
Step-by-step guide
Create the Webhook trigger to receive user messages
Create the Webhook trigger to receive user messages
Start a new workflow and add a Webhook node. Set the HTTP Method to POST and the Path to /pii-safe-chat. In the Response section, set Response Mode to 'Using Respond to Webhook Node' so the workflow can process the message and return a cleaned response. This webhook will accept a JSON body with a 'message' field containing the user's text and an optional 'userId' field.
1// Expected incoming payload:2// POST /webhook/pii-safe-chat3// { "message": "My email is john@example.com and SSN is 123-45-6789", "userId": "user_42" }Expected result: Webhook node is configured and shows a test URL ending in /webhook-test/pii-safe-chat
Add a Code node to detect and mask PII
Add a Code node to detect and mask PII
Add a Code node after the Webhook. Name it 'Mask PII'. This node scans the user message with regex patterns for common PII types, replaces each match with a numbered placeholder like [EMAIL_1] or [SSN_1], and stores the original-to-placeholder mapping in the output so you can reverse it later. The mapping travels alongside the masked message through the rest of the workflow.
1const input = $input.first().json;2const message = input.message || '';34const patterns = [5 { type: 'EMAIL', regex: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g },6 { type: 'SSN', regex: /\b\d{3}-\d{2}-\d{4}\b/g },7 { type: 'PHONE', regex: /\b(?:\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g },8 { type: 'CREDIT_CARD', regex: /\b\d{4}[-.\s]?\d{4}[-.\s]?\d{4}[-.\s]?\d{4}\b/g }9];1011const mapping = {};12let masked = message;13let counter = 0;1415for (const p of patterns) {16 masked = masked.replace(p.regex, (match) => {17 counter++;18 const placeholder = `[${p.type}_${counter}]`;19 mapping[placeholder] = match;20 return placeholder;21 });22}2324return [{ json: { maskedMessage: masked, piiMapping: mapping, userId: input.userId } }];Expected result: The Code node output shows maskedMessage with placeholders like [EMAIL_1] replacing real data, and piiMapping containing the originals
Send the masked message to the LLM
Send the masked message to the LLM
Add an OpenAI node (or any LLM node) after the Mask PII node. In the Prompt field, use the expression {{ $json.maskedMessage }} so only the sanitized text reaches the model. Set the System Message to instruct the model to preserve any bracketed placeholders in its response. This ensures the LLM never sees real PII and keeps placeholders intact for later restoration.
1// System Message for the LLM node:2// "You are a helpful assistant. When you see placeholders like [EMAIL_1] or [SSN_1], keep them exactly as-is in your response. Never attempt to guess the real values behind placeholders."Expected result: The LLM responds using the placeholders instead of real PII values
Add a Code node to unmask PII in the response
Add a Code node to unmask PII in the response
Add another Code node after the LLM node. Name it 'Unmask PII'. This node takes the LLM response and replaces every placeholder back with the original value from the mapping table. The result is a fully personalized response that never exposed real PII to the language model.
1const llmResponse = $input.first().json.text || $input.first().json.message?.content || '';2const mapping = $('Mask PII').first().json.piiMapping;34let restored = llmResponse;5for (const [placeholder, original] of Object.entries(mapping)) {6 restored = restored.replaceAll(placeholder, original);7}89return [{ json: { response: restored, userId: $('Mask PII').first().json.userId } }];Expected result: The output contains the LLM's response with all original PII values restored in their correct positions
Return the response via Respond to Webhook node
Return the response via Respond to Webhook node
Add a Respond to Webhook node at the end of the workflow. Set the Response Body to Expression and use {{ $json.response }} to send back the unmasked, personalized response to the calling application. This completes the round-trip: real data in, masked data to LLM, real data out.
Expected result: Sending a test POST to the webhook returns a personalized AI response with no PII ever leaving your n8n instance
Add the Guardrails node for layered protection
Add the Guardrails node for layered protection
For additional safety, add a Guardrails node between the Mask PII node and the LLM node. Enable the PII Sanitization check. This acts as a second layer: if the regex in your Code node misses something, the Guardrails node catches it. Also enable Prompt Injection Detection to block adversarial inputs that try to trick the LLM into revealing system instructions or bypassing the masking.
Expected result: The Guardrails node passes clean messages through and blocks any that contain unmasked PII or injection attempts
Complete working example
1// ====== Mask PII — Code Node ======2const input = $input.first().json;3const message = input.message || '';45const patterns = [6 { type: 'EMAIL', regex: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g },7 { type: 'SSN', regex: /\b\d{3}-\d{2}-\d{4}\b/g },8 { type: 'PHONE', regex: /\b(?:\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g },9 { type: 'CREDIT_CARD', regex: /\b\d{4}[-.\s]?\d{4}[-.\s]?\d{4}[-.\s]?\d{4}\b/g },10 { type: 'IP_ADDRESS', regex: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g },11 { type: 'DATE_OF_BIRTH', regex: /\b(?:0[1-9]|1[0-2])\/(?:0[1-9]|[12]\d|3[01])\/(?:19|20)\d{2}\b/g }12];1314const mapping = {};15let masked = message;16let counter = 0;1718for (const p of patterns) {19 masked = masked.replace(p.regex, (match) => {20 counter++;21 const placeholder = `[${p.type}_${counter}]`;22 mapping[placeholder] = match;23 return placeholder;24 });25}2627return [{28 json: {29 maskedMessage: masked,30 piiMapping: mapping,31 piiDetected: counter > 0,32 piiCount: counter,33 userId: input.userId || 'anonymous'34 }35}];3637// ====== Unmask PII — Code Node ======38// Place this in a separate Code node after the LLM node39//40// const llmResponse = $input.first().json.text || '';41// const mapping = $('Mask PII').first().json.piiMapping;42// let restored = llmResponse;43// for (const [placeholder, original] of Object.entries(mapping)) {44// restored = restored.replaceAll(placeholder, original);45// }46// return [{ json: { response: restored } }];Common mistakes when protecting Sensitive User Data in Prompts Passed Through n8n
Why it's a problem: Sending raw user input directly to the LLM node without any preprocessing
How to avoid: Always place a Code node or Guardrails node before the LLM to scan and sanitize input
Why it's a problem: Using simple string.replace() instead of replaceAll() when unmasking, leaving some placeholders intact
How to avoid: Use replaceAll() or a global regex to ensure every occurrence of each placeholder is restored
Why it's a problem: Storing the PII mapping in static workflow data, which persists across executions and leaks data between users
How to avoid: Keep the mapping in the item's json output so it only lives for the duration of one execution
Why it's a problem: Forgetting that the LLM might rephrase or split placeholders across lines
How to avoid: Add an instruction in the system prompt telling the model to preserve placeholders exactly, and add fuzzy matching in the unmask step
Best practices
- Never log PII mapping tables to n8n execution history in production — disable Save Successful Execution Data or redact logs
- Use environment variables for any API keys and never include them in Code node scripts
- Layer defenses: regex masking in Code node plus Guardrails node PII detection for redundancy
- Test with edge cases like PII embedded in URLs, email signatures, and multi-line text
- Add jurisdiction-specific patterns (IBAN for EU, Aadhaar for India, NHS numbers for UK)
- Set the LLM temperature to 0 when processing PII-adjacent tasks to reduce hallucination risk
- Review masked outputs periodically to catch new PII formats your regex does not cover
- Consider encrypting the PII mapping at rest if your n8n instance stores execution data
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I'm building an n8n workflow that sends user messages to an LLM. How do I mask PII (emails, SSNs, phone numbers) before the LLM call and restore them afterward? Give me JavaScript code for two n8n Code nodes: one for masking and one for unmasking.
Add a Code node before the OpenAI node. In the Code node, write a JavaScript function that uses regex to find emails, SSNs, and phone numbers in {{ $json.message }}, replaces them with placeholders like [EMAIL_1], and stores the mapping. After the LLM node, add another Code node that reads the mapping and restores originals.
Frequently asked questions
Does the Guardrails node in n8n detect all types of PII automatically?
The Guardrails node detects common PII types like emails, phone numbers, and credit card numbers, but it may miss jurisdiction-specific identifiers like IBANs or national ID numbers. Use it as a second layer alongside custom regex in a Code node for comprehensive coverage.
Can I use this masking approach with any LLM provider in n8n?
Yes. Since the masking happens in a Code node before the LLM call, it works with OpenAI, Anthropic Claude, Google Gemini, Cohere, Mistral, or any other LLM node. The LLM only ever sees the masked text.
What happens if the LLM generates new PII in its response?
The unmask step only restores your original mapped values. To catch LLM-generated PII, add a second Guardrails node or Code node scan after the LLM response and before returning data to the user.
Is this approach GDPR compliant?
Masking PII before sending to third-party APIs is a strong step toward GDPR compliance, but full compliance also requires data processing agreements with your LLM provider, proper consent management, and data retention policies. Consult a legal professional for your specific case.
How do I handle PII in languages other than English?
The regex patterns in this tutorial target English-format PII. For international formats, add patterns for local phone numbers, postal codes, and ID numbers. You can also use a dedicated NLP-based PII detection library in the Code node via npm packages if your n8n instance supports custom modules.
Can RapidDev help build a production-grade PII masking pipeline in n8n?
Yes. RapidDev specializes in building secure n8n workflows for teams handling sensitive data, including multi-jurisdiction PII masking, audit logging, and compliance documentation.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation