OpenAI Privacy Filter: Run Local PII Redaction in 10 Minutes

OpenAI's new open-weight Privacy Filter runs on your laptop, redacts 8 types of PII at 96% F1, and costs nothing. Here's the 10-minute setup.

OpenAI just released a model that does something its flagship products can’t: it never sends your data anywhere. Privacy Filter, launched on April 23, is an open-weight, Apache-licensed tool that detects and masks personal information in text — and it runs on your laptop. No API calls. No cloud. No usage meter ticking. Drop it into a pipeline, and a customer-support ticket with a name, phone number, and credit card comes out the other side with those fields replaced by placeholders, all in a single pass.

This is a sharper turn than it looks. OpenAI has spent the last two years telling enterprises to trust the cloud. Then, on a Thursday morning, they shipped a 1.5-billion-parameter model to Hugging Face under a license that lets anyone use it commercially, fork it, and fine-tune it — and the community had an MLX port running 24 to 33 times faster on Apple Silicon before the day was out.

If you handle PII at work — you’re a developer building on an LLM, a compliance officer reviewing AI pipelines, or an IT lead who’s been told by legal to “figure out the privacy thing” — this is the most interesting release of the week. Here’s what it is, what it actually does, and how to get it running on your machine before your competitors notice.

What OpenAI Privacy Filter actually is

Think of it as a spell-checker for sensitive data. You feed it a block of text. It reads every word. It labels each one either “safe” or “one of eight kinds of private stuff.” You can then keep those labels, blank them out, or swap them for placeholders.

The technical shape is unusual for a small model. It’s a 1.5-billion-parameter sparse mixture-of-experts, with only 50 million parameters active per forward pass. That’s how it holds a 128,000-token context window while still being light enough to run in a browser via WebGPU. For comparison: the smallest open Llama you’d normally use for text classification sits around 1 to 8 billion dense parameters, and most of them top out at 8K context.

It’s built on the same gpt-oss architecture OpenAI released last year — but repurposed. Instead of generating text left-to-right, it reads the whole passage at once (bidirectional token classification, to use the jargon) and emits a label for every token. Under the hood there are 8 transformer blocks, grouped-query attention with 14 query heads and 2 KV heads, 128 experts with top-4 routing per token, and a constrained Viterbi decoder that stitches the token labels into coherent spans so “John” and “Smith” don’t end up categorized as two different entities.

You don’t need to care about any of that to use it. You need to know three things:

  1. It’s free. Apache 2.0 license. Ship it in a product tomorrow, no one will stop you.
  2. It runs locally. The weights live on your machine, the inference happens on your machine, and the output stays on your machine. No data leaves.
  3. It hits 96% F1 on the PII-Masking-300k benchmark — 94% precision, 98% recall. The corrected benchmark version pushes that to 97.43%. Translation: it catches almost everything, and when it does flag something, it’s almost always right.

The 8 things it detects

Out of the box, Privacy Filter recognizes eight span types:

  • private_person — names
  • private_address — street addresses
  • private_email — email addresses
  • private_phone — phone numbers
  • private_date — dates of birth, treatment dates, anything date-shaped in a private context
  • private_url — URLs that identify a person or private resource
  • account_number — bank accounts, routing numbers, policy numbers, membership IDs, and credit card numbers
  • secret — API keys, passwords, tokens

Notice what’s not there: Social Security numbers, medical record numbers, NHS numbers, Brazilian CPFs. OpenAI left those out of the default taxonomy on purpose — they’re region-specific, and the sensible move is to fine-tune the model for your jurisdiction rather than pretend one global label set fits every country. Inside 24 hours of release, a developer had already shipped a medical-labels fine-tune covering MRN, NHS, NPI, and CPF, plugged into an iOS app via MLX.

How to run it locally in 10 minutes

You need Python 3.10+, a decent laptop (the model uses about 3 GB of RAM at BF16 precision), and internet access once to download the weights. After that, you’re fully offline.

Step 1: Install

pip install transformers torch

That’s the whole setup. If you’ve used Hugging Face before, you already have this.

Step 2: Load the model

from transformers import pipeline

classifier = pipeline(
    task="token-classification",
    model="openai/privacy-filter",
    aggregation_strategy="simple",  # groups tokens into spans for you
)

First run downloads about 3 GB from Hugging Face. Every run after that is instant — the weights are cached at ~/.cache/huggingface/hub/.

Step 3: Send it something dirty

Let’s say you’ve got a real-world customer support ticket:

ticket = """
From: Sarah Chen <sarah.chen@acmelogistics.com>
Subject: Refund for order #A-48291

Hi team,

Following up on my call yesterday. My shipment on 2026-03-14 never
arrived. The driver said he delivered to 1247 Maple Ave, Apt 4B,
Seattle WA 98103 but I've been at that address for 6 years and
nothing showed up.

Can you process a refund to my card ending 4532 9871 0042 8855?
You can reach me on (206) 555-0147 during business hours.

Thanks,
Sarah
"""

spans = classifier(ticket)
for span in spans:
    print(f"{span['entity_group']:20} | {span['word']}")

Step 4: Read the output

You get back a list of spans, each with a category and the exact text it flagged:

private_person       | Sarah Chen
private_email        | sarah.chen@acmelogistics.com
private_date         | 2026-03-14
private_address      | 1247 Maple Ave, Apt 4B, Seattle WA 98103
account_number       | 4532 9871 0042 8855
private_phone        | (206) 555-0147
private_person       | Sarah

Every personal detail caught. The subject line “order #A-48291” correctly left alone — it’s an order number, not PII.

Step 5: Actually redact the text

The pipeline gives you span positions. A small loop replaces them with placeholders:

def redact(text, spans):
    result = list(text)
    # sort descending so earlier edits don't shift later indexes
    for span in sorted(spans, key=lambda s: s["start"], reverse=True):
        label = f"[{span['entity_group'].upper()}]"
        result[span["start"]:span["end"]] = label
    return "".join(result)

clean = redact(ticket, spans)
print(clean)

Output:

From: [PRIVATE_PERSON] <[PRIVATE_EMAIL]>
Subject: Refund for order #A-48291

Hi team,

Following up on my call yesterday. My shipment on [PRIVATE_DATE] never
arrived. The driver said he delivered to [PRIVATE_ADDRESS]
but I've been at that address for 6 years and
nothing showed up.

Can you process a refund to my card ending [ACCOUNT_NUMBER]?
You can reach me on [PRIVATE_PHONE] during business hours.

Thanks,
[PRIVATE_PERSON]

That clean version is what you’d feed into a downstream LLM, a training dataset, a support-ticket analytics pipeline, or an internal dashboard. The raw one never leaves your machine.

Running it in a browser

If you’d rather not install Python, the model also runs client-side through Transformers.js:

import { pipeline } from "@huggingface/transformers";

const classifier = await pipeline(
  "token-classification",
  "openai/privacy-filter",
  { device: "webgpu", dtype: "q4" }
);

const spans = await classifier(ticketText, { aggregation_strategy: "simple" });

Q4 quantization shrinks the weights to about 800 MB. WebGPU does the math on the user’s graphics card. The text never hits a server. For a customer-facing redaction tool — say, inside a form before submission — this is the simplest on-device privacy layer you can build.

How it stacks up against Presidio, AWS Comprehend, and Google DLP

PII redaction isn’t a new category. Privacy Filter’s pitch is the combination of open weights, on-device execution, modern architecture, and zero cost. Here’s how it reads against the three tools most teams already know.

FeatureOpenAI Privacy FilterMicrosoft PresidioAWS Comprehend PIIGoogle DLP
CostFree (Apache 2.0)Free (MIT)~$0.0001 per 100 chars ($0.0003 minimum per request)$1–3 per GB inspected, $1/GB de-identification
Runs locally✅ Yes✅ Yes❌ Cloud only❌ Cloud only
ApproachLearned model (bidirectional transformer + MoE)Regex + NER + checksumsHosted ML modelHosted ML model
Context window128K tokens (single pass)Per-sentencePer-documentPer-document
F1 (PII-Masking-300k)96% (97.4% corrected)~85% with NERNot publishedNot published
Fine-tunable✅ Yes, full weights available⚠️ Add custom recognizers❌ No❌ No
Multi-language⚠️ English-primary✅ Multi-language out of the box✅ Multi-language✅ Multi-language
Audit trailYou build itYou build itBuilt-in via CloudTrailBuilt-in via Cloud Logging
SLA / supportNone (community)None (community)AWS enterprise SLAGoogle enterprise SLA

The picture that emerges isn’t “Privacy Filter wins.” It’s that the shape of “which tool should I pick” has shifted.

If you’re already in AWS or Google Cloud and you need compliance logs, managed SLAs, and someone to blame at 3 a.m., the hosted services still earn their money. Privacy Filter does not give you an auditor-ready trail. You’d have to build that yourself.

If you’ve been using Presidio for local redaction, Privacy Filter is probably an upgrade on accuracy — the learned model catches nuanced cases that regex misses, like realizing “my dentist’s office on Birch Ave” contains an address fragment and a professional relationship. But you’d lose Presidio’s multi-language support and its mature integrations with image and structured-data pipelines. Many teams end up running both: Presidio for structured regex-heavy data like form fields, Privacy Filter for unstructured prose.

If you’ve been sending data to a hosted PII API because you didn’t realize there was a real local option — this is the news. A text sanitizer that stays on your machine is now 10 minutes of setup away.

What it can’t do

OpenAI has been unusually direct about the limits. Worth reading before you build on it.

It’s not a compliance guarantee. The model card literally uses the phrase “redaction aid, not a safety guarantee.” If your use case is HIPAA, GDPR, or any regulated workflow, Privacy Filter is one layer in a defence — not the whole thing. The National Law Review’s coverage echoed this: “Privacy risks remain, especially in legal, healthcare, financial, and other regulated settings.”

It misses uncommon names and over-redacts public ones. Try it on a news article about Taylor Swift and it’ll mask her name as private_person. Try it on a customer-support ticket from someone named Aranya Subramanyan and it may miss the surname. Both are fixable with fine-tuning, but out of the box, expect these edge cases.

Context can still re-identify someone. This is the subtle one. If you mask “Sarah Chen” but keep “Director of Product at Acme Logistics since 2021, manages a team of 12 in Seattle,” a motivated re-identifier can still figure out who the subject is. Privacy Filter does not redact job titles, team sizes, employer names, or tenure — those aren’t in the taxonomy. De-identification is a harder problem than PII removal, and nobody’s solved it in 1.5 billion parameters.

It’s English-first. OpenAI notes performance “drops on non-English text, non-Latin scripts.” If you’re redacting Japanese customer service logs, Chinese medical records, or Arabic contracts, fine-tune or use a different tool. Presidio still wins on raw language coverage.

It’s static. The 8 categories are the 8 categories. If you need to detect patient identifiers, tax IDs, or internal code names, you have to fine-tune the model on your own labeled data. OpenAI ships evaluation and fine-tuning scripts in the GitHub repo, but that’s engineering work, not a config flag.

What this means for you

If you’re a developer building on an LLM: This replaces the pre-call redaction step most teams handwave about. Paste it in front of your prompt — the raw ticket, support log, or user doc goes in, the redacted version goes to the model, the model response gets cross-referenced with the original redaction map to re-insert real values when needed. Zero API calls to third parties. Zero cost increase. The whole pipeline gets a lot easier to explain to a security reviewer.

If you’re a compliance officer: Privacy Filter doesn’t make you compliant. It makes the “what did you do about PII” question answerable in one sentence. Instead of “we trust the vendor,” you can say “we redact at the edge using an on-device open-weight model, and we retain the redaction logs.” That sentence plays better in every audit meeting we’ve ever seen.

If you’re an IT lead wondering whether to care: If anyone at your company pastes customer data into ChatGPT, Privacy Filter is your cheapest way to stop the data leak without also stopping the productivity. Wrap it in a small internal web app that sits on the intranet, let people paste in the dirty version, copy the clean version into whatever AI tool they’re using. Ship it in a day.

If you’ve never touched a model before: Honestly, this is a good one to start with. Unlike chatbot models, you don’t need to prompt-engineer anything. You pass text in, you get labels out. It’s the most deterministic “AI” you’ll ever run. Spin up a Google Colab notebook, paste the 3 lines above, feed it one of your own emails, and watch every piece of personal info get flagged. The exercise takes 10 minutes and teaches more about how modern models work than 5 hours of reading.

The bottom line: For unstructured-text PII redaction in English, Privacy Filter is now the default answer. Free, fast, local, accurate, and OpenAI-trained. Use it as a layer, not a guarantee. And do not tell your compliance team you’ve “solved privacy.” You’ve solved one very specific piece of it — which is still the piece most people get wrong.

Who should use it

  • Developers wrapping LLM calls for customer-support, sales, HR, or internal knowledge base use cases
  • Data teams preparing training or fine-tuning datasets from real-world logs
  • Security and IT teams building a pre-AI redaction checkpoint for employees using ChatGPT, Claude, or Gemini
  • Startups that don’t have the budget for a $2K-a-month hosted DLP service but do have customer data flowing through their systems
  • Solo developers and hobbyists who want a default “remove PII before I do anything else” step in personal automation

Probably not the right fit if you need a managed SLA, multi-language coverage as a first requirement, or compliance-grade logging out of the box.

The bottom line

Privacy Filter is small, fast, accurate, and free. It runs in a browser. It runs on a laptop. It runs 24 to 33 times faster on Apple Silicon with the community MLX port. It handles 128K tokens in a single pass. And for the cluster of use cases where “English unstructured text” and “keep the data on this machine” intersect, it’s the best tool available as of the morning it launched.

The interesting part isn’t the model. It’s that OpenAI shipped something that explicitly reduces their own revenue — every call to Privacy Filter is a call that doesn’t go to their API. That choice signals a shift in how the commercial AI labs are thinking about the trust layer. Not every part of a privacy-conscious pipeline needs to be paid, hosted, and cloud-native. Some parts are better served by a small model that stays on your machine.

Download it, run the 10-minute setup, try it on one real document from your own inbox, and see what it catches. If it catches things you’d rather not know were sitting in cleartext in your email — which it will — you’ve just discovered the first use case.


Sources:

Build Real AI Skills

Step-by-step courses with quizzes and certificates for your resume