OpenAI Privacy Filter

Apr 26, 2026 · ai · Source ↗

TLDR

OpenAI releases an open-weight 1.5B-parameter bidirectional token-classification model for detecting and redacting PII in text.

Unlike generative models, it labels an entire input sequence in one pass, then decodes coherent spans using a constrained Viterbi procedure.
Built on a fixed taxonomy of privacy labels, adapted from an autoregressive pretrained checkpoint into a token classifier.
Span decoding via Viterbi allows the model to produce consistent, non-overlapping entity spans rather than per-token guesses.
Released as open-weight, meaning it can run locally without sending data to any third-party API.

The dominant practical concern is the redact/rehydrate loop: PII replaced with tokens like [NAME] must be restored before responses reach users, requiring a stateful mapping layer at the UX boundary.
Skeptics argue these models miss identity leakage from combinations of non-PII attributes; using them as a compliance shield for “anonymized” data is a known industry mistake.
Multiple builders are already wrapping this model into proxy pipelines alongside regex fallbacks, targeting legal, tax, and immigration document workflows.

@stratos123: Details the architecture – bidirectional token classifier with constrained Viterbi span decoding, 1.5B params, adapted from an autoregressive checkpoint.
@fzxu22: Built anon_proxy, an open-source anonymization proxy combining this model with regex; all detection runs locally so no PII leaves the machine.
@CMay: “No, you are not” anonymizing – warns that aggregate data risks mean these tools are not sufficient for true de-identification at scale.