Episode Details

Back to Episodes
OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

Published 1 week, 5 days ago
Description

This story was originally published on HackerNoon at: https://hackernoon.com/openai-open-sources-privacy-filter-a-tiny-model-that-scrubs-pii-without-an-api-call.
OpenAI open-sourced Privacy Filter, a 50M-active-parameter model that detects and masks PII locally in one pass. Here's what's actually new, and what's hype.
Check more stories related to tech-companies at: https://hackernoon.com/c/tech-companies. You can also check exclusive content about #openai, #open-source, #privacy-filter, #openai-privacy-filter, #tiny-model, #ai, #openai-open-sources, #hackernoon-top-story, and more.

This story was written by: @abstraction. Learn more about this writer by checking @abstraction's about page, and for more stories, please visit hackernoon.com.

OpenAI released Privacy Filter under Apache 2.0 — a 1.5B-parameter (50M active) bidirectional token-classification model that detects and masks PII in text locally, in a single forward pass. It runs on a laptop, supports 128K context, hits 96% F1 out of the box, and is fine-tunable with minimal data. Eight entity categories: names, addresses, emails, phones, URLs, dates, account numbers, and secrets. It's context-aware (not regex), ships with a CLI and eval tooling, and slots into the same open-weight ecosystem as gpt-oss. The catch: multilingual support is thin, adversarial formatting breaks it, and the benchmark validation used OpenAI's own models to grade OpenAI's model.

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us