Episode Details

OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

Published 1 week, 5 days ago

Description

This story was originally published on HackerNoon at: https://hackernoon.com/openai-open-sources-privacy-filter-a-tiny-model-that-scrubs-pii-without-an-api-call.
OpenAI open-sourced Privacy Filter, a 50M-active-parameter model that detects and masks PII locally in one pass. Here's what's actually new, and what's hype.
Check more stories related to tech-companies at: https://hackernoon.com/c/tech-companies. You can also check exclusive content about #openai, #open-source, #privacy-filter, #openai-privacy-filter, #tiny-model, #ai, #openai-open-sources, #hackernoon-top-story, and more.

This story was written by: @abstraction. Learn more about this writer by checking @abstraction's about page, and for more stories, please visit hackernoon.com.

OpenAI released Privacy Filter under Apache 2.0 — a 1.5B-parameter (50M active) bidirectional token-classification model that detects and masks PII in text locally, in a single forward pass. It runs on a laptop, supports 128K context, hits 96% F1 out of the box, and is fine-tunable with minimal data. Eight entity categories: names, addresses, emails, phones, URLs, dates, account numbers, and secrets. It's context-aware (not regex), ships with a CLI and eval tooling, and slots into the same open-weight ecosystem as gpt-oss. The catch: multilingual support is thin, adversarial formatting breaks it, and the benchmark validation used OpenAI's own models to grade OpenAI's model.

Episode Details

OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

Description

Listen Now

Love PodBriefly?