PII Scrubber in Python

For work, I am running into a common company problem where we have to scrub PII out of logs.

However, our security model also locks us out of production servers/queues etc… so we need as much data as possible.

I created this PIP package to scrub out dictionaries (i.e. JSON) that have keys which meet criteria.



Lessons learned:

  • I always forget how to actually build packages
  • I am very surprised I couldn’t find a package that does this. Lots of things for scanning log files, but not much for sanitizing logs as you go.
  • Should be edited to become much more modular. Let the user override their “filthy” key list that needs scrubbing. I suspect this would be accomplished with a more object-oriented approach.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s