For work, I am running into a common company problem where we have to scrub PII out of logs.
However, our security model also locks us out of production servers/queues etc… so we need as much data as possible.
I created this PIP package to scrub out dictionaries (i.e. JSON) that have keys which meet criteria.
https://pypi.org/project/pnmac-common/
Lessons learned:
- I always forget how to actually build packages
- I am very surprised I couldn’t find a package that does this. Lots of things for scanning log files, but not much for sanitizing logs as you go.
- Should be edited to become much more modular. Let the user override their “filthy” key list that needs scrubbing. I suspect this would be accomplished with a more object-oriented approach.