Sanitize sensitive information from your database dumps 💩
Supports:
- PostgreSQL
- MySQL
pip install django-sanitized-dump
orpip install django-sanitized-dump[MySQL]
if you use MySQL- Add
sanitized_dump
toINSTALLED_APPS
- Initialize config file:
./manage.py init_sanitizer
- Check your newly created
.sanitizerconfig
file and modify the sanitation strategy to fit your requirements. - Run
./manage.py check_sanitizerconfig
to verify that your.sanitizerconfig
includes all models and fields - Get sanitized database dump:
./manage.py create_sanitized_dump > dump.sql
Heavy lifting of the DB sanitation is done by: https://github.com/andersinno/python-database-sanitizer
Configuration file is used to define a strategy on how to sanitize your data. Strategy defines a sanitation function for each model field.
config:
addons:
- "ai-sanitizers"
- "some-other-lib"
strategy:
user:
first_name: "name.first_name"
last_name: "name.last_name"
education:
created: null
modified: null
id: null
field: "education.field"
school: "education.school"
started: "datetime.datetime"
credits: null
information: "string.loremipsum_preserved"
file_file: null
# /sanitizers/name.py
def sanitize_first_name(value):
return faker.first_name()
def sanitize_last_name(value):
return faker.last_name()
# /sanitizers/education.py
def sanitize_field(value):
return "Some field"
def sanitize_schoo(value):
return "My school"
Note: This should not be done in the initial implementation of the sanitizer but is up to the sanitizer functions. This is just a nice to have but not of a high priority.
Check that the returned value is of the same type as the argument value passed to the sanitizer. For instance, if a MySQL DATETIME value is passed to the sanitizer, a MySQL DATETIME value shoud be returned as well.
- Custom sanitizers inside ./sanitizers
- Addon sanitizers (
config.addons
) - Core sanitizers
./manage.py create_sanitized_dump > dump.sql
- Warn about unhandled fields
- Creates a database dump (
mysqldump
/pgdump
) - Run sanitizer
./manage.py check_sanitizerconfig
- Returns an error code if there are unhandled database fields
Check can be used in CI environments for detecting changes in models, that are not present in sanitizer configuration.
./manage.py init_sanitizer
- Create configuration from current database state