Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to disable PII anonymization for the HMA synthesizer #2357

Open
chetan-hiwale-hackathon opened this issue Jan 22, 2025 · 1 comment
Open
Labels
question General question about the software under discussion Issue is currently being discussed

Comments

@chetan-hiwale-hackathon

Environment details

If you are already running SDV, please indicate the following details about the environment in
which you are running it:

  • SDV version:
  • Python version:
  • Operating System:

Problem description

<Replace this with a description of the problem that you are trying to solve using SDV. If
possible, describe the data that you are using, or consider attaching some example data
that others can use to propose a working solution for your problem.>

What I already tried

<Replace with a description of what you already tried and what is the behavior that you observe.
If possible, also add below the exact code that you are running.>

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
@chetan-hiwale-hackathon chetan-hiwale-hackathon added new Automatic label applied to new issues question General question about the software labels Jan 22, 2025
@npatki
Copy link
Contributor

npatki commented Jan 23, 2025

Hi @chetan-hiwale-hackathon, nice to meet you. Could you provide a bit more detail about what you are observing? In the above issue, we have space to provide "Problem description" and "What I already tried". Could you fill it out? In particular, I'm curious which particular columns are being anonymized, and what would you like to see there instead?

Without any of these details, my general recommendation is to always double-check your metadata. SDV treats metadata as the ground truth to know which columns to anonymize and which ones to model. Any column listed as pii or unknown in your metadata will be anonymized. I would recommend reading the following docs to inspect and update your metadata:

@npatki npatki added under discussion Issue is currently being discussed and removed new Automatic label applied to new issues labels Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question about the software under discussion Issue is currently being discussed
Projects
None yet
Development

No branches or pull requests

2 participants