Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

randomness of audit data set #155

Open
rickardbrannvall opened this issue Sep 30, 2024 · 1 comment
Open

randomness of audit data set #155

rickardbrannvall opened this issue Sep 30, 2024 · 1 comment

Comments

@rickardbrannvall
Copy link
Collaborator

Issue

Problem Description

I noticed that indices included in audit_dataset differ between runs even if random seeds are kept fixed.

Expected Behavior

Same audit dataset should be generated for fixed random seeds (and sizes etc).

What Needs to be Done

Perhaps random number generators (streams) need to be global and used explicitly in all random functions.

How Can It Be Tested or Reproduced

Log audit dataset indices between two runs and check for agreement.

@rickardbrannvall
Copy link
Collaborator Author

Problem may be related to code in data_preparation.py in dev_utils function prepare_train_test_dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant