Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about PassGAN implementation #14

Open
iBM88 opened this issue Sep 15, 2018 · 6 comments
Open

Question about PassGAN implementation #14

iBM88 opened this issue Sep 15, 2018 · 6 comments

Comments

@iBM88
Copy link

iBM88 commented Sep 15, 2018

Hi, I have some questions:

  • Is the data encoded as onehot representation for each character of the password?
  • How do you enforce the generator to return onehot encoded characters to be fed into the discriminator?
  • How do you force the network to produce variable-length passwords?

Thanks

@prodnet
Copy link

prodnet commented Nov 24, 2018

Greetings,

-Why passgan produces only passwords of a length <= 9 chars, how can adjust this parameter?
-Why it generating not clean candidates, to be complatibliles with haschat?

Cevess
mugan
26341909
�hawitrx
aulemt28
æaæ��1

Maybe you can clarify us!
Thanks.

@prodnet
Copy link

prodnet commented Nov 28, 2018

Please reply.

@brannondorsey
Copy link
Owner

@iBM88 apologies for the late reply. I haven't taken a look at this code base in a while nor have I read the paper in > 1 year. But, here is a quick stab at some of your answers:

  1. Yes
  2. The network outputs probability distribution over all possible output characters and the most likely character is sampled. This is called greedy argmax sampling, and you can see it here.
  3. Put simply, you don't ;) The generator learns to produce passwords of varying length because the training data has passwords of different length. A knowledgeable spectator would recognize that this GAN implementation can't accept variable-length input or produce variable-length output. I get around that by padding the input with using the backtic character (`) which is stripped after password generation.

Password data looks like this at input and after output:

password``
hunter2```
god```````

The ` characters are stripped before they are print to stdout.

password
hunter2
god

For some reason, at least one user has experienced unexpected behavior related to this technique. I haven't investigated further to determine if it was user error or an issue with the repo itself.

@brannondorsey
Copy link
Owner

@prodnet The maximum password length can be set using the --seq-length command-line argument in train.py. As for the reason the generator is outputting "not-clean" output, I would imagine that may be that the training data has similar "not clean" characters/bytes. It's been a while since I've familiarized myself with this codebase, so this is only speculation.

@xiaozhouguo94
Copy link

I can't reproduce your results using the model I trained, and the cover rate is about 4%.
So, I wonder whether you can stably achieve almost 20% cover rate every time?

@vamsijay11
Copy link

code is executed.
but output file(generated_pass.txt) is not created.
in which file we are writing code for creating file( generated_pass.txt)

please help me with this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants