Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions #1

Open
DabiaoMa opened this issue Oct 30, 2017 · 2 comments
Open

Some questions #1

DabiaoMa opened this issue Oct 30, 2017 · 2 comments

Comments

@DabiaoMa
Copy link

Hi,

I have read through your implementation of deep voice 3, this is really a very clean one. Have you got any good results yet?

And I have some doubts maybe you could help me clear.

  1. 'modules.py', line 24. Why do we need to make the first row of the embedding matrix to 0 vector?

  2. 'modules.py', line 270. I checked the paper, but I did not find the details about the 'scale' option...

  3. 'moduels.py', line 338, 343. In the paper, It says, 'For a single speaker, ωs is set to one for the decoder and fixed for the encoder to the ratio of output timesteps to input timesteps'. So maybe to the queries, position_rate should be 1, and for keys, position_rate should be hp.T_y/hp.T_x?

  4. 'moduels.py', line 384. I think this line is performing context normalization, and maybe the denominator should be square root of the total input time step, something like sqrt(tf.to_float(val.get_shape()[1]))?

  5. 'synthesis.py', line 38. Maybe the total time step should be hp.T_y//hp.r?

Thanks

@Kyubyong
Copy link
Owner

Thanks.

I haven't seen a success yet.

  1. 0's are reserved for paddings. So I wanted to let them have zeros. But I guess if they have values it makes no difference.
  2. I referenced 'Attention is all you need' https://arxiv.org/pdf/1706.03762.pdf
  3. You're right. Technically, the position rate for the encoder should be (T_y//r)/T_x since T_y is reduced by the reduction factor r.
  4. I think you're right.
  5. Yup, already.

@DabiaoMa
Copy link
Author

DabiaoMa commented Nov 1, 2017

I would like to implement it in mxnet, but I am still hesitating. Hope you could get good results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants