Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

错误 #1

Open
Xingkangze opened this issue Mar 9, 2020 · 3 comments
Open

错误 #1

Xingkangze opened this issue Mar 9, 2020 · 3 comments

Comments

@Xingkangze
Copy link

你好,你在代码train中的PNtrain和PUtrain中的数据是错的,你有没有发现,你在model.fit的时候,每次训练的都是同一批数据,以mnist为例,每次都训练1268个相同的数据,结果在tensorBoard中训练集过拟合,测试集一塌糊涂,希望你重视这个问题,感谢。

@wangqr
Copy link
Owner

wangqr commented Mar 9, 2020

您好,在神经网络训练过程中,一旦训练集和测试集划分完成,是不可以更改的。简单来说,如果修改划分,相当于考试前做过原题,这样得出的准确率没有意义。

至于正负样本数量,您可以阅读论文5节:
img

关于过拟合的问题,nnPU的目的正是解决过拟合,基线方法存在严重过拟合是在论文中已经说明的。

@Xingkangze
Copy link
Author

Xingkangze commented Mar 9, 2020 via email

@wangqr
Copy link
Owner

wangqr commented Mar 9, 2020

对,作为参照的PN方法正样本为1000个,负样本约为正样本的1/4,合在一起训练集共约1200个样本。测试集样本数目无所谓,因为它只是用来表征结果,不会对网络参数产生影响。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants