Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiries about Netflix dataset item size and the processing method #16

Open
seamoon224 opened this issue May 3, 2024 · 0 comments
Open

Comments

@seamoon224
Copy link

Thank you for sharing the code and data.

I have a query regarding the Netflix dataset mentioned in your paper. According to the paper, the dataset includes 17,366 items. However, upon examining the train.json, val.json, and test.json files, the highest item number I noted is 17,363, with only 8,413 unique items being represented. This seems to contradict the statistics cited in your paper.

Could you please provide some clarification on this discrepancy?

Additionally, it would be helpful to have a detailed explanation of your data processing methods. Furthermore, the datasets provided do not include user ratings. Are all interactions noted in the train.json, val.json, and test.json files indicative of user preferences for movies (i.e. movies with high user ratings, like 4+)?

Thank you for your attention to these questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant