Skip to content

CongGroup/Poisoning-SSL-based-RS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 

Repository files navigation

Poisoned_datasets

Poisoning data generated by our scheme.

For each dataset, we generated fake users whose numbers are 1%, 2%, and 3% of real users.

It is worth noting that, in the three original Amazon datasets (Beauty, Sports and Outdoors, and Toys and Games), none of the users has multiple interactions with the same item, while in the Yelp dataset, users often interact with the same item multiple times. Therefore, to ensure the stealthiness of our attack, when constructing the poisoning data of the Amazon datasets, we let each fake user only interact with the target item once, while in Yelp, we allow each fake user to interact with the target item multiple times. For comparison, we also provide the poisoning data of the Yelp dataset where each user interacts with the same item at most once.

Seq-poison

Our model for fake user generating.

datasets

Original pre-processed user-item interaction records obtained by the data downloaded from Google Drive (which is publicly available).

We use the “5-core” datasets as described in our paper.

Run

Create bi-classifier:

python train_classify.py

Now we get the bi-classifier model _{data_name}bi_classify.pt.

Train the generator that generates fake users:

python main.py

Generate poisoning data (the percentage of fake users can be set):

python generate_data.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages