Skip to content

Official implementation of the Cura system from paper "Cura: Curation at Social Media Scale".

Notifications You must be signed in to change notification settings

StanfordHCI/Curation-Modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Curation Modeling

This repository contains the official implementation of the Cura system from paper "Cura: Curation at Social Media Scale".

The raw Reddit data and the checkpoint of the best model are available at: Google Drive.

Setting up

Check out the code from our repository using

git clone https://github.com/Azure-Vision/Curation-Modeling.git
git checkout Trans

Place .env.development in the root directory.

Install the dependencies using

conda env create -f environment.yml

Then activate the environment with

conda activate cr2

Train on Reddit data

Run the following script

python train.py CONFIG_PATH

The configurations used in the paper “Cura: Curation at Social Media Scale” can be found at configs/subreddit_minority_no_peer_new.yml, and the configurations for the online experiment that includes more subreddits can be found at configs/subreddit_minority_no_peer_more_subs.yml.

Run prediction on Reddit data

Evaluate the prediction accuracy and confidence of the curation model under different conditions: run test_model.ipynb.

Evaluate the change in prediction accuracy when the curation model receives more peer votes: run sim_new_votes.ipynb.

Perform curation on selected subreddit given selected curators: run curation.ipynb.

Luanch the interface for administrators to select curators and perform curation using

streamlit run curation_interface.py

Integrate to Curio

Collect and preprocess posting and user data from Curio app: run process_CURIO_data.ipynb.

Finetune the pretrained curation model on Curio data using

cd trained_models; mkdir finetune_CURIO_full_data; mkdir deploy_CURIO_full_data; cp subreddit_minority_no_peer_new/latest.pt finetune_CURIO_full_data/latest.pt; cd ..; python train.py configs/finetune_CURIO_full_data.yml; cp trained_models/finetune_CURIO_full_data/best.pt trained_models/deploy_CURIO_full_data/best.pt

Launch the curation model backend for Curio with

uvicorn curation_backend:app --port 5000

About

Official implementation of the Cura system from paper "Cura: Curation at Social Media Scale".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published