Skip to content

Evaluating tokenization methods for multilingual social media data

Notifications You must be signed in to change notification settings

Ivpe1975/thesis

Repository files navigation

Multilang_thesis

Step 1: Run everything in the "matchers" folder.

Step 2: Run the processing notebook

Step 3: Train machamp through either the command and the bottom of machamp/machamp_train.job or just sbatch that file on SLURM cluster.

Step 4: Same thing with the machamp_pred.job but run all the commands if not using cluster.

Step 5: Use the evaluation notebook to your hearts' content

You can read the resulting thesis in the pdf document in the root folder

About

Evaluating tokenization methods for multilingual social media data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published