Guess The Class

A CS105 Project that classifies and characterizes class lectures based on word frequency.
Project developed by Shreya Balaji, Benson Wan, and Richard Duong.
Link to the Github Repository Here
If you want to take a look at our presentation and findings, click here

How to use

This repository contains sample data extracted from YouTube. The directory structure is as follows:

YouReader/                  # custom python package for extracting captions
docs/                       # documents, graphics, and other resources
notebooks/                  # notebooks for graphics
scripts/                    # setup scripts
tests/                      # unit and integration tests
old/                        # old development code
data/                       # collected data
          links.csv         # input file for links
          example.csv       # example input file
          save.json         # downloaded and cleaned data

Steps (Prerequisites)

Before you can use and test code from this project, you will need the following installed on your system:

Optional if you want to generate graphics with notebooks

Anaconda or Jupyter Notebook

Steps (First Time Installation)

To use this package, you'll have to generate a virtual environment to download the prerequisite python libraries. If you have not generated the virtual environment yet, follow these steps.

Download and extract the code
Run the following commands:

Move to project directory
=========================
$ cd GuessTheClass

To generate a virtual environment
=================================
[Linux, MacOS]
$ chmod +x scripts/setup.sh
$ scripts/setup.sh

[Git Bash on Windows]
$ scripts/winsetup.sh

[Cmd Prompt on Windows]
> "scripts/setup.bat"

Steps (General Setup)

After setting up the virtual environment for the first time, Run these commands to load up the virtual environment before you start using our package.

Load the virtual environment
============================
[Linux, MacOS]
$ source env/bin/activate

[Git Bash on Windows]
$ source env/Scripts/activate

[Cmd Prompt on Windows]
> "env/Scripts/activate.bat" 


Disable the virtual environment
===============================
$ deactivate

How to run

If you want to run our program and use the existing dataset, you can use the template notebook in the notebook/ directory

GuessTheclass/notebooks/template.ipynb

If you have your own existing dataset that you want to test:

Put your YouTube links into "data/links.csv"
You can build your captions dataset using the example below

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Demo		Demo
YouReader		YouReader
data		data
docs		docs
notebooks		notebooks
old		old
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download.py		download.py
output.txt		output.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Guess The Class

Table Of Contents

Additional Documents

How to use

Steps (Prerequisites)

Steps (First Time Installation)

Steps (General Setup)

How to run

About

Releases

Packages

Languages

License

iarebwan/GuessTheClass

Folders and files

Latest commit

History

Repository files navigation

Guess The Class

Table Of Contents

Additional Documents

How to use

Steps (Prerequisites)

Steps (First Time Installation)

Steps (General Setup)

How to run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages