Skip to content

iarebwan/GuessTheClass

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A CS105 Project that classifies and characterizes class lectures based on word frequency.
Project developed by Shreya Balaji, Benson Wan, and Richard Duong.
Link to the Github Repository Here
If you want to take a look at our presentation and findings, click here


  1. Overview
  2. Table Of Contents
  3. How to use
  4. References


Additional Documents

  1. Project Presentation
  2. Project Report
  3. Timeline
  4. Assignment Specifications
  5. Project Proposal

This repository contains sample data extracted from YouTube. The directory structure is as follows:

YouReader/                  # custom python package for extracting captions
docs/                       # documents, graphics, and other resources
notebooks/                  # notebooks for graphics
scripts/                    # setup scripts
tests/                      # unit and integration tests
old/                        # old development code
data/                       # collected data
          links.csv         # input file for links
          example.csv       # example input file
          save.json         # downloaded and cleaned data

Steps (Prerequisites)

Before you can use and test code from this project, you will need the following installed on your system:


Optional if you want to generate graphics with notebooks


Steps (First Time Installation)

To use this package, you'll have to generate a virtual environment to download the prerequisite python libraries. If you have not generated the virtual environment yet, follow these steps.

  1. Download and extract the code
  2. Run the following commands:
Move to project directory
=========================
$ cd GuessTheClass

To generate a virtual environment
=================================
[Linux, MacOS]
$ chmod +x scripts/setup.sh
$ scripts/setup.sh

[Git Bash on Windows]
$ scripts/winsetup.sh

[Cmd Prompt on Windows]
> "scripts/setup.bat"


Steps (General Setup)

After setting up the virtual environment for the first time, Run these commands to load up the virtual environment before you start using our package.

Load the virtual environment
============================
[Linux, MacOS]
$ source env/bin/activate

[Git Bash on Windows]
$ source env/Scripts/activate

[Cmd Prompt on Windows]
> "env/Scripts/activate.bat" 


Disable the virtual environment
===============================
$ deactivate

How to run

If you want to run our program and use the existing dataset, you can use the template notebook in the notebook/ directory

GuessTheclass/notebooks/template.ipynb

If you have your own existing dataset that you want to test:

  1. Put your YouTube links into "data/links.csv"
  2. You can build your captions dataset using the example below



About

A CS105 Final Class Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.5%
  • Other 0.5%