- Install Rust (see https://rustup.rs/)
- Clone submodules:
git submodule init && git submodule update
- Run
./install_deps.sh
to install the submodules cd vgrid-widget
and run./install.sh
.
- This will require
npm
and other javascript dependencies (npm install --save react react-dom mobx mobx-react
). Install them as needed. - Once this succeeds,
cd ..
to return to the top level.
- Install python dependencies:
pip3 install -r requirements.txt
- Copy/symlink the indexed captions as
data/index
- Copy/symlink the data directory as
data
- Run
./derive_data.py
to generate derived data - Run
./develop.py
to start a development server or editconfig.json
to serve using wsgi.
Run pytest -vs tests
from the top directory.
There should be 4 entries in this directory
documents.txt
(a list of documents that are indexed)lexicon.txt
(a list of all the words)index.bin
(a directory or inverted index file)data
(a directory of all the binary encoded captions)
The data directory consists of the following files and directories:
videos.json
(metadata about the videos)faces.ilist.bin
(intervals when faces are on screen)people
(directory containing intervals when identified people are on screen)people.metadata.json
(optional; JSON dictionary of names to metadata tags)hosts.csv
(optional; a list of people and channels that they are hosts of)face-bboxes
(directory containing face bounding boxes)derived
(this directory is generated by./derive_data.py
)
- IntervalList (or ilist) - These are files that store intervals with a binary bit-vector payload. The intervals can overlap, but must be sorted by start time.
- IntervalSet (or iset) - These are files that store non-overlapping intervals, sorted by start time. Unlike IntervalList, there is no bit-vector payload.