Video Doodles frontend

Dependencies and required input data

  • Install dependencies

We use npm to manage dependencies, so to build the app locally please install Node.js/npm: Then run:

cd app/frontend
npm install
  • Provide videos data

We provide data for 16 videos, please download the archive and extract into frontend/public/data folder, so that it looks like:

    video_name             # Root folder for the video <video_name>
        depth_16           # Folder containing all depth frames

Running the web app

Install and run locally:

npm run dev
# Find it at http://localhost:8080

Build and serve (to deploy on some server):

npm run build
cd public
python3 -m http.server
# Find it at http://localhost:8000

Note: the app will not work correctly (ie, it can't place canvases correctly) unless the backend server is also running. See backend repo for instructions to run it.

Exporting results

Video doodles results can be exported from the UI. In View mode click the Export button. This will download 3 files to your download folder (you might need to authorize the website to download multiple files in the browser):

<timestamp>_<video-name>.zip         # color frames of the video
<timestamp>_<video-name>_result.json # save of the result, can be reloaded in the app
<timestamp>_<video-name>_log.json    # log of the session, used for analysis purpose during the user study

The frames can be made into a video again, eg with ffmpeg:

# Uncompress frames
unzip <frames_zip_file> -d temp
# Render to a video
ffmpeg -y -framerate 20 -i 'temp/frame_%d.jpg' -c:v libx264 -pix_fmt yuv420p video.mp4
# Optional: delete the temp folder
yes | rm temp/*

We added a feature to export keyframes from a given canvas, in order to use them as offline tracking targets. The button Export Keyframes is available on the right-side bar when a canvas is selected in Edit mode.

What's in there

This application features:

  • A three.js renderer that composites video frames with rendered 3D objects from a specified camera (both extrinsic and intrinsic parameters are set from a camera model estimated with COLMAP, see preprocessing code for more details about the conversion), using estimated depth maps to render occlusions. See Viewport.svelte for the three.js scene, and Frame.js for details on camera parameters and depth/color buffers. See shaders/fragment.glsl for depth compositing.
  • A svg stacked on top of the three.js renderer canvas, that contains 2D gizmos corresponding to the AnimatedCanvas objects of the application. See Viewport.svelte and CanvasGizmo.js.
  • A timeline that displays keyframes from the AnimatedCanvas objects, and can be navigated by clicking. See KeyframeTimeline.svelte.
  • A drawing canvas that displays a remapped view into the scene, as seen "from the canvas" to enable sketching in context. See DrawingCanvas.svelte and SceneViewFromCanvas.svelte.
  • A super simple frame-by-frame animation system to define multi-frame animation loops. See SketchFramesTimeline.svelte and SketchAnimationClip.js.
  • A link to a backend server through websocket, to exchange info via json strings. See websocket.js.

More resources


The VideoDoodles system and implementation is described in the associated publication: webpage, paper, ACM page.

If this code is useful to your research, please consider citing the publication:

Emilie Yu: [email protected]