Skip to content

Latest commit

 

History

History
120 lines (101 loc) · 3.94 KB

README.md

File metadata and controls

120 lines (101 loc) · 3.94 KB

3a. Selecting media with Youtube

The Youtube selector uses Youtube's Data API to find videos uploaded between certain dates using a search term. This API requires an API key, which is free to get. Follow the instructions in the documentation, adding the API key in a line in the .env file at the root of your mtriage folder.

With the API key in our mtriage environment, we can run the following config to select some videos from youtube:

folder: media/demo_official/3
select:
  name: Youtube
  config:
    search_term: Tear gas 
    uploaded_before: "2015-10-02T00:00:00Z"
    uploaded_after: "2015-10-01T00:00:00Z"

Let's run it:

./mtriage run docs/tutorial/3/3a.yaml

The Youtube selector indexes videos by making an API call, and then downloads the videos in parallel according to however many CPU cores your computer has available. By default, it downloads the videos at very low quality, and it also retrieves a 'meta.json' file regarding the video's provenance and other meta information.

3b. Image classification with KerasPretrained

Let's now classify the frames in the videos that we've downloaded using image classifiers that have been pretrained on the labels in the ImageNet database. We'll do so using a neural net architecture called ResNet, which is a state-of-the-art architecture for image classification. We'll give the KerasPretrained analyser the three labels we're interested in--tank, rifle, and military uniform--to indicate that we want to predict the appearance of these objects in the videos' frames.

folder: media/demo_official/3
elements_in:
  - Youtube
analyse:
  - name: Frames
  - name: KerasPretrained
    config:
      model: ResNet50
      labels:
        - tank
        - rifle
        - military uniform

Note that the first time you runthis config, it will download the pretrained weights for Resnet, which is a file ~100mb in size (this download only happens once):

./mtriage run docs/tutorial/3/3b.yaml

3c. A complete mtriage workflow

Now that we've tested the parts, let put this all together in a single workflow, and broaden the media space slightly:

folder: media/demo_official/3c
select:
  name: Youtube
  config:
    search_term: tear gas + mexico
    uploaded_before: "2018-11-30T00:00:00Z"
    uploaded_after: "2018-11-15T00:00:00Z"
analyse:
  - name: Frames
  - name: ImageDedup
    config:
      threshold: 3
      method: dhash
  - name: KerasPretrained
    config:
      model: ResNet50
      labels:
        - tank
        - rifle
        - military uniform
  - name: Rank

In this config, we select videos uploaded between the 15th and 30th of November in 2018 that match both "tear gas" and "mexico" in Youtube's search API. Once downloaded, we split each video into frames, deduplicate similar images using dhash, classify deduplicated frames using resnet, and then create an additional JSON that ranks the classified videos according to the number of positive frames they contain (using the Rank analyser).

That's a fair bit of computational work. Go and grab a beverage while this command runs to completion, if you like:

./mtriage run docs/tutorial/3/3c.yaml

Once it's finished, take a look at the files that the workflow has produced in the media/demo_official/3c folder. You should see everything in a 'Youtube' folder (as you may recall, mtriage runs are organised internally by selector), and then most of the created media in a 'derived' folder inside that.

You're officially finished with the mtriage tutorial. If you want to work through the media mtriage has just analysed using a frontend interface, however, as we do here at Forensic Architecture, head over to our mtriage-viewer and follow the instructions there!