^{Back | Next |}^Contents
^{Image Classification}

Running the Live Camera Recognition Demo

The imagenet.cpp / imagenet.py samples that we used previously can also be used for realtime camera streaming. The types of supported cameras include:

MIPI CSI cameras (csi://0)
V4L2 cameras (/dev/video0)
RTP/RTSP streams (rtsp://username:password@ip:port)

For more information about video streams and protocols, please see the Camera Streaming and Multimedia page.

Below are some typical scenarios for launching the program on a camera feed (run --help for more options):

C++

$ ./imagenet csi://0                    # MIPI CSI camera
$ ./imagenet /dev/video0                # V4L2 camera
$ ./imagenet /dev/video0 output.mp4     # save to video file

Python

$ ./imagenet.py csi://0                 # MIPI CSI camera
$ ./imagenet.py /dev/video0             # V4L2 camera
$ ./imagenet.py /dev/video0 output.mp4  # save to video file

note: for example cameras to use, see these sections of the Jetson Wiki:
             - Nano:  https://eLinux.org/Jetson_Nano#Cameras
             - Xavier: https://eLinux.org/Jetson_AGX_Xavier#Ecosystem_Products_.26_Cameras
             - TX1/TX2: developer kits include an onboard MIPI CSI sensor module (0V5693)

Displayed in the OpenGL window are the live camera stream, the classified object name, and the confidence of the classified object, along with the framerate of the network. On Jetson Nano you should see up to around ~75 FPS for GoogleNet and ResNet-18 (faster on other Jetson's).

The application can recognize up to 1000 different types of objects, since the classification models are trained on the ILSVRC ImageNet dataset which contains 1000 classes of objects. The mapping of names for the 1000 types of objects, you can find in the repo under data/networks/ilsvrc12_synset_words.txt

This concludes this section of the Hello AI World tutorial on image classification. Next, we're going to start using Object Detection networks, which provide us with the bounding box coordinates of multiple objects per frame.

Next | Multi-Label Classification for Image Tagging
Back | Coding Your Own Image Recognition Program

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!