An app that lets you control Google Slides presentations with just hand gestures
demo.mp4
Although this project sounds pretty complicated on paper, its actually easier than it seems. The computer vision / hand detection part of the code is mostly already done using Mediapipe, which is a library produced by Google that aids in pose tracking, such as in the face, hands, and body.
So, all you have to do is tune and modify the exisitng library such that it focuses on your specific use case. After that, the mediapipe library returns all the coordinates of the body parts that you asked it to track.
These are all the possible coordinates returned by Mediapipe for hands
The hand gesture part is a little more involved, but is still simpler than it probably seems. The hand needs to pass four tests in order to be classified as being in the proper gesture:
- The middle, ring, and pinky fingers need to be closed
- The thumb and pointer fingers need to be completely extended
- The thumb and pointer needs to form approximately a right angle
- The hand needs to be either pointing in the left or right orientation, not up, down, forward, or backwards
Untitled.4.mp4
If you are curious how these tests are implemented in code, they are contained within the "HandGestureDetection.py" file, it is fairly short and readable, and I also added comments explaining most of the functions.
If the hand passes all of these tests, then either the right or left arrow key will be pressed depending on the orientation. If the gesture is held, then there will be a slight delay between each arrow key press. This will move the slides forwards or backwards. If there are conflicting instructions, for example, if the right hand is pointing right, while the left hand is pointing left, then no keys will be pressed.
- Hand Detection
- Detect hand orientation
- Detect if finger is closed
- Detect if finger is completely extended
- Detect if thumb and index form right angle
- Support left and right kepresses on both Windows and MacOS
- Multi-hand Support
- Build for production on both Windows and MacOS
- Convert code to Javascript
- Develop mobile website
- Create Chrome (?) extension