Hackable AI Powered Mirror on your laptop.
Mirror is a web app that constantly watches the realtime video feed from the webcam and responds with comments.
- 100% Local and Private: Try all kinds of ideas. Don't worry, everything happens on your laptop with NO Internet connection.
- FREE: Since the AI model is running 100% on your machine, you can keep it running forever and experiment with different things.
- Hackable: Simply by changing the prompt (or tweaking the code), you can easily repurpose Mirror to do different things.
Watch the video of Mirror in action:
- When you launch the app, the browser will ask you for webcam permission.
- When you allow the webcam, it will start streaming the video to the AI (Bakllava, running on llama.cpp).
- The AI will analyze the image and stream the response, which the frontend prints in realtime.
When you launch the web UI, it will immediately start streaming responses from the AI based on the prompt: "Describe a person in the image".
You can edit this field to let Mirror start streaming whatever you want
Some example prompts you can try:
- What is this object I am holding?
- What is the person doing?
- Describe some notable events in the image.
- How many people are in this picture?
- Let me know if you see anything weird.
Try the 1 click install using Pinokio: https://pinokio.computer/item?uri=https://github.com/cocktailpeanut/mirror
Make sure to use the latest version of Pinokio (0.1.49 and above)
Mirror has a lot of moving parts, so if you don't use the 1 Click Installer, it may take a lot of work:
- Orchestration of multiple backends (llama.cpp server and the gradio webui server)
- Install pre-requisites, such as cmake, visual studio (windows), ffmpeg, etc.
If you want to install manually, go to the following section.
Note that everything mentioned in this entire section is essentially what the 1 Click Installer does, automatically, and works on Mac, Windows, and Linux. So if you get stuck trying to run Mirror manually, try the 1 click install.
git clone https://github.com/cocktailpeanut/mirror
git clone https://github.com/ggerganov/llama.cpp
Download the following bakllava model files to the llama.cpp/models
folder
- https://huggingface.co/mys/ggml_bakllava-1/resolve/main/ggml-model-q4_k.gguf
- https://huggingface.co/mys/ggml_bakllava-1/resolve/main/mmproj-model-f16.gguf
cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release
Create a venv and install rerquirements
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Install FFMPEG: https://ffmpeg.org/download.html
First start the llama.cpp server:
cd llama.cpp\build\bin
Release\server.exe -m ..\..\ggml-model-q4_k.gguf --mmproj ..\..\mmproj-model-f16.gguf -ngl 1
cd llama.cpp\build\bin
./server -m ..\..\ggml-model-q4_k.gguf --mmproj ..\..\mmproj-model-f16.gguf -ngl 1
First activate the environment:
source venv/bin/activate
Then run the app.py file
python app.py
- The backend code was inspired and adopted from Realtime Bakllava, which uses...
- Llama.cpp for the LLM Server.
- Bakllava for the Multimodal AI model.
- The Web UI was built with gradio.