Skip to content

Latest commit

 

History

History
129 lines (105 loc) · 7.53 KB

README.md

File metadata and controls

129 lines (105 loc) · 7.53 KB

Located Voice CMS

Located Voice CMS

Table of Contents

About Located Voice CMS

  • Located Voice CMS was developed as a Google Summer of Code 2023 project with the Liquid Galaxy Project. Details can be viewed here.
  • Located Voice CMS is an Android app that lets the user connect to a Liquid Galaxy system and send POIs and Tours. At the same time, the app also lets to create, update and delete POIs, Tours, and Categories. Use various tools to control your Liquid Galaxy System. Access the latest features:
  1. Artificial Intelligence Voice: Listen to pre-recorded AI-generated voices describing the POIs stored on the cloud.
  2. Artificial Intelligence Context: Connect to the AI server and generate your own real-time audio via Suno's Bark Generative Audio AI Model. To do this, just run the following commands on your AI server:
    1. Pull the docker image:

       docker pull vedantkingh/bark2
    2. To run on CPU:

       docker run vedantkingh/bark2

      To run on GPU:

       docker run --gpus all vedantkingh/bark2 

      Install the NVIDIA Container Toolkit using this installation guide if GPU is inaccessible by the Docker.

  3. Category Sounds: Feel the POIs ambience around you with immersive sounds to your categories. You can also add, edit and delete the category sounds.
  4. Context Sensing: With the Nearby Places feature, you can now generate nearby places out of thin air to visit around a certain POI. Best experienced when coupled with Artificial Intelligence Context.

App Screenshots

Screenshot 1 Screenshot 2 Screenshot 3 Screenshot 4 Screenshot 5 Screenshot 5

Running the APK

Prerequisites

  • Android device. Preferrably a 10-inch Android Tablet

Steps:

  • Download the apk file from this repository or the Google Play Store.
  • To connect to the Liquid Galaxy and the AI Server, tap on menu icon and go to Administration Tools > Settings then fill up the details of your Liquid Galaxy Rig (LG Server IP, LG Server ID, LG Server Password and the number of machines) and the AI Server (AI Server IP and the Port on which the docker is running).
  • Now simply explore the application, send a wide variety of KML Data to the LG and listen to immersive audio via our Cloud as well as your local AI Server.

AI Server Guide

  • The AI server is used for real-time AI audio generation. It runs the Bark AI model created model by Suno AI in your AI server.
  • Running your own AI server for Located Voice CMS is fairly simple. You can simply run the API in a dockerized container using the Docker Image. To do this, just run the following commands on your AI server:
    1. Pull the docker image:

      docker pull vedantkingh/bark2
    2. To run on CPU:

      docker run vedantkingh/bark2

      To run on GPU:

      docker run --gpus all vedantkingh/bark2 

      Install the NVIDIA Container Toolkit using this installation guide if GPU is inaccessible by the Docker.

  • If you want to run the API without docker or dig deep into the API which is running the model. It is available at this repository.
    1. Clone the repository:
      git clone https://github.com/vedantkingh/bark.git
    2. Install dependencies:
      sudo apt-get update && sudo apt-get upgrade -y
      sudo apt-get install -y python3-dev python3-pip build-essential sox libsox-fmt-mp3
      sudo apt-get install -y nvidia-cudnn
      pip install --no-cache-dir -r requirements.txt
    3. Start the Flask API on the server:
      python app.py
    4. Send a POST request to the /synthesize endpoint with the desired text as JSON payload. For example:
      POST /synthesize
      Content-Type: application/json
      
      {
          "text": "Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe."
      }
    This will generate the voice audio corresponding to the provided text.

Previous Versions

An Android app that lets the user connect to one Liquid Galaxy system and send POIs and Tours. At the same time, the app also lets to create, update and delete POIs, Tours and Categories.

This project is a GSOC 2023 project that continues a previous project. This GSoC 2023 project contains the following subprojects for each one of them a commits link is provided

Contributing

Fill up issues, bugs or feature requests in our issue tracker. Please be very descriptive and clear so it is easier to help you. If you want to contribute to this project you can open a pull request at time you like.

License

This project is licensed under the MIT license.
Copyright @2023 Vedant Singh