Skip to content

Commit

Permalink
OpenAI voices. Voice as default command.
Browse files Browse the repository at this point in the history
Add support to AI-generated voices from OpenAI API, which are very
human-like. Set them as default. Change the default command to `voice`.

Migrate to v1.2.4 of openai package.
  • Loading branch information
paulovcmedeiros committed Nov 16, 2023
2 parents 293b874 + c291b8e commit 34d3c5d
Show file tree
Hide file tree
Showing 18 changed files with 510 additions and 340 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
- name: Install PortAudio and PulseAudio
run: |
apt-get update
apt-get --assume-yes install portaudio19-dev python-all-dev pulseaudio
apt-get --assume-yes install portaudio19-dev python-all-dev pulseaudio ffmpeg
#----------------------------------------------
# --- configure poetry & install project ----
Expand Down
46 changes: 28 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,25 @@

# pyRobBot: Talk and Chat with GPT LLMs

An interface to OpenAI's [GPT large language models (LLMs)](https://platform.openai.com/docs/models) that implements:
* A conventional chatbot that can be used either via web UI or terminal
* A personal assistant that can actually interact with you by voice
A python package that uses OpenAI's [GPT large language models (LLMs)](https://platform.openai.com/docs/models) to implement:
* A fully configurable personal assistant that can speak and listen to you
* An equally fully configurable text-based chatbot that can be used either via web UI or terminal

The package is written in Python. The web chatbot UI is made with [Streamlit](https://streamlit.io).

**See and try the [demo web app on Streamlit](https://pyrobbot.streamlit.app)!**

## Features
- [x] Text to speech and speech to text (`rob voice`)
- Talk to the GPT assistant!
- You can choose your preferred language (e.g., `rob voice --lang pt-br`)
- [x] Web UI
- [x] Text to speech and speech to text
- Talk to the GPT assistant and the assistant will talk back to you!
- Choose your preferred language (e.g., `rob --lang pt-br`)
- Choose your preferred Text-to-Speech (TTS) engine
- [OpenAI Text-to-Speech](https://platform.openai.com/docs/guides/text-to-speech) (default): AI-generated *human-like* voice
- [Google TTS](https://cloud.google.com/text-to-speech) (`rob --tts google`): free at the time being, with decent quality


- [x] Browser UI (made with [Streamlit](https://pyrobbot.streamlit.app))
- Add/remove conversations dynamically
- Automatic/editable conversation summary title
- [x] Terminal UI
- For a more "Wake up, Neo" experience
- [x] Fully configurable
- Support for multiple GPT LLMs
- Control over the parameters passed to the OpenAI API, with (hopefully) sensible defaults
Expand All @@ -32,6 +36,7 @@ The package is written in Python. The web chatbot UI is made with [Streamlit](ht
- Dynamically modifiable AI parameters in each chat separately
- No need to restart the chat
- [x] Autosave & retrieve chat history
- In the browser UI, you can even read the transcripts of your voice conversations with the AI
- [x] Chat context handling using [embeddings](https://platform.openai.com/docs/guides/embeddings)
- [x] Estimated API token usage and associated costs
- [x] OpenAI API key is **never** stored on disk
Expand All @@ -42,13 +47,16 @@ The package is written in Python. The web chatbot UI is made with [Streamlit](ht
- Python >= 3.9
- A valid [OpenAI API key](https://platform.openai.com/account/api-keys)
- Set in the Web UI or through the environment variable `OPENAI_API_KEY`
- Optionally, to enable voice chat, you also need:
- To enable voice chat, you also need:
- [PortAudio](https://www.portaudio.com/docs/v19-doxydocs/index.html)
- Install on Ubuntu with `sudo apt-get --assume-yes install portaudio19-dev python-all-dev`
- Install on CentOS/RHEL with `sudo yum install portaudio portaudio-devel`
- [ffmpeg](https://ffmpeg.org/download.html)
- Install on Ubuntu with `sudo apt-get --assume-yes install ffmpeg`
- Install on CentOS/RHEL with `sudo yum install ffmpeg`

## Installation
This, naturally, assumes your systems fulfills all [requirements](#system-requirements).
This, naturally, assumes your system fulfills all [requirements](#system-requirements).
### Using pip
```shell
pip install pyrobbot
Expand All @@ -73,25 +81,27 @@ and general `rob` options. For info about specific subcommands and the
options that apply to them only, **please run `rob SUBCOMMAND -h`** (note
that the `-h` goes after the subcommand in this case).


### Using the Web UI
### Chatting by Voice (default)
```shell
rob
```

### Chatting by Voice
### Using the Web UI
```shell
rob voice
rob ui
```


### Running on the Terminal
```shell
rob .
```

## Disclaimers
This project's main purpose is to serve as a learning exercise for me, as well as tool for experimenting with OpenAI API, GPT LLMs and text-to-voice/voice-to-text. It does not claim to be the best or more robust OpenAI-powered chatbot out there.
This project's main purpose has been to serve as a learning exercise for me, as well as tool for experimenting with OpenAI API, GPT LLMs and text-to-speech/speech-to-text.

While it does not claim to be the best or more robust OpenAI-powered chatbot out there, it *does* aim to provide a friendly user interface that is easy to install, use and configure.

Having said this, this project *does* aim to provide a friendly user interface that is easy to use and configure. Feel free to open an issue or submit a pull request if you find a bug or have a suggestion.
Feel free to open an [issue](https://github.com/paulovcmedeiros/pyRobBot/issues) or, even better, [submit a pull request](https://github.com/paulovcmedeiros/pyRobBot/pulls) if you find a bug or have a suggestion.

Last but not least: this project is **not** affiliated with OpenAI in any way.
8 changes: 5 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
license = "MIT"
name = "pyrobbot"
readme = "README.md"
version = "0.2.4"
version = "0.3.0"

[build-system]
build-backend = "poetry.core.masonry.api"
Expand All @@ -24,15 +24,17 @@
# Other dependencies
loguru = "^0.7.2"
numpy = "^1.26.1"
openai = "^0.28.1"
openai = "^1.2.4"
pandas = "^2.1.2"
pillow = "^10.1.0"
pydantic = "^2.4.2"
streamlit = "^1.28.0"
tiktoken = "^0.5.1"
# Text to speech
gtts = "^2.4.0"
pydub = "^0.25.1"
pygame = "^2.5.2"
setuptools = "^68.2.2" # Needed by webrtcvad-wheels
sounddevice = "^0.4.6"
soundfile = "^0.12.1"
speechrecognition = "^3.10.0"
Expand Down Expand Up @@ -128,7 +130,7 @@
##################

[tool.pytest.ini_options]
addopts = "-v --failed-first --cov-report=term-missing --cov-report=term:skip-covered --cov-report=xml:.coverage.xml --cov=./"
addopts = "-v --cache-clear --failed-first --cov-report=term-missing --cov-report=term:skip-covered --cov-report=xml:.coverage.xml --cov=./"
log_cli_level = "INFO"
testpaths = ["tests/smoke", "tests/unit"]

Expand Down
13 changes: 9 additions & 4 deletions pyrobbot/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
from importlib.metadata import metadata, version
from pathlib import Path

import openai
from loguru import logger
from openai import OpenAI, OpenAIError

logger.remove()
logger.add(
Expand Down Expand Up @@ -46,9 +46,15 @@ class GeneralDefinitions:
@staticmethod
def openai_key_hash():
"""Return a hash of the OpenAI API key."""
if openai.api_key is None:
try:
client = OpenAI()
except OpenAIError:
api_key = None
else:
api_key = client.api_key
if api_key is None:
return "demo"
return hashlib.sha256(openai.api_key.encode("utf-8")).hexdigest()
return hashlib.sha256(api_key.encode("utf-8")).hexdigest()

@property
def package_cache_directory(self):
Expand All @@ -71,4 +77,3 @@ def chats_storage_dir(self):
)

# Initialize the OpenAI API client
openai.api_key = GeneralConstants.SYSTEM_ENV_OPENAI_API_KEY
11 changes: 4 additions & 7 deletions pyrobbot/app/multipage.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
import datetime
from abc import ABC, abstractmethod, abstractproperty

import openai
import streamlit as st
from openai import OpenAI
from pydantic import ValidationError

from pyrobbot import GeneralConstants
Expand Down Expand Up @@ -137,12 +137,9 @@ def init_chat_credentials(self):
help="[OpenAI API auth key](https://platform.openai.com/account/api-keys). "
+ "Chats created with this key won't be visible to people using other keys.",
)
openai.api_key = (
self.openai_api_key
if self.openai_api_key
else GeneralConstants.SYSTEM_ENV_OPENAI_API_KEY
)
if not openai.api_key:

client = OpenAI()
if not client.api_key:
st.write(":red[You need to provide a key to use the chat]")

def add_page(
Expand Down
125 changes: 72 additions & 53 deletions pyrobbot/argparse_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,41 @@
import sys

from . import GeneralConstants
from .chat_configs import ChatOptions
from .command_definitions import accounting, run_on_terminal, run_on_ui, run_over_voice
from .chat_configs import ChatOptions, VoiceChatConfigs
from .command_definitions import (
accounting_report,
browser_chat,
terminal_chat,
voice_chat,
)


def _populate_parser_from_pydantic_model(parser, model):
_argarse2pydantic = {
"type": model.get_type,
"default": model.get_default,
"choices": model.get_allowed_values,
"help": model.get_description,
}
for field_name, field in model.model_fields.items():
args_opts = {
key: _argarse2pydantic[key](field_name)
for key in _argarse2pydantic
if _argarse2pydantic[key](field_name) is not None
}
args_opts["required"] = field.is_required()
if "help" in args_opts:
args_opts["help"] = f"{args_opts['help']} (default: %(default)s)"
if "default" in args_opts and isinstance(args_opts["default"], (list, tuple)):
args_opts.pop("type", None)
args_opts["nargs"] = "*"

parser.add_argument(f"--{field_name.replace('_', '-')}", **args_opts)

return parser


def get_parsed_args(argv=None, default_command="ui"):
def get_parsed_args(argv=None, default_command="voice"):
"""Get parsed command line arguments.
Args:
Expand All @@ -21,45 +51,21 @@ def get_parsed_args(argv=None, default_command="ui"):
"""
if argv is None:
argv = sys.argv[1:]
if not argv:
argv = [default_command]

chat_options_parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter, add_help=False
)
argarse2pydantic = {
"type": ChatOptions.get_type,
"default": ChatOptions.get_default,
"choices": ChatOptions.get_allowed_values,
"help": ChatOptions.get_description,
}
for field_name, field in ChatOptions.model_fields.items():
args_opts = {
key: argarse2pydantic[key](field_name)
for key in argarse2pydantic
if argarse2pydantic[key](field_name) is not None
}
args_opts["required"] = field.is_required()
if "help" in args_opts:
args_opts["help"] = f"{args_opts['help']} (default: %(default)s)"
if "default" in args_opts and isinstance(args_opts["default"], (list, tuple)):
args_opts.pop("type", None)
args_opts["nargs"] = "*"

chat_options_parser.add_argument(f"--{field_name.replace('_', '-')}", **args_opts)
first_argv = next(iter(argv), "'")
info_flags = ["--version", "-v", "-h", "--help"]
if not argv or (first_argv.startswith("-") and first_argv not in info_flags):
argv = [default_command, *argv]

# Main parser that will handle the script's commands
main_parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)

main_parser.add_argument(
"--version",
"-v",
action="version",
version=f"{GeneralConstants.PACKAGE_NAME} v" + GeneralConstants.VERSION,
)

# Configure the main parser to handle the commands
subparsers = main_parser.add_subparsers(
title="commands",
dest="command",
Expand All @@ -71,45 +77,58 @@ def get_parsed_args(argv=None, default_command="ui"):
help="command description",
)

# Common options to most commands
chat_options_parser = _populate_parser_from_pydantic_model(
parser=argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter, add_help=False
),
model=ChatOptions,
)
chat_options_parser.add_argument(
"--report-accounting-when-done",
action="store_true",
help="Report estimated costs when done with the chat.",
)

# Voice chat
voice_options_parser = _populate_parser_from_pydantic_model(
parser=argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter, add_help=False
),
model=VoiceChatConfigs,
)
parser_voice_chat = subparsers.add_parser(
"voice",
aliases=["v"],
parents=[voice_options_parser],
help="Run the chat over voice.",
)
parser_voice_chat.set_defaults(run_command=voice_chat)

# Web app chat
parser_ui = subparsers.add_parser(
"ui",
aliases=["app"],
parents=[chat_options_parser],
help="Run the chat UI on the browser.",
)
parser_ui.set_defaults(run_command=run_on_ui)
parser_ui.set_defaults(run_command=browser_chat)

# Terminal chat
parser_terminal = subparsers.add_parser(
"terminal",
aliases=["."],
parents=[chat_options_parser],
help="Run the chat on the terminal.",
)
parser_terminal.add_argument(
"--report-accounting-when-done",
action="store_true",
help="Report estimated costs when done with the chat.",
)
parser_terminal.set_defaults(run_command=run_on_terminal)

parser_over_voice = subparsers.add_parser(
"voice",
aliases=["v"],
parents=[chat_options_parser],
help="Run the chat over voice.",
)
parser_over_voice.add_argument(
"--report-accounting-when-done",
action="store_true",
help="Report estimated costs when done with the chat.",
)
parser_over_voice.set_defaults(run_command=run_over_voice)
parser_terminal.set_defaults(run_command=terminal_chat)

# Accounting report
parser_accounting = subparsers.add_parser(
"accounting",
aliases=["acc"],
help="Show the estimated number of used tokens and associated costs, and exit.",
)
parser_accounting.set_defaults(run_command=accounting)
parser_accounting.set_defaults(run_command=accounting_report)

return main_parser.parse_args(argv)
Loading

0 comments on commit 34d3c5d

Please sign in to comment.