From 6f726b94fa46fe46347f9e2227d0ca59315b34b4 Mon Sep 17 00:00:00 2001 From: Xinwei Xiong <3293172751NSS@gmail.com> Date: Tue, 22 Oct 2024 11:26:58 +0800 Subject: [PATCH] feat: optimize project readme --- README.md | 402 ++++++++++++++++++++++++------------------------------ 1 file changed, 181 insertions(+), 221 deletions(-) diff --git a/README.md b/README.md index 9fcd886..5003728 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@

- voiceflow
+ voiceflow

- ⭐️ Template for a typical module written on Go ⭐️
+ ⭐️ 基于 Go 语言的实时语音交互框架 ⭐️

@@ -17,232 +17,192 @@

-

English中文

-

----- - -## 🧩 Awesome features - -At Github, we want to start new projects faster using best practices with a predefined structure and focusing on core ideas implementation rather than wasting time on environment configuration and copying boilerplate code. - -I defined a spec template that I could use to quickly start building a full-fledged project. - -In each directory, there is a README.md and an OWNERS, which explains what the directory does and who owns it. - -**Labels denger:** -Read about the [voiceflow](https://github.com/telepace/voiceflow/labels) tag design. We have provided in the [github-label-syncer](https://github.com/telepace/github-label-syncer) warehouse label synchronizer. - - -## 🛫 Quick start - -> **Note**: You can get started quickly with voiceflow. - -1. Generate a [new repository](https://github.com/telepace/voiceflow/generate) from the template. -2. Clone the repository locally. -3. Update files, read the README files in each directory. -4. Write your code and tests. - -
- Work with Makefile - -```bash -❯ make help # show help -❯ make build # build binary -``` - -
-
- Git hook(push & commit) - -```bash -❯ make # To trigger the hook script, move to.git -❯ tree .git/hooks -.git/hooks -├── applypatch-msg.sample -├── commit-msg -├── commit-msg.sample -├── fsmonitor-watchman.sample -├── post-update.sample -├── pre-applypatch.sample -├── pre-commit -├── pre-commit.sample -├── pre-merge-commit.sample -├── pre-push -├── pre-push.sample -├── pre-rebase.sample -├── pre-receive.sample -├── prepare-commit-msg.sample -├── push-to-checkout.sample -└── update.sample -❯ cp ../sealer/_output/bin/sealer/linux_amd64/sealer ./ # add big binary file -❯ git add . -❯ git commit -a -s -m "nono" # Excess commit blocking -telepace : Running local telepace pre-commit hook. -telepace : File sealer is 71 MB, which is larger than our configured limit of 2 MB -telepace : If you really need to commit this file, you can override the size limit by setting the GIT_FILE_SIZE_LIMIT environment variable, e.g. GIT_FILE_SIZE_LIMIT=42000000 for 42MB. Or, commit with the --no-verify switch to skip the check entirely. -telepace : Commit aborted -❯ rm -rf .git # remote big binary -❯ git commit -a -s -m "nono" # Bad commit blocking -telepace : Running local telepace pre-commit hook. -telepace : Running the telepace commit-msg hook. -fakehsh: subject does not match regex [^(build|chore|ci|docs|feat|feature|fix|perf|refactor|revert|style|test)(.*)?:\s?.*] -fakehsh: subject length less than min [10] -telepace : Please fix your commit message to match telepace coding standards -telepace : https://gist.github.com/cubxxw/126b72104ac0b0ca484c9db09c3e5694#file-githook-md -❯ git commit -a -s -m "docs(main): README-en Chinese documentation" -telepace : Running local telepace pre-commit hook. -telepace : Running the telepace commit-msg hook. -[main b3b339f] docs(main): README-en Chinese documentation - 1 file changed, 29 insertions(+) -❯ git push origin main -``` - -
-
- Work with actions - -Actions provide handling of PR and issue. -We used the bot [🚀@kubbot](https://github.com/kubbot), It can detect issues in Chinese and translate them to English, and you can interact with it using the command `/comment`. - -Comment in an issue: - -```bash -❯ /intive -``` - -
-
- Work with Tools - -```bash -❯ make tools -``` - -
-
- Work with Docker - -```bash -❯ make deploy -``` - -
- - -## 🕋 architecture diagram -```go -// architecture diagram -``` - -**MVC Architecture Design:** -```go -// MVC Architecture Design -``` - -## 🤖 File Directory Description - -Catalog standardization design structure: - -```bash -.voiceflow -├── CONTRIBUTING.md # Contribution guidelines -├── LICENSE # License information -├── Makefile # Makefile for building and running the project -├── README.md # Project overview in English -├── README_zh-CN.md # Project overview in Chinese -├── api # API-related files -│ ├── OWNERS # API owners -│ └── README.md # API documentation -├── assets # Static assets, such as images and stylesheets -│ └── README.md # Assets documentation -├── build # Build-related files -│ ├── OWNERS # Build owners -│ └── README.md # Build documentation -├── cmd # Command-line tools and entry points -│ ├── OWNERS # Command owners -│ └── README.md # Command documentation -├── configs # Configuration files -│ ├── OWNERS # Configuration owners -│ ├── README.md # Configuration documentation -│ └── config.yaml # Main configuration file -├── deploy # Deployment-related files -│ ├── OWNERS # Deployment owners -│ └── README.md # Deployment documentation -├── docs # Project documentation -│ ├── OWNERS # Documentation owners -│ └── README.md # Documentation index -├── examples # Example code and usage -│ ├── OWNERS # Example owners -│ └── README.md # Example documentation -├── init # Initialization files -│ ├── OWNERS # Initialization owners -│ └── README.md # Initialization documentation -├── internal # Internal application code -│ ├── OWNERS # Internal code owners -│ ├── README.md # Internal code documentation -│ ├── app # Application logic -│ ├── pkg # Internal packages -│ └── utils # Utility functions and helpers -├── pkg # Public packages and libraries -│ ├── OWNERS # Package owners -│ ├── README.md # Package documentation -│ ├── common # Common utilities and helpers -│ ├── log # Log utilities -│ ├── tools # Tooling and scripts -│ ├── utils # General utility functions -│ └── version # Version information -├── scripts # Scripts for development and automation -│ ├── LICENSE_TEMPLATES # License templates -│ ├── OWNERS # Script owners -│ ├── README.md # Script documentation -│ ├── githooks # Git hooks for development -│ └── make-rules # Makefile rules and scripts -├── test # Test files and test-related utilities -│ ├── OWNERS # Test owners -│ └── README.md # Test documentation -├── third_party # Third-party dependencies and libraries -│ └── README.md # Third-party documentation -├── tools # Tooling and utilities for development -│ └── README.md # Tool documentation -└── web # Web-related files, such as HTML and CSS - ├── OWNERS # Web owners - └── README.md # Web documentation -``` - -## 🗓️ community meeting - -We welcome everyone to join us and contribute to voiceflow, whether you are new to open source or professional. We are committed to promoting an open source culture, so we offer community members neighborhood prizes and reward money in recognition of their contributions. We believe that by working together, we can build a strong community and make valuable open source tools and resources available to more people. So if you are interested in voiceflow, please join our community and start contributing your ideas and skills! - -We take notes of each [biweekly meeting](https://github.com/telepace/voiceflow/issues/2) in [GitHub discussions](https://github.com/orgs/telepace/discussions), and our minutes are written in [Google Docs](https://docs.google.com/document/d/1nx8MDpuG74NASx081JcCpxPgDITNTpIIos0DS6Vr9GU/edit?usp=sharing). - - - -voiceflow maintains a [public roadmap](https://github.com/telepace/community/tree/main/roadmaps). It gives a a high-level view of the main priorities for the project, the maturity of different features and projects, and how to influence the project direction. - -## 🤼‍ Contributing & Development - -telepace Our goal is to build a top-level open source community. We have a set of standards, in the [Community repository](https://github.com/telepace/community). - -If you'd like to contribute to this voiceflow repository, please read our [contributor documentation](https://github.com/telepace/voiceflow/blob/main/CONTRIBUTING.md). - -Before you start, please make sure your changes are in demand. The best for that is to create a [new discussion](https://github.com/telepace/voiceflow/discussions/new/choose) OR [Slack Communication](https://join.slack.com/t/telepace/shared_invite/zt-1se0k2bae-lkYzz0_T~BYh3rjkvlcUqQ), or if you find an issue, [report it](https://github.com/telepace/voiceflow/issues/new/choose) first. - - -## 🚨 License - -telepace is licensed under the MIT License. See [LICENSE](https://github.com/telepace/voiceflow/tree/main/LICENSE) for the full license text. - -[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Ftelepace%2Ftelepace.svg?type=large)](https://app.fossa.com/projects/git%2Bgithub.com%2Ftelepace%2Fvoiceflow?ref=badge_large) - - -## 🔮 Thanks to our contributors! +🧩 项目简介 + +voiceflow 是一个基于 Go 语言的开源项目,旨在提供实时语音与大型语言模型(LLM)的交互能力。通过集成多种第三方语音平台和本地模型,voiceflow 支持实时语音转文本(STT)、文本转语音(TTS),以及与 LLM 的智能交互。 + +核心功能: + + • 实时语音转文本(STT):支持集成多家云服务商的 STT 服务和本地模型,实时将用户语音转换为文本。 + • 与 LLM 交互:将识别的文本直接发送给支持音频的 LLM,获取智能回复。 + • 文本转语音(TTS):将 LLM 的回复文本转换为语音,支持多种 TTS 服务和本地模型。 + • 音频存储与访问:通过 MinIO 等存储服务,将生成的音频文件存储并提供访问路径,供前端实时播放。 + • 可插拔的服务集成:采用模块化设计,支持各个 STT、TTS 服务和 LLM 的可插拔式集成,方便扩展和定制。 + +🛫 快速开始 + + 注意:以下指南将帮助您快速启动并运行 voiceflow。 + +1. 克隆仓库 + +git clone https://github.com/telepace/voiceflow.git +cd voiceflow + +2. 配置环境 + + • 复制并修改 .env 文件,填写您的第三方服务 API 密钥和其他敏感信息。 + +cp configs/.env.example configs/.env + + • 修改 configs/config.yaml 文件,根据您的需求配置服务提供商和相关参数。 + +3. 安装依赖 + +确保您已安装 Go 1.16 或更高版本。 + +go mod tidy + +4. 运行应用 + +go run cmd/main.go + +5. 前端连接 + +前端可以通过 WebSocket 连接到 ws://localhost:8080/ws,开始实时语音交互。 + +🕸️ 系统架构 + +graph TD + A[前端浏览器] -- 音频数据 --> B[WebSocket 服务器] + B -- 调用 --> C[语音转文本 (STT)] + C -- 文本 --> D[大型语言模型 (LLM)] + D -- 回复文本 --> E[文本转语音 (TTS)] + E -- 音频数据 --> F[存储服务 (MinIO)] + F -- 音频URL --> B + B -- 音频URL --> A + + • 前端浏览器:用户通过浏览器录制语音,并通过 WebSocket 发送到服务器。 + • WebSocket 服务器:接收前端的音频数据,协调各个服务模块的调用。 + • 语音转文本(STT):将音频数据转换为文本。 + • 大型语言模型(LLM):根据文本生成智能回复。 + • 文本转语音(TTS):将回复文本转换为语音数据。 + • 存储服务(MinIO):存储生成的音频文件,并提供访问 URL。 + +🤖 目录结构 + +voiceflow/ +├── cmd/ +│ └── main.go # 应用程序入口 +├── configs/ +│ ├── config.yaml # 业务配置文件 +│ └── .env # 环境变量文件 +├── internal/ +│ ├── config/ # 配置加载模块 +│ ├── server/ # WebSocket 服务器 +│ ├── stt/ # 语音转文本模块 +│ ├── tts/ # 文本转语音模块 +│ ├── llm/ # LLM 交互模块 +│ ├── storage/ # 存储模块 +│ ├── models/ # 数据模型 +│ └── utils/ # 工具函数 +├── pkg/ +│ └── logger/ # 日志模块 +├── scripts/ # 构建和部署脚本 +├── go.mod # Go 模块文件 +└── README.md # 项目说明文档 + +🔧 配置说明 + +.env 文件 + +用于存放敏感信息,如 API 密钥。 + +# .env 示例 +MINIO_ENDPOINT=play.min.io +MINIO_ACCESS_KEY=youraccesskey +MINIO_SECRET_KEY=yoursecretkey +AZURE_STT_KEY=yourazuresttkey +AZURE_TTS_KEY=yourazurettskey + +config.yaml 文件 + +用于业务配置,如服务端口、启用的功能模块等。 + +# config.yaml 示例 +server: + port: 8080 + enable_tls: false + +minio: + enabled: true + bucket_name: voiceflow-audio + +stt: + provider: azure # 可选值:azure、google、local + +tts: + provider: google # 可选值:azure、google、local + +llm: + provider: openai # 可选值:openai、local + +logging: + level: info + +🛠️ 核心模块 + +1. WebSocket 服务器 + +使用 gorilla/websocket 实现,负责与前端的实时通信,接收音频数据并返回处理结果。 + +2. 语音转文本(STT) + + • 接口定义:internal/stt/stt.go 定义了 STT 服务的接口。 + • 可插拔实现:支持 Azure、Google、本地模型等多种实现方式。 + +3. 文本转语音(TTS) + + • 接口定义:internal/tts/tts.go 定义了 TTS 服务的接口。 + • 可插拔实现:支持 Azure、Google、本地模型等多种实现方式。 + +4. 大型语言模型(LLM) + + • 接口定义:internal/llm/llm.go 定义了与 LLM 交互的接口。 + • 可插拔实现:支持 OpenAI、本地模型等多种实现方式。 + +5. 存储模块 + + • 接口定义:internal/storage/storage.go 定义了存储服务的接口。 + • 实现方式:默认使用 MinIO 进行音频文件的存储,也支持本地文件系统。 + +📖 使用指南 + +集成新的 STT/TTS 服务 + + 1. 在对应的模块下新建文件夹,例如 internal/stt/yourservice。 + 2. 实现对应的接口,例如 Recognize 方法。 + 3. 在 NewService 方法中添加对新服务的支持。 + +配置 LLM 服务 + +在 config.yaml 中修改 llm.provider,并在 internal/llm 下实现对应的 LLM 接口。 + +前端开发 + + • WebSocket 通信:前端通过 WebSocket 与服务器通信,发送音频数据,接收处理结果。 + • 音频播放:接收到服务器返回的音频 URL 后,使用 HTML5 Audio 播放。 + +🤝 参与贡献 + +我们欢迎任何形式的贡献!请阅读 CONTRIBUTING.md 了解更多信息。 + + • 提交问题:如果您发现了 Bug,或者有新的功能建议,请在 Issues 中提交。 + • 贡献代码:Fork 本仓库,在您的分支上进行修改,提交 Pull Request。 + +📄 开源协议 + +voiceflow 使用 [MIT](./LICENSE) 开源协议。 + +❤️ 致谢 + +感谢所有为本项目做出贡献的开发者!