Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

July 2024 Deepgram product release update #35

Merged
merged 3 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 20 additions & 2 deletions charts/deepgram-self-hosted/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

## [Unreleased]

## [0.4.0] - 2024-07-25

### Added
- Introduced entity detection feature flag for API containers (`false` by default).
- Updated default container tags to July 2024 release. Refer to the [main Deepgram changelog](https://deepgram.com/changelog/deepgram-self-hosted-july-2024-release-240725) for additional details. Highlights include:
- Support for Deepgram's new English/Spanish multilingual code-switching model
- Beta support for entity detection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify "for pre-recorded audio" again here to be consistent with the other bullets in this list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Beta support for improved redaction for pre-recorded audio
- Beta support for improved entity formatting for streaming audio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth specifying that these are all English-only?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Removed

- Removed some items nested under `api.features` and `engine.features` sections in favor of opinionated defaults.

## [0.3.0] - 2024-07-18

### Added
Expand All @@ -16,9 +30,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

### Added

- Sample `values.yaml` file for on-premises/self-managed Kubernetes clusters.

### Fixed

- Resolves a mismatch between PVC and SC prefix naming convention.
- Resolves error when specifying custom service account names.
- Sample `values.yaml` file for on-premises/self-managed Kubernetes clusters.

### Changed

Expand Down Expand Up @@ -66,7 +83,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
- Initial implementation of the Helm chart.


[unreleased]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.3.0...HEAD
[unreleased]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.4.0...HEAD
[0.4.0]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.3.0...deepgram-self-hosted-0.4.0
[0.3.0]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.2.3...deepgram-self-hosted-0.3.0
[0.2.3]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.2.2-beta...deepgram-self-hosted-0.2.3
[0.2.2-beta]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.2.1-beta...deepgram-self-hosted-0.2.2-beta
Expand Down
4 changes: 2 additions & 2 deletions charts/deepgram-self-hosted/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
apiVersion: v2
name: deepgram-self-hosted
type: application
version: 0.3.0
appVersion: "release-240627"
version: 0.4.0
appVersion: "release-240725"
description: A Helm chart for running Deepgram services in a self-hosted environment
home: "https://developers.deepgram.com/docs/self-hosted-introduction"
sources: ["https://github.com/deepgram/self-hosted-resources"]
Expand Down
15 changes: 6 additions & 9 deletions charts/deepgram-self-hosted/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# deepgram-self-hosted

![Version: 0.3.0](https://img.shields.io/badge/Version-0.3.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: release-240627](https://img.shields.io/badge/AppVersion-release--240627-informational?style=flat-square) [![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/deepgram-self-hosted)](https://artifacthub.io/packages/search?repo=deepgram-self-hosted)
![Version: 0.4.0](https://img.shields.io/badge/Version-0.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: release-240725](https://img.shields.io/badge/AppVersion-release--240725-informational?style=flat-square) [![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/deepgram-self-hosted)](https://artifacthub.io/packages/search?repo=deepgram-self-hosted)

A Helm chart for running Deepgram services in a self-hosted environment

Expand Down Expand Up @@ -188,11 +188,11 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| api.driverPool.standard.timeoutBackoff | float | `1.2` | timeoutBackoff is the factor to increase the timeout by for each additional retry (for exponential backoff). |
| api.features | object | `` | Enable ancillary features |
| api.features.diskBufferPath | string | `nil` | If API is receiving requests faster than Engine can process them, a request queue will form. By default, this queue is stored in memory. Under high load, the queue may grow too large and cause Out-Of-Memory errors. To avoid this, set a diskBufferPath to buffer the overflow on the request queue to disk. WARN: This is only to temporarily buffer requests during high load. If there is not enough Engine capacity to process the queued requests over time, the queue (and response time) will grow indefinitely. |
| api.features.summarization | bool | `true` | summarization enable summarization *if* a valid summarization model is available. |
| api.features.topicDetection | bool | `true` | topicDetection enables topic detection *if* a valid topic detection model is available. |
| api.features.entityDetection | bool | `false` | Enables entity detection on pre-recorded audio *if* a valid entity detection model is available. *WARNING*: Beta functionality. |
| api.features.entityRedaction | bool | `false` | Enables entity-based redaction on pre-recorded audio *if* a valid entity detection model is available. *WARNING*: Beta functionality. |
| api.image.path | string | `"quay.io/deepgram/onprem-api"` | path configures the image path to use for creating API containers. You may change this from the public Quay image path if you have imported Deepgram images into a private container registry. |
| api.image.pullPolicy | string | `"IfNotPresent"` | pullPolicy configures how the Kubelet attempts to pull the Deepgram API image |
| api.image.tag | string | `"release-240627"` | tag defines which Deepgram release to use for API containers |
| api.image.tag | string | `"release-240725"` | tag defines which Deepgram release to use for API containers |
| api.livenessProbe | object | `` | Liveness probe customization for API pods. |
| api.namePrefix | string | `"deepgram-api"` | namePrefix is the prefix to apply to the name of all K8s objects associated with the Deepgram API containers. |
| api.readinessProbe | object | `` | Readiness probe customization for API pods. |
Expand Down Expand Up @@ -228,13 +228,10 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| engine.chunking.speechToText.streaming.minDuration | float | `nil` | minDuration is the minimum audio duration for a STT chunk size for a streaming request |
| engine.chunking.speechToText.streaming.step | float | `1` | step defines how often to return interim results, in seconds. This value may be lowered to increase the frequency of interim results. However, this also causes a significant decrease in the number of concurrent streams supported by a single GPU. Please contact your Deepgram Account representative for more details. |
| engine.concurrencyLimit.activeRequests | int | `nil` | activeRequests limits the number of active requests handled by a single Engine container. If additional requests beyond the limit are sent, the API container forming the request will try a different Engine pod. If no Engine pods are able to accept the request, the API will return a 429 HTTP response to the client. The `nil` default means no limit will be set. |
| engine.features | object | `` | Enable ancillary features |
| engine.features.languageDetection | bool | `true` | languageDetection enables Deepgram language detection *if* a valid language detection model is available |
| engine.features.multichannel | bool | `true` | multichannel allows/disallows multichannel requests |
| engine.halfPrecision.state | string | `"auto"` | Engine will automatically enable half precision operations if your GPU supports them. You can explicitly enable or disable this behavior with the state parameter which supports `"enable"`, `"disabled"`, and `"auto"`. |
| engine.image.path | string | `"quay.io/deepgram/onprem-engine"` | path configures the image path to use for creating Engine containers. You may change this from the public Quay image path if you have imported Deepgram images into a private container registry. |
| engine.image.pullPolicy | string | `"IfNotPresent"` | pullPolicy configures how the Kubelet attempts to pull the Deepgram Engine image |
| engine.image.tag | string | `"release-240627"` | tag defines which Deepgram release to use for Engine containers |
| engine.image.tag | string | `"release-240725"` | tag defines which Deepgram release to use for Engine containers |
| engine.livenessProbe | object | `` | Liveness probe customization for Engine pods. |
| engine.metricsServer | object | `` | metricsServer exposes an endpoint on each Engine container for reporting inference-specific system metrics. See https://developers.deepgram.com/docs/metrics-guide#deepgram-engine for more details. |
| engine.metricsServer.host | string | `"0.0.0.0"` | host is the IP address to listen on for metrics requests. You will want to listen on all interfaces to interact with other pods in the cluster. |
Expand Down Expand Up @@ -290,7 +287,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| licenseProxy.enabled | bool | `false` | The License Proxy is optional, but highly recommended to be deployed in production to enable highly available environments. |
| licenseProxy.image.path | string | `"quay.io/deepgram/onprem-license-proxy"` | path configures the image path to use for creating License Proxy containers. You may change this from the public Quay image path if you have imported Deepgram images into a private container registry. |
| licenseProxy.image.pullPolicy | string | `"IfNotPresent"` | pullPolicy configures how the Kubelet attempts to pull the Deepgram License Proxy image |
| licenseProxy.image.tag | string | `"release-240627"` | tag defines which Deepgram release to use for License Proxy containers |
| licenseProxy.image.tag | string | `"release-240725"` | tag defines which Deepgram release to use for License Proxy containers |
| licenseProxy.keepUpstreamServerAsBackup | bool | `true` | Even with a License Proxy deployed, API and Engine pods can be configured to keep the upstream `license.deepgram.com` license server as a fallback licensing option if the License Proxy is unavailable. Disable this option if you are restricting API/Engine Pod network access for security reasons, and only the License Proxy should send egress traffic to the upstream license server. |
| licenseProxy.livenessProbe | object | `` | Liveness probe customization for Proxy pods. |
| licenseProxy.namePrefix | string | `"deepgram-license-proxy"` | namePrefix is the prefix to apply to the name of all K8s objects associated with the Deepgram License Proxy containers. |
Expand Down
7 changes: 5 additions & 2 deletions charts/deepgram-self-hosted/templates/api/api.config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,11 @@ data:
{{- end }}

[features]
topic_detection = {{ .Values.api.features.topicDetection }}
summarization = {{ .Values.api.features.summarization }}
topic_detection = true
summarization = true
entity_detection = {{ .Values.api.features.entityDetection }}
entity_redaction = {{ .Values.api.features.entityRedaction }}

{{- if .Values.api.features.diskBufferPath }}
disk_buffer_path = "{{ .Values.api.features.diskBufferPath }}"
{{- end }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ data:
]

[features]
multichannel = {{ .Values.engine.features.multichannel }}
language_detection = {{ .Values.engine.features.languageDetection }}
multichannel = true
language_detection = true

[chunking.batch]
{{- if .Values.engine.chunking.speechToText.batch.minDuration }}
Expand Down
27 changes: 11 additions & 16 deletions charts/deepgram-self-hosted/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ api:
# -- pullPolicy configures how the Kubelet attempts to pull the Deepgram API image
pullPolicy: IfNotPresent
# -- tag defines which Deepgram release to use for API containers
tag: release-240627
tag: release-240725

# -- Additional labels to add to API resources
additionalLabels: {}
Expand Down Expand Up @@ -222,11 +222,15 @@ api:
# -- Enable ancillary features
# @default -- ``
features:
# -- topicDetection enables topic detection *if* a valid topic detection model is available.
topicDetection: true
# -- Enables entity detection on pre-recorded audio
# *if* a valid entity detection model is available.
# *WARNING*: Beta functionality.
entityDetection: false

# -- summarization enable summarization *if* a valid summarization model is available.
summarization: true
# -- Enables entity-based redaction on pre-recorded audio
# *if* a valid entity detection model is available.
# *WARNING*: Beta functionality.
entityRedaction: false

# -- If API is receiving requests faster than Engine can process them, a request
# queue will form. By default, this queue is stored in memory. Under high load,
Expand Down Expand Up @@ -274,7 +278,7 @@ engine:
# -- pullPolicy configures how the Kubelet attempts to pull the Deepgram Engine image
pullPolicy: IfNotPresent
# -- tag defines which Deepgram release to use for Engine containers
tag: release-240627
tag: release-240725

# -- Additional labels to add to Engine resources
additionalLabels: {}
Expand Down Expand Up @@ -455,15 +459,6 @@ engine:
# Account Representative.
links: []

# -- Enable ancillary features
# @default -- ``
features:
# -- multichannel allows/disallows multichannel requests
multichannel: true
# -- languageDetection enables Deepgram language detection *if*
# a valid language detection model is available
languageDetection: true

# -- chunking defines the size of audio chunks to process in seconds.
# Adjusting these values will affect both inference performance and accuracy
# of results. Please contact your Deepgram Account Representative if you
Expand Down Expand Up @@ -525,7 +520,7 @@ licenseProxy:
# Deepgram images into a private container registry.
path: quay.io/deepgram/onprem-license-proxy
# -- tag defines which Deepgram release to use for License Proxy containers
tag: release-240627
tag: release-240725
# -- pullPolicy configures how the Kubelet attempts to pull the Deepgram
# License Proxy image
pullPolicy: IfNotPresent
Expand Down
8 changes: 8 additions & 0 deletions common/license_proxy_deploy/api.toml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,14 @@ topic_detection = true # or false
### Enables summarization *if* a valid summarization model is available
summarization = true # or false

### Enables pre-recorded entity detection *if* a valid entity detection model is available
### *WARNING*: Beta functionality.
entity_detection = false # or true

### Enables pre-recorded entity-based redaction *if* a valid entity detection model is available
### *WARNING*: Beta functionality.
entity_redaction = false # or true

### If API is receiving requests faster than Engine can process them, a request
### queue will form. By default, this queue is stored in memory. Under high load,
### the queue may grow too large and cause Out-Of-Memory errors. To avoid this,
Expand Down
8 changes: 8 additions & 0 deletions common/standard_deploy/api.toml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,14 @@ topic_detection = true # or false
### Enables summarization *if* a valid summarization model is available
summarization = true # or false

### Enables pre-recorded entity detection *if* a valid entity detection model is available
### *WARNING*: Beta functionality.
entity_detection = false # or true

### Enables pre-recorded entity-based redaction *if* a valid entity detection model is available
### *WARNING*: Beta functionality.
entity_redaction = false # or true

### If API is receiving requests faster than Engine can process them, a request
### queue will form. By default, this queue is stored in memory. Under high load,
### the queue may grow too large and cause Out-Of-Memory errors. To avoid this,
Expand Down
6 changes: 3 additions & 3 deletions docker/docker-compose.license-proxy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.7"
services:
# The speech API service.
api:
image: quay.io/deepgram/onprem-api:release-240627
image: quay.io/deepgram/onprem-api:release-240725

# Here we expose the API port to the host machine. The container port
# (right-hand side) must match the port that the API service is listening
Expand Down Expand Up @@ -41,7 +41,7 @@ services:

# The speech engine service.
engine:
image: quay.io/deepgram/onprem-engine:release-240627
image: quay.io/deepgram/onprem-engine:release-240725

# Utilize a GPU, if available.
runtime: nvidia
Expand Down Expand Up @@ -83,7 +83,7 @@ services:

# The service to validate your Deepgram license
license-proxy:
image: quay.io/deepgram/onprem-license-proxy:release-240627
image: quay.io/deepgram/onprem-license-proxy:release-240725

# Here we expose the License Proxy status port to the host machine. The container port
# (right-hand side) must match the port that the License Proxy service is listening
Expand Down
4 changes: 2 additions & 2 deletions docker/docker-compose.standard.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.7"
services:
# The speech API service.
api:
image: quay.io/deepgram/onprem-api:release-240627
image: quay.io/deepgram/onprem-api:release-240725

# Here we expose the API port to the host machine. The container port
# (right-hand side) must match the port that the API service is listening
Expand Down Expand Up @@ -37,7 +37,7 @@ services:

# The speech engine service.
engine:
image: quay.io/deepgram/onprem-engine:release-240627
image: quay.io/deepgram/onprem-engine:release-240725

# Utilize a GPU, if available.
runtime: nvidia
Expand Down
6 changes: 3 additions & 3 deletions podman/podman-compose.license-proxy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.7"
services:
# The speech API service.
api:
image: quay.io/deepgram/onprem-api:release-240627
image: quay.io/deepgram/onprem-api:release-240725

# Here we expose the API port to the host machine. The container port
# (right-hand side) must match the port that the API service is listening
Expand Down Expand Up @@ -41,7 +41,7 @@ services:

# The speech engine service.
engine:
image: quay.io/deepgram/onprem-engine:release-240627
image: quay.io/deepgram/onprem-engine:release-240725

# Utilize a GPU, if available.
devices:
Expand Down Expand Up @@ -84,7 +84,7 @@ services:

# The service to validate your Deepgram license
license-proxy:
image: quay.io/deepgram/onprem-license-proxy:release-240627
image: quay.io/deepgram/onprem-license-proxy:release-240725

# Here we expose the License Proxy status port to the host machine. The container port
# (right-hand side) must match the port that the License Proxy service is listening
Expand Down
4 changes: 2 additions & 2 deletions podman/podman-compose.standard.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.7"
services:
# The speech API service.
api:
image: quay.io/deepgram/onprem-api:release-240627
image: quay.io/deepgram/onprem-api:release-240725

# Here we expose the API port to the host machine. The container port
# (right-hand side) must match the port that the API service is listening
Expand Down Expand Up @@ -37,7 +37,7 @@ services:

# The speech engine service.
engine:
image: quay.io/deepgram/onprem-engine:release-240627
image: quay.io/deepgram/onprem-engine:release-240725

# Utilize a GPU, if available.
devices:
Expand Down
Loading