Skip to content

Releases: ml4ai/tomcat-speechAnalyzer

4.1.6

07 Jul 15:05
Compare
Choose a tag to compare

Implements database efficiency improvements:

  • Increases maximum connection pool size for Postgres db
  • docker-compose verifies that db is running before launching SpeechAnalyzer
  • Switches to prepared statements for database queries
  • Fixes Opensmile configuration to accurately describe incoming audio chunks

4.1.5

23 May 20:26
Compare
Choose a tag to compare
  • Fixes timestamp published with Boost log messages
  • Prints version number at start of agent
  • Prints sentiment and personality messages to log and log file

4.1.2

10 May 18:09
Compare
Choose a tag to compare

The SpeechAnalyzer will now print log messages to both the console, and a log file.

Local:

If running locally, the files will be stored in a logs directory created in the directory containing the executable.

Docker:

If running in Docker, the logs will be stored in the /logs directory inside the Docker container. This directory will need to be mounted to access the log files.

NOTE: Due to an issue with release 4.1.1, this release does not include those updates.

4.1.1

29 Apr 16:13
606ca0d
Compare
Choose a tag to compare
  • Incoming audio chunks are now in raw binary format instead of base64
  • Switch to paho.mqtt.cpp library when interfacing with internal message bus
  • Reduces CPU usage to between 200-300% with three audio streams
  • Upgrades Ubuntu version when building with Docker due to compatibility issues with paho.mqtt.cpp

4.1.0

26 Apr 17:54
be28d05
Compare
Choose a tag to compare

A number of fixes and optimizations have been made to reduce CPU usage to between 250%-400% for three participants while a trial is in progress.

Fixed a bug where external Opensmile processes were not killed properly
Fixed a bug with an infinite loop not yielding CPU usage
Updated database and Opensmile code to increase efficiency and yield CPU as much as possible

Additionally, this pull request updates the repository formatting to properly reflect the decoupling of the SpeechAnalyzer and ASRAgent

3.5.1

17 Mar 20:36
69df9f5
Compare
Choose a tag to compare
  • The --disable_opensmile command line option will now disable all Opensmile related components, including the MMC server, Postgres database, and Opensmile feature extraction.

3.5.0

10 Mar 16:24
1ff4808
Compare
Choose a tag to compare
  • Switches to the "video" enchanced model for Google Cloud Speech API from the default model
    
  • Enables specifying domain words and phrases through a configuration file for improved transcription accuracy
    
  • Updates the googleapis version to allow for using speech adaptions boosts in api version v1
    
  • Resolves a number of edge-case bugs that could cause the spechAnalyzer to crash on startup or shutdown
    
  • Switches many std::cout statements to BOOST_LOG_TRIVIAL to avoid non-thread-safe behavior
    
  • Adds more detailed logging information
    
  • Fixes crash when building Docker container if not enough CPUs were allocated to Docker Engine
    

3.4.0

11 Feb 20:24
47df7a0
Compare
Choose a tag to compare
  • Updates uaz_speech_analyzer to version 3.4.0
  • Removes a number of unused/redundant fields from ASR messages and resolves issues #17-#21
  • Adds additional fields to features database
  • Renames a number of columns in database for clarity
  • Intermediate transcriptions re-enabled and additional fields added
  • New command line option to disable all but initial intermediate transcription messages

3.3.1

14 Jan 16:02
08be522
Compare
Choose a tag to compare
  • Fixes bug when Vosk leaves out start or end timestamp in response message
  • Adds additional error handling for Vosk message processing
  • Removes sending word-aligned features on message bus
  • Prevents multiple concurrent connections with the same participant_id

3.2.0

07 Jan 19:16
06c4fc1
Compare
Choose a tag to compare
  • Extracted vocalic features now stored in a Postgres database
    
  • Disabled storing features in memory
    
  • Vosk speech engine now default ASR backend
    
  • Code formatting
    
  • Version number updated to 3.2.0