Skip to content

Releases: JamesBrill/react-speech-recognition

Async stopListening

07 Nov 15:12
9ce4fa5
Compare
Choose a tag to compare

Attempts to address #66

stopListening and abortListening were previously synchronous functions, but the internal function for turning off the microphone is asynchronous. This made it impossible for consumers to wait for the microphone to be turned off. This release addresses that, making these functions asynchronous and wait for the microphone to be turned off before returning.

Bugfix: race condition between multiple instances

08 Sep 00:09
5f16752
Compare
Choose a tag to compare

Attempts to address #63

Because the commands prop is an array, its value is always deemed to be changing when either (a) the default value is used or (b) the consumer does not memoize their commands value. The various callbacks and effects depending on commands get re-run on every change, including unrelated changes (e.g. transcript).

This was resulting in a race condition when multiple instances of SpeechRecognition were being rendered. While the transcript was being updated, the new transcripts would cause commands to "change" and re-run the dependent effects, including the one that subscribes and unsubscribes SpeechRecognition to RecognitionManager. As RecognitionManager updates its subscribers of transcript changes, they unsubscribe and resubscribe with a different ID. When having to iterate through many subscribers, the manager would sometimes find that a subscriber no longer existed due to unsubscribing with that old ID after a previous transcript change.

The solution is to stop subscribers from caring about changes to commands. matchCommands, the direct dependent of commands, now accesses it via a mutable ref, just getting whatever the latest value is.

I'm not sure how I feel about this - it feels like a misuse of hooks. Besides hashing commands and comparing that with its old value, this is the best I can come up with for now.

Bugfix: fix SSR support

26 Aug 22:36
5fa4761
Compare
Choose a tag to compare

Adds fallback for server-side rendering when navigator is not defined. Is the release of #61

Bugfix: conflict between stopListening and resetTranscript

21 Aug 20:00
4de0d15
Compare
Choose a tag to compare

There was a bug (#58) in which resetTranscript would turn the microphone on, even when stopListening had just been called.

When any of stopListening, abortListening or resetTranscript are called, the RecognitionManager will first call disconnect, which asks the Web Speech API to stop listening in some way. resetTranscript requests an abort to prevent speech in progress from being picked up immediately after the transcript is reset. It also specifies behaviour to perform when the microphone is actually turned off. Unlike the other two methods, resetTranscript specifies that the microphone be restarted after being aborted.

The callback for this moment is onRecognitionDisconnect. If the microphone was stopped by resetTranscript, it will immediately turn it on again; otherwise, it will remain off. This is where we encounter the issue.

If stopListening is called and then resetTranscript, resetTranscript will be the last method to configure the behaviour on disconnect, which is to restart. When the disconnect finally happens, it will restart the microphone, going against the request of stopListening to keep it turned off.

The solution is to only configure disconnect logic when the microphone is deemed to be turned on. If stopListening gets in first, it sets listening to false. When resetTranscript follows it, it gets blocked by a new condition in disconnect that requires listening to be true, stopping it from overriding the disconnect behaviour.

Pass resetTranscript to command callbacks

14 Aug 15:54
af3d083
Compare
Choose a tag to compare

Command callbacks do not have access to the resetTranscript function as they have to be defined before resetTranscript is returned by useSpeechRecognition. This PR addresses this problem by passing resetTranscript to the command callbacks as the last argument. This is put inside an object to allow further properties to be passed in future. Intended to address #53

Fuzzy matching

31 Jul 20:37
2568bb6
Compare
Choose a tag to compare

Contribution from @urbanL1fe

Adds two new properties that can be set on each command object, with the Dice coefficient used under the hood to compare strings:

  • isFuzzyMatch: Boolean that determines whether the comparison between speech and command is based on similarity rather than an exact match. Fuzzy matching is useful for commands that are easy to mispronounce or be misinterpreted by the Speech Recognition engine (e.g. names of places, sports teams, restaurant menu items). It is intended for commands that are string literals without special characters. If command is a string with special characters or a RegExp, it will be converted to a string without special characters when fuzzy matching. The similarity that is needed to match the command can be configured with fuzzyMatchingThreshold. isFuzzyMatch is false by default. When it is set to true, it will pass three arguments to callback:
    • The value of command
    • The speech that matched command
    • The similarity between command and the speech
  • fuzzyMatchingThreshold: If the similarity of speech to command is higher than this value when isFuzzyMatch is turned on, the callback will be invoked. You should set this only if isFuzzyMatch is true. It takes values between 0 (will match anything) and 1 (needs an exact match). The default value is 0.8.

Code coverage threshold: 80%

12 Jul 12:53
602da10
Compare
Choose a tag to compare

Code coverage threshold: 80%

Update package.json description

12 Jul 11:20
049dd11
Compare
Choose a tag to compare

Update package.json description

v3: Commands

11 Jul 13:11
bec4d3f
Compare
Choose a tag to compare

Migrating from v2 to v3

v3 makes use of React hooks to simplify the consumption of react-speech-recognition:

  • Replacing the higher order component with a React hook
  • Introducing commands, functions that get executed when the user says a particular phrase
  • A clear separation between all parts of react-speech-recognition that are global (e.g. whether the microphone is listening or not) and local (e.g. transcripts). This makes it possible to have multiple components consuming the global microphone input while maintaining their own transcripts and commands
  • Requires React 16.8
  • Adds code coverage step to CI
  • Some default prop values have changed so check those out below

The original Dictaphone example

In v2

import React, { Component } from "react";
import PropTypes from "prop-types";
import SpeechRecognition from "react-speech-recognition";

const propTypes = {
  // Props injected by SpeechRecognition
  transcript: PropTypes.string,
  resetTranscript: PropTypes.func,
  browserSupportsSpeechRecognition: PropTypes.bool
};

const Dictaphone = ({
  transcript,
  resetTranscript,
  browserSupportsSpeechRecognition
}) => {
  if (!browserSupportsSpeechRecognition) {
    return null;
  }

  return (
    <div>
      <button onClick={resetTranscript}>Reset</button>
      <span>{transcript}</span>
    </div>
  );
};

Dictaphone.propTypes = propTypes;

export default SpeechRecognition(Dictaphone);

In v3

import React, { useEffect } from 'react'
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'

const Dictaphone = () => {
  const { transcript, resetTranscript } = useSpeechRecognition()

  useEffect(() => {
    SpeechRecognition.startListening({ continuous: true })
  }, []);

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

  return (
    <div>
      <button onClick={resetTranscript}>Reset</button>
      <p>{transcript}</p>
    </div>
  )
}
export default Dictaphone

autoStart

This was a global option in v2 that would cause the microphone to start listening from the beginning by default. In v3, the microphone is initially turned off by default. It can be turned on when your component first renders by either useEffect if you're using hooks or componentDidMount if you're still using class components. It is recommended that you do this close to the root of your application as this affects global state.

useEffect(() => {
  SpeechRecognition.startListening({ continuous: true })
}, []);

continuous

This was another global option in v2 that would by default have the microphone permanently listen to the user, even when they finished speaking. This default behaviour did not match the most common usage pattern, which is to use react-speech-recognition for "press to talk" buttons that stop listening once a command has been spoken.

continuous is now an option that can be passed to SpeechRecognition.startListening. It is false by default, but can be overridden like so:

SpeechRecognition.startListening({ continuous: true })

clearTranscriptOnListen

This is a new prop in v3 that is passed into useSpeechRecognition from the consumer. Its default value makes a subtle change to the previous behaviour. When continuous was set to false in v2, the transcript would not be reset when the microphone started listening again. clearTranscriptOnListen changes that, clearing the component's transcript at the beginning of every new discontinuous speech. To replicate the old behaviour, this can be turned off when passing props into useSpeechRecognition:

const { transcript } = useSpeechRecognition({ clearTranscriptOnListen: false })

Injected props

SpeechRecognition used to inject props into components in v2. These props are still available, but in different forms.

transcript

This is now state returned by useSpeechRecognition. This transcript is local to the component using the hook.

resetTranscript

This is now state returned by useSpeechRecognition. This only resets the component's transcript, not any global state.

startListening

This is now available as SpeechRecognition.startListening.

stopListening

This is now available as SpeechRecognition.stopListening.

abortListening

This is now available as SpeechRecognition.abortListening.

browserSupportsSpeechRecognition

This is now available as the function SpeechRecognition.browserSupportsSpeechRecognition.

listening

This is now state returned by useSpeechRecognition. This is the global listening state.

interimTranscript

This is now state returned by useSpeechRecognition. This transcript is local to the component using the hook.

finalTranscript

This is now state returned by useSpeechRecognition. This transcript is local to the component using the hook.

recognition

This is now returned by the function SpeechRecognition.getRecognition.

Commands

To respond when the user says a particular phrase, you can pass in a list of commands to the useSpeechRecognition hook. Each command is an object with the following properties:

  • command: This is a string or RegExp representing the phrase you want to listen for
  • callback: The function that is executed when the command is spoken
  • matchInterim: Boolean that determines whether "interim" results should be matched against the command. This will make your component respond faster to commands, but also makes false positives more likely - i.e. the command may be detected when it is not spoken. This is false by default and should only be set for simple commands.

Command symbols

To make commands easier to write, the following symbols are supported:

  • Splats: this is just a * and will match multi-word text:
    • Example: 'I would like to order *'
    • The words that match the splat will be passed into the callback, one argument per splat
  • Named variables: this is written :<name> and will match a single word:
    • Example: 'I am :height metres tall'
    • The one word that matches the named variable will be passed into the callback
  • Optional words: this is a phrase wrapped in parentheses ( and ), and is not required to match the command:
    • Example: 'Pass the salt (please)'
    • The above example would match both 'Pass the salt' and 'Pass the salt please'

Example with commands

import React, { useState } from 'react'
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'

const Dictaphone = () => {
  const [message, setMessage] = useState('')
  const commands = [
    {
      command: 'I would like to order *',
      callback: (food) => setMessage(`Your order is for: ${food}`)
    },
    {
      command: 'The weather is :condition today',
      callback: (condition) => setMessage(`Today, the weather is ${condition}`)
    },
    {
      command: 'My top sports are * and *',
      callback: (sport1, sport2) => setMessage(`#1: ${sport1}, #2: ${sport2}`)
    },
    {
      command: 'Pass the salt (please)',
      callback: () => setMessage('My pleasure')
    },
    {
      command: 'Hello',
      callback: () => setMessage('Hi there!'),
      matchInterim: true
    }
  ]

  const { transcript } = useSpeechRecognition({ commands })

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

  return (
    <div>
      <p>{message}</p>
      <p>{transcript}</p>
    </div>
  )
}
export default Dictaphone

Add tests

28 Jun 12:34
Compare
Choose a tag to compare
v2.1.4

v2.1.4