Releases: JamesBrill/react-speech-recognition
Async stopListening
Attempts to address #66
stopListening
and abortListening
were previously synchronous functions, but the internal function for turning off the microphone is asynchronous. This made it impossible for consumers to wait for the microphone to be turned off. This release addresses that, making these functions asynchronous and wait for the microphone to be turned off before returning.
Bugfix: race condition between multiple instances
Attempts to address #63
Because the commands
prop is an array, its value is always deemed to be changing when either (a) the default value is used or (b) the consumer does not memoize their commands
value. The various callbacks and effects depending on commands
get re-run on every change, including unrelated changes (e.g. transcript).
This was resulting in a race condition when multiple instances of SpeechRecognition
were being rendered. While the transcript was being updated, the new transcripts would cause commands
to "change" and re-run the dependent effects, including the one that subscribes and unsubscribes SpeechRecognition
to RecognitionManager
. As RecognitionManager
updates its subscribers of transcript changes, they unsubscribe and resubscribe with a different ID. When having to iterate through many subscribers, the manager would sometimes find that a subscriber no longer existed due to unsubscribing with that old ID after a previous transcript change.
The solution is to stop subscribers from caring about changes to commands
. matchCommands
, the direct dependent of commands
, now accesses it via a mutable ref, just getting whatever the latest value is.
I'm not sure how I feel about this - it feels like a misuse of hooks. Besides hashing commands
and comparing that with its old value, this is the best I can come up with for now.
Bugfix: fix SSR support
Adds fallback for server-side rendering when navigator
is not defined. Is the release of #61
Bugfix: conflict between stopListening and resetTranscript
There was a bug (#58) in which resetTranscript
would turn the microphone on, even when stopListening
had just been called.
When any of stopListening
, abortListening
or resetTranscript
are called, the RecognitionManager
will first call disconnect
, which asks the Web Speech API to stop listening in some way. resetTranscript
requests an abort to prevent speech in progress from being picked up immediately after the transcript is reset. It also specifies behaviour to perform when the microphone is actually turned off. Unlike the other two methods, resetTranscript
specifies that the microphone be restarted after being aborted.
The callback for this moment is onRecognitionDisconnect
. If the microphone was stopped by resetTranscript
, it will immediately turn it on again; otherwise, it will remain off. This is where we encounter the issue.
If stopListening
is called and then resetTranscript
, resetTranscript
will be the last method to configure the behaviour on disconnect, which is to restart. When the disconnect finally happens, it will restart the microphone, going against the request of stopListening
to keep it turned off.
The solution is to only configure disconnect logic when the microphone is deemed to be turned on. If stopListening
gets in first, it sets listening
to false. When resetTranscript
follows it, it gets blocked by a new condition in disconnect
that requires listening
to be true, stopping it from overriding the disconnect behaviour.
Pass resetTranscript to command callbacks
Command callbacks do not have access to the resetTranscript
function as they have to be defined before resetTranscript
is returned by useSpeechRecognition
. This PR addresses this problem by passing resetTranscript
to the command callbacks as the last argument. This is put inside an object to allow further properties to be passed in future. Intended to address #53
Fuzzy matching
Contribution from @urbanL1fe
Adds two new properties that can be set on each command object, with the Dice coefficient used under the hood to compare strings:
isFuzzyMatch
: Boolean that determines whether the comparison between speech andcommand
is based on similarity rather than an exact match. Fuzzy matching is useful for commands that are easy to mispronounce or be misinterpreted by the Speech Recognition engine (e.g. names of places, sports teams, restaurant menu items). It is intended for commands that are string literals without special characters. Ifcommand
is a string with special characters or aRegExp
, it will be converted to a string without special characters when fuzzy matching. The similarity that is needed to match the command can be configured withfuzzyMatchingThreshold
.isFuzzyMatch
isfalse
by default. When it is set totrue
, it will pass three arguments tocallback
:- The value of
command
- The speech that matched
command
- The similarity between
command
and the speech
- The value of
fuzzyMatchingThreshold
: If the similarity of speech tocommand
is higher than this value whenisFuzzyMatch
is turned on, thecallback
will be invoked. You should set this only ifisFuzzyMatch
istrue
. It takes values between0
(will match anything) and1
(needs an exact match). The default value is0.8
.
Code coverage threshold: 80%
Code coverage threshold: 80%
Update package.json description
Update package.json description
v3: Commands
Migrating from v2 to v3
v3 makes use of React hooks to simplify the consumption of react-speech-recognition
:
- Replacing the higher order component with a React hook
- Introducing commands, functions that get executed when the user says a particular phrase
- A clear separation between all parts of
react-speech-recognition
that are global (e.g. whether the microphone is listening or not) and local (e.g. transcripts). This makes it possible to have multiple components consuming the global microphone input while maintaining their own transcripts and commands - Requires React 16.8
- Adds code coverage step to CI
- Some default prop values have changed so check those out below
The original Dictaphone example
In v2
import React, { Component } from "react";
import PropTypes from "prop-types";
import SpeechRecognition from "react-speech-recognition";
const propTypes = {
// Props injected by SpeechRecognition
transcript: PropTypes.string,
resetTranscript: PropTypes.func,
browserSupportsSpeechRecognition: PropTypes.bool
};
const Dictaphone = ({
transcript,
resetTranscript,
browserSupportsSpeechRecognition
}) => {
if (!browserSupportsSpeechRecognition) {
return null;
}
return (
<div>
<button onClick={resetTranscript}>Reset</button>
<span>{transcript}</span>
</div>
);
};
Dictaphone.propTypes = propTypes;
export default SpeechRecognition(Dictaphone);
In v3
import React, { useEffect } from 'react'
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'
const Dictaphone = () => {
const { transcript, resetTranscript } = useSpeechRecognition()
useEffect(() => {
SpeechRecognition.startListening({ continuous: true })
}, []);
if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
return null
}
return (
<div>
<button onClick={resetTranscript}>Reset</button>
<p>{transcript}</p>
</div>
)
}
export default Dictaphone
autoStart
This was a global option in v2 that would cause the microphone to start listening from the beginning by default. In v3, the microphone is initially turned off by default. It can be turned on when your component first renders by either useEffect
if you're using hooks or componentDidMount
if you're still using class components. It is recommended that you do this close to the root of your application as this affects global state.
useEffect(() => {
SpeechRecognition.startListening({ continuous: true })
}, []);
continuous
This was another global option in v2 that would by default have the microphone permanently listen to the user, even when they finished speaking. This default behaviour did not match the most common usage pattern, which is to use react-speech-recognition
for "press to talk" buttons that stop listening once a command has been spoken.
continuous
is now an option that can be passed to SpeechRecognition.startListening
. It is false
by default, but can be overridden like so:
SpeechRecognition.startListening({ continuous: true })
clearTranscriptOnListen
This is a new prop in v3 that is passed into useSpeechRecognition
from the consumer. Its default value makes a subtle change to the previous behaviour. When continuous
was set to false
in v2, the transcript would not be reset when the microphone started listening again. clearTranscriptOnListen
changes that, clearing the component's transcript at the beginning of every new discontinuous speech. To replicate the old behaviour, this can be turned off when passing props into useSpeechRecognition
:
const { transcript } = useSpeechRecognition({ clearTranscriptOnListen: false })
Injected props
SpeechRecognition
used to inject props into components in v2. These props are still available, but in different forms.
transcript
This is now state returned by useSpeechRecognition
. This transcript is local to the component using the hook.
resetTranscript
This is now state returned by useSpeechRecognition
. This only resets the component's transcript, not any global state.
startListening
This is now available as SpeechRecognition.startListening
.
stopListening
This is now available as SpeechRecognition.stopListening
.
abortListening
This is now available as SpeechRecognition.abortListening
.
browserSupportsSpeechRecognition
This is now available as the function SpeechRecognition.browserSupportsSpeechRecognition
.
listening
This is now state returned by useSpeechRecognition
. This is the global listening state.
interimTranscript
This is now state returned by useSpeechRecognition
. This transcript is local to the component using the hook.
finalTranscript
This is now state returned by useSpeechRecognition
. This transcript is local to the component using the hook.
recognition
This is now returned by the function SpeechRecognition.getRecognition
.
Commands
To respond when the user says a particular phrase, you can pass in a list of commands to the useSpeechRecognition
hook. Each command is an object with the following properties:
command
: This is a string orRegExp
representing the phrase you want to listen forcallback
: The function that is executed when the command is spokenmatchInterim
: Boolean that determines whether "interim" results should be matched against the command. This will make your component respond faster to commands, but also makes false positives more likely - i.e. the command may be detected when it is not spoken. This isfalse
by default and should only be set for simple commands.
Command symbols
To make commands easier to write, the following symbols are supported:
- Splats: this is just a
*
and will match multi-word text:- Example:
'I would like to order *'
- The words that match the splat will be passed into the callback, one argument per splat
- Example:
- Named variables: this is written
:<name>
and will match a single word:- Example:
'I am :height metres tall'
- The one word that matches the named variable will be passed into the callback
- Example:
- Optional words: this is a phrase wrapped in parentheses
(
and)
, and is not required to match the command:- Example:
'Pass the salt (please)'
- The above example would match both
'Pass the salt'
and'Pass the salt please'
- Example:
Example with commands
import React, { useState } from 'react'
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'
const Dictaphone = () => {
const [message, setMessage] = useState('')
const commands = [
{
command: 'I would like to order *',
callback: (food) => setMessage(`Your order is for: ${food}`)
},
{
command: 'The weather is :condition today',
callback: (condition) => setMessage(`Today, the weather is ${condition}`)
},
{
command: 'My top sports are * and *',
callback: (sport1, sport2) => setMessage(`#1: ${sport1}, #2: ${sport2}`)
},
{
command: 'Pass the salt (please)',
callback: () => setMessage('My pleasure')
},
{
command: 'Hello',
callback: () => setMessage('Hi there!'),
matchInterim: true
}
]
const { transcript } = useSpeechRecognition({ commands })
if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
return null
}
return (
<div>
<p>{message}</p>
<p>{transcript}</p>
</div>
)
}
export default Dictaphone
Add tests
v2.1.4 v2.1.4