add a kaldi rule (at least for sleep/wake) #797

kendonB · 2020-04-28T11:13:39Z

Is your feature request related to a problem? Please describe.
Dragon has a bunch of built in commands that help with use. It would be nice for switchers to have a Kaldi rule loosely based on the functionality available in base Dragon.

Describe the solution you'd like
A Kaldi grammar with the following features:

Sleep/wake
An automatic program opener. In Dragon, you say "open " and it automatically finds and opens the program you want. It seems to be very good at automapping these utterances to program .exe's. I'm not sure how it works.
An automatic program switcher. In Dragon. you say "switch to ".
A universal button presser. I think this is just a matter of adding the buttons that are missing in hit in nav.py.
A "maximize window" command that works correctly.

To add sleep/wake we need to translate the grammar in this file into caster.

@daanzu if you have any pointers or know of someone who has done this please let us know.

The text was updated successfully, but these errors were encountered:

LexiconCode · 2020-04-28T12:54:55Z

This potential isn't too hard to implement. This would be relevant to all engines not just Kaldi. How Kaldi manages to make this work set_exclusiveness(). An exclusive grammar takes precedent over all other active grammars. A Rule/grammars that are exclusive are only available for recognition.

Note setting a grammar to exclusive overrides DNS's built-in sleep-wake function. When using DNS "natlink.setMicState("sleeping")" setting to match the state of the grammar.

lexxish · 2020-04-29T14:11:57Z

This potential isn't too hard to implement. This would be relevant to all engines not just Kaldi. How Kaldi manages to make this work set_exclusiveness(). An exclusive grammar takes precedent over all other active grammars. A Rule/grammars that are exclusive are only available for recognition.

Note setting a grammar to exclusive overrides DNS's built-in sleep-wake function. When using DNS "natlink.setMicState("sleeping")" setting to match the state of the grammar.

The FunkContext with a mapping rule.

Overriding the DNS default commands for sleep/wake would be nice because you can use shorter commands such as "snore" to put the mic to sleep. Only downside is I'm not sure if we could get the taskbar icon toggling from green to blue. I think the Kaldi implementation is more important regardless.

LexiconCode · 2020-04-29T14:35:54Z

Only downside is I'm not sure if we could get the taskbar icon toggling from green to blue.

Fortunately I believe this can be handled by natlink.setMicState(state) reading the documentation it controls the mic, where state is 'on', 'off' or 'sleeping' natlink.getMicState() returns current state. Therefore the DNS icon could be in sync with the exclusive grammar state.

I agree the though the implementation is more important for WSR/Kaldi.

lexxish · 2020-04-29T15:15:05Z

I agree the though the implementation is more important for WSR/Kaldi.

Regarding Kaldi, would implementation involve changing content_loader.py or does this rule operate independent of how you load the other rules?

LexiconCode · 2020-04-29T16:21:50Z

Fortunately we don't have to change anything in Caster to make grammars exclusive. It's a simple bool. It works on any other rule. The rule must be already loaded into the engine before it's set to be exclusive. One set no other commands except for those that are exclusive through one or multiple rules will be recognized.

kendonB · 2020-05-11T00:37:46Z

@lexxish did you ever figure out getting sleep to work?

LexiconCode · 2020-05-13T21:04:51Z

@lexxish

With straight dragonfly this would be pretty easy with Caster it's a bit different because we don't know the grammar name being used as it's different every boot. I've been working on programmatically switching DNS Modes in preparation for creating a mode unified mode manager for all engines. The following could be used in the sleep grammar.

from dragonfly import get_engine, Grammar

def find_grammar_name():
    grammar_cache = None
    if grammar_cache is None:
        for grammar in get_engine().grammars:
            for rule in grammar.rules:
                if rule.exported:
                    if rule.name == "Mode Rules": # Rule name
                        print(rule.name)
                        grammar_cache = grammar
                        return grammar_cache
    else:               
        return grammar_cache

in another function then you can use grammar_cache.set_exclusiveness(0) or grammar_cache.set_exclusiveness(1) to toggle exclusiveness

You can also check for the running engine type if there is differences that need to be handled based on engine implementation. For example with DNS:

if get_engine()._name == 'natlink':
	import natlink 
    # Do something

LexiconCode · 2020-05-13T21:25:23Z

* A "maximize window" command that works correctly.

What's wrong with the current behavior @kendonB?

An automatic program switcher. In Dragon. you say "switch to ".

Besides creating a GUI the backend information could be obtained from a tweaked function to use get_all_windows(): returning all pids list instead of Window.get_foreground()

def get_active_window_info():
    '''Returns foreground window executable_file, executable_path, title, handle, classname'''
    FILENAME_PATTERN = re.compile(r"[/\\]([\w_ ]+\.[\w]+)")
    window = Window.get_foreground()
    executable_path = str(Path(get_active_window_path()))
    match_object = FILENAME_PATTERN.findall(window.executable)
    executable_file = None
    if len(match_object) > 0:
        executable_file = match_object[0]
    return [executable_file, executable_path, window.title, window.handle, window.classname]

lexxish · 2020-05-13T23:01:47Z

@lexxish did you ever figure out getting sleep to work?

I have not tried yet. Will update you all if I do.

I do have some "switch to" like code I can post if anyone wants it. I use a phonetic distance library to choose the best match based on what is currently running. Also have "open" like command that searches a couple directories (e.g. desktop)...it's not perfect and I think the way "bring" allows you to specify programs is also nice for things you use a lot.

Another item that would be nice would be ability to use Kaldi for commands, but DNS for dictation - similar to how I believe Kaldi can be used with Google Speech Recognition.

Last item that would be nice to have (but deserves it's own issue number) is integration with accessibility APIs like DNS has. So you can say things like "Click X" when X is a button in a browser.

lexxish · 2020-05-13T23:07:35Z

* A "maximize window" command that works correctly.
What's wrong with the current behavior @kendonB?

An automatic program switcher. In Dragon. you say "switch to ".

Besides creating a GUI the backend information could be obtained from a tweaked function to use get_all_windows(): returning all pids list instead of Window.get_foreground()
def get_active_window_info():
    '''Returns foreground window executable_file, executable_path, title, handle, classname'''
    FILENAME_PATTERN = re.compile(r"[/\\]([\w_ ]+\.[\w]+)")
    window = Window.get_foreground()
    executable_path = str(Path(get_active_window_path()))
    match_object = FILENAME_PATTERN.findall(window.executable)
    executable_file = None
    if len(match_object) > 0:
        executable_file = match_object[0]
    return [executable_file, executable_path, window.title, window.handle, window.classname]

I could be wrong, but I think Caster's default maximize uses "alt+SPACE, x" to maximize rather then sending the foreground window a maximize message (https://docs.microsoft.com/en-us/windows/win32/learnwin32/window-messages). I don't think "alt+SPACE, x" works for every application, but can't think of a specific one right now. The same type of scenario for exists for closing windows in Caster too I believe, where we could send SIGTERM and/or SIGKILL message equivalents (probably two different voice commands) instead of using keyboard shortcuts and it would (hopefully) work more consistently.

LexiconCode · 2020-05-13T23:14:52Z

I could be wrong, but I think Caster's default maximize uses "alt+SPACE, x" to maximize rather then sending the foreground window a maximize message

Back when implementing kaldi support I switched it from "alt+SPACE, x" to dragonfly cross-platform implementation. For Windows OS utilizes Win32. If something's not behaving correctly with those minimize/maximize commands let me know.

Caster/castervoice/rules/core/navigation_rules/window_mgmt_rule.py

Line 13 in 7d3834e

'maximize':

Caster/castervoice/lib/utilities.py

Line 77 in 7d3834e

def maximize_window():

LexiconCode · 2020-05-13T23:24:29Z

Last item that would be nice to have (but deserves it's own issue number) is integration with accessibility APIs like DNS has. So you can say things like "Click X" when X is a button in a browser.

I will open up a new issue. Done #814

daanzu · 2020-05-14T01:11:56Z

Another item that would be nice would be ability to use Kaldi for commands, but DNS for dictation - similar to how I believe Kaldi can be used with Google Speech Recognition.

I don't have experience with Natlink, and don't currently have Dragon installed, but I'd be happy to help implementing this. Is there a way with Natlink to just get straight dictation recognition text from audio data passed to it? daanzu/kaldi-active-grammar#23

LexiconCode · 2020-05-14T01:41:40Z

Perhaps there should be an issue in KaldiAG for working on this?

Agreed

LexiconCode · 2020-05-20T20:52:31Z

@lexxish and @kendonB I will attempt to implement the sleeping grammar and modes for all engines. These modes will override DNS's built-in modes but will be kept in sync with the DNS GUI.

LexiconCode · 2021-04-03T01:25:23Z

The #881 addresses the following request.

An automatic program switcher. In Dragon. you say "switch to ".

LexiconCode added the New Feature A new feature that is not currently implemented. label Apr 28, 2020

LexiconCode added Kaldi Related to Kaldi speech recognition backend Windows Speech Recognition Windows Speech Recognition Backend labels May 14, 2020

LexiconCode mentioned this issue May 16, 2020

Added change engine mic state and mode on caster startup for DNS #815

Merged

17 tasks

LexiconCode mentioned this issue Oct 4, 2020

Implemented microphone modes #853

Merged

17 tasks

LexiconCode mentioned this issue Apr 3, 2021

Implemented Window Switch Manager #881

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a kaldi rule (at least for sleep/wake) #797

add a kaldi rule (at least for sleep/wake) #797

kendonB commented Apr 28, 2020 •

edited by LexiconCode

Loading

LexiconCode commented Apr 28, 2020 •

edited

Loading

lexxish commented Apr 29, 2020 •

edited

Loading

LexiconCode commented Apr 29, 2020 •

edited

Loading

lexxish commented Apr 29, 2020 •

edited

Loading

LexiconCode commented Apr 29, 2020 •

edited

Loading

kendonB commented May 11, 2020

LexiconCode commented May 13, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

lexxish commented May 13, 2020

lexxish commented May 13, 2020

LexiconCode commented May 13, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

daanzu commented May 14, 2020 •

edited

Loading

LexiconCode commented May 14, 2020 •

edited

Loading

LexiconCode commented May 20, 2020

LexiconCode commented Apr 3, 2021

add a kaldi rule (at least for sleep/wake) #797

add a kaldi rule (at least for sleep/wake) #797

Comments

kendonB commented Apr 28, 2020 • edited by LexiconCode Loading

LexiconCode commented Apr 28, 2020 • edited Loading

lexxish commented Apr 29, 2020 • edited Loading

LexiconCode commented Apr 29, 2020 • edited Loading

lexxish commented Apr 29, 2020 • edited Loading

LexiconCode commented Apr 29, 2020 • edited Loading

kendonB commented May 11, 2020

LexiconCode commented May 13, 2020 • edited Loading

LexiconCode commented May 13, 2020 • edited Loading

lexxish commented May 13, 2020

lexxish commented May 13, 2020

LexiconCode commented May 13, 2020 • edited Loading

LexiconCode commented May 13, 2020 • edited Loading

daanzu commented May 14, 2020 • edited Loading

LexiconCode commented May 14, 2020 • edited Loading

LexiconCode commented May 20, 2020

LexiconCode commented Apr 3, 2021

kendonB commented Apr 28, 2020 •

edited by LexiconCode

Loading

LexiconCode commented Apr 28, 2020 •

edited

Loading

lexxish commented Apr 29, 2020 •

edited

Loading

LexiconCode commented Apr 29, 2020 •

edited

Loading

lexxish commented Apr 29, 2020 •

edited

Loading

LexiconCode commented Apr 29, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

LexiconCode commented May 13, 2020 •

edited

Loading

daanzu commented May 14, 2020 •

edited

Loading

LexiconCode commented May 14, 2020 •

edited

Loading