Skip to content

Latest commit

 

History

History
126 lines (96 loc) · 6.7 KB

README.md

File metadata and controls

126 lines (96 loc) · 6.7 KB

Hobby Command Line

Hobby is a dangerous app.

This project was the result of using Vosk API to get text from speech and executing a Lua script based on the first word.

After reading the saftey warnings, see the User Guide for more helpful information.

Safety Warnings

  • when active anyone can control your phone in the same ways you can over voice.
  • Lua can read and write files, this may present some danger
    • App does not have universal files access as it has user accessible scripts and malicious scripts could be installed.
  • Lua can dispatch arbitrary Intents (communicate with active apps), and start arbitrary applications. I don't know the full breadth of what this entails, but there are intents that request money transfers that I haven't special cased out yet.

Why Lua

Slightly safer and easier than the original scheme was. I found it unappetizing to develop new features when the app was written in scheme.

Guide for Speaking to Hobby / Writing Hobby Actions

Just like linux command lines start with an action like "cat", Hobby commands should start with the action name. This is used to look up a Lua file by that name, and the Lua is responsible for interpreting what intent to fire off.

For instance "Play music" will be interpreted as:

Action: Play Arguments: "play", "music"

and will be handled by "lua/actions/play.lua"

Actions are handled this way because it is efficient, quick, and predictable. Some exceptions can be made for filler words such as "Set an Alarm for ten PM" instead of having to say "Alarm ten PM", which would go to set.lua which would then redirect to alarm.lua. Hobby is not intended to interpret every way something can be said, but users can make aliases for common ways they say things to make dealing with Hobby easier.

Replacing the system Assisstant

I don't have documentation on this currently but it is possible to replace the squeeze action without root, although it will require a screen tap if the phone was locked, otherwise the microphone will not enable. It did require me to run an ADB command from developer mode, but I wouldn't reccomend you do this unless you know what you're doing.

Goals

Basically I would like this to do everything I want to speak to my phone about, or do with my eyes closed.

  • Offline Speech to Text
    • Vosk is an offline model, if you notice this app making network calls, file an issue.
    • As speech recognition is done offline, you will need a phone capable of running a model efficiently. I'm using a pixel 3.
  • Offer speech to text via an intent?
  • Settings panel
  • Implement voice interaction API
  • Play/Pause/Stop Music (implemented common interface, but most apps ignore it).
    • "Stop"
    • "Play"
    • "Previous"
    • "Next"
  • Take a note
    • "Note [title]"
      • [body]
    • "Stop" to finish note
    • This also lets you tweet/discord/telegram
    • Unfortunately I had to go through the app picker as no one implemented the CREATE_NOTE API described by Google.
  • Reminder API
  • Make a calendar event
    • Calendar event on (date) [start (time) end (time)] [named (name)]
      • [from, to] unfortunately hard to parse because to could be two
  • Manage Alarms/Timer
    • Timer
      • Timer [for] [3 hours] [22 minutes] [and] [30 seconds]
      • Timers have a max of 24 hours
    • Alarm
    • "Stop"
    • "Alarm for (Time) [on (Date)|Daily|Weekly] [named (Name)] [repeat]"
    • "Alarm"
      1. "Please say the alarm time"
    1. "Do you want this alarm to repeat 'Daily', 'Weekly', or 'Not' ('Activate' to skip)"
    2. "Alarm name? ('Activate' to skip)"
  • Conversations of more than one line, if the Action needs more clarification, or is taking a note for you.
  • User Scripts
    • /Android/data/org.hobby.voicecommandline/files/lua
  • Script execution in the background.
  • Credit screen to thank Vosk / Lua
    • a "license" action exists, although I would like something more
      • so that users can have easy access to licensing info
  • Custom failure behavior
    • If Hobby doesn't understand you, I would like it to be a user preference whether the app will attempt to detect speech again, or simply close. Probably give an option to choose the number of retries.
  • Cancel words / button.
    • Say "stop stop stop" (at least 3 stops in a row) to cancel out of whatever action hobby is doing.

Why did you name it that

I thought it would not be wise to call it Robby, and building things like this is my hobby. Feel free to call it whatever you want, it doesn't activate by trigger word. I launch it personally by squeezing my phone. If I ever allow it to recognize its name, I will include the ability to overwrite it.

Intents

Most intents are not implemented by this app at the moment, but the list of things you could possibly implement is contained in this section.

Music

Other

Licensing

  • Alphacephei's Vosk API, Apache 2.0
    • They offer an enterprise edition if you think the speech to text is good for your company at https://alphacephei.com/en/
    • They offer improved accuracy too, I'm just using freely available models
  • vosk-model-small-en-us-0.15, Apache 2.0 Alphacephei
    • There are better models if you want at the cost of more disk space
    • There are also other languages, I'm using english because I can only chose one language to package (doing more would waste space) so if you want your language supported, make a build using the appropriate model and redo the code that decides which intents to create
    • https://alphacephei.com/vosk/models
  • Lua license (MIT)
  • The sounds
    • I made them using zynaddsubfx and Ardour
  • All the extra code I wrote is AGPL (GNU AFFERO GENERAL PUBLIC LICENSE Version 3, 19 November 2007) with the caveat that you are not allowed to sell this app you may distribute this app for free in compliance with AGPL on side-loading app stores as is. You may only distribute this app on non-side-loaded app stores like the Android Store if it is for free and under a different name and complies with AGPL.
  • If you are the Dicio project, or Music Player GO, you may use my code under the same GPL license you use.