Supporting AI/NLP & text mining based workflows #1337

dmrd · 2021-06-11T05:03:34Z

dmrd
Jun 11, 2021

After some initial discussion @ the community calls, I want to start a discussion here around what it would look like to integrate natural language tools, ranging from semantic search to GPT-style language models (LMs) into Athens. I mainly talk about language models below, but there's no reason to limit exploration & design to this class of models: there is a lot of room for much simpler solutions (many without any learning at all!) to support Athens users in writing and creating.

The core point that I believe makes this useful is discrimination is easier than generation (obligatory reference to that one ira glass quote)

A couple examples of what text completion can look like in writing & creative endeavors:
- Sudowrite applies this to creative writing: when you encounter writers block, the model proposed possible completions - editing/choosing is often easier than generating!
- ldea crossover & brainstorming: Given a set of blocks, what ideas are similar? Different? This is illustrated on words (rather than whole sentences) in learned word embeddings, where word analogies can be carried out in vector space, such as king - man + woman = queen
- David Bieber has a nice post on experimenting with GPT3, which naturally begins with using GPT3 to generate more ideas on what to use it for: https://davidbieber.com/snippets/2020-07-22-writing-with-gpt3/

In terms of implementation, one can directly integrate models using web APIs, such as GPT3 through OpenAI and many smaller models through Huggingface. Models somewhere between GPT2 & GPT3 are starting to be released into the public domain (see: GPT-J)

For many use cases, applying ideas of search engines (semantic embeddings, knowledge mining) will seem just as magical: imagine a sidebar in Athens presenting you with the most relevant snippets across your DB as you write, even if they only overlap in semantics (meaning) not syntax (like the words you used). See the Semantica and MemNav writeup on the Psionica site (linked below).

I recommend the writing and experiments that @paulbricman has been doing in his Psionica project around avenues to integrate simple natural language tools to augment various aspects of writing and using a personal knowledge base.
- Dual | Psionica: Formulates a language model agent as a knowledge base assistant, allowing you to teach new skills ("turn this note into flashcards")
- K-Probes | Psionica: simple pre-written prompts can be enough to help generation, no statistical methods required.
- Semantica | Psionica: uses semantic embeddings (like the word analogy example above) to generate options & ideas.
- MemNav | Psionica: Apply text mining methods to a personal knowledge base. What if you could ask questions of your DB such as "Who have I talked to about Athens?"

I hope this can spark further discussion - there's no system that takes these sort of assistive tools seriously yet & they will feel truly magical. What workflows do you aspire to in Athens, and what does it take to make them happen?

paulbricman · 2021-06-11T06:51:08Z

paulbricman
Jun 11, 2021

Thanks for the tag, @dmrd! Let me know if I can help with brainstorming NLP tools for Athens or with advice on the local NLP tech stack (beyond HuggingFace). All Psionica projects are open source and local-first, so supporting Athens on this path seems like an obvious way to go! One thing I'm personally considering is porting Dual from Obsidian to Athens.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting AI/NLP & text mining based workflows #1337

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Supporting AI/NLP & text mining based workflows #1337

dmrd Jun 11, 2021

Replies: 1 comment

paulbricman Jun 11, 2021

dmrd
Jun 11, 2021

paulbricman
Jun 11, 2021