-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engram-es Q & U positions #47
Comments
Dmitri -- Thank you for reaching out with suggestions to improve engram-es! The approach relies on certain assumptions about ergonomics and on a representative corpus from which bigrams are derived. You write at a particularly good time, as I am revisiting this project from scratch and am developing a data-driven approach with crowdsourced information with a new software pipeline. I will keep your suggestion in mind as I progress in this project. |
Hi, https://github.com/culebron/rus-layout-opt/blob/master/Russian%20Optimization.ipynb |
I've updated my code by the link above, for more readability and configurability. Hope this helps. |
Thank you, @culebron! I am taking a different, more data-driven approach this time, and will reach out for help if it comes back to scoring bigrams based on an algorithms or weighting scheme... |
A follow-up email from @culebron on 27 Nov:
When I made several initial layouts, and finally sat down and tested it typing real text (I added it to Ergodox and later to the OS to my laptop), I discovered a very inconvenient monogram (index finger in inner column upper row). Turned out, I hadn't a penalty for that and had abused this lacuna to get better scores. I thought, maybe rules cast in stone aren't the way to go? Because they promote agressive optimizations and abuse of the rules. Maybe we should iterate with manual testing, discovering inconveniences, modify rules, and re-evaluate previous improvements? On the other hand, it may lead to insanely complicated rules. This project (https://github.com/dariogoetz/keyboard_layout_optimizer) has 14 KINDS of rules! Configuration is enterprize-scale! The whole project is 8500 SLOC!!! Pure insanity and completely irreproducible. "Here, I got great layout" --"it's not scoring well on my system" -- "IDK, works for me." Also, regarding manual testing -- probably as we optimize the layout, issues become less obvious and test subjects won't be able to point a finger at anything particular. Hence...
|
From @culebron: Check this out. I prefer to stick with arbitrary rules, but check layouts with hands, and also visualize them, like this. Standard Russian layout. Key colors (viridis scale) = costs on particular keys. Arrow thickness = frequency, arrow color = price per press (price * freq = cost). Same for my last optimized layout: |
Cool, @culebron. What are you using to generate the visuals? I like these! |
@binarybottle I just sketched a visualization in QGIS, then reproduced it in Matplotlib/Pyplot code. It was tough. |
I actually improved upon this, making comparison images, where color and size scales are the same for several layouts. Here's Sholes' №1 (QWERTY), Sholes №2 (his last layout), Dvorak and Colemak. Sholes №2 scores a bit worse than №1. |
The following was an email sent to me on 26 Nov 2024:
Hi,
I've been trying to optimize Russian layout and searching for scientific papers found your project. (IDK why Google Scholar doesn't show it on top. All I found was irrelevant.) Great that someone did a serious research paper and published a notebook, thanks for your work! I'll try running the code on my experimental layouts to see how it ranks them.
But there's a problem. I paid attention at your Spanish layout and tried to imagine typing in Spanish (I looked at the letters and moved the appropriate fingers imagining typing something like "y para entender lo que el ha propuesto...") I found out Q & U are placed very inconveniently: words like "que", "fue", "fuera", "puede", "aquel" require quite awkward moves between index finger and middle finger on the left hand.
My gut feeling told me they must be more outwards, because E/A/O often come after them as in the words I mentioned. So I calculated a metric that I used in my research: how many incoming & outgoing bigrams the letter has on the same hand. If bigram starts with the letter in question, it's "outgoing", if it ends, it's "incoming". I counted number of bigrams for your Spanish layout's left hand side. And turns out I was right: letter U wants to be much to the left of E/A/O. But "Q" wants to be even lefter.
You can run this code in your jp notebook after the code that defines bigrams frequencies:
import pandas as pd
d = pd.DataFrame({'bigram': bigrams, 'freq': bigram_frequencies})
d['l1'] = d.bigram.str[0]
d['l2'] = d.bigram.str[1]
left = ['A', 'E', 'I', 'O', 'U', 'Z', 'H', 'P', 'F', 'X', 'Q', 'Y']
d2 = d[d.l1.isin(left) & d.l2.isin(left)]
t2 = d2.groupby('l1').agg({'freq': 'sum'}).join(d2.groupby('l2').agg({'freq': 'sum'}), lsuffix='_out', rsuffix='_in').reset_index()
t2['delta'] = t2.freq_in - t2.freq_out
t2.sort_values('delta')
Output:
In my reseach, I did use such queries in Pandas quite a lot, and optimized consciously, rather than use optimization algorithms. Another metric I had was "how much is the key connected with the keys in the same row" -- to see if some keys could be moved elsewhere, or should have stayed where they were.
I wonder why Q & U got there where they were? Is it because top row rf & pinky are penalized?
Anyway, many thanks for posting your code. I see I was going in the right direction.
Best regards,
Dmitri
The text was updated successfully, but these errors were encountered: