Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more qualities based on large data scraping #90

Open
eyaler opened this issue Mar 29, 2023 · 1 comment
Open

more qualities based on large data scraping #90

eyaler opened this issue Mar 29, 2023 · 1 comment

Comments

@eyaler
Copy link

eyaler commented Mar 29, 2023

Hi!

this is a followup on some of the comments in issue#34
this followup analysis is based on 117k songs form UltimateGuitarTabs 1960-2023 in the rock, pop, country and folk genres, totaling at 9.9M chords instances and 6000 different chords. this is for my project https://github.com/eyaler/uku3le currently being reworked.

These are the most common issues and qualities that fail parsing, and seem to have sensible solutions

  1. German notation uses H, Hm for B, Bm and this is relevant also for base chords.
  2. 7sus, 9sus with out a following number. should probably be a synonym for 7sus4, 9sus4?
  3. Maj7, should be allowed for maj7
  4. mmaj7, mMaj7 should be allowed for mM7
  5. 7sus2 afaiu is (0, 2, 7, 10)
  6. (add9), m(add9), (maj7), m(maj7), same for maj9, (aug), (dim) (sus4), 7(sus4), (sus2), 7(sus2), (2), (4), (5), (7), (9), (11) -> remove brackets
  7. add2 and (add2) afaiu is (0, 2, 4, 7)
  8. 6sus2 afaiu is (0, 2, 7, 9)
  9. E# -> F, B# -> C, Fb -> E, Cb -> B, H# -> C, and also for base chords
  10. maj7sus2 is (0, 2, 7, 11)?
  11. maj7sus4 is (0, 5, 7, 11)?
  12. ends with + or +5 to donate aug,
  13. ends with m+ or m+5 or m# or m#5 for (0, 3, 8)?
  14. ends with 7+ to donate 7+5
  15. maj7+5 (or Maj7+5) to donate (0, 4, 8, 11)?
  16. 6sus4 (or just 6sus) afaikt is (0, 5, 7, 9)
  17. strip white space so e.g: "C " is "C", "A7 " is "A7"
  18. 7M is probably maj7
  19. 6add9 is 69? and m6add9 to m69
  20. madd11 to (0, 3, 7, 17)?
  21. sus7 is probably 7sus4?
  22. strip asterisks (*)
  23. replace ° or º to dim, also if quality (ignoring base) is just 'o'
  24. if the quality (ignoring the base) is just 'M' or 'mi' it is probably safe to assume it is 'm'
  25. fixing caps where obvious: ADD, Add -> add; MAJ, Maj -> maj, SUS...
  26. m(maj9) -> (0, 3, 7, 10, 14) ?
  27. add4add9, add9add4 -> (0, 4, 5, 7, 14)
  28. min7 -> m7
  29. ma7 -> maj7
  30. -5, (-5) -> omit5
  31. maj7#11, maj7+11 -> M7+11
  32. add#11 -> (0, 4, 7, 18) ?
  33. m13 -> (0, 3, 7, 10, 14, 21) ?
  34. s4 -> sus4
  35. 7add11 and 6add11 -> (0, 4, 7, 10, 17)) and (0, 4, 7, 9, 17)) ?
  36. i also take care of do/re/mi/fa/sol/la/si (case insensitive) which may be following by #/b/7/m and as the base chord

i could go on... but the above helped me reduce the song reject rate in my case from 6.3% to 1.1%

fixes may be required also in from_note_index()

of course instead of dealing with all specific cases it would be useful to have generic normalization rules as
fixing caps where no ambiguity, eg: ADD, Add -> add
fixing strings where no ambiguity, eg: maj -> M
removing brackets where no ambiguity
ends with + or +5 -> aug
etc.
such generic rules (where there is no danger of ambiguity) would greatly help maintaining the qualities table.

disclaimer: i do not know anything about music or music theory.

@yuma-m
Copy link
Owner

yuma-m commented Apr 1, 2023

Hello @eyaler, Thank you for raising this issue. Let me give general guidance for your findings.

  • Uncommon expressions should be handled using QualityManager in your project (e.g. 3, 18, 28, and 34).
  • Removing brackets sometimes changes the meaning of a chord. If a common expression is missing from qualities, it should be added to the DEFAULT_QUALITIES, otherwise please use QualityManager.
  • Please feel free to create a pull request to add missing common qualities (e.g. 5, 7, and 33).
  • Stripping some characters should be handled in the project which uses PyChord (e.g. 17, and 22)

If you want to contribute or discuss further for each item, I appreciate it if you could create separate pull requests and issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants