TODO.txt

################################################################################
# General
################################################################################

################################################################################
# Key/Chord detection
################################################################################

- Now that I've got some annotated examples (see $meter_vectors2), I should add
  a test to measure the performance of key detection, to help make sure it 
  doesn't regress.

################################################################################
# Phrase detection
################################################################################

- Check out tools/test_autocorrel_phrase_detection.rb. I was trying to do meter
  detection. In following an idea I saw in a paper, I ended up making a fairly
  bad meter detection algorithm because it kept showing an 
  autocorrelation+entropy spike at 2-4 measures, which I assume is a phrase
  boundary.

- It still surprises the hell out of of me that this isn't TRIVIAL. "My Bonnie 
  lies over the ocean" is practically verbatim AAAA BA'BA'.  Why isn't that a
  trivial win? Go back to printing out the details.

- A very different (and more cognitively plausible) approach would be to go 
  left-to-right, building up a phrase structure as we go. This would have the 
  advantage of weighting the initial material, which is almost always the most 
  important phrase. It would avoid randomly walking into weird phrase-structure
  space and then trying to course-correct back to something sensible.

- What we're getting from the phrase detection isn't sufficient to do 
  composition in the way I was expecting. It doesn't draw correspondences 
  between phrases. If we were to track correspondences, we could use that to 
  perform similar operations on groups of phrases.  E.g., if two phrases are 
  highly similar and we're exploring tactics, we should perform the same tactic
  on both of them.

- Similarity is currently looked at on a note-by-note basis. But phrases
  exhibit patterns of similarity that are not evident in simple note-by-note
  comparisons.
	- Instead of looking at overall phrase duration and penalizing the outliers,
	  we could view duration as a property of phrase. It's not that we want to
	  penalize outliers from the mean; it's more that we want to penalize 
	  phrases which bear no similarity (including duration) to others.
	- What kinds of properties do phrases have? Overall duration, overall 
	  contour/shape, weighted pitch class set, duration set, 1st order markov
	  stats, surprise

- Change the way similarity factors into scoring. Right now, the algorithm 
  is filtering low similarity scores and summing the remainder. That means 
  that a phrase candidate with mediocre similarity to every other phrase is
  treated just as high as a phrase with above-average similarity to exactly
  one other phrase.