I re-implemented my midi-RNN-midi conversion pipeline, employing some existing TensorFlow code and helpers. Their code works much better than mine! Although, it’s interesting to find out they are using an idea similar to mine: a two-dimensional matrix of entities representing melody and harmony.
In my case, I boiled it down to “slices of time” that either had or didn’t have frequencies present, represented by numbers I had mapped to them. My idea was that generated music might make more sense when it adheres to a given temporal resolution. This, of course, results in quantized music. In their case notes have a free duration instead. (As far as I understood. There may be nuances to this that were not that apparent).
The results of these experiments sometimes sound like “imprecise” playing, but, for the most part, the gains made by perceived interpretation seems to outweigh any downsides by quite some distance. Amazing work by the TensorFlow contributors!
Regardless, the perceived result is getting better. I wonder how of much that is just perceived improvement due to the addition of dynamics and agogics.