Regression with sequence as predictor

Question

Background: Obviously, some musical compositions are better than others (measured by charts, saleability, last.fm site, ...). Each composition is, in fact, ordered sequence of discrete values (notes). Putting all together, it looks like supervised problem - sequence of notes (independent "variable") can predict popularity of composition (dependent variable). Hovewer, each sequence is ordered and of different length, what is not usual in classical regression setting, where we have for each object the same number of predictors and set of predictors can be permuted arbitrarily.

Questions: Is this also supervised problem? If yes, how to predict popularity of composition from sequence of notes. Are there any special techniques, how to solve this problem? If yes, how to efficiently generate set of new sequences of notes with high popularity?

Jan Beran's book http://www.amazon.com/Statistics-Musicology-Interdisciplinary-Jan-Beran/dp/1584882190/ seems a natural starting place. — Nick Cox, Aug 11 '13 at 11:06

score 6 · Answer 1 · answered Aug 11 '13 at 16:03

I'll approach the problem from several angles.

There's a library in Python for symbolic manipulation and feature extraction of sheet music, music21.

Generally discretely sampled time-series falls into a few domains. You can gain a broad overview by reading a book on signal processing, of which there are several accessible free resources. Such as Fourier and Wavelets, and the DSP Guide. A rigorous statistics approach is Brockwell and Davis. The book focuses on forecasting, but the mindset is invaluable.

If your samples were discretely-sampled continuously-valued sequences(like an audio recording), you'd reach for any of the commonly used shift-invariant feature extraction tools. Some popular ones are Mel Cepstrum, Stationary Wavelet Transforms and shift-invariant kernels. A useful intermediate step for Fourier based methods is Spectrograms, and for Wavelets there is the Continuous Wavelet Transform. Spectrograms have been segmented, clustered and histogrammed for cutting-edge bird classification performance.

Shift-invariance is notable here, as it is precisely the property that your features should possess. For example, a simple histogram of chords is shift-invariant.

There are undoubtedly many kernels that could be used for this problem, but you'll likely have to roll your own and verify both Mercer's conditions and shift-invariance. An oversimplified kernel that satisfies these conditions is Radial distance from a normalized histogram of chords. The simplest way to imbue temporal features would be n-gramming the chords before histogramming.

Finally, predicting music popularity is not sensible for many purposes(lastfm). A more frequent problem is building a Recommender system.

Regression with sequence as predictor

1 Answers1